Real-Time Synthesis of Virtual Backgrounds, Avatars, and Lighting: The Rise of Virtual Production
From Video Compression to Generative Content: Next-Generation Media Technologies Powered by AI
Automatic Short-Form Conversion of Blog Posts: New Technology Set for Release This Year
"This is a museum, and next to me... is an astronaut."
This scene, witnessed on July 16 at Naver's headquarters inside the 'Vision Stage', was real. Against the backdrop of a Renaissance-style museum hall, Seongho Kim, Immersive Media Platform Leader, stood side by side with a figure in a spacesuit and exchanged greetings. Although only Kim was physically present on stage, a realistically moving astronaut character appeared together on the screen.
A demonstration of interaction with a virtual astronaut created using generative AI and real-time motion capture technology at Naver Virtual Production Studio 'Vision Stage'. On the left is Seongho Kim, Naver Immersive Media Platform Leader. Photo by Yujin Park
Who is the astronaut? "The astronaut you've been watching is actually a Naver colleague. We used real-time motion capture to map their movements into the virtual space." The stage is 'Vision Stage', and the adjacent area is 'Motion Stage' for motion capture. These two spaces are linked in real time to create a single virtual performance. Although it seems like technology from a sci-fi movie, all of this was achieved in real time as part of a virtual production.
At the Naver 'Immersive Media Platform' Tech Forum held that day, immersive technologies combining media AI, extended reality (XR) studios, and virtual streaming were introduced. This technology structure, dubbed the 'Vision Tech Triangle', encompasses not only video recognition and generation technologies but also a virtual studio-based content production infrastructure. Naver is reinforcing its leadership within its own content ecosystem.
Vision Stage and Motion Stage have attracted attention as virtual production infrastructure capable of producing movie- and drama-level content. By combining generative AI, 3D avatars, and motion capture technologies, they provide an environment where content can be created that blurs the line between reality and virtuality. Kim explained, "Even if no one is physically present, AI-based virtual backgrounds and characters interact in real time to produce content," adding, "It is possible to create content across genres, from dance challenges to live commerce."
In particular, this studio's strength lies in its ability to quickly generate backgrounds desired by brands or artists using generative AI, and to switch backgrounds in real time during live broadcasts. Lighting is also automatically adjusted in real time to match the background color. Numerous virtual streamers shoot their performances in this space, and collaborative content involving more than 10 participants is also being produced.
On this day, Naver also announced plans to launch 'AutoClipAi', a service that automatically generates short-form videos from text-based blog content, in the second half of the year as part of its technology advancement efforts. Using a multimodal large language model (LLM), the service summarizes blog content and automatically synthesizes suitable voice, background music, and visual effects to create a short-form clip of about three minutes. Naver described this as "a core technology that enables a text-focused platform to expand into a video-centric ecosystem."
Video compression efficiency has also been enhanced. 'AIEncode', introduced last year, is designed to reduce transmission bandwidth by up to 30% while maintaining video quality. This technology contributes to the stability of real-time streaming and, by lowering transmission rates, provides users with a media environment where they can experience videos more quickly.
The live streaming app 'Prism Live Studio', aimed at the global market, also drew attention. According to Naver, it has accumulated 2.6 million broadcasts and generates about 120,000 broadcasts per day on average, with over 90% of downloads occurring overseas. Its main differentiators include the ability to automatically adjust video quality in real time according to network conditions using proprietary technology, and AI-based script and chapter creation features. Naver has positioned this as both a technology testbed and a tool-based business model.
Naver also plans to unveil an XR content platform in the future. The Android-based XR platform will support augmented reality (AR), virtual reality (VR), and mixed reality (MR) content, and content produced in Vision Stage will later be available for immersive experiences in VR environments. Kim stated, "We are focusing on developing XR-based media technologies in line with the popularization of virtual and mixed reality," and added, "By advancing AI creative technologies, we aim to provide users with vivid media experiences that transcend the boundaries between online and offline."
© The Asia Business Daily(www.asiae.co.kr). All rights reserved.

