AI Startup Cleon 'Clone' Service
Creating Virtual Humans with Face and Voice in 10 Minutes
Advancing Virtual Human Production Technology
[Asia Economy Reporter Seungjin Lee] “Hello. I am your clone.”
A virtual human resembling me appeared on the screen with just one photo. The virtual me on the screen greets in my voice. Judging by the facial expressions and gestures, it is unmistakably me. It took only 10 minutes for a virtual human resembling me to be created from a single photo and a few voice samples. Virtual humans, which were previously created by companies for advertisements and other uses, have now entered everyday life.
One Face Photo + 30 Seconds of Voice = Virtual Human Created in 10 Minutes
AI startup Cleon showcased its deep human technology ‘Camelo’ at IFA 2022, Europe’s largest consumer electronics exhibition held in Berlin, Germany. Camelo creates a person identical to me using just one photo and 30 seconds of voice. While deepfake technology is familiar to us as a concept of face synthesis, deep human technology can change not only the face but also the body shape and voice.
Cleon aims to popularize virtual humans. They are developing a virtual human creation service that allows anyone to easily create a virtual human to serve as their avatar in the upcoming metaverse world.
Currently, Cleon is running a beta service called ‘Clone,’ where anyone can experience creating a virtual human. Clone can create a virtual human with a face similar to the user’s in just 10 minutes from a single photo. By inputting a script, users can produce presentation videos of about 10 minutes, and the virtual human changes facial expressions and performs various human-like gestures.
Having tried the Clone service myself, a virtual human resembling me was created in about a minute from just one photo. Currently, to prevent unauthorized misuse of others’ photos, the service operates by overlaying the face from the photo onto a pre-existing model’s face.
The biggest feature of the Clone service is that it drastically reduced the cost of creating virtual humans from tens of millions to hundreds of millions of Korean won to around 5 million won. This is thanks to the ‘zero-shot learning’ technology (which eliminates the need for large-scale data training). For face recognition, deep learning trains on numerous faces multiple times so that even from a frontal face photo, it can predict how the side profile looks. This enables the creation of a virtual human resembling the user from just one photo and significantly reduces production costs.
Additionally, the Clone service applies face and voice synthesis technology and lip shape generation technology to simulate a real person pronouncing the input text. Users can freely select gender, language, pitch, background, and simple body gestures can be expressed through body shape generation technology. All of this is completed in just 10 minutes.
Actual use case of clone service. After selecting a model with desired appearance, gender, age, race, and body type, a face to be synthesized onto the model is uploaded, and the generated virtual human is used in a video.
Virtual Humans Active as Docents, Reporters, and Professional Educators... Transcending Time and Space
Currently, virtual humans mostly serve to promote specific companies or products on social media (SNS) or in advertisements. However, as technology advances, their activity areas are rapidly expanding.
Cleon provides virtual reporters to several domestic media outlets. When manpower is insufficient for breaking news, virtual reporters deliver news by inputting scripts. They are also used in places requiring repetitive delivery of specific content. Besides corporate chatbot services, AI docents assist museum tours at a museum in Singapore instead of human docents.
The potential uses of virtual humans are expected to be limitless in the future. At this year’s IFA, Cleon unveiled the AI video automatic dubbing solution ‘Kling’ for the first time. Kling can dub dialogue into multiple languages using the original voice of the person in the video and synchronizes lip movements to match the dialogue. The demo version previously won the Innovation Award in the Software & Mobile Apps category at CES 2022.
Applying this to virtual humans enables companies exporting goods overseas to simultaneously engage foreign buyers in multiple languages. For example, a clothing company’s virtual human can simultaneously provide explanations about fabric and design concepts in English, Chinese, Spanish, and other languages to overseas buyers. Cleon plans to provide virtual human services across various fields through technological innovation and collaboration with multiple companies.
© The Asia Business Daily(www.asiae.co.kr). All rights reserved.
![[Virtual Human 2.0] A 'Virtual Human' Resembling Me Is Created Within 10 Minutes](https://cphoto.asiae.co.kr/listimglink/1/2022091408433430893_1663112615.png)

