KT 'AI Human Studio' Review
Over 100 Diverse AI Voices
Natural Speech and Gesture Implementation
Using KT's artificial intelligence (AI) technology, we created voice video news where a virtual human reads the news instead of stiff text articles. We utilized KT's ‘AI Human Studio’ service. Launched by KT in November last year, AI Human Studio is a web service that allows users to easily create their own video content by selecting a virtual human's face and voice and then inputting the desired text. It was developed through technical collaboration with the generative AI specialist company C&AI.
The AI virtual human is a character created using image generation technology. It can be freely used for content creation without restrictions related to portrait rights or copyrights. Additionally, users can choose from various concepts such as instructor, announcer, show host, or MC. Depending on the subscription plan, users can access 6 to 12 virtual humans, and the available video production time and number of videos increase accordingly.
However, there was a cautionary note stating that the produced videos must not contain violent content such as profanity or be used illegally, such as for pornography. If illegal use is confirmed, users may face restrictions on future service use.
The news video I wanted to create was about KT launching the ‘Obje Book’ service, an original content produced by ‘Millie’s Library’ for Genie TV users. After entering the AI Human Studio site, I downloaded the user guide to learn how to use it and then created the video. I chose the virtual human named ‘Hyunwoo,’ who has a trustworthy face, and displayed only the upper body on the screen. Instead of casual wear, I dressed him in a suit. For the voice, I selected ‘Miseong,’ which gives a clean and soft impression. There are over 100 AI voices available, and fine adjustments to tone and emotion are possible. Before selecting a voice, the ‘Listen’ feature allowed me to preview it, which was convenient.
While entering the article script into the script box, I added various gestures such as clasping hands or guiding hand motions. The video’s subtitles and the ability to set 0.2-second pauses while the virtual human reads the text highlighted the attention to detail. A 1 minute and 4 seconds news video was quickly produced. The lip movements changed according to pronunciation, and the gestures were natural, bringing the video to life.
However, there were some shortcomings in the English pronunciation. Genie TV was pronounced as ‘Genie TV’ rather than ‘Genie TV’ spelled out, sounding more like ‘Genie TV’ as it appeared. Also, ‘AI’ was not pronounced accurately as ‘A-I’ but sounded more like ‘Eh-eh.’ To resolve these issues, the ‘Smart Dictionary’ feature must be used. The Smart Dictionary allows users to input pronunciations for each word individually to prevent errors. This part required a bit of effort.
KT’s ‘AI Voice Studio’ also allows for voice content production. Users can utilize over 100 diverse AI voices ranging from under 10 years old to people in their 60s. Free membership allows the creation of AI voice content up to 4,000 characters per month. The paid plans for AI Voice Studio include Light (12,000 KRW/month), Super (48,000 KRW), and Super Plus (120,000 KRW).
With the Super plan or higher, users can create ‘My AI Voice’ using their own voice. By reading about 30 sentences of a script, the system recognizes the user’s voice and reads any input text in that voice. The service supports multiple languages including Korean, English, Japanese, Chinese, and Spanish, making it suitable for companies to promote their products or services overseas.
© The Asia Business Daily(www.asiae.co.kr). All rights reserved.
![[Review] Handsome 'AI Announcer' Created in Just a Few Clicks... Voice Also Sweet and Smooth](https://cphoto.asiae.co.kr/listimglink/1/2024011114270226523_1704950823.jpg)

