Japan NTT Develops Voice Conversion Technology
Convert Voice and Speaking Style to Your Preference
Media: "Needed in Call Centers with Loud and Abusive Calls"
Transforms Awkward English into Fluent Speech
When Japan's NTT commercializes the technology, even if a malicious complainant shouts, the call center will convert the voice into a pleasant, soft tone. The photo is an image to help understand the article.
A technology that makes it easier to listen to the voices of abusive complainants filled with shouting or transforms awkward English pronunciation into fluent speech has been developed in Japan.
On the 17th, Japan's NTT announced, "We have developed a voice conversion technology that instantly changes the voice and speaking style to a preferred style." This technology enables voice conversion in various voice communications, whether face-to-face or remote. NTT stated, "This technology allows real-time voice conversion in web meetings and live transmissions," adding, "In the future, applications in various scenarios such as use on smartphones and VR devices are expected." NTT plans to showcase this technology at the Communication Science Basic Research Institute Open House 2024, starting on the 24th.
Japanese media expect that in Japan, where 'Kasu-hara' (customer harassment) is widespread, this technology will allow call center employees to comfortably respond by converting the voices of abusive complainants in real time, and it will also facilitate conversations with disabled people who have difficulty speaking. Additionally, when holding meetings or conversations in English, it can make awkward pronunciations closer to native speakers or correct trembling voices caused by nervousness.
NTT mentioned two technical features: one is converting the speaker's voice characteristics to those of another speaker, and the other is low-latency conversion processing. This means that since meetings or conversations are conducted in real time, the voice converted through the voice conversion system is delivered to the other party without delay or interruption.
Japan's SoftBank has also recently developed a voice modulation phone technology called 'Emotion Cancellation' using artificial intelligence (AI). According to a report by Hong Kong's South China Morning Post (SCMP), SoftBank engineer Toshiyuki Nakatani developed the technology to protect call center employees experiencing Kasu-hara. This technology identifies the caller's voice when shouting or raising their tone and makes it natural and calm. It lowers the tone of high-pitched female voices and softens male voices that may sound threatening.
© The Asia Business Daily(www.asiae.co.kr). All rights reserved.

