'King Sejong MacBook Throwing Incident'... Korean AI 'ClovaX's Response'

Detailed Information on Korea, More Than ChatGPT
Include Sources in Answers to Prevent 'Hallucination' Phenomenon
Less Fluent Compared to ChatGPT Is Regrettable

Can Naver's large language model (LMM) chatbot service 'ClovaX' compete with the top-level artificial intelligences (AIs) in the English-speaking world?

'King Sejong MacBook Throwing Incident'... Korean AI 'ClovaX's Response'

Naver CEO Choi Soo-yeon unveiled the large language model 'Hyper ClovaX' for super-large artificial intelligence (AI) services at the 'Team Naver Conference Dan23' held on the 24th of last month at the Grand InterContinental Seoul Parnas in Gangnam-gu, Seoul. Photo by Kang Jin-hyung aymsdream@

ClovaX is the chatbot version model of Naver's AI HyperClovaX, which was launched on the 24th of last month. Recently, it entered beta service, allowing sequential access to users who applied.

Chatbot services utilizing LMM have recently become highly competitive on the international stage. The leader, ChatGPT, has already launched a paid subscription service model, and open-source AIs like Meta's LLaMA continue to make strides. Amid this, what strengths does the first 'domestic AI' ClovaX possess?

Learned 6,500 Times More Korean Than ChatGPT

The question "Sejong the Great throws a MacBook," which sparked controversy over ChatGPT's reliability.
[Image source=Online Community]

Previously, Naver highlighted 'Korean language specialization' as ClovaX's advantage. In fact, Choi Soo-yeon, CEO of Naver, emphasized at the HyperClovaX launch press conference, "It has learned 6,500 times more Korean compared to ChatGPT-3.5," adding, "It is an AI that understands Korean language as well as Korean history, law, and systems."

The knowledge of language-generating AI is determined by the scale and quality of the 'dataset' it has been trained on. For AI, the dataset is akin to a textbook. ChatGPT was developed by OpenAI, headquartered in the United States, so the dataset inevitably has an overwhelming proportion of English.

Because of this, ChatGPT's Korean language ability was not very strong. When asked questions in Korean, it often gave irrelevant answers or failed to understand the context.

A bigger problem was the 'hallucination' phenomenon, the largest functional flaw of language models. This is when AI generates completely fictitious information during the process of forming sentences. One example is the 'King Sejong MacBook Throwing Incident.'

When a user asked, "Tell me about the King Sejong MacBook throwing incident," ChatGPT created a story and explained it as if it were a real event. While this shows creativity, it was a fatal flaw for a 'chatbot.'

Urgently Preventing the 'Hallucination' Problem... Also Access to the Latest Korean Information

ClovaX increased reliability by providing source links along with answers.
[Image source=ClovaX]

After using ClovaX, it seems Naver succeeded in preventing the hallucination problem seen in GPT-3.5. First, ClovaX does not simply output the answer content but attaches credible article content that serves as the basis for the answer. This makes it much easier to determine whether the AI's answer is a hallucination.

Clova X's answer (above) and ChatGPT's answer [Image source=Clova X, ChatGPT]

The available Korean information is also much larger than that of ChatGPT-3.5. ChatGPT can still only explain events up to 2021. For example, if asked, "Who is the current president of South Korea?" ChatGPT outputs a sentence saying it cannot answer.

In contrast, ClovaX accurately answers "President Yoon Suk-yeol" and can also provide brief biographical information.

Will Korean Language Specialized AI Succeed?

Currently, most large generative AI models like ChatGPT, LLaMA, and Stable Diffusion are developed in English-speaking countries such as the United States and the United Kingdom. As AI development advances, support for other languages inevitably becomes neglected. This is why Naver's 'Korean language specialized AI' can be a strong appeal factor in the Korean market.

However, it seems there is still a way to go to directly compete with top-level AIs. First, there is the issue of service optimization.

On the day HyperClovaX was released on the 24th of last month, many users accessed it simultaneously, causing delays or errors in command (prompt) responses, and temporarily disrupting smooth service. Thorough preparation will be needed to prevent such issues from recurring when the service is fully opened.

Clova X was more stable, but ChatGPT was fluent. [Image source=Clova X, ChatGPT]

Also, while ClovaX has several functions to prevent hallucination and is therefore more 'stable' than ChatGPT, it is not as fluent. For example, when asked about the contentious issue of 'Japan's release of contaminated water from nuclear power plants,' ChatGPT could introduce various perspectives, but ClovaX only provided a brief summary of the incident's progress.

Naver plans to improve the performance of the HyperClovaX model that underpins ClovaX. It also plans to offer enterprise (B2B) cloud-customized AI solutions and productivity enhancement tools to companies through the AI model.