Flitto Adds Dataset to 'Korean LLM Leaderboard'

AI language data specialist company Flitto announced on the 22nd that it has provided datasets for building newly added evaluation metrics in the ‘Open Ko-LLM Leaderboard,’ which assesses Korean-style large language models (LLMs).

The ‘Open Ko-LLM Leaderboard’ is a public platform for evaluating the performance of domestic large language models (LLMs), jointly operated by the National Information Society Agency (NIA) and Upstage since September last year. Currently, the number of large language models participating in the leaderboard has surpassed 1,500, with continuous involvement from companies and research institutions driving AI development.

Flitto provided benchmark datasets in Korean for the newly added evaluation categories on the ‘Open Ko-LLM Leaderboard’: ‘Sentiment Evaluation (Ko-EQBench)’ and ‘Instruction Following (Ko-Instruction Following).’ The ‘Sentiment Evaluation’ metric, applied since the 16th, verifies the ability to understand various emotions and social interactions within conversational contexts, while the ‘Instruction Following’ metric assesses whether the model accurately follows given instructions. The instruction following evaluation is expected to become an important test criterion in the future, serving as a gauge for the compliance level of enterprise-specific language models with given directives.

Following its participation last month in building the ‘Common Sense Reasoning’ and ‘Mathematical Reasoning’ metrics, Flitto contributed to the establishment of these new evaluation categories by supplying datasets, thereby advancing the Korean large language model ecosystem. By providing high-quality Korean datasets, Flitto enabled performance evaluations that better understand Korean culture and language, playing a positive role in enhancing the reliability of Korean large language models.

Lee Jeong-su, CEO of Flitto, stated, “We expect that the addition of these evaluation categories will further accelerate the development of domestic generative AI models,” and added, “We will continue to support the provision of high-quality language data so that Korean language performance can be evaluated more objectively and systematically.”

Text Size

Flitto Adds Dataset to 'Korean LLM Leaderboard'

News & buzz

Special Coverage

Share