The government will publicly recruit an implementing agency for the "Performance Evaluation Dataset Construction Project" until August 7, 2025, in order to assess the performance of AI models developed by elite teams participating in the "Independent AI Foundation Model" project.
This project, hosted by the Ministry of Science and ICT and the National Information Society Agency (NIA), aims to build datasets that will be used to evaluate the performance of domestic AI models.
Although various generative AI services have emerged, led by global big tech companies, most performance evaluations have relied on English-language benchmarks, which have not sufficiently reflected the service environment and context in Korea.
To address this limitation, and to objectively assess the performance of both domestic and international AI models while reflecting Korea's culture and social values, a total of 2.4 billion won (three projects, 800 million won per project) will be invested to build high-quality performance evaluation datasets.
This year, three types of datasets will be constructed as a priority: ▲ datasets for evaluating the mathematical problem-solving ability of LLMs (mathematics field); ▲ topic-specific question-answer and reasoning datasets for assessing Korea-specific knowledge (knowledge field); and ▲ datasets for evaluating performance on various tasks in long-form contexts (long-form comprehension field). In the future, datasets will also be developed to evaluate a wider range of generative AI models, including multimodal and agent-based models.
Companies and organizations wishing to participate must have AI development capabilities based on large-scale datasets, such as large language models, natural language processing, or multimodal AI, and at least one such company or organization must be included in each consortium of implementing agencies.
Kim Kyungman, Director-General for AI-Based Policy at the Ministry of Science and ICT, stated, "To secure high-performance, independent domestic AI foundation models that the public can truly experience, it is essential that the evaluation datasets also reflect Korea's social and cultural environment." He added, "The performance evaluation datasets established through this project will be made publicly available so that not only the elite teams but also future domestic AI development organizations can utilize them."
© The Asia Business Daily(www.asiae.co.kr). All rights reserved.


