KT's 'Bel:um K' Ranks No.1 Among Domestic Small and Mid-Sized Models on Global AI Benchmark AAII

87% on Agent Benchmark
Implemented as a Small and Mid-Sized Model Under 40B
Targeting the B2B Market

Bel:um K. listed in the AAII (Artificial Analysis Intelligence Index).

KT's self-developed artificial intelligence (AI) model, Bel:um K., has achieved the highest score among domestic small and mid-sized models on the global AI performance evaluation platform, AAII (Artificial Analysis Intelligence Index). In particular, it demonstrated top-tier performance in the area of agentic AI, which autonomously performs tasks, officially establishing its competitiveness in the enterprise AI market.

KT announced on January 5 that Bel:um K. ranked first among domestic AI models in the small and mid-sized category listed on AAII. AAII is a platform operated by the AI evaluation specialist organization Artificial Analysis, which compares and analyzes AI model performance by aggregating results from multiple public benchmarks rather than a single test. Alongside major global AI models, domestic models such as LG Exaone, Naver HyperCLOVA, and Upstage SOLAR are also listed.

In this evaluation, Bel:um K. received consistently high scores across more than 10 key categories, including reasoning, domain expertise, mathematics and programming, and agent task performance. KT explained that this demonstrates both the versatility and real-world applicability of Bel:um K. as an agentic AI, which goes beyond simple Q&A to understand objectives and utilize necessary tools and systems to complete tasks.

Notably, Bel:um K. achieved a score of 87% on the τ²-bench (Tau Square Bench), which evaluates agent-specialized performance, placing it among the top-tier models. The τ²-bench measures whether an AI can collaborate with humans in real work environments and use various tools to carry out tasks to completion. According to KT, this score is comparable to the latest model from Google Gemini.

As the AI market rapidly shifts from conversational models to task-oriented AI agents, Bel:um K. is being recognized as a model that can be immediately utilized in corporate settings for document creation and analysis, internal work automation, and development and operations support.

Despite being a small to mid-sized model with fewer than 40 billion parameters, Bel:um K. delivered stable results on high-difficulty reasoning and domain knowledge evaluation benchmarks such as MMLU Pro, GPQA, and HLE. It is particularly notable for achieving the highest level of Korean language understanding and contextual comprehension among domestic models.

KT explained that Bel:um K. was developed using a "from scratch" approach, handling every stage from model architecture design to training data construction in-house. This allowed the model to precisely reflect the work environment and social context of Korean enterprises. By reducing the burden of large-scale GPU infrastructure while ensuring both instruction-following capability and efficiency, it is considered well-suited for the B2B market.

From the initial development stage, KT designed Bel:um K. with the B2B market in mind. By forming data alliances with domestic and international data-holding organizations, KT utilized high-quality, copyright-secured data for training, and refined real-world application scenarios in collaboration with major corporate clients.

KT plans to accelerate the adoption of customized AI agents that automate and autonomously process work across industries such as finance, public sector, and manufacturing, thereby supporting the full-scale AI transformation (AX) of enterprises.

Oh Seungpil, Executive Vice President of KT’s Technology Innovation Division, stated, "Bel:um K.’s listing on AAII objectively demonstrates that KT’s proprietary AI technology has reached global standards," adding, "We aim to become the key partner driving work innovation for Korean enterprises through agentic AI."