SKT Elite Team's AI Model Surpasses DeepSeek... 519 Billion Parameters

A.X K1 Technical Report Released
Outperforms DeepSeek in Mathematics and Coding
Enhanced Efficiency Through Selective Parameter Activation

On January 7, SK Telecom's elite team announced that it has released the technical report for its ultra-large artificial intelligence (AI) model, "A.X K1," with 519 billion parameters (519B), on the open-source AI model platform Hugging Face.

The SK Telecom elite team developed A.X K1, the first model in Korea with more than 500 billion parameters. The development utilized approximately 1,000 graphics processing units (GPUs), and the total training volume was estimated based on the training period and the scale of GPUs used.

SKT Elite Team's AI Model Surpasses DeepSeek... 519 Billion Parameters

A visitor at the SK Telecom elite team experience space set up at COEX, where the first reader AI foundation model presentation was held on the 30th of last month, is using the A.X K1 model. Provided by SK Telecom

Based on this, the maximum model size was designed according to scaling theory, which states that model performance is proportional to the resources invested. The data used for model training amounted to about 10 trillion (10T) data points. During this process, the team developed the model without government support, relying solely on self-procured GPUs.

According to the elite team, A.X K1 achieved performance comparable to or exceeding global models such as China's DeepSeek-V3.1 in major benchmarks. In particular, it demonstrated its capabilities in fields requiring complex calculations and reasoning, such as mathematics and coding.

A.X K1 scored 89.8 points on the "AIME25" benchmark, which measures AI's mathematical problem-solving abilities, outperforming the DeepSeek-V3.1 model (88.4 points) at 102% of its level. AIME25 uses problems from the American high school mathematics olympiad to assess AI's math skills.

In the coding benchmark "LiveCodeBench," A.X K1 recorded 75.8 points for English-based tasks and 73.1 points for Korean-based tasks, proving its real-time coding problem-solving abilities. These results are 109% and 110% of the scores achieved by DeepSeek-V3.1, which scored 69.5 for English and 66.2 for Korean, respectively.

Operational efficiency has also been improved. A.X K1 selectively activates only 33 billion out of its 519 billion parameters. In addition, it adopts a Mixture of Experts (MoE) architecture to ensure both stability and efficiency during AI training. MoE is a method in which several smaller expert models work together to solve a larger problem.

SK Telecom plans to add multimodal capabilities to A.X K1 and expand it to a trillion-parameter scale within this year.