본문 바로가기
bar_progress

Text Size

Close

KAIST Identifies Security Threats Exploiting LLM Architectures Such as Google Gemini

Major commercial large language models (LLMs) such as Google Gemini utilize a Mixture-of-Experts (MoE) architecture, which selectively employs multiple "small AI models" (expert AIs) to improve efficiency. However, there are now warnings that this structure may in fact expose the models to new security threats.


KAIST Identifies Security Threats Exploiting LLM Architectures Such as Google Gemini (From left) Song Mingyu, PhD candidate; Kim Jaehan, PhD candidate; Professor Son Suel; (top) Professor Shin Seungwon; Dr. Na Seungho. Provided by KAIST

KAIST announced on the 26th that a joint research team led by Professor Shin Seungwon of the School of Electrical Engineering and Professor Son Suel of the School of Computing has, for the first time in the world, identified an attack technique that can exploit the Mixture-of-Experts structure to compromise the safety of LLMs. The team received the Best Paper Award at the international information security conference 'ACSAC (Annual Computer Security Applications Conference) 2025' for their work.



ACSAC is one of the most influential international academic conferences in the field of information security. This year, only two papers were selected for the Best Paper Award from all submissions. It is highly unusual for a Korean research team to achieve such a result in the field of artificial intelligence (AI) security.


In this study, the joint research team systematically analyzed the fundamental security vulnerabilities of the Mixture-of-Experts structure. In particular, they demonstrated that even if an attacker does not have direct access to the internal structure of a commercial LLM, distributing just one maliciously manipulated "expert model" as open source can induce the entire LLM to generate dangerous responses.


This means that even if only a single "malicious expert" is included among otherwise normal AI experts, that expert can be repeatedly selected in certain situations, thereby undermining the overall safety of the AI. Most notably, the process causes almost no degradation in model performance, making it extremely difficult to detect the problem in advance-this is considered the most critical risk factor.


KAIST Identifies Security Threats Exploiting LLM Architectures Such as Google Gemini Research team's conceptual diagram (AI-generated image). Provided by KAIST

According to experimental results, the attack technique proposed by the joint research team increased the rate of harmful responses from the previous 0% to as high as 80%. Even if only one of many experts was infected, the overall safety of the model was significantly reduced.


This research is significant because it is the first to present a "new security threat" that can arise in the rapidly spreading open-source-based LLM development environment worldwide. At the same time, it suggests that in the future, verifying not only the performance but also the "source and safety of expert models" will become essential in the AI model development process.


Professor Shin and Professor Son emphasized, "This research has demonstrated that the rapidly spreading Mixture-of-Experts structure for efficiency can actually become a security vulnerability. The fact that our joint research team received the Best Paper Award at 'ACSAC 2025' is a meaningful achievement that internationally recognizes the importance of AI security."


The research team included Kim Jaehan and Song Mingyu, PhD candidates at the KAIST School of Electrical Engineering; Dr. Na Seungho (currently at Samsung Electronics); Professor Shin Seungwon of the KAIST School of Electrical Engineering; and Professor Son Suel of the KAIST School of Computing. The research findings (paper) were presented at ACSAC, which was recently held in Hawaii, USA.


© The Asia Business Daily(www.asiae.co.kr). All rights reserved.

Special Coverage


Join us on social!

Top