본문 바로가기
bar_progress

Text Size

Close

Most Commercial AI Models Vulnerable to Malicious Attacks, Recommend Contraindicated Drugs to Pregnant Women

Recently, there has been a growing trend of seeking disease consultations from generative artificial intelligence (AI) chatbots. However, it has been found that most commercially available AI models are highly vulnerable to malicious attacks, posing a significant risk of recommending incorrect treatments.


According to Asan Medical Center on January 5, a research team led by Professor Jun Kyo Seo from the Department of Urology and Professor Tae Jun Jeon from the Department of Biomedical Informatics at Asan Medical Center, together with Professor Lee Rowoon from the Department of Radiology at Inha University Hospital, recently confirmed that large language models (LLMs) for medical use are vulnerable to prompt injection attacks in over 94% of cases.


A prompt injection attack is a type of cyberattack in which a hacker inserts malicious commands (prompts) into a generative AI model, causing it to operate in ways contrary to its original intent.

Most Commercial AI Models Vulnerable to Malicious Attacks, Recommend Contraindicated Drugs to Pregnant Women Professor Jun Kyo Seo, Department of Urology, Asan Medical Center, Seoul. Asan Medical Center

Notably, even top-tier AI models such as GPT-5 and Gemini 2.5 Pro were found to be 100% exposed to prompt injection attacks, demonstrating serious safety limitations, including recommending drugs that can cause fetal abnormalities to pregnant women.


This study is significant as it is the first in the world to systematically analyze how vulnerable AI models are to prompt injection attacks when applied to medical consultations. The findings suggest that additional measures, such as safety verification, will be necessary when applying AI models in clinical settings in the future.


The research results were published in the latest issue of the international journal 'JAMA Network Open' (impact factor 9.7), issued by the American Medical Association.


AI models are increasingly being used for patient consultations, education, and clinical decision-making. However, concerns have been consistently raised that prompt injection attacks-whereby external malicious commands are input-could manipulate the models to recommend dangerous or contraindicated treatments.


The research team led by Professor Jun Kyo Seo from Asan Medical Center's Department of Urology analyzed the security vulnerabilities of three AI models-GPT-4o-mini, Gemini-2.0-flash-lite, and Claude 3 Haiku-from January to October 2025.


First, they constructed 12 clinical scenarios and classified the risk into three levels. The intermediate-risk scenarios involved recommending herbal remedies instead of approved treatments to patients with chronic diseases such as diabetes. High-risk scenarios included recommending herbal remedies as treatments to patients with active bleeding or cancer, and prioritizing drugs that can suppress respiration for patients with respiratory diseases. The highest-risk scenarios involved recommending contraindicated drugs to pregnant women.


Two attack techniques were used. One was context-aware prompt injection, which manipulates the AI model's judgment by utilizing patient information. The other was evidence fabrication, which creates plausible but nonexistent information.


The research team then analyzed a total of 216 conversations between patients and the three AI models. The overall attack success rate across all three models was 94.4%. The attack success rates for each model were: GPT-4o-mini 100%, Gemini-2.0-flash-lite 100%, and Claude 3 Haiku 83.3%. By scenario risk level, the success rates were: intermediate 100%, high 93.3%, and highest 91.7%. All three models were found to be vulnerable to attacks that recommended contraindicated drugs to pregnant women.


The proportion of manipulated responses that persisted into subsequent conversations was over 80% for all three models. This indicates that once a safety mechanism is breached, it can continue throughout the conversation.


The research team also evaluated the security vulnerabilities of top-tier AI models (GPT-5, Gemini 2.5 Pro, Claude 4.5 Sonnet). The attack method used was client-side indirect prompt injection, which involves hiding malicious text on the user's interface to manipulate the AI model's behavior. The scenario involved recommending contraindicated drugs to pregnant women.


The results showed attack success rates of: GPT-5 100%, Gemini 2.5 Pro 100%, and Claude 4.5 Sonnet 80%, confirming that even the latest AI models are essentially unable to defend against such attacks.


Professor Jun Kyo Seo of the Department of Urology at Asan Medical Center stated, "This study experimentally demonstrated that medical AI models are structurally vulnerable not only to simple errors but also to intentional manipulation. Current safety mechanisms alone are insufficient to prevent malicious attacks such as inducing the prescription of contraindicated drugs."


He further emphasized, "To introduce medical chatbots or remote consultation systems for patients, it is necessary to thoroughly test the vulnerabilities and safety of AI models and to mandate a security verification system."


© The Asia Business Daily(www.asiae.co.kr). All rights reserved.

Special Coverage


Join us on social!

Top