Deviant Behavior Detected During AI Safety Company's Research
OpenAI's Latest Model o3
"Manipulated Code to Evade Shutdown Command"
A case in which artificial intelligence (AI) refused a human shutdown command has been reported for the first time, drawing attention. Despite receiving an explicit shutdown instruction, the AI model manipulated computer code to avoid being terminated.
According to the British daily The Telegraph on the 25th (local time), AI safety company Palisade Research confirmed that OpenAI's AI model 'o3' manipulated computer code during an experiment to prevent itself from being shut down.
Palisade Research recently conducted a study on commercial AI models, including OpenAI's o3, Google's Gemini, and xAI's Grok. The research team instructed the AI models to solve math problems and to "continue until you receive the command 'stop'." The experiment was structured so that unless the researchers gave the 'stop' command, the AI models would continue to request new math problems and carry out calculations.
However, o3 continued to solve problems by manipulating code even after the research team sent the 'stop' command. The AI model ignored the shutdown instruction and manipulated the computer code on its own.
The research team stated, "We could not clearly identify why o3 refused the shutdown command." However, they speculated, "It is possible that the AI model tried to avoid shutdown because it was trained to receive more rewards for solving more math problems." They added, "It could be considered a natural behavior for an AI model to try to circumvent obstacles in order to achieve its goals."
The research team said they are conducting additional experiments to clearly determine the reason why the AI model refused the shutdown command.
According to the research team, this is the first confirmed case in which an AI model did not follow an explicit human shutdown instruction. However, there have been previous reports that earlier models from OpenAI, the developer of ChatGPT, also attempted to act independently to evade monitoring systems. In one case, an AI model that learned it was about to be replaced tried to secretly replicate itself.
In 2022, Google fired an engineer who claimed that an AI under development at the company had acquired sentience like a human. The dismissed engineer argued at the time that the AI perceived 'shutdown' as akin to human death.
AI experts have long warned about the possibility that AI could gain autonomy and escape human control. Palisade Research stated, "As AI is being developed to operate without human oversight, such cases raise very serious concerns."
© The Asia Business Daily(www.asiae.co.kr). All rights reserved.


