Domestic researchers have devised a new concept of artificial intelligence model that can independently formulate and verify hypotheses like humans and animals. This achievement became possible by uncovering the process through which humans and animals establish hypotheses, maintain consistent behavioral strategies, and independently doubt and verify these hypotheses to adapt to situations.
KAIST announced on the 27th that a joint research team led by Professors Sangwan Lee and Minhwan Jeong from the Department of Brain and Cognitive Sciences proposed a new reinforcement learning theory and elucidated its neuroscientific principles.
(From left) Professor Lee Sang-wan, Department of Brain and Cognitive Sciences, KAIST, and PhD candidate Yang Min-su. Courtesy of KAIST
According to the joint research team, the problem of finding an appropriate balance between behavioral consistency and flexibility according to the current situation is called the "Stability-flexibility dilemma."
To resolve this dilemma, one must continuously verify and revise whether their current judgment is correct. Although various studies related to this have been conducted in the fields of neuroscience and artificial intelligence, a perfect solution had not been discovered until now.
The joint research team, noting that both traditional reinforcement learning theories and the latest AI algorithms fail to adequately explain animal behavior, devised a new method that can dynamically profile behavioral patterns based on hypotheses formulated independently to predict and verify subsequent situations.
They also proposed a new adaptive reinforcement learning theory and model that asymmetrically updates behavioral strategies based on prediction errors of hypotheses animals form about their current situations.
Most state-of-the-art AI models focus on efficient problem-solving and often fail to properly explain human or animal behavior. In contrast, the proposed model was able to predict animal behavior in response to unexpected events with an average accuracy improvement of 15% (up to 31%) compared to the latest AI models.
These results were consistently reproduced in the analysis of four different animal experimental datasets previously published (two-step task, two-armed bandit task, T-maze task, two-armed bandit task with MSN inactivation).
During the research, the joint team also discovered that medium-sized spiny neurons in the striatum of the midbrain basal ganglia (a brain region responsible for motor control and learning functions), which constitute 90% of the striatum and are characterized by inhibitory neural activity, are involved in the hypothesis-based adaptive reinforcement learning process.
The research results demonstrate that the brain’s method of contextual inference fundamentally differs from that of large-scale AI models.
For example, AI models such as ChatGPT and DeepSeek estimate contextual information from user inputs and, based on this, match it to the necessary expert systems (DeepSeek uses reinforcement learning for matching), assuming the context is correct until new information arrives.
However, the brain doubts the context (hypothesis) it has inferred itself and actively accepts new contexts as soon as the doubt is confirmed. The joint research team emphasized that this suggests a new direction that could alleviate hallucination phenomena?excessive confidence shown by AI?and enable the construction of reasoning engines similar to humans.
The joint research team expects that the research results will be widely applicable in practical fields through brain science-artificial intelligence convergence research.
For instance, using human dynamic behavioral profiling technology, it is possible to analyze individuals’ abilities to formulate and verify hypotheses, which can be directly applied to designing customized educational curricula, personnel and workforce management systems, and human-computer interaction fields.
Moreover, the proposed adaptive reinforcement learning model, as a "brain-like thinking AI" technology, is expected to be utilized in solving the value alignment problem between humans and AI.
Professor Sangwan Lee stated, "This research is a case that revealed the brain’s hypothesis-based adaptive learning principles, which were difficult to explain solely by existing AI reinforcement learning theories," and added, "Incorporating brain science theories of self-doubt and verification into the design and learning processes of large-scale AI systems could enhance their reliability."
Meanwhile, this research was conducted with support from the Ministry of Science and ICT’s Institute for Information & Communications Technology Planning & Evaluation SW Star Lab, the Frontier Challenge R&D project, the Korea Research Foundation’s mid-career researcher program, and KAIST’s Jaecheol Kim AI Graduate School project.
© The Asia Business Daily(www.asiae.co.kr). All rights reserved.

