Implementation of AI Capability to Understand 'Noran Podo' Possible

KAIST Research Team, Google DeepMind, and Rutgers University Collaborate Internationally
Develop New AI Model Understanding Novel Concepts and Benchmark for Executing Programs

The ability of artificial intelligence to understand and imagine visual concepts never seen before, such as 'yellow grapes' or 'purple bananas,' has become possible.

KAIST (President Kwang Hyung Lee) announced on the 30th that a research team led by Professor Seongjin Ahn of the School of Computing, in an international joint study with Google DeepMind and Rutgers University in the United States, has developed a new AI model that systematically combines visual knowledge to understand new concepts, along with a benchmark program to perform it.

Implementation of AI Capability to Understand 'Noran Podo' Possible

(From left) Professor Seongjin Ahn, Department of Computer Science, KAIST; Master’s student Youngbin Kim, Department of Computer Science, KAIST; PhD student Gautam Singh, Rutgers University; Master’s student Junyoung Park, Department of Computer Science, KAIST; DeepMind Senior Researcher Challa Gulcher (currently Professor at EPFL)

Humans have the ability to learn concepts like 'purple grapes' and 'yellow bananas,' separate them, and then recombine them to imagine concepts never seen before, such as 'yellow grapes' or 'purple bananas.' This ability is called systematic generalization or compositional generalization and is considered a key element in realizing general artificial intelligence.

The problem of systematic generalization has remained a major challenge in the field of AI deep learning for 35 years since the renowned American cognitive scientists Jerry Fodor and Zenon Pylyshyn argued in 1988 that artificial neural networks could not solve this problem.

The international joint research team led by Professor Seongjin Ahn developed a benchmark to study systematic generalization of visual information to fill this gap. Unlike language, visual information does not have a clear structure of 'words' or 'tokens,' making it a significant challenge to learn this structure and achieve systematic generalization.

Professor Seongjin Ahn said, “Systematic generalization of visual information is an essential ability to achieve general artificial intelligence, and we expect this research to accelerate advancements in AI reasoning and imagination capabilities.”

Caglar Gulcehre, a lead researcher from DeepMind who participated in the study and is currently a professor at the ?cole Polytechnique F?d?rale de Lausanne (EPFL) in Switzerland, stated, “Once systematic generalization becomes possible, it will be able to achieve higher performance with much less data than currently required.”

This research is scheduled to be presented at the 37th Conference on Neural Information Processing Systems (NeurIPS), held from December 10 to 16 in New Orleans, USA.