AI Model 'HUSH' Developed to Simultaneously Extract Spatial Structure and Object Information from Panoramic Images
Utilized in AR, MR, Digital Twin Applications... Accepted at Leading Computer Vision Conference CVPR 2025
An artificial intelligence (AI) technology capable of understanding the three-dimensional information of indoor spaces and objects from a single 360-degree panoramic photo has been developed.
This technology is expected to be utilized in fields where an accurate understanding of spatial information is essential, such as augmented reality (AR), mixed reality (MR), and digital twins.
On July 1, a research team led by Professor Kyungdon Joo at the UNIST Graduate School of Artificial Intelligence announced the development of an AI model called 'HUSH (Holistic Panoramic 3D Scene Understanding using Spherical Harmonics)', which can simultaneously extract the spatial structure and three-dimensional information of internal objects from a 360-degree panoramic image.
Research team, Professor Kyungdon Joo (left) and researcher Jongsung Lee. Provided by UNIST
In AR or MR technologies, AI must be able to accurately understand and represent information such as the positions of walls or furniture and the distances between objects in order to combine real spaces with digital content. Previously, this required multiple photos taken from different angles or expensive equipment such as depth sensors.
The HUSH model developed by the research team can extract such information using only a 360-degree panoramic image. While panoramic images can capture a wider area in a single shot compared to standard photos, their spherical distortion makes it difficult for AI to analyze them accurately. There are methods that reduce distortion by cropping the image and repeatedly applying general AI models, but this can lead to information loss or inefficient computation.
To solve these problems, the research team utilized a mathematical representation called 'Spherical Harmonics (SH)', which accurately reflects the spherical characteristics of panoramic images. This method analyzes the information on a spherical surface by dividing it into frequency components. Broad and flat areas such as ceilings or floors are represented by low-frequency components, while detailed and complex structures such as the outlines of furniture or objects are represented by high-frequency components, thereby improving accuracy.
Researcher Jongsung Lee, the first author, explained, "Spherical harmonics is a technology originally used in virtual view generation to represent the color of objects or scenes, but we applied it for the first time to panoramic image-based spatial reconstruction, inspired by its ability to effectively analyze data on a sphere."
The HUSH model achieved higher accuracy in depth prediction and other tasks compared to existing 3D scene reconstruction models, and it also demonstrated superior computational efficiency by simultaneously predicting various spatial information from a single image.
Professor Kyungdon Joo stated, "This technology can be widely applied in real life, such as accurately recognizing indoor spaces around users for AR and MR, or generating immersive media that users can interact with from a single image."
An artificial intelligence model capable of inferring spatial information such as depth and normals from a single panoramic image.
This research was accepted at CVPR 2025 (Conference on Computer Vision and Pattern Recognition), a prestigious conference in the field of computer vision. CVPR 2025 was held in Nashville, United States, for five days starting from June 11.
© The Asia Business Daily(www.asiae.co.kr). All rights reserved.

