본문 바로가기
bar_progress

Text Size

Close

Opened Path for AI Training Even Without NVIDIA GPU

KAIST Professor Handongsoo's Team Develops
Fast AI Training Possible on Regular PCs
Up to 104 Times Faster Performance Compared to Existing Data Parallel Training
Expected to Accelerate AI Research and Development in Academia and SMEs

A technology that enables efficient training of artificial intelligence (AI) models without expensive data center-grade graphics processing units (GPUs) or high-speed networks has been developed by a domestic research team.

Opened Path for AI Training Even Without NVIDIA GPU Professor Dongsoo Han, Department of Electrical Engineering, KAIST

KAIST (President Kwang Hyung Lee) announced on the 19th that a research team led by Professor Dongsoo Han from the Department of Electrical Engineering has developed a technology that can accelerate AI model training by tens to hundreds of times even in distributed environments with limited network bandwidth, using consumer-grade GPUs.


Training AI models typically requires high-performance server GPUs such as NVIDIA H100 or A100, which cost tens of millions of won each, or expensive infrastructure with 400Gbps high-speed networks to connect them. While capital-rich big tech companies purchase tens of thousands of GPUs to train AI, most companies and researchers have found it difficult to adopt such costly infrastructure due to financial constraints.


To address this issue, Professor Dongsoo Han’s research team developed a distributed training framework called "StellaTrain." This technology enables efficient distributed training even in general internet environments with bandwidth hundreds to thousands of times lower than high-speed dedicated networks, by utilizing multiple consumer GPUs used by high-performance general PC users. It solves the problem where the speed slows down by hundreds of times without advanced GPUs during large-scale AI model training. The research team explained that using StellaTrain technology can achieve up to 104 times faster performance compared to conventional data-parallel training.


Professor Dongsoo Han said, "This research will make large-scale AI model training easily accessible to everyone," and added, "We plan to continue developing technologies that enable large-scale AI model training even in low-cost environments." He also made StellaTrain available as open source on the developer platform GitHub so that anyone can use it.


This research was conducted jointly with Dr. Hwijoon Lim and PhD student Jooncheol Ye from KAIST, and Professor Sangeetha Abdu Jyothi from UC Irvine. The research results were presented at ACM SIGCOMM 2024 held in Sydney, Australia, last August.


© The Asia Business Daily(www.asiae.co.kr). All rights reserved.


Join us on social!

Top