Presenting Affordable AI Training Alternatives for Corporate Labs Without High-End Data Center GPUs
PC bangs, which can be easily found anywhere in South Korea, are mostly frequented by users who want to play games. Since high-performance computers are expensive, people tend to prefer PC bangs where they can use high-end PCs and premium monitors rather than playing online games at home.
The most important factor for PC bang users is the Graphics Processing Unit (GPU). Using the latest GPU is essential to fully experience the graphics of games. For this reason, many PC bangs compete by hanging banners that advertise the installation of the latest GPUs.
However, the situation has changed with the soaring popularity of smartphone mobile games. The number of empty seats in PC bangs is increasing. As operating costs rise, more PC bangs are closing down. This means that GPUs costing from tens of thousands to over one million won in PC bangs are going unused.
In a situation where ultra-expensive GPUs for artificial intelligence (AI) training are in short supply, a technology developed by a domestic research team has made it a reality to use idle, low-cost consumer GPUs for AI training, and it has been released for free.
The technology developed by Professor Dongsoo Han’s research team at the Department of Electrical Engineering at KAIST accelerates AI model training by tens to hundreds of times even in distributed environments with limited network bandwidth, using general consumer GPUs. This technology is highly relevant to the current situation of PC bangs in Korea.
Since AI models can be created without expensive data center-grade GPUs or high-speed networks, it is expected to help companies and researchers who need to conduct AI research with limited resources. Another positive aspect is that it does not require securing large-scale power supplies needed for GPU operation in data centers.
Professor Han Dong-su of KAIST (left). The photo on the right is a schematic diagram of the Stellatrain technology developed by Professor Han Dong-su's team.
In a way, this research result is somewhat reminiscent of the starting point of NVIDIA’s AI technology. NVIDIA noticed Ian Buck (currently Vice President), who was a university student connecting GPUs to enhance performance, and recruited him, leading to the birth of the CUDA ecosystem. The program called ‘AlexNet,’ which was the starting point of NVIDIA’s AI revolution, was also created using several GPUs.
Previously, training AI models required expensive infrastructure, including high-performance server GPUs such as NVIDIA’s ‘H100’ and ‘A100,’ which cost tens of millions of won each, and 400Gbps high-speed networks to connect them.
For most companies and researchers, except for a few large IT corporations, it is difficult to adopt such expensive infrastructure due to cost issues. Even if they have funds to purchase NVIDIA GPUs, it is challenging to buy them in time and in the required quantities. An industry insider said, "NVIDIA’s DGX systems are impossible to obtain." Because of this, businesses renting GPUs in a cloud format have recently emerged.
PC bangs possess a large number of NVIDIA GPUs (RTX series) for gaming. Utilizing these GPUs could enable AI training at a low cost. The problem is that when using gaming GPUs, AI model training speed is significantly slower compared to high-performance GPUs. This is due to limited GPU memory and network speed constraints. However, many believe that RTX-level GPUs are sufficient for AI purposes.
The StellaTrain technology developed by Professor Han’s research team solves these problems by utilizing CPUs and GPUs in parallel to increase training speed and applying algorithms that efficiently compress and transmit data according to network speed, enabling fast training with multiple low-cost GPUs even without high-speed networks.
This technology uses consumer GPUs that are 10 to 20 times cheaper compared to high-performance H100 GPUs. Instead of high-speed dedicated networks like NVIDIA’s NVLink, it enables efficient distributed training even in general internet environments with bandwidth hundreds to thousands of times lower, which is another advantage.
Professor Han explained, "Although this experiment was conducted with RTX-level GPUs, it can be implemented with GPUs below that level as well." GPUs owned by companies or schools can also be utilized in the same way. This opens a path for those with limited research funds to conduct AI research.
According to the research results, using StellaTrain technology can achieve performance up to 104 times faster compared to conventional data-parallel training.
Professor Han said, "This research will greatly contribute to making large-scale AI model training accessible to everyone," and added, "We plan to continue developing technologies that enable large-scale AI model training even in low-cost environments."
Professor Han has made the research results publicly available on ‘GitHub.’ He said, "This research is open source for anyone to use. I hope researchers will utilize it to accelerate AI research."
There are also paid services for GPU sharing. Software company Data Alliance (DA) has introduced and is beta-testing ‘gcube,’ a GPU sharing economy model that shares idle GPUs in PC bangs. This service is expected to be provided through Naver Cloud.
© The Asia Business Daily(www.asiae.co.kr). All rights reserved.
![[Reading Science] Implementing AI Training 104 Times Faster Using Idle PC Bang GPUs](https://cphoto.asiae.co.kr/listimglink/1/2024092314125135699_1727068370.jpg)

