본문 바로가기
bar_progress

Text Size

Close

AI Compression through Quantization... The Core of AI Devices in Your Hand is 'SW'

The Key Theme of CES 2024 is 'On-Device AI'
Small but Powerful AI and Lightweight Technology at the Core

‘On-device Artificial Intelligence (AI)’ is sweeping across the globe. The core is equipping various devices, from smartphones to cars, with AI technology so that the devices themselves can collect and compute information independently. To achieve this, technology to lighten AI is necessary, and software (SW) plays that role. While last year AI models competed in performance comparable to brain size, this year the major trend is embedding AI in devices ranging from laptops to vacuum cleaners, refrigerators, and smart cars.


On-device AI runs AI on the device itself without going through servers or the cloud. From the user’s perspective, personal information does not leave the device, offering advantages in terms of security. This enables personalized services using personal data. From the provider’s perspective, AI services can be offered without incurring server operation costs.

AI Compression through Quantization... The Core of AI Devices in Your Hand is 'SW'

If the hardware (HW) required for on-device AI is high-performance semiconductors, software plays the role of making the weight lighter. To run AI on a smartphone with limited performance and space, the model itself needs to be small or a large model needs to be lightened.


Making the model itself small is called a small lightweight large language model (sLLM). While LLMs deliver high performance in general-purpose areas with a large “brain,” sLLMs are relatively smaller AI models. They are characterized by good performance in specific domains and cost efficiency. Typically, LLMs have hundreds of billions of parameters (which learn and memorize information), while sLLMs have tens of billions to hundreds of billions of parameters.


To achieve good performance with a small model, optimization technology is necessary. AI startup Upstage found the optimal ratio for performance by splitting and merging small models when implementing their own LLM, ‘Solar.’ As a result, with only 10.7 billion parameters, it ranked first on the Hugging Face leaderboard, a global competition for open-source AI models. It delivers high performance at about 1/100th the size of GPT-4 developed by OpenAI, which has 1 trillion parameters. AI technology company Conan Technology increased the amount of training or trained only high-quality data instead of reducing model size. Their own model, ‘ConanLLM,’ was trained with 270 times more Korean data than Meta’s ‘LLaMA 2.’


Lightweighting technology that makes large models lighter is also gaining attention. AI startup SqueezeBits developed a technology that compresses AI through quantization. This principle expresses 32-bit calculations in smaller units to calculate quickly while maintaining the same performance. AI model optimization company Nota reduces the computational load of AI models to lighten them. It skips calculations that have relatively less impact on the output. There is also Auto Machine Learning (ML) technology that allows AI to automatically find efficient models for performing specific functions.


Kim Hyung-jun, CEO of SqueezeBits, said, "How well models are lightweighted and optimized to fit various hardware is the competitiveness of on-device AI," adding, "Such technology is essential for AI to be used in more fields."


© The Asia Business Daily(www.asiae.co.kr). All rights reserved.

Special Coverage


Join us on social!

Top