본문 바로가기
bar_progress

Text Size

Close

Bituen Completes Quality Inspection of AI Training Data for 'Video Summary Data by Voice Recognition'

AI specialist company B2N (CEO Taeil An) announced on the 25th that it has completed quality inspection for the consortium in the ‘2023 AI Training Data Construction Project’ led by the Ministry of Science and ICT and promoted by the National Information Society Agency (NIA).


B2N collaborated in a consortium with companies such as PCN, Saltlux, and Timbel on tasks including ‘Video Summary Data by Speech Recognition,’ ‘Comic Webtoon Data,’ and ‘Building Crack Detection Images (Advanced)’ as part of the ‘2023 AI Training Data Construction Project.’


In particular, B2N participated as the dedicated company for AI training data quality management and quality management service provider for the three consortia, performing overall quality management tasks within the consortium such as ▲establishing and executing quality management plans ▲inspecting quality management activities at each stage ▲providing dedicated support for TTA quality verification.


Additionally, by utilizing the AI training data quality management solution ‘SDQ for AI,’ B2N collected high-quality AI training data from the initial construction stage through syntactic accuracy checks on data structure, input value ranges, and data formats specified in the NIA’s ‘AI Training Data Quality Management Guidelines,’ as well as statistical diversity tests measuring class and instance distribution, sentence length, and vocabulary count.


The AI training data quality inspected by B2N in this project totaled four types and 660,000 cases, including 630,000 images, 30,000 sub-labelings (super-large AI corpus, image captions), and 3,000 hours of voice data. By supporting quality management of training data across diverse fields such as Korean language, disaster safety environment, and cultural tourism, B2N once again demonstrated its technological capability in AI training data quality management.


Furthermore, to support the creation of a super-large AI ecosystem, B2N successfully conducted quality inspections on a vast amount of high-quality corpus data, including a total of 1.86 million sentences and 17.44 million tokens (word units) that can be used as language models.


Park Soonhyuk, head of B2N’s AIX Group, stated, “From 2020 to 2023, we have participated in the AI training data construction project for four consecutive years, gaining recognition for the technological capability and stability of our quality verification services and the ‘SDQ for AI’ solution.”


He added, “In the 2024 super-large AI diffusion ecosystem creation project, we plan to participate in various forms such as participating companies, dedicated quality management services, and third-party quality verification services (per inspection case). This year, we will expand quality management for large-scale corpus data regarding duplication, content similarity, and harmfulness, and in addition to the syntactic accuracy and statistical diversity tests we have provided, we also plan to support semantic accuracy tests.”


Meanwhile, B2N’s AI training data quality management solution ‘SDQ for AI’ is the AI training data quality management solution with the most references in Korea, providing syntactic accuracy and statistical diversity tests, and through ‘Laflow,’ an AI training data integrated platform launched last year, it also supports semantic accuracy tests.


© The Asia Business Daily(www.asiae.co.kr). All rights reserved.


Join us on social!

Top