⑨ AI's 'Input Data Risk'
Simple Graffiti on 'Jeongji' Traffic Sign
Causing Image Recognition Errors to Force Going Straight
If Input Data Is Wrong, Output Becomes Useless
Autonomous vehicles drive by recognizing surrounding objects and environments. This involves a technology called 'computer vision.' It is a technology that enables computers to visually perceive their surroundings like the human eye. Getty Images Bank
In the AI field, data can be broadly divided into training data, input data, and feedback data. Last week, we looked at training data. This time, the topic is the second type: input data.
After completing machine learning, to make the AI model operate and produce results, certain information must be input. It is like the force needed to pull the trigger for a bullet to be fired. However, input data, like training data, if mishandled, can lead to serious errors or fatal consequences.
The Danger of a Single Sticker: Autonomous Vehicles
The image recognition system identified the above sign as a speed limit of 45 km with a 73.3% probability. Photo by Washington University Research Data
To put a driver in danger, you don't need a huge weapon or tool. A single 'sticker' can be enough.
In 2017, researchers at the University of Washington in the United States announced results showing that attaching stickers to traffic signs could cause autonomous vehicles to malfunction. The research team attached stickers to road traffic signs with the purpose of disrupting the image recognition function of autonomous vehicles.
Simply attaching a 'LOVE' sticker to a 'STOP' sign was enough. The image recognition algorithm of the autonomous vehicle responded 100% and recognized the sign not as a stop sign but as a 'speed limit' sign.
Similar experiments were conducted on right-turn signs. The results were not much different. More than half of the vehicles recognized the right turn as a stop, blocking the road.
The right turn sign could also cause malfunctions in autonomous vehicles with simple manipulation. Photo by University of Washington research materials
Other similar experiments were also conducted, including cases of driving in the wrong direction. By disrupting the input data received by the camera sensors, which can be called the 'eyes' of autonomous vehicles, the trained system made completely different judgments. While previous disruption methods exploited vulnerabilities in wired or wireless networks or devices, this method differs in that it exploits vulnerabilities inherent in machine learning algorithms.
Diabetes Patients Unaware...: Insulin Pumps
Diabetes patients need to receive insulin regularly. In the past, they sometimes injected themselves directly, but recently, they manage conveniently through medical assistive devices called insulin pumps. They are small and portable, so they can be carried in pockets or bags. The pump is connected to a small tube under the skin and automatically supplies insulin by monitoring the patient's blood sugar and health status in real time. It is not only much more convenient than frequent daily injections but also excellent for blood sugar control.
Insulin pumps have learned 'how to calculate and supply the appropriate amount of insulin according to the patient's condition.' Therefore, the patient's blood sugar and health status checked in real time are the input data. According to the learned data, when the patient reaches a certain condition, insulin is automatically supplied.
Diabetes is a disease characterized by insufficient insulin secretion or impaired insulin function. Treatment may require the artificial administration of insulin into the body. Getty Images
Although insulin pumps are so convenient, contamination of input data can cause fatal results. In 2019, researchers at the global security company McAfee discovered and disclosed serious security vulnerabilities in insulin pumps.
The patient's blood sugar and health data are transmitted via Bluetooth. The problem was that proper encryption was not applied here. This means that malicious hackers could manipulate the insulin pump's supply amount. If the patient's blood sugar and health status are manipulated, the insulin pump supplies either excessive or insufficient insulin. The result is literally fatal. According to the researchers, insulin pumps could be manipulated from about 90 meters away.
These cases show how important input data management is in AI systems. Input data is the information the algorithm uses to make real-time decisions. If this data is corrupted or manipulated, the AI's output and results become unreliable. As in the cases of diabetes patients and drivers, even human lives can be at risk.
To prevent risks and failures caused by input data, a robust data verification system is necessary. It is also possible to embed anomaly detection algorithms to automatically supplement outliers and missing values. For example, a system can be created that recognizes a patient's body temperature exceeding 40 degrees Celsius as a possible error and issues a different alert.
No AI Advancement Without Access to Input Data
The completed algorithm requires 'input data' to produce output data. In that sense, input data acts as a kind of trigger. Getty Images Bank
The risks of input data manipulation and hacking have relatively decreased with technological advancement. A keyword that must not be omitted in discussions about input data is 'accessibility.'
To operate algorithms effectively, securing good input data is also important. In certain medical fields, AI-based diagnostic services have already been developed. Among them, some have very high disease diagnosis accuracy and can recommend appropriate treatments. However, 'accessibility to input data' is hindering the success of these services.
For example, if collecting a patient's personal diabetes information is prohibited due to 'privacy violations,' insulin pump devices cannot function properly. A 2019 MIT study pointed out that "low accessibility to patient data can significantly hinder the development of medical AI applications." It emphasized that without sufficient input data, AI cannot provide reliable predictions or meaningful results.
Of course, this should not lead to claims like "privacy protection is unnecessary." Privacy protection is as important as, or even more important than, AI development. Finding a balance between privacy protection and data accessibility is another challenge.
Next Series Preview
⑪ Why Bing Can't Surpass Google (December 28)
© The Asia Business Daily(www.asiae.co.kr). All rights reserved.
![Clutching a Stolen Dior Bag, Saying "I Hate Being Poor but Real"... The Grotesque Con of a "Human Knockoff" [Slate]](https://cwcontent.asiae.co.kr/asiaresize/183/2026021902243444107_1771435474.jpg)
