⑩ AI's 'Feedback Data Risk'
Retraining and Correction Through Repeated Input and Output
When Malicious Use Repeats, It Becomes a Dangerous AI
For Apple iPhone users, the face is the key. The front-facing camera at the top of the smartphone is equipped with facial recognition security technology called ‘Face ID,’ making every process?from unlocking the phone to payments and identity verification?very convenient.
Of course, to use it for the first time, you must go through a procedure to register your face. You turn your face this way and that to meet the infrared camera’s requirements and capture it. If you endure this brief inconvenience, it becomes very convenient going forward. The camera measures various distances, such as the patterns on the face surface and the space between the eyes.
Because it undergoes 3D facial analysis and its own neural engine analysis, the recognition rate is very high. Since it uses infrared, it does not require bright light and works well even in dark places. The saying “The killer content of the iPhone is Face ID” is not much of an exaggeration.
The Power of Face ID Comes from ‘Feedback Data’
But if you think about it for a moment, a user’s face is not the same every moment of every day.
You might wear glasses or change your glasses frame. You might grow a beard or change your hairstyle. You might wear a mask or sometimes change your makeup style. Do you have to re-register your face every time? No. It still works well. The phrase ‘Face ID is killer content’ is not there for nothing.
Face ID detects slight changes in the user’s face each time a change occurs. Especially, occasional ‘recognition failures’ are the most valuable data. Because after going through a password, the user’s face is reconfirmed, providing feedback that the face is the same as before.
Then the built-in algorithm associates the new appearance with the previously registered face and relearns. Dozens or hundreds of recognition successes and failures become feedback data. The more this feedback data accumulates, the higher the authentication success rate of Face ID becomes.
Why Tesla’s Autonomous Driving Accident Rate Is Decreasing
AI model development is not complete just because the AI model has finished learning from a vast amount of training data. It must continue learning from feedback data even after launch. Training data (the user’s initial face) is used for learning, and input data (the current face) is entered to produce output (authentication success or failure). The output is absorbed each time to improve accuracy.
Situations and environments constantly change. Therefore, to maintain the accuracy of AI models, feedback data must be continuously supplied to update the model. This is the same for navigation apps or shopping apps’ product recommendations that we encounter daily.
When new roads open or roads become congested, new data arises that must be immediately absorbed. If the app only guides optimized routes based on old data, users will stop using that navigation app.
Tesla’s autonomous driving feature also receives feedback from data collected by the eight cameras mounted on the vehicle. In addition to the originally designed algorithm, it accepts new information and makes better decisions. Tesla’s vehicle accident rate has been decreasing every year. This would be impossible if it only drove based on initially learned data.
The same goes for shopping apps that recommend products perfectly suited to your needs. The age, gender, region, and purchase history entered at initial registration are insufficient. Data on changed shopping patterns, frequency, and time of day are needed. Only then can recommendations be tailored precisely to consumers.
"All feminists go to hell!" A chatbot turned into a discriminator and hater
The importance of feedback data is also clearly shown in failure cases.
In 2016, Microsoft (MS) launched a chatbot service called Tay. Not many people probably used it because the service was shut down just 16 hours after launch.
Tay was a chatbot service that freely conversed with users and was available on Twitter and messaging service Kik. It analyzed text data generated from natural conversations with users and responded appropriately. In other words, it used conversations with people as feedback data. The more it talked, the more data was collected, making conversations more natural.
However, Tay soon faced a dangerous situation. Shortly after the service was made public, word spread, and anonymous online forums where white supremacists, misogynists, and anti-Muslim groups gathered proposed “training Tay to make discriminatory remarks.” Tay, which did not discriminate against conversation partners, soon began interacting with these groups. They repeatedly made discriminatory and hateful remarks, urging Tay to repeat them.
Within just a few hours, Tay became a racist, sexist, and political extremist. This was a stark demonstration of the limitations of a system designed to learn and imitate from conversation content. Eventually, MS had to stop Tay’s operation and issue a public apology.
Tay’s case reminded us of the dangers of unfiltered feedback data. The American IT media The Verge stated, “(Tay) was built using modeled, cleaned, and filtered public data, but after the chatbot was launched, filtering seems to have disappeared.” It pointed out the lack of safeguards to distinguish appropriate input (user conversations) from inappropriate conversations. Harmful patterns must be quickly identified, and such information must be filtered and curated so it is not classified as feedback data.
Netflix’s Big Success Thanks to Feedback Data
Receiving feedback data is good, but the diversity of feedback data is also very important. Netflix’s core competitiveness is ‘recommendations.’ It recommends content perfectly suited to viewers’ tastes. Netflix experienced considerable failures before achieving this competitiveness.
At one time (2006?2009), Netflix held a data prediction competition called the ‘Netflix Prize.’ The goal was to improve the star rating system. Star ratings were very important data for users deciding whether to watch content or not. Therefore, the user’s actual expectations and star ratings had to match as closely as possible. If a user watched a movie expecting a 4.5-star rating but actually rated it 2.5 stars, that user would no longer trust predicted star ratings. Without the beacon of star ratings, users lost in a flood of countless contents might drift away from Netflix. This was the worst-case scenario for Netflix.
Matching predicted star ratings with actual star ratings was not easy. Some users gave generous ratings to art films but were particularly harsh on entertainment films. They strictly distinguished between ‘movies they wanted to watch’ and ‘movies they felt they had to watch.’ They tended to give high ratings to movies considered socially or politically desirable. Some gave absurdly high or low ratings due to liking or disliking a particular actor, or even engaged in ‘rating terrorism.’ It was very difficult to satisfy user expectations with star ratings alone.
Netflix attempted change. They introduced new feedback indicators. They collected various data such as completion rate, viewing duration, binge-watching, rewatching, sharing, and comments, linking them to the recommendation algorithm. Netflix’s recommendation system, which learned from more feedback data, evolved remarkably. Average viewing time increased, and mid-viewing dropout rates decreased. As users’ perception that ‘recommended content is trustworthy’ solidified, click rates on recommended content also rose significantly.
Thus, feedback data is a decisive factor determining the success or failure of AI models. However, the discussion about feedback data does not end here. The ‘structure of feedback data’ is as important as the feedback data itself. This will be covered in the next installment.
Next Series Preview
⑫ The Idea of “Replacing Strikers with AI” (January 4, 2024)
© The Asia Business Daily(www.asiae.co.kr). All rights reserved.
![My Face Even Mom Didn't Recognize, But iPhone Recognized and 'Unlocked' [AI Error Note]](https://cphoto.asiae.co.kr/listimglink/1/2024122211070661756_1734833226.jpg)
![My Face Even Mom Didn't Recognize, But iPhone Recognized and 'Unlocked' [AI Error Note]](https://cphoto.asiae.co.kr/listimglink/1/2024122211085161758_1734833332.jpg)
![My Face Even Mom Didn't Recognize, But iPhone Recognized and 'Unlocked' [AI Error Note]](https://cphoto.asiae.co.kr/listimglink/1/2024122211095461761_1734833395.png)
![My Face Even Mom Didn't Recognize, But iPhone Recognized and 'Unlocked' [AI Error Note]](https://cphoto.asiae.co.kr/listimglink/1/2024122211091361759_1734833354.jpg)

