본문 바로가기
bar_progress

Text Size

Close

"Opinion Polls May Lie, but Data Tells the Truth"

[Asking the Way to the Data Economy] Doppleganger, Developing Countries' GDP... Facts Known by Big Data

[Asia Economy Reporter Kim Heung-soon] In the book "Everybody Lies" written by Google data analyst Seth Stephens-Davidowitz, there is a case of David Ortiz, the star hitter of the Boston Red Sox in Major League Baseball. Ortiz played for Boston for 14 seasons starting in 2003. He contributed to the team breaking the "Curse of the Bambino" and winning the World Series for the first time in 86 years in 2004, and also achieved a championship in 2007.


His thriving performance declined at the age of 33 in 2008. His batting average dropped by 0.068, on-base percentage by 0.076, and slugging percentage by 0.114. In 2009, his performance was even worse. What caught the Boston team's attention was the "doppelganger" model developed by American statistician Nate Silver. Silver annually compiled data on about 18,000 Major League players, including height, age, position, home runs, and batting average. From this, he selected 20 players with the most similar performance to Ortiz during his age 24 to 33 and analyzed their baseball careers. Representative players included Jorge Posada and Jim Thome. They also showed peak performance in their late 20s, struggled in their early 30s, and then regained their skills.


Following the doppelganger prediction, the Boston team decided to give Ortiz more opportunities, and this choice proved correct. Ortiz excelled in the 2013 World Series at age 38 with a batting average of 0.688, winning his third championship and being named the Most Valuable Player (MVP). Retiring in 2016, he remained one of the most beloved players among Boston fans with the nickname "Big Papi." The doppelganger model demonstrates that data collection and analysis can be usefully applied not only in baseball but also in various business fields. The more data sets available, the more accurate the analysis and prediction can be.



There is also a case where economic output was measured using data instead of Gross Domestic Product (GDP) in developing countries. A research team from Brown University in the U.S., including Vernon Henderson, Adam Storeygard, and David Weil, argued in their paper "Measuring Economic Growth from Outer Space" that nighttime lights in developing countries can help measure economic output.


They judged that most economic activities in developing countries go unrecorded and that government agencies have limited resources to measure economic output, making traditional GDP measurement inefficient. Therefore, they analyzed nighttime lights using images from U.S. Air Force satellites orbiting the Earth 14 times a day. As a result, they identified a sharp decline in Indonesia's nighttime lights during the 1998 Asian financial crisis. South Korea's nighttime lights increased by 72% from 1992 to 2008. The researchers explained, "Combining weak government data with imperfect nighttime light data produces better estimates than using either alone."


With the enforcement of the Data 3 Act (Personal Information Protection Act, Information and Communications Network Act, Credit Information Act) on August 5, expectations for revitalizing the data economy are growing in South Korea. However, some remain skeptical about whether the economic value of data can truly be realized. Stephens-Davidowitz emphasized in his book, "Utilizing all available data and having a broad perspective on what is considered data holds great value for scholars and entrepreneurs." He added, "Photos of lines at a supermarket or images taken from space?all of these are data," stressing that "this new data can see through people's lies (incorrect predictions)."


© The Asia Business Daily(www.asiae.co.kr). All rights reserved.

Special Coverage


Join us on social!

Top