Surge in Polling Agencies in the 2000s
Internet Polls Proliferate, Statistical Rigor Declines
Nonresponse Bias and Last-Minute Election Herding Also Noted
The unprecedented global focus on the U.S. presidential election has come to an end. Polls predicted a narrow race with Vice President Kamala Harris holding an edge, but the result was a landslide victory for President-elect Donald Trump. In 2016, U.S. polls forecasted a decisive win for former Secretary of State Hillary Clinton, yet the White House went to Trump. In 2020, polls projected an 8-percentage-point victory for President Joe Biden, but the actual margin was around 4 percentage points, marking the largest polling error in 40 years. Although President-elect Trump’s election record stands at two wins and one loss, his track record against polls is three wins and zero losses. Why do polls miss the mark like this?
The Fading Tradition of Probability Sampling
To understand why polling predictions have been off, it is necessary to review the evolution of polling methods in the U.S. According to Emerson College Polling Center (ECPC), about 100 years ago, data in the U.S. was mainly collected through mail and face-to-face interviews. Then, in the 1970s, as landline telephones became widespread in American households, telephone polling based on Random Digit Dialing (RDD) began to flourish. Pollsters would select an area code and then randomly generate seven-digit phone numbers starting with that code, which were called by pre-hired call center interviewers.
In the 1990s, polling methods underwent another transformation. While random telephone surveys were good for ensuring sample representativeness, they had limitations such as calls going to businesses or government agencies irrelevant to polling, resulting in high time and cost. To address this, 'stratified sampling' was introduced, based on publicly available voter registration lists that included gender, age, and education level. For example, if the population had 40% college graduates, a sample of 100 would include 40 randomly selected college graduates and 60 non-graduates, creating a sample similar to the population.
Among the leading U.S. polling organizations, The New York Times (NYT) and Siena College have maintained telephone polling using voter registration lists. According to them, polls typically survey about 1,000 respondents with a margin of error of ±3 to 4 percentage points. At a 95% confidence level, this means that if the same poll were conducted 100 times, 95 results would fall within the margin of error.
Sample Bias in Online Polling
The problem begins here. Since the 2000s, the era of traditional telephone polling has waned, giving way to polls conducted via mobile phones, text messages, and internet platforms. Especially with the rise of internet-based polling, the previously strict stratified sampling has been replaced by 'quota sampling.' While quota sampling categorizes the population into subgroups like stratified sampling, it selects samples within subgroups arbitrarily, reducing statistical rigor and increasing sample bias.
The Pew Research Center noted, "In the early 2000s, there were only about 30 companies publishing nationwide election polls, but now there are over 60. The problem is that about half of these use 'opt-in' online surveys rather than traditional methods like telephone interviews using random phone numbers." This method is cheaper than telephone interviews but tends to survey groups with high political interest or familiarity with the internet, reducing sample representativeness. Pew Research Center pointed out, "Surveys using non-probability sampling can have, on average, twice the error of those using probability sampling."
However, this cannot be blamed solely on online polling. Even well-regarded polls with statistical rigor missed the mark in the last three elections where Trump was a candidate. In 2016, most polls predicted a decisive win for Clinton. Nationally, Clinton received 2.8 million more votes, but Trump won the Electoral College with 304 votes by sweeping battleground states and entered the White House. In the 2020 election, polls predicted an 8-percentage-point victory for Biden, but the actual margin was around 4 percentage points, about twice the error. The American Association for Public Opinion Research (AAPOR) reflected, "The 2020 polls had an unprecedented scale of error, the largest in 40 years based on national voter turnout."
Polling organizations, determined to improve, predicted a narrow national lead for Vice President Harris within the margin of error and a slight edge in battleground states in this election, but Trump swept not only the national popular vote but all seven battleground states. A common thread in the last three elections is that Trump’s support was underestimated. Pew Research explained, "Looking back over the past 20 years of election polling, elections with Trump as a candidate had large prediction errors, but those without Trump were generally accurately predicted."
The 'Silent Trump' Nonresponse
In the last two elections, experts pointed to 'shy Trump voters' who hid their support for Trump and falsely reported their preferred candidate as a reason for polling failures. However, Pew Research Center found no evidence supporting the shy Trump effect. Instead, experts are now focusing on 'nonresponse bias.'
Star statistician Nate Silver, known as a 'wizard' in U.S. elections, noted, "Trump supporters often have low civic engagement and social trust, so they may be less willing to complete news organizations’ surveys," suggesting that nonresponse, rather than false responses from shy Trump voters, may reduce the representativeness of poll samples. Nate Cohn, a data analyst at NYT, also revealed, "Recent NYT and Siena College polls showed that white Democrats were 16% more likely to respond than white Republicans."
Polling Organizations’ 'Herding'
The 'herding' phenomenon among polling organizations near election day is also cited as a problem. Pollsters tend to avoid reporting statistical outliers that show unusually high or low values near the election, fearing damage to their reputation, and instead follow the consensus trend. Statistician Silver pointed out, "Too many polls in battleground states report a margin between Harris and Trump within 1 percentage point. Normally, there should be a larger difference."
For example, pollster Ann Selzer, known as the 'Midwest prophet,' reported an outlier showing Vice President Harris leading Trump by 3 percentage points in Iowa, a Trump stronghold, before the election. However, the actual result was a 14-percentage-point victory for Trump, denting her credibility.
The Wall Street Journal (WSJ) commented, "Over the last three elections, Trump’s support improved among college graduates, workers, Latinos, and Black voters," and added, "Looking at the U.S. polls that underestimated Trump’s support three times in a row, it seems they still do not understand the political climate Trump has created in America."
© The Asia Business Daily(www.asiae.co.kr). All rights reserved.
![[Global Focus] Final Score '3 to 0'... Why Polls Show Trump’s Crushing Defeat](https://cphoto.asiae.co.kr/listimglink/1/2024111316095298804_1731481792.jpg)
![[Global Focus] Final Score '3 to 0'... Why Polls Show Trump’s Crushing Defeat](https://cphoto.asiae.co.kr/listimglink/1/2024110715523690977_1730962355.jpg)
![[Global Focus] Final Score '3 to 0'... Why Polls Show Trump’s Crushing Defeat](https://cphoto.asiae.co.kr/listimglink/1/2024111316213198855_1731482492.jpg)

