Third Way: The Divinity Of Data In The News Media
It was in 1936, as FDR defeated Alf Landon for the presidency, when George H. Gallup accurately predicted a victory for the Democrats, albeit far underestimating the scale of FDR's sweep. Despite underestimating the vote totals in the 1948 presidential election, and then the 1952 election, and then the 1956 election, and so on, political polling had found a permanent home in United States journalism. With time, the public has grown to become sceptical of polls, largely because they have seemingly failed to provide spot-on predictions, especially in the realm of politics.
An Ironic Poll: US Adults Trust In Polling
Yet the media and its consumers demand impossible political predictions based on minimal information. Predictions can be upended by just the slightest shift, whether they be about Senate seats or hurricanes. When these barely visible shifts are unaccounted for, the media and the public immediately point figures at pollsters for not delivering an accurate picture of the future. What is even more frustrating is that the news media, which before the 2020 election day warned that the results would look different on the first night of vote counting and called on people to wait before making any electoral claims, immediately changed course and screamed heresy at pollsters for missing their mark. The science of polling is hard enough as is, but with the media tearing down any faith in polls it becomes even more difficult for good pollsters to handle the herculean expectation that they are oracles. The apparent "failure" of polls in the minds of United States adults is nothing new, but it does reveal the need for reevaluation of data collection in journalism and a dire need for a new understanding of the role of polls in journalism.
To understand the problems of polling, it is important to first tackle data as a whole. Data itself is not the main source of the public’s confusion. Big data comes with its faults, like reducing a life story into several encouraging vicious algorithmic spirals that freeze social mobility. Despite this, the numbers themselves are not untrue, assuming the data is collected properly. What makes data-driven journalism tricky is rather that the reader must put their trust in the data scraping teams and writers who tie a narrative to data. For newspapers and magazines with teams dedicated to this, like The Economist and The Guardian, among others, trusting the reliability of their data is not hard. Indeed, the papers have even fashioned themselves as hubs of data, as the New York Times COVID-19 tracker shows. This still leaves the obvious but pernicious problem of checking data, which must be left up to experts in the field rather than United States citizens, kicking the problem of trust down the road. Nonetheless, the field is big enough to where bad data can be picked out easily.
The power of data in journalism has more pernicious side effects. Data is similar to standpoint epistemology in that it can both validate a claim or eviscerate one without supporting data. This is not too dangerous of a stance; an argument backed by good data is inherently better off than one not, simply from the fact that data is more evidence. Yet this gets murkier when one questions what makes "good" data. Here is where partisan flames can ignite. An obvious example is former president Donald Trump doubting polls, election data, and any evidence that worked against his made up, partisan view of reality. A second example is the Times poor coverage of the 2016 election, causing people to believe that Hilary Clinton was a shoo-in by cherry-picking data points. Another example was the release of Irreversible Damage by the journalist Abagail Shirer in 2020 and earlier debates around the research of Brown researcher Lisa Littman in a 2018 study, both of which dealt with questions of gender dysphoria and possible mal effects occurring from gender reassignment surgeries. Unlike the clear-cut case of Trump, the issues raised by criticisms of these studies are more sensitive and more complex. In both of these cases the data about transgender medical interventions, which is poor, has been questioned. Underneath this criticism are denunciations of both authors as transphobic, creating the appearance of a leftist ideology that does not tolerate dissent. If data is to back up claims, then we find ourselves compelled to research sensitive topics no matter what truth emerges. A blockade on that data would stymie debate, research, and progress to better methodologies. In journalism, partisan viewing of data is something the reader must be cautious of.
The problems only increase when the focus moves from data to polling, where individuals are responsible for responding accurately and truthfully and questions deal with opinion rather than demographics. It is undeniable that people lie on polls, either because they are parroting a popular opinion or because they genuinely cannot remember a political stance. Opinions expressed in polls can also change quickly, meaning there will always be a characteristic uncertainty to polling data. Couple that with the fact that certain groups tend not to respond to polls (in 2020 it was Republicans), and the perceived predictive potential polls have is weakened. The "perceived" here is important since the truth is that most polls work pretty well on average. Indeed, both the 2020 and 2016 polls worked relatively normally. Put in another way, the presidential election polls are not getting worse with time. The main problem with polls is how they are cast by the media as having an insanely high degree of certitude. A "nearly impossible" loss for Trump, for example, should have been understood as a 70% margin of failure; a 30% chance of winning would firmly be in the realm of possibility. Political predictability as a whole is something to be sceptical about, according to an in-depth look by Phillip Tetlock in Expert Political Judgement. Through a rigorous investigation of experts, Tetlock presented data that showed that experts were no better (and sometimes worse) than any regular person or formula at predicting a certain event. Above all, Tetlock's research showed that predicting the future is hard. It is impressive that polls, which are subject to many imperfections, can accurately predict a presidential victor. Even if the margins are off by substantial percentage points. Give credit where credit is due.
The Polls Aren't Getting More Inaccurate
While data itself is reliable to an extent, it can be scrubbed and misrepresented easily by those who analyze it. Mis-intention or mal-intention can quite easily flip results if one part of a data set is obscured. Presentation, then, is key, and it, like many other aspects of the news, can be influenced by media bubbles. Ideological sway, the gravity of public opinion, and over-education all contribute to a whittling in public confidence of the news media. Worst of all is the news media crippling of itself, and punditry’s poor track record of making impossible predictions. No sane person would trust someone that is always wrong.
Dispensing With Data?
What is likely to get confused during critiques of polling in journalism is the role of data journalists. It would be imperially wrong to look at "bad" polls and condemn data journalists for these. Polls themselves differ from data, which is collected by industries and represents patterns, trends, and demographics of consumers. Polls reflect mutable aspects of people (such political policy beliefs). Data journalism reveals stories that could not be told without a set of skills suited for understanding and scraping data and helps make clear points that go below the surface of an issue. It is a valuable methodology of analysis and having the statistics in one place can fundamentally change how the public understands the facts they are based on.
Still, it is frustrating how informed United States adults must be to navigate their political world. Data dominates citizens daily lives and backs up stories, yet the everyday adult probably cannot decipher raw data wholly, meaning they must either trust in the media to do the work properly or learn themselves how to do it to be sufficiently confident in the information they receive. What is even more infuriating is that the news media, who has the role of informer and educator in the United States, has continually failed to make the nuances of data and polling clear. The news has treated polling as divine word, and therefore their creators as failures when the polls, naturally, are not perfectly precise. Out of all the failures of the news media, this one stings the most.
Above all, the obvious solution to probability problems in the news media is for journalists and readers to understand what the polls are saying. The predictive potential of polls may provide clues about the future, but they can never give an undoubted predication. That does not mean polls are worthless; polls provide a helpful picture of the general trends shaping the country and give a clearer picture of public yens.
It may be wiser to rely less on polls as a whole and instead seek information about the public from other sources. One alternative would be to investigate google searches, which appear to be more accurate predictors of peoples' true colours than opinion polls. Although this diminishes the improper correlation of identity and beliefs, it still ignores nuances about data sets that may alter outcomes. Nonetheless, looking for new ways of gauging public opinion seems like a step in the right direction. If combined with polling, current areas of uncertainty in polling may be cleared up, although the polls will never predict the future perfectly.
Polls and data will not soon leave journalism. That is a good thing. At their best, data and polling provide insights into the mentality of the public and help determine where attention should be focused. Yet challenging data's territorial reign in journalism are ontological questions about what data truly says and how that story is communicated. Ideological warfare and simple data illiteracy can make trusting and understanding polls hard. Moreover, the news media has done a paltry job of learning how to factor polls and data into the stories they tell, and when to concede that their predictive power is as limited as most everyone else. New ways of aggregating data may help increase their predictive value, yet, faced with less trust in polls and certain groups unwilling to answer them, the media should start to move away from opinion polls and towards more reliable and more refinable data. After all, the biggest faults that come from data-driven stories are human ones.