Guest Pollster | October 26, 2008
Topics: Climate Modelers , Disclosure , Likely Voters , Nate Silver
As Mark Blumenthal and Nate Silver have both noted in detail of late, the design of likely voter models can significantly impact how pollsters interpret and transform the raw data of voter samples into the topline results we see at pollster.com, fivethirtyeight.com, and other sites covering election polling. In turn, Mark and Nate observe, likely voter model design depends significantly on judgments that pollsters make about how to model the likelihood that any voter sampled will actually turn out and vote in the election. As we have all seen in the last few days, differences in how such judgments get made by different pollsters, combined with differences in the samples of voters collected by each poll, can mean the difference between a 1-point and a 14-point spread between the respective candidates for President.
A key challenge for consumers of polls - whether citizens, journalists, or politicians - is sorting out to what extent the likely voter model or the underlying raw data sample is responsible for variations in poll outcome. In fact, this sorting out of how judgments made by modelers impact model design and outputs is a general challenge in the use of science to inform policy choices, which I have studied for much of the past two decades. Judgments like this are inevitable in any scientific work, which is why policy officials turn to experts to make judgments on the basis of the best available knowledge, evidence, and theories.
One case that I have looked at in detail is the use of computer models of the Earth's climate to make predictions about whether the planet is experiencing global warming. As I'm sure most of you know, models of climate change have been viewed skeptically by many people. I believe the trials and tribulations of climate modelers - and also their approaches to addressing skepticism about their judgments - offer three useful insights for pollsters working with likely voter models.
- Transparency - climate models are far more complex than most polls, but climate modelers have made significant efforts to make their models transparent, in a way that many pollsters haven't. (In much the same way, computer scientists have called for the code used in voting machines to be open source.) By making their models transparent, i.e., by telling everyone the judgments they use to design their model, pollsters would enhance the capacity of other pollsters and knowledgeable consumers of polls to analyze how the models used shape the final reported polling outcome. They would also do well to publish the internal cross-tabs for their data.
- Sensitivity - climate modelers have also put a lot of effort into publishing the results of sensitivity analyses that test their models to see how they are impacted by embedded judgments (or assumptions). This is precisely what Gallup has done in the past week or so, in a limited fashion, with its "traditional" and "extended" LV models and its RV reporting. By conducting and publishing sensitivity analyses, Gallup has helped enhance all of our capacity to properly understand how their model responds to different assumptions regarding who can be expected to vote.
- Comparison - climate modelers have also taken a third step of deliberate comparisons of their models using identical input data. The purpose of such comparison is to identify where scientific judgments were responsible for variations among models, and where those variations resulted from divergent input data. Since the purpose of polling is to figure out what the data are saying, it is essential to know how different models are interpreting that data, which can only be done if we know how different models respond to the same raw samples.
The reason climate modelers have carried out this activity is to help make sure that the use of climate model outputs in policy choices was as informed as possible. This can't prevent politicians, the media, or anyone else from inappropriately interpreting the outputs of their models, but it can enable a more informed debate about what models are actually saying and, therefore, how to make sense of the underlying data. As the importance of polling grows, to elections and therefore to how we implement democracy, pollsters should want their polls to be as informative as possible to journalists, politicians, and the public. Adopting model transparency, sensitivity analyses, and systematic model comparisons could go a long way toward creating such informed conversations.