Articles and Analysis


Bowers Vs. 538 Vs. Pollster

Topics: Chris Bowers , Fivethirtyeight , Poll Aggregation , Pollster.com

Chris Bowers posted a two-part series this week that compares the final estimate accuracy of his simple poll averaging ("simple mean of all non-campaign funded, telephone polls that were conducted entirely within the final eight days of a campaign") to the final pre-election estimates provided by this site and Fivethirtyeight.com.

Chris crunches the error on the margin in a variety of different ways, but the bottom line is very little difference among the methods. These are his conclusions:

  • 538 and Pollster.com even, I'm further back: Pollster was equal to 538 when all campaigns are included (the "1 or more" line) and with all campaigns except the outliers (the "2 or more" line). Kind of funny that not adjusting any of the polls, and adjusting all of the polls, results in the same rate of error. To no one's surprise, my method was much better among more highly polled campaigns, but still about 10% behind the other two once poll averaging (2 polls or more) comes into play. I make no pretense about my method needing polls in order to work.
  • Anti-conventional wisdom : 538 had the edge among higher-polled campaigns, which means Pollster.com was superior among lower-polled campaigns. This goes against conventional wisdom. Many thought Silver's demographic regression gave him an edge among less-polled campaigns, but that Pollster's method only worked well in heavily polled environments. Turns out the opposite was true, and I'm not sure why. Maybe Silver's demographic regressions don't work, but his poll weighting does. Or something.
  • Still very close : While I was a little behind, the difference between the methods is minimal. I'm a little disappointed, but clearly anyone can come very close to both 538 and Pollster.com in terms of prediction accuracy with virtually no effort. Just add up the polls and average them. It is about 90% as good as the best methods around, and anyone can do it.

You can see the full post for details, but his calculations are in line with what we found in our own quick (and as yet unblogged) look at the same data. We simply saw no meaningful differences when comparing the final, state-level estimates on Pollster to Fivethirtyeight.

Keep in mind that we designed our estimates, derived from the trend lines plotted on our charts, to provide the best possible representation of the underlying poll data -- nothing more and nothing less. So the accuracy of our estimates tells us that the poll data alone, once aggregated at the end of the campaign, provided remarkably accurate predictions of state-level election outcomes. The fact that the more complex models used at FiveThirtyEight were equally accurate raises the question: In terms of predictive accuracy, what value did Fivethirtyeight's extra steps (weighting by past polls performance and the various adjustments based on other data and regression models) provide?



Ah, I wondered when people would get around to this. I came to the same conclusion after doing much more analysis on it than it really deserved, although I included RCP where I could as well.

It was also interesting to me how well (generally) 538's regression model did _alone_. Not as well as using the polling data, really, but actually not bad.

I'm sure Mark's concluding question will raise some hackles, but it's an important one, and it is one of the reasons that I really like pollster.com's approach: not because I believe it will be more accurate, but because they are interested in just _presenting data_ elegantly.



Most of 538s regression techniques only really applied when looking towards the future.. and so when you use election day as the readout most of his correction factors drop out. The polls were not moving all that much over time in the last week or so, and the frequency of polling went up, and so all these fancy things he did to try to catch polling up etc really became irrelevant down the stretch.

Now that's not to say they were not useful earlier in the race, in terms of saying where the state of the race was likely to be at this moment and/or where the race was likely to move.

To evaluate that, one should be able to analyze those times during that campaign when a simpler technique such as Pollster, deviated from the predictions given by 538, then look at the subsequent polls which came out and figure out if 538 did a good job or not.

From observing the 538 and the polls over the course of the election, I'd guess that he did OK, but not that great. It seemed there were a few states that would always be drifting in one direction due to his regression analysis as polling data got old in that state, and then new polling would come in and the state would snap back in the other direction.

If his techniques were really working, you'd expect the new polling to come in closer to his regression than to the simple minded average of past polls.



All reasonable poll aggregations are going to get the predictions pretty close if they rely on polls taken near the end of October. Bowers' results are therefore no surprise.

I think the value added of 538's method is that it may give a more accurate reading of how things are going (both in each state and nationally) when the polling is sparse as well as earlier in the electoral season (July-August-September). Its method of using a demographic-historical baseline as well as polling information from related ("similar") states or from national polls, to inform the estimates of state polls was a unique feature compared with other polling aggregations. This approach did a decent job of "smoothing" the trends as well. And I think he told a pretty good word story based on his estimates.

Much like Silver's "PECOTA" forecasts of baseball player performance, however, there is a certain irreducible uncertainty (randomness or unpredictability) about the underlying phenemenon. He may be able in those player forecasts, for example, to achieve, say, a 65% R-sq (and a small RMSE) for some player performance indicator -- but not 70% or above; but in doing so he fairly consistently (but not always) beats the competition by 2 or 3 percentage points in the R-sq and with a smaller RMSE.

In the end, Silver's mainly looking for what might be termed a "bettor's edge," not a decisive victory. In that sense, he acquitted himself quite well in his first-ever attempt to make electoral predictions -- though by late October he should not be expected to have any bettor's advantage over other highly sophisticated approaches, such as Pollster's. And he showed a lot of insight into the prediction business early on with some of his calls in the primary elections.


Post a comment

Please be patient while your comment posts - sometimes it takes a minute or two. To check your comment, please wait 60 seconds and click your browser's refresh button. Note that comments with three or more hyperlinks will be held for approval.