Articles and Analysis


Predictive Accuracy of the 2008 Pennsylvania Primary Pollsters

Topics: 2008 , ARG , Barack Obama , Bernwood Yost , Hillary Clinton , IVR , SurveyUSA

Berwood Yost is Director of The Floyd Institute's Center for Opinion Research at Franklin & Marshall College. Kirk Miller is B.F. Fackenthal Professor of Biology and Senior Research Fellow at The Floyd Institute's Center for Opinion Research.

The 2008 Democratic presidential primary on April 22 put Pennsylvania in the national spotlight for a long six weeks. Members of the media followed the candidates into the Keystone State intending to learn more about its people and its politics. Not far behind the media came the pollsters--some media even brought their own pollsters. Pennsylvania voters were besieged by pollsters in unprecedented numbers. There were 39 publicly released surveys, which included more than 30,000 interviews with the state's voters, during only the last three weeks of the campaign. This is a tremendous increase in polling activity compared to the 26 polls released in the final three weeks of the 2004 presidential campaign in Pennsylvania or the 15 released during the final three weeks of the 2006 Senate campaign.

Taken together, the pollsters who pestered Pennsylvanians did an adequate job of predicting the final outcome: 36 of the 39 polls in April predicted a Clinton victory and the three outliers were all conducted by the same polling organization. We agree with Charles Franklin's assessment that the aggregate performance of the Pennsylvania pollsters was good. Figure 1 is a frequency distribution of the predictive accuracy of the 39 public polls released in Pennsylvania. It shows that there was a slight bias in the polling estimates toward Barack Obama (meaning the polls in Pennsylvania underestimated Hillary Clinton's margin of victory), but that this bias was small and, according to the exit polls, not surprising because late deciding voters moved in larger proportions toward Clinton.

Figure 1 Frequency Distribution of Predictive Accuracy


Some individual pollsters faired much better than others in the accuracy of their estimates. Figure 2 shows the predictive accuracy and corresponding confidence interval for each of the 39 polls conducted between April 1 and 22 in Pennsylvania, arranged by the number of days prior to the primary the survey was completed. Those pollsters who produced a biased estimate, meaning the confidence interval for their estimate did not overlap zero, are labeled in Figure 2. Three of the four polls conducted by Public Policy Polling (PPP) were biased and all were biased toward Obama. Two of the three polls conducted by American Research Group (ARG) were biased and one of SurveyUSA's three polls showed bias. One ARG poll showed that Clinton and Obama were tied; the other, seven days later, showed Senator Clinton ahead by 20 points. The SurveyUSA poll that missed also showed Senator Clinton ahead by 20 points. The measure of predictive accuracy we used shows that the pollsters' final estimates were mostly in line with the final election results.

Figure 2 Predictive Accuracy of Individual Polls by Date of Poll


The misses identified in Figure 2 are not related to sample size. Four of the surveys that missed had four of the eight largest samples; the other two that missed had sample sizes that were only slightly below the median size. There is a relationship in these analyses, as one would expect, between sample size and the widths of the confidence intervals, but there is no relationship between sample size, width of the confidence interval, and the likelihood that a survey was biased. We don't know what methodological choices matter most in producing unbiased polls without further examination of the methodological choices the pollsters make. Some might conclude that pollsters who use inter-active voice response (IVR) technology to collect data are more prone to bias because two of the three pollsters who produced biased estimates use IVR, but not all IVR pollsters produced biased results.

Another interesting question we tried to answer is whether the polls converged on the end result as election day approached. Depending on the method used, the answer is a qualified, "slightly." Figure 3 shows the predictive accuracy of each poll as a function of days before the Pennsylvania primary. The trend line fitted to the figure is produced by a LOWESS iterative locally weighted least squares regression. The red dots identify the six biased polls noted earlier. The curve indicates that the polls began to converge until about two weeks prior to the election, that they remained relatively constant for about a one-week period, and then began to converge again over the final days of the campaign. If the six biased polls are removed from the analysis, the convergence is not dramatically improved.

Figure 3 Predictive Accuracy of Individual Polls by Date of Poll with Fitted Regression Line


Measuring Predictive Accuracy

We used the measure of predictive accuracy developed by Martin, Traugott and Kennedy (2005) A Review and Proposal for a new Measure of Poll Accuracy. Public Opinion Quarterly, Volume 69 (3): 342 - 369. Their method compares the ratio of the estimated percent of voters voting for each candidate to the ratio of the final vote tally for each. The natural log of this odds ratio (ln odds) is used because of its favorable statistical properties and the ease of calculating confidence intervals for each estimate. The confidence interval for a poll that reasonably predicts the final outcome of the primary election will overlap zero. Senator Clinton's votes or projected votes were the numerators in all the ratios we calculated so negative values for ln odds represent an overestimate in favor of Senator Obama and positive values represent an overestimate in favor of Senator Clinton. According to this measure, a poll is biased if its confidence interval does not overlap zero. The polling results used in this analysis were taken from


Kenny Easwaran:

You say "there is no relationship between sample size, width of the confidence interval, and the likelihood that a survey was biased." This seems exactly as it should be, right? I suppose I'm not totally clear on what "biased" means, but if it just means that the election day result was outside the confidence interval of that poll, then the fact that you're using the same confidence level for each poll suggests that each one should be just as likely to be biased. What you'd expect a relation between is the width of the confidence interval and the distance of the center point from the actual election outcome.


Post a comment

Please be patient while your comment posts - sometimes it takes a minute or two. To check your comment, please wait 60 seconds and click your browser's refresh button. Note that comments with three or more hyperlinks will be held for approval.