May 01, 2008
Predictive Accuracy of the 2008 Pennsylvania Primary Pollsters
Berwood Yost is Director of The Floyd Institute's Center for Opinion Research at Franklin & Marshall College. Kirk Miller is B.F. Fackenthal Professor of Biology and Senior Research Fellow at The Floyd Institute's Center for Opinion Research.
The 2008 Democratic presidential primary on April 22 put Pennsylvania in the national spotlight for a long six weeks. Members of the media followed the candidates into the Keystone State intending to learn more about its people and its politics. Not far behind the media came the pollsters--some media even brought their own pollsters. Pennsylvania voters were besieged by pollsters in unprecedented numbers. There were 39 publicly released surveys, which included more than 30,000 interviews with the state's voters, during only the last three weeks of the campaign. This is a tremendous increase in polling activity compared to the 26 polls released in the final three weeks of the 2004 presidential campaign in Pennsylvania or the 15 released during the final three weeks of the 2006 Senate campaign.
Taken together, the pollsters who pestered Pennsylvanians did an adequate job of predicting the final outcome: 36 of the 39 polls in April predicted a Clinton victory and the three outliers were all conducted by the same polling organization. We agree with Charles Franklin's assessment that the aggregate performance of the Pennsylvania pollsters was good. Figure 1 is a frequency distribution of the predictive accuracy of the 39 public polls released in Pennsylvania. It shows that there was a slight bias in the polling estimates toward Barack Obama (meaning the polls in Pennsylvania underestimated Hillary Clinton's margin of victory), but that this bias was small and, according to the exit polls, not surprising because late deciding voters moved in larger proportions toward Clinton.
Figure 1 Frequency Distribution of Predictive Accuracy

Some individual pollsters faired much better than others in the accuracy of their estimates. Figure 2 shows the predictive accuracy and corresponding confidence interval for each of the 39 polls conducted between April 1 and 22 in Pennsylvania, arranged by the number of days prior to the primary the survey was completed. Those pollsters who produced a biased estimate, meaning the confidence interval for their estimate did not overlap zero, are labeled in Figure 2. Three of the four polls conducted by Public Policy Polling (PPP) were biased and all were biased toward Obama. Two of the three polls conducted by American Research Group (ARG) were biased and one of SurveyUSA's three polls showed bias. One ARG poll showed that Clinton and Obama were tied; the other, seven days later, showed Senator Clinton ahead by 20 points. The SurveyUSA poll that missed also showed Senator Clinton ahead by 20 points. The measure of predictive accuracy we used shows that the pollsters' final estimates were mostly in line with the final election results.
Figure 2 Predictive Accuracy of Individual Polls by Date of Poll
The misses identified in Figure 2 are not related to sample size. Four of the surveys that missed had four of the eight largest samples; the other two that missed had sample sizes that were only slightly below the median size. There is a relationship in these analyses, as one would expect, between sample size and the widths of the confidence intervals, but there is no relationship between sample size, width of the confidence interval, and the likelihood that a survey was biased. We don't know what methodological choices matter most in producing unbiased polls without further examination of the methodological choices the pollsters make. Some might conclude that pollsters who use inter-active voice response (IVR) technology to collect data are more prone to bias because two of the three pollsters who produced biased estimates use IVR, but not all IVR pollsters produced biased results.
Another interesting question we tried to answer is whether the polls converged on the end result as election day approached. Depending on the method used, the answer is a qualified, "slightly." Figure 3 shows the predictive accuracy of each poll as a function of days before the Pennsylvania primary. The trend line fitted to the figure is produced by a LOWESS iterative locally weighted least squares regression. The red dots identify the six biased polls noted earlier. The curve indicates that the polls began to converge until about two weeks prior to the election, that they remained relatively constant for about a one-week period, and then began to converge again over the final days of the campaign. If the six biased polls are removed from the analysis, the convergence is not dramatically improved.
Figure 3 Predictive Accuracy of Individual Polls by Date of Poll with Fitted Regression Line
Measuring Predictive Accuracy
We used the measure of predictive accuracy developed by Martin, Traugott and Kennedy (2005) A Review and Proposal for a new Measure of Poll Accuracy. Public Opinion Quarterly, Volume 69 (3): 342 - 369. Their method compares the ratio of the estimated percent of voters voting for each candidate to the ratio of the final vote tally for each. The natural log of this odds ratio (ln odds) is used because of its favorable statistical properties and the ease of calculating confidence intervals for each estimate. The confidence interval for a poll that reasonably predicts the final outcome of the primary election will overlap zero. Senator Clinton's votes or projected votes were the numerators in all the ratios we calculated so negative values for ln odds represent an overestimate in favor of Senator Obama and positive values represent an overestimate in favor of Senator Clinton. According to this measure, a poll is biased if its confidence interval does not overlap zero. The polling results used in this analysis were taken from Pollster.com.
-- Guest Pollster
May 01, 2008 in The 2008 Race
TrackBack
TrackBack URL for this entry:
http://www.pollster.com/cgi-bin/mt/mt-tb.fcgi/5013.
Listed below are links to weblogs that reference Predictive Accuracy of the 2008 Pennsylvania Primary Pollsters:
Comments
Post a comment
Gov General Election:
Indiana
Montana
New Hampshire
North Carolina
North Dakota
Utah
Washington
Sen General Election:
Alabama
Alaska
Colorado
Idaho
Illinois
Iowa
Kansas
Kentucky
Louisiana
Maine
Massachusetts
Michigan
Minnesota
Mississippi (A)
Mississippi (Special)
Nebraska
New Hampshire
New Jersey
New Mexico
North Carolina
Oklahoma
Oregon
Rhode Island
South Dakota
Texas
Virginia
2008 POLL DATA
Pres General Election:
National
National 4-way
Alabama
Alaska
Arizona
Arkansas
California
Colorado
Connecticut
Delaware
Florida
Georgia
Hawaii
Idaho
Illinois
Indiana
Iowa
Kansas
Kentucky
Louisiana
Maine
Maryland
Massachusetts
Michigan
Minnesota
Mississippi
Missouri
Montana
Nebraska
Nevada
New Hampshire
New Jersey
New Mexico
New York
North Carolina
North Dakota
Ohio
Oklahoma
Oregon
Pennsylvania
Rhode Island
South Carolina
South Dakota
Tennessee
Texas
Utah
Vermont
Virginia
Washington
West Virginia
Wisconsin
Wyoming
All 08 GE Match-ups
All 08 Primaries
PUBLIC POLLSTERS
ABC News
AP-IPSOS
CBS News
Democracy Corps (D)
Diageo/Hotline Poll
Economist/YouGov
EPIC/MRA
The Field Poll
FOX News
GWU/Battleground
Gallup
Harris Interactive
IBD/TIPP
ICR - International Communications Research
LA Times/Bloomberg
Mason Dixon Polling and Research
Marist Poll
Market Shares Corporation
Mitchell Interactive
NBC/Wall Street Journal
New York Times
Opinion Research Corporation
Pew Research Center
Polimetrix
Princeton Survey Research Associates International
Public Agenda
Public Policy Polling
Quinnipiac University Poll
Rasmussen Reports
Selzer & Company
Suffolk University Political Research Center
Survey USA
Time/SRBI
Washington Post
World Public Opinion
Zogby International
POLL BLOGS AND SITES
Political Arithmetik
Crosstabs.org
The Polling Report
Electoral-Vote.com
R. Chung's Graphics
Prof. Wang's State Poll Meta-Analysis
Prof. Pollkatz Pool of Polls
Slate: Election Scorecard
Public Opinion Pros
Frank Newport: Gallup Guru
Carl Bialik: The Numbers Guy
Poll Positions: Kathy Frankovic
The Numbers: Gary Langer
Washington Post: Behind the Numbers
SURVEY RESEARCH ORGANIZATIONS
American Association for Public Opinion Research (AAPOR)
The National Council on Public Polls (NCPP)
Council of American Survey Research Organizations (CASRO)
The World Association for Public Opinion Research (WAPOR)
The Council for Marketing and Opinion Research (CMOR)
Marketing Research Association
ARCHIVES
July 13, 2008 - July 19, 2008
July 6, 2008 - July 12, 2008
June 29, 2008 - July 5, 2008
June 22, 2008 - June 28, 2008
June 15, 2008 - June 21, 2008
June 8, 2008 - June 14, 2008
June 1, 2008 - June 7, 2008
May 25, 2008 - May 31, 2008
May 18, 2008 - May 24, 2008
May 11, 2008 - May 17, 2008
All pollster.com archives
MysteryPollster.com archives








You say "there is no relationship between sample size, width of the confidence interval, and the likelihood that a survey was biased." This seems exactly as it should be, right? I suppose I'm not totally clear on what "biased" means, but if it just means that the election day result was outside the confidence interval of that poll, then the fact that you're using the same confidence level for each poll suggests that each one should be just as likely to be biased. What you'd expect a relation between is the width of the confidence interval and the distance of the center point from the actual election outcome.
Posted on May 3, 2008 9:14 PM