Articles and Analysis


South Carolina Poll Errors

Topics: 2008 , Barack Obama , Hillary Clinton , John Edwards , The 2008 Race


The polls had a bad day on Saturday in grossly underestimating the support for Barack Obama, though they nailed the Clinton and Edwards votes quite well. Not one poll came within the ten-ring, and the final poll of the primary understated Obama's vote by nearly 15 points. Several polls flirted just inside the 20-ring, and one hapless example of the consequences of poor question wording, the Clemson University poll, understated Obama support by nearly 30 points. The Clemson poll allowed 36% to remain "undecided", hopelessly biasing downward their estimates of candidate support, but especially so for Obama.

My colleague Mark Blumenthal has posted a nice comparison of these South Carolina results with those "terrible" polls from New Hampshire. Below is the same poll error chart for New Hampshire, but scaled the same as the one for South Carolina above.


The New Hampshire results were mostly inside the 10 ring and all were inside the 15 ring. And a couple even touched the 5 ring. Judged by distance from the bullseye, New Hampshire doesn't look that bad, certainly not compared to South Carolina.

But there is another difference, and this is where New Hampshire was terribly wrong and South Carolina not so bad: All but one of the New Hampshire polls had the wrong leader. None of the South Carolina polls, not even Clemson's, got the leader wrong.

So while the distance from the bullseye was quite a bit worse in South Carolina, the creation of confounded expectations was not. It was the expectations that were created and then confounded that make New Hampshire a polling disaster, while there has been little said about the polling errors in South Carolina. (Except here, where we care about such things all the time!)

The other interesting comparison is the parallel that the number 2 finisher in both South Carolina and New Hampshire was quite well estimated. The SC polls got Clinton within normal margin of error. And the New Hampshire polls also got the 2nd place finisher there, Obama, within reasonable error.

The problem in both cases is in the substantial underestimate of the first place finishers vote. The final choices of late deciding voters is a challenge for all polling, and perhaps especially so in primaries where there is no "party identification" to come home to if you can't make up your mind. In New Hampshire the Clinton win rested on significantly more voters supporting her than expected. In South Carolina is was the magnitude of the victory, rather than first place itself, that confounded the polling.

Increases in voter turnout in this cycle may be part of the story (a 75% increase in South Carolina), but here we see those late deciders breaking for different candidates, and yet in both cases for the ultimate winner. Second place results may on average be slightly low compared to the polls, but the first place "bonus" seems quite strong. At least for the Democrats. In the Republican South Carolina primary, both first and second place finishers were a bit underestimated, so there was not the same asymmetric error for first place. The New Hampshire Republican race also about equally understated the votes for first and second. The relatively lightly polled Michigan Republican race shows somewhat greater underestimate of first place (Romney) and second place (McCain). And in Nevada, with only 3 late polls, Romney was dramatically underestimated, while Ron Paul finished second but was only moderately underestimated.

So perhaps these reflect pollsters' difficulty in discerning the likely behavior of undecided voters, or perhaps these are last minute decisions to vote by "not-so-likely" voters who are screened out of the sample but who turn out for the ultimate winner in larger than expected numbers.

Turning to 2nd and 3rd place, the chart below shows that the polls had a pretty good day predicting the Clinton and Edwards votes. Despite some chatter about a late Edwards surge and a Clinton fall (including some evidence in our sensitive trend estimates that such a movement was occurring) most of the late polls were within the five-ring for 2nd and 3rd place, and all got the order of finish right.


Cross-posted at Political Arithmetik.


Daniel T.:

I remain puzzled as to why we see such a surge for the winning candidate; neither of your two explainations are convincing. The issue isn't the timing of the break but the direction; the magnitude. There is nothing magical about undecided or "not likey" voters that would make them all break a certain way; if the polling is any good such people should break in the exact same proportion as the polled voters do. That's what sampling is all about.

We have now seen this affect in NV, NH, and SC. Fundamentally, there is something wrong with the voter screens. Whatever sample is devised is not represenative of the group as a whole. This is bad.

One possible cause is the increasing reliance on cell phones. Most modern voter polling is done via landlines. Yet, I, nor most people I know, even own a landline phone anymore. And even if you had cell numbers most owners would not be willing to take up valuable airtime minutes on a survey. I don't know why this would make the result it is, but something to think about.

Another possibility is an inability to take into the account politcal organizations. Perhaps people who are not likely to vote make that statement based upon past expereince, not on what they plan to do in this election.

In any event, the problem lies in the voter screens somewhere.



About the Clemson U poll:
One cause may be poor wording of the questions.
But - the poll was conducted over a much longer period, from Jan 15-24. Given the verbal jousting and increased attention on SC in the week between Nevada and SC, would that not have swayed voters well after the Clemson poll?

At the very least, I think the length and timing of the poll could also explain some of the large fraction of "undecideds", particularly in the African-American vote - 45% undecided - most of which broke for Obama. Especially for voters who liked the Clintons, but swung Obama's way only after seeing him up close and personal, if you will.
Clinton's and Edwards' numbers in the Clemson poll were not that far off (20% and 17%).



Daniel T:
Search "cell phone" on pollster.com, and you will see an article where Gallup is specifically adding only-cell-phone-owners to its surveys.
But the article also quotes Pew Center research that apparently suggests not including such voters makes only a percent or two difference to the eventual numbers. Can make a difference in close races, but may not cause huge errors.



Your group has some nerve beating up on these pollsters. When you nearly libeled Insider Advantage (one of the first to show Obama with a large lead in S.C.) you picked the poll to death. Don't you realize that most pollster had it so far Obama that they even stopped polling the Dems in S.C. by mid-week. Obama won big and no one is shocked--particularly those of us who believed the early polls from over a month ago that showed the race an open and sut matter. Mark should try to be more fair and give credit where credit is due.


John (Private Eyes) Oates:

It is clear now that it is a very difficult chore to poll Obama. I think he causes a problem for a number of reasons, all of which are polling achilles heels. 1. His support is undoubtedly very high in the cell phone only households 2. He is bringing a previously non-participatory population into the mix and 3) he is a black candidate and people (read respondents) are always very odd discussing race, they become nervous and respond to questions in unpredictable manners based on their severe difficulty in discussing the issue in a straight-forward, comfotable, truthful manner. However, the rampant speculation about what is going wrong with the polls is killing me! Would'nt it be easier and more logical if someone actually COMPARED THE POLLS AND THE VOTE OUTCOMES to see which kinds of groups are misrepresenting their voting intentions or being missed in the pre-election polls? Or would that be too scientific? On a final note -- Poll with caution...the democratic race is a minefield!


Mark Lindeman:

John, there is no way to determine confidently based on the vote outcomes how various groups have voted. Ecological inference might work (for instance, looking at wards where almost all registered voters are black as a proxy for how blacks voted), but it might not. You can look at the exit poll results (and people have), but those might not be right either. And we can't even be sure that the vote counts match people's intentions. So some degree of speculation is inevitable, although some information can be brought to bear.

Mary, I have no idea what you're talking about. Folks can look back at the South Carolina poll results over the last month and see if it makes any more sense to them.


John Oates:

Sorry Mark, I was a victim of fast typing. I meant look at exit polls. But I supppose you are right, those might be unreliable too. I guess my central point is that, pollsters would be better off noting the reasons why Obama is so difficult to poll, rather than injecting wild speculation into the debate (also, it is never good to begin wildly speculating around reporters, as who knows which direction THEY might take the debate -- see Huffington Post's new feature on Pollstrology)



Can I ask Charles F a dumb question.

Why do final polls include any undecided.

Shouldn't pollsters (or makers of bull's eye graphs) first determine undecided and let people know what it is) and then allocate the undecided to the candidates in proportion to their totals?

Otherwise this bull's eye graph will always be off by the undecideds and thus understate support in general.

This seems pretty obvious to me. Is this being done by pollsters or in the data in the bulls' eye graph?

If not, Why not?


Post a comment

Please be patient while your comment posts - sometimes it takes a minute or two. To check your comment, please wait 60 seconds and click your browser's refresh button. Note that comments with three or more hyperlinks will be held for approval.