Articles and Analysis


Gallup Daily vs. Super Tuesday

Topics: 2008 , Barack Obama , Gallup , Hillary Clinton , Likely Voters , National Journal

In response to a reader's question about the Gallup Daily survey, I left a comment last night that was not correct. It concerned the screen that Gallup applies to the results on the Democratic and Republican presidential primary contest. I had assumed, wrongly as it turns out, that Gallup reported the results for all adults nationwide that identify or lean Democratic. An alert reader caught and alerted me to the word "voter" used in their methodological blurb. I emailed Gallup's Jeff Jones to check, and he kindly replied with the precise explanation for how the select the primary "voters" whose preferences they report every day:

Republicans or Republican-leaning independents who say they are extremely, very or somewhat likely to vote in their state’s primary or caucus when it is held.

Democrats or Democratic-leaning independents who say they are extremely, very or somewhat likely to vote in their state’s primary or caucus when it is held.

We [also] make provisions for those residing in states that have already held their primary caucus – those who indicate they have already voted are considered extremely likely to vote, and those who did not vote in their state’s primary or caucus would be excluded from the base.

One important note: The screen that Jones describes is similar to what other pollsters use in statewide surveys, but it is not the more rigorous and sometimes controversial Gallup "likely voter model" that they use in general elections and used for their surveys in New Hampshire.

Back to the question from "FlyOnTheWall" that prompted this discussion:

Today's Gallup polling was done yesterday [Tuesday]. It's of likely Democratic primary voters. And it attempts to show for whom they're going to vote. Today's snapshot shows a 13-point lead for Clinton.

Only we ran this experiment on a broader basis yesterday, and found (based on the tallies of the popular vote that I've seen) less than a point separating the two candidates. And it gets worse. Gallup broke out the February 5 states a few days ago, and found that voters there were more - not less - favorably disposed to Clinton than their entire sample. So, presumably, what Gallup is telling us is that voters in February 5 states favor Clinton by some 15 points - when the voters themselves turn out to be evenly divided.

That's as an egregious an error as Zogby, from a pollster who's supposed to be a whole lot more reputable. What gives?

And that's a fair question. First, to clarify, the results released yesterday that showed the 13-point lead were based on interviews conducted from Sunday afternoon (before the Super Bowl started) through Tuesday night. While some Tuesday night respondents on the West Coast may have been aware of the results, most were not. So Fly is right to suggest that the Sunday to Tuesday window was a good time period to compare to the actual results.

Of course, Gallup did not report on the vote preference of voters in Super Tuesday states in their Sunday to Tuesday data. They did that on Monday:

Forty-nine percent of Democrats and (where eligible to participate) Democratic-leaning independents in Super Tuesday states favor Clinton for the nomination, while 44% choose Obama. This analysis is based on tracking data from Jan. 30-Feb. 3, all collected since John Edwards suspended his campaign.

But note the last sentence. They reported a result nationally on Monday from the last three-nights of interviewing (Friday to Sunday) showing Clinton 4 points ahead of Obama (47% to 43%). However, the results for the Super Tuesday states were culled from interviews over the prior five nights of calling (Wednesday through Sunday). The different time period might have made a difference, although the national Clinton-Obama margin looks to have been roughly four points over the five day period as well. Perhaps Gallup can clarify.

In the spirit of my op-ed piece this morning, the disclosure of some additional statistics from the Gallup data would help in comparing the Super Tuesday results to the Gallup Daily results of the last week or so:

  • What percentage of adults, nationally, qualified as "Democratic and Democratic-leaning voters" over the last week or so? What percentage of adults qualified as "Republican and Republican-leaning voters?" How does the combination of the two compare to the 29.1% turnout of eligible adults that Michael McDonald's invaluable primary turnout web page estimates for Super Tuesday?
  • Gallup may have to reach back more than a week to get sufficient sample sizes, but what are the same statistics when we compare Super Tuesday primary states to Super Tuesday caucus states? McDonald's data indicates the obvious -- that caucus turnout was significantly lower than primary turnout -- though I will need to crunch the data more to get an overall comparison of turnout in Democratic primaries to Democratic caucuses.
  • And while we are at it, what was the vote preference over the last five or six nights of interviewing if we looked only at primary states that voted on Super Tuesday: And how does that preference compare to the actual votes cast in just the primary states?
  • [Update - I left one out: Did "extremely likely" Democrats in Super Tuesday states differ from those just somewhat likely to vote?]

Perhaps someone at Gallup could take a crack at this. It would make a terrific Gallup Guru item, don't you think?



Thanks, Mark, for the rundown. I'm an amateur trying to understand this stuff, and learn a great deal every time you sit down to explain.

I'd suggest that we have an answer to your first question, and that it's troubling. Gallup says they survey 1,000 adults each night, but the respondents to the poll are typically between 2,000-2,400 (of the 3,000 in the three-day sample). So we're talking about 65-80% of adults, when only 29% are going to the polls. That could be a big part of this.

The second question is interesting, and Obama's performance in the caucus states was actually understated by the state-level polling I've seen. But we're talking about so few votes (excluding Minnesota, a caucus that functions like a primary, just a few hundred thousand out of the 14 million) that I can't see it being the controlling effect.

The last two also seem directly on point.

But I'd add one other, in this particular cycle:
What is the racial and gender breakdown of the Gallup samples?

That, after all, seems to have been John Zogby's undoing in CA. It wouldn't surprise me if that's what was going on here, as well.



I think your reliance of the reported popular vote total (to derive preference) may be erroneous since this total includes the totals from caucus states.

Obama may have won the Minnesota Caucus but he most likely woould have lost in a primary.

Clinton's voters are primarily nurses, shift workers, Mall salepeople (like my wife) and working class citizens who can not afford to take off practically a whole day to caucus but would have voted in a primary.

Thus a national poll will most likely not account for caucus states.


tom veil:

One more possibility: the "Super Bowl bias" that you discussed in a previous post. As you noted, "the demographic profile [of non-responders] looks considerably more like Obama supporters than Clinton supporters." If we could get the daily tracking numbers on a single-day basis instead of a 3-day average basis, we could check for this hypothesis.




Let me take one last crack at explaining my objection to the caucus state explanation (and perhaps someone can correct me if my logic is muddled). There were eight caucuses held on the Democratic side yesterday, and they accounted for just over 800k of the votes used in the tallies of the popular vote. Assuming that those votes split 3-1 for Obama (and I'm fairly certain the margins were significantly closer, but haven't gone back to check) then subtracting those votes out of the tallies reveals that in the states that held primaries alone, the split in the votes actually cast was something like 51-49. (Again, these are rough numbers.)

The only way that caucus states could have distorted the result is if Democratic-leaning voters in those sparsely-populated states overwhelmingly backed Hillary, but only Obama supporters turned out to vote. But state-level polls showed these states evenly divided on the eve of the election, or backing Obama. And the results jibed with the conventional wisdom. And even if they universally backed Hillary, the populations of these states are so low it could only have accounted for a few points worth of the gap.



Even more confusing than some of the poll results are the delegate results coming in.

CNN posts Obama as trailing Clinton by ~80 delegates, while MSNBC has Obama ahead overall with more delegates in and BBC has (again) Obama behind by ~80 delegates with even more in.

My question is this: Is BBC's or MSNBC's delegate count partially predictive so that they have delegates counted that aren't necessarily in Obama's column yet, or are they simply more up-to-date? I did notice that CNN had left off some state's delegates which may favor Obama such as Georgia, Alabama, Connecticut, and Kansas and a few from Illinois... but it also has missing delegates from NY and California.

Who should I believe and what does the delegate count really look like?



You'll have to look at their methodology and decide who to believe on your own. Some counts include projected delegates; some only count the delegates that have definitely been won. Some counts include superdelegates; some don't. Etc.



Anyone else find it odd that it's 4:30pm, and Gallup has yet to release today's tracking numbers? I'm beginning to wonder if they're taking the time to assemble some answers to all the questions raised here.



Why does noone separate those who have already cast their vote by mail from those who are about to go to the polls? That would seem to be a better indicator of the potential change in the vote than lumping the early voters in with primary day voters who might still be volatile.

Clinton's numbers were much higher two weeks before the February 5th primary, but Obama was favored in a lot of polls right before the election. It seems likely that some number who voted for her in January might have wished they'd voted for Obama in February. People might have voted for Clinton but now say they favor Obama; either way, their already-cast votes wouldn't help read the way the potential voters are leaning.

If this is covered elsewhere, please enlighten me; after all, I'm just an amateur.



I think there are a couple more obvious, and not especially scientific, explainations for the gallup jump in Tuesday's polling:

1. This polling was Sun-Tues, replacing a previous day's release that was Sat-Mon. We know, based on Gallup's info, that Saturday was Obama's best ever polling day (they say he led outright in that day's sample), and if Tuesday was a somewhat bad day, replacing Saturday's results with Tuesday's would make the Wednesday release look especially bad.

2. On Tuesday, it's certainly possible that Gallup had difficulty reaching one key group of people: those actually participating in the process by voting or volunteering. I mean, polling on a semi-national election day would always run the risk of oversampling apolitical, uninterested people (sort of the opposite of Obama's fervent supporters).

I'd guess that some combination of undersampling Obama's supporters on Tuesday and dropping out the Saturday results is responsible for the big change.



FYI, in response to the above post about Delegate counts, the divergent counts are the result of including (or not including) Super Delegates. CNN counts them (and I suspect BBC must too, based on your description), based on announced supporters of Clinton and Obama. This gives Clinton a lead.

The total number of Pledged (ie, not Super) Delegates allocated through Tuesday will be 1792, and the best guess is that Obama leads here 908-884 (that's his campaign's #, and it's backed up by NBC's projected range).

I'd say that counting Super Delegates now is a fool's errand, since more than half of them haven't decided yet and the first half are free to change their minds at any time. It's sort of like counting a state's Electoral Votes for one candidate when only 50% of the returns are in from that state - it's highly presumptive.

I'd be very interested to see Mark weigh in on the media's different Delegate projections, though, since it's become their new go-to all-encompassing number, now that they say they don't trust polls!



Isn't it a bit odd that Gallup hasn't released today's poll (as of 5:20 PM EST)? They don't have any error, do they? Interesting that Rasmussen today has it exactly tied, btw.




I took a stab at it here:

Mark took a crack at it here in the context of Florida, and he's a lot smarter than I am:

The bottom line, I think, is that even VBM voters in California rarely cast their votes much before the final week. Fully 1/4 of the VBM ballots were turned in at the precincts or arrived by mail on election day. So yeah, I'm sure that some voters experienced buyers remorse. But for the most part, the difference between VBM and precinct voters is a function of their differing demographies, not when they made up their minds.



Did anyone actually read Zogby's response to the California fiasco? I can't find the link right now, but basically he said that they underestimated the Latino vote substantially and somewhat overestimated the black vote. So in essence, there wasn't necessarily anything wrong with the data, only the weighting of that data.

Here's an idea: since the pollsters have been so incredibly off, why don't they just publish their data on their websites, publish their expected result based upon their own weights, but actually give users the ability to adjust the weights for different demographic groups based upon their own expectations of turnout? It'd take about 1 hour worth of programming to do, I think.

... maybe I'm dreaming, but it'd be nice....



I see Time has a poll out that collaborates Gallup's - taken on the eve of Super Tuesday.
C48 O42



Mark B. --

Have you seen this? http://www.surveyusa.com/wp-content/uploads/2008/02/fixed-active-hi-level-pollster-report-card-through-020608.JPG

Anyone on pollster.com, by any chance, want to do some sort of analysis to verify?



Here's the response from Gallup. The key points:
-They're using a sample that screens out the 20% of adults who are unlikely to vote, leaving 80%, even though just 30% have actually been voting
-If they had used a tighter screen, of self-reported 'extremely likely' voters, they would have had about 50% of the sample left, and in the five days after Edwards dropped out, it would've been Obama 48%, Clinton 45% - a statistical tie.
-Even so, Obama's momentum had been stopped cold by Clinton by Sunday.

Thanks, Mark. I never would've been able to get this out of them without your advocacy.




A final thought, as I mull this over. If Gallup is saying that the sample which includes 80% was wildly off the mark as a predictor of actual voting, but that the sample which included just the 50% of highly likely voters came darn close to predicting how actual voters actually vote - then why the heck don't they use the tighter screen all the time?

If they're trying to find out how all Americans feel, they shouldn't use any screen. But if they're tracking voter sentiment, then they should be screening for voters. And since a loose screen produces results that aren't predictive, and a tight screen produces those that are, I really wish they'd just use the tight screen going forward.



The new one's out, 51/40 Clinton.


Daniel T:

I have read many rather convoluted explainations as to polling errors but they are basically boil down to one thing: pollsters are doing a poor job of sampling. Whatever method(s) they are using isn't matching the actual turnout. Mark's comment that even the best of pollsters gets it wrong on ocassion is true but besides the point. Without knowing *why* they were wrong there is no useful way to evaluate the data the next time around. Zogby's comment about racial disparities is a case in point. OK, so he thinks that's where the error was. But why? What problems in his sampling caused him to overestimate blacks and underestimate Latinos? Without knowing that we know nothing.

If I throw darts at the dart board random chance means I am going to get some results right. But good polling means not just simply saying, "ooops, I did it again"; it is a costant process of refining and reevaluating methods to reflect changing circumstances. Here in New Mexico, the Democratic Party chairman was quoted as saying he was "caught off-guard" by the fact that turnout was 3X what it was in 2004. Has the man been living in a cave? Is Gallup and the rest of the pollsters living in a cave? Some error is to be expected but there seems to me a trend of bad sampling that the profession is overlooking. The problem is much deeper than "They're using a sample that screens out the 20% of adults who are unlikely to vote, leaving 80%, even though just 30% have actually been voting.". There is something fundamentally different about 2008 than 2004 and pollsters simply aren't being honest with themselves about it.


Beware of Statistics:

Congrats on your op-ed piece. It seems to me that the exit polls are also having some problems. For instance, in California, they had Clinton winning by 7 points and McCain & Romney virtually neck and neck.

They also had McCain beating Huckabee by 7 points in Missouri, and I believe Obama beating Clinton by 7 in New Mexico.

Am I looking at these right? Is there a problem here too?

I'd appreciate any insights.



Andrew Therriault:

From today's Gallup daily poll posting:

"Those who track voter turnout in various states that voted on Super Tuesday estimate that actual turnout was around 30%, and varied considerably among states. Thus, a broad sample of over 80% of American adults would not be expected to match the actual voting patterns of the much smaller group that turn out to vote in either party's primary.

There is, in fact, strong evidence in the tracking data from the days prior to Super Tuesday that Obama did significantly better when those who reported the highest likelihood of voting are isolated in the sample. Retrospectively, Gallup analysis can isolate just voters who say they are extremely likely to vote -- about 50% of the sample (this still overestimates actual turnout). The vote preferences of Democrats within that smaller slice for the five days prior to Super Tuesday (and after John Edwards left the race) show that Clinton (45%) and Obama (48%) were basically tied."

This difference in turnout may explain how the results of their polls differ so much with the actual results. Any chance of persuading them to also report the results of the "extremely likely" subsample? If 50% of respondents fall into this category, I would expect that its results would be a far better estimate of actual voter attitudes than the larger sample.


Nick Panagakis:

Full SurveyUSA pollster scorecard here. Scroll down.



Post a comment

Please be patient while your comment posts - sometimes it takes a minute or two. To check your comment, please wait 60 seconds and click your browser's refresh button. Note that comments with three or more hyperlinks will be held for approval.