Mark Blumenthal | March 25, 2008
Topics: 2008 , Barack Obama , Bradley/Wilder , Divergent Polls , Gallup , Hillary Clinton , IVR Polls , Mark Lindeman , Pollster , PPP , Rasmussen , Sampling Error , SurveyUSA
In case you missed our update, the most recent Gallup Daily result on the Democratic race shows a near dead-heat, with Barack Obama ahead of Hillary Clinton by a single percentage point margin not nearly large enough to attain statistical significance (47% to 46%). That one point lead is somewhat apropos, since it is virtually identical to the average of all of Gallup's Daily releases since February 8 (Obama 46%, Clinton 45%). So the question for the day: How much of the daily variation over the last six weeks has been real and how much is random noise?
Let's start with the chart of the Gallup Daily results since their three-day track completed on February 8 (and released on February 9). That was the first three-day result collected entirely after the results from the Super Tuesday primaries were known.
While the Gallup trend has shown several "figure eights" over the last few weeks (as reader "emcee" put it), most of that variation occurs within the range that we should expect from a survey with a +/- 3 point margin of sampling error.
To illustrate that point, consider the hypothetical possibility that the preferences among Democrats have remained perfectly stable for the last six weeks. Let's assume that the average result since February 8 -- 46% to 45% favoring Obama -- has been the unchanging reality. What sort of random variation should we expect from taking a sample rather than interviewing the entire population?
First, remember that the so-called "margin of error" applies to the individual percentages, not the margin between the candidates. So under our hypothetical "no change" scenario, we would expect the the Obama percentages to fall somewhere between 43% and 49% (46% +/- 3) and the Clinton percentages to fall somewhere between 42% and 48% for Clinton (45% +/-3).
Since February 8, the results of the actual Gallup Daily have fallen outside that range on just three days:
- March 1, when Obama led 50% to 42%
- March 13, when Obama led 50% to 44%
- March 18, when Clinton led 49% to 42%
But wait. As some of you may remember, most political surveys (including Gallup) calculate the margin of error using a 95% confidence level. That assumption means that we should expect results slightly outside the margin of error for one poll in twenty.
Unfortunately, at this point our story gets a little bit more complicated, because the "one in twenty" assumption applies to statistically independent measurements. Since each Gallup Daily release is based on a three-day rolling average, there is overlap in the sample on successive days. So only the results from every third day are truly "independent." 'll skip over some even more confusing explanation and get to the bottom line: Since February 8, roughly one-in-seven independent samples from the Gallup Daily series has produced a result outside the margin of error from my hypothetical, no-change, 46-45 scenario. That's a little bit more than we would expect by chance alone, but not much more.
Having said all that, my explanation still oversimplifies. It ignores the possibility for meaningful change within the standard "margin of error" -- subtle shifts that might not attain statistical significance in a single three-day sampling, but might over the course of a week or more.
A better way to distinguish the meaningful patterns is to compare Gallup's results to those from another pollster or two. Let's start with a chart of the Rasmussen Reports daily tracking poll over the same six week period. Not surprisingly, the average of the Rasmussen data gathered since February 8 also shows Obama leading by a single percentage point (45% to 44%).
Compare the two charts (or look at the chart below, which plots a Clinton-minus-Obama margin for both polls) and you will see several features in common:
- Both show a shift from Clinton to Obama between Super Tuesday and mid-February
- Both show Obama maintaining a low single-digit lead from mid to late February
- Both show Clinton rising a few days before the March 4 primaries and falling a few days after
And yet, at about the time the news surrounding Jeremiah Wright became a full-blown media obsession (March 14), the results of the two polls appear to diverge. Why is that?
We should keep in mind that Gallup and Rasmussen collect their data differently (and ask slightly different questions -- see the postscript). Gallup uses live interviewers, makes repeated call-backs to unavailable respondents, samples cell phone numbers, and routes calls to Spanish speaking interviewers when they reach a Spanish speaking household. Rasmussen uses an automated system and recorded voice to conduct interviews, a slightly tighter screen for "likely voters," yet (as I understand it) makes no calls backs, does not call cell-phones and makes no provision for bilingual interviewing.
Some, I am sure, will readily conclude that one or more of these characteristics (or perhaps others that I've omitted) provide "obvious" explanations for the discrepancies. I am reluctant to make too much of these differences. The reasons be clearer after we look at data from a third source. I obtained it earlier today from an anonymous but trusted pollster that I'll call "Polimatic." Here is a chart of the Polimatic's tracking data for the last six weeks:
Those who notice the greater stability in the Polimatic data as compared to Gallup and Rasmussen are on to something important. Next consider how the Clinton-minus-Obama margin from the Polimatic data compares to the other pollsters:
See some interesting patterns? Starting to form theories about what type of poll Polimatic is, or how their methodology might influence their results?
Well, before you go too far, I should fess up. I fibbed. "Polimatic" is not a pollster at all. The data are based on a simulation run by our friend Mark Lindeman. Mark created a spreadsheet that generates random results consistent with a thee-day rolling average tracking sample of 1,260
40 interviews and the assumption that the "true" population value remains an unchanging 46% to 45% Obama lead.
The Polimatic line is more stable, suggesting that the consistently highest highs and lowest lows of the blue and red lines probably represent real divergence. However, the purely random variation of the simulated poll trend line is frequently hard to distinguish from the real surveys.
To generate the results above, I closed my eyes and clicked the mouse to let the spreadsheet recalculate. As such, the "Polimatic" line illustrates one potential trend showing nothing but random noise around a 46% to 45% margin. I'll say it one more time to be clear: All of the variation in the Pollmatic trend lines is based on purely random chance. Any resemblance to real changes as measured by Gallup or Rasmussen is entirely coincidental.
So what can we conclude from all this?
First, there has been far more stability than change in the national Obama-Clinton vote preference since Super Tuesday, and that includes the period of last ten days. To the extent that we have seen real changes, they are barely bigger than what we might expect by chance alone.
Second, if you look closely, you will notice that the seemingly odd divergence between Gallup and Rasmussen since the Wright story broke is really not that unusual. It is comparable to similar separations in the trend lines that occurred around February 13 and February 29. Random variation will do that.
Third, and probably most important, it is far too easy to look at these rolling average tracking surveys and see compelling narratives and spin interesting theories from what is often little more than random noise.
PS: Yes, as a few readers have already suggested in prior comments, some of the stability in national Democratic vote preference may stem from the fact that most states have already held their primaries and caucuses. We had some discussion about a month ago about how Gallup alters its screen slightly to accommodate states that have already voted. However, neither Gallup nor Rasmussen alters their vote question for those who have already voted. Here is the text used by each:
Gallup: Which of these candidates would you be most likely to support for the Democratic nomination for president in 2008, or would you support someone else? [ROTATED: New York Senator, Hillary Clinton; Former Alaska Senator, Mike Gravel; Illinois Senator, Barack Obama]
Rasmussen: If the Democratic Presidential Primary were held in your state today, would you vote for Hillary Clinton or Barack Obama? [options are rotated]
PPS: While I was writing this post, Mickey Kaus blogged a theory for the divergent Gallup and Rasmussen trend lines:
The 'Bradley Effect' is Back? Gallup's national tracking poll has Obama retaking the lead over Hillary after bottoming out on the day of his big race speech. Rasmussen's robo-poll, on the other hand, shows Obama losing ground since last Tuesday. True, even Rasmussen doesn't seem to be putting a lot of emphasis on his survey's 6-point shift. But isn't this week's primary race exactly the sort of environment--i.e.., the issue of race is in the air--when robo-polling is supposed to have an advantage over the conventional human telephone polling used by Gallup? Voters wary of looking like bigots to a live operator--'and why didn't you like Obama's plea for mutual for understanding that all the editorial pages liked?'--might lie about their opinions, a phenomenon known as the Bradley Effect. But they might be more willing to tell the truth to a machine. ...
Or more likely, the apparent differences between are about random variation in one or both polls. If you average the results from data collected since March 14 (the day the Wright story exploded) they are not very different:
- Live Interviewer Gallup Daily: Clinton +2 (47% to 45)
- Automated Rasmussen Reports: Obama +1 (45% to 44%)
Kaus also links to an automated PPP survey in North Carolina that fielded on the evening of March 17, the night before the Obama speech. As such, it is consistent with Gallup's "bottoming out" for Obama, not contradictory. The SurveyUSA results I blogged about on Friday were also collected from March 14 to March 16, just after the Wright story broke but before Obama's speech.