May 4, 2008 - May 10, 2008


POLL: Daily Tracking (Through 5/8)

Gallup Poll

Obama 48, Clinton 46
Obama 46, McCain 45... Clinton 48, McCain 44

Rasmussen Reports

Obama 50, Clinton 42
Obama 47, McCain 44.. Clinton 48, McCain 43

Favorable / Unfavorable
McCain: 49 / 48
Clinton: 45 / 53
Obama: 51 / 47

POLL: ARG West Virginia (5/7-8)

American Research Group

West Virginia
Clinton 66, Obama 23

POLL: R2K/DailyKos Texas (5/5-7)

Research 2000/DailyKos.com (D)

McCain 52, Obama 39... McCain 53, Clinton 38
Sen: Cornyn (R-i) 48, Noriega (D) 44

POLL: Rasmussen Missouri (5/6)

Rasmussen Reports

McCain 47, Obama 41... McCain 45, Clinton 43
Gubernatorial Numbers

POLL: Daily Tracking (Through 5/7)

Gallup Daily

Obama 47, Clinton 46
Obama 46, McCain 45... Clinton 47, McCain 45

Rasmussen Reports

Obama 47, Clinton 43
Obama 46, McCain 44... Clinton 48, McCain 43

Favorable / Unfavorable
McCain: 49 / 48
Clinton: 46 / 52
Obama: 51 / 47

McCain 53, Obama 39... McCain 48, Clinton 37
Sen: Chambliss vs

Poblano's Model

Topics: 2008

My latest NationalJournal.com column, about the remarkable success of the non-poll statistical model created by Poblano of FiveThirtyEight.com, is now online.

"Outliers" from the IN/PA and Beyond

Topics: 2008 , Barack Obama , CBS/New York Times , Hillary Clinton , Kathy Frankovic , SurveyUSA

[I fell behind a bit on his feature over the last hectic week, so some of these are a little stale].

Kathy Frankovic considers the reasons for conflicting national numbers from CBS/NYT and Gallup earlier this week (see also my post on this topic); last week she had a terrific review of how question order can affect survey results.

Carl Bialik revisits the primary math post IN/NC and considers Brian Schaffner's superdelegate projection model.

CJR's Clint Hendler takes an in-depth look at the various popular vote counts.

David Hill says "every superdelegate can find one survey that confirms the outcome he or she intuitively prefers for the Obama-Clinton fight.

Jay Cost crunches the exits from IN and NC

John Cohen finds that most Republicans voting in IN and NC express "little other than a sincere preference for Clinton over Obama."

Brian Schaffner notes that turnout in IN and NC "exceeded the number of votes Kerry won in the states in the general election"

Tom Schaller thinks Hillary could have done better among African Americans (via Smith).

PPP's Tom Jensen tips his hat to SurveyUSA's Jay Leve

POLL: Rasmussen Wisconsin (5/5)

Rasmussen Reports

McCain 47, Obama 43... McCain 47, Clinton 43

POLL: Daily Tracking (Through 5/6)

Gallup Poll

Obama 47, Clinton 46
Obama 46, McCain 45... Clinton 46, McCain 45

"Obama's Support Similar to Kerry's in 2004"
"Obama Beats McCain Among Jewish Voters"

Rasmussen Reports

Obama 47, Clinton 43
Obama 45, McCain 44... Clinton 47, McCain 44

Favorable / Unfavorable
McCain: 49 / 47
Clinton: 46 / 51
Obama: 51 / 46

NC-IN Follow-Up Links

Topics: 2008 , Exit Polls

Here are links to exit poll and related analysis from:

Other pollster related news:

  • Brian Schaffner plots pollster performance
  • SurveyUSA report cards for NC and IN
  • PPP takes a victory lap
  • Mickey Kaus (8:45 entry) vents about hidden exit poll estimates

If you've seen other exit poll analysis or pollster related news, email me a link and I'll add it to the list.

Live Blogging North Carolina and Indiana Election Night

Topics: 2008 , Barack Obama , CBS , CNN , Exit Polls , Fox News , Hillary Clinton , MSNBC , Pollster , Zogby

I will be live-blogging here tonight on what we might learn -- and what we might do better to ignore -- from the exit poll in Indiana and North Carolina. More details to follow, but please feel free to use this as an open-thread on what is appearing on the net and elsewhere on the exit polls. Here are the links where official exit poll tabulations will appear shortly after the polls close at 8:00 p.m.:

Comments will appear in reverse chronological order -- all times Eastern.

12:30 - And finally...while some important votes are still being counted n Indiana, at least some conclusions about pollster performance are inescapable. First, it does appear that most of the polls significantly understated Barack Obama's percentage of the African American vote, especially in North Carolina, as they have in other states. According to the most current exit poll tabulations (as of this writing), Obama won over 90% of African Americans in both North Carolina and Indiana.

Second, it seems clear that in terms of the overall result, the winner** among the pollsters tonight was Zogby International. Their final polls had Obama ahead by 14 in North Carolina and up by 2 in Indiana -- a closer margin in Indiana than any survey reported in the final week. [Update: Not so fast. Another pollster -- PPP -- did as well or better, depending on the yardstick. The SurveyUSA report cards for Indiana and North Carolina, as well as the Brian Schaffner's graphic, show that Zogby and PPP both scored well, with PPP doing slightly better on 10 of 16 rankings].

I have certainly been critical of Zogby over the years, but credit is due. Pollster reader political_junkie was right with this comment earlier tonight: "It took a lot of courage for them to publish their 'outlier' results last night, one night before primary."

Third, the non-polling based statistical model developed by Poblano at FiveThirtyEight.com outperformed most of the polls. His models predicted a narrow (51.0% to 49.0%) margin in Indiana and a 17 point margin in North Carolina (58.6% to 41.4%).

10:16 - Better late than never, answering the question raised at 8:11: I am told that Edison/Mitofsky conducted a telephone poll of early voters in North Carolina to gather data to combine with the interviews completed at polling places. In Indiana, it was polling place interviews only.

10:11 -- Just posted by Poblano, who is doing his own modeling of the vote count:

I'm now showing Clinton winning Indiana by 1.8 percent, or about 23,000 votes. And one thing to remember about Indiana is the provisional ballot issue -- people who were rejected at the polls because they did not meet the state's ID requirements could still cast provisional ballots and prove their identity later. It's possible that we'll still have a hanging chads type of situation.

9:39 - Thatcher asks: "Did I just see MSNBC change their call from 'Too early' [to call] to "Too close'"? Yes you did.

9:32 - A friend passes along this blog post:

I am watching MSNC coverage of the Indiana and North Carolina Democratic Primary results and I am struck by how profligate the network is with almost all of the exit poll results for Indiana (which has not yet been called) except the aggregate percentages for each of the candidates! The race is "too close to call," but that doesn't explain this.

Why don't the networks tell us who "won" according to the exit polls? The polls have closed, so they can't affect the results? Are the exit polls so lacking in trustworthiness that they aren't really informative? If that is the case, why are the demographic breakdowns from the exit polls thrown around and talked as if they were gospel? (They don't even mention margins of error when talking about them!)

Here's bottom line: When they say a race is "too close to call" or that it's "too early to call," they don't "know" who won yet. They have estimates, and they have a good sense of who is likely to win, but not enough statistical confidence in those estimates to project a winner with complete confidence. No one wants to repeat the mistakes made in projecting Florida eight years ago.

On the other hand, I tend to agree with those who wish the networks would make more of their estimate data available via the web on election night. The various estimates that the exit poll operation generates -- including the levels of statistical confidence in each measure and the real-time estimates of the precinct level error -- are truly fascinating.

8:25 - From Nora O'Donnell's report on MSNBC: In Indiana, Clinton carries whites without a college degree 63% to 34%; Obama ahead by two points (51% to 49%) among college educated white voters.

8:22 - Via Ambinder: CBS News has projected Hillary Clinton the winner in Indiana (thanks PHGrl).

8:15 - Reader Thatcher asks: "So does every network use the same exit poll info?" Yep.

8:11 - Reader RS asks a really good question: "Maybe this question has been answered earlier, but: Do the exit polls account for early voters?"

The networks typically do a pre-election telephone interviews among those who say they have already voted in states that typically get (or expect) a very large proportion of early voters, so I assume that they did one in North Carolina. But to be honest, I'm not sure.

7:55 - MSNBC's Nora O'Donnell just read results from two subgroups of North Carolina voters we've watched carefully in other states. Among white college educated voters, Clinton is leading by 7 (52% to 45%). Among white voters without a college degree, Clinton leads 68% to 26%.

7:54 - The North Carolina tabulations just updated with an additional 554 exit poll interviews not included in the first batch. My extrapolation of the underlying estimate now gives Obama a 14-point lead, 55% to 41%.

7:50 - Just a note on the mechanics of the exit poll updates and the network projection process. The numbers we have seen so far are -- presumably -- based on the exit poll interviews weighted based on some hard counts of turnout and (probably) weighted to the "composite estimate" which splits the difference between the exit poll tallies and pre-election polls. Right now, however, exit poll interviews and other results "reporters" are obtaining actual vote counts for the sampled precincts and these are being gradually incorporated into the estimates that the network "decision desks" look at to make their projections.

Unfortunately, If tonight's updates follow the pattern of recent election nights, the cross-tabulations we can see will not be updated until much later in the evening.

7:38 - Clinton again does better among the late than early deciders in North Carolina though not by as much as in Indiana. She won those deciding in the last three days by six points more (48%) than those who decided earlier (42%).

7:36 - The exit poll estimates 33% of North Carolina's Democrats as African American, and Obama is winning then 91% to 6%. Clinton is holding a 59% to 36% margin among white voters.

7:32 - CNN and MSNBC project Obama the winner in North Carolina. My extrapolation of the initial vote estimate used to weight the current exit poll tabulations shows Obama leading 55% to 41%, with 4% choosing "no preference."

7:27 - Polls close in NC shortly. Here's a CBS summary

7:19 - Once again, Hillary did better in Indiana among those who made up their minds in the last three days (+9) or within the last week (+8) than those who decided earlier (tie). See my Pennsylvania election night post (6:26 update) for comparative data from past primaries).

7:17 - Demographics: African Americans are 15% of the current estimate in Indiana. Obama is winning 92% of black voters, Clinton 60% of white voters. Compare to the pre-election polls here.

7:13 - Interesting that the MSNBC anchors have a new phrase that better fits the past problems with these numbers: "Too early to call."

7:12 - While I was typing, the Indiana table updated slightly but the overall estimate still rounds to 52-48% in Clinton's favor.

7:09 - Time for the usual caveats, with a twist. My usual election night helper Mark Lindeman had a social engagement tonight, so his spiffy extrapolation program is unavailable. I will be doing simpler extrapolations off of the vote by gender cross tab the old-fashioned way (which may mean a tad more rounding error).

Again, these initial tabulations are weighted to an estimate of the result that is usually a mashup of pre-election polls and the interviews exit polls conducted at polling places and over the phone (with early voters) by the networks. These estimates improve, becoming more accurate over the course of the night. Click here for more detail on how these numbers are derived and how they improve over the course of the evening.

7:04 - The exit poll tabulations now available on the network sites (links above) show an initial 52% to 48% estimate favoring Hillary Clinton. Please note that these initial estimates are usually a cross between the interviews conducted at polling places and an average of pre-election polls and, more importantly, have often been quite different than the final result.

6:55 - From the ABC summary:

These preliminary results indicate that a third of North Carolina's voters are African-Americans, in line with the norm, e.g. 31 percent in 1992, the last primary there for which exit poll data are available.

6:51 - Here (via The Page) are official early exit poll summaries, focusing mostly on results on questions other than vote preference from CNN, ABC, Fox, ABC, CBS, AP and MSNBC.

5:50 - Almost forgot. Halperin has these words of wisdom about "what makes it tough to produce good models for the exit polls in North Carolina and Indiana:"

1. They are not closed contests open only to Democrats.
2. Turnout is going to be huge (probably record breaking).
3. The absence of recent competitive primaries.

So let’s all be patient, shall we?

5:45. First out of the block in the leaked exit poll derby (at least that I'm aware of) is Huffington Post. Now before you click that link, please read my column on the problems exit polls have had this primary season (paying close attention to the table) and the follow-up this week on Pennsylvania. I'll be relocating to the Pollster.com "home office" over the next 30-40 minutes...see you back at 7:00.

NC-IN Predictions Roundup

Topics: 2008 , Barack Obama , Hillary Clinton , Tom Holbrook

A typically busy election day here, but I want to quickly link to some interesting vote predictions popping up in the blogosphere:

  • Tom Holbrook, a professor of political science at the University of Wisconsin-Milwaukee, has used our final standard trend estimates for Clinton and Obama for the past primary states to create a statistical model that projects forward. His projection of Hillary Clinton's share of the two-party vote is 53% in Indiana and and 47% in North Carolina.

Holbrook first tested (and explained) the workings of his model in Pennsylvania on primary day, and nailed the result. He projected that Clinton would get 54.8% of the two-candidate vote. She won 54.6% of the vote. In the craziness of the Pennsylvania election day, I neglected to post a link to Holbook's projection -- apologies to all for that.

  • Brian Schaffner has updated his delegate projections for tonight's two primary states also based on our polling trend estimates: a 38-34 split in Clinton's favor in Indiana and a 62-53 split for Obama in North Carolina.
  • FiveThirtyEight's Poblano has used a regression model based not on poll results but on census and other "hard" population data at the Congressional District level. The model projected a 53.7% to 46.3% Clinton win in Pennsylvania. In Indiana, his model predicts a narrow (51.0% to 49.0%) Clinton win in Indiana that translates into a delegate split of 36 to 36. In North Carolina, the model predicts a very large (58.6% to 41.4%) Obama victory, with Obama gaining 66 delegates and Clinton 49.

Using a similar approach, Poblano has also generated "scorecards" for both Indiana and North Carolina that attempts to project results by county to match the final RealClearPolitics statewide polling averages (Clinton +5.0 in Indiana and Obama +8.0 in North Carolina). The idea is to create statistics to use to follow the election returns. If Clinton or Obama is outperforming his projections in the various counties, it would suggest a performance better or worse than the statewide assumptions.

I will predict with high confidence that others have posted similar models somewhere on the web. If you know of one worth adding, please post a comment or send us an email (questions at pollster dot com).

NC and IN Final Sensitivity Comparison

Topics: 2008 , Barack Obama , Hillary Clinton


Both standard and sensitive estimators are agreed in North Carolina. In Indiana there is a little bit of room between them, but not enough to affect conclusions about the probable outcome (if the polls are right!)

The gyrations the Indiana sensitive estimator for Clinton goes through, thanks to variability in polls and relatively few polls, is a good warning that the sensitive estimator may just be a bit too ready to chase after noise.


Cross-posted at Political Arithmetik.

Vote by Race in NC and IN

Topics: 2008

I thought it might be helpful, given our focus on the racial composition in the electorates for Indiana and especially North Carolina, to post a summary of results by race in each state. Fortunately, just about all of the pollsters have released results by race on their final polls. A summary of each follows.

Let's start with North Carolina, where the ultimate margin will be extremely sensitive to the racial composition, as these numbers will make clear.


Most of the surveys have had the Clinton percentage of the white vote at roughly 61%, and -- with one notetable exception -- at about the average of 10% of the black vote. Much of the variation in the overall margin comes from the percentage of African Americans in the full sample. Most of the pollster have been reporting an African-American percentage of 32% to 33%. But as FiveThirtyEight's Poblano points out about his very helpful North Carolina prediction spreadsheet,

[A]n increase of 1 percent in the fraction of the electorate that is African-American translates to roughly a 1-point increase in Barack Obama's margin over Hillary Clinton. So -- if the pollsters are assuming 33% black turnout when it will actually be 40%, that would add 7 points to Obama's margin -- putting us in the 13-14 point range.

In the same post, Poblano makes the case that the African American percentage in North Carolina is likely to be higher than the 32% reported by most pollsters.

Another huge issue in trying to use these findings as the basis for vote projections is deciding what to do about undecided voters. In South Carolina pre-election polls vastly underestimated Barack Obama's vote, mostly among African-Americans. Polls conducted with an automated methodology in South Carolina showed Obama doing better among black voters than those using live interviewers, and PPP's Tom Jensen and FiveThirtyEight's Poblano 'have both noted that (as Poblano put it) "polls have significantly underestimated Barack Obama's margin of victory in Southern states with substantial black populations." This pattern in past primaries argues for assuming that Obama's support among African-Americans will be higher tonight than the 82% average noted above.

On the other hand, TNR's Noam Scheiber noticed that Clinton does slightly better among IVR pollsters in North Carolina, and he wonders if the automated surveys are "picking up on a queasiness [about Obama and Wright that] black voters are less comfortable sharing with human interviewers." Perhaps, but notice that much of the difference comes from the InsiderAdvantage poll.

What's your guess? You can plug numbers into Poblano's spreadsheet to test your theories.


The most consistent aspect of the Indiana results is Obama's percentage among white voters, which nearly every pollster pegs at or near the average of 39%. If Clinton gets 60% or more of the white vote tonight, the only question will be the size of her margin given the percentage of African-Americans. To win, Obama will need to hold Clinton a few points under 60 and boost the African American share to roughly 15%.

One striking inconsistency in Indiana is the result among African-Americans. However, we should keep in mind that the sample sizes are usually very small (probably in the range of 50 to 100 interviews), so we should not be surprised to see big variation. Moreover, the pattern is not consistent across pollsters by mode.

Update: I neglected to include explicit links above, but I did posts yesterday on what the composition by all of the North Carolina and Indiana polls by race and other demographics.

NC and IN Final Pollster Comparisons

Topics: 2008


With the last of the preelection polls in, we can now do our "apples to apples" comparison. Follow each pollster in the charts to see who's high, who's low and who has jumped around.

Note this is for the Obama minus Clinton MARGIN (which makes it easier to plot all the polls in one, still jumbled, chart.)

And check back tonight as the votes roll in to see who nailed it and who missed. In North Carolina all agree on the winner, only the margin is in dispute. But Indiana has a little disagreement on who is ahead. Fun!


Cross posted at Political Arithmetik.

POLL: Rasmussen Kentucky (5/5)

Rasmussen Reports

Clinton 56, Obama 31

POLL: SurveyUSA Kentucky (5/3-5)


Clinton 62, Obama 28

POLL: Daily Tracking (Through 5/5)

Gallup Poll

Obama 48, Clinton 46
McCain 46, Obama 45... Clinton 46, McCain 45

"Most Democrats Not Eager for Either Candidate to Drop Out" - Video
"Clinton Supporters Believe Wright Is Relevant to Campaign"

Rasmussen Reports
Obama 47, Clinton 43
Obama 45, McCain 45... Clinton 47, McCain 43

Favorable / Unfavorable
McCain: 50 / 47
Clinton: 47 / 50
Obama: 50 / 47

"When Compared to McCain, Voters Trust Clinton More than Obama on Five Key Issues"
"66% Say Jeremiah Wright Comments Hurt Obama's Campaign"

Final Polls: About The Undecided?

Topics: 2008 , Barack Obama , Dick Bennett , Divergent Polls , Hillary Clinton , John Edwards , John McCain , Zogby

Well, for the umpteenth time this primary season, we wake up to wide variation on the final polls for the day's primaries. Today we have polls showing Barack Obama leading Hillary Clinton by anywhere from 4 to 14 points in North Carolina. Meanwhile, poll in Indiana show everything from a 2-point Obama edge to a 12-point Clinton blowout. One big question in looking at the variation is whether the pattern of variation suggests a pattern of a larger undecided translating into a hidden Clinton vote. The evidence on this question is mixed, however, and relies mostly on the results from just one pollster.

First, rather than listing the polls, lets do a chart of the final results in the last week or so from each pollster. Start with North Carolina. The following chart simply plots each result on a grid with the Obama percentage on the vertical axis and the Clinton percentage on the horizontal axis. All of the points on the North Carolina chart are above the blue diagonal line indicating an Obama lead.

Many -- such as one of Josh Marshall's readers -- think they see a pattern (which yours truly also saw at first in Pennsylvania but that subsequently disappeared) suggesting a coming "break" of undecided voters to Clinton. Such a pattern would imply a horizontal pattern to the dots above, with all of the variation in the Clinton number and little in the Obama number. That pattern holds only with respect to the Zogby poll, the one showing Obama leading by the biggest (51% to 37% margin), but also the poll with the most respondents categorized as either undecided or as choosing "someone else."

The pattern of the other points in North Carolina is mostly circular, about as varied as we would expect given sampling error and centered around a roughly seven point Obama advantage (50% to 43%) with 7% left over as undecided or "other."

Now Indiana:

In Indiana the dots are slightly more dispersed, with Zogby again the showing the best result for Obama, in this case a 2-point Obama advantage (45% to 43%), with 12% categorized as either undecided or "other." In this case, however, two polls have shown roughly as many voters choosing an option other than Obama or Clinton, although both were about a week old: One from TeleResearch (showing Clinton leading by 10 points with 14% undecided/other) and the other from Rasmussen Reports (giving Clinton a 5-point lead with 13% undecided/other).

Again, if we set the Zogby result aside, we get most of the polls forming a circular, mostly random pattern around an average advantage of 7 points (50% to 43%) with 7% undecided or "other."

In thinking about what to make about the difference between Zogby and the other polls, it may be useful to think about what it means for a respondent to tell a pollster they are "undecided" in the days leading up to the election. There are at least three possibilities:

  1. They are going to vote but are still uncertain about which candidate to support
  2. They are going to vote, have decided which candidate they support but are not willing to share their preference with the person (or computer) on the other end of the phone line
  3. They are not going to vote but were mistakenly identified as a "likely voter" by the pollster

Pollsters understand that many voters hover somewhere between a final decision and being totally undecided. So most consider it good practice to "push" uncertain voters, especially near election day, as the candidate they lean to supporting is almost always the candidate they ultimately support.

So I tend to agree with Pollster readers who have expressed frustration in comments with pollsters reporting a large undecided preference. What is especially puzzling about the Zogby result, however, is the very large percentage that they have reported as favoring "someone else" -- 4% in North Carolina (down from 8% over the weekend) and 5% in Indiana (down from 7%). What does that mean? Are respondents expressing a preference for John Edwards? John McCain? Are those really non-primary voters?

At any rate, given that the demographics of Zogby's samples are not radically different from the other pollsters in the two states, there is certainly a good possibility that a harder "push" would benefit Clinton. We also have seen in exit polls that late deciders have favored Clinton in most of the primaries since Super Tuesday. More on both states -- and particularly the issue of late deciders favoring Clinton -- later today.

Update: Just noticed this helpful information posted by Dick Bennett on the ARG web site:

The Democratic ballot in Indiana has two lines (Hillary Clinton and Barack Obama), while the Democratic ballot in North Carolina has four lines (Hillary Clinton, Mike Gravel, Barack Obama, and No Preference).

Democratic primary voters in our surveys in Indiana were asked just the two candidate choices. If voters said someone other than Clinton or Obama, our interviewers were instructed to inform the voters that there are only two choices on the ballot. In most cases, voters then selected Clinton or Obama instead of saying they were undecided.

In North Carolina, our surveys gave the four choices on the ballot. Democratic primary voters selecting the "no preference" line also told us that they would never vote for Clinton or Obama. Our results combine the no preference with someone else (even though no preference will get more votes than Mike Gravel).

In watching the results tonight, be aware that "someone else" is not on the ballot in Indiana and some voters in North Carolina will vote the no preference line.

Response to Doug Usher

Topics: Humphrey Taylor , Internet Polls , Pollsters

Humphrey Taylor has served as chairman of The Harris Poll, a service of Harris Interactive, since 1994.

Thanks to Doug Usher for his contribution to the debate on the panel-based online methodology for political polling. I am glad to see that he acknowledges the value and viability of this method for national polls. But I am puzzled as to why he thinks this method will not work in Congressional races (at least I think that is what he is saying). He writes of the "sour spot" based on the fact that political polls need to reach "a narrow population for which pollsters do not have well defined web contact information".

I assume he means by this that we cannot sample a geographic area because we do not know where people live. Of course we can, and we do this easily. We and others who have large panels, know the states in which people live, so that takes care of senate races.

What about congressional districts?

Some panels also have zip code information, and those that do not can screen for it. In so far as some zip codes straddle the boundary with another district we can screen for streets or even addresses if necessary. And of course this problem is the same ,or possibly worse, for RDD telephone samples ,as telephone exchanges may also straddle the boundaries between districts. Furthermore many people now take their telephone numbers with them when they move from one district to another.

My comment that was quoted by Doug Usher was taken out of context. I certainly believe that online political polling methods are "the wave of the future". My mention of cell phones was specifically in reference to telephone surveys of people aged 18 to 29. I am sorry if that was not clear.

POLL: InsiderAdvantage NC (5/5; Final)


North Carolina
Obama 47, Clinton 43

What's Up With The National Polls?

Topics: 2008 , Associated Press , Barack Obama , CBS/New York Times , Divergent Polls , Gallup , Gary Langer , Hillary Clinton , Mark Mellman , Rasmussen , USAToday Gallup

Last week, Democratic pollster Mark Mellman used his column in The Hill to argue that one could use "same-poll to same-poll comparisons" to argue that the controversy over Barack Obama's "bitter" comments had either no impact or a great deal of impact on the Pennsylvania primary. "In addition to providing jobs for pollsters," he wrote, "proliferating polls now give everyone the evidence to prove their favorite theory." He drew specific examples from the lesser known pollsters active in statewide contests that are typically derided by the polling establishment and the mainstream media.

Over the last 48 hours, however, we have seen Mellman's point proved by the country's most respected and well established polls and media organizations. As Gallup editor in chief Frank Newport noted in his Gallup Guru blog:

Two different headlines today in the New York Times and USA Today about the impact of the Jeremiah Wright controversy came to different conclusions about Jeremiah Wright and Barack Obama. The Times’ headline: “ In Poll, Obama Survives Furor; but Fall Is the Test”, while USA today headlines: “Flap over pastor pulls Obama down, poll finds”.

And for those who missed it, ABC's Gary Langer briefly summarized the conflicting numbers:

Briefly: Times/CBS has Barack Obama +12 vs. Hillary Clinton, with a headline saying Obama “survives furor” over the Rev. Jeremiah Wright. USAT/Gallup has Clinton +7, saying the flap over Wright “pulls Obama down.” Adding to the mix is Gallup’s daily poll, which has Obama +4.

These polls also differ in their general election match-ups: Times/CBS has Obama +11 and Clinton +12 vs. John McCain, while USAT/Gallup has them basically tied. Gallup daily has Clinton-McCain tied, McCain +5 vs. Obama.

And just in overnight: AP/IPSOS weighs in with a national poll showing Clinton +7 over Obama, with Clinton +5 and Obama +4 over McCain, spawning a thousand different headlines, no doubt (as AP stories always do).

Why all the confusion? Langer has a nice summary of the various methodological quirks that may or may not explain the differences between the Times/CBS and Gallup polls, though his bottom line is the most important take-away point. This episode, he writes,

[Is] a reminder that all polls – even good-quality ones – are done differently, and don’t always get the same results or engender the same analysis. And that horse-race results, in the midst of a close and unsettled campaign, may be particularly vulnerable to these kinds of influences.

Another way to make the same point. Look at our chart for the national Democratic primary polls as captured this morning It plots points for every survey released since January 2007 (captured this morning).

Let your eyes focus, for a moment, not on the the lines but on the cloud of dots surrounding each line. Each dot represents an individual poll. The dots are a bit more dense lately, a pattern explained mostly by our inclusion of daily tracking polls by Gallup and Rasmussen Reports since January 2008 (note: we plot only ever third or fourth day of each survey so we their rolling average samples do not overlap). However, the pattern of variation -- the spread of points around each line -- is considerable but not any wider now than at any other point over the last 17 months.

What is different right now is that the gap between Obama and Clinton is very close. So some polls show Clinton ahead, some show Obama ahead and given the sate of the race and we are paying much more attention to the small differences in individual polls than we usually do. Random error may explain some of the variation, small differences in methodology (question wording, order, the particular sub-population that answered he question) explains the rest. Either way, the cloud of variation -- bringing with it sometimes odd and conflicting "same-poll to same-poll comparisons" -- of "is an inherent part of political polling.

The variation is also the reason why we favor looking at all the polls in this "mashed up" graphical form rather than debating endlessly over which individual poll is closest to "right" at any moment.

POLL: Zogby IN, NC (5/4-5; Final)


Obama 45, Clinton 43

North Carolina
Obama 51, Clinton 37

POLL: AP-Ipsos National (4/30-5/4)


755 RV, 514 RV Dems/Dem-Leaners

Clinton 47, Obama 40
Obama 46, McCain 42... Clinton 47, McCain 42

How much does the Pollster matter for Trend?

Topics: 2008 , Pollster , Pollster.com , SurveyUSA , Trend lines , Zogby


One of the things we think about a lot at Pollster.com is the quality of polling. Mark Blumenthal's post on the North Carolina poll demographics here is a great example of how much variability we see among polls, all trying to hit the same target population.

This issue is also raised by those who would like to exclude some polls from our trend estimates. If one "bad apple" spoils the barrel, then this is a serious issue for our efforts to estimate the state of the races here.

We've stuck to our principle that we include all available polls without cherry picking (to shift the fruit metaphor!) but we don't do that out of blind faith. Rather we do it because the empirical evidence shows that the effects of single pollsters are generally small, certainly compared to the other sources of uncertainty about the state of the race.

Here I take a look at this issue for North Carolina and Indiana.

There are four elements that affect how much a pollster influences our trend estimate.

First, the pollster's results must be "different" from the trend we'd estimate without them. If a pollster happened to hit our trend dead on every time, their influence would reinforce our trend estimate, but not change it. So for a poll to affect the trend, it needs to be different from what we'd otherwise estimate.

Second, the pollster needs to produce results that are systematically different from the trend. If a pollster bounces around the trend, some high and some low, then the net effect is small, even if individual polls are rather far off the trend.

Since the trend is determined across all pollsters, these first two points are another way of saying that the pollster must differ from what other pollsters are getting.

Third, volume matters. In some states, a single pollster accounts for a substantial proportion of all polling, while other pollsters contribute only a single poll. The former obviously have more potential influence than the latter. But high volume of polls doesn't matter if they are consistently close to (and scattered around) the trend estimate based on other polling. The problem comes when the prolific pollster is also rather different from others, and especially if there are few other pollsters active in the state.

Fourth, polls late in the game can have more leverage on the "current" trend estimate. So a pollster that does several polls but only in the last week before election day can have more influence on the current estimate than they would if those polls were spread over the entire pre-election period. Again, such an effect is only visible if the late polls are different from other polling.

Having an effect on the trend could be a very good thing if the pollster is right while others are wrong. The problem is how do you know a priori which pollster will be right THIS TIME. Experience this year demonstrates that a good day can be followed by a bad day, or both on the same day.

It is also important to put these effects in perspective across all polls we see in a race. The individual polls are highly variable. Our data often finds polls covering plus or minus 5, 6 or even 7 points of our estimated trend for an individual candidate, and double that for the margin between two candidates. There is a lot of noise out there, and the whole point of our trend estimator is to extract the signal from the noise. Our estimator (especially the "standard" estimator I'm using here, as opposed to the "sensitive" estimator we also check) is designed to resist polls that are "way off" (i.e. outliers) but at the same time be able to follow the common trend across polls. (I'm going to not go into the details of our local regression estimator here, which is not a simple rolling average. Let's hold that for another day. The FAQ on this is coming.)

So let's take a look at the North Carolina plot way up there at the top of this post. The horizontal axis is scaled to show the range of poll results we've seen in the state since April 1. This provides perspective on how much variation you see from poll to poll in the raw results.

The red "whiskers" at the bottom of the plot are the individual polls taken over this time. There is a bit more than a 25 point range in the Obama-Clinton margin during this period. Since the trends in the state have been relatively flat, only a little of this variation is due to "real change".

Our trend estimate based on all polls is the vertical blue line, which as of Monday afternoon is +8.6 points in Obama's favor.

How much do individual pollsters matter for this estimate? PPP has done the most polling in the state. If we take them out, the trend estimate drops to 7.0, a shift of 1.6 points on the difference (or an average of .8 points for each candidate, moving in opposite directions of course).

At the opposite extreme, removing Insider Advantage from our estimator produces a 10.7 point Obama lead, a shift of 2.1 points on the difference, or 1.05 points per candidate.

For most other pollsters, the effect is far smaller, even for relatively frequent pollsters such as SurveyUSA and ARG.

So the maximum effect of removing a single pollster is a shift between a 7.0 and a 10.7 point Obama lead. A shift of 3.7 points on the difference can matter in a close race, but that difference is relatively small compared to the variation we see in individual polls. Indeed, the four polls completed 5/4 show a range of +3 to +10 for the Obama margin. (They average a +7.25, compared to our trend estimate of +8.6.)

There is less polling in Indiana, so we might expect more influence since there are fewer polls to stabilize the trend estimator.


Here the current estimate using all polls is -6.2, a lead for Clinton. The range of results we get from excluding pollsters is from -4.1 (excluding SurveyUSA) to -8.7 (excluding Zogby). That is a bit larger than North Carolina, as expected. But put this in the perspective of the range of raw poll results for Indiana, which is from -16 to +5 in polls taken since April 1. The six latest polls as of Monday, all ending on 5/4, range from -12 to +2.

To sum up. Which polls we include affect our results. That both has to be and should be. We WANT the data to matter, and of course it does. What we don't want is for individual polls to make such large differences for our results that inclusion or exclusion decisions become critical. The results we see here show that we SHOULD be somewhat uncertain as to the trend, as it depends upon which individual pollsters are included. What is somewhat different in our approach at Pollster.com is we want to emphasize this uncertainty and put it in perspective, rather than produce a single number and treat that as if it were "certain". That is why we always show the individual polls spread around our trend estimate in the charts. All estimates have uncertainty. We need to understand both the value of the estimate and the uncertainty inherent in it. Pollster effects are part of that story.

However, what is crucial is that these effects on the trend estimate are small compared to the range of variability we see across individual polls. The goal of our trend estimator is to produce a better estimate than what any single poll (or pollster) can provide. By that standard pollster effects on the trend are modest compared to the variability across individual polls.

Evaluating the accuracy of the polls is a different topic, one we'll revisit again on Wednesday.

Cross-posted at Political Arithmetik.

POLL: SurveyUSA North Carolina (5/2-4)


North Carolina
Obama 50, Clinton 45

POLL: PPP Indiana (5/3-4)

Public Policy Polling (D)

Clinton 51, Obama 46

NC and IN Sensitivity Update

Topics: 2008 , Barack Obama , Hillary Clinton


As we close in on tomorrow's primaries in North Carolina and Indiana, the "standard" and "sensitive" trend estimates have largely converged.

In North Carolina the standard estimator puts Obama at 50.1% and Clinton at 41.5%. The sensitive estimator has it Obama 49.5% and Clinton 42.2%. Or, a margin in the standard trend of +8.6 for Obama vs +7.3 in the sensitive estimate.


In Indiana, the standard estimator puts Clinton up 49.5% to 43.3% for Obama. Switching to the sensitive estimator makes it Clinton 51.2% to Obama's 43.5%. Or a Clinton advantage of 6.2% for the standard estimator versus 7.7% for the sensitive one.

Either way the polls are seeing a split decision tomorrow. Anything else will be a very interesting surprise.

Cross-posted at Political Arithmetik.

POLL: InsiderAdvantage IN (5/4)

n=502, crosstabs

Clinton 48, Obama 44

The Demographics of the Indiana Surveys

Topics: 2008 , Barack Obama , Divergent Polls , LA Times/Bloomberg , Pollster , Pollsters , Zogby

And following-up on my post this morning on the demographics for the North Carolina polls, here is the same set of statistics, when available, for the recent surveys of Indiana. Since African-American's are a much smaller share of the population in Indiana, the Clinton-Obama results are not quite as sensitive to their percentage of the Democratic electorate as in North Carolina. However, should the Indiana result be close, the size of the African-American population will be important. Also, the tables show that the polls vary on age as much as in other states.

The following table shows demographic composition statistics for those pollsters that have released them. Click on the table to display a larger version that also includes the vote preference results for reach poll.


The table excludes statistics from pollsters that have not publicly released demographic information for their North Carolina surveys (or perhaps more accurately, have not published anywhere I could find it): LA Times/Bloomberg, Indianapolis Star/WTHR/Selzer, Howey-Gauge (and thanks again to Pollster reader jac13 for sharing the demographic profile data that Zogby makes available to paid subscribers).

In Indiana we see the same wide variation in the age distribution among pollsters seen elsewhere: Even The percentage of 18-to-29-year-olds varies from 8% to 22%, the percentage 18-to-44 varies from 26% to 51%.

With the exception of one pollster, the variation in racial composition is smaller. Most show an African-American percentage of somewhere between 9% and 12%, with Research2000 (13%) and Suffolk University (15%). The most extreme value is the Howey-Gauge survey, which reported a much higher percentage of African-Americans (20%) among likely primary voters.

Brian Schaffner noticed last week that larger percentage of African-Americans in the Howey-Gauge poll explained how they showed Obama with a two-point advantage while other firms showed Obama trailing by seven or more percentage points. He has some interesting speculation about the composition of the Indiana electorate, but ultimately I have to agree with his bottom line conclusion: Given the lack of an Indiana benchmark for past Democratic presidential primaries, "we don't really know what to expect in terms of African American turnout."

[Updated table to include new surveys from PPP and InsiderAdvantage]

POLL: Rasmussen WV (5/4)

Rasmussen Reports

West Virginia
Clinton 56, Obama 27

POLL: Daily Tracking (Through 5/4)

Gallup Poll

Obama 50, Clinton 45
McCain 47, Obama 43... Clinton 46, McCain 46

Rasmussen Reports

Obama 46, Clinton 45
McCain 47, Obama 43... Clinton 47, McCain 43

Favorable / Unfavorable
McCain: 51 / 46
Clinton: 48 / 49
Obama: 49 / 48

POLL: SurveyUSA Indiana (5/2-4)


n=675, MoSE=3.8
Clinton 54, Obama 42

POLL: PPP North Carolina (5/3-4)

Public Policy Polling (D)

North Carolina
n=870, MoSE=3.3
Obama 53, Clinton 43

Weighting their final poll.

POLL: ARG, IN, NC (5/2-4)

American Research Group

North Carolina
Obama 50, Clinton 42

Clinton 53, Obama 45

POLL: USA Today/Gallup National (5/1-3)

USA Today/Gallup
(story, results)

Clinton 51, Obama 44

POLL: Suffolk IN (5/3-4)

Suffolk University

Clinton 49, Obama 43

POLL: InsiderAdvantage NC (5/4)


North Carolina
Obama 48, Clinton 45

POLL: Zogby IN, NC (5/3-4)


Obama 44, Clinton 42

North Carolina
Obama 48, Clinton 40

The Demographics of the North Carolina Polls

Topics: 2006 , Barack Obama , Divergent Polls , Hillary Clinton , LA Times/Bloomberg , Mike McDonald , Pollster , Pollsters , SurveyUSA , Zogby

Time for another round-up of available poll demographics, this time from North Carolina. The most important variable in this state is the African American percentage of likely Democratic primary voters. The most recent polls -- at least among those that have disclosed their demographics -- have converged around a black percentage of 32-33%. Needless to say, given the near monolithic support that African Americans have given Barack Obama, that percentage will ultimately be critical to his share of the vote on Tuesday.

The following table shows demographic composition statistics for those pollsters that have released them. Click on the table to display a larger version that also includes the vote preference results for reach poll.


The table excludes the pollsters that have, as of yet, not publicly released demographic information for their North Carolina surveys: Mason-Dixon, Rasmussen Reports, and LA Times/Bloomberg (special thanks to readers Paul and jac13 for sharing the demographic profile data that Zogby shares with paid subscribers).

As in previous states, we see considerable variation in the kinds of voters selected as "likely primary voters." Easily the most variant likely voter sample on the list is the one from the Civitas Institute from early April, with a composition of just 28% African American and 17% under the age of 45. However, even if we set that survey aside, we still see considerable variation: from 51% to 58% female, from 39% to 55% age 18-to-44 and from 25% to 37% African American (and those last extremes come from a single pollster -- more below).

A quick review from my post on the demographics of the Pennsylvania surveys:

It is important to remember that pollsters come to these composition statistics through different paths. Some interview samples of adults, weight those demographically to match census estimates of Pennsylvania's adults, then select "likely voters" and let their demographics fall where they may. Others will weight their "likely voter" samples directly to pre-determined demographic targets. Some pollsters will not set weights or quotas for demographics, but will set such weights or quotas for geographic regions (based on past turnout and their assumptions about what might be different this time).

With that in mind, note two very striking changes from two pollsters that set pre-determined demographic targets, Public Policy Polling (PPP) and InsiderAdvantage:

  • The first three surveys released in April by PPP had an African American composition of 36% or 37%. Their most recent survey, fielded last Sunday and Monday evenings, had a black composition of just 33%.
  • The gyrations in the weighting by InsiderAdvantage are even more dizzying. Their first North Carolina survey in late March was 37% African American. Their next two surveys in April were only 25% African American, and their most recent poll last week bumped the black percentage back up to 33%. Notice that none of their percentages for women, 18-29-year-olds, 18-44-year-olds or those 65+ changed by a single digit, despite a 12-point variance in the black percentage.

Both pollsters put out written summaries of their results, but neither made any reference (that I could find) explaining or justifying their changing assumptions about the racial composition of the North Carolina electorate. [Update: On their final poll, PPP upped the black share to 35%, but explained their rationale]. By the way, we know that these two pollsters set predetermined demographic targets, because both have confirmed as much to me in previous communications (here for InsiderAdvantage and here for PPP).

The change in the PPP poll is important -- they should have noted it -- but relatively modest compared to the astonishingly large, significant and unexplained shifts in the African American composition in the InsiderAdvantage polls. InsiderAdvantage's Matt Towery likes to brag of his "significant experience" as a pollster, but after a number of curious episodes over the last few months, it is getting very hard to take those claims seriously.

It's also worth pointing out the relative stability in the racial composition of the SurveyUSA results, given that they do not force their samples to a pre-determined demographic profile (details on their procedures here). The percentage of African Americans in their four surveys since March have remained relatively stable, falling within the range of 30% to 33%.

Finally, one caution about the percentage reported as "unaffiliated" (having no party affiliation). Only PPP includes the full text of their party question, and it is possible other pollsters are asking about party identification (whether respondents "consider themselves" as partisans) rather than party registration.

Update: Almost forgot. Fivethirtyeight's Poblano posted a handy spreadsheet that can help you see just how much small changes in the racial composition of the North Carolina electorate can affect the potential margin between the candidates. It's well worth the click.

Update II: In posting this last night, I neglected to point out that North Carolina has been releasing reports on the demographics of early voters. As North Carolina is one of nine southern states still required by the Voting Rights Act of 1965 to track voter registration by race, racial tallies among early voters are also available. The demographic composition of early voters have been analyzed by Brian Schaffner, DailyKos diarist dean4ever and noted in comments by many of our readers over the weekend.

Overnight, GMU Professor Michael McDonald, whose academic focus is voter turnout, posted the following comment:

North Carolina is an exceptional state in that it provides near real-time updates of its voter registration file. Indeed, you can download the entire file of absentee and early in-person voters directly from the state's ftp site.

North Carolina is also an interesting state because race and gender are recorded on the voter file (birthdate appears to be supressed in the absentee file). When I crunch the numbers, out of the 397,850 persons who are listed as returning a Democratic Party ballot as of 5/03:

39.9% are African American
60.8% are women

Note, a small percentage (less than 1%) of records have missing data.

Will these percentages hold for Tuesday? That is hard to say, mostly because people who study early voting (myself included) don't know much about the characteristics of early primary voters. Added is the confounding factor that one-stop registration and voting is permitted for in-person early voters only and not for Election Day voters. Providing little further clues, African-Americans are only slightly more likely to vote early in-person, 40.6%, and women slightly less, 60.7%.

The fact that nearly 400,000 early votes have been cast so far is remarkable given past primary turnout in North Carolina. The state held a caucus in 2004 (due to a redistricting battle that delayed the primary), but 544,922 Democrats voted in the largely uncontested primary in May 2000, and 691,875 voted in May 1992 (statistics I gathered for a column noting that pollster PPP has been sampling from a total universe of 874,222). The record was 961,000 in 1984, according to the Charlotte Observer, which cites "long time N.C. political observers" guessing that "as many as 1.5 million" may vote this year. So this early vote will be a significant portion of the total votes cast, but as McDonald points out, no one knows exactly how big.

It is also worth pointing out that the Obama campaign has made early voting drives a focus of their field organizing, so it is certainly possible that the ranks of early voters are disproportionately swollen with Obama voters. Last week's poll from SurveyUSA showed Obama leading by a 18 points (57% to 39%) among early voters, but that subgroup was just 2% of their total sample. Thus, one key result to watch in the final poll releases today -- among those far sighted enough to track and report it -- will be the size and preference of the early voters.

POLL: National Daily Tracking (through 5/3)

Gallup Daily

Obama 49, Clinton 45
McCain 47, Obama 42... McCain 46, Clinton 45

Rasmussen Reports

Clinton 44, Obama 45
McCain 47, Obama 44... Clinton 46, McCain 44

Favorable / Unfavorable
McCain: 52 / 46
Clinton: 48 / 50
Obama: 50 / 48

POLL: CBS/Times National (5/1-3)

CBS News/New York Times
(CBS story, Wright/Obama/Campaign results, Economy/Gas Tax results; Times story, results, methodology)
n=671 adults, 601 registered voters, 283 Democratic primary voters

Dem primary voters: Obama 51, Clinton 38

Registered voters:
Obama 51, McCain 40, Clinton 53, McCain 41

POLL: Zogby IN, NC (5/2-3)


North Carolina
Obama 48, Clinton 39

Obama 43, Clinton 41