June 13, 2010 - June 19, 2010


Rating Pollster Accuracy: How Useful?

Topics: Accuracy , Brendan Nyhan , Courtney Kennedy , Fivethirtyeight , Nate Silver

I have been posting quite a bit lately on the subject of the transparency of Nate Silver's recently updated pollster ratings, so it was heartening to see his announcement yesterday that FiveThirtyEight has established a new process to allow pollsters to review their own polls in his database. That is a very positive step and we applaud him for it.

I haven't yet expressed much of an opinion on the ratings themselves or their methodology, and have hesitated to do so because I know some will see criticism from this corner as self-serving. Our site competes with FiveThirtyEight in some ways, and in unveiling these new ratings, Nate emphasized that "rating pollsters is at the core of FiveThirtyEight's mission, and forms the backbone of our forecasting models."

Pollster and FiveThirtyEight serve a similar mission, though we approach it differently: Helping those who follow political polls make sense of the sometimes conflicting or surprising results they produce. We are, in a sense, both participating in a similar conversation, a conversation in which, every day, someone asks some variant of the question, "Can I Trust This Poll?"

For Nate Silver and FiveThirtyEight, the answer to that question often flows from their ratings of pollster accuracy. During the 2008 campaign season, Nate leaned heavily on earlier versions of his ratings in posts that urged readers to pay less attention to some polls and more to others, with characterizations running the gamut from "pretty awful" or "distinctly poor" to the kind of pollster "I'd want with me on a desert island." He also built those ratings into his forecasting models, explaining to New York Magazine that other sites that average polls (among them RealClearPolitics and Pollster.com) "have the right idea, but they're not doing it quite the right way." The right way, as the article explained, was to average so that "the polls that were more accurate [would] count for more, while the bad polls would be discounted."

For better or worse, FiveThirtyEight's prominence makes these ratings central to our conversation about how to interpret and aggregate polls, and I have some serious concerns about the way these ratings are calculated and presented. Some commentary from our perspective is in order.

What's Good

Let's start with what's good about the the ratings.

First, most pollsters see value in broadly assessing poll accuracy. As the Pew Research Center's Scott Keeter has written (in a soon to be published chapter), "election polls provide a unique and highly visible validation of the accuracy of survey research," a "final exam" for pollsters that "rolls around every two or four years." And, while Keeter has used accuracy measurements to assess methodology, others have used accuracy scores to tout their organizations' successes, even if their claims sometimes depend on cherry-picked methods of scoring, cherry-picked polls or even a single poll. So Silver deserves credit for taking on the unforgiving task of scoring individual pollsters.

Second, by gathering pre-election poll results across many different types of elections over more than ten years, Silver has also created a very useful resource to help understand the strengths and weaknesses of pre-election polling. One of the most powerful examples is the table, reproduced below, that he included in his methodology review. It shows that poll errors are typically smallest for national presidential elections and get bigger (in ascending order) for polls on state-level presidential, senate, governor, and primary elections.


Third, I like the idea of trying to broaden the scoring of poll accuracy beyond the final poll conducted by each organization before an election. He includes all polls with a "median date" (at least halfway completed) within 21 days of the election. As he writes, we have seen some notable examples in recent years of pollsters whose numbers "bounce around a lot before 'magically' falling in line with the broad consensus of other pollsters." If we just score "the last poll," we create incentives for ethically challenged pollsters to try to game the scorecards.

Of course, Silver's solution creates a big new challenge of its own: How to score the accuracy of polls taken as many as three weeks before an election while not penalizing pollsters that are more active in races like primary elections that are more prone to huge late swings in vote preference. A pollster might provide a spot-on measurement of a late breaking trend in a series of tracking polls, but only their final poll would be deemed "accurate."

Fourth, for better or worse, Silver has already done a service by significantly raising the profile of the Transparency Initiative of the American Association for Public Opinion Research (AAPOR). Much more on that subject below.

Finally, you simply have to give Nate credit both for the sheer chutzpah necessary to take on the Everest-like challenge of combining polls from so many different types of elections spanning so many years into a single scoring and ranking system. It's a daunting task.

A Reality Check

While the goals are laudable, I want to suggest a number of reasons to take the resulting scores, and especially the rankings of pollsters using those scores, with huge grains of salt.

First, as Silver himself warns, scoring the accuracy of pre-election polls has limited utility. They tell you something about whether pollsters "accurately [forecast] election outcomes, when they release polls into the public domain in the period immediately prior to an election." As such:

The ratings may not tell you very much about how accurate a pollster is when probing non-electoral public policy questions, in which case things like proper question wording and ordering become much more important. The ratings may not tell you very much about how accurate a pollster is far in advance an election, when definitions of things like "likely voters" are much more ambiguous. And they may not tell you very much about how accurate the pollsters are when acting as internal pollsters on behalf of campaigns.

I would add at least one more: Given the importance of the likely voter models in determining the accuracy of pre-election polls, these ratings also tell you little about a pollsters' ability to begin with a truly representative sample of all adults.

Second, even if you take the scores at face value, the final scores that Silver reports vary little from pollster to pollster. They provide little real differentiation among most of the pollsters on the list. What is the range of uncertainty, or if you will, the "margin of error" associated with the various scores? Silver told Markos Moulitsas that "the absolute difference in the pollster ratings is not very great. Most of the time, there is no difference at all."

Also, in response to my question on this subject, he advised that while "estimating the errors on the PIE [pollster-introduced error] terms is not quite as straightforward as it might seem," he assumes a margin of error "on the order of +/- .4" assuming a 95% confidence level. He adds:

We can say with a fair amount of confidence that the pollsters at the top dozen or so positions in the chart are skilled, and the bottom dozen or so are unskilled i.e. "bad". Beyond that, I don't think people should be sweating every detail down to the tenth-of-a-point level.

That information implies, as our commenter jme put it yesterday that "his model is really only useful for classifying pollsters into three groups: Probably good, probably bad and everyone else." And that assumes that this confidence is based on an actual computation of standard errors for the PIE scores. Commenter Cato has doubts.

But aside from the mechanics, if all we can conclude is that Pollster A produces polls that are, on average, a point or two less variable than Pollster B, do these accuracy scores help us understand why, to pick a recent example, one poll shows a candidate leading by 21 points and another shows him leading by 8 points?

Third, even if you take the PIE scores at face value, I would quarrel with the notion that they reflect pollster "skill." This complaint that has come up repeatedly in my conversations with survey methodologists over the last two weeks. For example, Courtney Kennedy, a senior methodologist for Abt SRB, tells me via email that she finds the concept of skill "odd" in this context:

Pollsters demonstrate their "skill" through a set of design decisions (e.g., sample design, weighting) that, for the most part, are quantifiable and could theoretically be included in the model. He seems to use "skill" to refer to the net effect of all the variables that he doesn't have easy access to.

Brendan Nyhan, the University of Michigan academic who frequently cross-posts to this site, makes a similar point via email:

It's not necessarily true that the dummy variable for each firm (i.e. the "raw score") actually "reflects the pollster's skill" as Silver states. These estimates instead capture the expected difference in accuracy of that firm's polls controlling for other factors -- a difference that could be the result of a variety of factors other than skill. For instance, if certain pollsters tend to poll in races with well-known incumbents that are easier to poll, this could affect the expected accuracy of their polls even after adjusting for other factors. Without random assignment of pollsters to campaigns, it's important to be cautious in interpreting regression coefficients.

Fourth, there are good reasons to take the scores at something less than face value. They reflect the end product of a whole host of assumptions that Silver has made about how to measure error, and how to level the playing field and control for factors -- like type of election and timing -- that may give some pollsters an advantage. Small changes in those assumptions could alter the scores and rankings. For example, he could have used different measures of error (that make different assumption about how to treat undecided voters), looked at different time intervals (Why 21 days? Why not 10? Or 30?), gathered polls for a different set of years or made different decisions about the functional form of his regression models and procedures. My point here is not to question the decisions he made, but to underscore that different decisions would likely produce different rankings.

Fifth, and most important, anyone that relies on Silver's PIE scores needs to understand the implications of his "regressing" the scores to "different means," a complex process that essentially gives bonus points to pollsters that are members of the National Council of Public Polls (NCPP) or that publicly endorsed AAPOR's Transparency Initiative prior to June 1, 2010. These bonus points, as you will see, do not level the playing field among pollsters. They do just the opposite.

In his methodological discussion, Silver explains that he combined NCPP membership and endorsement of the AAPOR initiative into a single variable and found, with "approximately" 95% confidence, "that the [accuracy] scores of polling firms which have made a public commitment to disclosure and transparency hold up better over time." In other words, the pollsters he flagged with an NCPP/AAPOR label appeared to be more accurate than the rest.

His PIE scores include a complex regressing-to-the-mean procedure that aims to minimize raw error scores that are randomly very low or very high for pollsters with relatively few polls in his database. And -- a very important point -- he says that the "principle purpose" of these scores is to weight pollsters higher or lower as part of FiveThirtyEight's electoral forecasting system.

So he has opted to adjust the PIE scores so that NCPP/AAPOR pollsters get more points for accuracy and others get less (he applies an analogous penalty for pollsters that conduct surveys over the internet). The adjustment effectively reduces the PIE error scores by as much as a half point for pollsters in the NCPP/AAPOR category. Pollsters with the least number of polls in his database get the biggest boost in their PIE scores. He also awards a similarly sized and analogous penalty to three firms that conduct surveys over the internet. His explains that his rationale is "not to evaluate how accurate a pollster has been in the past -- but rather, to anticipate how accurate it will be going forward."

Read that last sentence again, because it's important. He has adjusted the PIE scores that he uses to rank "pollster performance" not only on their individual performance looking back, but also on his prediction on how they will perform going forward.

Regular readers will know that I am an active AAPOR member and strong booster of the initiative and efforts to improve pollster disclosure generally. I believe that transparency may tell us something, indirectly, about survey quality. So I am intrigued by Silver's findings concerning the NCPP/AAPOR pollsters as a group, but I'm not a fan of of the bonus/penalty point system he built into the ratings of individual pollsters. Let me show you why.

The following is a screen-shot of the table Silver provides that ranks all 262 pollsters, showing just the top-30. Keep in mind this is what his readers get to when they click on the "Pollster Ratings" tab displayed prominently on tab at the top of FiveThirtyEight.com:


The NCPP/AAPOR pollsters are denoted with a blue star. They dominate the top of the list, accounting for 23 of the top 30 pollsters.

But what would have happened had Silver awarded no bonus points? We don't know for certain, because he provided no PIE scores calculated any other way, but we did our best to replicate Silver's scoring method but recalculating the PIE score without any bonus or penalty points (regressing the scores to the single mean of 0.12). That table appears below.**

[I want to be clear that the following chart was not produced or endorsed by Nate Silver or FiveThirtyEight.com. We produced it for demonstration purposes only, although we tried to replicate his calculations as closely as we could. Also note that the "Flat PIE" scores do not reflect Pollster.com's assessment or ranking of pollster accuracy, and no one should cite them as such].


The top 30 look a lot different once we remove the bonus and penalty points. The number of NCPP/AAPOR designated pollsters in the top 30 drops from 23 to 7 (although the 7 that remain all fall within the top 13, something that may help explain the underlying NCPP/AAPOR effect that Silver reports). Those bumped from the top 30 often move far down the list. You can download our spreadsheet to see all the details, but nine pollsters awarded NCPP/AAPOR bonus points drop in the rankings by 100 or more places.

[In a guest post earlier today on Pollster.com, Monmouth University pollster Patrick Murray describes a very similar analysis he did using the same data. Murray regressed to the PIE scores to a different single mean (0.50), yet describes a very similar shift in the rankings].

Now I want to make clear that I do not question Silver's motives in regressing to different means. I am certain he genuinely believes the NCPP/AAPOR adjustment will improve the accuracy of his election forecasts. If the adjustment only affected those forecasts -- his poll averages -- I probably would not comment. But they do more than that. His adjustments appear to significantly and dramatically alter rankings prominently promoted as "pollster ratings," ratings that are already having an impact on the reputations and livelihoods of individual pollsters.

That's a problem.

And it adjusts those ratings in a way that's not justified by his finding. Joining NCPP or endorsing the AAPOR initiative may be statistically related to other aspects of pollster philosophy or practice that made them more accurate in the past, but no one -- not even Nate Silver -- believes that a mere commitment made a few weeks ago to greater future transparency caused pollsters to be more accurate over the last ten years.

Yet in adjusting his scores as he does, Silver is increasing the accuracy ratings of some firms and penalizing others on those grounds, in a way that is also contrary to AAPOR's intentions. On May 14, when AAPOR's Peter Miller presented the initial list of organizations that had endorsed the transparency initiative, he specifically warned his audience that many organizations would soon be added to the list because "I have not been able to make contact with everyone" while others faced contractual prohibitions Miller believed could be changed over time. As such, he offered this explicit warning: "Don't make any inferences about blanks up here, [about] names you don't see on this list."***

And one more thought: If you look back at both tables above, you will notice Silver strikes out the name Strategic Vision, LLC, and marks with a black "x", because he concludes that its polling "was probably fake," cracks the top-30 "most accurate" pollsters (of 262) on both lists.

If a pollster can reach the 80th or 90th percentile for accuracy with made up data, imagine how "accurate" a pollster can be by simply taking other pollsters' results into account when tweaking their likely voters model or weighting real data. As such, how useful are such ratings for assessing whether pollsters are really starting with representative samples of adults?

My bottom line: These sort of pollster ratings and rankings are interesting, but they are of very limited utility in sorting out "good" pollsters from "bad."

**Silver has not, as far as I can tell, published the mean he would regress PIE to had he chosen to regress to a single mean. I arrived at 0.12 based on an explanation he provided to Doug Rivers of YouGov/Polimetrix (who is also the owner of Pollster.com) that Rivers subsequently shared with me: "the [group mean] figures are calibrated very slightly differently than the STATA output in order to ensure that the average adjscore -- weighted by the number of polls each firm has conducted -- is exactly zero." A "flat mean" of 0.12 creates a weighted average adjscore of zero. I emailed Silver this morning asking if he could confirm. As of this writing he has not responded.

**In the interests truly full transparency, I should disclose that I suggested to Nate that he look at pollster accuracy among pollsters that had endorsed the AAPOR Transparency Initiative before he posted his ratings. He had originally found the apparent effect looking only at members of NCPP, and he sent an email to Jay Leve (of SurveyUSA), Gary Langer (polling director of ABC News) and me on June 1 to share the results and ask some additional questions, including: "Are there any variables similar to NCPP membership that I should consider instead, such as AAPOR membership?" AAPOR membership is problematic, since AAPOR is an organization of individuals and not firms, so I suggested he look at the Transparency Initiative list. In his first email, Silver also mentioned that, "the ratings for NCPP members will be regressed to a different mean than those for non-NCPP members." I will confess that at the time I had no idea what that meant, but in fairness, I certainly could have raised an objection then and did not.

Yoda 'Outliers'

Topics: Outliers Feature

Gallup's job creation index hits a 20 month high.

Tom Jensen cautions on interpreting polls immediately after scandals.

Jennifer Agiesta takes a look at enthusiasm for the World Cup around the world.

John Dickerson channels Yoda on polls.

US: 2012 President (USAToday/Gallup 6/11-13)

Topics: National , poll

USA Today / Gallup
6/11-13/10; 1,014 adults, 4% margin of error
926 registered voters, 4% margin of error
Mode: Live telephone interviews
(Gallup release)


Please tell me whether you think each of the following political office-holders deserves to be re-elected, or not. How about Barack Obama?
Registered voters: 46% Yes, deserves, 51% No, does not
All adults: 48% Yes, deserves, 49% No, does not

NY: 2010 Sen (Rasmussen 6/16)

Topics: New York , poll

6/16/10; 500 likely voters, 4.5% margin of error
Mode: Automated phone
(Rasmussen release)

New York

2010 Senate
49% Gillibrand (D), 38% DioGuardi (R)
50% Gillibrand (D), 38% Blakeman (R) (chart)
49% Gillibrand (D), 34% Malpass (R)

Favorable / Unfavorable
Kirsten Gillibrand: 49 / 38 (chart)
Bruce Blakeman: 32 / 27
David Malpass: 30 / 29
Joe DioGuardi: 35 / 27

Job Approval / Disapproval
Pres. Obama: 54 / 44 (chart)
Gov. Paterson: 35 / 61 (chart)

MD: 2010 Gov (Murphy 6/8-10)

Topics: Maryland , poll

The Polling Company (R) for Brian Murphy
6/8-10/10; 508 likely voters
Mode: Live telephone interviews
(Murphy campaign release)


2010 Governor
44% O'Malley (D), 43% Ehrlich (R)
44% O'Malley (D), 25% Murphy (R)

TN: 2010 Gov (Rasmussen 6/15)

Topics: poll , Tennessee

6/15/10; 500 likely voters, 4.5% margin of error
Mode: Automated phone
(Rasmussen release)


2010 Governor
50% Haslam (R), 32% McWherter (D)
44% Ramsey (R), 33% McWherter (D)
44% Wamp (R), 33% McWherter (D)

Favorable / Unfavorbable
Bill Haslam: 66 / 17
Ron Ramsey: 46 / 30
Mike McWherter: 45 / 38
Zach Wamp: 51 / 29

Job Approval / Disapproval
Pres. Obama: 42 / 57
Gov. Bredesen: 75 / 25

Murray: Are Nate Silver's Pollster Ratings 'Done Right'?

Topics: AAPOR , AAPOR Transparency Initiative , Fivethirtyeight , Nate Silver , Patrick Murray , Poll Accuracy , Polling Errors , Transparency

Patrick Murray is director of the Monmouth University Polling Institute

The motto of Nate Silver's website, www.fiverthirtyeight.com, is "Politics Done Right." Questions have been raised whether his latest round of pollster ratings lives up to that claim.

After Mark Blumenthal noted errors and omissions in the data used to arrive at Research 2000's rating, I asked to examine Monmouth University's poll data. I found a number of errors in the 17 poll entries he attributes to us - including six polls that were actually conducted by another pollster before our partnership with the Gannett New Jersey newspapers started, one eligible poll that was omitted, one incorrect candidate margin, and even two incorrect election results that affected the error scores of four polls. [Nate emailed that he will correct these errors in his update later this summer.]

In the case of prolific pollsters, like Research 2000, these errors may not have a major impact on the ratings. But just one or two database errors could significantly affect the ratings of pollsters with relatively limited track records - such as the 157 (out of 262) organizations with fewer than 5 polls to their credit. Some observers have called on Nate to demonstrate transparency in his own methods by releasing that database. Nate has refused to do this (with a somewhat dubious justification), but at least he now has a process for pollsters to verify their own data.

Basic errors in the database are certainly a problem, but the issue that has really generated buzz in the polling community is his new "transparency bonus." This is based on the premise that pollsters who were members of the National Council on Public Polls or had committed to the American Association for Public Opinion Research (AAPOR) Transparency Initiative as of June 1, 2010 exhibit superior polling performance. These pollsters are awarded a very sizable "transparency bonus" in the latest ratings.

Others have remarked on the apparent arbitrariness of this "transparency bonus" cutoff date. Many, if not most, pollsters who signed onto the initiative by June 1, 2010 were either involved in the planning or attended the AAPOR national conference in May. A general call to support the initiative did not go out until June 7.

Nate claims that, regardless of how a pollster made it onto the list, these pollsters are simply better at election forecasting, and he provides the results of a regression analysis as evidence. The problem is that the transparency score misses most researchers' threshold for being significant (p<.05). In fact, of the three variables in his equation - transparent, partisan, and Internet polls - only partisan polling shows a significant relationship. Yet, his Pollster Introduced Error (PIE) calculation awards "transparent" polls and penalizes Internet polls, but leaves partisan polls untouched. Moreover, his model explains only 3% of the total variance in pollster raw scores (i.e. polling error).

I decided to run some ANOVA tests on the effect of the transparency variable on pollster raw scores for the full list of pollsters as well as sub-groups at various levels of polling output (e.g. pollsters with more than 10 polls, pollsters with only 1 or 2 polls, etc.). The F values for these tests range from only 1.2 to 3.6 under each condition, and none are significant at p<.05. In other words, there may be more that separates pollsters within the two groups (transparent versus non-transparent) than there is between the two groups.

I also ran a simple means analysis. The average error among all pollsters is +.54 (positive error is bad, negative is good). Among "transparent" pollsters, the average score is -.63 (se=.23), while among other pollsters it is +.68 (se=.28). A potential difference, to be sure.

I then isolated the more prolific pollsters - the 63 organizations with at least 10 polls. Among this group, the 19 "transparent" pollsters have an average error score of -.32 (se=.23) and the other 44 pollsters average +.03 (se=.17). The difference is now less stark.

On the flip side, organizations with fewer than 10 polls to their credit have an average error score of -1.38 (se=.73) if they are "transparent" - all 8 of them - and a mean of +.83 (se=.28) if they are not. That's a much larger difference. Could it be that the real contributing factor to pollster performance is the number of polls conducted over time?

Consider that 70% of "transparent" pollsters on Nate's list have 10 or more polls to their credit, but only 19% of the "non-transparent" organizations have been equally as prolific. In effect, "non-transparent" pollsters are penalized for being affiliated with a large number of colleagues who have only a handful of polls to their name - i.e. pollsters who are prone to greater error.

To assess the tangible effect of the transparency bonus (or non-transparency penalty) on pollster ratings, I re-ran Nate's PIE calculation using a level playing field for all 262 pollsters on the list to rank order them. [I set the group mean error to +.50, which is approximately the mean error among all pollsters.] Comparing the relative pollster ranking between his and my lists produced some intriguing results. The vast majority of pollster ranks (175) did not change by more than 10 spots on the table. On its face, this first finding raises questions about the meaningfulness of the transparency bonus.

Another 67 pollsters moved between 11 to 40 ranks between the two lists, 11 shifted by 41 to 100 spots, and 9 pollsters gained more than 100 spots in the rankings, solely due to the transparency bonus. Of this last group, only 2 of the 9 had more than 15 polls recorded in the database. This raises the question of whether these pollsters are being judged on their own merits or riding others' coattails, as it were.

Nate says that the main purpose of his project is not to rate pollsters' past performance but to determine probable accuracy going forward. The complexity of his approach boggles the mind - his methodology statement contains about 4,800 words including 18 footnotes. It's all a bit dazzling, but in reality it seems like he's making three left turns to go right.

Other poll aggregators use less elaborate methods - including straightforward means - and have been just as, or even more, accurate with their election models (see here and here). I wonder if, with the addition of this transparency score, Nate has taken one left turn too many.

OK: 2010 Gov (SoonerPoll 5/25-6/8)

Topics: Oklahoma , poll

5/25-6/8/10; 503 likely voters, 4.4% margin of error
325 likely Republican primary voters, 5.4% margin of error
318 likely Democratic primary voters, 5.5% margin of error
Mode: Live telephone interviews
(General, Republican primary, Democratic primary)


2010 Governor: Republican Primary
59% Fallin, 10% Brogdon, 2% Jackson, 1% Hubbard

2010 Governor: Democratic Primary
37% Edmondson, 36% Askins,

2010 Governor: General Election
50% Fallin (R), 35% Edmondson (D)
49% Fallin (R), 36% Askins (D)
41% Edmondson (D), 40% Brogdon (R)
44% Askins (D), 36% Brogdon (D)

AR: 2010 Gov (Rasmussen 6/15)

Topics: Arkansas , poll

6/15/10; 500 likely voters, 4.5% margin of error
Mode: Automated phone
(Rasmussen release)


2010 Governor
57% Beebe (D), 33% Keet (R)

Favorable / Unfavorable
Jim Keet: 45 / 30
Mike Beebe: 72 / 25

MN: 2010 Gov (SurveyUSA 6/14-16)

Topics: Minnesota , poll

6/14-16/10; 1617 likely voters, 2.5% margin of error
500 likely DFL primary voters, 4.5% margin of error
Mode: Automated phone
(SurveyUSA release)


2010 Governor: DFL Primary
39% Dayton, 26% Kelliher, 22% Entenza

2010 Governor: General Election
35% Emmer (R), 33% Kelliher (D), 12% Horner (I)
38% Dayton (D), 35% Emmer (R). 12% Horner (I)
37% Emmer (R), 33% Entenza (D), 12% Horner (I)

TX: 2010 Gov (Rasmussen 6/16)

Topics: poll , Texas

6/16/10; 500 likely voters, 4.5% margin of error
Mode: Automated phone
(Rasmussen release)


2010 Governor
48% Perry (R), 40% White (D)

Favorable / Unfavorable
Rick Perry: 53 / 45
Bill White: 55 / 36

Job Approval / Disapproval
Pres. Obama: 40 / 60
Gov. Perry: 53 / 45

AAPOR Adds Transparency Initiative Endorsements

Topics: AAPOR , AAPOR Transparency Initiative , Disclosure , Magellan Data & Mapping , Peter Miller , Quinnipiac University Poll

And speaking of AAPOR's Transparency Initiative, the organization announced via email yesterday the names of 11 more survey organizations that recently pledged their support for the evolving program:

  • Elon University Poll
  • The Elway Poll
  • Magellan Data and Mapping Strategies
  • Monmouth University Polling Unit
  • Muhlenberg College Institute of Public Opinion
  • NORC
  • Public Policy Institute of California
  • Quinnipiac University Poll
  • University of Arkansas at Little Rock Survey Research Center
  • University of Wisconsin Survey Center
  • Western New England College Polling Institute

Among the new names, the two most notable for regular readers are probably Quinnipiac University, the pollster active in many important races in 2010 and Magellan Data and Mapping Strategies, a relatively new firm that has released mostly automated pre-election polls in recent months. When newcomers like Magellan choose to endorse the Transparency Initiative, it's apparent that the "carrot" that past AAPOR President Peter Miller hopes to offer participating pollsters as an incentive is beginning to work.

That new names bring the total number of participants up to 44. When the initiative is launched in about a year, participating which pollsters will routinely release essential facts about their methodology and deposit their information to a public data archive. AAPOR has also posted an update on their work-in-progress on the initiative. I wrote about it in more detail here.

More on presidential approval in midterm elections

Topics: Barack Obama , midterm , presidential approval

Two important points of followup on Tuesday's post about how Matt Bai overhyped President Obama's approval rating as "ominous" for Democrats:

1. First, as Emory's Alan Abramowitz correctly pointed out in an email to me, "Seat exposure and the midterm dummy variable predict substantial Democratic losses regardless of what happens to either the generic ballot or Obama approval." See, for instance, Abramowitz's statistical model of House seat swings, which predicts a 38 seat loss for Democrats on the basis of those two factors alone.

2. Second, Rahm Emanuel seems to have bought into the hype about the president's approval rating as the overriding factor in midterms. In Bai's article, he is paraphrased as follows:

For every point that Obama's approval rating dips below 50 percent, Emanuel said, there are probably four or five more House districts that will swing into the Republican column, and vice versa.

These results are not corroborated by the statistics. Controlling for other factors, Abramowitz's model predicts that a one point decrease in Obama's net approval (approval-disapproval) is associated with a .22 seat shift toward Republicans. (He indicates by email that the coefficient for raw approval is less than 0.5.) Even the simple slope in a bivariate plot is far less than 4-5 seats per point of approval. Unless there's some massive non-linearity around 50% approval, Emanuel's estimate is off by an order of magnitude.

[Cross-posted to brendan-nyhan.com]

Decennial Visitors 'Outliers'

Topics: Outliers Feature

Alan Abramowitz thinks Republicans can take back the House.

Tom Jensen advises Obama that campaigning for candidates may do more harm than good.

Lymari Morales says oil companies and the federal government both have public opinion holes to dig out of.

Rich Lowry feels (sort of) sorry for President Obama.

Jon Cohen notices a change in Mexicans' opinion of the US after passage of Arizona's immigration law.

Junk Charts plugs the scatter-plot matrix.

A town in Maine has a bit of trouble keeping track of when to hold elections.

The Onion dispenses advice on making the most of your census-taking visitors.

MI: 2010 Gov Primaries (IMP-PPC 6/2-6)

Topics: Michigan , poll

Inside Michigan Politics / Practical Political Consulting
6/2, 6/6/10; 303 likely Republican primary voters
188 likely Democratic primary voters
Mode: Automated phone
(Detroit Free Press article)


2010 Governor: Republican Primary
21% Hoekstra, 15% Snyder, 10% Cox, 10% Bouchard, 1% George

2010 Governor: Democratic Primary
14% Dillon, 10% Bernero

US: Immigration (ABC/Post 6/3-6)

Topics: National , poll

ABC News / Washington Post
6/3-6/10; 1,004 adults, 3.5% margin of error
Mode: Live telephone interviews
(ABC: story, results; Post: story, results)


Obama Job Approval on Immigration
39% Approve, 51% Disapprove

On another subject, do you think the United States is or is not doing enough to keep illegal immigrants from coming into this country?
23% Doing enough, 75% Not doing enough

Would you support or oppose a program giving ILLEGAL immigrants now living in the United States the right to live here LEGALLY if they pay a fine and meet other requirements?
57% Support, 40% Oppose

A new law in Arizona would give police the power to ask people they've stopped to verify their residency status. Supporters say this will help crack down on illegal immigration. Opponents say it could violate civil rights and lead to racial profiling. On balance, do you support or oppose this law?
58% Support, 41% Oppose

NC-2: 2010 House (Civitas/SurveyUSA 6/15-16)

Topics: North Carolina , poll

Civitas Institute / SurveyUSA
6/15-16/10; 400 registered voters, 4.8% margin of error
Mode: Automated phone
(Civitas release)

North Carolina 2nd Congressional District

2010 House
39% Ellmers (R), 38% Etheridge (D), 13% Rose (L)

IA: 54% Grassley, 37% Conlin (Rasmussen 6/14)

Topics: Iowa , poll

6/14/10; 500 likely voters, 4.5% margin of error
Mode: Automated phone
(Rasmussen release)


2010 Senate
54% Grassley (R), 37% Conlin (D) (chart)

US: National Survey, Oil Spill (CNN 6/16)

Topics: National , poll

CNN / Opinion Research Corporation
6/16/10; 534 adults, 4% margin of error
Mode: Live telephone interviews
(CNN release)


Obama Job Approval
50% Approve, 48% Disapprove (chart)

Do you approve or disapprove of the way that _____ has handled the oil spill off the coast of Louisiana in the Gulf of Mexico?
Barack Obama: 41/ 59
The oil company called BP: 13 / 87
The federal government in general: 25 / 74

Do you think President Obama has been too tough, about right, or not tough enough in dealing with BP in regards to the oil spill?
5% Too tough, 26% About right, 67% Not tough enough

Who do you trust more to improve the situation in the Gulf of Mexico -- BP, or the federal government?
32% BP, 54% Federal government

Global Attitudes (Pew 4/7-5/8)

Topics: Global , poll

Pew Global Attitudes Project
4/7-5/8/10 (dates varied by country)
700-3,262 adults/country
Modes: Telephone, Face-to-Face
(Pew release)


"As the global economy begins to rebound from the great recession, people around the world remain deeply concerned with the way things are going in their countries. Less than a third of the publics in most nations say they are satisfied with national conditions, as overwhelming numbers say their economies are in bad shape. And just about everywhere, governments are faulted for the way they are dealing with the economy.

Yet in most countries, especially in wealthier nations, President Barack Obama gets an enthusiastic thumbs up for the way he has handled the world economic crisis. The notable exception is the United States itself, where as many disapprove of their president's approach to the global recession as approve.

This pattern is indicative of the broader picture of global opinion in 2010. President Barack Obama remains popular in most parts of the world, although his job approval rating in the U.S. has declined sharply since he first took office. In turn, opinions of the U.S., which improved markedly in 2009 in response to Obama's new presidency, also have remained far more positive than they were for much of George W. Bush's tenure."

US: Generic Ballot (USA Today / Gallup 6/11-13)

Topics: National , poll

USA Today / Gallup
6/11-13/10; 1,014 adults, 4% margin of error
Mode: Live telephone interviews
(USA Today story)


2010 Congress: Generic Ballot
48% Democrat, 43% Republican (chart)

Rasmussen Profile in Washington Post

Topics: AAPOR Transparency Initiative , Automated polls , Disclosure , IVR Polls , Jason Horowitz , Rasmussen , Scott Rasmussen , Washington Post

Today's Washington Post Style Section features a lengthy Jason Horowitz profile of Scott Rasmussen, the pollster whose automated surveys have "become a driving force in American politics." Horowitz visited Rasmussen's New Jersey office -- he leads with the "fun fact" that Rasmussen "works above a paranormal bookstore crowded with Ouija boards and psychics on the Jersey Shore" -- and talked to a wide array of pollsters about Rasmussen including Scott Keeter, Jay Leve, Doug Rivers, Mark Penn, Ed Goeas and yours truly. It's today's must read for polling junkies.

It's also apparent from the piece that Rasmussen won't be joining AAPOR's Transparency Initiative any time soon:

Rasmussen said he didn't take the criticism personally, but he grew visibly annoyed when asked why he didn't make his data -- especially the percentage of people who responded to his firm's calls -- more transparent.

"If I really believed for a moment that if we played by the rules of AAPOR or somebody else they would embrace us as part of the club, we would probably do that," he said, his voice taking on an edge. "But, number one, we don't care about being part of the club."

With due respect, AAPOR's goal in promoting transparency issue is not about getting anyone to join a club (and yes, interests disclosed, I'm an AAPOR member) or even about following certain methodological "rules," it's about whether your work can "stand by the light of day," as ABC's Gary Langer put it recently.

And speaking of methodological rules, I want to add a little context to Horowitz' quote from me:

"The firm manages to violate nearly everything I was taught what a good survey should do," said Mark Blumenthal, a pollster at the National Journal and a founder of Pollster.com. He put Rasmussen in the category of pollsters whose aim, first and foremost, is "to get their results talked about on cable news."

The quotation is consistent with an argument I made last summer in a piece titled "Can I Trust This Poll," which explained how pollsters like Rasmussen are challenging the rules I was taught:

A new breed of pollsters has come to the fore, however, that routinely breaks some or all of these rules. None exemplifies the trend better than Scott Rasmussen and the surveys he publishes at RasmussenReports.com. Now I want to be clear: I single out Rasmussen Reports here not to condemn their methods but to make a point about the current state of "best practices" of the polling profession, especially as perceived by those who follow and depend on survey data.


If you had described Rasmussen's methods to me at the dawn of my career, I probably would have dismissed it the way my friend Michael Traugott, a University of Michigan professor and former AAPOR president, did nine years ago. "Until there is more information about their methods and a longer track record to evaluate their results," he wrote, "we shouldn't confuse the work they do with scientific surveys, and it shouldn't be called polling."

But that was then.

In the piece, I go on to review the findings of Traugott and AAPOR's report on primary polling in 2008, as well as Nate Silver's work in 2008, both of which found automated polling to be at least as accurate as more conventional surveys in predicting the outcome in 2008.

The spirit of "that was then" is also evident in quotations at the end of the Horowitz profile that remind us that automated polling depends on people's willingness to answer landline telephones and is barred by federal law from calling respondents on their cell phones:

"When you were growing up, you screamed, 'I got it, I got it,' and raced your sister to the telephone," said Jay Leve, who runs SurveyUSA, a Rasmussen competitor who uses similar automated technology. "Today, nobody wants to get the phone."

Leve thinks telephone polling, and the whole concept of "barging in" on a voter, is kaput. Instead, polls will soon appear in small windows on computer or television screens and respondents will reply at their leisure. For Doug Rivers, the U.S. chief executive of YouGov, a U.K.-based online polling company that is building a vast panel of online survey takers, debating the merits of Rasmussen's method struck him as "a little odd given we're in 2010."

Again, I'm doing the full profile little justice -- please go read it all.

NJ: Approval Ratings (Rasmussen 6/14)

Topics: New Jersey , poll

6/14/10; 500 likely voters, 4.5% margin of error
Mode: Automated phone
(Menendez, Christie)

New Jersey

Job Approval / Disapproval
Gov. Christie: 51 / 45 (chart)
Sen. Menendez: 50 / 43 (chart)
Pres. Obama: 51 / 49 (chart)

If an election were held today to recall Senator Menendez from office would you vote to recall him or to let him continue serving as a United States senator?
39% Vote to recall
39% Let him continue serving

AR: 61% Boozman, 32% Lincoln (Rasmussen 6/15)

Topics: Arkansas , poll

6/15/10; 500 likely voters, 4.5% margin of error
Mode: Automated phone
(Rasmussen release)


2010 Senate
61% Boozman, 32% Lincoln (chart)

Favorable / Unfavorable
Blanche Lincoln: 32 / 65
John Boozman: 65 / 26

Job Approval / Disapproval
Pres. Obama: 38 / 61
Gov. Beebe: 76 / 21

NJ: Approval Ratings (Quinnipiac 6/10-15)

Topics: New Jersey , poll

6/10-15/10; 1,461 registered voters, 2.6% margin of error
Mode: Live telephone interviews
(Quinnipiac release)

New Jersey

Job Approval / Disapproval
Gov. Christie: 44 / 43 (chart)
Sen. Lautenberg: 40 / 47 (chart)
Sen. Menendez: 38 / 43 (chart)
Pres. Obama: 50 / 46 (chart)

Favorable / Unfavorable
Chris Christie: 42 / 40 (chart)

Obama approval: No oil spill effect

Topics: Barack Obama , Jimmy Carter , presidential approval

In the wake of President Obama's speech to the nation about BP and the Gulf last night, it's worth noting that his approval ratings have not been affected by the spill so far. The speech is unlikely to have a significant effect either.

I'm laying down a marker on these two points because of the likelihood that a post hoc narrative will be created in which the Gulf spill and/or the speech played a major role in Democratic losses in the 2010 midterms or a subsequent Obama defeat in 2012. This is precisely what happened after Jimmy Carter's so-called "malaise" speech, which is frequently used to "explain" Carter's subsequent defeat in 1980. Here, for example, is a random Omaha World Herald editorial from 1995 I found in Nexis:

A few commentators seized on Clinton's rumination and compared it to Jimmy Carter's 1979 "malaise" speech. Carter's call for America to shake itself out of a national malaise and regain its self-confidence was perceived by some people as a whiny effort by a president searching for something to blame for his low ratings in the polls. It's believed that in appearing to rebuke the public, Carter alienated voters, contributing to his election defeat in 1980.

In reality, Carter's approval ratings after the speech, while low, were generally stable until the Iranian hostage crisis. His defeat can easily be explained by the state of the economy, which was terrible:


Remember, dramatizing an event (e.g., Carter's defeat) is not the same thing as explaining it. It's not clear that the malaise speech had a significant effect on Carter's fortunes, and so far there's no evidence that the oil spill or last night's speech have had a significant effect on Obama's.

Update 6/17 3:49 PM: See also Greg Marx's take at CJR.

[Cross-posted to brendan-nyhan.com]

Interesting Email 'Outliers'

Topics: Outliers Feature

Mark Mellman discusses how the American public gets in the way of budget cuts.

Steve Singiser wonders how to interpret widely varying generic ballot numbers; Chris Bowers says not to sweat individual polls.

David Hill expounds on the importance of redistricting.

Jennifer Agiesta notices that fewer new voters supported Obama in the Gulf states than in the rest of the country.

Alex Lundry makes an important point about political technology.

Forbes debuts an interactive map showing where Americans moved (and moved from) in 2008 (via Sullivan).

Tom Jensen gets fascinating email.

IL: 2010 Gov (PPP 6/12-13)

Topics: Illinois , poll

Public Policy Polling (D)
6/12-13/10; 552 likely voters, 4.2% margin of error
Mode: Automated phone
(PPP release)


2010 Governor
34% Brady (R), 30% Quinn (D), 9% Whitney (G) (chart)

Job Approval / Disapproval
Gov. Quinn: 27 / 50

Favorable / Unfavorable
Bill Brady: 22 / 22
Rich Whitney: 5 / 15

MI: 2010 Gov Primary (Magellan 6/8-9)

Topics: Michigan , poll

Magellan Strategies (R)
6/8-9/10; 742 likely Republican primary voters, 3.6% margin of error
Mode: Automated phone
(Magellan release)


2010 Governor: Republican Primary
26% Hoekstra, 20% Snyder, 16% Cox, 11% Bouchard, 2% George

WA: 2010 Sen (Elway 6/9-13)

Topics: poll , Washington

Elway Research
6/9-13/10; 405 registered voters, 5% margin of error
Mode: Live telephone interviews
(Elway release)


2010 Senate
47% Murray (D), 40% Rossi (R) (chart)
47% Murray (D), 33% Akers (R) (chart)
46% Murray (D), 32% Didier (R) (chart)

IA: 2010 Gov (Rasmussen 6/14)

Topics: Iowa , poll

6/14/10; 500 likely voters, 4.5% margin of error
Mode: Automated phone
(Rasmussen release)


2010 Governor
57% Branstad (R), 31% Culver (D)

Favorable / Unfavorable
Chet Culver: 41 / 55
Terry Branstad: 61 / 34

Job Approval / Disapproval
Pres. Obama: 50 / 48
Gov. Culver: 41 / 58

CO: 2010 Gov (Rasmussen 6/14)

Topics: Colorado , poll

6/14/10; 500 likely voters, 4.5% margin of error
Mode: Automated phone
(Rasmussen release)


2010 Governor
46% McInnis (R), 41% Hickenlooper (D) (chart)
41% Maes (R), 41% Hickenlooper (D)

Favorable / Unfavorable
Scott McInnis: 54 / 33
John Hickenlooper: 53 / 42
Dan Maes: 45 / 25

Job Approval / Disapproval
Pres. Obama: 48 / 53 (chart)
Gov. Ritter: 41 / 57 (chart)

Matt Bai: Wrong on presidential approval

Topics: Barack Obama , presidential approval

In the New York Times Magazine, Matt Bai suggests it is "an ominous sign, historically speaking, for a majority party" when "the president's own approval ratings fell below 50 percent":

[President Obama] continued to go out and shake his head disbelievingly at "the culture of Washington," which to the Democrats in the House sounded as if he were saying that his own party was the problem, as if somehow the Democratic majorities in Congress hadn't managed to navigate the bulk of his ambitious agenda past a blockade of Republican vessels, their ship shredded by cannon fire. And all this while the president's own approval ratings fell below 50 percent -- an ominous sign, historically speaking, for a majority party...

Just about every strategist of either party in Washington will tell you that the best indicator of whether the voters are growing less skeptical -- and, thus, of whether Democrats can survive the November elections intact -- can be found in the president's approval rating. There is a political theorem that illustrates this, supported by data from past elections and often repeated by Democrats now, and it goes like this: If the president's approval rating is over 50 percent in the fall, then his party will suffer only moderately. If his rating is under 50 percent, however, then the pounding at the polls is likely to be a memorable one.

I'm not sure why Bai thinks Obama's approval numbers are so ominous. Using USA Today's presidential approval tracker, I made this chart showing approval ratings to this point in each of the last seven presidencies:


Obama's approval trajectory (in purple) is tightly clustered with five of the last seven presidents. Only two of those seven -- George H.W. Bush and George W. Bush -- had significantly higher approval ratings at this point, and neither is an especially compelling counter-example: Bush 43's approval ratings were artificially inflated by 9/11, and Bush 41 was not re-elected. It's not clear that there's anything ominous about Obama's standing at this point.

If Bai is instead referring to the fortunes of the president's party in midterm elections under unified government, then there are only three relevant first-term examples in the contemporary era: Carter (1977-1978), Clinton (1993-1994), and Bush 43 (2001-June 2002). Of those, Democrats suffered moderate damage in 1978 with Carter around 50 percent; the Republicans won a landslide victory in 1994 with Clinton in the mid-40s; and Republicans picked up seats in 2002 when Bush's approval ratings were still extremely high.

Finally, if Bai is referring to midterm elections more generally, I'm not sure what makes 50 percent so magical. The president's approval ratings are an important factor, as this Nate Silver graph shows, but it's not clear that it matters whether Obama is slightly below 50 percent or not -- he's likely to lose seats either way (as most presidents do):


In reality, other factors such as slow jobs growth and the generic ballot are far more ominous for Democrats than Obama's approval rating.

[Note: The text of this post was revised.]

[Cross-posted to brendan-nyhan.com]

SD: 2010 Gov, House (Rasmussen 6/10)

Topics: poll , South Carolina

6/10/10; 500 likely voters, 4.5% margin of error
Mode: Automated phone
(Rasmussen: Governor, House)

South Dakota

2010 Governor
52% Daugaard (R), 36% Heidepriem (D)

2010 House
53% Noem (R), 41% Herseth-Sandlin (D)

Instant Replay 'Outliers'

Topics: Outliers Feature

Glen Bolger writes that Obama's economic message is failing in key congressional districts; Ken Rudin analyzes the NPR congressional districts poll; Jim Geraghty, Chris Good and Nate Silver add more.

Frank Newport gives Obama some public-opinion driven advice on his speech tonight.

Brian Beutler asks whether Republicans have already lost their chance to take the Senate; Stu Rothenberg thinks a smaller Democratic majority is likely.

Rasmussen finds that most Michigan voters want Armando Gallaraga awarded a perfect game, and support increased use of instant replay in baseball.

US: National Survey (AP-GfK 6/9-14)

Topics: National+ , poll

6/9-14/10; 1,044 adults, 4.3% margin of error
Mode: Live telephone interviews
(AP-GfK release)


State of the Country
37% Right Direction, 60% Wrong Track (chart)

Obama Job Approval
50% Approve, 49% Disapprove (chart)
Economy: 45 / 50 (chart)
Health care: 49 / 46 (chart)

Overall, please tell me whether you approve, disapprove or neither approve nor
disapprove of the way ______ is handling the oil spill that was caused by the rig it was
operating in the Gulf of Mexico.

Barack Obama: 45 / 52
BP: 15 83

Congressional Job Approval
24% Approve, 73% Disapprove (chart)
Dems in Congress: 38 / 60
Reps in Congress: 32 / 65

Do you want to see the Republicans or Democrats win control of Congress?
39% Republican, 46% Democrats

Would you like to see your own member of Congress get re-elected in November, or
would you like to see someone else win the election?

37% Own member, 55% Someone else

Do you favor, oppose, or neither favor nor oppose increasing drilling for oil and gas in
coastal areas around the United States?

45% Favor, 41% Oppose, 13% Neither

Which is more important to you as you think about increasing drilling for oil and gas in
coastal areas around the United States?

50% The need for the U.S. to provide its own sources of energy
49% The need to protect the environment

Party ID
34% Democrat, 24% Republican, 27% independent, 15% Don't know (chart)

FL: 2010 Gov Rep Primary (McCollum 6/7-8)

Topics: Florida , poll

McLaughlin & Associates (R) for Bill McCollum
6/7-8/10; 600 likely Republican primary voters, 4% margin of error
Mode: Live telephone interviews
(McLaughlin release)


2010 Governor: Republican Primary
40% McCollum, 40% Scott

LA: 2010 Sen, Oil Spill (PPP 6/12-13)

Topics: Louisiana , poll

Public Policy Polling (D)
6/12-13/10; 492 likely voters, 4.4% margin of error
Mode: Automated phone
(CQ article, PPP release)


2010 Senate
46% Vitter, 37% Melancon (chart)

Favorable / Unfavorable
Charlie Melancon: 29 / 34

Job Approval / Disapproval
Pres. Obama: 37 / 57 (chart)
Gov. Jindal: 63 / 31 (chart)
Sen. Landrieu: 39 / 51 (chart)
Sen. Vitter: 45 / 43 (chart)

Do you approve or disapprove of how ______ has handled the aftermath of the oil spill?
Gov. Jindal: 65 / 25
Pres. Obama: 32 / 62

Do you support or oppose drilling for oil off the shore of Louisiana?
77% Support, 12% Oppose

IL: 31% Giannoulias, 30% Kirk (PPP 6/12-13)

Public Policy Polling (D)
6/12-13/10; 552 likely voters, 4.2% margin of error
Mode: Automated phone
(PPP release)


2010 Senate
31% Giannoulias (D), 30% Kirk (R), 14% Jones (G) (chart)

Favorable / Unfavorable
Alexi Giannoulias: 23 / 31
Mark Kirk: 23 / 31
LeAlan Jones: 2 / 13

Job Approval / Disapproval
Pres. Obama: 53 / 41
Sen. Burris: 19 / 54
Sen. Durbin: 49 / 36

US: Battleground Districts (NPR 6/7-10)

Topics: National , poll

NPR / Greenberg Quinlan Rosner (D) / Public Opinion Strategies (R)
6/7-10/10; 1200 likely voters conducted in the Congressional Battleground, designated as the 60 most competitive Democratic districts, divided into two tiers of 30 districts each, and the 10 most competitive Republican districts
Mode: Live telephone interviews
(NPR story, GQR release)



Bolger says the NPR poll has more evidence of a trend that's been apparent all year: Republican-leaning voters are energized, while the intensity seems to have leached out of the Democratic ranks.

"When you look at the generic ballot for Congress in the Democrat-held seats, the Republican is up by 5 [points]. But among those who rate their interest as 8 to 10, you know, the high-interest voters, the Republican leads in those Democratic seats 53 to 39.

"And what that means is that is in a close election, the Republican enthusiasm will put Republicans over the top, just like in '06 and '08, the Democratic enthusiasm put the Democrats over the top."


The results are a wake-up call for Democrats whose loses in the House could well exceed 30 seats. In the named-congressional ballot in the 60 Democratic districts, Democrats trail their Republican opponent, 42 to 47 percent, with only a third saying they want to vote to-relect their member. In the top tier of 30 most competitive seats, the Democratic candidate trails by 9 points (39 to 48 percent) and by 2 points in the next tier of 30 seats (45 to 47 percent). On the other hand, the Republican candidates are running well ahead in their most competitive seats ( 53 to 37 percent). As we saw in the special election in PA-12, Democrats will have to battle on a seat-by-seat basis -- that has shifted these kinds of numbers this year.

CA: 2010 Sen, Gov (CrossTarget 6/13)

Topics: California , poll

CrossTarget (R) / Pajamas Media (R)
6/13/10; 600 likely voters, 4% margin of error
Mode: Automated phone
(Pajamas Media)


2010 Governor
46% Brown (D), 43% Whitman (R) (chart)

2010 Senate
47% Fiorina (R), 47% Boxer (D) (chart)

US: Generic Ballot (Gallup, Rasmussen 6/7-13)

Topics: National , poll


6/7-13/10; 1,600 registered voters, 3% margin of error
Mode: Live telephone interviews
(Gallup release)

2010 Congress: Generic Ballot
49% Republican, 44% Democrat (chart)

6/7-13/10; 3,500 likely voters, 2% margin of error
Mode: Automated phone
(Rasmussen release)

2010 Congress: Generic Ballot
46% Republican, 36% Democrat (chart)

US: 2012 Primary (PPP 6/4-7)

Topics: National , poll

Public Policy POlling (D)
6/4-7/10; 401 likely Republican primary voters, 4.9% margin of error
Mode: Automated phone
(PPP release)


2012 President: Republican Primary
25% Romney, 22% Huckabee, 19% Palin, 15% Gingrich, 6% Paul

US: Energy Policy (National Journal/Pew 6/10-13)

Topics: National , poll

National Journal / Pew Research Congressional Connection Poll
6/10-13/10; 1,010 adults, 4% margin of error
Mode: Live telephone interviews
(National Journal release, Pew release)


Right now, which ONE of the following do you think should be the more important priority for U.S. energy policy:
37% Keeping energy prices low
56% Protecting the environment

When it comes to the U.S. policy about offshore oil and gas drilling in U.S. waters, do you think the government should:
31% Expand offshore drilling
35% Continue existing offshore drilling, but ban new drilling
22% Ban all offshore drilling

Favorable / Unfavorable
Barack Obama: 56 / 39 (chart)
Michelle Obama: 69 / 22
Sarah Palin: 39 / 52 (chart)
Nancy Pelosi: 27 / 50
John Boehner: 12 / 22

Party ID
33% Democrat, 26% Republican, 34% independent (chart)

LA: Oil Spill, 2010 Sen (Magellan 6/10,13)

Topics: Louisiana , poll

Magellan Strategies (R)
6/10, 6/13/10; 1,030 likely voters, 3.1% margin of error
Mode: Automated phone
(Magellan release)


2010 Senat
51% Vitter (R), 31% Melancon (D) (chart)

Do you approve or disapprove of the way ______ has handled the
Gulf oil spill?

Pres. Obama: 31 / 60
Gov. Jindal: 66 / 21

Do you favor or oppose the expansion of offshore drilling?
72% Favor, 15% Oppose

Do you favor or oppose the 6 month moratorium that the Obama administration
has just placed on the expansion of offshore drilling?

21% Favor, 64% Oppose

SC: 2010 Sen (Rasmussen 6/10)

Topics: poll , South Carolina

6/10/10; 500 likely voters, 4.5% margin of error
Mode: Automated phone
(Rasmussen release)

South Carolina

2010 Senate
58% DeMint (R), 21% Greene (D)

Favorable / Unfavorable
Jim DeMint: 62 / 25
Alvin Greene: 20 / 51

Yost & Borick: The Silver Standard

Topics: AAPOR Transparency Initiative , Berwood Yost , Chris Borick , Disclosure , Franklin and Marshall College , Muhlenberg College , Nate Silver , Poll Accuracy

This guest pollster contribution comes from Berwood Yost, director of the Floyd Institute for Public Policy Franklin and Marshall College, and Christopher Borick, director of the Muhlenberg College Polling Institute.

Nate Silver's compilation of performance data for election polling in the United States and his ratings of polling organizations should be applauded for increasing the ability of the public to judge the accuracy of the ever increasing number of pre-election polls. Helping the public determine the relative effectiveness of polls in predicting election outcomes can be compared to Consumer Reports equipping individuals with information about which products meet minimum standards for quality. As with the work of Consumer Reports, Mr. Silver is explicit in his methodology and provides substantial justification for the assumptions he adopts in his calculations. But as is the case in the construction of any measure, there are some reasonable questions that can be raised about what was included in those calculations. One such question has to do with the "affiliation bonus."

Silver's decision to include an "affiliation bonus" for pollsters that are either in the NCPP or have joined AAPOR's Transparency Initiative has significant consequences for his final ratings. Table 1 provides two pollster-introduced error (PIE) estimates for a sub-group of academic polling organizations, one that uses the calculation for all telephone pollsters and the other that uses the calculation for those pollsters who receive the "affiliation bonus." We chose this group because all of the organizations, regardless of their affiliation with NCPP or the AAPOR Transparency initiative, consistently release full descriptions of their methodology and provide detailed breakdowns of their results. The scores highlighted in yellow are those reported for each pollster on Silver's site. As Table 1 shows, the rankings are substantially different depending on whether a firm receives the "affiliation bonus."

[Editor's note: Chris Borick informs us that Muhlenberg University has signed on to the AAPOR Transparency Initiative, but did so after June 1, so they were not classified as a participant in Silver's ratings. Berwood Yost tells us that Franklin and Marshall intends to sign on, but has not done so yet].


As part of his rating methods Mr. Silver makes the decision to discount the "raw scores" for polls despite noting that those scores are the most "direct measure of a pollster's performance." His primary justification for discounting the "raw scores" is because his project is, "not to evaluate how accurate a pollster has been in the past--but rather, to anticipate how accurate it will be going forward" (taken from Silver's methodological discussion). Those who read his rankings should take care to understand the distinction that Silver is making between past performance and expected future performance. We are not sure why the scores based on past performance are inferior to PIE and he does not make a sufficiently strong case for the very heavy discount that he applies to those scores in his calculations. It would be valuable to see some more evidence about what makes PIE a better indicator of polling performance. The "affiliation bonus" may indeed be correlated with the performance of polls, but is it actually the affiliations that are leading to better performance or is it some other unmeasured variable that is at work? Silver's calculations show that the "affiliation bonus" explains only three percent of the variance in his regression equation and has a p value that is greater than .05. One may ask if that is sufficient evidence to provide such a strong advantage to some pollsters.

In closing we would once again like to applaud Mr. Silver for taking on the important task of applying solid methods to the evaluation of pollster accuracy. The public needs such efforts in order to more effectively sift through the avalanche of polls that greet them every election season. Our intention is simply to note that the scores produced by Silver should be evaluated in terms of both their strengths and limitations.

Oil Spill Pie Charts Suck 'Outliers'

Topics: Outliers Feature

John Harwood sees little impact from the oil spill on Obama approval (with an assist from Charles Franklin).

Joel Benenson frames climate change legislation as a political winner (via Smith)

Pete Brodnitz had a good night last week.

Mike Huckabee asks pollsters to include him on 2012 trial heat questions.

Marco Rubio is not worried about polls.

PPP invites you to vote on where they poll this weekend.

Lymari Morales counts the many ways Gallup posts updates of Obama job approval.

Bob Groves reports that the Census non-response follow-up is "about 93% complete.. somewhat ahead of schedule and certainly under-budget."

Junk charts says the BP oil spill brings out the worst in pie charts.

Transparency and Pollster Ratings: Update

Topics: Clifford Young , Disclosure , Gary Langer , Joel David Bloom , Nate Silver , poll accuracy , Taegan Goddard

[Update: On Friday night, I linked to my column for this week, which appeared earlier than usual. It covers the controversy over Nate Silver's pollster ratings, and an exchange last week between Silver, Political Wire's Taegan Goddard and Research 2000's Del Ali over the transparency in the FiveThirtyEight pollster ratings. In linking to the column I also posted additional details on the polls that Ali claimed Silver had missed and promised more on the subject of transparency that I did not have a chance to include in the column. That discussion follows below.]

Although my column discusses issues of transparency of the database Nate Silver created to rate pollster accuracy, it did not address transparency in regards to the details of the statistical models used to generate the ratings.

When Taegan Goddard challenged the transparency of the ratings, Silver shot back that the transparency is "here in an article that contains 4,807 words and 18 footnotes," and explains "literally every detail of how the pollster ratings are calculated."

Granted, Nate goes into great detail describing how his rating system works, but several pollsters and academics I talked to last week wanted to see more details of the model and the statistical output in order to better evaluate whether the ratings perform as advertised.

For example, Joel David Bloom, a survey researcher at the University at Albany who has done a similar regression analysis of pollster accuracy, said he "would need to see the full regression table" for Silver's initial model that produces the "raw scores," a table that would include the standard error and level of significance for each coefficient (or score). He also says he "would like to see the results of statistical tests showing whether the addition of large blocks of variables (e.g., all the pollster variables, or all the election-specific variables) added significantly to the model's explanatory power."

Similarly, Clifford Young, pollster and senior vice president at IPSOS Public Affairs, said that in order to evaluate Silver's scores, he would "need to see the fit of the model and whether the model violates or respects the underlying assumptions of the model," and more specifically, "what's the equation, what are all the variables, are they significant or aren't they significant."

I should stress that no one quoted above doubts Silver's motives or questions the integrity of his work. They are, however, trying to understand and assess his methods.

I emailed Silver and asked about both estimates of the statistical uncertainty associated with his error scores and about not providing more complete statistical output. On the "margin of error" of the accuracy scores, he wrote:

Estimating the errors on the PIE [pollster-introduced error] terms is not quite as straightforward as it might seem, but the standard errors generally seem to be on the order of +/- .2, so the 95% confidence intervals would be on the order of +/- .4. We can say with a fair amount of confidence that the pollsters at the top dozen or so positions in the chart are skilled, and the bottom dozen or so are unskilled i.e. "bad". Beyond that, I don't think people should be sweating every detail down to the tenth-of-a-point level.

In a future post, I'm hoping to discuss the ratings themselves and whether it is appropriate to interpret differences in the scores as indicative of "skill" (short version: I'm dubious). Today's post, however, is about transparency. Here is what Silver had to say about not providing full statistical output:

Keep in mind that we're a commercial site with a fairly wide audience. I don't know that we're going to be in the habit of publishing our raw regression output. If people really want to pick things apart, I'd be much more inclined to appoint a couple of people to vet or referee the model like a Bob Erikson. I'm sure that there are things that can be improved and we have a history of treating everything that we do as an ongoing work-in-progress. With that said, a lot of the reason that we're able to turn out the volume of academic-quality work that we do is probably because (ironically) we're not in academia, and that allows us to avoid a certain amount of debates over methodological esoterica, in which my view very little value tends to be added.

To be clear, no one I talked to is urging FiveThirtyEight to start regularly publishing raw regression output. Even in this case, I can understand why Silver would not want to clutter up his already lengthy discussion with the output of a model featuring literally hundreds of independent variables. However, a link to an appendix in the form of a PDF file would have added no clutter.

I'm also not sure I understand why this particular scoring system requires a hand-picked referee or vetting committee. We are not talking about issues of national security or executive privilege

That said, the pollster ratings are not the fodder of a typical blog post. Many in the worlds of journalism and polling world are taking these ratings very seriously. They have already played a major role in getting one pollster fired. Soon these ratings will appear under the imprimatur of the New York Times. So with due respect, these ratings deserve a higher degree of transparency than FiveThirtyEight's typical work.

Perhaps Silver sees his models as proprietary and prefers to shield the details from the prying eyes of potential competitors (like, say, us). Such an urge would be understandable but, as Taegan Goddard pointed out last week, also ironic. Silver's scoring system gives bonus accuracy points to pollsters "that have made a public commitment to disclosure and transparency" through membership in the National Council on Public Polls (NCPP) or through commitment to the Transparency Initiative launched this month by the American Association for Public Opinion Research (AAPOR), because he says, his data shows that those firms produce more accurate results.

The irony is that Silver's reluctance to share details of his models may stem from some of the same instincts that have made many pollsters, including AAPOR members, reluctant to disclose more about their methods or even the support the Transparency Initiative itself. Those instincts are what AAPOR's leadership is hoping to use their Initiative to change.

Last month, AAPOR's annual conference included a plenary session that discussed the Initiative (I was one of six speakers on the panel). The very last audience comment came from a pollster who said he conducts surveys for a small midwestern newspaper. "I do not see what the issue is," he said, referring to the reluctance of his colleagues to disclose more about their work "other than the mere fact that maybe we're just so afraid that our work will be scrutinized." He recalled an episode where he had been ready to disclose methodological data to someone who had emailed with a request but was stopped by the newspaper's editors who were fearful "that somebody would find something to be critical of and embarrass the newspaper."

Gary Langer, the director of polling at ABC News, replied to the comment. His response is a good place to conclude this post:

You're either going to be criticized for your disclosure or you're going to be criticized for not disclosing, so you might as well be on the right side of it and be criticized for disclosure. Our work, if we do it with integrity and care, will and can stand the light of day, and we speak well of ourselves, of our own work and of our own efforts by undertaking the disclosure we are discussing tonight.

SC: 2010 Gov (Rasmussen 6/10)

Topics: Governor , poll , South Carolina

6/10/10; 500 likely voters, 4.5% margin of error
Mode: Automated phone
(Rasmussen release)

South Carolina

2010 Governor
46% Barrett (R), 38% Sheheen (D)
55% Haley (R), 34% Sheheen (D)

Favorable / Unfavorable
Gresham Barrett: 50 / 28
Nikki Haley: 63 / 26
Vincent Sheheen: 46 / 36

MI: 2010 Gov (Rasmussen 6/9)

Topics: Governor , Michigan , Poll

6/9/10; 500 likely voters, 4.5% margin of error
Mode: Automated phone
(Rasmussen release)


2010 Governor
Hoekstra (R) 40%, Dillon (D) 35%
Cox (R) 39%, Dillon (D) 37%
Snyder (R) 41%, Dillion (D) 33%
Hoekstra (R) 39%, Bernero (D) 36%
Cox (R) 40%, Bernero (D) 34%
Snyder (R) 42%, Bernero (D) 30%

Mike Cox: 48 / 40
Virg Bernero: 37 / 31
Peter Hoekstra: 45 / 33
Andy Dillion: 37 / 34
Rick Snyder: 47 / 28

Job Approve/Disapprove
Gov. Granholm: 41 / 59
Pres. Obama: 49 / 51

NY: 2010 Gov, Sen (Siena 6/7-9)

Topics: Governor , New York , Poll , Senate

6/7-9/10; 808 registered voters, 3.4% margin of error
Mode: Live telephone interviews
(Siena: crosstabs, toplines)

New York

2010 Governor: General Election
60% Cuomo, 24% Lazio (chart)
65% Cuomo. 23% Paladino

2010 Senate (B): General Election
48% Gillibrand (D), 27% Blakeman (R) (chart)
47% Gillibrand (D), 29% DioGuardi (R)
49% Gillibrand (D), 24% Malpass (R)

2010 Senate (A): General Election
59% Schumer (D), 27% Berntsen (R)
60% Schumer (D), 26% Townsend (R)

2010 Senate (A): Democratic Primary
78% Schumer (D), 11% Credico (R)

2010 Governor: Republican Primary
45% Lazio, 18% Paladino

2010 Senate (B): Republican Primary
21% DioGuardi, 7% Blakeman, 3% Malpass

2010 Senate (A): Republican Primary
20% Townsend, 15% Berntsen

David Paterson: 31 / 56 (chart)
Andrew Cuomo: 59 / 26
Rick Lazio: 31 / 28
Carl Paladino: 16 / 17
Kirsten Gillibrand: 36 / 27 (chart)
Bruce Blakeman: 8 / 11
David Malpass: 8 / 11
Joe DioGuardi: 14 / 11
Charles Schumer: 54 / 32 (chart)
Gary Bernsten: 6 / 12
Jay Townsend: 14 / 12

ME: 2010 Gov (Rasmussen 6/10)

Topics: Governor , maine , poll

6/10/10; 500 likely voters, 4.5% margin of error
Mode: Automated phone
(Rasmussen release)


2010 Governor
43% LePage (R), 36% Mitchell (D), 7% Cutler (I)

Favorable / Unfavorable
Paul LePage: 59 / 25
Libby Mitchell: 50 / 40
Eliot Cutler: 25 / 29

Job Approval / Disapproval
Pres. Obama: 48 / 51
Gov. Baldacci: 41 / 59

US: Generic Ballot (PPP 6/4-7)

Topics: Generic House Vote , Poll

Public Policy Polling (D)
6/4-7/10; 650 registered voters, 3.9% margin of error
Mode: Automated phone
(PPP release)


2010 Congress: Generic Ballot
43% Democrat, 41% Republican (chart)

Do you approve or disapprove of the job of _____________
Congressional Democrats: 35% Approve, 54% Disapprove
Congressional Republicans: 20% Approve, 62% Disapprove