Earlier today, Fox News released five new polls measuring voter preferences in the Senate races in Florida, Nevada, Pennsylvania, Ohio and California. The Fox News story says the polls are conducted by Pulse Opinion Research. We will tackle the results in another article, but for now political junkies may be wondering, what is Pulse Opinion Research?
The answer (as reported earlier today by Political Wire) is that Pulse is a "field service" spun off of of Rasmussen Reports that conducts their well known automated, recorded-voice surveys. It also conducts polls for other clients including, as of today, Fox News. While the questions asked on specific surveys may differ, the underlying methodology used by Fox/Pulse and Rasmussen are essentially identical.
Earlier this year, Rasmussen launched a new website for Pulse that, as he explained to Tim Mak of the Frum Forum, allows anyone to "go to the [Pulse] website, type in their credit card number, and run any poll that they wanted, with any language that they want... In effect, you will be able to do your own poll, and Rasmussen will provide the platform to ensure that the polling includes a representative national sample." According to the Pulse web site, basic election surveys start at $1,500 for a sample of 500 state or local respondents.
Scott Rasmussen confirms, via email, that surveys conducted by Pulse for Fox News and for Rasmussen reports are essentially equivalent in terms of their calling, sampling and weighting procedures:
Pulse Opinion Research does all the field work and processing for Rasmussen Reports polling. They do the same for other clients using the system that I developed over many years. So, in practical terms, polling done by Pulse for any client, including Fox News, will be processed in exactly the same manner. In a Rasmussen Reports poll, Rasmussen Reports provides the questions to Pulse. In a Fox News poll, Fox News provides the questions for their own surveys.
Both will use the same targets for weighting, including weights applied for partisan identification:
The process for selecting Likely Voter targets is based upon partisan trends identified nationally (and reported monthly). In an oversimplified example, if the national trends move one point in favor of the Democrats, the targets for state samples will do the same. As Election Day draws near, the targets are also based upon specific results from all polling done in that state. In competitive states, Pulse can draw upon a large number of interviews to help estimate the partisan mix.
For Election 2010, the net impact is that the samples are typically a few points more favorable to the Republicans than they were in Election 2008. Also, most of the time, the number of unaffiliated voters is a bit lower than in 2008. The samples also show a lower share of minority voters and younger voters.
One positive aspect of the new Fox News/Pulse surveys is that Fox is making demographic cross-tabulations freely available (example here) that Rasmussen Reports keeps behind a subscription wall. And Fox is going a step further, adding weighted sample sizes for each subgroup (something Rasmussen does not currently make available even to subscribers). So if you want to see the demographic composition, you can use the weighted counts to calculate the percentages.
On the other hand, this development may well double the number of polls conducted with the Rasmussen methodology in some races going forward. For example, the Fox/Pulse surveys were conducted on Saturday, September 11 and included samples in Nevada and Ohio. Today, Rasmussen Reports released two additional surveys conducted in Nevada and Ohio on Monday, September 13. Rasmussen, again via email, confirms that his "Rasmussen Reports polling schedule in entirely independent of anything the Fox or Pulse does." He adds:
Our plans were laid out long ago, with the only variable being which races remain the closest as Election Day approaches. For example, we don't expect to poll Connecticut as often as California. But, if the CT race gets closer (as possibly suggested by Quinnipiac), we will poll it more frequently. Same thought process holds true for West Virginia.
As it is, the Rasmussen surveys have already grown far more numerous and dominant so far this election cycle than in 2008. Pollster.com has already tracked 237 Rasmussen Reports surveys on the 2010 elections for U.S. Senate, almost double the number at this point for U.S. Senate races in 2008 (120). While the total number of surveys fielded by all pollsters have also increased, Rasmussen's share of these polls has grown significantly, from 35% of all Senate polls so far to 49%
Rasmussen is the only pollster active in about a half dozen less competitive contests and has fielded three out of four polls in states that have been only marginally competitive, like Indiana and Delaware.
The growing predominance of Rasmussen's surveys so far this cycle has consequences for all that follow and track polling data, including our efforts to track and chart polls at Pollster and Huffington Post. This is another story that we will focus on in the weeks ahead.
Markos Moulitsas, who first fired his former polling partner Research 2000 in June and subsequently filed suit alleging that polling conducted by that firm was fraudulent, announced this afternoon that his DailyKos website will soon resume polling with two new partners: Public Policy Polling (PPP) for "horserace" polling in statewide contests and another pollster to be named later for national surveys. The first survey, to be fielded in Delware, will be released next week.
Moulitsas was also quick to tweet what will amount to a new standard in polling disclosure:
And while we won't be able to do it next week, both pollsters have agreed to RELEASE ALL RAW DATA. We just have to figure out the logistics.
Access to raw data will mean that anyone with basic statistical software will be able to use the data to run their own tabulations or analysis. While many national media pollsters provide such raw data to academic archives like the Roper Center for Public Opinion and the Pew Research Center provides it on their website, those resources are usually available many months after the results are released.
Tom Jensen, PPP's polling director, provided this reaction for The Huffington Post:
We're very excited for the opportunity, especially because Daily Kos has shown such a strong interest in surveying 'under polled' races over the years. We're looking forward to getting data out there in states like Delaware, where we're kicking off, that don't usually see a lot of public polling. We're also glad in a time when a few bad apples have cast a shade over the polling industry to let people see that we're really doing our work. We really appreciate Markos' commitment to transparency and are happy to partner with him on that
Markos posts more details to Daily Kos, including news that they "hope to be back up to pre-scandal polling frequencies by September" as well as this comment:
I'm so excited about all of this I can barely contain myself. While the R2K mess has been a nightmare, it has opened up new possibilities -- the ability to work with some of the most accomplished pollsters in the biz, and break new ground by providing unparalleled transparency.
His post also notes that PPP was one of only two pollsters from a short list of those he considers most accurate that was available to do "horserace" polling. He explains that automated pollster SurveyUSA "couldn't do horserace polling because of exclusivity contracts with other media organizations." Could that be a clue to the identity of the yet-to-be announced pollster that will conduct non-horserace "weekly State of the Nation national polling" for DailyKos?"
Polling critics are fast to point out when polls get it wrong, so let us point out an instance of the polls getting it right. In Tuesday night's Georgia Republican gubernatorial primary, Karen Handel and Nathan Deal placed first and second to advance to a runoff on August 10th. The Pollster.com aggregate did a good job of catching the rise of both these candidates and fall of one time frontrunner John Oxendine, and the individual polls were pretty good too.
In the final seven days before the primary, five polls came out and all five had Handel finishing in first place. Four of the five correctly predicted that Handel and Deal would end up in the runoff. Three of the five forecasted Handel first and Deal second. The final two polls taken accurately projected first through seventh. Most impressively, the final poll (conducted by Insider Advantage) came within three percent of pegging each difference in percentage of the vote between the top four candidates. Why is all of this impressive to me?
First, the only poll (Mason-Dixon) that incorrectly put John Oxendine in the runoff was also the only live interview poll in the field during the final week. Many news organizations still refuse to share results from automated phone (or dismissively referred to as robopolls) because of such fears as five year old (or even a dog) could theoretically be interviewed. All of the "less reliable" automated polls got the final the runoff participants correct, but the supposedly "scientific" poll did not.
Second, southern and gubernatorial primaries have been shown to be the toughest statewide contests to poll. The fact that the race was in fluctuation in the final weeks and the polls were able to pick up on a clear trend toward Handel and Deal and away from Oxendine is especially striking with this past performance in mind.
Third, the pollsters that were furthest off the mark were the ones who have the best historic track record. Mason-Dixon & Rasmussen have been two of the top ten pollsters over the past twelve years, but both of them missed the correct order of the top two finishers. It is important to point however that Oxendine's (and Rasmussen's Deal) percentage was just outside of the margin of error when undecideds are allocated based on those who have registered an opinion.
Fourth, the most accurate poll was conducted by Insider Advantage, which surprised me because of their lack prior lack of transparency and poor performance. Earlier this year, I was involved with a tussle with Insider Advantage. Why? I simply wanted to find out whether their Insider Advantage / Florida Sun-Times poll was live interviewer or automated phone for an article I was working on related to the Florida Republican Senatorial primary. Insider Advantage did not clearly state either way on their website. I repeatedly tried to contact Insider Advantage's CEO Matt Towery for the article, but did not get a response. Florida Sun-Times' political reporter David Hunt, who had written articles based on the poll, believed the poll was "live" interviewer. Only after Mark Blumenthal send an inquiring did I found out that Insider Advantage employs an automated phone technique. In addition, Nate Silver found them to be the second worst pollster with at least ten polls conducted over the past twelve years. Last night, none of these past actions seemed to matter as Insider Advantage nailed the placement of the seven candidates and difference of vote between the top four Republican candidates for governor.
Overall, It was a good, but odd night for polling where up was down and down was up.
Today's Washington Post Style Section features a lengthy Jason Horowitz profile of Scott Rasmussen, the pollster whose automated surveys have "become a driving force in American politics." Horowitz visited Rasmussen's New Jersey office -- he leads with the "fun fact" that Rasmussen "works above a paranormal bookstore crowded with Ouija boards and psychics on the Jersey Shore" -- and talked to a wide array of pollsters about Rasmussen including Scott Keeter, Jay Leve, Doug Rivers, Mark Penn, Ed Goeas and yours truly. It's today's must read for polling junkies.
Rasmussen said he didn't take the criticism personally, but he grew visibly annoyed when asked why he didn't make his data -- especially the percentage of people who responded to his firm's calls -- more transparent.
"If I really believed for a moment that if we played by the rules of AAPOR or somebody else they would embrace us as part of the club, we would probably do that," he said, his voice taking on an edge. "But, number one, we don't care about being part of the club."
With due respect, AAPOR's goal in promoting transparency issue is not about getting anyone to join a club (and yes, interests disclosed, I'm an AAPOR member) or even about following certain methodological "rules," it's about whether your work can "stand by the light of day," as ABC's Gary Langer put it recently.
And speaking of methodological rules, I want to add a little context to Horowitz' quote from me:
"The firm manages to violate nearly everything I was taught what a good survey should do," said Mark Blumenthal, a pollster at the National Journal and a founder of Pollster.com. He put Rasmussen in the category of pollsters whose aim, first and foremost, is "to get their results talked about on cable news."
The quotation is consistent with an argument I made last summer in a piece titled "Can I Trust This Poll," which explained how pollsters like Rasmussen are challenging the rules I was taught:
A new breed of pollsters has come to the fore, however, that routinely breaks some or all of these rules. None exemplifies the trend better than Scott Rasmussen and the surveys he publishes at RasmussenReports.com. Now I want to be clear: I single out Rasmussen Reports here not to condemn their methods but to make a point about the current state of "best practices" of the polling profession, especially as perceived by those who follow and depend on survey data.
If you had described Rasmussen's methods to me at the dawn of my career, I probably would have dismissed it the way my friend Michael Traugott, a University of Michigan professor and former AAPOR president, did nine years ago. "Until there is more information about their methods and a longer track record to evaluate their results," he wrote, "we shouldn't confuse the work they do with scientific surveys, and it shouldn't be called polling."
But that was then.
In the piece, I go on to review the findings of Traugott and AAPOR's report on primary polling in 2008, as well as Nate Silver's work in 2008, both of which found automated polling to be at least as accurate as more conventional surveys in predicting the outcome in 2008.
The spirit of "that was then" is also evident in quotations at the end of the Horowitz profile that remind us that automated polling depends on people's willingness to answer landline telephones and is barred by federal law from calling respondents on their cell phones:
"When you were growing up, you screamed, 'I got it, I got it,' and raced your sister to the telephone," said Jay Leve, who runs SurveyUSA, a Rasmussen competitor who uses similar automated technology. "Today, nobody wants to get the phone."
Leve thinks telephone polling, and the whole concept of "barging in" on a voter, is kaput. Instead, polls will soon appear in small windows on computer or television screens and respondents will reply at their leisure. For Doug Rivers, the U.S. chief executive of YouGov, a U.K.-based online polling company that is building a vast panel of online survey takers, debating the merits of Rasmussen's method struck him as "a little odd given we're in 2010."
Again, I'm doing the full profile little justice -- please go read it all.
I am going to state the obvious: Rasmussen has had more favorable horse race numbers for Republicans this cycle than other pollsters. Why? It could have to do with its likely voter screen, interactive voice response technology, or perhaps, as some have suggested, a more sinister motive to "shape the debate". But something funny has happened in the past four days, Rasmussen's numbers have come back in line in three states.
Indeed, Rasmussen has released polling data that seems to show a Democratic uprising in Rasmussen's polling. In the state of Kentucky, Jack Conway has, in less than two weeks, cut Rand Paul's post-primary lead from 25 (on the prior Rasmussen poll) to 9. In Connecticut, Dick Blumenthal has seen his lead over Linda McMahon grow from 3 to 23 on Rasmussen polling in less than two weeks. Finally, in Missouri, Roy Blunt's 5+ point lead seen in every Rasmussen poll since February has dropped to a statistically insignificant point.
In the case of the first two contests, the shifts are exaggerated because the previous poll done by Rasmussen was fielded in one night less than 24 hours after two very important events: the first New York Times story on Blumenthal's war record on May 17 and Rand Paul winning the Kentucky primary on May 18. However, a Greenberg Quinlan Rosner poll showed a 15% Blumenthal lead over almost the identical field period as Rasmussen's 3 point Blumenthal advantage poll. While Rasmussen polls in Connecticut and Kentucky taken before these outliers were not as "out there", there was still a Republican house effect in these states and Missouri .
Interestingly, these more friendly Democratic numbers remind me of something that happened during the 2008 election. My friend David Shor, who I am currently working on projects with, documented that Rasmussen had a "summer" reversal of its Republican house effect in 2008. Looking at all polling from presidential, senatorial, gubernatorial, and house races, David found "Rasmussen polls have a statistically significant Pro-Republican house-effect that appears during primary season in the beginning of the year, disappears during the summer, and then very rapidly appears right before the Republican National Convention". His chart catalogs this well, as seen by the higher p-value in the summer months (150 to 70 days). For the less mathematically inclined, the p-value indicates the probability that a difference occurred by chance alone. So p values closer to 0 indicate a very high likelihood that the difference between Rasmussen and other pollsters was real, while a p-value closer to 1 means that any differences are most likely to to random chance.
It is too early to know whether we will see a similar shift in other Rasmussen polling. These are only three polls in three contests and for all I know the next round of polls will show other pollsters moving towards Rasmussen.
My column for this week examines a minor oddity some have noted elsewhere: Why has automated pollster Rasmussen Reports done less final-weekend pre-election polling so far in 2010 than at this point in 2008? Please click through and read the whole thing.
This morning, Nate Silver flagged a pretty glaring difference between two similarly worded and structured questions asked on surveys conducted by Rasmussen Reports and CBS News at roughly the same time. In so doing, he's highlighted a critical question at the heart of an important debate about not just Rasmussen's automated polls, but about all surveys that compromise aspects of their methods: Are the respondents to these surveys skewed to the most attentive, interested Americans? Do Rasmussen's samples skew, to use Nate's phrase, to "political junkies?"
Here is his chart:
CBS News found just 11% of adults who say they are "very closely" following "news about the appointment of U.S. Solicitor General Elena Kagan to the U.S. Supreme Court" in a survey fielded May 20-24. Rasmussen found 37% who say they are "very closely" following "news stories about President Obama's nominee for the Supreme Court" in an automated survey fielded from May 24-25. The answer categories were identical.
In addition to the minor wording differences, the big potential confounding factor is that Rasmussen screened for "likely voters," while CBS interviewed all adults. Nate does some hypothetical extrapolating and speculates that likely voter model alone cannot account for all the difference.
Whether you find that speculation convincing or not, the theory that more politically interested people "self select" into automated surveys is both logical and important. GWU Political Scientist John Sides put it succinctly in a blog post last year about an automated poll by PPP:
A reasonable question, then, is whether this small self-selected sample is -- even with sample weighting -- skewed towards the kind of politically engaged citizens who are more likely to think and act as partisan[s] or ideologues.
It is difficult to answer that question definitively, especially about Rasmussen's surveys, in a way that is based on hard empirical evidence and not just informed speculation. The reason is that the difference in mode (automated or live interviewer) is typically confounded by equally significant differences in question wording (examples here and here) or use of likely voter filtering by Rasmussen but not other polls. The Kagan example is helpful because the question wording is much closer, but the likely voter confound remains.
I have longargued that Rasmussen could help resolve some of this uncertainty by being more transparent about their likely voter samples, which dominate their releases to a far greater degree than almost any other media pollster. What questions do they use to select likely voters? What percentage of adults does Rasmussen's likely voter universe represent? What is the demographic composition of their likely voter sample, by age, gender, race, income? That sort of information is withheld even from Rasmussen's subscribers.
They could also start reporting more results among both likely voters and all adults. They call everyone, and they would incur virtually zero marginal expense in keeping all voters on the phone for a few additional questions.
Back to Silver's post. He includes some extended discussion on some of the differences in methodology that might explain why political junkies would be more prone to self-select to Rasmussen's surveys than those done by CBS. I have to smile a little because I outlined the same issues in a presentation at the Netroots Nation conference last August on a panel that happened to include Silver. I've embedded the video of that presentation below. It's well worth watching if you want more details on the stark methodological differences between Rasmussen and CBS News (my presentation begins at about the 52:00 minute mark; I review much of the same material in the first part of my Can I Trust This Poll series).
Finally, I want to follow-up on two of Nate's comments. Here's the first:
I've never received a call from Rasmussen, but from anecdotal accounts, they do indeed identify themselves as being from Rasmussen and I'm sure that the respondent catches on very quickly to the fact that it's a political poll. I'd assume that someone who is in fact interested in politics are significantly more likely to complete the poll than someone who isn't.
I emailed Rasmussen's communications director and she confirmed that they do indeed identify Rasmussen Reports as the pollster at the beginning of their surveys.
But here's the catch: So does CBS. I also emailed CBS News polling director Sarah Dutton and she confirms that their scripted introduction introduces their surveys as being conducted by CBS News or (on joint projects) by CBS and the New York Times. According to Dutton, they "also tell respondents they can see the poll's results on the CBS Evening News with Katie Couric, or read them in the NY Times."
Second, he argues that CBS will "call throughout the course of the day, not just between 5 and 9, which happens to be the peak time for news programming." I checked with Dutton, and that's technically true: CBS does call throughout the day, up until 10 p.m. in the time zone of the respondent. However, they schedule most of their interviewers to work in the evenings. As such, most of their calling occurs during evening hours because, as Dutton puts it, "that's when people are at home."
More important, CBS makes at least 4 dials to each selected phone number over a period of 3 to 5 days, and they make sure to call back during both evening and daytime hours. The idea is to improve the odds of catching the full random sample at home. That said, if a pollster does not do callbacks -- and Rasmussen does not -- it's probably better to restrict their calling to the early evening because, again, that's when people are at home.
But I don't want to get bogged down in the minutiae. The question of whether automated surveys have a bias toward interested and informed respondents is big and important, especially when we move beyond horse race polling to surveys on more general topics. I'm sure Nate Silver will have more to say on it. So will we.
Is Joe Sestak closing the gap between Arlen Specter and himself through television ads, or was the race always close?
About a month ago three polls conducted for the 2010 Pennsylvania Democratic Senatorial primary were released within a week of each other, and they all showed vastly different results. Quinnipiac had Specter leading Sestak by 21%, Susquehanna by 14%, and Rasmussen by 2%. This past Saturday I tweeted "IVR polling was much more accurate in 04 Rep primary with Specter v. Toomey... Does this mean Ras[mussen] is right?" The tweet spoke to something I noticed two weeks ago when referencing back to polling on the 2004 Republican Senatorial Primary in Pennsylvania. In that contest, then Republican Arlen Specter just escaped a strong challenge from now Republican nominee Pat Toomey. In the final month of that contest, automated phone polls conducted by SurveyUSA showed at times a much closer race than live interviewer polls done by Quinnipiac and Franklin and Marshall.
Three weeks to a month before the 2004 primary, Quinnipiac had Specter up by 15% and Franklin and Marshall by a nearly identical 13%, but SurveyUSA had him up by only 6%. Then 10 days before the primary, all the pollsters agreed that Toomey was only 5-6% behind. On the eve of the primary, Quinnipiac gave Specter a 6% lead, but SurveyUSA saw a 48-48% tie. On election day, Specter won by only 1.5%.
Today, as we stand two weeks before the 2010 Democratic Senatorial Primary, a new live-interviewer poll from Muhlenberg College paints a much closer race than previous live interviewer polls. Like the live interviewer polls taken around this point in 2004, Specter has a 6% lead over his opponent. This result is close to the Rasmussen poll released about three weeks ago.
Why was the automated phone polling right in 2004 and looks to be onto something in 2010? It could have to do with the tighter likely voter screen automated phone polling usually uses. Base voters (who would have favored a challenge from the right by Toomey in 2004 and Sestak in 2010) are the voters who are most likely to actually vote. It is the reason that both the Keystone and SurveyUSA polls found Toomey performing 5+% better 2004 when applying stricter voter screens. It should also be noted that base voters are also more firmly committed to their candidate, which even the Quinnipiac poll, that gave Specter a 21% lead over Sestak, showed.
The bottom line is I think this election is probably going to be a close one, and Specter better hope for a healthy turnout.
I received the following email from InsiderAdvantage CEO Matt Towery in response to today's column:
Mark, I take criticism now constructively and we will do more to make clear we use IVR. Out of a sense of equal fairness would you share with your readers that PPP, I assume using a phone room, had virtually the same numbers in Crist-Rubio one day before we released ours? Transparency should flow both ways, don't you agree? All my best Matt
And for the record, PPP also uses an automated methodology, not a "phone room," though Towery's observation raises a fair point about PPP: Their blog posts and releases rarely disclose that their surveys use an automated methodology, although their embrace of that technology is no mystery. If nothing else, their web site's mission page makes it crystal clear.
My National Journalcolumn for this week looks at the failure of one pollster, Insider Advantage, to disclose whether it uses live interviewers or an automated method in its reports and the resulting consequences.
I have more to add on this topic -- please check back later today.
Update: I posted thoughts on how we how we plan to do better at holding Pollsters to minimal standards for disclosure here at Pollster.com.
Update II: A response from InsiderAdvantage CEO Matt Towery.
This certainly seems like a banner week for blogging about pollster Scott Rasmussen, as I count at least three entry-worthy topics on the automated polltaker: (1) the flurry of commentary surrounding the piece by Politico's Alex Isenstadt on attacks from the left on Rasmussen's credibility (2) reporting Monday by the liberal blog Think Progress showing that Rasmussen was paid $140,500 by the 2004 Bush campaign for survey research and the good question that raises about whether sites like ours should label Rasmussen as "[R]" for Republican (3) yesterday's new Rasmussen survey of likely voters in this month's special election in Massachusetts.
As commenting on all three at once exceeds both my time and mental bandwidth, I'm going to start with the third and most timely topic, but I will come back to the other two later this week.
Rasmussen's Massachusetts survey, consisting of 500 automated interviews of Massachusetts likely voters conducted in just one day (Monday), shows Democrat Martha Coakley leading Republican Scott Brown by just nine percentage points (50% to 41%). That's a surprisingly narrow lead in a heavily Democratic state that Barack Obama carried by more than 26 percentage points (61% to 36%). Even in 1994, a banner year for Republicans, Ted Kennedy defeated Mitt Romney in an unusually competitive reelection contest by 17 points (58% to 41%).
Nate Silver looked at the Rasmussen results by party and extrapolated that the survey consisted of 52% Democrats, 21% Republicans and 27% in the independent/other category:
Although there are lots of different ways to ask about party identification, typically that's not what we see in elections in the Bay State, as the number of independents is usually much higher (43 percent of Massachusetts voters were independent/other in 2008, and 51 percent are registered as independents). They're also showing an electorate that is 39 percent liberal, 34 percent conservative, and 27 percent moderate; that compares to 2008 exit poll demographics of 31 percent liberal, 19 percent conservative, and 49 percent moderate.
So Rasmussen's theory on this election, basically, is that the people in the middle won't bother to show up; there are many fewer independents and many fewer moderates in their sample than you usually get in Massachusetts. Instead, it will be a race between the bases.
My first reaction is that while there are indeed different ways to ask about party identification (and even more ways to ask about self-reported ideology), it's a bad idea to compare official party registration statistics (that tally how many voters check the box for Democrat or Republican when they register to vote) with survey questions about party identification (typically: "when it comes to politics, do you think of yourself as a Democrat, a Republican or an independent?"). Depending on the state, the two measures can produce very different sets of numbers.
Back in October, Monmouth University pollster Patrick Murray explained why officially "unaffiliated" voters in New Jersey are very different from the "independents" identified by pollsters. I suspected that Massachusetts, with its very high percentage of non-partisan registrants, might produce similar differences, so I emailed Andrew Smith, director of the University of New Hampshire Survey Center who frequently conducts surveys of Massachusetts for the Boston Globe. Here is his take:
[It's] important to point out that a high percentage of the registered Independents in MA (they're actually called unenrolled) are really either Democrats (36%) or Republicans (34%) when you look at PARTY ID. (We use the Univ of Michigan question and I recode leaners into the partisan buckets). Calling them "Independents" makes it look like there is a large pool of free-thinkers out there up for grabs, which is simply not the case ... not in MA, not in NH (a regular media story during the NH Primary is about the large number of Independents up for grabs, a story which sounds good, but has no basis in fact!), not anywhere!
Those who are registered Unenrolled in MA are less interested in elections and less likely to vote than are registered Republicans or Democrats. This phenomenon is consistent across the US (see "The American Voter Revisited" for the most recent in a long line of studies making this point). It's my sense that the 2010 MA special election will have low turnout and the percentage of voters who are registered as either Democrat or Republican will be higher than the percentage registered as such among all MA adults.
Let me explain that a little more slowly. The Boston Globe/UNH poll routinely asks respondents about both their party registration and their party identification. Early in their interview they ask: "Are you registered to vote as a Democrat, Independent, Republican or something else?" On their September survey, according to data Smith provided, 36% of all registrants said they were registered as Democrats, 14% said they were Republicans and the rest (50%) reported they were unenrolled.
Then at the end of their survey they ask: "GENERALLY SPEAKING, do you usually think of yourself as a Republican, a Democrat, an Independent or what?" To those who initially identify as independent, identify with another party or offer no preference they ask a follow-up: "Do you think of yourself as closer to the Republican or to the Democratic party?"
When Smith combined the initial identifiers with leaners in September, he found 50% were Democrats, 32% were Republicans and only 19% remained independent or without a preference. And a cross-tabulation of the UNH party registration and party identification questions shows that more than two thirds of the unenrolled voters identify or lean to either the Democratic (36%) or Republican (34%) parties.
So unaffiliated/independent was 50% on the party registration question, but only 19% on party identification (with leaners allocated). So again, it is a mistake to expect the party identification results produced by a poll to match the party registration statistics produced by the Secretary of State (and an even bigger mistake to weight results of a poll to match those statistics, but that's a topic for another day).
But wait, the party identification results I cited are for adults. Does the Democratic advantage narrow among "likely voters" in Massachusetts?
Yes, although the results that Smith provided from the September BostonGlobe/UNH poll did not ask about the special election. They did ask, however, about interest and likelihood to vote in the general election for governor in November 2010, and thanks for Andrew Smith, I can provide those tabulations below:
As the table shows, if you narrow the survey based on interest in the gubernatorial election, the Democratic advantage narrows considerably, from 17 percentage points among all adults (49% to 32%) to just three percentage points among the 35% of adults that say they are extremely interested (45% to 42%). On the other hand, when the UNH pollsters asked voters if they were likely to vote in the November 2010 general election, the Democratic advantage actually grows slightly, to 19 points (51% to 32%). Which of these, or what combination, might provide the best "model" of a true likely voter? There's no obvious answer -- welcome to the highly varied "art" of likely voter modeling.
But wait...the point of all of this is how these numbers compare to extrapolated party numbers produced by Nate Silver for the Rasmussen poll, and at first blush, the Globe/UNH numbers are not that far off. If anything, Rasmussen's party results (52% Democrat to 27% Republican) are more favorable to the Democrats than the Globe/UNH numbers from September.
I had even considered including the Rasmussen numbers in the table above, but decided against because the comparison is a bit misleading. The problem is that Rasmussen asks a very different partisanship question:
If you are a Republican, press 1. If a Democrat, press 2. If you belong to some other political party, press 3. If you are independent, press 4. If you are not sure, press 5.
Does this question ask about party identification or registration? Given the absence of the "do you consider yourself" clause and the use of "belong to, a respondent might interpret it either way. And I'm assuming that since Rasmussen uses an automated method, the respondent can interrupt the question at any time to choose a selection, something I suspect they tend to do more readily toward the end of the interview (especially if they are feeling impatient to get off the phone).
Also, notice that unlike the Globe/UNH question, Rasmussen does not include a follow-up to press independents on how they lean. Yet if you click on a Rasmussen result on our national party ID chart, you will see that when asked of national adult samples, Rasmussen results tend to produce more partisans (about 12 percentage points more, on average) than other pollsters (roughly 9 points higher on Republicans and 3 points higher on Democrats).
Why is it different? Does the combination of a different question and mode effectively push some independents to say how they lean (with a bigger push toward the GOP)? Or do Rasmussen's sampling and calling procedures yield an adult sample that skews more partisan and Republican, even before they apply their likely voter screen and party weighting? You can make a case for either argument, but I don't have conclusive evidence to resolve this puzzle.
Also, Rasmussen typically weights their statewide pre-election samples by party to targets derived in a somewhat fuzzy process. How did they determine their party weighting targets for the Massachusetts survey? How much did the party weighting alter the results (as compared to weighting on demographics alone)? What percentage of Massachusetts adults passed the screen and qualified as likely voters?
And most important, why aren't answers to these questions disclosed on a routine basis on RasmussenReports.com? Keep in mind that Nate Silver had to extrapolate his estimate of Rasmussen's partisan balance, and even that came from crosstabs available to subscribers only.
I'll have more to say later about the questions of bias, intentional and otherwise, that have been swirling around Rasmussen this week. But until pollsters like Rasmussen start disclosing more about the numbers they produce, it is hard to do much more than speculate about whether poll like the one he did in Massachusetts are as representative as they should be. Is this new poll "horribly, terribly wrong?" With so little information to go on, it's hard to say.
Update: Harry Enten (aka "poughies"**), the Dartmouth student who wrote a guest contribution a month ago on modeling gay marriage referenda, takes issue with my conclusion with an intriguing comment below that argues that the Rasmussen poll "has too many Republicans and not enough independents." He reaches this conclusion by comparing the relationship between results for party ID and actual registration in a previous Massachusetts congressional district race polled by SurveyUSA (which asks a more traditional party identification question). It's worth reading.
**And yes, he gave me permission to reveal his identity.
Hardly a week goes in which I do not receive at least one email like the following:
Although I really appreciate you continually adding this "outlier" poll for your aggregated data, I do wonder why Rasmussen polling numbers are ALWAYS significantly lower and different than every other poll when measuring the President's job approval rating (with the exception of Zogby's internet poll)? How do Rasmussen pollsters explain this phenomenon and, more importantly, what is your explanation for this statistically significant ongoing discrepancy between Rasmussen and pretty much every other poll out there?
We have addressed variants of this question many times, but since this questions is easily the most frequently asked via email, it is probably worth trying to summarize what we've learned in one place.
Let me start with this reader's premise. Are Rasmussen's job approval ratings of President Obama typically lower than "every other poll?" The chart that follows, produced by our colleague Charles Franklin, shows the relative "house effects" for organizations that routinely release national polls based on the approval percentage. Rasmussen's Obama job approval ratings (third from the bottom) do tend to be lower than most other polls, but they are not the lowest.
Before reviewing the reasons for the difference, I want to emphasize something the chart does not tell us. The line that corresponds with the zero value is NOT a measure of "truth" or an indicator of accuracy. The numeric value plotted on the chart represents the average distance from an adjusted version of our standard trend line (it sets the median house effect to zero, producing a line that is usually within a percentage point of our standard trend line). Since that trend line is essentially the average of the results from all pollsters, the numbers represent deviations from average. Calculate house effects using a different set of pollsters, and the zero line would likely shift.
A related point: Readers tend to notice the Rasmussen house effect because their daily tracking polls represent a large percentage of the points plotted on our job approval chart. For the daily tracking polls released by Rasmussen and Gallup Daily, we plot the value of every non-overlapping release (every third day). As of last week, Gallup Daily and Rasmussen represent almost half (49%) of the points plotted on our charts (each organization claims 24% each). As such, their polls do tend to have greater influence on our trend line than other organizations that poll less often (see more discussion by Charles Franklin, Mike McDonald and me on the consequences of the greater influence of the daily trackers).
So why are the Rasmussen results different? Here are the three possible answers:
1) LIkely voters - Of the twenty or so pollsters that routinely report national presidential job approval ratings, only Rasmussen, Zogby and Democracy Corps routinely report results for a population of "likely voters." Of the pollsters in the chart above, PPP, Quinnipiac University, Fox News/Opinion Dynamics and Diageo/Hotline report results for the population of registered voters. All the rest sample all adults. Not surprisingly, most of the organizations near the bottom the house effect chart -- those showing lower than average job approval percentages for Obama -- report on either likely or registered voters, not adults.
Why does that matter? As Scott Rasmussen explained two weeks ago, likely voters are less likely to include young adults and minority voters who are more supportive of President Obama.
2) Different Question - Rasmussen also asks a different job approval question. Most pollsters offer just two answer categories: "Do you approve or disapprove of the way Barack Obama is handling his job as president?" Rasmussen's question prompts for four: "How would you rate the job Barack Obama has been doing as President... do you strongly approve, somewhat approve, somewhat disapprove, or strongly disapprove of the job he's been doing?"
Scott Rasmussen has long asserted that the additional "somewhat" approve or disapprove options coax some respondents to provide an answer that might otherwise end up in the "don't know" category. In an experiment conducted last week and released yesterday, Rasmussen provides support for that argument. They administered three separate surveys of 800 "likely voters, each involving a different version of the Obama job approval rating: (1) the traditional two category, approve or disapprove choice, (2) the standard Rasmussen four-category version and (3) a variant used by Zogby and Harris, that asks if the president is doing an excellent, good, fair or poor job. The table below collapses the results into two categories; excellent and good combine to represent "approve," fair and poor combine to represent "disapprove."
The 4-category Rasmussen version shows a smaller "don't know" (1% vs. 4%) and a much bigger disapprove percentage (52% vs 46%) compared to the standard 2-category question. The approve percentage is only three points lower on the Rasmussen version (47%) than the traditional question (50%). As Rasmussen writes, the differences are "consistent with years of observations that Rasmussen Reports polling consistently shows a higher level of disapproval for the President than other polls" (make of this what you will, but three years ago, Rasmussen argued that the four category format explained a bigger "approve" percentage for President Bush).
We can see that Rasmussen does in fact report a consistently higher disapproval percentage for President Obama by examining Charles Franklin's chart of house effects for the disappprove category. Here the distinction between Rasmussen, Harris and Zogby -- the three pollsters that ask something other than the traditional two-category approval question -- is more pronounced.
The Rasmussen experiment shows an even bigger discrepancy between the approve percentage on the two-category questions (50%) and the much lower percentage obtained by combining excellent and good (38%). This result is similar to what Chicago Tribune pollster Nick Panagakis found on a similar experiment conducted many years ago (as described in a post last year).
Variation in the don't know category also helps explain the house effects for many of the other pollsters. The table below shows average job approval ratings for President Obama by each pollster over the course of 2009 (through November 19). It shows that smaller don't know percentages tend to translate into larger disapproval percentages. With live interviewers and similar questions, the differences are usually explained by variations in interviewer procedures and training. Interviewers that push harder for an answer when the respondent is initially uncertain obtain results with smaller percentages in the don't know column.
3) The Automated Methodology - Much of the speculation about the differences involving Rasmussen and other automated pollsters centers on the automated mode itself (often referred to by the acronym IVR, for interactive voice response). Tom Jensen of PPP, a firm that also interviews with an automated method, offered one such theory earlier this year:
[P]eople are just more willing to say they don't like a politician to us than they are to a live interviewer because they don't feel any social pressure to be nice. That's resulted in us, Rasmussen, and Survey USA showing poorer approval numbers than most for a variety of politicians.
Other commentators offer a different theory, neatly summarized recently by John Sides, who speculates that since automated polls "generate lower response rates" than those using live interviewers, automated poll samples may "[skew] towards the kind of politically engaged citizens who are more likely to think and act as partisan[s] or ideologues," even after weighting to correct demographic imbalances.
A lack of data makes evaluating this theory very difficult. Few pollsters routinely release response rate data (and even then, technical differences in how those rates are computed makes comparisons across modes challenging). And, as far as I know, no one has attempted a randomized controlled experiment to test Jensen's "social pressure" theory applied to job approval ratings.
But that said, it is intriguing that the bottom five pollsters on Franklin's chart of estimated house effects on the approval rating all collect their data using surveys administered without live interviewers: Rasmussen and PPP use the automated telephone methodology and Harris, Zogby and You/Gov Polimetrix survey over the Internet (using non-probability panel samples). Of course, with the exception of YouGov/Polimetrix, these firms also either interview likely or registered voters, use a different question than other pollsters, or both.
As such, it is next to impossible to disentangle these three competing explanations for why the Rasmussen polls produce a lower than average job approval score for President Obama, although we can make the strongest case for the first two.
P.S.: For further reading, we have posted on the differences between Rasmussen and other pollsters in slightly different contexts here, here and here and on my old MysteryPollster blog here, here and here. Also be sure to read Scott Rasmussen's answer last week to my question about how they select likely voters. Finally, Charles Franklin posted side-by-side charts showing the Obama job approval house effects for each pollster last week; he has posted similar house effect charts on house effects on the 2008 horse race polls here, here, here and here.
My column for this week reviews the notion that the success of automated polling, sometimes known by the acronym IVR (for Interactive Voice Response), in predicting the outcomes of this year's elections extends to polls on other issues, especially health care reform. Please click through and read it all.
The column quotes the pollsters at the three most prominent firms that conduct automated polling, SurveyUSA, Rasmussen Reports and Public Policy Polling (PPP). Since I quoted each only briefly in the article, and since their comments were all far more extensive and on-the-record, I am sharing them here verbatim.
I asked each to respond to this passage of a polling review from former George W. Bush deputy chief of staff Karl Rove:
Automated polling firms like SurveyUSA and Rasmussen have
drawn criticism in the health care debate for showing Americans
significantly more opposed to reform than traditional pollsters who use
Yet on Tuesday, automated polling firms like Rasmussen were significantly more accurate
than conventional competitors. Voters who stay on the phone to answer
the questions of an automated pollster may more accurately represent
the electorate in off-‐year elections when turnout is lower and only
the most enthusiastic voters are likely to turn out. If so, Democrats
who face re-‐election next year should start worrying--automated
pollsters' results showing a majority of Americans opposed to health
care reform may be the most prescient look at what lies in store for
next year's midterms.
Scott Rasmussen, Rasmussen Reports
First, I am pleased that Karl Rove noted how "automated
polling firms like Rasmussen were significantly more
accurate than conventional competitors" in polling the New
Jersey Governor's race.
Only part of that success can be attributed to the
automated methodology. Much of it has to do with the way
that we measured the support of nominal supporters of
Daggett and undecided voters. Our survey model helped us
project actual Daggett's vote total more closely than
As a result, I continue to believe that you can do a good
automated poll or a good operator-assisted poll. You can
also do a bad poll using either method. Automated systems
clearly have an advantage when it comes to consistency in
tracking polls, but there may be areas where
operator-assisted polls have an advantage as well.
As for the health care debate, the methodology issue has
little to do with it because all polls show a plurality or
majority opposition to the health care plan working its
way through Congress. On the Pollster.com site, the
average results show 49.6% opposed and 41.8% in favor, a
gap of just under 8 points. Our latest polling at
Rasmussen Reports shows 45% in favor and 52% opposed, a 7
I do believe Democrats should be concerned because the
health care debate has become a lose-lose situation for
them. But, it's not because automated polls show a
different result. It's because all polls send the same
message. The health care issue is complex and very
challenging to measure. But, the overall messages from
polling using both automated systems and operator-assisted
approaches are quite similar. Most Americans are at least
somewhat happy with their own coverage and quality of
care. Anything that would force them to change is going to
create political problems. Competition and choice are seen
as good things. And, there is a strong desire to reduce
the cost of health care along with a skepticism about the
ability of our political process to accomplish that goal.
Jay Leve, SurveyUSA
Recorded-voice telephone polls are not inherently superior.
Recorded-voice telephone polls are not inherently inferior.
True: when asked yes/no questions about personal conduct - such as: "Do you have unprotected sex?" or "Do you drink alone?" - respondents who answer by pressing a button or checking a box report higher incidences than respondents who must "confess" to a human.
But: I don't think you can argue, on an issue as complicated as health-care, that mode trumps. I could draft two health-care questions today, and produce conflicting results tomorrow, one that shows support for reform, the other that shows opposition. And I could do that regardless of whether the research was conducted by US mail, mall intercepts, headset operators, professional announcers, or email.
Too many poll watchers are mode-fixated. Often, mode is the least of it.
Tom Jensen, Public Policy Polling (PPP):
IVR polls were more accurate than live interviewers in New Jersey and Virginia at calling the horse race. That does not mean IVR is superior to live interviewers on every kind of question that ever gets polled. It does mean that IVR polls should be taken as seriously as any other polls on most measures of public opinion- they deserve to be a part of the discussion. They should not be ignored on issues like health care and Obama's approval.
That said, I think Rasmussen's Republican friendly numbers on things like Obama's approval and health care are more a result of his polling likely voters, presumably for the midterm elections, than an IVR vs. live interviewer thing. We saw last Tuesday that GOP voters are a lot more fired up right now so it's not surprising they're more likely to pass an off year voter screen. We model our monthly national approval polls on a Presidential year electorate because of the 2012 horse race polling we do and we find Obama with numbers more similar to the live interviewer national pollsters than to Rasmussen's. That's a sampling issue rather than a mode issue.
There are good live interviewer polling outfits and bad ones. There are good IVR polling outfits and bad ones (particularly the sort of fly by night ones that aren't a consistent presence on the polling scene.) What I want to see is not for everyone to think that IVR polls are superior, but for people to judge individual polling companies on their actual merits and not how they conduct their interviews.
I'm not sure if that gets to the heart of what you're looking for and if you have any specific questions I'm happy to answer but those are my overall feelings- no individual poll should be treated as if it's the one and only accurate one but all polls with a track record of accuracy, so long as they're transparent about their methodology, deserve to be taken seriously.
The most recent polling in New Jersey shows an excruciatingly close race between incumbent Democrat Jon Corzine and Republican challenger Chris Christie. As of this writing, our standard trend estimate (below) puts Corzine "ahead" by a negligible 0.8% (41.4% to 40.6%). The more sensitive setting on our smoothing tool makes the Corzine margin slightly narrower (0.6%), the less sensitive setting makes it slightly larger (0.9%). Any way you look at it though, the differences between the estimates -- and more importantly, between Corzine and Christie -- are virtually meaningless. Right now, the current polling snapshot of this race is a close as these things get.
For perspective on the closeness of the margin you might want to stroll down memory lane and revisit my final Election Day update from Tuesday, November 4, 2008. We showed only four states where the Obama-McCain margin on our trend-estimates was less than 2 percentage points, and the leader ultimately won the state in 2 of 4 states. So a margin of under two percentage points puts us well within true toss-up territory in terms of predictive accuracy, especially with a weekend of polling still to go.
Understandably, the close nature of the race has political junkies turning these numbers upside down and reading every possible tea leaf and in search of the key to the outcome. After doing much of the same (while out with the flu) the last few days, the best answer I can give based on the empirical evidence -- for the moment at least -- is that this race is currently looking very close.
Are things trending toward Corzine? Yes, when compared to early September, our chart indicates a decline of roughly four percentage points for Christie and an increase of roughly three points for Corzine. Over the course of the summer, Christie had been dropping (from a high of roughly 49% in early July), while Corzine remained flat.
What is less clear is whether the closing trend has continued over the last two weeks. As of this writing, only three pollsters have tracked more than once since mid-October, allowing apples-to-apples trend comparisons. Two, SurveyUSA and Democracy Corps -- show Corzine's margin two percentage points better. One, Rasmussen, shows it one point worse. None of these differences are statistically significant alone and the patterns are obviously small and inconsistent.
That said, the trend over the next four days may not be as smooth, and the Daggett "wild card" that everyone has focused on for the last few months is the reason. Consider at least three ways that the Daggett effect leaves us even more uncertain about the outcome:
Individual level uncertainty -- The Monmouth University Polling Institute reported yesterday on a focus group they convened earlier this week in Edison, NJ among voters who are still either undecided or just leaning to a candidate. While they explicitly warn against treating the findings as representative of all undecided voters, the most clear finding was a sense of unhappiness with both major candidates: "These voters claim that this is the most difficult election choice they have ever faced. Nearly all said that Jon Corzine has not done a good enough job to deserve reelection. They simply have not heard enough from Chris Christie to cast their lot with him." Their final decision about Daggett, the report says, may come down to whether he has a chance of winning.
Aggregate level uncertainty -- One statistic worth pondering: On the last ten polls, all conducted in the last week, the portion of the electorate that is either undecided or supporting a candidate other than Corzine or Christie averages 16.5% (with a range of 11% to 23%). As a crude measure of voter uncertainty, that's considerably more than 5% or so we saw at this stage of last year's presidential election.
Measurement artifacts? -- Complicating this issue even further are the measurement challenges that pollsters face when testing lesser known independent candidates, especially when voters are unhappy with the top two choices. Offer just three choices and no explicit undecided category and some undecided voters will choose the independent as their way of expressing uncertainty. On the other hand, fail to prompt for the independent and you may measure a number that's much lower (see, for example, the intriguing experiment embedded in the Fairleigh Dickinson poll). Reality likely falls somewhere in between. And no one can be certain of the effect that the other 9 candidates will have.
And finally, there is the intriguing pattern noted earlier this week by PPP's Tom Jensen and explored last night by Nate Silver. Christie has done consistently better on telephone polls conducted using an automated, recorded voice than on those using live interviewers. Using the filter tool on our chart, as of this writing, Christie runs roughly three points ahead of Corzine on the automated polls, but Corzine runs a little less than three points ahead on live interviewer polls. The chart below, which Charles Franklin kindly prepared this afternoon, shows that the difference has been consistent throughout the race (his margins are likely different than on our interactive chart due to his use of slightly different smoothing levels).
We also see a similar though far less pronounced and consistent effect in Virginia, and then only since Labor Day.
What this effect is about, and what it portends for the outcome in New Jersey, I cannot say. Nate Silver has some plausible speculation about automated surveys being potentially more sensitive to an enthusiasm gap between Republicans and Democrats, although if that is true, I have no explanation for why we saw no such consistent difference between automated and live interviewer surveys in the Obama-McCain polling last year. We should have new surveys over the weekend or on Monday from all three automated pollsters in New Jersey (SurveyUSA, PPP and Rasmussen) and from at least three of the live-interviewer polls. So this phenomenon will be interesting to watch.
Either way, the combination of a very close snapshot and many indicators of potential volatility makes for a very uncertain outcome.
My National Journal column for the week, now posted, defends automated, recorded voice polling from what is becoming a common line of attack: without a live interviewer anyone, regardless of age, might participate in the survey. Please click through for the details.
Since I typically file my NationalJournal.com columns on Friday afternoon to appear on Monday morning, I get a chance to mull them over all weekend before posting these quick updates on Pollster. This weekend, I realized I that one conclusion could have used more emphasis: My bottom-line on automated polls is that they have established a strong record in measuring campaign horse-race results in pre-election polls. Over at least the last four election cycles, they have been as accurate as live election polls at the end of the campaign, and their horse race results generally track well with live interviewer surveys. So I think that it is wrong to condemn automated polls simply because they use a recorded voice rather than live interviewers.
That said, we need to keep in mind that the mode of interview is just one aspect of a methodology. If you look at the best known automated surveys, you will see a lot of variation in how they draw their samples, how persistent they are in attempting to call-back households where no one answers on the first call, how many interviews they conduct, how they identify likely voters, how they weight the data and, finally, in the questions they ask. All of those factors might make any given automated poll more or less reliable or accurate than any given live interviewer poll.
Also, while automated surveys have proven themselves in one particular application -- measuring campaign horse race numbers late in the campaign -- we need to be careful about overlooking potential shortcomings for other kinds of research. I would certainly not recommend an automated interview for any general population study that wants to ask more than four or five substantive questions or that involves open-ended questions that allows respondents to answer in their own words.
On a slightly different subject, the column also highlights one statistic that Charles Franklin computed:
[T]he national job approval data does not support the assertion that automated polls are more "erratic." My Pollster.com partner Charles Franklin checked and found that despite identically sized three-day samples, the Rasmussen daily tracking poll is less variable than Gallup (showing standard deviations of 1.8 and 2.4, respectively), probably because Rasmussen weights its results by party identification.
Charles also sent along a chart, which is based on deviations from the trend line for Obama's job approval rating since taking office in January.
The tails of the Gallup curve are slightly wider than the Rasmussen curve. The point is not that Rasmussen is better or worse than Gallup, again only that the presidential approval is slightly less variable as Rasmussen, probably because they weight by party.
You can certainly make a case that rolling average daily tracking, whether automated or traditional, includes a lot of random variation, and that those seeking a narrative can find whatever story they want in the meaningless daily bumps. On that score, I generally agree with the advice offered by the First Read piece I quoted in the column: Beware -- lots of daily approval polls with widely differing methods "lets some folks cherry-pick what they want."
Finally, one subject that deserves more attention than the two brief paragraphs I gave it is what we lose when a live interviewer does not gather the data. A few weeks ago, a survey researcher named Colleen Porter shared a defense of quality interviewing in the form of an anecdote on the member-only listserv of the American Association for Public Opinion Research (AAPOR). She gave me permission to share the story here, in which she describes monitoring an interview being conducted on behalf of her client:
The interviewer is amazing. Her surname is Hispanic--is she this good in Spanish, too? Of course they put their best interviewers on the first night; I would, too, when I was at a survey lab.
When she asks about the location of an event, the respondent commences a story about the many times it has happened. The interviewer repeats the question exactly as worded, with emphasis on "LAST TIME," but a tone of complete patience as if reading a new question. The respondent focuses, and answers promptly.
That is exactly how it is supposed to work. Score! As the respected client, I am off in a room alone, and there is no one to give a high five. I punch the air. I love to hear good interviewing.
Update: Brenden Nyhan emails to pass along a 2006 journal article by respected political scientist Gary Jacobsen (requires a subscription to view the published article, but you can access an earlier conference version of the paper here, in pdf format). Jacobsen's paper is based in the 50-state job approval surveys that automated pollster SurveyUSA conducted during 2005 and early 2006. In the article's appendix, he describes how he "examined the data carefully for internal and external consistency as well as intuitive plausibility" and found that "they passed all of the tests very satisfactorily." His conclusion:
In sum, I found no reason to believe that the quality and accuracy of the aggregate data produced by SurveyUSA's automated telephone methodology is in any way inferior to that produced by other telephone surveys, and I thus have no qualms about using the data for scientific research on aggregate state-level political behavior.
My National Journalcolumn for the week, on the surprisingly dire view of the future of polling from SurveyUSA's Jay Leve, is now online.
At a panel at last week's Joint Statistical Meetings in Washington DC, Leve delivered a presenation with this surprising conclusion: "If you look at where we are here in 2009," for phone polling, he said, "it's over... this is the end. Something else has got to come along." Intrigued? Hope so. Click through for the details.
*Correction: The original headline and subheading on both the National Journal column and this entry incorrectly stated that Leve forecasts "doom" for all of polling and the polling profession. Leve sees doom for a particular kind of polling, what he calls "barge-in telephone polling" -- in essence,this means telephone surveys as we now know them, both live operator and automated. However, as I hope the last paragraph of the column makes clear, he is optimistic about the future of polling: "And for those who might ask, he adds that he 'doesn't look to the
future with despair but with wonder' at the opportunities for the
About a month ago, I wrote a post about the fairly obvious and consistent differences among pollsters ion the Barack Obama job approval question -- what we usually refer to as "house effects." At issue is that the two of the national pollsters that have produced consistently lower scores for Obama use an automated, recorded voice to ask questions rather than live interviewers. My argument was that we should not overlook the other factors that might also explain the house effects at evidence on our job approval chart.
One admittedly far-fetched hypothesis I floated to explain the consistently lower approval scores produced by Public Policy Polling (PPP), one of the automated pollsters, is that they ask a slightly different question: Most of the others ask respondents if they "approve or disapprove of the way Barack Obama is handling his job as president." PPP asks if they "approve or disapprove of Barack Obama's job performance" (emphasis mine). I wondered if "some respondents might hear 'job performance' as a question about Obama's performance on the issue of jobs," and suggested that they conduct an experiment to check.
Well, it turns out that the folks at PPP took my advice. They randomly split their most recent North Carolina survey (pdf) in two. The full survey interviewed 686 registered voters, so each half sample had roughly 340 interviews. One random half-sample heard their usual question (rate "Barack Obama's job performance"). The other half heard the more standard question (rate "the way Barack Obama is handling his job as president. According to PPP's Tom Jensen, the two versions "actually came out completely identical- 51 [percent approve] / 41 [percent disapprove] on each."
So much for my theory. That said, the bottom line from last month's post remains the same:
While tempting, we cannot easily attribute to [the automated methodology] all of the apprent difference to Obama's job rating as measured by Rasmussen and PPP on the one hand, and the rest of the pollsters on the other. There are simply too many variables to single out just one critical.
To review, let's quickly list a few (I discussed most in the original post).
1) Population. Rasmussen interviews "likely voters;" PPP interviews registered voters. Most of the other national media polls interview and report on all adults, although a handful (most notably Fox/Opinion Dynamics, Quinnipiac, Diageo/Hotline, Cook/RT Stategies and Resurgent Republic) all report results from registered voters.
Alert reader Tlaloc suggested that while our charts allows easy filtering by mode (live interviewer, automated, etc) it would be even more useful to filter by population. We will add that feature to our to-do list. Meanwhile, Charles Franklin prepared the chart below, which shows three solid (loes regression) trend lines for Obama's approval percentage. Black shows the polls of all adults, blue shows the polls of registered voters (including PPP, whose individual releases are designated with blue triangles) and red shows the Rasmussen Reports results.
As the chart shows, the three categories produce consistently different estimates of Obama approval, with Rasmussen lowest, adult surveys highest and registered voter surveys somewhere in the middle. Moreover, the three PPP surveys are closer to the Rasmussen result than the other registered voter surveys (and we omitted the small handful of other pollsters besides Rasmussen that report Obama approval among "likely voters").
2) Question format. If you scan the "undecided" column of our table of recent Obama job approval results (and really that should be "not sure" -- another item for our to-do list), you will see quite a lot of variation. Although Rasmussen rarely reports a specific result, they usually have only a percentage point or so that is neither approve nor disapprove. The unsure percentages for CNN/ORC, ABC/Washington Post, AP/GfK and Ipsos/McClatchy tend to be in the low single digits. PPP has produced an unsure response of 6-8 percent. Meanwhile, pollsters like Pew Research Center, CBS News, Fox/Opinion Dynamics typically produce unsure responses over 10 percentage points.
The reason for the variation is usually some combination of the format of the question, including the number of answer choices offered, whether the pollster offers an explicit "unsure" category and whether they have an added push of those who are initially reluctant to answer the question. The point is not that any particular method is right or wrong, but that these differences matter.
3) Sample frame. PPP is unlike virtually all of the other national pollsters in that they sample from a list of all registered voters culled from voter rolls. Phone numbers are usually obtained by attempting to match names and addresses to listed telephone directories. As such, a significant number of selected voters are not covered -- PPP does not say how many are missed in their public releases. That difference in coverage may also contribute to the apparent house effect.
4) Live interviewer vs automated telephone. If we could easily control for the first three factors, we might be able to reach some conclusion about whether the lack of a live-interviewer produces an effect of its own. In other words, holding all other factors equal, are some respondents providing a different answers to the job approval question when asked by an automated method rather than a live interviewer. Unfortunately, we have only national results on Obama job approval from just three pollsters that use the automated phone mode (Rasmussen, PPP and SurveyUSA - and just one poll from the latter).
The above is not an exhaustive list of the possible reasons for pollster house effects. So again, it's next to impossible to try to reach any firm conclusions about the automated mode alone. Also, as I concluded last month (and it bears repeating):
Just because a pollster produces a large house effect in the way they measure something, especially in something relatively abstract like job approval, it does not follow automatically that their result is either "wrong" or "biased" (a conclusion some readers have reached and communicated to me via email), only different. Observing a consistent difference between pollsters is easy. Explaining that difference is, unfortunately, often quite hard.
In a column this past Sunday, Washington Post polling director Jon Cohen explains why the Post has not reported on recent surveys "purporting to show the status of" the upcoming Democratic primary contest for governor in Virginia. Their bottom line:
None of the recent polls in the Virginia governor's race meet our current criteria for reporting polls: Two primary ones were by Interactive Voice Response, commonly known as "robopolls," and the third was a partial release from one of the candidates eager to change the campaign story line.
Cohen's piece starts a conversation worth having about the difficulty of polling in low turnout primaries, about the coverage of "horse race" results and where journalists should draw the line in reporting on polls conducted by campaigns or of otherwise unknown or questionable quality. For today, I am going to shamelessly gloss over those bigger issues (and shamelessly promote that I'll take up some of them in my about to resume NationalJournal.com column next week) and consider instead the narrower issue of the Post's policy against reporting the results of automated polls (also known as interactive voice response, or IVR).
Cohen makes makes two arguments for not reporting automated surveys:
1) Automated polls take "less care" determining likely voters:
Given the great complexity in determining "likely voters" in the upcoming electoral clash, extra care should taken to gauge whether people will show up to vote. Unfortunately, polls that use recorded voice prompts typically take less care than polls conducted by live interviewers.
2) Automated polls are impractical for surveys asking more than a half-dozen substantive questions:
People are generally less tolerant of long interviews with computerized voices. One recent Virginia robopoll asked six questions about the governor's race; the other asked four....Lost in the brevity is much, if any, substance. Neither of the two in Virginia asked about the top issues in the race, what candidate attributes matter most or anything about the economy. Without this essential context, these thin polls offer little more than an uncertain horse race number. In understanding public opinion, "why" voters feel certain ways is crucially important
Expanding on the second point, Cohen also points out that the requisite brevity of automated polls also leads campaign pollsters to rarely use automated polls. He quotes Joel Benenson and Bill McInturff and cites the poll released by Virginia candidate Brian Moran (conducted by Greenberg, Quinlan Rosner).
Let's take these in reverse order. First, he is right that the automated methodology is inappropriate for longer, in-depth surveys and that a single, automated pre-election poll can typically "offer little more than an uncertain horse race number." So we would want to stick to live interviewer surveys if we want to understand the broader currents of public opinion surrounding an election (the goal of the work done by the Post/ABC poll) or if we want to plot campaign strategy or test campaign messages (the goal of campaign pollsters). The inherent brevity of automated polls is the primary reason that campaign pollsters still rely on traditional, live-interviewer methods for their work.
Similarly, the need for a very short questionnaire on automated polls prevents the use of a classic Gallup-style likely voter model (which requires asking seven or more questions about vote likelihood, past voting and attention paid to the campaign). However, I do not agree that the absence of a Gallup style index means that automated polls take inherently "less care" with likely voter selection than other state-level pre-election surveys. Many pollsters, including most of those that work for political candidates, rely on other techniques (such as screening questions, geographic modeling and stratification and the use of vote history garnered from registered voter lists) to sample and select the likely electorate.
Do we really think the polls produced by SurveyUSA and PPP in Virginia take "less care" in selecting likely voters than the Mason-Dixon Florida primary poll reported yesterday by the Post's Chris Cillizza or the Quinnipiac New Jersey primary poll reported in Sunday's Post?
And while I will grant that final-poll pre-election poll accuracy is a potentially flawed measure of overall survey quality, it is the best yardstick we have to assess the accuracy of likely voter selection methods. After all, the Gallup-style likely voter models were developed by looking back at how poll estimates compare election outcomes and tweaking the indexes until they produced the most accurate retrospective results. With each new election, pollsters look back at how their models performed, adjusting them as necessary to improve their future performance. Thus, if a pollster is careless in selecting likely voters it ought to produce less accurate estimates on the final poll.
On that score, automated "robo" polls have performed well. As PPP's Tom Jensen noted earlier this week, analyses conducted by the National Council on Public Polls (in 2004), AAPOR's Ad Hoc Committee on Presidential Primary Polling (2008), and the Wall Street Journal's Carl Bialik all found that automated polls performed about as well as live interviewer surveys in terms of their final poll accuracy. To that list I can add two papers presented at last week's AAPOR conference (one by Harvard's Chase Harrison and Farleigh Dickinson Unversity's Krista Jenkins and Peter Woolley) and papers on prior conferences on poll conducted from 2002 to 2006 (by Joel Bloom and Charles Franklin and yours truly). All of these assessed poll conducted in the final weeks or months of the campaign and saw no significant difference between automated and live interviewer polls in terms of their accuracy. So whatever care automated surveys take in selecting likely voters, the horse race estimates they produce have been no worse.
One reason why is that respondents may provide more accurate reports of both their vote intention to a computer than a live interviewer. We know that live interviewers can introduce an element of "social discomfort" that leads to an underreporting of socially frowned upon behavior (smoking, drinking, unsafe sex, etc). Is it such a stretch to add non-voting to that list?
So let me suggest that this argument is really about the value of polls that measure the "horse race" preference -- and little more -- a few weeks or months before an election. Is that something worth reporting? Jon Cohen and ABC News polling director Gary Langer, the two principals of the ABC/Washington Post polling partnership, have been consistently outspoken in saying, "no," urging all of uurging us all to "throttle back on the horse race."
I have no doubt of their sincerity of their commitment to that goal or the obstacles they face putting it into practice, but I wonder if urging abstinence is a workable solution. Political journalists and their political junkie readers are intensely and instinctively interested in the basic assessments that "horse race" numbers provide. Poll references have a way of showing up in stories about the Virginia governor's race, even in a newspaper that is supposedly not reporting on Virginia primary polls. Just yesterday, for example, the Post's print edition debate story reported that the Virginia candidates "sought to stamp a final impression in a race where polls show the majority of voters remain undecided" and Chris Cillizza told us in his online blog that "polling suggests [Terry McAuliffe] leads both [Brian] Moran and state Sen. Creigh Deeds."
So the "polls" show something newsworthy enough to report, but the reporters are not allowed to name or cite the polls they looked at to reach that conclusion. Does that make any sense?
I am pondering two somewhat related questions this afternoon, but both have to do with national surveys conducted using an automated ("robo") methodology (or more formally, IVR or interactive-voice-response) to measure Barack Obama's job approval rating. One is the ongoing Rasmussen Reports daily tracking, the other is the just-released-today national survey by Public Policy Polling (PPP).
Both surveys are certainly producing lower job approval scores for President Obama than those from other pollsters. The difference for Rasmussen is painfully obvious when you look at our job approval chart, magnified by the sheer number of data points they contribute to the chart. Look at the chart and you can see two bands of red "disapproval" points with the trend line falling in between. Point to and click on any of the higher scores and you will see that virtually all come from Rasmussen. Similarly point to and click on a Rasmussen "black" approval point and you will see that virtually all of their releases fall somewhere below the line.
The most recent Rasmussen Reports job rating for Obama s 55% approve, 44% disapprove. Use the filter tool to drop Rasmussen from the trend, and the current trend estimate (based on all other polls) is, with rounding, 61% approve, 30% disapprove. Leave Rasmussen in and the estimate splits the difference. The latest PPP survey produces a result very similar to Rasmussen: 53% approve of Obama's job performance and 41% disapprove.
I know that Charles Franklin is working on a post that will discuss the impact of the Rasmussen numbers of the job approval chart, so I am going to defer to him on that aspect of this discussion. (Update: Franklin's post is up here).
But since some will find it very tempting to jump to the conclusion that the IVR mode explains the difference -- as PPP's Tom Jensen did back in February -- I want to take a step back and consider some of the important ways these surveys differ from other polls (and with each other) that have little or nothing to do with IVR.
First consider the Rasmussen tracking: Like many other national polls it begins with what amounts to a random digit dial sample -- randomly generated telephone numbers that should theoretically sample from all working landline telephones. However, unlike many of the national surveys, it does not include cell phone numbers, it screens to select "likely voters" rather than adults, and Rasmussen weights by party identification (using a three-month rolling average of their own results weighted demographically, but not by party). Rasmussen also asks a different version of the job approval rating. Other pollsters typically ask respondents to say if they "approve" or "disapprove" Rasmussen asks if them to choose from four categories, "strongly approve, somewhat approve, somewhat disapprove or strongly disapprove."
And Rasmussen uses an IVR methodology.
Now consider PPP: Unlike Rasmussen, they draw a random sample from a national list of register voters compiled by Aristotle International (which gathers registered voter lists from Secretaries of State in each of the 50 states plus the District of Columbia and attempts to match each voter with a listed telephone number in the many states where that information is not provided by the state. As far as I know, Aristotle has not published the percentage of registered voters on that list for which they lack a working telephone number, but it is likely a significant percentage. The critical issue is that the population covered by PPP is going to be different than that covered by other pollsters including Rasmussen.
So any coverage problems aside, PPP still samples a different population (registered voters) than most other public polls. Like most other pollsters, but unlike Rasmussen, they do not weight by party identification. Finally, the also ask a job approval question that is slightly different from most other pollsters.
Consider these versions:
Gallup (and most others): "Do you approve or disapprove of the way Barack Obama is handling his job as president?"
Rasmussen: "How would you rate the job Barack Obama has been doing as President... do you strongly approve, somewhat approve, somewhat disapprove, or strongly disapprove of the job he's been doing?"
PPP: "Do you approve or disapprove of Barack Obama's job performance?"
Note the very subtle difference: Others ask about how Obama is "handling his job" or about the job he "has been doing as president." PPP asks about his "job performance." MIght some respondents might hear "job performance" as a question about Obama's performance on the issue of jobs? That hypothesis may seem far fetched (and it probably is), but a note to PPP: It would be very easy to test with a split-form experiment.
Oh yes, in addition to all of the above, PPP uses an IVR methodology.
As should be obvious from this discussion, not all IVR methods are created equal. I happened to be at a meeting this morning with Jay Leve of SurveyUSA, one of the original IVR pollsters. As he pointed out, "there is as much variability among the IVR practitioners as there would be among the live telephone operators" on methodology, including some of the other more arcane aspects of methodology that I haven't referenced.
So the main point: While tempting, we cannot easily attribute to IVR all of the apprent difference to Obama's job rating as measured by Rasmussen and PPP on the one hand, and the rest of the pollsters on the other. There are simply too many variables to single out just one critical. The lack of a live interviewer may well play a role, but the differences in the populations surveyed, the sample frames and the text of the questions asked or some other aspect of methodology may be just as important.
More generally, just because a pollster produces a large house effect in the way they measure something, especially in something relatively abstract like job approval, it does not follow automatically that their result is either "wrong" or "biased" (a conclusion some readers have reached and communicated to me via email), only different. Observing a consistent difference between pollsters is easy. Explaining that difference is, unfortunately, often quite hard.
I must admit, despite the fact that my National Journal colleagues publish The Hotline just one floor down from my office, I missed this brief announcement (subscription required) on Tuesday appended to results from a recent survey from Public Policy Polling (PPP):
Traditionally, the Hotline has only published live-telephone interview surveys while excluding interactive voice response (IVR) polls, despite the increased media coverage of many of these so-called "robo-polls." In our constant effort to remain tuned to industry developments, and to determine if such distinctions are fair and valid, the Hotline will begin running selected numbers from IVR polls during the upcoming cycle. Specifically, head-to-head matchups, favorability ratings and approval ratings from IVR outfits will appear on an interim basis in the Hotline's Latest Edition through the '10 midterms. This data -- from firms such as InsiderAdvantage, Public Policy Polling, Rasmussen Reports and SurveyUSA -- will be published alongside live-telephone data, but will be clearly labeled as IVR results.
For those who are unfamiliar, The Hotline has been a DC institution for more than 20 years, serving up a daily political news summary chock full of polling data since the days when the preferred mode of delivery was the fax machine. They have long refused to publish surveys that used an automated methodology rather than live interviewers, so in our small world, their decision to publish IVR results, even if only on an "interim" basis, is important and, in my view at least, a welcome step.
"Numbers Guy" Carl Bialik devotes his Wall Street Journalcolumn and a companion blog post today to the subject of the automated "interactive voice response" polling that has become such a staple of the current campaign. Both are well worth reading in full.
Bialik managed to interview most of the major players in the political IVR field, and had a reaction from our partner Charles Franklin, summing up our own philosophy regarding the automated polls (that use a recorded voice rather than a live interviewer, and ask respondents to answer questions by pressing keys on their touch-tone phones):
The automated-polling method, says Charles Franklin, professor of political science at the University of Wisconsin and co-developer of the poll-tracking site Pollster.com, "can prove itself through performance or it can fail through poor performance, but we shouldn't rule it out a priori."
The column notes that IVR pollster SurveyUSA ranks second most accurate among all pollsters during the 2008 primaries in the ratings compiled by Nate Silver and that IVR polling was indistinguishable during the primaries in terms of how the final poll compared to the election result:
Their accuracy record in the primaries -- such as it was -- was roughly equivalent to the live-interviewer surveys. Each missed the final margin by an average of about seven points in these races, according to Nate Silver, the Obama supporter who runs the election-math site fivethirtyeight.com.
Franklin did our own compilation of polls conducted during the final week of the 2006 (for a paper presented at the AAPOR conference last year) and reached essentially the same conclusion.
The article also indicates some cracks may be forming in the intense skepticism that the survey research establishment has long held for IVR surveys. Bialik notes that a polling textbook (The Voters Guide to Election Polls ) authored by Paul J. Lavrakas and Michael Traugott, "refers to these surveys as Computerized Response Automated Polls -- insulting acronym intended." But at the end of the column, Lavarakas indicates a willingness to consider the methodology:
Accepting responses by touch tones may have a particular advantage this election, says Mr. Lavrakas, former chief methodologist at Nielsen Media Research, because it may extract more-honest responses from white respondents about their intent to vote for Sen. Obama. "Ultimately the proof is in the pudding, and those firms that use IVR for pre-election polling and do so with an accurate track record should not be dismissed," he says.
My NationalJournal.com column, on those wildly variant automated polls in North Carolina from Public Policy Polling (PPP), is now online.
A few additional pieces of the story: First, I get a lot of email asking about the firms like PPP. Who are they? Who pays for the polls? PPP was founded by a North Carolina businessman named Dean Debnam, and its clients are mostly Democratic candidates holding or seeking local office in North Carolina. According to Tom Jensen, PPP's communications director, Debnam founded the company to help provide "low cost, high quality polling" to candidates for local offices who "could never afford a $12,000 poll."
PPP is a good example of a growing trend that the automated (or interactive voice response - IVR) technology makes possible. It is easier than ever for organizations with little prior experience in survey research to make calls, ask questions, tabulate the results and disseminate them via the Internet. Where my pollster colleagues disagree -- often vehemently -- is whether the new firms like PPP are delivering the "high quality" polling they promise. For example, one campaign pollster friend I talked to this week said he had a "hearty laugh" about the change in the sample selection methodology I describe in the column because, "I doubt seriously that they had one in the first place."
I should say that Jensen has been very responsive on behalf of PPP and as transparent about their methods as any pollster we have dealt with. On the other hand, PPP made no reference to their changed sample selection in their most recent releases (here and here, though they did note the change in a separate blog entry). They also neglected to extract the relevant vote history data from the sample that would have allowed a simple tabulation of the results from the latest survey using the older, narrower universe of past primary voters. Those are the kinds of mistakes that fuel skepticism among experienced pollsters.
Professor Franklin and I share an attitude about these sorts of surveys that sometimes puts us at odds with many of our colleagues in survey research. We believe we ought to judge all surveys by their performance rather than simply dismissing them by their methodology alone. Skepticism is certainly appropriate for newcomers using relatively unproven methods, but we will continue to track and follow the results from companies like PPP in order to evaluate their ultimate success or failure in achieving their stated goals.
In case you missed our update, the most recent Gallup Daily result on the Democratic race shows a near dead-heat, with Barack Obama ahead of Hillary Clinton by a single percentage point margin not nearly large enough to attain statistical significance (47% to 46%). That one point lead is somewhat apropos, since it is virtually identical to the average of all of Gallup's Daily releases since February 8 (Obama 46%, Clinton 45%). So the question for the day: How much of the daily variation over the last six weeks has been real and how much is random noise?
Let's start with the chart of the Gallup Daily results since their three-day track completed on February 8 (and released on February 9). That was the first three-day result collected entirely after the results from the Super Tuesday primaries were known.
While the Gallup trend has shown several "figure eights" over the last few weeks (as reader "emcee" put it), most of that variation occurs within the range that we should expect from a survey with a +/- 3 point margin of sampling error.
To illustrate that point, consider the hypothetical possibility that the preferences among Democrats have remained perfectly stable for the last six weeks. Let's assume that the average result since February 8 -- 46% to 45% favoring Obama -- has been the unchanging reality. What sort of random variation should we expect from taking a sample rather than interviewing the entire population?
First, remember that the so-called "margin of error" applies to the individual percentages, not the margin between the candidates. So under our hypothetical "no change" scenario, we would expect the the Obama percentages to fall somewhere between 43% and 49% (46% +/- 3) and the Clinton percentages to fall somewhere between 42% and 48% for Clinton (45% +/-3).
Since February 8, the results of the actual Gallup Daily have fallen outside that range on just three days:
March 1, when Obama led 50% to 42%
March 13, when Obama led 50% to 44%
March 18, when Clinton led 49% to 42%
But wait. As some of you may remember, most political surveys (including Gallup) calculate the margin of error using a 95% confidence level. That assumption means that we should expect results slightly outside the margin of error for one poll in twenty.
Unfortunately, at this point our story gets a little bit more complicated, because the "one in twenty" assumption applies to statistically independent measurements. Since each Gallup Daily release is based on a three-day rolling average, there is overlap in the sample on successive days. So only the results from every third day are truly "independent." 'll skip over some even more confusing explanation and get to the bottom line: Since February 8, roughly one-in-seven independent samples from the Gallup Daily series has produced a result outside the margin of error from my hypothetical, no-change, 46-45 scenario. That's a little bit more than we would expect by chance alone, but not much more.
Having said all that, my explanation still oversimplifies. It ignores the possibility for meaningful change within the standard "margin of error" -- subtle shifts that might not attain statistical significance in a single three-day sampling, but might over the course of a week or more.
A better way to distinguish the meaningful patterns is to compare Gallup's results to those from another pollster or two. Let's start with a chart of the Rasmussen Reports daily tracking poll over the same six week period. Not surprisingly, the average of the Rasmussen data gathered since February 8 also shows Obama leading by a single percentage point (45% to 44%).
Compare the two charts (or look at the chart below, which plots a Clinton-minus-Obama margin for both polls) and you will see several features in common:
Both show a shift from Clinton to Obama between Super Tuesday and mid-February
Both show Obama maintaining a low single-digit lead from mid to late February
Both show Clinton rising a few days before the March 4 primaries and falling a few days after
And yet, at about the time the news surrounding Jeremiah Wright became a full-blown media obsession (March 14), the results of the two polls appear to diverge. Why is that?
We should keep in mind that Gallup and Rasmussen collect their data differently (and ask slightly different questions -- see the postscript). Gallup uses live interviewers, makes repeated call-backs to unavailable respondents, samples cell phone numbers, and routes calls to Spanish speaking interviewers when they reach a Spanish speaking household. Rasmussen uses an automated system and recorded voice to conduct interviews, a slightly tighter screen for "likely voters," yet (as I understand it) makes no calls backs, does not call cell-phones and makes no provision for bilingual interviewing.
Some, I am sure, will readily conclude that one or more of these characteristics (or perhaps others that I've omitted) provide "obvious" explanations for the discrepancies. I am reluctant to make too much of these differences. The reasons be clearer after we look at data from a third source. I obtained it earlier today from an anonymous but trusted pollster that I'll call "Polimatic." Here is a chart of the Polimatic's tracking data for the last six weeks:
Those who notice the greater stability in the Polimatic data as compared to Gallup and Rasmussen are on to something important. Next consider how the Clinton-minus-Obama margin from the Polimatic data compares to the other pollsters:
See some interesting patterns? Starting to form theories about what type of poll Polimatic is, or how their methodology might influence their results?
Well, before you go too far, I should fess up. I fibbed. "Polimatic" is not a pollster at all. The data are based on a simulation run by our friend Mark Lindeman. Mark created a spreadsheet that generates random results consistent with a thee-day rolling average tracking sample of 1,26040 interviews and the assumption that the "true" population value remains an unchanging 46% to 45% Obama lead.
The Polimatic line is more stable, suggesting that the consistently highest highs and lowest lows of the blue and red lines probably represent real divergence. However, the purely random variation of the simulated poll trend line is frequently hard to distinguish from the real surveys.
To generate the results above, I closed my eyes and clicked the mouse to let the spreadsheet recalculate. As such, the "Polimatic" line illustrates one potential trend showing nothing but random noise around a 46% to 45% margin. I'll say it one more time to be clear: All of the variation in the Pollmatic trend lines is based on purely random chance. Any resemblance to real changes as measured by Gallup or Rasmussen is entirely coincidental.
So what can we conclude from all this?
First, there has been far more stability than change in the national Obama-Clinton vote preference since Super Tuesday, and that includes the period of last ten days. To the extent that we have seen real changes, they are barely bigger than what we might expect by chance alone.
Second, if you look closely, you will notice that the seemingly odd divergence between Gallup and Rasmussen since the Wright story broke is really not that unusual. It is comparable to similar separations in the trend lines that occurred around February 13 and February 29. Random variation will do that.
Third, and probably most important, it is far too easy to look at these rolling average tracking surveys and see compelling narratives and spin interesting theories from what is often little more than random noise.
PS: Yes, as a few readers have already suggested in prior comments, some of the stability in national Democratic vote preference may stem from the fact that most states have already held their primaries and caucuses. We had some discussion about a month ago about how Gallup alters its screen slightly to accommodate states that have already voted. However, neither Gallup nor Rasmussen alters their vote question for those who have already voted. Here is the text used by each:
Gallup: Which of these candidates would you be most likely to support for the Democratic nomination for president in 2008, or would you support someone else? [ROTATED: New York Senator, Hillary Clinton; Former Alaska Senator, Mike Gravel; Illinois Senator, Barack Obama]
Rasmussen: If the Democratic Presidential Primary were held in your state today, would you vote for Hillary Clinton or Barack Obama? [options are rotated]
PPS: While I was writing this post, Mickey Kaus blogged a theory for the divergent Gallup and Rasmussen trend lines:
The 'Bradley Effect' is Back? Gallup's national tracking poll has Obama retaking the lead over Hillary after bottoming out on the day of his big race speech. Rasmussen's robo-poll, on the other hand, shows Obama losing ground since last Tuesday. True, even Rasmussen doesn't seem to be putting a lot of emphasis on his survey's 6-point shift. But isn't this week's primary race exactly the sort of environment--i.e.., the issue of race is in the air--when robo-polling is supposed to have an advantage over the conventional human telephone polling used by Gallup? Voters wary of looking like bigots to a live operator--'and why didn't you like Obama's plea for mutual for understanding that all the editorial pages liked?'--might lie about their opinions, a phenomenon known as the Bradley Effect. But they might be more willing to tell the truth to a machine. ...
Or more likely, the apparent differences between are about random variation in one or both polls. If you average the results from data collected since March 14 (the day the Wright story exploded) they are not very different:
Live Interviewer Gallup Daily: Clinton +2 (47% to 45)
Automated Rasmussen Reports: Obama +1 (45% to 44%)
Kaus also links to an automated PPP survey in North Carolina that fielded on the evening of March 17, the night before the Obama speech. As such, it is consistent with Gallup's "bottoming out" for Obama, not contradictory. The SurveyUSA results I blogged about on Friday were also collected from March 14 to March 16, just after the Wright story broke but before Obama's speech.
Yesterday, Clinton chief strategist Mark Penn released a polling memo highlighting "some pretty big changes" in polling numbers that suggest "a strong swing in momentum in the race to Hillary." Later in the afternoon, ABC News correspondent Jake Tapper posted some analysis by Peyton Craighill of the ABC News Polling Unit:
Mark Penn’s note is full of overblown claims based on current polling. He’s cherry picking numbers from recent polls. Much of his claim of a Clinton swing is based on the latest tracking data from Gallup in which Clinton is now ahead by 7 points. If you go back two more days Obama has a 7-point lead in a separate USA Today/Gallup poll. CBS has a new poll out today that shows a close 46-43 percent Obama-Clinton race. The CBS poll also has the match ups with McCain at 48-43 percent for Obama-McCain and 46-44 percent for Clinton-McCain. We see little indication of a shift to Clinton. Of the nine polls cited in his note, five of them are not airworthy.
Tapper adds: "'Airworthy' is a term our Polling Unit uses for polls so poorly done we are discouraged from mentioning them on air." I believe Tapper left out the word "not" in that sentence. Polls considered "not airworthy" are those ABC does not mention on air, and that category includes polls conducted using an automated methodology, such as those by SurveyUSA (ABC details its standards here).
Without reopening the long debate on automated polls (a topic we've written about often), we should note that the latest round of SurveyUSA polls do generally show Obama's support worsening in general election matchups against McCain. Of course, all of those surveys were fielded last weekend (March 14-16) while the Jeremiah Wright sound-bites played endlessly on the cable news network but before Obama's speech on Tuesday. Probably the wisest advice on how to interpret poll numbers this week comes from some commentary yesterday by NBC News political director Chuck Todd:
Don't use the polls this week to judge where Obama is and what kind of damage...is it long term or is it short term. I'd wait a week and look at the polls in a week and then we'll know how badly this [hurt Obama] because there has certainly been critical mass as far as attention has been concerned on the speech and how he is trying to pivot and move on. So if there is an uptick then we will know that what we are seeing is bottom, what we are seeing today is the worst, and if today is bottom, the Obama campaign probably thinksthey can recover.
For those who will be watching results from the Mississippi primary tonight, here is a breakdown of the demographics of recent surveys as well as the tabulations of vote by race. First, the demographic composition and overall results:
Obviously, we have far fewer polls (and pollsters) to consider than for last week's primaries in Texas and Ohio. Only three pollsters have been active in Mississippi -- ARG, Rasmussen and InsiderAdvantage -- and their reported demographic compositions have been reasonably consistent. ARG uses live interviewers, Rasmussen Reports uses an automated (IVR) methodology and InsiderAdvantage has used both in recent months but does not specify their methodology on their lasttwo releases.
The vote results by race are less consistent. All show Clinton with a wide lead among white votes and Obama with a wider lead among African-Americans, but the specific results -- particularly Obama's support among black voters -- have varied. Assuming that the networks conduct an exit poll tonight, we will see in a few hours how the results from that survey compare to those above.
Update: Rasmussen Reports emails with demographic composition numbers (thank you), so I updated the table above.
Update2: As jr886 points out in the comments, the folks at The Page are certainly expecting exit poll results.
There has been a considerable buzz over the last two days about the surveys released yesterday by SurveyUSA that test both McCain-Obama and McCain-Clinton trial-heat questions in all 50-states. Putting aside the concerns some have about SurveyUSA's automated methodology and the other usual caveats about horse race polling at this stage in the campaign, I tend to agree with the critique from Matt Yglesias (via Sullivan):
Each of these polls has a sample size of 600, so the margin of error will come into play. What's more, there are 100 separate polls being aggregate here, so the odds are that several of these are just bad samples.
True on both counts. SurveyUSA colors in states on their maps even if a candidate leads by a point or two, margins that are not close to achieving statistical significance. However, since SurveyUSA says they did 600 interviews in each state, we can take their analysis a step further, applying statistical sampling error to the candidates' margins in each state.
Professor Franklin and I have done just that, classifying each state based on the statistical significance of the candidate's lead. We call a state "strong" for the candidate if they lead by a margin that is statistically significant at a 95% level of confidence, the level typically used to calculate the "margin of error" attached to most surveys. We label as "lean" any state where a candidate leads by more than one standard deviation, which amounts to a 68% confidence level. We label all other states as toss-ups.
Note also that these significance tests assume "simple random sampling," which produces a smaller error margin than we would get if we could take into account that SurveyUSA, like virtually all pollsters, weights its data. We would need access to the raw data and weights in order to do truly correct significance testing.
The tables and maps appear below, followed by some discussion. First, here are the results and a map showing an Obama vs. McCain match-up (you can click on any of the images for a larger size version):
And here are the results and a map showing an Clinton vs. McCain match-up:
If you would prefer, you can also download the spreadsheet that we used to create the tables.
Now that you have all of the data before you, let's consider the merits of the project and a few caveats about the data. First, this sort of project -- which involved 30,000 interviews completed in 50 states over a three-day period (February 26-28) -- would not have been feasible with live interviewers.
On the other hand, the automated methodology is controversial with traditional survey researchers. I wrote about the arguments for and against IVR (interactive voice response) surveys Public Opinion Quarterly, and I have blogged often on the subject often, both here at Pollster and on its forerunner MysteryPollster. Readers are obviously welcome to share their opinions about the IVR methodology in the comments.
The other caveats noted by SurveyUSA are worth repeating: They surveyed all self-reported registered voters, and did not attempt to screen for "likely voters" (although many national pollsters do the same at this stage, feeling that we are too early in the process to attempt to predict what voters will actually cast ballots). McCain would likely do slightly better in both match-ups under a "likely voter" screen. Also, we are obviously still eight months from the election. Much can and will change in terms of voter perceptions and preferences.
Let us also keep in mind the limitations of random sampling error. It tells us only about the variability that comes from calling a sample of households rather than dialing every working phone number in every state. As with any survey, it tells us nothing about the potential for error based on the wording of the questions, the selection of respondents within the household and the voters missed because they lack land-line phones or do not participate in the survey. Be careful about using the misnomer "statistical tie" to describe states in the toss-up category. One candidate would likely show a "significant" lead if we could increase the sample size -- we just lack the statistical power to know which candidate that would be.
Finally, keep in mind that since we are looking at 100 tests (2 each in 50 states), these results probably misclassify five states by chance alone (as opposed to the way we would classify them if SurveyUSA had called every working telephone in the 50 states).
With all the caveats out of the way, what does all this data tell us? Consider this summary of the electoral vote totals**:
These data are less useful in forecasting the ultimate result than they are in gauging the relative strength of both Clinton and Obama as of last week (February 26 to February 28). Those dates are important, since both the Gallup Daily and Rasmussen Reports automated tracking have shown Clinton gaining ground on Obama nationally over the last week.
Nonetheless, as of last week, Hillary Clinton led in states that add up to a slightly greater electoral vote total counting the leaners (250 for Clinton vs. 244 for Obama. Still Obama appeared to put more states into play (138 pure toss-up for an Obama-McCain race vs. a Clinton-McCain race). So Obama's initial electoral vote advantage is greater.
The most interesting aspect of these surveys is the states that explain those differences. Let's consider first the states where Obama does better than Clinton:
Obama moves three states from lean McCain to strong Obama: Colorado, Iowa and Oregon
Obama moves two states from strong McCain to lean Obama: Nevada and North Carolina
Obama leads in two states that are toss-ups in a Clinton-McCain race: New Mexico (lean) and Washington (strong)
Obama moves four states from strong McCain (against Clinton) to toss-up: Nebraska, New Hampshire, North Carolina and Virginia
On the other hand, Clinton does better than Obama in a smaller number of states:
Clinton moves one state from strong McCain to strong Clinton: Arkansas
Clinton moves one state from strong McCain to lean Clinton: West Virginia
Clinton leads in the two states that are toss-ups in an Obama-McCain race: Florida (strong) and New Jersey (lean)
Clinton moves one state from strong McCain to undecided: Tennessee
Clinton moves one state from lean McCain to undecided: Pennsylvania
Here is another table that makes it easier to see these comparisons (again, click on the image to see a full size version):
So, Pollster readers, what do you think?
**And yes, after putting these tables together I see that SurveyUSA split the Nebraska electoral votes based on on the vote totals, something I did not do.
Update: Nick Beaudrot (via Yglesias) creates thematic maps based on the same data keyed to the size of the candidate margin.
A few comments on our post of the new SurveyUSA Texas poll raised two questions worthy of further discussion.
First, reader s.b. notes:
[W]ith an automated survey, if its in English, they aren't sampling spanish only or mostly spanish speakers. I think it skews these results.
Some pollsters (such as Gallup) offer voters the opportunity to complete the survey in Spanish when they encounter Spanish speaking respondents. Most pollsters, however, will simply end the interview in these instances. I asked SurveyUSA's Jay Leve about their procedure in Texas and he notes that while they do have the facility to offer respondents the option to complete a survey in either English or Spanish (and have done so in mayoral elections in New York and Los Angeles and some congressional districts), they did not offer a Spanish interview for their Texas poll.
However, before leaping to conclusions about the SurveyUSA results, keep in mind that none only one of the other Texas pollsters report using bilingual interviewing for any of their surveys [Correction: interviews for the Washington Post/ABC News poll "were conducted in English and Spanish"]. Three of the other pollsters -- Rasmussen Reports, PPP and IVR polls -- also interview with an automated methodology rather than live interviewers.
And before leaping to conclusions about all the Texas polls, we might want to know just how many Latino voters in Texas speak only Spanish. I have not done survey work in Texas, but my memory from conversations with pollsters that do is that the percentage that will actually complete an interview in Spanish when offered is typically in the low single digits.
Second, several commenters have speculated about the small changes in the demographic composition of the last two SurveyUSA Texas polls. For example, "Mike in CA" points out:
Hispanic turnout at 28% sounds just about right. The last SUSA survey had it at 32% which was way too high. It seems SUSAhas scaled back their Hispanic estimates, so they must have a reason. Additionally, the boosted AA to 23%, from 18%. Seems reasonable considering the extraordinary increases in early voting turnout from Houston and Dallas [emphasis added].
That's not quite right. Keep in mind that SurveyUSA's approach to likely voter modeling is comparable to that used by Iowa's Ann Selzer, in that they do not make arbitrary assumptions about the demographic composition of the likely electorate. As SurveyUSA's Jay Leve explains, they "weight the overall universe of Texas adults to U.S. census" demographic estimates, then they select "likely voters" based on screen questions and allow their demographics to "fall where they may." So some of the demographic variation from survey to survey is random, but large and statistically statistically significant variation should reflect real changes in the relative enthusiasm of voters. Leve goes into more detail in the email that I have reproduced after the jump, which also includes the full text of the questions they use to select likely voters.
Jay Leve and his crew at SurveyUSA have been busy this week. Following up on our discussion of their pollster report cards, SurveyUSA has a new and improved scorecard chart for individual states primaries (example for Florida Republicans with explanation here, example for Wisconsin Democrats here). The new state-level report card format includes eight different measures of error and a number of additional variables intended to help us "better understand the correlation between the methodological choices an election pollster makes and the results an election pollster produces."
Length of field period.
Proximity of poll release to election
The number of undecided voters
The number of respondents interviewed
The sample source (if available)
The interviewing technique (if available)
The method of respondent selection (if available)
They also updated their "high-level" report cards (summarizing one measure of error for all final polls in 2008) to include polls from Wisconsin.
Finally, in another interesting innovation, they are also soliciting reader input on the McCain story:
Washington Post polling director Jon Cohen reported an easily overlooked but important statistic yesterday, especially to anyone thinking about the reliability of the last round of Iowa polls. Using the Iowatables here at pollster.com, he determined that public polls in Iowa this year have interviewed nearly 80,000 "likely caucus goer" respondents:
As a ratio of voters polled to expected turnout, this must be something of a record. (In 2004 about 120,000 people participated in the Democratic caucuses, and in 2000 about 90,000 in the GOP contest.)
And it's not just the public pollsters calling. Campaigns have been known to set up a phone bank or two to gauge opinion, solicit support and cajole voters to actually show up and spend hours caucusing in the middle of winter.
A month-and-a-half ago, already deep into the "silly season" but well before the final stretch, eight in 10 likely Democratic caucus goers and nearly six in 10 on the GOP side said they'd been called on the telephone by at least one of the campaigns. And Pew reported the pervasive use of robo-calls (though most Iowans who get such automated calls about the campaign said they usually hang up).
I can add two confirming anecdotes. The first comes from a comment left by "Randy Iowa" here at Pollster just last night:
Is there a Do Not Call list that i can get on? I have received a survey call everyday this week and at least one candidate has called everyday as well.
I emailed Randy, and sure enough, he is an Iowa voter. He says that "80%" of the calls he received were automated. Interestingly, he is also a non-affiliated voter (not registered with a party) registered independent who has never participated in a caucus (though has "voted Republican my entire life"). (By the way, the short answer to Randy is no. Pollsters and political campaigns are exempt from the federal do not call restrictions, though at least one group is trying to change that).
I wonder how many calls those identified as past caucus goers are getting? Here is one possible answer in he form of an email I received about an hour ago from a "help desk" operator at a major residential telephone company. He apparently assumed (mistakenly) that Pollster.com conducts surveys:
Subject: Please stop calling this customer
This customer is getting upwards of 20 calls a day from automated poll services, she lives in Iowa and her phone number is 563-[redacted]. Please stop calling her.
Not surprisingly, the recipient of the calls lives near Davenport Iowa.
Aside from spectacle of the sheer volume of "poll" calls, we might want to think about what all that calling is doing the the response rates the real pollsters are getting. And if pollsters are having a harder time getting voters to respond this week, are those suddenly reluctant voters skewing the results? We may never know, of course, but if nothing else, I would be very nervous were I using an automated (IVR) methodology to collect survey data in Iowa right now. More important: I wonder how many many Iowans have been ignoring their ringing phones altogether the last few days?
A suggestion from alert reader and frequent commenter
I write to suggest that you analyze
the huge discrepancy between the latest Rasmussen and Washington Post/ABC
polls. I'm talking about the Republican nomination. Rasmussen says Thompson is
up by 4 over RG, while WP/ABC says Rudy is up by 20 pts over FT, who isn't even
in second place here (36 RG to 14 FT). One of these pollsters is
obviously very wrong. Two polls cannot both be accurate, if their margin of
victory do not approximate each other. This is a humongous 24 point
Here, with a little assist from Professor Franklin, is a
chart showing the discrepancy that Andrew noticed. The two surveys do seem to
show a consistent difference that is clearly about more than random sampling
error. The ABC News/Washington
Post survey shows Giuliani doing consistently better, and Thompson
doing consistently worse, than the automated surveys conducted by Rasmussen
Reports, although the discrepancy has been largest in terms of how the most recent ABC/Post poll compares to Rasmussen surveys conducted over the last month or so.
To try to answer Andrew's question, it makes sense to take
two issues separately. First, why are
the surveys producing different results for the Republican primary?
At the most basic level, these surveys seem to be measuring
the same thing: Where does the Republican nomination contest stand nationally? And
both surveys begin with a national sample of working telephone numbers drawn
using a random digit dial (RDD) methodology. Take a closer look, however, and
you will see some pretty significant difference in methodology:
ABC/Post survey uses live
interviewers. Rasmussen uses an automated recorded voice that asks
respondents to enter their answers by pushing buttons on a touch tone
keypad. This method is known as Interactive Voice Response (IVR). The
response rates -- and more importantly, the kinds of people that respond --
are likely different, although neither pollster has released specific
response rates for any of the results plotted above.
ABC/Post survey attempts to
select a random member of each household to be interviewed by asking "to
speak to the household member age 18 or over at home who's had the last
birthday" (more details here). Rasmussen interviews whatever adult member
of the household answers the telephone. Both organizations weight the
final data to reflect the demographics of the population.
Reports weights each survey by
party identification, using a rolling
average of recent survey results as a target (although their party
weighting should have little effect on a sub-group of Republican primary
voters). The ABC/Post survey
does not weight national surveys at this stage in the campaign by party
[Update -- one I overlooked: The ABC/Post survey includes Newt Gingrich on their list of choices. Gingrich receives 7% on their most recent survey. If the Rasmussen survey prompts Gingrich as a choice, they do not report it. It is also possible that Rasmussen omits other candidates as well, as t Their report provides results for just Giuliani, Thompson, Romney and McCain. Update II -- Scott Rasmussen informs via email: "We include all announced candidates plus Fred Thompson"].
perhaps most important for Andrew's question: The ABC/Post survey asks the presidential primary question of all
adults that identify with or "lean" to the Republicans. The Rasmussen
survey screens to a narrower slice of the population: Those they select as
"likely Republican primary voters."
Unfortunately, neither pollster tells us the percentage of
adults that answered their Republican primary question, but we can take a
reasonably educated guess: "Leaned Republicans" have been somewhere between 35%
and 42% of the adult population on surveys conducted in recent months by Gallup and the Pew
Research Center. If Rasmussen's likely voter selection model for Republican
is analogous to their model
for Democrats, their "likely Republican primary" subgroup probably
represents 20% to 25% of all adults.
Consider also that, even before screening for "likely
voters" and regardless of the response rate,
those willing to complete an IVR study may well represent a population that is
better informed or more politically interested than those who complete a survey
with an interviewer.
Put this all together, and it is clear that the Rasmussen
survey is reaching a very different population, something I would wager
explains much of the difference in the results charted above.
Now, the second question, which result is more "accurate?" It is tempting to say that this question is impossible to
answer, since we will never have a national primary election to check it against.
But a better answer may be that "accuracy" in this case depends on what we want
to use the data for.
If we were trying to predict the outcome of a national
primary, and if all other aspects of methodology were equal (which they're
not), I would want to look at the narrower slice of "likely voters" rather than
all adult "leaned Republicans." Since the nomination process involves series of
primaries and caucuses starting with Iowa and New Hampshire, and since
the results from those early contests typically influence preferences in the
states that vote later, we really need to focus on early states for a more
"accurate" assessment of where things stand now. While interesting and fun to
follow, these national measurements provide only indirect indicators of the
current status of the race for the White House.
Why would the ABC/Post survey want to look at all
Republicans, rather than likely voters? Here is the way ABC polling director
Gary Langer explained it in his online
column this week:
I like to think there are two things we cover in
an election campaign. One is the election; the other is the campaign.
The campaign is about who wins. It's about tactics
and strategy, fundraising and ad buys, endorsements and get-out-the-vote
drives. It's about the score of the game - the horse race, contest-by-contest,
and nothing else. We cover it, as we should.
The election is the bigger picture: It's about
Americans coming together in their quadrennial exercise of democracy - sizing
up where we're at as a country, where we want to be and what kind of person
we'd like to lead us there. It's a different story than the horse race, with
more texture to it, and plenty of meaning. We cover it, too.
We ask the horse race question in our national
polls for context - not to predict the winner of a made-up national primary,
but to see how views on issues, candidate attributes and the public's personal
characteristics inform their preferences.
Questions like Andrew's are more consequential in the statewide surveys we
are tracking here at Pollster.com, and those surveys have been producing some
discrepancies even bigger than the one charted above. We will all be in a
better to make sense of those differences if we know more about the
methodologies pollsters use. I'll be turning to that issue in far more detail
The latest automated SurveyUSA poll in the Kentucky
Governor's race provides us with one of those classic conflicting poll stories
that we just love here at Pollster.com, because it illustrates how small differences
in methodology can have a profound effects on the results. In this case, SurveyUSA
shows Democrat Steve Beshear leading incumbent Republican Ernie Fletcher by a
23 point margin (59% to 36%) with only 5% undecided. Meanwhile, an InsiderAdvantage
poll conducted a week earlier shows Beshear leading by just three points (41%
to 38%) with a much larger number (21%) in the undecided category
What explains the difference? Continue after the jump for
more explanation, but my best guess is that the solution can be found in this
conundrum: On a poll, "undecided" means something different than "still trying
York Times gives prominent play to a story
on bills working their way through various state legislatures across the
country to crack down on prerecorded campaign calls:
Nearly two-thirds of registered
voters nationwide received the recorded telephone messages, which as political
calls are exempt from federal do-not-call rules, leading up to the November
elections, according to a survey by the Pew Internet and American Life
Project, an independent research group. The calls, often known as
robocalls, were the second most popular form of political communication,
trailing only direct mail, the group said.
The article did not address the potential impact, if any, on
automated surveys conducted using a pre-recorded script that ask respondents to
answer by typing keys of their touch tone telephones. As of January, according to
the newsletter of the
Council of American Survey Organizations (CASRO), there were already "sixteen
bills in seven different states addressing automated calls." However, as I read
the CASRO report, most of these new bills -- like the federal "do-not-call"
regulations -- do not appear to restrict calls made for the purpose of survey
I thought it might be useful to ask the opinion of survey
researchers who conduct automated "interactive voice response" (IVR) surveys
for their reaction to today's Times
Jay Leve is the editor of SurveyUSA,
a firm that conducts public polls exclusively with IVR. His comment:
The people who try to deceive voters, using whatever
technology, should be put in prison. Nothing is more repugnant than individuals
or firms who use technology to disenfranchise voters, which is what the calls
being debated do. Many such calls are designed to suppress turnout. They are
the 21st Century Bull Connor, with a fire-hose replaced by Ethernet.
SurveyUSA welcomes carefully drawn legislation that makes it a crime to mislead
voters, by whatever means. SurveyUSA opposes sloppily drawn legislation, in any
jurisdiction, that fails to recognize the vital community interest served by
legitimate, hyper-local public opinion research.
Thomas Riehle is a partner in RT Strategies, a firm that usually conducts
telephone surveys using live interviewers (including the polls conducted for
the Cook Political
Report). One exception was last year's Majority
Watch project, which fielded pre-election polls via IVR in contested U.S.
House races. Riehle's comment:
The research industry, under the
leadership of CMOR [the Council for Marketing and
Opinion Research], has done a good job in helping legislators and
regulators distinguish between a telemarketing program contacting hundreds of
thousands of households with sales or advertising mass-marketing messages,
which 'do not call' lists regulate, and live telephone interviews with a few
hundred or a thousand households who complete a survey research project. I
would hope that regulators or legislators intending to limit any negative impact
they might find caused by hundreds of thousands of political telemarketing
recorded calls will not unintentionally limit the ability to complete a few
hundred survey research calls using recorded-voice interviews.
As they say in radio land, our comment line is open. What do
Kaus and Chris
Bowers at MyDD noticed that Rasmussen Reports has been showing a much
closer race on their automated national
tracking of the 2008 Democratic presidential primary contest. Both floated
different theories for that difference that imply that the Rasmussen's numbers
are a more accurate read. This post takes a closer look at those arguments,
although the bottom line is that hard answers are elusive.
The chart below shows how the recent Rasmussen surveys
compare to the trend for all other conventional polls as tracked by Professor
Franklin here at Pollster. The bolder line represents the average trend across
all conventional surveys, while the shorter narrow lines connect the recent
Rasmussen surveys. Click the image to enlarge it, and you will see that all but
one of the Rasmussen surveys shows Barack Obama running better than the overall
trend. The Rasmussen results for Clinton
show far more variability, especially during the first four weeks of
Rasmussen's tracking program. They show Clinton
running worse than other polls over the last three weeks. Note that a new survey
released overnight by
Gallup (that shows Clinton's lead "tightening") has not altered
the overall trend.
Of course the graphic above includes survey questions that continue
to include Al Gore on the list of candidates. In order to reduce the random
variability and make the numbers as comparable as possible, I created the following
table. It shows that Clinton leading by an average of roughly 15 points (38.6%
to 23.8%) on the three most recent conventional telephone surveys, but by just
5 points (33.0% to 28.3%) on the three most recent Rasmussen automated surveys
(surveys that use a recorded voice and ask respondents to answer by pressing
buttons on their touch tone phones). Given the number of interviews involved,
we can assume that these differences are not about random sampling error. Something
is systematically different about the Rasmussen surveys that has been showing a
tighter Democratic race over the last three weeks.
But what is that difference? That's a tougher question to answer.
Here are some theories, including those suggested by Bowers and Kaus:
1) The automated methodology yields
more honest answersabout vote choice (and thus, a more
accurate result). The theory is that some people will hesitate to reveal
certain opinions to another human being, particularly those that might create
some "social discomfort" for the respondent. Thus, Kaus provides his "Don't Tell Mama"
theory: "men don't like Hillary but are reluctant to say so in public" or to
"tell a human interviewer -- especially, maybe, a female interviewer."
2) The people sampled by
Rasmussen's surveys are more representative of likely Democratic primary
votersbecause it uses a tighter screen. Chris Bowers makes
that point by arguing that the Rasmussen screen looks slightly tighter than
those used by other pollsters - "38-39% of the likely voter population" rather
than the "40-50% of all registered voters [sampled by] the vast majority of
national Democratic primary polls."
3) The people sampled by automated
surveys are more representative of likely primary votersbecause
they give more honest answers about whether they will vote. We
know from at least 40 years of validation studies that many respondents will
say they voted when they did not, due to the same sort of "social discomfort"
mentioned above. Voting is something we are supposed to do, and a small portion
of adults is reluctant to admit to non-voting to a stranger on the telephone. In
theory, an automated survey would reduce such false reports.
4) The people sampled by automated
surveys are less representative of likely primary votersbecause
they capture exceptionally well informed respondent. This theory is one
I hear often from conventional pollsters. They argue that only the most
politically interested are willing to stay on the phone with a computer, and so
automated surveys tend to sample individuals who are much more opinionated and
better informed than the full pool of genuinely likely voters.
Lets take a closer look at the arguments from Kaus and
Kaus makes much of the fact that the Rasmussen poll shows a
big gender gap, with Clinton
showing a "solid lead" (according to Rasmussen) among men, but trailing 11
points behind Obama among men. He wonders if other polls show the same gender
gap. While precise comparisons are impossible, all the other polls I found that
reported demographics results also show Clinton doing significantly better
among women then men (Cook/RT
Strategies, CBS News,
Time and the Pew Research
Center). Rasmussen certainly shows Obama doing better among men than the
other surveys, but then, Rasmussen shows Obama doing better generally than the
Of course (if it turns out the
gender gap in the two polls is roughly comparable) it could be that many men and
many women don't like Hillary but are reluctant to say so in public. (if it
turns out the gender gap in the two polls is roughly comparable) it could be
that many men and many women don't like Hillary but are reluctant to
say so in public.
His backup may be plausible, especially when interviews are
conducted by women, although we obviously have no hard evidence either way.
Bowers' theory feels like a better fit to me, especially if
we also consider the possibility that the absence of an interviewer may
reduce the "measurement error" in the selection of likely voters. The bottom
line, however, is that we really have no way to know for sure. It is certainly possible,
of course, that the Rasmussen's sampling is less accurate. All of these
theories are plausible, and without some objective reality to use as a
benchmark, we can only speculate about which set of polls is the most valid.
What strikes me most, as I go through this exercise,
is how little we know about some important methodological details. What are
the response rates? Are Rasmussen's higher or lower than conventional polls? How many respondents answered the primary vote questions on recent surveys conducted by ABC News/Washington Post, NBC/Wall Street Journal and Fox News and the most recent CNN survey? Many
pollsters provide results for subgroups of primary voters, yet virtually none
tell us about the number of interviews behind such findings. We also know
nothing of the demographic composition of their primary voter subgroups, including
gender, age or the percentage that initially identify as independent.
And how exactly do those pollsters that currently report on "likely
voters" select primary voters? How tight are their screens? Very little of information
is in the public domain (and given that these numbers involve primary results,
voter guide from 2004 is of little help).
I emailed Scott Rasmussen to ask about their likely voter
procedure for primary voters. His response:
with the tightest segment from our pool of Likely Voters... Dems are asked about
how likely they are to vote in Primary... Unaffiliateds are asked if they had the
chance, would they vote in a primary... if so, which one...
I am not completely sure what the "tightest segment" is, but
I my guess is that they take those who say they will definitely or certainly vote
in the Democratic primary. He also confirmed that the 774 likely Democratic
primary voters came from a pool of 2,000 likely voters. So last night I asked what portion of adults qualified as likelyvoters so we might do an apples-to-apples comparison of the relative "tightness"
of survey screens. As of this writing, I have not received an answer.UPDATE: Via email, Scott Rasmussen tells me that while he did not have numbers for that specific survey readily available, the percentage of adults that qualify as likely general election is typically "65% to 70%...for that series." He promised to check and report back if the number for this latest survey are any different.
But with respect to all pollsters again, and not just Mr. Rasmussen, why
is so little of this sort of information in the public domain? Most media pollsters
pledge to abide by professional codes of conduct that
require disclosure of basic
methodological details on request. Maybe it's time we start asking for that
information for every survey, and not just those that produce quirky results.
Picking up on the post from earlier tonight, the new Majority Watch surveys released today provide another strong indicator of recent trends, in this case regarding the race for the U.S. House. The partnership of RT Strategies and Constituent Dynamics released 41 new automated surveys conducted in the most competitive House districts.
Since they conducted identical surveys roughly two weeks ago in 27 30 of the 41 districts, we have an opportunity for an apples-to-apples comparison involving roughly 27,000 30,000 interviews in each wave. The table below shows the results from both waves from each of those 27 30 districts. The bottom line average indicates that overall, the Democratic margin in these districts increased slightly, from +1.9 to +2.7 percentage during October.
Whatever one may think of their automated methodology, the Majority Watch surveys used the same methodology and sampling procedures for both waves. And as with the similar "mashup" of polls in the most competitive Senate races in the previous post, these also show no signs of an abating wave.
Interests disclosed: Constituent Dynamics provided Pollster.com with technical assistance in the creation of our national maps and summary tables
Andrew Kohut is the President of the Pew Research Center and arguably the
dean of the survey research profession. President of the Gallup
Organization from 1979 to 1989, Kohut recently received the highest honor of
the American Association of Public Opinion Research's highest honor, their
2005 Award for Exceptionally Distinguished Achievement. He spoke with
Pollster.com's Mark Blumenthal last week about how the Pew Research
Centerwill measure voting intentions for the upcoming elections and
about the future of survey research.
Topic A - for just about everybody right now - is handicapping the races for control of the House and Senate. I'm sure our readers would be interested in your take. But I think perhaps of even greater interest would be what kinds of surveys and measures you are looking at and will be looking at over the coming weeks?
Well, we're going to do what we traditionally do in off-years and that is measure voting intentions for the House. Generally in off-years the pre-election polls do a pretty good job of estimating the popular vote for the House and we know that has a correspondence to the number of seats that each party has. In 1994 we were very fortunate that The Times Mirror Center, the center that preceded Pew, was among the first to say, "We've got a Republican plurality in the popular vote." We didn't have quite enough of a margin in the poll, even though the poll provided a very accurate estimate of the popular vote to flatly predict that Republicans would take over, but we described it as a high likelihood. We could have the same thing happen in this election. What I'm struggling with is that safe-seat redistricting has made the relationship between the popular vote and seats won by each party less than what it once was. And so we're going to have to try to make our estimates, taking into account the traditional relationship between seats and votes and how that relationship may have changed since the '90s Census was used to redistrict.
Will you be looking at any of the statewide surveys or congressional level surveys that are out in the public?
Well, I look at them just for the sake of trying to understand what else is going on out there, but what I learned from Paul Perry at the Gallup Organization was to not use ad-hoc judgments, but to focus on the survey measures that we use to estimate the size of the vote of the party or a candidate. So in the meantime we're concentrating on whether our turnout scale is working well, how the undecideds are likely to break, what the last minute trends are if any, and how stable are people's choices. Those are the things that are really most important to me. I'm not a handicapper, I'm a measurer. There's a difference.
Actually that's a perfect segue to another question I wanted to ask. Just before the 2004 election, as you well know, your final survey gave George Bush a three-point lead in the popular vote. And you did a projection in which you allocated the remaining six percent that were undecided about evenly and predicted a 51 to 48 Bush win, which turned out to be right on the nose exactly the way the popular vote broke. You wrote in that final report, "Pew's final survey suggests the remaining undecided vote may break only slightly in Kerry's favor." And I think you did a three-to-three allocation or something close to that. And I just wondered what you can tell us about the process you used to reach that conclusion then and what does it say about what you will do in the coming weeks?
Well, we do a couple of things. First, we throw out half of the undecideds because validation surveys show that they vote at very low rates. Then, we look at a regression equation that predicts choices based upon all of the other questions we have in the survey among the decideds and apply that model to the undecideds. We also then look at the way the leaners - that is the people who don't give us a choice initially - are breaking and make the assumption that the leaners are closer to the undecideds than to the people who give us an answer right off the top of their heads when we ask them it first.
I want your readers to know that we ask several questions, the first one is the flat out question where we ask where you lean, and we look at how the leaners break. We take those two estimates in mind and divide the undecideds. They are based upon measures. They're not based upon "you know I think," "I got this feeling," "history tells us," or any of this other stuff where you can let judgments get in your way.
What I learned from Paul Perry - and I keep going back to him because he taught me everything I know about this - is that what you should be prepared to do is to have a way of measuring all of the things that you're interested in covering and be able to look at those measurements in the current election relative to your experience in previous elections. And we try to do that. The one time I didn't do that was in 2002, because I was pre-occupied with other things. On an ad hoc basis, I kicked out one of my traditional questions out of the turn-out scale and it really hurt our projection. It made it too Democratic. I won't do that again. I chalk that mistake up to being pre-occupied with the first Global Survey that we were doing at the same time. In any event having said that, that's my philosophy and that's the way we will pursue it here at the Pew Research Center.
I'd like to take a more forward look at what trends you've seen developing in survey research. If you could try to imagine a world in ten or twenty years, how differently do you think the very best political surveys will be conducted?
I really don't know the answer to that. Hopefully somehow we're going to solve the problem of a sampling frame for online surveys, because I'm a firm believer that unless you have a sampling frame in which you can draw samples of people online, it's hard to do these post-facto weightings of people who opt-in to samples and make that work. I haven't seen it yet to my satisfaction. Obviously means of communication are so much more sophisticated and varied - the old land-line telephone will probably be a relic - so I don't have a good answer for you. I'm confident this is a practice that is pretty nimble and full of people who are survivors and will figure a way to cope with it. What that way is, I'm not sure.
I guess that takes me to one last topic. We've logged in over 1000 statewide polls in our database at Pollster.com, and more than half of the statewide surveys have been either automated recorded voice telephone (IVR) or Internet panel. And of the 200 or so polls that have been released on the House, about half of those have been automated. You spoke about the Internet panel problem and I wonder what sort of reaction you have to the explosion of automated recorded IVR surveys.
Well, I know they did reasonably well in one election. I would have to see them perform over a longer period of time. I'd like to see where they succeed and where they don't succeed. They always remind me a little bit of a New Yorker cartoon of two hounds sitting in front of a computer screen and one turns to the other and says, "On the internet they don't know we're dogs." One of the things that really bothers me about this is that we just don't know who we're talking to. And that goes to the very premise of the practice of sampling: you should know who you're talking to. In any event I will take a wait-and-see - I want to see more evidence before I come to some conclusion about it, other than my true discomfort with completion rates that low and not knowing firmly or clearly who you're dealing with.
With the addition of House race data to Pollster.com, it is a good time to talk about the difficulty of measuring the status of the race to control Congress at the district level. Political polling is always subject to a lot of variation and error (and not all of it the random kind), but Congressional district polls have their own unique challenges.
First, we are tracking something different in terms of voter attitudes an preferences than in other races, particular contests for President. Two years ago, voters received information about George Bush and John Kerry from nearly every media source for most of the year. Huge numbers of voters tuned in to watch live coverage of nationally televised candidate debates. In races for the Senate and House, news coverage is far less prevalent and voters pay considerably less attention until the very end of the campaign. Even then, voters still get much of their information about House candidates from paid television and direct mail advertising.
Of course, in the top 25 or 30 House races, the candidates (and political parties) have already been airing television advertising. However, if you expand the list to the next 30-40 races that could be in play, the flow of information to voters drops off considerably. Middle-tier campaigns in districts in expensive media markets (like New York or Chicago) will depend on direct mail rather than television to reach voters.
So generally speaking, voter preferences in down ballot races are more tentative and uncertain. The (Democratic affiliated) Democracy Corps survey of Republican swing districts released last week reported 26% of likely voters saying there is at least a "small chance" they may still change their minds about their choice for Congress. When they asked the same question about the presidential race in mid-October 2004, only 14% said they saw a "small chance" or better of changing their mind about voting for Kerry or Bush.
This greater uncertainty means that minor differences in methodology can have a big impact on the results. Specifically, pollsters may vary widely in terms of the size of the undecided they report depending on how hard they push uncertain voters.
Second, the mechanics of House races polling can be very different from statewide methodology. The biggest challenge involves how to limit the sample to voters within a particular House district. In statewide races the selection is easy. Since area code boundaries do not cross state lines, it is easy to sample within individual states. So most of the statewide polls we have been tracking use a random digit dial (RDD) methodology that can theoretically reach every voters with a working land line telephone.
No such luck with Congressional districts, whose gerrymandered borders frequently divide counties, cities, even small towns and suburbs. Since very few voters know their district numbers, pollsters use a variety of strategies to sample House districts. Most of the partisan pollsters, as well as the Majority Watch tracking project, use samples drawn from lists of registered voters (sometimes referred to as "registration based sampling" or RBS). These lists make it easy to select voters within a given district, but the lists frequently omit telephone numbers for large numbers of voters (typically 20% to 40%30% to 50%**). Remember the real fear that RDD surveys are missing cell-phone-only households? Right now the missing cell phone households represent roughly 6-8% of all voters. Lists, obviously, miss many more. If the uncovered households differ systematically from those with working numbers on the lists, a bias will result.
Again, most partisan pollsters (including my firm) are comfortable sampling from lists, because the benefits of sampling actual voters within each District appear to outweigh the risks of coverage bias (see the research posted by list vendor Voter Contact Services of a sampling of arguments in favor of RBS). Media pollsters are generally more wary. SurveyUSA, for example, conducted a parallel test of RDD and RBS in a 2005 experiment that found a large and consistent a bias in RBS sampling that favoring one candidate. "SurveyUSA rejects RBS as a substitute for RDD," their report read, "because of the potential for an unpredictable coverage bias." So in House polls they often use RDD and screen for voters in the given district based on voters' ability to select their incumbent member of Congress from a list of all members of Congress from their area.
These various challenges have made many media outlets and public pollsters wary of surveys in House races. As of two week ago, we had logged more than 1,000 statewide polls for Senate or Governor into our Pollster.com database for 2006. As of yesterday, we had tracked only 173 polls conducted in the most competitive House races, but as the table below shows, only 47 of those came from independent media pollsters using conventional telephone methods
Nearly half of all the House race polls come from two automated pollsters: SurveyUSA (23) and especially the Majority Watch project of RT-Strategies and Constituent Dynamics (56). Also, more than a quarter of the total (52) are partisan surveys conducted by the campaigns, the party committees or their allies, with far more coming from Democrats (44) than Republicans (8).
The sample sizes for House race surveys are also typically smaller. While national surveys typically involve 800 to 1000 likely voters, and statewide surveys 500 to 600, many of the House polls involve only 400 to 500 interviews (although the Majority Watch surveys have been interviewing at least 1000 voters in each district).
Finally, very few districts have been surveyed by public pollsters more than a few times since Labor Day. Only two of the 25 seats now held by Republicans rated as "toss-ups" by the Cook Political report have been polled 5 or more times. Most of these critical seats have been polled 2 to 4 times. Put this all together, and the results are likely to be more varied and more subject to all sorts of error than other kind of political polls. After the 2004 election, SurveyUSA put together a collection of results for every pre-election public opinion poll released in the U.S. from October 1 to November 2, 2004. Their spreadsheets included 64 House race surveys, and their calculations of the error of each survey indicate that those few House races had more than double the error on the margin (5.82) than the polls conducted in the presidential race (3.43).
All of which goes to say that while we too will be watching the House polls more closely over the next three weeks, for all the tables and numbers, we know far more about these races than meets the eye. More on what we do know tomorrow.
**Correction: Colleagues have emailed to point out that quoted match rates for list samples have improved in recent years and now typically range from 60% to 80%. I won't quarrel, although I have had past experiences where quoted rate exaggerated the actual match once non-working numbers are purged from the sample.
We have devoted much attention recently to the flood of new national surveys showing small declines in the Bush job approval rating and modest Democratic gains on the generic House ballot question since mid-September. Until today, I had not looked closely at levels of party identification reported on those surveys. It turns out those have also trended Democratic recently, a finding that may explain some of the apparent "house effect" differences among statewide pollsters over the last few days.
The debate over weighting surveys by party identification has been a focus of this blog since its inception. My posts on the subject from 2004 and beyond are worth reviewing but the gist is this: Pollsters typically ask respondents some variant of a question asking whether they consider themselves "a Democrat, a Republican or an independent?" The so called "Party ID" question has been asked, examined and studied for more than 50 years, and an ongoing debate exists about whether to weight (or statistically adjust) survey results by party.
The crux of the debate is whether party identification is more like a fixed demographic characteristic (such as gender or race) or more like an attitude that can change with the prevailing political winds. For most adults, party identification does appear to be highly stable, changing rarely if ever. The problem is that some small portion of voters (perhaps 10% or 15%) appear willing to jump back and forth -- usually between one of the parties and the independent category -- depending on the wording of the question, its position in the survey, how hard the interviewer pushes for an answer, or, in some cases, what has been happening in the news.
Those who argue for weighting by party say that the real trends tend to be slow and gradual and that party weights can adjust dynamically over time to accommodate these slow moving trends (see also the party weighting page maintained by Prof. Alan Reifman). Those who argue against party weighting (a class that includes most of the national media pollsters) worry that such an approach will suppress real but short-term changes that sometimes occur in reaction to news events (such as the period just after the 9/11 attacks or the period just after the 2004 Republican convention).
A look at the party identification data from the recent surveys suggests we may be in the midst of another such short term change. The table that follows shows party identification results for six national surveys conducted before and after the resignation of Congressman Mark Foley. Five of six show some Democratic gain in party identification:
This change may also explain the wide divergence in results reported by the two automated pollsters in two nearly simultaneous surveys conducted this week in Missouri and Ohio. In both states, SurveyUSA showed the Democratic candidates with significantly greater leads (+14 in Ohio and +8 in Missouri) than Rasmussen (+6 in Ohio and -1 in Missouri). While both pollsters use the automated "interactive voice response" (IVR) methodology, Rasmussen weights by party and SurveyUSA does not. Moreover, the most recent SurveyUSA samples have grown more Democratic since August.
Does this shift in party identification represent a real shift in attitudes among the population of adults or registered voters or does it reflect some short enthusiasm among Democrats to be interviewed? Is the change a momentary spike or will it persist until Election Day? These are the questions that professional pollsters are mulling over right now, and the answers are not obvious. We will just have to wait and see (no pun intended).
Our Slate Election Scorecard update tonight focuses on two new polls in New Jersey that confirm recent gains by Bob Menendez and move the race to lean Democrat status. The overall scorecard tally now indicates 49 seats leaning or held by Democrats, 49 seats held or leaning Republican. Is this change indicative of a larger Democratic surge?
Two new polls out this evening from Survey USA in Ohio and Missouri both show the Democratic candidates in each state leading by much wider margins than on other recent polls. These results and the sometimes improbably wide Democratic margins on the generic House ballot in some recent national surveys leave some wondering whether, as reader Gary Kilbride put it in a comment a few hours ago, "the current poll numbers skew misleadingly toward Democrats due to the Foley scandal." He wonders if the same might be happening to the Majority Watch congressional district results released today.
I will have more to say about all of this tomorrow, but for tonight one quick note about those new Majority Watch congressional surveys. Although they released results from 32 districts today, only nine involved follow-up surveys in districts polled previously using comparable ballot tests. The table below shows the August and October results for those nine districts.
All of these Districts are currently represented by Republicans and all were rated as toss-ups by the Cook Political Report when the polls went into the field (they moved CO-07 to lean Democratic status just yesterday). While Tom Riehle's analysis made much of the apparent Republican improvement in Washington-08, Virginia-02 and Indiana-02, the overall pattern looks more random. Those Republican advances were largely offset by Democratic gains in North Carolina-11 and New Mexico-01. Overall, the average Democratic margin declined by just a single percentage point.
The bigger story may be that the average Republican percentage across these nine districts has not budged from 46% since August or that none of the Republicans in the nine districts holds a statistically significant lead. More on the meaning of these House polls tomorrow.
This "Guest Pollster Corner" contribution comes from Thomas Riehle, a
Partner of RT Strategies
Editor's note: In a 2:30 p.m. press conference, Riehle announced that when
the sum up results of the 63 surveys they have conducted since August and
consider races where a candidate hold a lead beyond the margin of error,
Democrats currently lead or safely hold 217 seats and while Republicans lead
in or hold 198 seats. Democrats will need to win 218 seats to gain majority
control. Full data are now available at www.majoritywatch.com, including a
summary of all top-line results to date. The Pollster.com House Race page
is updated to include all of the new Majority Watch data.
Majority Watch, a project of RT Strategies and Constituent Dynamics, sponsored by Waggener Edstrom Worldwide, is the most comprehensive project ever undertaken to identify and conduct polls in most of the highly contested House races across the country. In August and September, Majority Watch polled in 30 House districts. On October 1, we polled Mark Foley's Florida 16th district with two simultaneous polls, one in which respondents were informed that a vote for Foley would count for the Republican candidate to be named later, and one in which respondents did not get that information.
Today, Majority Watch begins to release results from Round II. In 32 races polled in the current cycle:
Republican incumbents who seemed to be in trouble in late August have held on or even improved their positions. In Washington's 8th C.D., Republican Rep. Dave Reichert has moved from 3 points behind to 3 points ahead of Democrat Darcy Burner, 48%-45%. In Virginia's 2nd C.D., Republican Rep. Thelma Drake has moved from 8 points behind to a marginal 2-point lead over Democrat Phil Kellam, 48%-46%. In Indiana's 2nd C.D., Republican Rep. Chris Chocola has moved from 12 points behind to only 4 points behind Democrat Joe Donnelly, 46% for Chocola to 50% for Donnelly. In Colorado's 7th C.D., Majority Watch polling shows Republican Rick O'Donnell is tied with Democrat Ed Perlmutter in the race to fill the open seat, essentially unchanged since August. All of these were among the first races Democrats targeted, and that early warning may have given Republicans the head's up they needed to remain competitive and avoid getting swept away.
Most Republican leaders have survived the worst of Foley's Folly, but in localized areas where there was a local media hook for the story (Florida, New York, possibly Arizona), damage may have been severe for many Republicans, at least at this time -- there's still time to recover.
On the positive side for Republicans, neither Speaker Denny Hastert (ahead by 10 points, 52%-42%) nor House Page Board chairman Rep. John Shimkus (ahead by 17 points, 53%-36%) seem to have suffered. The highest profile Republican House incumbent closest to Washington, D.C., Rep. Frank Wolf in Virginia's 10th C.D., remains ahead of Democrat Judy Feder, 47%-42%.
On the other hand, in Ohio, where the Republicans were already beset by the culture of corruption charge, Republican Conference Chairperson Deborah Pryce is behind by double digits, in Ohio's 18th District Republican Joy Padgett trails Democrat Zack Space by 9 points, and even in Ohio's 2nd C.D., Rep. Jean Schmidt is marginally behind Democrat Democrat Victoria Wulsin by 3 points, 45%-48%.
In New York, NRCC Chairperson Tom Reynolds has stumbled badly, trailing Democrat Jack Davis by 16 points, 56% for Davis to 40% for Reynolds. In the open seat in New York's 24th C.D., Democrat Michael Arcuri has opened a significant lead, 53%-42% over Republican Raymond Meier. Even Republican Rep. Peter King, never shy about pointing out when the leadership is wrong and vocal in his anger at how House leaders have handled the Foley case, seems to have suffered -- his is only marginally ahead, 48%-46% over Democrat Dave Mejias.
In a surprise, Arizona Republican Rep. Rick Renzi is marginally behind Democrat Ellen Simon, 50%-46%.
The Philadelphia suburbs remain troublesome for Republicans, with Republican Rep.s Jim Gerlach and Curt Weldon trailing their Democratic challengers.
Majority Watch takes advantage of new technologies, married to the oldest standards of sampling and vote modeling, to extend the practice of public opinion polling down to the level of House races. Calls are made by IVR recordings ("robo-calling"). The sample is drawn from voter lists of active voters, with Majority Watch controlling in-home selection in those households where more than one voter resides. The calls are kept extremely short in order to keep response rates as high as those for many publicly-released telephone interviewer polls (about 20% response rate using the standard AAPOR definition). And consumers are increasingly comfortable pushing buttons to respond to recorded voices -- can any reader say he or she is unfamiliar with the notion of "press 1" for one thing or "press 9" for another? These "robo-calls" perform not much differently than traditional telephone interviewer calls for very short, "horse-race" polls.
Majority Watch is currently polling in ten more House districts for release next week (GA-08, IL-08, IL-10, NH-01, NH-02, NY-19, NY-20, NY-25, NY-29, and OH-01), at which time we will have solid polls, with about 1,000 voters, in each of 55 House races. Depending on developing political circumstances, we may further expand the list and conduct more polls after next week.
Based on what you know right now, do you think Speaker of the House Dennis Hastert should remain in his position as Speaker of the House? Do you think he should resign as Speaker of the House but remain a member of Congress? Or do you think he should completely resign from Congress?
27% Remain Speaker
20% Resign leadership
43% Resign from Congress
10% Not sure
MP readers may want to note that the results above, from a one-night sampling of 1,000 adults conducted Thursday night, were actually the second of a two-night tracking poll. From the SurveyUSA release:
Though Thursday night's polling data is not good news for Hastert, the data is an improvement from SurveyUSA interviews conducted 24 hours prior, on Wednesday night. Then, 49% of Americans said Hastert should resign from Congress, 17% said he should remain as Speaker, and 23% said he should resign his Leadership post but remain a member of Congress. Though the day-to-day movement is small, and some of it is within the survey's 3.2% margin of sampling error, the movement is consistent across the board and therefore worthy of comment.
There are inherent limitations to surveys with short field periods; however, when a news story is changing hour-by-hour, nightly tracking studies can provide a valuable "freeze-frame" snapshot of what Americans were thinking at a moment in time.
As part of their "interactive crosstabs" for this poll, SurveyUSA provides a time series chart that allows users to plot trends for each of the key subgroups (via the pull-down menu that appears in the upper left corner of the data table).
Now I have no idea whether SurveyUSA intends to continue tracking this question going forward. They are obviously a lot busier now than when they tracked the response to Hurricane Katrina for 24 days in September 2005. But if these results intrigue you, it's probably worth checking the SurveyUSA Breaking News page for further updates.
Update: Rasmussen Reports, the other big automated pollster, also conducted a survey on whether Hastert should resign. Their results offer a lesson on the challenges of writing this sort of question:
Should Dennis Hastert Resign from His Position as House Speaker? 36% Yes 27% No 37% Not sure
Do you have a favorable or unfavorable opinion of Dennis Hastert? 10% Very Favorable 14% Somewhat favorable 19% Somewhat unfavorable 16% Very unfavorable 41% Not sure
The number who say Hastert should resign as speaker is much higher on the Rasmussen survey (36%) than on the SurveyUSA poll (20%), but Survey USA reports a much higher number (63%) who say Hastert should resign either as speaker or from Congress. Offering three choices rather than two appears to make a big difference. And the fact that 41% on the Rasmussen survey say they do not know Hastert well enough to rate him helps explain why the question format and language make such a big difference.
The timing also differed: The Rasmussen survey was conducted over the last two nights (Thursday and Friday) while SurveyUSA tracked on Wednesday and Thursday nights.
Our Slate Senate Scorecard update for tonight focuses on a new Rasmussen poll in Connecticut that shows Joe Lieberman leading Democratic nominee Ned Lamont by ten points (50% to 40%).
Tracking the Connecticut Senate race especially challenging because the most active pollsters in the state have shown consistent differences in their results -- at least until today. See the chart below (courtesy Charles Franklin), which shows Lieberman's margin over Lamont (Lieberman's percentage minus Lamont's percentage):
Both the Rasmussen automated surveys and the conventional, live interviewer phone polls conducted by Quinnipiac University showed Lieberman's margins narrowing since July but holding fairly steady over the last month. However, until the survey released today, the Rasmussen surveys have consistently shown a closer margin than the Quinnipiac Polls. This pattern is similar to the one we described yesterday in Tennessee, where Democrat Harold Ford is running stronger on the Rasmussen surveys than on conventional telephone interview polls conducted by Mason Dixon.
In this case it is harder to use the survey mode (live interviewer vs. automation) to explain the differences in because the house effects are inconsistent by mode. Another live interview pollster (American Research Group) has also shown a consistently closer race, while automated pollster SurveyUSA reported Lieberman ahead by 13 points in early September.
Today's result, however, brings the Rasmussen and Quinnipiac polls into agreement, at least for the moment. The last Quinnipiac poll released last week also showed Lieberman leading by 10 points. So is the latest turn in the Rasmussen trend line the sign of new Lieberman momentum, a convergence in the polls results or just an outlier result? Only time, and more surveys, will tell for sure.
Hmm. Just yesterday we had one with Ford up by 5; not long before that there was one with Corker up by 5. Is it just me, or is this more variation than we usually see? Are voter sentiments that volatile (or superficial)? Or is there something about this race that makes minor differences in polling methodology more important? Or is this normal?
At the moment at least, I agree with the answer he received later from Michael Barone that the poll numbers in Tennessee do not appear unusually volatile. Barone pointed out that the results of nearly all the Tennessee polls this year appear to fall within sampling error of the grand average. That point is worth expanding on, but it is also worth noting that the averages conceal some important differences among the various Tennessee surveys.
First, let's talk about random sampling error. If we assume that all of the polls in Tennessee used the same mode of interview (they did not), that they were based on random samples of potential voters (the Internet polls were not), that they had very high rates response and coverage (none did), that they defined likely voters in exactly the same way (hardly), that they all asked the vote question in an identical way (close, but not quite) and that the preferences of voters have not changed over the course of the campaign (no again), then the results for the various polls should vary randomly like a bell curve.
Do the appropriate math, and if we assume that all had a sample size of roughly 500-650 voters (most did) than we would expect these hypothetically random samples to produce a results that falls within +/- 4% of the "true" result 95% of the time. Five percent (or one in twenty) should fall outside that range by chance alone. That is the standard "margin of error" that most polls report (which captures only the random variation due to random sampling. But remembering the bell curve, most of the polls should cluster near the center of the average. For example, 67% of those samples should fall within +/- 2% of the "true" value.
Now, let's look at all of the polls reported in Tennessee in the last month, including the non-random sample Zogby Internet polls:
As it happens, the average of these seven polls works out to a dead-even 44% tie, which helps simplify the math. In this example, only 1 the 14 (7%) results falls outside the range of 40% to 48%44% (that is 44%, +/- 4%.). And only 4 3 of 14 (28%21%) fall outside the range of 42% to 46% (or 44%, +/- 2%). So as Michael Barone noted, the variation is mostly what we would expect by random sampling error alone. Considering all the departures from random sampling implied above, that level of consistency is quite surprising.
These results may seem more varied than in previous years partly because the samples sizes are considerably smaller than the national samples of (typically) 800 to 1000 likely voters that we obsessed over during the 2004 presidential race.
The confluence of the averages over the last month (or even over the course of the entire campaign, as Barone noted) glosses over both important differences among the pollsters and some real trends that the Tennessee polls have revealed. Charles Franklin helped me prepare the following chart, which shows how the various polls tracked the Ford margin (that is, Ford's percentage minus Corker's percentage). The chart draws a line to connect the dots for each pollster that has conducted more than one survey. The light blue dots are for pollsters that have done just one Tennessee survey to date.
The chart shows a fairly consistent pattern in the trends reported by the various telephone polls, both those done using traditional methods (particularly Mason-Dixon) and the automated pollster (Rasmussen). Franklin plotted a "local trend" line (in grey) that estimates the combined trend picked up by the telephone polls (both traditional and automated). The line "fits" the points well: It indicates that Ford fell slightly behind over the summer, but surged from August to September (as he began airing television advertising).
As Barone noticed, the five automated surveys conducted since July (including one by SurveyUSA) have been slightly and consistently more favorable to Ford than the three conventional surveys (to by Mason-Dixon and one by Middle Tennessee State University). But the differences are not large.
The one partisan pollster - the Democratic firm Benenson Strategy Group - released two surveys that showed the same trend but were a few points more favorable to Democrat Ford than the public polls. This partisan house effect among pollsters of both parties for surveys released into the public domain is not uncommon.
But now consider the green line, the one representing the non-random sample surveys of Zogby Interactive. It tells a completely different story: The first three surveys were far more favorable to Democrat Ford during the summer than the other polls, and Zogby has shown Ford falling behind over the last two months while the other pollsters have shown Ford's margins rising sharply.
This picture has two big lessons. The first is that for all their "random error" and other deviations from random sampling, telephone polls continue to provide a decent and reasonably consistent measure of trends over the course of the campaign. The second is that in Tennessee, as in other states we have examined so far, the Zogby Internet surveys are just not like the others.
UPDATE: Mickey Kaus picks up on Barone's observation that the automated polls have been a bit more favorable to the Democrats in Tennessee and speculates about a potentially hidden Democratic vote:
Maybe a new and different kind of PC error is at work--call it Red State Solidarity Error. Voters in Tennessee don't want to admit in front of their conservative, patriotic fellow citizens that they've lost confidence in Bush and the GOPs in the middle of a war on terror and that they're going to vote for the black Democrat. They're embarrassed to tell it to a human pollster. But talking to a robot--or voting by secret ballot--is a different story. A machine isn't going to call them "weak."
Reynolds updates his original post with a link to Kaus and asks whether the same pattern exists elsewhere.
Another good question, although for now our answer is incomplete. We did a similar "pollster compare" graphic on the Virginia Senate race over the weekend. The pattern of automated surveys showing a slightly more favorable result for the Democrats was similar from July to early September, but the pattern has disappeared over the last few weeks as the surveys have converged. In Virginia, the most recent Mason-Dixon survey has been the most favorable to Democrat Jim Webb.
While we will definitely take a closer look at this question in other states in the coming days and weeks, it is worth remembering that most of the "conventional surveys" in Tennessee and Virginia were done by one firm (Mason-Dixon), while most of the automated surveys to date in Tennessee have been done by Rasmussen. As such, the differences may result from differences in methodology other than the mode of interviewer among these firms (such as how they sample and select likely voters or whether they weight by party as Rasmussen does).
If one story is more important than all others this year--to those of us who obsess over political polls--it is the proliferation of surveys using non-traditional methodologies, such as surveys conducted over the Internet and automated polls that use a recorded voice rather than a live interviewer. Today's release of the latest round of Zogby Internet polls will no doubt raise these questions yet again. Yet for all the questions being asked about their reliability, discussions using hard evidence are rare to non-existent. Over the next month, we are hoping to change that here on Pollster.com.
Just yesterday in his "Out There" column (subscription only), Roll Call's Louis Jacobson wrote a lengthy examination of the rapid rise of these new polling techniques and their impact on political campaigns. Without "taking sides" in the "heated debate" over their merits, Jacobson provides an impressive array of examples to document this thesis:
[I]t's hard to ignore the developing consensus among political professionals, especially outside the Beltway, that nontraditional polls have gone mainstream this year like never before. In recent months, newspapers and local broadcast outlets have been running poll results by these firms like crazy, typically without defining what makes their methodology different - something that sticks in the craw of traditionalists. And in some cases, these new-generation polls have begun to influence how campaigns are waged.
He's not kidding. Of the 1,031 poll results logged into the Pollster.com database so far in the 2006 cycle from statewide races for Senate and Governor, more than half (55%) have been done by automated pollsters Rasmussen Reports, SurveyUSA or over the Internet by Zogby International. And that does not count the surveys conducted once a month by SurveyUSA in all 50 states (450 so far this year alone). Nor does it count the automated surveys recently conducted in 30 congressional districts by Constituent Dynamics and RT Strategies.
Jacobson is also right to highlight the way these new polls "have made an especially big splash in smaller-population states and media markets, where traditional polls - which are more expensive - are considered uneconomical." He provides specific examples from states like Alaska, Kanasas and Nevada. Here is another: Our latest update of the SlateElection Scorecard (which includes the automated polls but not those conducted over the Internet) focuses on the Washington Senate race, where the last 5 polls released as of yesterday's deadline had all been conducted by Rasmussen and SurveyUSA.
Yet the striking theme in coverage of this emerging trend is the way both technologies are lumped together and dismissed as unreliable and untrustworthy by establishment insiders in both politics and survey research.
Jacobson's piece quotes a "political journalist in Sacramento, Calif," who calls these new surveys "wholly unreliable" (though he does include quotes from a handful of campaign strategists who find the new polls "helpful, within limits").
Consider also the Capital Comment feature in this month's Washingtonian, which summarizes the wisdom of "some of the city's best political minds" (unnamed) on the reliability of these new polls. Singled out for scorn were the Zogby Internet polls - "no hard evidence that the method is valid enough to be interesting" - and the automated pollsters, particularly Rasmussen:
[Rasmussen's] demographic weighting procedure is curious, and we're still not sure how he prevents the young, the confused, or the elderly from taking a survey randomly designated for someone else. Most distressing to virtually every honest person in politics: His polls are covered by the media and touted by campaigns that know better
The Washingtonian feature was kinder to the other major automated pollster:
SurveyUSA's poll seems to be on the leading edge of autodial innovation. Its numbers generally comport with other surveys and, most important, with actual votes.
[The Washingtonian piece also had praise for the work of traditional pollsters Mason-Dixon and Selzer and Co, and complaints about the Quinnipiac College polls]
Or consider the New York Times' new "Polling Standards," noted earlier this month in a Public Editor column by Jack Rosenthal (and discussed by MP here), and now available online. The Times says both methodologies fall short of their standards. While I share their caution regarding opt-in Internet panels, their treatment of Interactive Voice Response -- the more formal name for automated telephone polls -- is amazingly brusque:
Interactive voice response (IVR) polls (also known as "robo-polls") employ an automated, recorded voice to call respondents who are asked to answer questions by punching telephone keys. Anyone who can answer the phone and hit the buttons can be counted in the survey - regardless of age. Results of this type of poll are not reliable.
Skepticism about IVR polling based on theoretical concerns is certainly widespread in the survey research establishment, but one can look long and hard for hard evidence of the lack of reliability of IVR, or even Internet polling, without success. Precious little exists, and the few reviews available (such as the work of my friend, Prof. Joel Bloom, or the 2004 Slate review by David Kenner and William Saletan) indicate that the numbers produced by the IVR pollsters comport as well or better than with actual election results than those from their traditional competitors.
The issues involving these new technologies are obviously critical to those who follow political polling and require far more discussion than is possible in one blog post. So over the next six weeks, we are making it our goal here at Pollster to focus on the following questions: How reliable are these new technologies? How have their results compared to election results in recent elections? How do the current results differ from the more traditional methodologies?
On Pollster, we are deliberately collecting and reporting polls of every methodology -- traditional, IVR and Internet -- for the express purpose of helping poll consumers make better sense of them. We certainly plan to devote a big chunk of our blog commentary to these new technologies between now and Election Day. And while the tools are not yet in place, we are also hoping to give readers the ability to do their own comparisons through our charts.
More to say on all the above soon, but in the meantime, readers may want to review my article published late last year in Public Opinion Quarterly (html or pdf), which looked at the theoretical issues raised by the new methods.
Interests disclosed: The primary sponsor of Pollster.com is the research firm Polimetrix, Inc. which conducts online panel surveys.
Our daily Slate Scorecard update posted earlier this evening focuses on the new poll from Mason-Dixon that shows a narrowing race in the Virginia Senate pitting incumbent Republican George Allen against Democratic challenger Jim Webb.
We also discuss why the Slate Scorecard does not include the online Zogby Interactive/Wall Street Journal polls and made the following observation:
The latest Zogby results for Virginia-showing Webb ahead 50 percent to 43 percent-help explain our caution. Zogby's Virginia samples have been consistently more favorable to Webb than other pollsters, suggesting a bias in Zogby's online methodology.
With the help of Charles Franklin, here is a chart showing the consistent difference in Virginia. It plots the Allen margin (that is, Allen's percentage of the vote minus Webb's percentage -- click on the graph for a full size image) for each of the four pollsters that have tracked the race. All four show the same sharp drop in Allen's lead since July, but the Zogby result (the green line) has been consistently more favorable to Webb than the three telephone pollsters that use random probability samples of all telephone households rather than samples of Internet volunteers.
The differences are not trivial. The latest polls from Survey-USA, Mason-Dixon and Rasmussen have shown Allen with leads of 3, 4 and 5 percentage points respectively. Zogby's result is very different, showing Webb with a seven percentage point lead.
Incidentally, we do include all of the Zogby Interactive results and other Internet polls in the charts and tables here on Pollster.com. Our aim is to give readers the ability to compare results across pollsters. One minor wrinkle -- at least for now -- is that the 5-poll averages reported here may differ from those on Slate because the averages here include the Zogby numbers while those reported on Slate do not. We are hoping to address that conflict in a future update to our chart pages.
The big news yesterday for true political junkies was the release of separate polls conducted simultaneously in 27 of the most competitive districts nationwide (with surveys in three more districts ongoing) using an automated recorded voice rather than live interviewers. The surveys were conducted for a project dubbed "Majority Watch" by the team of RT-Strategies, a DC based firm that polls for the Cook Political Report, and Constituent Dynamics, a company that specializes in the new automated methodology. While the slick Majority Watch website provides full crosstabs if you click far enough, many readers have asked, are these surveys legitimate? Are they reliable? The best answer, to paraphrase the Magic Eight Ball is, "reply hazy, ask again later."
The formal name for the automated methodology is Interactive Voice Response (IVR). Two companies - SurveyUSA and Rasmussen Reports - have conducted IVR surveys for years. While those companies do many things differently, both typically sample using a random digit dial (RDD) methodology that has the potential to reach every working land-line phone in a particular state. Unlike traditional surveys, the IVR polls use a recorded voice, rather than a live interviewer, and respondents must answer by pressing the keys on their touch-tone telephone. With IVR, the pollster's ability to randomly select a member of each sampled household is also far more limited.
The Majority Watch surveys add a few new twists. My friend Tom Riehle of RT Strategies kindly provided some additional details not included on the Majority Watch methodology page:
1) Majority Watch drew its samples from lists of registered voters rather than through random digit dial sampling. The advantage of this approach is that it solves the problem of how to limit the survey to those living in the correct district (a big challenge with RDD sampling). It also excludes non-registrants and allows the use of individual level vote history to determine who is a "likely voter."
The downside to voter list sampling - sometimes called Registration Based Sampling (RBS) - is that it only covers voters that have either provided their phone number to the registrar of voters or whose numbers are listed in public phone directories. "Match rates" (the percentage of voters on the list with working phone numbers) vary widely from state to state and district to district, but rarely exceeds 60%. If the uncovered 40% (or more) differ in their politics, a bias can result.
Pollsters continue to debate the merits of RDD and RBS sampling, and that debate deserves more attention than I will give it today. The short story is that most media pollsters continue to use RDD sampling, especially for national polls. Internal campaign pollsters have been making far greater use of list sampling, especially at the Congressional District level where they use RBS almost exclusively.
2) Majority Watch used individual vote history to select the "likely voters." The lists provided in most states by the registrar of voters typically reports vote history. If you voted in the 2004 presidential election, but not in that school board election in 2005, the list will say so. It is a matter of public record. Majority Watch used an approach common to campaign pollsters: The sampled only those who cast votes in at least two of the last four general elections in their precinct (which included "off-year contests" in 2005 and 2003). What this means, in effect, is that most of their respondents voted in both the 2004 presidential election and at least one general election race.
Majority Watch used based their "likely voter" model entirely on vote history from the list, and did not ask "screen" questions to select their sample.
3) The Majority Watch pollsters used an interesting approach to selecting a random voter in each household and matching the interviewed respondent to the actual voter on the list. They randomly selected one voter to be interviewed within each household, but then used the automated method to interview whoever answered the phone. The interview included questions asking respondents to report their gender and age. After each interview, a computer algorithm checked to see if the reported gender and age matched the data for that individual on the voter file. If the gender and age data did not match, they threw out the interview and did not include it in their tabulations.
4) According to the methodology page, the Majority Watch pollsters then weighted their data "to represent the likely electorate by demographic factors such as age, sex, race and geographic location in the CD." But how did they determine the demographics of the likely electorate in each district? The answer is surprisingly complicated.
They obtained data on each district reported by the U.S. Census as part of the Current Population Study (CPS). Keep in mind, as noted in a post last week, the CPS is also a survey (albeit with a huge sample size and very high response rate), subject to some of the same over-reporting of voting behavior as other surveys.
The Census publishes data for gender and race by Congressional District, but not for age. So the Majority Watch pollsters created their own estimate of the age distribution by applying state level CPS estimates of turnout by age cohort to the district level estimates of age for all adults. If that last sentence was confusing - and I know it was - don't worry. Just note that the estimate of age for "2002 Voters" provided in the lower left portion of each district page on the Majority Watch site is their estimate, as extrapolated from statewide CPS data and not an official Census estimate.
Also note something that has confused many readers that have looked at the Majority Watch web site. All of the demographic data that appears on their district level page is taken (or derived) from U.S.Census data. It is not based on data from their surveys!
Finally, those who have drilled down deep into the Majority Watch crosstabs will notice that the age distribution on the poll is older than their Census-based age estimate. That is because the Majority Watch pollsters also looked at the estimated age distribution obtained directly from the voter lists (based on the birthdays voters provide when they register to vote). This subject is definitely worthy of more discussion, but voter lists consistently show an older electorate than the CPS survey estimates. The Majority Watch pollsters set an age target that was a meld of their CPS estimate/extrapolation and the list estimates, and weighted the data to match. How accurate and reliable is this approach? I have no idea, and am quite sure other pollsters will see shortcomings.
Here is the bottom line for those wondering how much faith to put in the Majority Watch data: Political polling gets considerably more difficult at the congressional district level. While the Majority Watch approach is innovative, it is also new and untested, and it includes a lot of departures from the standard survey practice. And, according to Tom Riehle, the design of the survey may evolve between now and Election Day (and yes, future tracking surveys are planned):
We are privately taking the methodology and results to some tough critics to find out what questions they ask that we may not have thought to ask, in order to keep moving the quality closer to the best quality that can be achieved in telephone interviewing. In that sense, this is a work in progress, because we have made our best effort to develop an excellent methodology, and will continually improve that methodology based on the informed and legitimate questions of methodological critics.
The Majority Watch surveys may turn out to yield reliable results, or they may not. We really will not know until we watch how they compare to other public polls and the ultimate election results. And here at Pollster.com, we are hoping to help you do just that.