Our most recent update to the SlateElection Scorecard focuses on a new poll in the New Jersey Senate race, but I would like to say a few words about Virginia. We updated the race on Thursday, but two new polls have been released there since. Taken together, the new surveys show George Allen maintaining a narrow lead over Democratic challenger Jim Webb despite a swarm of new allegations regarding his past use of racial slurs. Those allegations may be having an effect, however, in blunting the impact of Allen's massive spending advantage in September.
First, a brief timeline: SurveyUSA fielded its first Virginia poll this week Sunday through Tuesday night, releasing data on Wednesday showing Allen leading among likely voters 49% to 44%, slightly (but not significantly) more than their previous poll conducted two weeks earlier. However, a new story broke in Virginia as the poll was in the field and the SurveyUSA release speculated about "volatile" race they saw in "day-to-day data:"
On Sunday 9/24, after Allen had been accused of using racial slurs in college, he led by 7 in SurveyUSA Sunday-only data . On Monday 9/25, after Allen strongly denied the accusations, he led by 11 in SurveyUSA Monday-only data. On Tuesday 9/26, after more people corroborated the accusations, Allen trailed Webb by 3 points, in Tuesday-only data. The 5-point Allen advantage shown here, when the 3 days of data are combined and averaged, cannot be considered stable [emphasis in original]
Presumably wanting to check a potential break to Webb, SurveyUSA continued to conduct interviews Wednesday and Thursday nights. They released new data yesterday based on a "rolling average" of surveys conducted Tuesday through Thursday. The result: Allen leads in the second wave by six points (50% to 44%). The four SurveyUSA polls released since mid-august indicate a small but consistent Allen lead (see their chart below - click it to see a full size image). However, the slightly growing Allen lead cannot be considered statistically significant given the reported sampling error of four or more percentage points of each survey.
Several readers wondered why the Slate summary on Thursday made no mention of the day-to-day results mentioned in the fine print of the first SurveyUSA release. The reason was partly the space limitation of the Slate summary, but mostly because - as Gerry Daly on Crosstabs correctly noted - "volatility" is a built in feature of small nightly samples. SurveyUSA interviewed roughly 200 to 210 likely voters each night this week, not enough to make the day-to-day shifts they reported statistically significant. The additional interviews conducted Wednesday and Thursday nights confirm that Allen's lead was undiminished by the end of the week.
Let's give SurveyUSA credit for sharing their suspicions of a mid-week trend to Webb and for conducting additional interviews to help sort things out. But let's also remember a rule of thumb I have learned over the years: When the larger sample indicates consistency but smaller subgroups look unstable, the "volatility" is usually in the survey, not the voters.
But wait, there is one more survey in the mix, this one released late yesterday by Mason-Dixon. They show Allen and Webb tied at 43% each among 625 likely voters** interviewed Saturday through Wednesday last week. Three weeks earlier, they had Allen ahead by 4 points (46% to 42%). While the difference between a four point lead and a tie obviously has great psychological significance, the two results are within sampling error of each other.
Are the trends in these polls hopelessly conflicted? Not really. Neither trend is statistically significant. Consider that chart below that Charles Franklin created for us this morning. It plots the Allen margin (Allen's vote minus Webb's vote). While the chart shows the random variation from poll to poll, the grey "local trend" regression line (based on the telephone polls only) that fits the points well and shows the narrowing of the race that occurred over the summer mostly flattening out in September.
[Obviously, the Zogby Internet surveys are an exception. While the most recent release snaps back into line with other polls, four out of six Zogby polls have shows results much more Democratic result, and the huge Allen rebound the show in September is inconsistent with what the other surveys have shown].
The recent general stability in the telephone surveys is remarkable if you consider how massively the Allen has been outspending Webb over the same period. The controversies surrounding Allen may be dominating the press coverage of the race, but the air war is a totally different story. In June, Allen reported $6.6 million in cash-on-hand to Webb's $424,000, and that advantage is now playing out. The Hotline (subscription only) reported on Thursday that Allen has already spent over $3.5 million on electronic advertising and his campaign has purchased another $650 thousand in television time next week, and much of that attacks Webb.
Webb's advertising is very light by comparison. The Associated Press reported on Wednesday that Webb started airing his second television ad on Tuesday, but only in "the Norfolk and Roanoke broadcast markets and cable in the Washington suburbs at a total cost of $65,000 through Oct. 2." So in the voter rich Richmond and Washington DC markets, voters have seen plenty of this advertisement attacking Webb, but have seen next to nothing of the spots introducing Webb.
So the controversies surrounding Allen are having an effect. For now, at least, they appear to be blunting the impact of Allen's attack ads, and that alone is remarkable.
**The language of the Mason-Dixon release is a little confusing, but they are reporting results among "likely voters" and not all registered voters: "A total of 625 registered Virginia voters were interviewed statewide by telephone. All stated they were likely to vote in the November general election" (emphasis added).
I just want to post a quick apology for the odd appearance of certain corners of the site this morning. As regular readers may notice, we have been hard at work on a few new menu choices for the blog pages. Unfortunately, at the moment, not everything is working as smoothly as it should. We are hoping to have evertying cleaned up soon, and will post more details about the new features when they are ready. Thank you for your patience.
So where do you turn when Jon Stewart's Daily Show decides to skewer not just opinion polls but poll charts as well? That's right, you turn to Pollster.com (though keep in mind that they may have earned their TV-14 rating with this one):
As the man says, "anything that gets the public fired up about Statistics has got to be a good thing." Speaking of which (for those who may have landed here via link or search engine), we do have quite a bit of material here on how to make sense of those oh-so-confusing poll numbers. Please feel free to look around.
And as for the "poll smoking" line, oh Wonkette Emerita, where are you when we need you?
Overlooked: As William Loeb points out below, this was not the first installment of the "Poll Smoking" segment. See his comment for more details
If one story is more important than all others this year--to those of us who obsess over political polls--it is the proliferation of surveys using non-traditional methodologies, such as surveys conducted over the Internet and automated polls that use a recorded voice rather than a live interviewer. Today's release of the latest round of Zogby Internet polls will no doubt raise these questions yet again. Yet for all the questions being asked about their reliability, discussions using hard evidence are rare to non-existent. Over the next month, we are hoping to change that here on Pollster.com.
Just yesterday in his "Out There" column (subscription only), Roll Call's Louis Jacobson wrote a lengthy examination of the rapid rise of these new polling techniques and their impact on political campaigns. Without "taking sides" in the "heated debate" over their merits, Jacobson provides an impressive array of examples to document this thesis:
[I]t's hard to ignore the developing consensus among political professionals, especially outside the Beltway, that nontraditional polls have gone mainstream this year like never before. In recent months, newspapers and local broadcast outlets have been running poll results by these firms like crazy, typically without defining what makes their methodology different - something that sticks in the craw of traditionalists. And in some cases, these new-generation polls have begun to influence how campaigns are waged.
He's not kidding. Of the 1,031 poll results logged into the Pollster.com database so far in the 2006 cycle from statewide races for Senate and Governor, more than half (55%) have been done by automated pollsters Rasmussen Reports, SurveyUSA or over the Internet by Zogby International. And that does not count the surveys conducted once a month by SurveyUSA in all 50 states (450 so far this year alone). Nor does it count the automated surveys recently conducted in 30 congressional districts by Constituent Dynamics and RT Strategies.
Jacobson is also right to highlight the way these new polls "have made an especially big splash in smaller-population states and media markets, where traditional polls - which are more expensive - are considered uneconomical." He provides specific examples from states like Alaska, Kanasas and Nevada. Here is another: Our latest update of the SlateElection Scorecard (which includes the automated polls but not those conducted over the Internet) focuses on the Washington Senate race, where the last 5 polls released as of yesterday's deadline had all been conducted by Rasmussen and SurveyUSA.
Yet the striking theme in coverage of this emerging trend is the way both technologies are lumped together and dismissed as unreliable and untrustworthy by establishment insiders in both politics and survey research.
Jacobson's piece quotes a "political journalist in Sacramento, Calif," who calls these new surveys "wholly unreliable" (though he does include quotes from a handful of campaign strategists who find the new polls "helpful, within limits").
Consider also the Capital Comment feature in this month's Washingtonian, which summarizes the wisdom of "some of the city's best political minds" (unnamed) on the reliability of these new polls. Singled out for scorn were the Zogby Internet polls - "no hard evidence that the method is valid enough to be interesting" - and the automated pollsters, particularly Rasmussen:
[Rasmussen's] demographic weighting procedure is curious, and we're still not sure how he prevents the young, the confused, or the elderly from taking a survey randomly designated for someone else. Most distressing to virtually every honest person in politics: His polls are covered by the media and touted by campaigns that know better
The Washingtonian feature was kinder to the other major automated pollster:
SurveyUSA's poll seems to be on the leading edge of autodial innovation. Its numbers generally comport with other surveys and, most important, with actual votes.
[The Washingtonian piece also had praise for the work of traditional pollsters Mason-Dixon and Selzer and Co, and complaints about the Quinnipiac College polls]
Or consider the New York Times' new "Polling Standards," noted earlier this month in a Public Editor column by Jack Rosenthal (and discussed by MP here), and now available online. The Times says both methodologies fall short of their standards. While I share their caution regarding opt-in Internet panels, their treatment of Interactive Voice Response -- the more formal name for automated telephone polls -- is amazingly brusque:
Interactive voice response (IVR) polls (also known as "robo-polls") employ an automated, recorded voice to call respondents who are asked to answer questions by punching telephone keys. Anyone who can answer the phone and hit the buttons can be counted in the survey - regardless of age. Results of this type of poll are not reliable.
Skepticism about IVR polling based on theoretical concerns is certainly widespread in the survey research establishment, but one can look long and hard for hard evidence of the lack of reliability of IVR, or even Internet polling, without success. Precious little exists, and the few reviews available (such as the work of my friend, Prof. Joel Bloom, or the 2004 Slate review by David Kenner and William Saletan) indicate that the numbers produced by the IVR pollsters comport as well or better than with actual election results than those from their traditional competitors.
The issues involving these new technologies are obviously critical to those who follow political polling and require far more discussion than is possible in one blog post. So over the next six weeks, we are making it our goal here at Pollster to focus on the following questions: How reliable are these new technologies? How have their results compared to election results in recent elections? How do the current results differ from the more traditional methodologies?
On Pollster, we are deliberately collecting and reporting polls of every methodology -- traditional, IVR and Internet -- for the express purpose of helping poll consumers make better sense of them. We certainly plan to devote a big chunk of our blog commentary to these new technologies between now and Election Day. And while the tools are not yet in place, we are also hoping to give readers the ability to do their own comparisons through our charts.
More to say on all the above soon, but in the meantime, readers may want to review my article published late last year in Public Opinion Quarterly (html or pdf), which looked at the theoretical issues raised by the new methods.
Interests disclosed: The primary sponsor of Pollster.com is the research firm Polimetrix, Inc. which conducts online panel surveys.
Our most recent Slate Scorecard update focuses on some changes to the races for Governor. Specifically, we have seen recent gains by Democrats Ed Rendell in Pennsylvania and Deval Patrick in Massachusetts. We had already classified both states as "strong" Democrat and both candidates appear to be widening their leads.
Our most recent Senate Scorecard update yesterday reviewed news polls in New Jersey. The Garden State polls all confirm that the race is very close, with Republican Tom Kean Jr. running a few non-significant points ahead. Meanwhile, the spate of new surveys in Pennsylvania confirm that Casey's double-digit lead is as commanding as ever.
The most interesting thing about the tracking is the continuing showing of trends to Democrats in many states, despite the apparent slight improvement in the Bush job rating nationally. The Slate feature includes "momentum meter" for each contest based on an algorithm Charles Franklin created that looks for significant changes in the margin of the last five polls within each state. The Senate Scorecard also features a national "momentum arrow" that looks at changes in the average of averages across all of the 13 states we are tracking.
Right now the overall momentum arrow is pointing in the Democratic direction, and six states (Arizona, Minnesota, Missouri, Rhode Island, and Tennessee) indicate trends in the last five polls toward the Democrat. None of the Republican candidates are showing recent momentum shifts. This trend is definitely something to keep an eye on. If the national political environment improves for Republicans, we would expect to see some impact on these Senate races. So far at least, no such improvement is evident.
Two recent polls provide case studies in how pollsters determine the demographics of "likely voters," especially the gender breakdown. The answer is not as simple as you might imagine, although when it comes to gender, some public pollsters show a surprising reluctance to adjust likely voter samples that produce highly implausible percentages of women.
Most of us understand that polls aim to capture a snapshot of attitudes at the moment they are taken, "as if the election were held today." On that much, most pollsters agree. But my pollster colleagues tend to disagree about how much they are willing to assume about who will vote. Media pollsters generally prefer to use procedures that select "likely voters" based on the respondent reports of their likelihood to vote, past voting behavior and interest in the campaign, while allowing the demographics of the resulting likely voter sub-sample to vary from poll to poll. Internal campaign pollsters are more willing to make estimates of some of the demographics of likely voters and adjust their samples accordingly to keep the demographic composition reasonably consistent.
Given that background, consider our first case study, a recent survey of Michigan voters conducted by EPIC/MRA for the Detroit News and several local television stations. The survey gave Democratic governor Jennifer Granholm an eight-point lead over Republican challenger Dick DeVos (50% to 42%), but also reported a gender composition of 57% female. Some Republicans cried foul, so the National Journal's Hotline (subscription only) contacted EPIC/MRA pollster Ed Sarpolous for comment:
The poll is a great example of the science of weighting polls, says EPIC/MRA's Ed Sarpolous. He explains that, when conducting a poll for a media client, the client has two options: They can take a snapshot of the race as it stands at that moment in time, or they can choose to guess what the electorate will do come Election Day. The difference is all in how the pollster weights the polls.
Here I have to stop. In my experience, pollsters rarely try to "guess what the electorate will do," although we may sometimes make an educated guess about who the electorate will be as described above. It is puzzling that Sarpolous would describe his weighting procedure in these terms, although the specifics of the rest of his explanation have much more to do with the demographic composition of his sample than its vote preference. The Hotline report continues:
The survey Sarpolous conducted, though, was unweighted. The [unweighted] "snapshot," as he calls it, allows his clients to take a look at the state of the race today. In this case, he says, men -- especially Republican men -- weren't making it through his screens of likely voters. That is, they were telling his interviewers that they were unlikely to vote. That made the unweighted sample of likely voters overwhelmingly female.
Sarpolous says his clients had to answer a question when deciding how to weight their poll -- or leave it alone: "Are you looking to write a story about what's happening today, or what's going to happen in 55 days?"
Two things are odd about this explanation. The first is that most media pollsters begin with a sample of all adults, and weight the adult sample that to match the highly reliable demographic estimates from the U.S. Census. They then select a pool of registered or likely voters from the larger adult sample, allowing the demographics of the sub-sample to vary.** It would be very unusual if the EPIC/MRA survey did no demographic weighting at all, but it is unclear from the explanation above.
Second, while estimating the composition of voters in terms of age or race can be difficult, we do have reasonably consistent estimates of gender. One such estimate comes from the Current Population Survey (CPS) of the U.S. Census. The CPS is also a survey, of course, and their voter estimates are based on self-reported voting behavior. However, the CPS is based on a very large initial sample (60,000+ households nationally each month) with a very high response rate (90%+). The following table shows the CPS estimates by state for 1998 and 2002 (kindly provided by Professor Michael McDonald of George Mason University, who has been analyzing CPS estimates of voter demographics for an upcoming journal article):
A note of caution: The CPS voter sample sizes in many states are well under 1,000, and as such, the estimates no doubt include much random variation due to sampling error. However, even with the variation, CPS reported very few states with a gender composition of 57% or higher in either year (the District of Columbia and Delaware in 2002, D.C. and Mississippi in 1998). The higher percentage of women in places like DC and Mississippi owes to a greater percentage of African American voters. CPS has shown a large and persistent gender gap in turnout among African Americans (see the report on differences in turnout by race in the 2002 CPS, Table B).
Michigan had a larger sample size in both years (n=1,437 in 2002). The gender percentage there was reasonably consistent -- 53.0% in 1998, 53.4% in 2002. The CPS is not the only source. A pollster might also consider the results of past exit polls and the gender statistics available from the lists of past voters. Except for urban areas and states like Mississippi, the gender composition of voters rarely deviates more than a percentage point or two from 52-53% female. Speaking from personal experience, most campaign pollsters would weight "likely voter" data for that state to 53% female.
The Hotline story went on to say that after "the public outcry," Sarpolous went back and weighted his results by gender to reflect a close balance of men and women.
As it turns out, men were nearly as likely as women to favor DeVos and, when weighted evenly to predict Election Day turnout, the results ended up the same.
That may be, though I am still puzzled why the pollster would not have conducted this sort of analysis before releasing the initial results.
The gender composition of our second case study was probably more consequential. A survey of Indiana's 8th Congressional District conducted recently for the Evansville Courier & Press by Indiana State University that showed Democratic challenger Brad Ellsworth leading incumbent Republican John Hostettler by a surprising 15 point margin (47.4% to 31.8%). The Courier & Pressreported that 63.5% of the 603 interviews conducted among registered voters were women. Most campaign pollsters would probably agree with the assessment of Republican pollster Bill Cullo, who called the gender mix "unprecedented" yesterday on Crosstabs.org.
Another recent automated survey on Indiana-08 conducted by the Majority Watch project (RT Strategies and Constituent Dynamics) weighted their results to 48% male, 52% female. They also showed a significant gender gap in voter preferences. Their survey had Hostettler leading by 14 points among men (56% to 42%) but trailing by a whopping 24 points among women (36% to 60%).
What were the results of Courier & Press survey by gender? The initial poll story includes no tabulations by gender. However, given the high proportion of women in their sample, they owe their readers some indication of how this very unusual result may have affected the results.
UPDATE: A follow-up article from the Courier & Press explains that their survey showed little difference in voter prefernence by gender. See my subsequent post for details.
**CLARIFICATION: In using the word "pool" above I did not mean to imply that the process of selecting registered for likely voters involves a second round of random sampling. Pollsters simply select the subgroup of interest (self identified registered voters, or voters that they classify as "likely") from the larger sample. The process is analogous to selecting any other subgroup (women, 18-30 year olds, union members, etc.).
Weighting or adjusting a sample of all adults by demographics like gender and age is not controversial among media and political pollsters, because, as I wrote above, we can base those adjustments on highly reliable U.S. Census estimates of the adult population. The practice of seperately weighting the subgroup of registered or likely voters is more controversial, because the demographics of those populations vary slightly from election to election, and estimates are less reliable.
Finally, the Courier & Press survey was conducted by the Sociology Research Lab at Indiana State University. The original version of this post identified it incorrectly as Indiana University.