Mark Blumenthal | April 26, 2008
Among Registered Voters:
Obama 47, McCain 44
Clinton 48, McCain 45
Among Registered/Dem-Dem Leaners:
Obama 48, Clinton 41
Among Registered Voters:
Obama 47, McCain 44
Clinton 48, McCain 45
Among Registered/Dem-Dem Leaners:
Obama 48, Clinton 41
The Gallup Daily
American Research Group
Conducted April 23-24, n=600, margin of sampling error +/- 4%
Clinton 50, Obama 45
Thanks to Pollster reader BC
The Gallup Daily
Pennsylvania: Clinton 47%, McCain 42%...McCain 44%, Obama 43%
Indianapolis Star-WTHR, conducted by Selzer and Associates
Obama 41, Clinton 38
Thanks to alert Pollster reader SK
South Bend Tribune, WSBT-TV, WISH-TV, WANE-TV, conducted by Research 2000
Sample of 600 "likely voters who vote regularly in state elections," margin of sampling error +/-4%, conducted April 21-24; Sample of 400 "likely Democratic primary voters," conducted April 23-24.
WSBT story, results
South Bend Tribune story
Obama 48, Clinton 47
McCain 51, Obama 43
McCain 52, Clinton 41
An organization named SaveTheVoters.org today released results of two surveys it commissioned in Florida and Michigan that were fielded by Public Policy Polling (PPP). The survey was fielded April 18-20 among 1,020 "Democratic voters" divided roughly evenly between the two states. Here are links to the release, Michigan results, Florida results, selected crosstabs.
Just to make the week more challenging (in light of Eric Dienstfrey's well deserved but not so well timed vacation this week), I agreed some months ago to speak tomorrow on a special panel, "Has Polling Killed Democracy", put on by The Miller Center of Public Affairs at the University of Virginia. So posting will be delayed tomorrow. Apologies in advance for that. We will (hopefully) get back to normal on Monday.
If you are not in Charlottesville tomorrow and still have too much free time on your hands, the panel will be webcast live and archived online at the Miller Center website.
The Gallup Daily
My NationalJournal.com column, a follow-up on the ongoing debate over counting the "popular vote" in the Democratic primary contest, is now online.
In the column, I quoted a passage from an article by the late Austin Ranney about the intent of the McGovern-Fraser commission whose reforms following the 1968 election helped create the current presidential primary system. The quote appeared in the following article: "Changing the Rules of the Presidential Nominating Game: Party Reform in American," in Parties and Elections, ed. Jeff Fishel ( Bloomington: Indiana University Press, 1978). I found it in Rhodes Cook's invaluable volume, The Presidential Nominating Process, A Place for Us? (Rowman and Littlefield, 2004).
As the Ranney article is not available online, I thought readers might appreciate seeing the complete quote with a bit more context. In 1968, only 13 states held primaries that were dominated by two candidates: Eugene McCarthy and Robert Kennedy who received 39% and 31%, respectively, of the popular vote cast in those primaries, respectively. Hubert Humphrey received only 2% of the primary vote. Yet at the time of Kennedy's death just after the final California primary, Humphrey had 561 delegates to 393 for Kennedy and 258 for McCarthy. Humphrey ultimately won his nomination on the first ballot with the support of 1759 delegates.
According to Rhode Cook's account, the McGovern-Fraser Commission was the result of efforts to "mollify" the supporters of Kennedy and McCarthy who "vociferously complained of the archaic, anti-democratic state-delegate selection processes that the boosted Humphrey, and of ham-handed tactics by party leaders at the convention that maintained Humphrey's delegate majority and their control of the party conventions" (p. 42). The commission was "well stocked with proponents of reform," but was "short on representatives from organized labor and the party's urban machines that could be counted on to defend the status quo" (p. 43).
Here is the full passage from the Ranney article (p. 220):
I well remember that the first thing we members of the Democratic party's McGovern-Fraser commission (1969-72) agreed on -- and about the only matter on which we approached unanimity -- was that we did not want a national presidential primary or any great increase in the number of state primaries. Indeed, we hoped to prevent any such development by reforming the delegate-selection rules to that the party's non-primary processes would be open and fair, participation in them would greatly increase, and consequently the demand for more primaries would fade away. And most of us were confident that our guidelines would accomplish all these ends.
But we got a rude shock. After our guidelines were promulgated in 1969 no fewer than eight states newly adopted presidential primaries, and by 1972 well over two-thirds of all the delegates were chosen or bound by them. Moreover, in 1973 Congress was considering a national presidential primary more seriously than ever before. Of course, it cannot be said that the guidelines were the sole cause for the proliferation of primaries. But we do know that in a majority of the eight cases the state Democratic primaries, who controlled the governorships and both houses of the legislature, decided that rather than radically revise their accustomed ways of conducting caucuses and conventions for other party matters, it would be better to split off the process for selecting national convention delegates and let it be conducted by a state-administered primary which the national party would then have to accept.
Ranney went on to consider arguments for and against primaries. After noting that a May 1972 Gallup poll showed 72% of Americans favoring a national primary, he presented a rationale for the minority view (p. 222):
Other Americans, however, believe that a national primary would do more harm than good. It would put an even greater premium than at present on large-scale mass media advertising, polling, public relations expertise and all the other costly features of "the new politics." An this, in turn, would put a premium on big money. Moreover, it would restrict most citizens to just one form of participation in the nominating process, and that would not be healthy for them for for the nation. People of this persuasion therefore agree with the McGovern-Fraser commission's conclusion that "purged of its structural and procedural inadequacies, the National Convention is an institution worth preserving."
I included the shorter reference to Ranney's recollections in the column, and the longer version here, not because they suggest any particular resolution to the debate about the "popular vote" but because they add some interesting and often ironic context. Partisans on both sides will see support for their positions in this history, but it is still history worth knowing.
Siena (College) Research Institute
Clinton 46, McCain 42
Obama 45, McCain 40
n=500 likely y voters, fielded 4/22
Pennsylvania was a pretty good night for most pollsters, certainly compared to some earlier primaries this year. A few made it into the "five-ring" of the target, while almost all were within the "ten-ring". Only two polls, one rather old, got the winner wrong.
Polls finished on or after April 14 are included in the analysis here.
These errors are based on the vote counts at the Pennsylvania Secretary of State web site as of Wednesday afternoon, with 99.44% of precincts reporting and a Clinton vote of 1,237,696 to Obama's 1,029,672, which rounded to 1 decimal point is 54.6% to 45.4%.
There are a number of different ways to compute accuracy for individual pollsters. SurveyUSA has an excellent assessment and explanation of these as well as measures for all pollsters in all primaries this year. (Their Pollster Report Card is currently masked, awaiting a 100% count from Pennsylvania, so I can't link to it right now. I don't expect the remaining precincts to change the 1 decimal point accuracy here, though I will check and update if necessary.)
The measure of accuracy I use here is being close to the "bullseye" of the target above. I think that is what most people would intuitively think of as accuracy-- getting both candidates right. A "perfect" poll would be exactly on the crosshairs in the middle of the target, which corresponds to getting both candidates' votes exactly right.
Because polls almost always include "undecided" voters, their results tend to be in the lower left quadrant, underestimating the final vote for each candidate. (And in a two candidate race, it is impossible to be in the upper right quadrant, but not so in multi-candidate primaries earlier in the year.)
To summarize a pollster's accuracy, I calculate the distance from their poll to the crosshairs of the bullseye. (The distance is the square root of the sum of squared errors for each candidate, if you recall your math about triangles and the hypotenuse.) This "Total Error" is plotted by pollster below. Smaller errors are to the left.
Quinnipiac gets the bragging rights by this measure, with their 51%-44% from polling completed 4/18-20/08. They are followed by Suffolk, ARG and SurveyUSA.
The dots become darker the closer to election day the poll was taken. In this plot, the more recent polls are usually more accurate than are older polls. This is especially clear in the Zogby/Newsmax polling.
A reasonable complaint about this measure is that if a poll finds more "undecided" voters, they will tend to be further away from the bullseye and so this measure penalizes pollster who are more sensitive to potential uncertainty among voters, while possibly rewarding those who push respondents harder for an answer. Deciding how hard to push for a preference is part of the "art" of polling and reasonable pollsters may differ on how hard to push.
An alternative measure focuses on the "margin" between the candidates in the poll compared to the vote. By this approach a poll with a 10 point margin is "right on" if the vote margin is 10 points. But this is true for a poll that has 55-45 as well as for one that has it 45-35 or even 25-15. Despite this drawback, the margin measure doesn't penalize for undecided rate and so it has fans. By that measure the pollsters line up as below.
Here two pollsters can each claim victory. Suffolk and Zogby/Newsmax each had a 10 point margin in their final polls, just a bit over the 9.2 point margin in the vote count. The Insider Advantage final poll has a larger error, with a 7 point margin in their 4/21 poll, while the 10 point margin was for their 4/20 poll. Likewise, Rasmussen's 9 point margin came from a very old poll taken 4/14 (50%-41%) while their final poll taken 4/20 had a 5 point margin (49%-44%).
Cross-posted at PoliticalArithmetik.com.
The Gallup Daily
Nevada: McCain 49, Clinton 38...McCain 48, Obama 43
Conducted April 19-21, n=734 likely Democratic primary voters
As of this writing, the Pennsylvania Secretary of State reports that Hillary Clinton defeated Barack Obama yesterday by a 9.2% margin (54.6% to 45.4%) with 99.4% of precincts reported. Given rounding and small discrepancies some news organizations have been reporting the margin as either nine or ten points, but for the purposes of comparing the pollster's performance, we are close enough. Our final "standard" estimate based on all the public polls showed Clinton ahead by seven points. Considering that Clinton once again did better among those who made up their minds over the last three days (leading 58% to 42%) than those who decided earlier (52% to 48%), on average at least, the public polls did reasonably well.
Also, via Jon Cohen at the Washington Post, the current exit poll results by race and education: Clinton won college educated white voters six points (53% to 47%; they were 41% of all voters) and whites without a college degree by 40 points (70% to 30%; 40% of all voters). As such, Clinton did better among both groups than in the Quinnipiac polling I featured here, although college educated voters, both white and of all races, where a significantly bigger portion of the electorate as measured by the exit polls than in most of the pre-election polls.
Obama's performance within these subgroups was only slightly better than in Ohio, where Clinton won white non-college educated voters by 44 (71% to 27%) and white college educated voters by 7 (52% to 45%). However, the Pennsylvania numbers may still be subject to one more round of re-weighting, so stay tuned.
I hope to have more later, but for now, here are some links to poll related post-election coverage.
Exit poll and related analysis:
Complaints about the continuing problems with early leaked exit poll numbers:
Other pollster related news:
If you've seen other exit poll analysis or pollster related news, email me a link and I'll add it to the list.
[Updated with additional links]
I will be live-blogging here starting very soon on what the exit polls will have to tell us about the exit polls. More details to follow, but please feel free to use this as an open-thread on what is appearing on the net and elsewhere on the exit polls. Here are the links where official exit poll tabulations will appear shortly after the polls close at 8:00 p.m.:
Comments will appear in reverse chronological order.
11:25 - Folks, I'm calling it a night. With 88% of the vote counted as of this writing, Clinton is holding a lead of exactly 10 percentage points. The pre-election polls did reasonably well, especially given that Clinton did better among those deciding late. The early leaked exit poll estimates were off once again. More tomorrow.
9:50 - Reader "Anon" posted the link to the Pennsylvania Secretary of State's official count. At this point, the official count is probably the best source of information about the ultimate vote count. Though having posted that link, I now see that the percentage of the vote counted in the official network news sites.
9:45 - After nearly 90 minutes, the exit poll tabulations update, and Mark Lindeman reports the underlying estimate extrapolates to 54% Clinton, 46% Obama. The geostrata show Philadelphia at 18% of all voters, Philly suburbs at 16%.
8:57 - Noticed an error in the 2006 turnout numbers I posted below. Now corrected.
8:50 - NBC calls Pennsylvania for Clinton.
8:37 - Here are some of the most important numbers to watch in the exit poll tabulations as they update: The 52% to 48% Clinton lead assumes that Philadelphia County is contributing 16% of the statewide vote and the rest of the Philadelphia suburbs (Berks, Chester, Delaware and Montgomery Counties) is contributing 16%.
These "geostrata" estimates are one of the most important aspects of the early tabulations to update as the evening wears on as the analysts incorporate, first hard turnout counts from the sampled precincts, and later the actual vote count.
For comparison's sake, here are some comparable actual numbers. As of January, the Democrats of Philadelphia County were 19% of all registered voters statewide, but in the 2000 Senate primary they contributed 16.7% of the vote. In the 2002 gubernatorial primary between Ed Rendell and Bob Casey, Philadelphia surged to 23.0% of votes cast and in the 2006 primary for Lieutenant Governor it was only 11.1%
still 22.7% of the votes.
The Democrats in the four counties that the exit pollsters define as Philadelphia suburbs were 15.6% of registered Democrats statewide in January. In the three contests cited above, the Philly suburbs contributed 9% in 2000, 16.2% in 2002, and 10.7%
21.6% in 2006.
PS: Nearly forgot, for those not noticing the fine print, is that these two regions are, as expected, Obama's best in the state. So if the composition ends up looking more like 2002
and 2006 than 2000 or 2006, his numbers will improve
Also, I should note that those numbers come from a spreadsheet shared by friend -- It is always possible I've made a computation error, so if anything looks out of order, please let me know.
8:17 - The tabulations show Clinton doing 10 points better among those who made up their minds in the last 3 days (she leads 60% to 40%) than among those who decided earlier that that (divide 50-50%). That is as good or better in terms of late deciders than in previous contests. See the 6:26 update.
8:01 - The characterization from MSNBC is "too close to call." Their website has the initial cross-tabulations posted, and Mark Lindeman has his extrapolation for us: 52% Clinton, 48% Obama. These are based on interviews received just before the polls close.
See the 6:39 update, but it's worth remembering that this initial update is almost always based on what exit pollsters call the "composite" estimate (an average of the exit poll tallies and pre-election polls. Subsequent updates will add in more exit poll interviews being phoned in now, constantly improving estimates of the turnout by the exit poll "geostrata" and, gradually actual vote for the sampled precincts. Also very important to keep in mind: The as-the-polls-close estimate for Ohio also had Clinton leading by four points.
7:48 -- I'm back.
Alert reader Daniel T posts the link to an AP write-up of some of the none vote preference results from the preliminary exit poll tabulations. With the same caveats as noted below about late afternoon partial results, here is some initial demographic information:
As expected, Pennsylvania's Democratic voters were overwhelmingly white and — as usual in Democratic contests — there were more women than men. About three in 10 were age 65 or over. Nearly half were from families that earned less than $50,000 last year. A quarter had household income of more than $100,000 and about as many reported having a postgraduate degree.
Three in 10 Pennsylvania Democratic voters were union members or had one in their household. And four in 10 had a gun owner in the household.
Those numbers are more or less in range with what pre-election polls reported, although the $100K number is higher. For what it's worth: telephone polls get higher refusals on income (for reasons that ought to be obvious).
Those clicking back through to my post on demographics will note that SurveyUSA reported a much higher percentage of college educated voters than other polls. That is almost certainly a function (as learned this afternoon via email) of the very different question they ask about education:
Have you graduated from a 4-year college?
Yes, press 1.
Most telephone surveys use a multi-category question that asks about years of education and typically offers the category of "some college" (meaning coursework sort of a degree). As such, the difference for SurveyUSA was almost certainly about the question, not about their sample
7:00 - This is probably a good time to remember these words of wisdom from TNR's Mike Crowley, written at this time on March 4:
Stop the Madness!
In the last couple of hours I've gotten allegedly reliable Ohio exit poll information showing
a) Narrow Obama lead
b) Narrow Hillary lead
c) Hillary blowout
I think from now on political journalists should turn off their BlackBerries from 5-8pm on election nights and, like, go do ESL tutoring or some other charitable work instead.
In that spirit, I am going to use the next 30 minutes to relocate to the Pollster.com "home office." I'll be back online before the polls close.
6:49 - Josh B comments:
cnn is reporting that according two their exits 58% of those that decided in the last week went for clinton how does that compare to your chart i am not to good at interpreting charts.
The ultimate is how that number -- if accurate -- compares to those who decided earlier. The table shows that the comparable number for Ohio was 57% (and that was from the final exit poll, weighted to match the result).
6:39 - Shortly after the polls close at 8:00 p.m., our friend Mark Lindeman will report the extrapolated overall vote estimate used to weight the exit poll cross-tabulations. These estimates begin as a mashup of pre-election polls and the interviews exit polls conducted at polling places and over the phone (with early voters) by the networks. These estimates improve, becoming more accurate over the course of the night. Click here for the usual caveats on how these numbers are derived and how they improve over the course of the evening. And see my post 3/4 National Journal column for the evidence that while the "as the polls close" numbers are better than those leaking now, they have still had their problems.
6:26 - I have my television tuned to MSNBC, where Nora O'Donnell occasionally pops up and reads preliminary exit poll results that do not pertain (directly) to vote preference estimates. One result she teased just before 6:00 involved the percentage (17%) that said they made up their minds in the last three days. I spent some time gathering results on that question from primaries held since 2/5 this afternoon, and here they are:
The first four columns of numbers in the table above show the percentage clinton received among each subgroup, those who decided in the last three days or before, and those who decided in the last week or before. The last two columns show the percentage that decided "today" (on Election Day) or over the "last three days" before that. Given the numbers in the table, the 17% number O'Donnell cited is consistent with the previous primaries.
What stands out from the table (as Gary Langer noted in his column yesterday) is the consistent and often large advantage that Hillary Clinton has had with late deciders [as compared to those who decided earlier -- thanks RS] in all of the contests since Louisiana. The key question is how the results we will have at 8:00 will compare.
Caveat emptor: I gathered the data for the table quickly. Errors are possible.
6:05 - Just to put a bit more emphasis on the previous update: The results "leaked" for Ohio at this hour on 3/4 showed Obama leading by 2. Clinton won by 10.
6:00 p.m. - I see that at least one publication has posted leaked exit poll results that most will consider a bit surprising. Please keep in mind that these leaked estimates have typically shown a skew in Obama's favor. See the table in my 3/7 National Journal column. Errors on the margin occurred (at this hour) in Obama's favor in 18 or 20 states I looked at, averaging 7 points in Obama's favor. The numbers leaked previously at this hour hit double digits in OH, RI, VT, NJ, MA, GA and AZ.
Cook Political Report / RT Strategies
Obama 45%, McCain 44%
Clinton 45%, McCain 45%
The Gallup Daily
Also: "86% Say Economy Getting Worse"
[Thanks to alert pollster reader JB and DW]
Morning Call/Muhlenberg College
Pennsylvania - Survey of Lehigh and Northampton Counties ONLY
Fielded April 10-17, n=332 likely Democratic primary voters
Clinton 47%, Obama 46%
Clinton 49%, Obama 42%, undecided 9%
We have been busy here over the last day or two, including links to 8 new polls that interviewed through Sunday night, so I am going to try use this post to wrap things up a bit. All but one of the late surveys shows Clinton leading by margins of 5 to 13 points, so to no one's surprise, most expect Hillary Clinton to defeat Barack Obama tonight. The suspense seems to be about the size of Clinton's margin. On that score, unfortunately, the polls are not conclusive.
Why not? Here's the short version: (1) The pattern of smaller undecideds correlating with larger Clinton margin has largely disappeared over the last week, (2) tracking polls have been inconsistent about late trends and (3) the ultimate margin will depend on how well these surveys have selected likely voters. The longer version follows:
1) Do undecideds look like Clinton voters?
Maybe, maybe not.
The notion of undecided voters "breaking" to one candidate or another is something of a misnomer to begin with. It makes the implicit assumption that all surveys measure the true electorate and that all voters that express preferences on surveys are truly decided, leaving the final margin in the hands of voters that tell pollsters they are "undecided."
In reality, our "likely voter" models inevitably include some adults that end up not voting and exclude some that do. As such, the "undecided" category on the final round of polling usually includes a disproportionate share of those who are disengaged from the race and end up not voting. Also, some voters tell pollsters they have a preference even though they say they may still change their minds (9% of those with a preference in Pennsylvania, according to the final Mason-Dixon survey).
A week ago, I used my National Journal column to highlight a pattern in the surveys that suggested a hidden vote for Clinton. The Obama percentage appeared relatively stable across polls while the Clinton percentage varied considerably with the size of the undecided category. As the undecided percentage decreased, Clinton's percentage grew.
On the last round of polls, however, the pattern that I highlighted has disappeared. I updated the chart used in the column with the polls fielded over the final weekend highlighted in dark blue. The wide "spread" in the dots is gone. The previous pattern had owed largely to differences in the results from two pollsters (which both use an automated methodology): SurveyUSA showed big Clinton leads and small undecided, while Public Policy Polling (PPP) showed Obama even or slightly ahead and a larger undecided percentage.
On the last round of surveys, two important things changed. SurveyUSA, still finding very few "undecided" voters, showed the Clinton margin narrowing significantly, while PPP added a follow-up question asking undecided voters how they lean. PPP continues to show Obama with a slight lead, only with a much smaller undecided percentage. So the pattern of dots in the chart is now more circular, and the relationship between the size of the undecided category and the Clinton margin has all but disappeared (something Poblano also noticed yesterday).
Of course, the remaining undecided may still conceal a disproportionate share of Clinton voters, but hard evidence of that proposition is weak. Chuck Todd noticed that undecided voters in the MSNBC/Mason-Dixon survey were higher in subgroups where Clinton does better (among gun owners and outside of the Pittsburgh and Philadelphia media markets). Looking back at theTime/SRBI survey conducted in early April, Charles Franklin saw evidence that undecideds seem "somewhat more likely to support Clinton." However, as I look at the pattern of undecideds in the most recent SurveyUSA and Quinnipiac surveys, I see no clear pattern in the undecided either by region or demographic subgroups. On the Quinnipiac survey, for example, the percentage of undecided voters is roughly same among African Americans (6%) and white voters without a college education (5%).
Gallup's Frank Newport looked at the evidence on this question last Friday (a post worth reading) and "neatly" concluded that "undecideds either will or will not break for Clinton in Pennsylvania." That's about right.
2) Are polls showing a late trend?
Once again, unfortunately, the bottom line is maybe, maybe not.
The Zogby rolling average tracking shows the Clinton margin growing from one percentage point (46% to 45%) to ten (51% to 41%). However, other surveys that have tracked twice over the last week to ten days show no consistent trend. As the table below shows, both SurveyUSA and ARG showed essentially a trend in Obama's favor over the last week, while four other pollsters showed essentially no change. On average, these "apple-to-apple" comparisons show Clinton's percentage increasing by less than a point, Obama's by roughly two. Ignoring statistical significance, four polls showed movement in Obama's direction, two showed movement in Clinton's direction and one showed no change in the Clinton-Obama margin.
Click the thumbnail below to see a larger version with more complete data:
Our trend estimates add another wrinkle. The standard trend lines look parallel, suggesting little or no change in Clinton's 6-7 point margin over the last week. However, as Charles Franklin explained earlier this morning, the more sensitive estimate -- which gives greater weight to more recent polls (including a few that had not been tracking a week ago) -- shows a slightly bigger Clinton margin (8.4 points).
So, again unfortunately, we either have evidence of a late trend, or we do not.
The exit polls tonight will help resolve whether late deciders have have favored one candidate. Yesterday, drawing on recent exit polls, ABC's Gary Langer noted:
Late deciders have been turning to Clinton recently, but only recently. She’s done better among late deciders than among other voters in 11 contests, including the eight most recent. Going farther back, though, she’s done the same among late deciders in 10 contests, and worse in 10.
3) "It's the Turnout Stupid"
That's the way FiveThirtyEight's Poblano put it yesterday, and he's right. For all our worry about late shifts and the problems of interpreting the "undecided" category, the collective accuracy of the polls (or lack thereof) in predicting Clinton's margin probably depends even more on how well they have done selecting "likely voters."
Give a pollster 1,000 voters to interview, and our measures do a reasonable job discerning their preferences. But trying to discern the actual primary voters from a random sample of 1,000 adults is not so easy and far less accurate. Different methods of selecting "likely voters" can end up selecting different kinds of people. Since the Obama-Clinton race features large differences in vote preference by race, gender, age, socio-economic status and region, relatively small shifts in the composition of the electorate can alter the vote margin noticeably.
As I reviewed yesterday, if we look at their composition in terms of race, age, gender, and years of education, the Pennsylvania polls show meaningful variation. Given the demographic patterns in the vote, a difference of four points in the African-American contribution on most polls can lead to a three point shift in the Clinton-Obama margin. Differences of five percentage points in terms of the contribution of white voters under 35 or white voters with a college education may translate into two-point shifts in the Clinton-Obama margin. The same is probably true for the share of the vote in the Philadelphia metro area (as Virginia Centrist points out).
"I want to know the future," Pollster reader Fourth wrote yesterday. "Is that too much to ask?"
No, it's not. Unfortunately the challenge of selecting likely primary voters is what makes these pre-election polls blunt instruments as predictors. They can give us a general sense of where things stand, which way they are moving (when the movement is large) and guidance about what each candidate needs to do to maximize their support. But the problem involves too many unknown variables to try to predict the outcome with precision.
The future will be here in about 12 hours. We will know soon enough.
[Embarrassing typos repaired. Many thanks to Pollster reader JB for his unsolicited fill-in for Eric].
Clinton has increased her lead in the trend estimates over the course of the last polls to 6.6 points using the standard estimator, and to 8.4 points using the sensitive estimate. Last minute polls have given her bigger margins.
Now the key question is whether undecideds push her over a 10 point win, or whether increases in turnout by new "unlikely" voters raises Obama's total.
Still a good bit of variation and some pollsters see a strong trend, others not so much.
Pollster variation doesn't make a lot of difference in our trend estimates.
But remember, since the polls don't allocate undecided, both they and the trend estimates are leaving some 8 percent of voters on the table. They will go somewhere, and if they break disproportionately for Clinton you have a "huge win", while if they go overwhelmingly for Obama you have a nail biter or a dramatic come-from-behind win. In previous primaries, the "winner" has usually enjoyed a significant increase in support beyond what the last polls showed.
Cross-posted at PoliticalArithmetik.com.
Suffolk University (release)
Allegheny County ONLY
n=402 likely Democratic primary voters, April 20-21
Clinton 52, Obama 40
4/20 through 4/21, n=675 likely Democratic primary voters
Clinton 51, Obama 41
On the eve of the Pennsylvania primary, here is one last update on the results by race, education and gender and measured by the Quinnipiac University surveys (and kindly shared with us courtesy of Quinnipiac polling director Doug Schwartz).
I have followed these results over the last several weeks for the same reason that ABC News polling director Gary Langer lists education at the top of his column today on "groups to watch" in Pennsylvania:
It’s hard see a single factor more compelling than socioeconomic status, particularly as defined by education. It’s split the Democratic electorate nearly all year, and as with her past victories, it’s what Hillary Clinton will be counting on tomorrow.
Years of education also split the Democratic electorate in past elections, such as 2000, 1992 and 1984, and survey researchers have known for decades that it has been one of the strongest predictors of racial tolerance. Yet amazingly, as per my post earlier today, at least seven of the Pennsylvania pollsters have released surveys that fail to ask or report any measure of income or education. Consider that omission when thinking about which polls to trust.
But I digress. Back to Langer's point about education:
Across primaries to date Obama’s won college graduates by 52-43 percent, while Clinton’s won less-educated voters by a very similar 52-42. The picture sharpens among whites only (there’s no difference by education among blacks): White college graduates have split 47-47 percent, while those with no college degree have gone 2-1 for Clinton, 60-31 percent.
The proportion of college-to-non-college voters isn’t always critical – Obama cruised among both groups in Wisconsin – but it’s mattered more often than not. Last month, in economically stressed Ohio, less-educated voters were in great supply (just 38 percent of white voters were college graduates, compared with an average of 52 percent across all primaries to date) and that helped Clinton immeasurably: She won less-educated whites by 71-27 percent, while her edge among white college graduates was just 52-45 percent.
The numbers that Langer cites above are from exit polls. In Ohio, the final Quinnipiac poll before the primary showed Clinton leading by a six-point margin (50% to 44%) among college educated whites and by more than 30 points (63% to 31%0 among non-college educated whites. Compare that to the numbers below, which include results from the latest Quinnipiac Pennsylvania survey released this morning.
Unlike Ohio, Obama has run ahead of Clinton among college educated whites by 7-9 points over the last three weeks. Clinton's margin among whites without a college education, however, has been roughly the same as in Ohio.
Hillary Clinton's favorable rating shows little or no change by subgroups over the last few weeks, with the possible exception of African Americans. While the subgroup is relatively small (no more than 140 interviews), Clinton's unfavorable rating has increased significantly (to 42%) from the levels measured in late March and early April (30 to 34%).
Similarly, Obama's favorable rating has been mostly stable in the Quinnipiac surveys over the last few weeks, although his positive rating among non-college educated white men has been slightly higher on the three surveys in April than the two surveys in March.
Conducted 4/18-20, n=1,016 adults, n=552 Democrats and Democratic leaners
USA Today: "Obama Widens Lead" (via The Page)
USA Today: "Dissapproval of Bush breaks record"
Gallup: "Democrats Split on Whether Campaign is Hurting the Party"
Dems: Obama 50, Clinton 40
Registered Voters: Obama 47, McCain 44 - Clinton 50, McCain 44**
Note: These results are not from the Gallup Daily but rather a completely separate sample using a different questionnaire conducted with USA Today.
**Thanks J_from_Germany, I'd missed that.
Clinton 49%, Obama 39%, undecided 12%
Public Policy Polling (D)
Favorable / Unfavorable
McCain: 52 / 44
Clinton: 44 / 54
Obama: 47 / 51
Senator Clinton currently holds a 6 point lead over Senator Obama in Pennsylvania, based on our Pollster Trend Estimate, 49%-43%. But that leaves about 8 percent undecided. What they do will determine whether Clinton's vote expands her lead compared to the polls, or if the undecided narrow or possibly reverse, the lead.
In this post I take a look at the individual level, though using data that are three weeks old, so use caution in extrapolating to tomorrow's electorate.
Using data from the Time/SRBI poll of Pennsylvania, conducted 4/2-6/08, I estimate a model of support for Obama compared to Clinton. I use "the usual suspects" as variables predicting vote: partisanship, gender, race, Hispanic ethnicity, region of the state, age, education, religion and income. The data at that time found an eight point Clinton lead, a bit higher than today's trend estimate.
Using the coefficients for "decided" voters, I can estimate the probable vote of the undecided 11% of voters in the poll. This gives us a look at how they would be expected to behave IF they behave like those who have already picked a candidate. (Note the "if" here. As with all models, this assumes stable influence of the variables among the undecided as among the decided.)
The plot above shows the distribution of estimated probability of voting for Obama. Values close to zero are very likely to support Clinton, while values close to 1 are very likely Obama supporters. Those close to .5 are flipping a coin. The shape of the distribution gives a sense of where voters "lump up" in their estimated preferences.
The black line plots the distribution among those who reported a vote preference. The red line plots the distribution of estimated support among those who said they were undecided in early April.
The key point is that the undecided resemble the decided, with a small shift to the left, suggesting they were as a group somewhat more likely to support Clinton. In these data, the primary difference between undecided and decided voters was age, with older voters more likely to say they hadn't decided. As we've seen in virtually every exit poll, older voters are more likely to support Clinton, so the result we find here, that the undecided lean a bit more towards Clinton, is consistent with this result.
Now again for the caveats. These data are three weeks old. The model requires the assumption that undecided voters ultimately behave like those who decided. Different variables as predictors can make a difference. And so on.
The goal here is NOT, NOT, NOT a prediction of tomorrow's vote. Much may have changed since the first week of April.
The point is to illustrate what we can learn about undecided voters beyond the simple fact they say "undecided". In this case, the data suggest they are not wildly different from those who decided, but their older age makes it more likely they ultimately lean more to Clinton.
The Time/SRBI data are archived at the Roper Center for Public Opinion Research. I am solely responsible for the analysis here.
Cross-posted at PoliticalArithmetik.com.
Just before the March 4 primaries, I did posts on the demographic compositions of the polls from both Texas and Ohio. With the ever valuable Eric Dienstfrey away on vacation this week, I am doing this post in a bit of a rush (so apologies in advance for typos). I would strongly recommend reviewing my post on the Texas demographics as a companion to this piece.
I have broken the available results into two tables below. Most come from documents posted on the web. Quinnipiac provided results on request, and the Zogby numbers were shared with my colleagues at the National Journal.
The racial mix of the Pennsylvania polls is not quite as critical to the level of candidate support as in Texas, since the share of black and Latino voters is smaller. Still, since Obama typically does better among African-Americans, men, younger voters and those with college degrees or higher incomes, while Clinton does better with whites, women, older voters and those with lower incomes or without a college degree, the demographic composition of the electorate will play a role in determining the outcome of the race.
The surveys show more variation on some characteristics than others. Most, for example, show the percentage of women as somewhere between 55% and 58%, and most show the African-American percentage as somewhere in the mid-teens. Of course, with Barack obama expected to receive 80% to 90% of the black vote, the difference between an African American composition of 13% and 18% can alter Obama's vote total by 3 to 4 points.
On the other hand, we see quite a bit of difference in age. Unfortunately, the pollsters do not all use the same categories to ask about and report respondent age. Still, we can see quite a bit of difference, particularly in the percentages in the 18-to--29, 18-to-35 and 18-to-44 categories. We see that 18-to-29-year-olds are are anywhere from 4% to 16%, that 18-to-44/45-year-olds are anywhere from 22% to 43%, depending on the pollster. Given that Obama typically does much better among younger voters, and that Clinton does much better among retirees, this variation is obviously critical. [Update: Brian Schaffner also blogged on this issue today].
Socio-economic status is another critical characteristic in the Obama-Clinton race, especially in Pennsylvania (and something that I have written about often). Unfortunately, quite a few pollsters either ask or report nothing about the level of self-reported education or income of their samples. Still, we see considerable variation. The percentage of respondents with college degrees varies from 29% to 44%. I should point out that education and especially income are subject to more measurement error than other demographic items, especially if the text of the question and the number of categories differs.
Finally, since readers asked for it the last time, I have also posted one more table that includes all of the data above, plus the vote preference results. You will need to click on the graphic below to see a larger, readable version.
It is important to remember that pollsters come to these composition statistics through different paths. Some interview samples of adults, weight those demographically to match census estimates of Pennsylvania's adults, then select "likely voters" and let their demographics fall where they may. Others will weight their "likely voter" samples directly to pre-determined demographic targets. Some pollsters will not set weights or quotas for demographics, but will set such weights or quotas for geographic regions (based on past turnout and their assumptions about what might be different this time).
Trying to discern the differences in these methods is beyond our capacity today. The important thing is to remember that different pollsters conceive of "likely voters" in different ways, and the "likely voters" they reporting at are not identical.
Update: Poblano at FiveThirtyEight.com blogged some worthy thoughts about differences in likely voter models today.
Please note that, given the crunch of time I have probably not proofed the tables as well as I should have. If you catch a typo, please do not hesitate to send an email so I can correct it.
Judge for yourself.
The Pennsylvania race has turned slightly toward Clinton over the weekend, with her lead now at an even 6 points in our standard trend estimate. If you believe in taking more chances with random noise, the sensitive estimator has a 6.4 point Clinton lead.
In the rush of new polling over the weekend, it is also good to check how much any of them may be affecting our estimates.
Dropping any single pollster makes only a bit of different to our estimates. The Clinton trend ranges from 48.5% to 49.6%, while Obama ranges from 42.6% to 43.5%. So dropping your least favorite pollster can, at most, account for the difference in a 5 point race and a 7 point one.
And note that we still have about 9 percent undecided. I wonder what they will do?
n=722 likely Democratic primary voters, fielded 4/20
Clinton 49%, Obama 44%
SurveyUSA - WCAU-TV Philadelphia, KDKA-TV Pittsburgh, WHP-TV Harrisburg, and WNEP-TV Scranton.
Pennsylvania 4/18 through 4/20, n=1,800 adults, n=710 likely Democratic primary voters
Clinton 50, Obama 44
WCAU-NBC10 story (via alert Pollster reader Joe E).
Public Policy Polling (D)
April 19-20, n=2,338 likely Democratic primary voters
Pennsylvania - 4/18 through 4/20, n=1,027 likely Democratic primary voters
Clinton 51%, Obama 44%
Suffolk University (via The Page)
4/19 through 4/20, n=602 likely Democratic primary voters
Clinton 48, Obama 42
Strategic Vision (R)
Pennsylvania 4/18 through 4/20
Clinton 48, Obama 41
McCain 48, Obama 40... McCain 46, Clinton 42
American Research Group
Clinton 54, Obama 41
The Gallup Daily (National)
Rasmussen Reports (National)
MSNBC/Pittsburgh Post-Gazette/McClatchy Newspapers/Mason Dixon Polling and Research
4/17 through 4/18, n=625 likely Democratic primary voters
Clinton 48, Obama 43
4/18 through 4/19, n=607 likely Democratic primary voters
Clinton 46, Obama 43