Articles and Analysis


Texas What-If

Topics: 2008 , ABC/Washington Post , Barack Obama , Divergent Polls , Hillary Clinton , Insider Advantage , Mason-Dixon , Pollsters , PPP , Rasmussen , SurveyUSA , Zogby

Let's start with the bottom line: The final value of our trend estimate for Texas (at least as of this writing) shows Hillary Clinton running slightly ahead of Barack Obama (47.6% to 45.9%), but I would advise readers against treating that as a solid prediction of the outcome. It may turn out that way, of course, but variation among individual polls and more importantly -- uncertainty at this hour about the racial composition of the Texas electorate -- means that the ultimate result is unknowable.

First, let's take a look at the latest version of my table comparing the demographic composition of most of the polls out over the last few weeks, updated with the surveys released since my post on Friday (and lets say a thank you to all the pollsters who have released this data -- what a change since Super Tuesday):

03-03 Texas Demos2.png

If you read through the lines a bit, you can see the different approaches that various pollsters take to "modeling" the likely electorate. Some arrive at a set of arbitrary weighting quotas for gender, age and race and apply these consistently on each survey. Notice the way the percentages for both Zogby and InsiderAdvantage are identical on all of their surveys, except one. The exception is the Latino percentage on the most recent InsiderAdvantage, which plunges from 37% to 27% (while all other demographics remain spot on identical). Perhaps someone there had a change of heart about their model?

Some -- such as ABC/Washington Post and SurveyUSA -- take a very different approach. The begin by interviewing a sample of all adults in Texas, weight the demographics of the adult sample to Census estimates for Texas, choose "likely voters" based on their answers to screen questions and allow the demographics of the likely voters "fall where they may." See this post for more details on SurveyUSA's approach.

An important underlying point here is that some pollsters have more confidence than others in the ability of their measurements to "predict" the likely electorate and its demographics. My own sense (and be advised that other pollsters may not agree) is that pre-elections polls are much better at measuring the opinions and preference of respondents than at precisely predicting who will vote and who will not. I will spare the detail this morning, but the bottom line is that screen questions are at best a crude measure of who will turn out. We can select "likely voters" with a greater probability of voting than those we screen out, but that's as good as it gets.

The best approach in situations like these, when voters demonstrate huge differences by racial and demographic subgroups, is to watch those differences and understand the potential range of outcomes. So let's do just that.

The good news, again, is that most of the pollsters have released both racial composition data and cross-tabulations of the vote by race, which allows for the following table. A few observations: Most surveys have been reasonably consistent (within the vagaries of sampling error) in their results for Latinos and African Americans. Most have shown Clinton with a roughly two-to-one lead among Latinos and most have shown Obama winning 75% to 85% of African Americans. Those results are generally consistent with exit poll results from other states (although Obama has typically done a few points better on the exit polls than in final pre-election polls).

03-03 TX by race

However, pollsters have been less consistent in their measurements of white voters.For example, on poll released in the last 24 hours, SurveyUSA and Rasmussen show Clinton leading by just four points among white voters, while Mason Dixon, PPP and InsiderAdvantage show Clinton with margins of 15 or more points among white voters.

I have included three sets of averages at the bottom of the table: One for all of the polls listed, one for the final poll by each organization and one for final polls released over the last three days. Keep in mind that cross-tabs by race were not available for all surveys.

Let's use these results to put together a "what if" analysis of the turnout (similar to that found in the Belo/Public Strategies analysis). I have created a spreadsheet in Excel (download here) or Google Documents (edit here) that you can use to try and test your own assumptions (details on how to edit the Google docs version at the end of this post).

I set up the spreadsheet using the following assumptions:

(1) A racial composition of 51% Anglo, 29% Latino and 20% African American -- the average of the assumptions and findings of the pollsters in the first table; (2) A 56% to 44% Clinton margin among white voters -- which takes the average above and proportionately allocates unecideds; (3) A 67% to 33% Clinton margin among Latinos and (4) an 85% to 15% Obama margin among African Americans (which assumes that Obama overperrforms in this constituency by about as much as he has elsewhere this year). I am not making predictions here, just grabbing for reasonably defensible assumptions based on the available data as a starting point -- your "mileage" may vary.

Google Docs - Texas Vote Scenario.png

As it turns out, these assumption produce a roughly two-point Clinton lead, the same as our current trend estimate. That's not a great surprise, since they are based on mash-ups of the demographics and results of most of the the polls used to generate the overall estimate. But now, play "what if" and see how making very small changes in any of the assumptions can easily alter the outcome.

For example, apply racial composition findings from the ABC/Washington Post survey (39% Latino, 17% African-American), leave all other assumptions as is, and you get a 6 point Clinton victory. On the other hand, apply the racial composition used on the Belo/Public Strategies polls (25% Latino, 22% African American), and you get a half-point Obama win. Leave the racial composition as-is, but assume that SurveyUSA or Rasmussen has the white vote right (Clinton leading by just four points) and Obama wins by two. But if InsiderAdvantage, PPP and Mason Dixon has Clinton's margin among white voters right (roughly 15 points), and all other assumptions remain constant, and Clinton wins by six. I could go on.

Again, your assumptions may be different on any of the above, so open up the spreadsheet and have at it. But now hopefully, you have a sense for why races like these give pollsters heartburn.

To edit the Google Docs spreadsheet: You will need a free Google account. Click here to display the published version, then click the "edit this page" link at the bottom right of the screen and you will see a "view only version of the spreadsheet." If you use the "File" pull-down at upper right to copy and rename the spreadsheet, you can edit, change and even publish as you see fit.

Hat-tip to David Ianelli and Elise Hu for providing the final Belo/Public Strategies numbers by race. Typos in the tables have been corrected.

Update: In the comments, Steve P asks some good questions:

Extrordinarily high turn out overall but I would think that [African-American] turnout is probably outpacing their regular share. But that Hispanic share is probably down.

I understand that this is difficult to guage since we have never had a primary go this deep into the schedule before on a competitive basis. Wouldn't you have to toss election models out because we have never seen anything like this before?

How well has final polls lined up with exit poll data demographically? Are we seeing the same demographic portions despite the much larger pie?

Should we throw out old demographic models? Certainly. But which model makes the most sense? To answer Steve's question briefly: While we have seen consistently greater proportions of younger and higher income voters participating in Democratic primaries as measured by exit polls, the change in racial composition has been inconsistent. We will know the answer in a few hours, of course, but a critical question is whether Texas will be more like California this year (where the Latino share surged from 16% to 30%, while the African American share dropped a point from 8% to 7%) or more like Arizona (where the Latino share increased one point, from 17% to 18%, but the African American contribution quadrupled, from 2% to 8%).


Michael X.:

Thanks for compiling all these!


Bob Evans:

The spreadsheet stuff shows is impossible for Billary to win. Better not donate to the Billary trust fund today suckers! It's OVER!


yes we can!


Nothing substantive to add here, but just wanted to add a note of thanks for compiling these. It's a fantastic public service.


Terrific work!

And you even included a spreadsheet so we can play too.

Mark Blumenthal, I think I love you.


Tom W:

Just wanted to agree with Brian.

with the exception of a few rabid fans of both candidates, I really enjoy the discourse here, and all the analysis. I must say I have learned quite a bit about polling and political analysis thanks to you and your more talented readers.

(And I like that there is far less verbal trash cluttering up the information here compared to many other neutral political sites. people need to get a grip.)



Nice job! Thanks! BTW, Constituent Dynamics/KXAN has a more recent poll out, although I can't remember if they provided any cross tab data. IVR just came out with an updated poll too; there's some discussion of racial composition in the article but without cross tabs.



It might be interesting to do this by age, in addition to race. Notice 18-45 ranges from 30-51%.

Basically, I think if youth turnout is relatively high, Obama wins, if low, Obama loses.

And I wonder how much the youth vote is being undercounted due to cell phones.



If you look at the rough averages of the latest pre-election polls in MD, VA, WI, DC, and MO, you'll find that Obama overperforms in actual
votes cast in the primary and Clinton underperforms. The percentages vary, but average around a 10% spread in favor of Obama.

Let's assume for the moment that the Clinton "kitchen sink" effort has cut the Obama overperformance in half, to 5% (and there's no information other than intuition to say whether there is no reduction, some reduction, or total reduction of this overperformance). Using the 5% figure, Obama wins Texas handily, is very close in Ohio, is competitive in Rhode Island, and blows out Vermont. As my wife points out (non-specifically), I have been wrong in the past. So don't bet the farm on it, but Obama looks more likely to sweep than Clinton does to achieve some partial victory.


David :

I agree Stonecreek. It bodes well for Obama that Ohio has some pretty nasty weather right now, which favors the candidate with the motivated/excited base.

It is similar to what happened in Wisconsin. Answering on the phone that you support a candidate is different than journeying through 0 degree temps to vote for them.


Adam G:

Those first tables above would be a heck of a lot more useful if in the final column they stated the (Clinton - Obama) difference (or vice versa)...


Mark, in addition to your point about Insider Advantage changing their likely voter model (with Latinos at 37% on 2/25, but 27% on 3/2), I notice that SurveyUSA shows a similar drop in likely women voters (58% on 2/23-25, but 48% on 3/1-2).

Referring back to you post about SurveyUSA's method for determining its likely voter sample, you wrote "large and statistically significant variation should reflect real changes in the relative enthusiasm of voters."

10 points seems pretty significant. I wonder what is going on there. Has the relentlessly media-hyped theme that Hillary is finished / Obama is inevitable dampened female voters' enthusiasm for turning out (rather than shifting those voters to Obama)? Or does it reflect complacency (either choice is good)? Is it possibly due to an intensification of male voter enthusiasm (increasing the share of likely male voters)?

Rasmussen goes from 57% on 2/24 to 51% and 53% on 2/27 and 3/2, respectively. Everyone else's numbers look steady. Was there just a temporary spike in female enthusiasm around the 24th that only SurveyUSA and Rasmussen picked up on?

What's the average turnout for female voters been in the previous primaries? The average for this set - 52% - seems low to me, but what do I know.

Too bad there isn't a chart like the one for candidate preference by race/hispanic origin for likely voters by sex - that would be interesting. Or maybe I'm just missing it here.

Terrific stuff, Mark.


Mark Blumenthal:


The 48% result on SurveyUSA was a typo. The correct numbers (which now appear in the table) are 56% female and 53% female on the last two SurveyUSA polls.

My bad -- apologies to all.



Talk about a texas 2-step, 2 pts here Obama wins, 2-pts there Clinton wins. mark does a great job on the Ethnicity Demographic. The subcategories of age may play a factor today in Ohio as Ciccina pointed out due to the waether. All in all we may actually have to "count the votes" before declaring a winner.


Steve P:

I appreciate how polling samples can be tied either to census data and electoral history, but it seem illiogical to me to discount what exit poll data is telling us about who is voting in this election cycle.

Extrordinarily high turn out overall but I would think that AA turnout is probably outpacing their regular share. But that Hisapnic share is proably down.

I understand that this is difficult to guage sice we have never had a primary go this deep into the schedule before on a competitiove basis. Wouldn't yiou have to toss election models out becasue we have never seen anything like this before?

How well has final polls lined up with exit poll data demogrraphically? Are we seeing the same demographic portions despite the much larger pie?



Great stuff Mark!

In a similar vein, checkout CNN's delegate counter:




Brian, you mentioned new Constituent's Dynamics and IVR polls. Do you have links for those? Thanks!


Steve P:

I appreciate how polling samples can be tied either to census data and electoral history, but it seem illiogical to me to discount what exit poll data is telling us about who is voting in this election cycle.

Extrordinarily high turn out overall but I would think that AA turnout is probably outpacing their regular share. But that Hisapnic share is proably down.

I understand that this is difficult to guage sice we have never had a primary go this deep into the schedule before on a competitiove basis. Wouldn't yiou have to toss election models out becasue we have never seen anything like this before?

How well has final polls lined up with exit poll data demogrraphically? Are we seeing the same demographic portions despite the much larger pie?



Mark B. - great work!
I do agree with Steve that an age+race-based spreadsheet would be nice.
And taking into account Ron Brownstein's column, an income-based differentiator would also be cool.

Or we can just wait till tonight :-)


Philip :

Wanted to add my appreciation. What a thorough assessment. I find your modeling of population to be most interesting and would allow for some comparison to exit polls (assuming valid exit polling??). Echoing previous comment by Brian W. - excellent public service.



To those saying "Obama over-performs", you need to establish why he over-performs if you are going to be able to make the case that he will in Texas.

In New Hampshire, New Jersey and California, Obama quite decidedly did not outperform the polls. That fact may help eliminate potential explanations as to why he over-performs.

1. Do cell-phones under-count young voters?
Remember that pollsters weight for expected turnout, based on a number of metrics. So, while cell-phone users may not get called, enough young land-line folks will get called. For this argument to be valid, you need to demonstrate that cell-phone users are a distinct population from those young people with land-lines.
PS: I have read a similar debate over the value of internet based polls (these are not online polls, but traditional polls, conducted through the internet).

2. Obama has a better ground organization than Clinton

This argument is much more compelling, and almost certainly true. Moreover, it explains a lot of the variation in terms of where Obama over-performed. Obama did much better in caucus states (where organization is important), in and in stormy Wisconsin. Hillary Clinton did as expected in most places, but considerably worse in the post Feb-5 states, where she had no ground game.

Unfortunately for the Obama-nauts, efforts have been made by team Clinton since Feb 5th to build up in Texas and Ohio, so I would suspect both candidates have comparable ground games.

3. Clinton's poll leads are based on name recognition

This is likely very true for all of the Super Tuesday states, and most polls that have been taken more than a couple of weeks before a given primary. That said, I would suspect that Obama's name recognition is pretty high by now. This poll (conducted on Feb 21) of favorability to Obama had 8% of people saying they had never heard of Obama. Another only had 1%.

Moreover, unlike most previous contests, the candidates have been campaigning in Texas and Ohio since Super Tuesday - almost a month ago. People know Obama's name by now.



love your site.

Please note that you have an error in your spreadsheet - D5 - fixed value. Just copy the formula in the cell below onto the 44% value.

It is not the ethic/gender mix that is important as it is the generational and crossover vote. As history doesn't measure this well, comparing to 2004 models is why everyone is so off this year. The youth / crossover vote is not in ANYONE'S model, as it has never happened before...

And it will get worse in the general. As a forecaster, what is needed is a fresh approach to the segmentation and better questions. PARTY AFFILIATION is not going to be a predictive going forward...


Excellent stuff, Mark. I can't thank you enough. You've provided an excellent public service here, if for no reason other than this will keep me bizzy and off the streets for a while. :-)

While I enjoy using the Wisconsin exit polls to tweak my own Ohio predictions, I hesitate to use them in Texas. Texas is a very different state, and may actually still be an independent nation from what I can tell.

However, I think there are two demographics that are more important than race when it comes to looking at how predictive the polls are. These are age and non-Dem ID.

In the case of age, the polls have a good distribution around Wisconsin's 49% or so under age 50, so I don't think that's a big issue. However, the Belo/WFAA/PS polls use a very Cheeseheaded 37-38% non-Dem ID in their polls and nearly everyone else is in the mid 20s.

That seems like a very significant difference, and it explains some of the variation in the White estimates. Sadly, however, Insider Advantage has the highest non-Dem among the ones you cite.

I'm not sure what to make of it, but I'll keep playing. It's a lot of fun when I should be working! Thanks ... I think ... :-)


I'm of two minds on Ohio's weather. On one hand, I agree with the above comments that the weather should affect more those people less excited about their candidate. A point to Obama. On the other hand, habitual older voters are more likely to brave the weather. A point to Clinton.

Whichever way it breaks, I would be cautious interpreting the early evening Ohio exit polling. I suspect in some sample precincts that the response rate will be low and the sample sizes are going to be unusually small due to the weather. The exit poll got lucky in Missouri, but I am not confident Edison Media Research will go two for two with the weather in Ohio. I suspect that even if Clinton wins Ohio by a comfortable margin, the media will wait on calling the results because they will be unsure of the exit poll's reliability given the weather.



Any word on early exit poll stuff?



Oh well, I see they're already up on the Texas poll page now before I had a chance to check back in...




Why I will be better than the pollsters

Obama 52.2%

Hillary 47.8%

After rounding, the final result is:

Obama 52%

Hillary 48%

Zogby can't beat that!!!

Thanks to Mark Blumenthal for this great tool.


Mark B. - "Typo" was also one of my guesses :-)

I'd love to see the candidate preference info by gender. Or is the gender split so constant, or amply captured by the "white vote" catagory, that its not relevent?

Michael - I have a hard time believing unequal "name recognition" was a factor as late as March 5.

There was, obviously, considerable earned and paid media in all of the primary and caucus states and by definition likely voters are paying more attention to the election than the average citizen. I find it hard to believe there were significant numbers of Democratic voters who went to the polls having never heard of Obama. If a voter was aware enough to know that Clinton was on the ballot, and which day and where to vote, s/he was also aware of Clinton's main opponent.

Beyond that, I think its a mistake to assume that a set of voters who (hypothetically) only know of one Dem candidate would all vote FOR the candidate they know. Some of these hypotheticals could have been "anyone but" voters.

I think post-New Hampshire the "name recognition" thing has been a canard, a politely euphemistic way of saying "a lot of those voters had no idea what they were doing." Which isn't an explanation at all.



Unlike you Ciccina, not everyone pays attention to politics.

See: Bush, George W.

Many people I talked to had no idea who this Obama guy was, but even more important, what he stood for.

That is why you see EVERY graph show a huge upspike in poll numbers after he campaigns there. Just look at Ohio and Texas. Hillary had huge leads there two weeks ago. NOW? All but gone. THAT is the power of name recognition. To suggest otherwise is simply ignorant.


I attempted a quick calculation of which counties have seen the greatest percentage of early voters in the Democratic primary relative to turnout in the 2004 primary. You can see those results here:


Bottom line...early voting seems highest is counties with large non-Hispanic white populations and lowest in the counties with the largest Hispanic populations. (Counties with the highest African American populations fall somewhere in between). This may suggest that turnout will be up more among African Americans than Hispanics. Then again, it may simply suggest that Hispanics don't vote early.




A vote before today is worth two on the way!



Thanks - very detailed and substantive analysis!



Brian Schaffner:
Based on your chart, Harris, Dallas and Bexar counties have the largest voter base - 0.9-1.8 million voters, and their Hispanic populations are from 38%-57%. So even though the % increase in early voters may not be as large as in other counties, the absolute number increase is much greater in these almost-Hispanic-dominated communities.

Of course, whether the increase is due to Hispanics or other folks - we don't know, do we?
Also, what are the chances that an early-voter would also participate in the evening caucus, compared to someone who votes late today?

So many intriguing possibilities... :-)



Texas, 32% Hispanic 18% black

Ohio, 20% Black up significantly but 60% female, out of the ball park.



I don't know why half my comments don't get through sorry if this is a repeat.

Texas, 32% Hispanic 18% black

Ohio, 20% Black up significantly but 60% female, out of the ball park.



wait.... demographics polled doesn't equate to demographic turned out, right? 32% Hispanic in TX means what in an exit poll?



I find it hard to believe that someone who actually goes to the voting booth in this primary (post-New Hampshire) would do so without any awareness of Obama. Of course, I talk to people who pay no attention to politics too, but they aren't turning out to vote in primaries.

These races close as primary day nears because people are making up their minds. Someone can be undecided right up until they step into the "booth" - it doesn't mean they've never heard of Obama. Remember, both candidates have roughly equal favorables among Dems.

You wrote: "Just look at Ohio and Texas. Hillary had huge leads there two weeks ago. NOW? All but gone. THAT is the power of name recognition. To suggest otherwise is simply ignorant."

I suspect you don't actually know what "name recognition" means in an electoral context. I know it is tempting to think that people who haven't decided to vote for your candidate are simply ignorant, but I doubt very much that that is the case in this cycle.

If you mean that part of Hillary's lead is made up of people who decided early on to support her regardless of who her opponent would be, that's still not "name recognition." You need only look at what's happened on the Republican side to see how quickly mere name recognition can collapse (e.g. Fred Thompson).

However, if you feel strongly about this, go back into some of the pre-Super Tuesday polls and look for name recognition questions. They may be there.

S.B., interesting w/ the 60% female... I wonder if this will wind up resembling New Hampshire.



Mark, those were the exit polls from last night. Interestingly in Vermont the exit polls were off by 5% in favour of Obama and in RI they were off by about 13%. In ohio they were off by about 6%, all favouring Obama. So what does that mean, maybe they continue to do the exit polls until they close and the numbers I am using were from 5 o'clock? That means people who went to the polls late heavily favoured Clinton, or people who voted Clinton were less likely to take the exit poll? interesting.


Post a comment

Please be patient while your comment posts - sometimes it takes a minute or two. To check your comment, please wait 60 seconds and click your browser's refresh button. Note that comments with three or more hyperlinks will be held for approval.