Articles and Analysis


More on the Arkansas Surprise

Topics: Arkansas , Bill Halter , Blanche Lincoln , Daily Kos , Likely Voters , Research2000 , Turnout

Before moving on to the more important issues raised by both Nate Silver's new pollster accuracy ratings and their apparent role in parting of ways between DailyKos and pollster Research 2000, I want to consider some possible lesson's from last night's Arkansas surprise.

Let's start with the assertion that Del Ali, president of Research 2000, made to me earlier today. He says that the final result -- Blanche Lincoln prevailed by a 52.0% to 48.0% margin -- fell within the +/- 4% margin of error of his final poll, which showed Halter at 49% and Lincoln at 46%. That much appears to be true. However, Research 2000 did three polls on the Lincoln-Halter run-off, including a survey conducted entirely on the evening of the first primary, and all three gave Halter roughly the same margin as the final poll.

2010-06-09-AR polls.png

I'll spare you the math (and the argument about how we might calculate the margin of error for such a pooled sample), but if you treat all three polls as if they were one, the difference between the vote count and the consistent Research 2000 result looks far more statistically meaningful.

One big problem in this case is that Research 2000 was the only pollster releasing results into the public domain. Had other pollsters been active, producing the sort of pollster-to-pollster variation we typically see, those who follow the race may have been less surprised by the outcome.

I am told, however, that there the Lincoln campaign and allies of the Halter campaign (presumably organized Labor) did conduct internal polling that was not publicly released. I communicated with senior advisors to both campaigns today who say that each side polled immediately after the first primary and showed Lincoln ahead. Lincoln's internal poll showed her leading by ten points, while two post-primary polls conducted by Halter's allies showed Lincoln leading by six and four points. The advisors also claim that neither campaign fielded a tracking poll in the final week, as all remaining resources were devoted to advertising and efforts to get out the vote.

Now in fairness to Research 2000, all of these claims were made to me today, on background, and I have no way to verify them independently. So take this information with a grain of salt.

Are there lessons to be learned here?

First, let's remember the point I made a week ago, with the help of Nate Silver's data: Whatever the reason, polls show far more error in primaries, especially primary elections in southern states.

Second, consider something largely overlooked: Arkansas has one of the largest cell-phone only populations in the nation. A year ago, the Center for Disease Control's National Center for Health Statistics (NCHS) published estimates of wireless-only percentages by state. Arkansas ranked fourth for the percentage of cell phone only households (22.6%) and seventh for the percentage of cell phone only adults (21.2% -- for rankings, see the charts in our summary). And the national level NCHS estimates of the cell phone only population have risen another 4.5 percentage points over the past year.

Nationally, the cell-phone-only population is largest among younger Americans, those who rent rather than own their homes and among non-whites. Those patterns could have made a difference in Arkansas.

Third is a point I made in my column earlier this week: The results of pre-election surveys are sometimes only as good as the assumptions that pollsters make in "modeling" likely voters.

For example, many pollsters stratify their likely voter samples regionally based on past turnout. In other words, they divide the state up into regions and use past vote returns to determine the size of each region as a percentage of the likely electorate. As should be obvious, these judgements are often subjective and rely heavily on the assumption that past turnout patterns will apply to future elections.

For the Arkansas runoff, however, pollsters could rely on a very proximate turnout model: The first primary on May 18 between Lincoln, Halter and D.C. Morrison. In fact, according to Del Ali, that's exactly what Research 2000 did for their runoff polls. They used the regional distribution of voters on May 18 to set regional quotas. They also conducted a survey of self-identified voters on primary election night, weighted the survey so their self-reported preferences matched the result, and relied on the resulting demographics to guide their demographic weighting on subsequent polls.

But here's the problem: As is typical, total turnout declined between the two elections. Roughly 70,000 voters (21% of those who voted in the first primary) did not vote in the runoff. But more important the pattern in the fall-off was not consistent throughout the state and the pattern favored Lincoln: Turnout was high in her base and fell off most where she was weakest.

I took the vote by county as reported by the Associated Press (here and here) and calculated turnout in each county for the runoff as a percentage of the total vote cast in the first primary. As the scatterplot below shows, the fall-off in turnout was typically greatest in counties where Lincoln's percentage of the vote on May 18 was lowest (I omitted results from Baxter and Newton counties which showed increases in the total vote suggesting clerical errors or omissions in AP's vote total).


The pattern is most likely explained by the fact that there were also Congressional run-off elections held yesterday in the 1st and 2nd Districts of Arkansas, which kept turnout higher in areas that are also Lincoln's base of support.

I don't want to make too much of the turnout pattern since, by my calculations, re-weighting the May 18 vote to match yesterday's county level turnout would add less than percentage point to Lincoln's lead. But hopefully it gives you some idea of what can happen when assumptions can go awry. Region is just one variable. Other assumptions, such as those for race and age, may have been even more consequential. Other pollsters making different assumptions may have produced very different results. When just one public pollster is active in the race, the odds of misreading the horse-race are greater.



How much of the discrepancies in the polls can be traced to respondant's trust of a particular pollster's identification. Did Research 2000 identify itself as conducting a poll on behalf of Kaily Kos? If so, then there would be many who lean conservative who would instantly hang up and many liberals who would enthusiastically answer the questions.

The same thing has to be true with the others. Would many conservatives be willing to answer a CBS/NY Times poll based on their track record of liberal bias?

Then you have the other side. How many liberals would hang up on Rasmussen? Fox News.

People are more informed than they used to be. These pollsters have developed their reputations and people know which way they lean. Someone is not likely to answer questions for a pollster they assume will diminish their answers anyway.



@ Gary

I know I hang up on pretty much any automated or sales call. If it was an IVR poll I probably wouldn't listen to it no matter what pollster it was.

I think pollsters try to account for that inherent bias.



When I've been surveyed/polled, particularly by the repub research group in our area, I love to give them false information. And I don't think I'm unusual in this - I bet it happens all the time. I can't believe phone polls, particularly land lines, will continue to be used forever.



One big item I think you are missing in the two AR elections. The first was a three person race and the second was a two person race.

This was a tough one to handicap. Morrison is actually more conservative than Lincoln. But he probably also got a lot of votes from the anti-incumbent, anti-DCInsider crowd. Morrison got 13% in the first election. In the runoff Lincoln was +7% and Halter was +6% vs the first time around. So neither was able to get the lions share of Morrison voters. The conservative Dems went for Lincoln and the anti-whatever bunch went for Halter.

Even more important, in my opinion, was the outsider money. It was the SEIU vs the Chamber of Commerce. In this state, the Chamber is going to win that one every time. As much as people like the WaPo column you linked to are saying it was Lincoln's contrite ads, I believe her union bosses ad was more effective.

Your point about the runoffs in the 1st and 2nd districts is well made. And to expand on it just a little, the big runoff in the 3rd district was on the GOP side.



One thing I don't think the pollster ratings take into account is that these are fluid races and particularly in primaries (especially low profile ones) even dedicated voters won't know who they are voting for or anything about the candidates until election day.

Now, Arkansas is probably a different story. With how saturated it was I highly doubt any voter in the state didn't know who Lincoln or Halter was. But even there, I would point you in the direction of Blanche Lincoln's closing ad featuring Bill Clinton. This is one of the most effective ads I have seen this election cycle, and it was her closer - running 100% for the last weekend - after the R2K poll was taken. Now, Lincoln and Halter's campaigns say she was already in the lead so maybe this was a bad poll.

But what about the "blowing" of the Alabama race. The higher profile candidate was polled at 42 and ended up at 38. There is a lot of evidence that Sparks benefited from a late surge of last minute deciders who all moved to him. The point being, Artur Davis as the "frontrunner" with a higher profile and more money had the whole year to solidify his position with the electorate and because he was running such a bad campaign he wasn't able to. When voters finally decided, Sparks combined his smaller base of support with the rest of the voters who were probably more anti-Davis than anything else. I don't know any pollster or campaign manager that expects a poll's horserace test to pick up this kind of theoretical support three weeks before the election.

I guess my main criticism is that events can change. They aren't always as obvious as a Wellstone funeral or a last minute DUI, but elections are fluid. It seems Silver tries to control for this by using how close the pollster was to the winner's % as one of his metrics, but I would think it would be better to identify a dominant candidate (usually the winner, but in the case of Alabama, definitely Davis was the known quantity) and see how the pollster pegged them.

One other example is Strategic Vision. They were never polling, what they were doing was just using available information to predict what they thought the outcome would be. And their fake polls scored pretty well on Silver's analysis. So if I'm a pollster who polls a 2 person primary three weeks before the election and the candidate I expect to win comes in at 40-20, should I just assume that the end result is going to be 60-40 and re-weight my results so that the winner's percentage will be closer to what I truly believe it will be on election day, so that my poll can confirm to an arbitrary scoring system instead of truly being a statistically accurate (with a margin of error) reading of where the electorate is really at?



hoosier_gary, Big-Mike, Chris:
Excellent posts.

I agree that Blanche making it anti-union (with the startling assist in that by Clinton) made the difference.

As I have posted elsewhere, Blanche's only chance is to go as far right as she dares. Now that the SEIU and Steelworkers have withdrawn, she has lost the anti-union, anti-outsider message.

The only target left is Obama, but she will have to make it about over-spending or she will turn off voters than she attracts. She has to lose her liberal taint somehow.

Of course, her strongest effort will have to be throwing mud at her opponent, a time-honored tactic because it often works. I personally have no problem with it as long as the media does not go overboard in assisting/protecting either party.



Mark Blumenthal:
Please don't spare us the math and statistical issues. At least a link to a discussion of those would be welcome.

In my experience with engineering and medical statistics, there seem to almost always be caveats in real world stastical problems, but it would be interesting to know how much you think each would affect this and similar aggregate analyses.


Post a comment

Please be patient while your comment posts - sometimes it takes a minute or two. To check your comment, please wait 60 seconds and click your browser's refresh button. Note that comments with three or more hyperlinks will be held for approval.