Articles and Analysis


Obama's Double-Digit Lead? The Cell Phone Only Difference in the National Trend Estimate

One can get lost in the deluge of polls which, just this week, show anything from a narrow 1% Obama lead (AP-Gfk) to a substantial margin of 14% (Pew). One pattern that seems to have become particularly evident this week is that the polls showing the biggest leads for Obama tend to be those that are polling the cell phone only population (such as Pew, CBS/New York Times, and ABC/Washington Post). We know from the recent Pew report that excluding cell phone only respondents from the sampling frame reduces Obama's margin by 2-3%, even when the sample is weighted. But how does this affect the national trend estimate, which takes into account all polling?

One of the great features of the new interactive tracking charts available on this site is the ability to select or remove particular pollsters. I used this feature to create two national trend estimates--one including only pollsters that include cell phone only respondents, and one including all other pollsters.

National Trend Estimate for Pollsters Reaching Cell Phone Only Respondents

National Trend Estimate for Pollsters not Reaching Cell phone Only Respondents

The comparison between the two trends is remarkably consistent with what the Pew Report would lead us to expect. While the trend that includes pollsters not calling cell phones shows an Obama advantage in the 6-7% range, the trend for those reaching cell phone only respondents shows an Obama lead greater than 10%. Obama's support increases by almost 3% in the national trend that includes polls reaching cell phone only respondents while McCain's support decreases by about 1%.

The difference between these two trends is hardly trivial since an extra 3-4% in the national vote could very well mean that several additional states tip in Obama's favor, producing a substantial electoral college landslide (keep in mind that most statewide polls are not including cell phone only respondents). If we assume that polls that reach the cell phone only population are more accurate, then Obama's lead may very well be in double-digits. But on November 4th, it will be worth checking back on these two trends to see whether the cell phone only pollsters actually do fare better in predicting the election outcome.

UPDATE: One of the nice things about the dynamic charts is that they will continue to update themselves. Thus, if you want to keep track of the differences between the separate trends for the next 11 days, you can bookmark this post and keep checking in.



There is a much greater margin of error than is being reported by the pollsters. The old rules about a 2100 sample size equalling 3.5% does not hold true any longer. The true margin of error is closer to 6%. This explain the widely divergent results of all the polls.



Richard F:

Would you care to offer some kind of evidence or elaboration on your statements?




You're cherry-picking data here. This 2-3% addition effect for Obama with pollsters calling cellphone can be seen only in past week or so in your graphs.

Just looking at lines on your graphs, there was no obvious difference between the two from mid-September to beginning of October. I note that the Pew report you cite was published on Sep 23.

I'm not going to argue that cell phone-only users are or are not more likely to be Obama voters. It may very well be true. However the margin for one candidate or the other for that demographic is likely to change over time just as the margin for general public changes. Ergo a +2-3% cellphone-only effect in Sept 23 even if true back then won't necessary mean it will be the same on Oct 23 or Nov 4.



Oops. I should have said @Brian, not @Mark.

I might add - look at the graph for beginning of September too. I think it further demonstrates my point about transient if real effect of including or excluding sub-groups.



@Richard F:

As far as I know, that's not how sampling works though. You do not need to increase your n in line with N. As long as your n is representative, which the pollsters (try and) make it, then increasing n doesn't automatically improve accuracy. I believe that article is flawed.



Margin of Error. I think Richard F is about right. I'm not a statistician, although I deal with stats a lot. For a random sample from the population, you compute the standard error of the mean (margin of error), by computing the standard deviation and dividing by square-root-of-N. Standard dev is not effected by sample size (much), but dividing by SRN decreases SEM with larger samples. This is a remarkable operation. It means that, from a single random sample, you get an estimate of what the standard deviation of the mean would be if you made many random samples from the same population. But, clearly a better, and more accurate method would be to actually make many separate samples of the same population, calculate the mean for each sample, and calculate the SEM by taking the standard deviation of the distribution of means.

Its not clear to me that, when you take weighted samples the formula of taking the standard deviation and dividing by the square root of N applies. Perhaps, but I'm not convinced.

On the other hand, we do have something like independent means taken from the same population. Over the past few months we have several polls taken on approximately the same populations taken on the same days. One could use treat these as independent samples of the same population and view their distribution as an estimate of the standard deviation of the distribution of means. Perhaps this is what Richard F. is talking about. Our general impression is that the distribution of daily tracking poll scores is greater than the SEM of the individual tracking polls would permit.

A value in this exercise might be to obtain a correction factor (?1.5) for the SEM estimates made in more traditional ways.

Other thoughts?



"Just looking at lines on your graphs, there was no obvious difference between the two from mid-September to beginning of October."

Huh? The cell phone only polls had Obama substantially ahead in mid-September while the others had McCain substantially ahead! That is an obvious difference to me.


Bear bones:

@ Hyh:

There is a substantial difference in the first half of September. The cell phone only polls put McCain ahead in that time, while the other polls that include non-cell phones put Obama ahead.

This also seems to go against the hypothesis that cell phone only users are more pro-Obama. It could possibly be interpreted to mean that cell phone users are more in-tune to news and are more responsive to what is being covered by the media. This can possibly explain why Obama's lead has recently been widening more quickly amongst the cell phone crowd than the general population at large.

Brian, do you have this information for pre-September?


MancJon: You actually do have to increase your sample size as the population size grows. Cube got it about right. The only time statistical representation work is when most of the population variables are known, which is not the case in election polling. This is why social science polling tends to me more accurate than political science polling. As the unknowns of the population increase the only way to account for as many variables as possible is to increase the sample size.



Some people here are confusing error and bias. Larger sample sizes reduce (the margin of) error, i.e. the so-called 'confidence intervals' (+/- x%) either side of the central value (the mean). This is also known as increasing the precision of estimates (which are all that surveys can give you). Bias on the other hand shifts the central value right or left (and the error distribution around it) error is random, bias systematic.

What Brian Schaffner is saying is that not including cell only users in polls possibly biases the results in a way that underestimates Obama's support (as these people are more likely than average to be his supporters) in ways that cannot be corrected by statistical weighting.

To correct for this you need to increase the sampling universe (sample both types of phone users) not the sample size.



How can we make the assumption that the polls that reach the cell phone population are more accurate? Incidentally, I'd say there's probably a fair enough case that the cell-phone demographic leans toward Obama by virtue of being young and connected to their cell phone at the hip. I'm just saying that this crowd wouldn't be the first people group to expect accuracy from ... although they may be doggedly in support of the liberal illuminati agenda at this moment in time, all might change the next.


Post a comment

Please be patient while your comment posts - sometimes it takes a minute or two. To check your comment, please wait 60 seconds and click your browser's refresh button. Note that comments with three or more hyperlinks will be held for approval.