Articles and Analysis


The Effect of ARG Polling on Iowa Trends

Topics: Iowa


American Research Group (ARG) does a large amount of state primary polling and is therefore potentially influential in estimating candidate support because they contribute more polls than most other organizations. This week we saw conflicting results from ARG and Time/SRBI polls of Iowa. (See Mark Blumenthal's analysis here.) The discrepancy of ARG polls from others in Iowa has been an issue here before, as has been the question of how much any single poll influences our trend estimates. Today we take another step towards systematically answering that question.

In the Democratic race, ARG has consistently found support for Clinton well above that of other polling organizations. In the chart above, ARG polls are in purple, the blue line is the trend estimated with all polls, including ARG, while the red line is the trend estimate without ARG. The light blue points are all non-ARG polls, while the purple points are the ARG polls.

This lets us compare three things: ARG polls to other polls, ARG polls to the trend, and the trend with ARG to the trend without ARG.

In the case of Clinton, ARG polls are consistently far above the results of other polls. This has been widely remarked upon already. And in the Clinton case, the ARG polls have shown some decline in support in Iowa, while other polls have shown an increase in her support. This is also the case in which ARG exerts a significant influence on the trend estimator. The blue trend line (with ARG included) is well above the red trend estimate which excludes ARG. This was especially true early in 2007 when there were few polls and several from ARG, giving them an extra influence due to lack of non-ARG data. As polling frequency has increased the two trend estimates have converged, but the non-ARG estimate remains a couple of points below the overall trend.

Blumenthal has talked about possible reasons for this, and I encourage you to see his post here.

I'm more concerned with the magnitude of difference and their effects here, so will leave it to Mark to explain the "why".

It is clear that ARG's estimates for Clinton have consistently been out of line with others, and that this has had an effect on my trend estimates, making Clinton appear more competitive in the first half of 2007.

But let's also look at the other candidates. ARG is less consistent in over- or under-estimating Edwards' support. Some ARG polls have put Edwards below trend, but others have him above trend. While ARG has disagreed with other pollsters in individual polls, the effect of ARG on the trend estimate for Edwards is negligible.

On the other hand, ARG has consistently had Obama below the support found in other polls, and well below the trend estimate. Despite this, the effect of ARG on the trend estimates has been small for Obama, with the blue and red trend estimates consistently quite close to one another.

Finally, Richardson has been a bit underestimated by ARG, but again with little influence on the trend estimates.

Bottom line: ARG has had a substantial effect on the Clinton trend estimate until recently. Still, the substantive effect is not trivial. Estimates including ARG put the trend at 26.2% for Clinton, 24.2% for Edwards, a Clinton lead of 2.0 points. But excluding ARG from the trends we get Clinton at 24.6% and Edwards at 25.9%, a 1.3 point Edwards lead. Of course both estimates say the race is close in Iowa, and perhaps we should stop there. But the consistent ARG overestimate of Clinton has influenced perceptions and estimates for this race.

If we switch to the Republican side, there is a consistent ARG overestimate of McCain support until very recently. ARG is also a bit high on Giuliani and a bit low on Romney. The Thompson numbers are relatively few and jump around.


Unlike the case of Clinton, the trend estimates are not much affected by the ARG data. The blue and red trend estimates lie very close to one another for all four Republican candidates, despite the high ARG readings for McCain.

There are two bottom lines here. Any pollster can experience consistent house effects that lead to over- or under-estimating support for some candidate. These may be due to sampling methods, filtering for likely voters, question wording or order, weighting methods, or perhaps to mysterious gremlins. ARG is an example of house effects, at least for Clinton and McCain and probably Obama. House effects are important because they give us a way of estimating what a poll would be if we adjust for those house effects. That gives better perspective than the raw numbers might. But house effects also allow us to say which polls are more in line and which more out of line with others. A house effect is not in and of itself evidence for bad polling methodology. There may be good reasons for choices that lead to significant house effects-- for example deciding to interview likely voters rather than adults or a decision not to push undecided voters or to push them for a preference. So we should be careful here in how we interpret the results. That said, it is crucial to know which organizations are consistently high or low for candidates (or any other variable.) The ARG lines in the figures above give a clear reading of that for the Iowa polling.

In the next few days we'll be rolling out a series of posts that look at house effects for all polling organizations across state and national polling. We'll have a systematic look at this, with estimates of the effects for each organization. I hope that will help clarify things.

The second bottom line point is that the trend estimates are pretty resistant to the effect of a single polling organization when there are plenty of other polls taken around the sample period, but that, as in the case of Clinton and ARG, this effect can be quite a bit larger when polling is sparse and a single organization contributes a substantial share of the polls while at the same time exhibiting a significant house effect. In one sense this problem goes away as we approach elections because the density of polling increases as does the heterogeneity of polling organizations. But as Iowa illustrates (and we'll see again in other primary states with limited polling) it is not always possible to be sure which polls are misleading us when the evidence is limited.

Stay tuned next week for the next step in examining the house effects in primary polling.

Cross-posted at Political Arithmetik.



Great post.

It seems unfortunate that the effects from mass consumption of political reporting based on polling averages skewed by a single organization cannot also be tracked, because that variable may be influencing the entire equation in ways we cannot even imagine. Although I suppose there's no way to definitively prove that this has happened, the evidence above does supports what I'm suggesting; it is not unreasonable to postulate that ARG's effect on the early polling averages saved Clinton from negative press regarding her campaign in Iowa, instead resulting in stories of upwards momentum for her that helped enable such a reality, and brought up only questions about Edwards' ability to hold on to his campaign there (Edwards ironically being the candidate who actually was - and still would be - leading all those polling averages in Iowa without ARG's skew).



I noticed the same thing about the ARG's Republican polls and wondered if something about their method tends to favor the national front-runner in general. I do note that Zogby and U of Iowa have each turned in a poll showing substantial leads for Clinton too though and pretty much everyone who's polled the race has showed her and Edwards about tied at least once.

I just tried my own experiment in attempting to filter for house effects though. No idea whether this approach has any real validity or not but it did happen to yield pretty believable results in this case. What I did was instead of looking at any pollster's absolute numbers, I pulled out polls from people who had polled the race more than once and looked just at their relative gain or loss for each candidate from one poll to the next.

As a starting point for each candidate, I used a straight average of four polls taken between mid-December and mid-January then added average gain/loss month by month to each candidate's total. Where polls were more than a month apart I used the average monthly gain or loss from their last one. This resulted in a ranking for August of Edwards 27%, Clinton 25%, Obama 23%, Richardson 8%

The odd thing is that in this case, if I exclude ARG Edwards' trend line shifts down and I end up with results more in line with your regular running average (Clinton 26%, Edwards 22%, Obama 20%, Richardson 10%). So I guess they have not been so bad to him in that respect. I did have to throw out their April poll however, because what the heck was that? Is there a reason no one else polled Iowa in April?

Anyway it was an interesting experiment. I've never heard of anyone doing this before but I'd been rolling it around in my minde for a couple of weeks and finally decided it was worth a try. You would be more than welcome to my spreadsheet if you have any interest in it. Apologies for the long comment.


Post a comment

Please be patient while your comment posts - sometimes it takes a minute or two. To check your comment, please wait 60 seconds and click your browser's refresh button. Note that comments with three or more hyperlinks will be held for approval.