# Pollster.com

## Articles and Analysis

### POLL: ARG Pennsylvania Dems 4/17-19

American Research Group

Pennsylvania
Clinton 54, Obama 41

Uri:

Mark, can you post a reaction to the commentary following the poll that discusses "internet bloggers and their bias in picking outliers" ?

____________________

Alan:

"There are a number of ways to test for outliers. The easiest rule-of-thumb is to divide the results into quartiles and then to set limits one and one-half times beyond the medians of the upper and lower quartiles.

Using the last ten surveys in Pennsylvania (between 4/6-10/08 and 4/14-15/08), the median of the upper quartile is 52%. 52% times 1.5 equals 78%. Any survey with Hillary Clinton with over 78% would be an outlier using this method. The lower limit is 44% and any survey with Clinton lower than 22% would also be an outlier. There are no outliers in the Pennsylvania results using this method."

This sounds like a weird version of using the interquartile range times 1.5. In his example, though, he simply takes the upper limit (I'm not entirely clear on what he's using for this, since I got 50 for the third quartile) and multiplies it by 1.5, ignoring the range of the data. Try applying this method as Bennett describes it to a hypothetical data set where every poll shows Clinton at 52%. You get the same 78% cutoff point for outliers, which is nonsensical.

In any case, using specific criteria to decide whether or not something is an outlier can be useful, but it seems a little silly to accuse bloggers of misusing a term which doesn't have a set technical definition to begin with.

____________________

richard pollara:

I can't pretend to understand their definition of an outlier but I do know that every time a poll comes out with a result that one group doesn't like, they trash the poll, the methodology, the pollsters record or the alleged bias of the pollster. This may be sacrilige on a polling blog but from what I have seen over the last few months (as a laymen) is that the predictive power of a poll, particularily well in advance of the primary, is almost nil. At its best these early polls seem to say, "at this moment in time, based on this set of questions and based on our method for determining who to interiew, we think XYZ. And oh yes, factor in an additional 4% margin for error". Surprisingly it wasn't NH that caused me to lose faith. I think it was Zogby in Ca. It is one thing not to cover the spread, but to miss it entirely was another matter. I prefer to rely on history (if you can call a four month primary season history) and the best analysis of the historical trends comes from Mark a few days ago. In the final days the undecideds come back to the safe choice. That is what seems to have happened in California and Ohio. I think we will see the same thing in Pa. That would put Hillary slightly ahead of Ohio, and net her perhaps an 11-12 point win. Of course I though George McGovern was a shoe in over Nixon so my predictive powers aren't the greatest either. Well just have to wait and see on Tuesday.

____________________

Chris G:

so according to Bennett's definition nothing between 22-78% for Clinton would be considered unusual??

in estimating standard error you usually divide out the number of sample points (or the square root of), and he doesn't do that in his formula. it essentially suggests that no matter how many polling results we have, the error is the same as long as they're drawn from the same distribution. doesn't seem right

I wish he'd at least provide references of some kind

____________________

Please be patient while your comment posts - sometimes it takes a minute or two. To check your comment, please wait 60 seconds and click your browser's refresh button. Note that comments with three or more hyperlinks will be held for approval.

MAP - US, AL, AK, AZ, AR, CA, CO, CT, DE, FL, GA, HI, ID, IL, IN, IA, KS, KY, LA, ME, MD, MA, MI, MN, MS, MO, MT, NE, NV, NH, NJ, NM, NY, NC, ND, OH, OK, OR, PA, RI, SC, SD, TN, TX, UT, VT, VA, WA, WV, WI, WY, PR