Pollster.com

Articles and Analysis

 

Even Polls About Baseball...

Topics: Sampling

Let's take a break from political polling controversies and focus on something of hopefully broader interest. A controversy involving a poll on baseball.

Larry Brown, who blogs MLB Clubhouse on AOL, posted scathing criticism a few days ago of a poll on Barry Bonds released over the weekend by ESPN and ABC News (hat tip to alert reader David Pinto of Baseball Musings). Brown sees evidence of a cooked poll:

If anyone actually bothered to read the poll, it says an oversample of 203 African-Americans were questioned out of 799 baseball fans. Considering only 12.8% of the population is African-American (according to the U.S. Census Bureau), and potentially a far lower percentage of baseball fans are African-American, I would say that the 25% mark of African-Americans used to conduct the study is designed to mislead the public and generate racially charged results.

Not exactly.

Pollsters sometimes "oversample" a survey sub-population in order to increase the reliability of the results for that group. More interviews means less potential random sampling error. Before tabulating the data for the full sample, however, they "weight" back the oversample its correct proportion with the larger sample.

I checked with Gary Langer, the director of polling at ABC News, and he provided a few additional details. The ABC Polling Unit started with a nationally representative sample of 1,803 randomly selected adults interviewed between March 29 and April 4. Of these, 660 described themselves as baseball fans (on the survey's first question). Of these, 64 were African-American.

The pollsters wanted a bigger and more reliable sampling of African-Americans. So they continued calling from April 5 to April 22 and interviewed another 476 randomly sampled African Americans, of whom 139 were self-described baseball fans.

Thus (adding everything up), the ESPN/ABC survey interviewed 799 baseball fans, including 203 among African Americans. Before tabulating the data, however, they weighted the combined sample of 2,279 (the original 1,803 plus the oversample of 476 blacks) in a way that reduced the proportion of African-Americans to its correct value as determined by the U.S. Census.**

This practice is not at all unusual. The intent is to generate more statistically reliable results by race, not -- as Brown puts it -- to "generate racially charged results."

One thing I will say in Larry Brown's defense. The two-sentence methodology blurb at the end of the ESPN was not entirely clear. For one thing, it said that the total sample of 799 baseball fans included "an oversample of 203 African-Americans." Technically, it included an oversample that increased the sample of African American to 203 interviews.

More important - and here is a message for all who write poll releases - it included the term "oversample" without any description of the weighting procedure. A sentence like this one (quoting from Gary Langer's email reply to me) would help reduce the confusion:

The combined sample (1,803 gen pop, oversample of 476 blacks) was weighted to Census norms, reducing the proportion of African-Americans to its correct population value.

**The US Census reports African-Americans as 11% of all adults. Browns 12.% statistic is the percentage African Americans among the full population, including children.

P.S.: And speaking of ABC Polling Director Gary Langer, the site Freakonomics posted some intriguing comments from Langer yesterday on the subject of whether polls have historically overstated support for minority candidates. It includes this accurate and well-deserved compliment from Freakonomics author Steven Dubner:

Gary is a force of nature. He not only runs ABC's polling but has become the network's top cop for keeping bad data off the air, vetting many of the surveys, studies, and polls that producers and reporters plan to use in their stories. I don't know of any other news organization that has such a resource. I am sure he is occasionally a thorn in the side of a reporter who's dying to cite some sensationalistic study from some biased organization ... but as consumers of news, we are all the better for it.

 

Comments

I think part of the confusion stems from that goofy term "oversample." If you're doing stratified sampling, which it sounds like the pollsters came around to doing, the sample sizes should be related to the within-group variation, not the proportions in the overall population. For opinions on Barry Bonds, there's not likely to be much prior information on variation, so who's to say whether the sample of Afro-Americans was "over" or "under?"

"Oversample" is a stupid and misleading description. Let's chuck it.

____________________

X Stryker:

Perhaps it would be better to say, "Overpolling was done to ensure a statistically significant sample of all groups, and then proportionally weighted based on US Census data."

Or something like that.

____________________



Post a comment




Please be patient while your comment posts - sometimes it takes a minute or two. To check your comment, please wait 60 seconds and click your browser's refresh button. Note that comments with three or more hyperlinks will be held for approval.

MAP - US, AL, AK, AZ, AR, CA, CO, CT, DE, FL, GA, HI, ID, IL, IN, IA, KS, KY, LA, ME, MD, MA, MI, MN, MS, MO, MT, NE, NV, NH, NJ, NM, NY, NC, ND, OH, OK, OR, PA, RI, SC, SD, TN, TX, UT, VT, VA, WA, WV, WI, WY, PR