Mark Blumenthal | August 7, 2009
Topics: Can I Trust This Poll , IVR , Michael Traugott , Nate Silver , Netroots Nation , Rasmussen , Sampling
What makes a public opinion poll "scientific?" If you had asked that question of a random sample of pollsters when I started my first job at a polling firm twenty-three years ago, you would have heard far more agreement than today. Now, many more pollsters are asking fundamental questions about the "best practices" of our profession, and their growing uncertainty makes it ever harder to answer the question I hear most often from readers of Pollster.com: "Can I trust this poll?"
Let's take a step back and consider the elements that most pollsters deem ssential to obtaining a high quality, representative survey. The fundamental principle behind the term "scientific" is the random probability sample. The idea is to draw a sample in a way that every member of the population of interest has an equal probability of being selected (or at least, to be a little technical, a probability that is both known and greater than zero). As long as the process of selection and response is truly random and unbiased, a sample of a thousand or a few hundred will be representative within a predictable range of variation, popularly known as the "margin of error."
Pollsters disagreed with each other, even twenty or thirty years ago, about the practical steps necessary to conduct a random sample of Americans. However, at the dawn of my career, at a time when at least 93% of American households had landline telephone service, pollsters were much closer to consensus on the steps necessary to draw a random, representative sample by telephone. Those included:
- A true random sample of known working telephone numbers produced by method known as "random digit dial" (RDD) that randomly generates the final digits in order to reach both listed and unlisted phones.
- Coverage of the population in excess of 90%, possible by telephone only with RDD sampling (in the pre cell-phone era) but almost never (decades ago) through official lists of registered voters.
- Persistence in efforts to reach selected households. Pollsters would call at least 3 or 4 different times on at least 3 or 4 successive evenings in order to get those who might be out of the house on the first or second call.
- A "reasonable" response rate (although, then as now, pollsters differed over what constitutes "reasonable").
- Random selection of an individual within each selected household, or at least a method closer to random than just interviewing the first person to answer the phone, something that usually skews the sample toward older women.
- The use of live interviewers -- preferred for a variety of reasons, but among the most important was the presumed need for a human touch to gain respondent cooperation.
- Weighting (or statistically adjusting) to correct any small bias in the demographic representation (gender, age, race, etc.) as compared to estimates produced by the U.S. Census, but never weighting by theoretically changeable attitudes like party identification.
I am probably guilty of oversimplifying. Pollsters have always disagreed about the specifics of some of these practices, and they have always adopted different standards. Still, from my perspective, these characteristics are the hallmarks of quality research for many of my colleagues -- especially those I see every year at the conferences of the American Association for Public Opinion Research (AAPOR; for more detail see their FAQ on random sampling).
The application of these principles has shifted slightly in recent years, even among traditionalists, in two ways: First, pollsters are no longer convinced that a low response rate means a skewed sample. As described in my column last week, pollsters have learned that some efforts to boost response rates can actually make results less accurate. Second, to combat the rapid declines in coverage posed by "cell phone only" households, many national media pollsters now also interview Americans via mobile phone, using supplemental samples of cell phone numbers to boost sample coverage back above 90%. But by and large, traditional pollsters still use the same standards to define "scientific" surveys as they did 20 or 30 years ago.
A new breed of pollsters has come to the fore, however, that routinely breaks some or all of these rules. None exemplifies the trend better than Scott Rasmussen and the surveys he publishes at RasmussenReports.com. Now I want to be clear: I single out Rasmussen Reports here not to condemn their methods but to make a point about the current state of "best practices" of the polling profession, especially as perceived by those who follow and depend on survey data.
When it comes to sampling and calling procedures, Rasmussen is consistent with the framework I describe in only one respect: They use a form of random-digit-dial sampling to select telephone numbers (although Rasmussen's methodology page says only that "calls are placed to randomly-selected phone numbers through a process that ensures appropriate geographic representation"). But in other ways, Rasmussen's methods differ: They use an automated, recorded voice methodology rather than live interviewers. They conduct most surveys in a single evening and never dial a selected number more than once. They routinely weight samples by party identification. They cannot interview respondents on their mobile phones (something not allowed via automated methods) and thus achieve a coverage rate well below 90%.
If you had described Rasmussen's methods to me at the dawn of my career, I probably would have dismissed it the way my friend Michael Traugott, a University of Michigan professor and former AAPOR president, did nine years ago. "Until there is more information about their methods and a longer track record to evaluate their results," he wrote, "we shouldn't confuse the work they do with scientific surveys, and it shouldn't be called polling."
But that was then. This year Traugott chaired an AAPOR committee that looked into the pre-election polling problems in New Hampshire and other presidential primary states in 2008. Their report concluded that use of "interactive voice response (IVR) techniques made no difference to the accuracy of estimates" in the primary polls. In other words, automated surveys, including Rasmussen's, were "about equally accurate" in the states they examined.
Consider also the analysis of Nate Silver. On his website Fivethirtyeight.com last year, he approached the issue of survey quality from the perspective of the accuracy of the results rather than their underlying methodology. He gathered past polling data from 171 contests for President, Governor and Senate fielded since 2000 and calculated accuracy scores for each pollster. His study rated Rasmussen as the third most accurate of 32 pollsters, just behind SurveyUSA, another automated pollster. When he compared eight daily tracking polls last fall, Rasmussen ranked first in Silver's accuracy ratings. He concluded that Rasmussen, "with its large sample size and high pollster rating -- would probably be the one I'd want with me on a desert island."
The point here is not to praise or condemn any particular approach to polling but to highlight the serious issues now confronting the profession. Put simply, at a time when pollsters are finding it harder to reach and interview a representative sample, the consumers of polling data do not perceive "quality" the same way that pollsters do. Moreover, the success of automated surveys in estimating election "horse race" results, and the ongoing transition in communications technology and the way Americans use it, has left many pollsters struggling to agree on best practices and questioning some of the orthodoxies of the profession.
The question for the rest of us, in this period of transition, remains the same: How do we know which polls to trust? I have two suggestions and will take those up in subsequent posts.
[Note: I will be participating in a panel at next week's Netroots Nation conference on "How to Get the Most Out of Polling." This post, and hopefully two to follow, are a preview of some of the thoughts I am hoping to share].