Articles and Analysis


Daily Kos & Research 2000: A Troubling Story

Topics: Daily Kos , Del Ali , Fraud , Markos Moulitsas , Research2000

The battle between Daily Kos and pollster Research 2000 went from ugly to surreal last week, as the website and its founder, Markos ("Kos") Moulitsas, filed suit and pollster Del Ali fired back with a lengthy, frequently rambling reply to TPM's Justin Elliot. The Washington Post's Greg Sargent points out that the coming lawsuit could provide an "unprecedented look at the inside of a professional polling operation." I would argue that it already has, although "professional" is not necessarily the adjective I would choose.

Consider what we have learned in just the past few days. From Elliot's reporting, for example, we know that the Olney, Maryland address listed on the Research 2000 website is a post office box and the company "does not appear as incorporated on the state business records database." Ali told Elliott "he incorporated with 'self-proprietorship' in 2000."

That profile fits the description of his business that Ali gave the Baltimore Daily Record in 2006 (h/t Harry Enten). The article described Research 2000 as having just "three part time employees" and being "quite a bit smaller" than Mason-Dixon Polling & Research Inc., where Ali had worked until starting his own business in late 1998. "The actual legwork" of his business, the article said, "is farmed out to professional call banks." Ali also claimed that his firm did considerable non-media work:

Ali said diversification - working with interest groups as well as media - is important for business, and that there is a misconception surrounding polling contracts with large news agencies. We don't make a great deal of money, Ali said. If I were to depend on making ends meet with media polls, then I'd be broke.

Running a mom-and-pop polling business is not incriminating, in and of itself. Many research companies, including my former business, are small shops that depend entirely on third-party call centers to conduct live telephone interviews. But it's important to consider the Research 2000 profile in terms of the sheer volume of work it claims it did for Daily Kos, especially given that Ali told me, as recently as four weeks ago, that his work for Kos and progressive PACs was "less than 15% of our overall business." If that is true, the volume of surveys that Research 2000 farmed out to call centers over the last two years was extraordinary. Several very large call centers would have been involved. Why have we heard nothing from them?

From Yahoo Politics' John Cook we learn that court records show Ali "has been sued numerous times in his home state of Maryland for nonpayment of debt and has been hit with several tax liens," including a $2,360 lien just two months ago. Cook also notes that Ali and his company were sued eight years ago for $5,692 for non-payment to "polling and research company" RT Nielson Company (now known as NSON). Their website confirms that NSON "specializes in telephone data collection" and provides these services "to many market and opinion research consulting companies."

Maryland Court records also show a judgment against Del Ali for $5,714.09 from a suit filed by Zogby International in 2001 (document obtained via search here). So we do have documentation to show that Research 2000 was doing business with research companies and survey call centers, albeit eight to ten years ago.

The Daily Kos complaint, published by Greg Sargent, provides new information on the financial side of the polling partnership from the Daily Kos perspective. Shortly after the 2008 elections, for example, Daily Kos entered into an agreement "reached orally" to conduct 150 polls for the website over the following year, including a weekly national survey and various statewide polls to be conducted "as requested." Kos agreed to make an initial payment in late 2008 and two "lump sum" payments in 2009. The complaint implies (though does not state explicitly) that the parties agreed to either a total amount or a set cost per poll (or both).

The complaint goes on to explain that Kos agreed to advance the second lump sum payment to May 2009 -- right around tax time -- in exchange for an additional 59 polls "to be performed free of charge." Ali requested the advance and offered the free polls in exchange, according to the filing, "claiming it would provide 'immense' help for cash flow reasons."

What the Daily Kos complaint omits is any discussion of the dollar amounts involved. Just how much did they pay for hundreds of thousands of interviews conducted over the last two years? What was the typical cost per interview (especially when we include those 59 free surveys)? The answers to those questions alone will tell us whether Research 2000 could have plausibly conducted live telephone interviewing on such a large scale. As both Patrick Ruffini and Nate Silver have speculated, Ali appears to have been charging absurdly low prices given the likely budget of a site like Daily Kos and the realities of the costs of farming out live interviewing.

Moreover, the financial arrangement described in the complaint -- pre-negotiated lump sums for hundreds of surveys with no written contract -- is also extraordinary. Telephone interviewing costs vary considerably depending on the number of interviews, the length of the questionnaire, the incidence of the target population (how many non-registrants or non-voters need to be screened out) and several other factors. I know of no call center that would agree to field a survey without an advance bid based on precise specification of all of these variables. Given this potential variability and the relatively low profit margins typically involved, pollsters, call centers and their clients are usually careful about nailing down the specifications in advance. The idea that a pollster would propose conducting 59 free polls as a means of obtaining, as Nate Silver puts it, a short-term loan with an "alarmingly high interest rate," is simply unheard of.

While the story told by the Kos filing was strange, the controversy grew even more surreal after Ali "lashed out" at Moulitsas and others in a rambling 1,100-word statement sent via email to TPM's Justin Elliot on Thursday. Ali claims in his statement that the Daily Kos complaint contains "many lies and fabrications," that "every charge against my company and myself are pure lies, plain and simple," and that Kos still owes him a "six figure payment."

Ali promises to "expose" the alleged mistruths "in litigation, not in the media" and says calls by the National Council of Public Polls (NCPP) and others to "just release the data and explain your methodology" indicate a bias toward Kos "and a disregard for the legal process."

Hardly. I am not a lawyer, but I find it difficult to believe that the release of exculpatory evidence now would in any way prejudice Ali's ultimate defense in court. If the surveys were genuine, then raw data files exist somewhere, at least for the most recent surveys. If the cross-tabulations published on Daily Kos are genuine, then statistical software exists somewhere that can replicate the tabulations published on Daily Kos -- including the strange matching odd or even pattern observed by Grebner, Weissman and Weissman. This is not be the stuff of advanced statistical analysis: Either the data and processes exist and can be replicated, or they do not and cannot.

Moreover, if Research 2000 actually conducted the literally hundreds of thousands of live interviews behind the results published on Daily Kos since January 2009 (I count well over 200,000 reported for their national surveys and U.S. Senate surveys alone), a wealth of documentation and eyewitness should be readily available that would be easily understood by mere statistical mortals: Call center invoices, testimony from interviewers, supervisors and the employees that prepared cross-tabulations. That sort of evidence helped send a call center owner to jail in an unrelated Connecticut case in 2006. That sort of evidence could also help vindicate Ali and Research right now -- but only if it exists.

By far the most troubling part of Ali's response comes in these two sentences (left in their original form including typographical errors):

Regardless though. to you so-called polling experts, each sub grouping, gender, race, party ID, etc must equal the top line number or come pretty darn close. Yes we weight heavily and I will, using te margin of error adjust the top line and when adjusted under my discretion as both a pollster and social scientist, therefore all sub groups must be adjusted as well.

"Top line" in this context means the results for the full sample rather than a subgroup, but it still unclear exactly which "top line numbers" Ali is referring to. If he means the results of attitude questions -- vote preference horse-race numbers, favorable ratings, issue questions or possibly even the party identification question -- he comes close to admitting a practice that every pollster I know would consider deceptive and unethical. "Scientific" political surveys are supposed to provide objective measurements of attitudes and preferences. As such pollsters and social scientists never have the "discretion" to simply "adjust" the substantive results of their surveys, within the margin of error or otherwise. As a pollster friend put it in an email he sent me a few minutes after reading Ali's statement: "That's not polling. It's Jeanne Dixon polling."

Pollsters and social scientists do often adjust their top line demographic results, and some will weight on attitude measurements like party identification, to correct for non-response bias (though party weighting continues to be subject of considerable debate in the industry). In either case, however, the adjustment needs to be grounded in prior empirical evidence -- U.S. census demographic estimates or, perhaps, previous surveys of the same population -- and not merely the whim of the researcher.

Because of the apparent lack of a written contract,** the Daily Kos complaint relies in part on the concept of an "implied warranty," the idea grounded in common law that transactions involve certain inherent understandings between a buyer and seller. Most reasonable people would agree that a political poll should be an objective measurement based on survey data that has been "adjusted" only as necessary to correct statistical bias. If Del Ali believes a pollster has the discretion to "adjust" results arbitrarily within the margin of error, he has been selling something very different than the rest of us have been (figuratively) buying.

Greg Sargent was right. The legal process of discovery, if this case gets that far, will provide truly full disclosure. But what we have learned so far is already very troubling.

[Typos corrected]

**Update (7/6): Markos Moulitsas emails to say that while was no formal "boilerplate" contract, "we hashed out our agreement via email." To be clear, a legally binding contract between two parties does not require a written document.


Mark Grebner:

The argument the Weissmans and I made - that the published results could not have arisen from the proper analysis of properly conducted polling - seems established now by Del Ali's own words. If he "adjusts" his numbers "within the margin of error" based solely upon the limits of his "discretion", there is no surprise that we see patterns which could not arise stochastically.

The controversy may continue, but our work is done.



Assuming this pollster made things up out of whole cloth, he would be the second (perhaps there are more?) such fraud.

Just how prevalent is this? Who can we trust?

At the very least, can you provide a filter on your polls that plots only those pollsters that adhere to the full disclosure rules?


richard pollara:

Pollsters complaining about other pollsters not using scientific methods is like astrologers wringing their hands over the slip shod practices of palm readers. In primary after primary (New Hampshire being the most famous) polling operations failed to accurately predict outcomes. Whether it was their inability to predict turnout or to come up with a likely voter screen, the underlying assumptions were so flawed that a simple demographic overlay proved much more predictive. Polls can only be relied upon when the race is a gimmee (two party general elections where party ID dominates). So if Research 2000 was making up the data, how different is it from the "swags" other firms made about likely voters and turnout? Maybe the one good thing that will come of this is a healthy skepticism of ALL polling. And maybe then NBC Nightly News won't lead with a story the night before the California Primary that Obama was ahead by 13 points based upon a poll by Zogby/Reuters(Zogby February 3-4 2008). Pollsters may employ the Scientific Method but it is a huge stretch to consider it a science.


Mark W:

richard pollara:

I'm confused, if you don't think polling has innate value, or its not "scientific" enough for you, why are you trolling around on a site called "pollster.com"? Cherry-picking individual times when someone got a race wrong doesn't prove anything about anything.

That said, I think even you would have to agree that CALLING a random group of people and averaging their responses to a set of questions probably has a better chance of representing "public opinion" than just MAKING UP numbers out of thin air.



When Nate Silver came out recently with his new pollster ratings, which "pollster" screamed the loudest about Silver having missed a few polls or gotten some of the numbers a bit wrong in his database? None other than Del Ali.

Just as there is such a thing as scientific sample surveys, there is a role not only for standards and policing in the industry (maybe even licensing?) but also the kind of systematic assessment that Pollster.com and FiveThirtyEight.com try to do.

R2000 looks like the second sketchy polling operation, along with Strategic Vision, Inc., that has been exposed in the last year as a product of efforts to promote accountability and performance ratings of pollsters. Strategic Vision threatened Silver with a lawsuit. Nothing came of it. Del Ali's threat of a counter-suit against Daily Kos rings just as hollow.

I am guessing that both Pollster, FiveThirtyEight, and all other "poll aggregators" are going to have to -- if they haven't done so already -- completely eliminate all polls conducted by these two dubious organizations from their historical databases.

But what's the next big step to take to protect the industry itself and its customers from shysters? I think the major media companies must take an active in that process.

Public polls are, of course, just a fraction of all the professional polling that is done. I wonder what sort of standards and practices are operating to assure the quality of polling that is conducted for private industry (for marketing and advertising, as well as campaigning) and governmental organizations.


richard pollara:

Mark: I posted multiple times on this site in 2008 that there were systemic problems in polling primary voters and that pollsters needed to take a hard look at their underlying assumptions to see whether what the were doing any better than just MAKING UP numbers out of thin air. ALL of the polling outfits seemed to be clueless when it came to identifying likely voters and turnout. Zogby was only the most egregious example (he missed California by 23 points!) and I singled him out because NBC breathlessly reported the death of the Clinton Campaign based upon the Zogby/Reuters election eve poll.

It seems disingenuous for the polling industry to be wringing their hands about Research 2000 when essentially they are doing the same thing in identifying likely voters and predicting turnout. Polling as a "science" seems to have come to a dead end. Maybe it is time for the public and the media to reevaluate the predictive value of all polling.



@Richard: while I favor a thoroughgoing review of polling, I seriously doubt polling as a science has or will ever come to a dead end.

The biggest poll ever taken in the United States is happening right now: the census. In most countries there is no alternative to a periodic census. (In some, the use of population registries has all but made censuses superfluous. That's never going to happen in the U.S. -- well never in my lifetime.) A census is not the same as a survey sample, for a variety of reasons. (And the use of sampling in the data collection process was "forbidden" by a U.S. Supreme Court decision in 1998.) But between censuses, our own Census Bureau conducts numerous sample surveys (in particular, the monthly Current Population Survey) for good public purposes. This isn't going to stop.

If your comment is aimed only at the "predictive" aspect of election polling, as opposed to the use of sample surveys for other purposes, then you may have a point. But even then, polls aren't going to stop being used for that purpose. So I would focus on standards and practices, and some kind of licensing and policing mechanism, to protect consumers against fraudulent and shoddy work.


Matt Sheldon:

@Mark Grebner -

You are moving the goalposts. 2 of the 3 of your "anomalies" are demonstrably false.

The even/odd pattern is easily explained had you thought it through.

1. Round the topline to the nearest integer
2. Since the gender split is essentially 50/50 you ensure that the average of Male and Female equals the topline
3. The even/odd pattern will ALWAYS appear

The other of you anomalies is explained by simple demographic weightings. You assumed random sampling, when ALMOST NOBODY does truly random sampling.

A combination of stratified sampling or weighting is par for the course.

You apparently did not know that.

Your final "anomaly" rests upon Del Ali nudging the topline by a point on a dozen occasions.

What you fail to consider is that rounding off to integers can erase underlying movement in a tracking poll.

It may be more accurate to nudge one of the numbers to show a 1 point move if the unrounded data shows a .8% move.

This is not a crimes, especially in a media tracking poll.

Numerous pollsters take far greater liberties in choosing the sampling and weighting frames than a simple nudge of a topline that actually enhances poll accuracy.

You claimed fraud in your initial analysis. You claimed this was resounding proof that the interviews were not completed.

There is zero credible evidence of that.

The only evidence we do have is that R2K was a cutrate pollster who probably employed panels instead of fresh lists.

Using panels is not a crime either, and it dramatically reduces the cost of research.

It will lead to a more partisan database over time, as the act of polling someone continuously can change their behavior going forward.

Nate Silver is moving the goalposts too. His analysis was obliterated on this site. Completely obliterated....

It revealed his general lack of experience in conducting polls.

R2K is a fly-by-night pollster, we know that. That is EXACTLY why DailyKos hired him.

This is a big surprise? Markos is just as sleazy, if not more.


Mark Grebner:

@Matt Sheldon: I will only respond to a few points.

You propose a mechanism by which the parity (odd-even status) of reported percentages for male and female voters might have become linked. We considered exactly such a mechanism in the course of writing our paper, but unlike you, we regard such a method as improper. If you believe the correct way of determining the percentage of women who are favorable to the Democratic Party (for example) is to first calculate the percentage for all voters, then round it, then project that percentage back to women, there's nothing anybody can say. That's not what Ali claimed he did. It would generally misreport the percentage found in the actual survey. It makes no procedural sense whatever. And, as we say in our paper, it would not represent a proper tabulation of actual polling. But I don't see any point in arguing about it.

Your further claim that the other anomalies are "explained by simple demographic weightings" is incomprehensible to me. Why would the percentage of people who support (say) Nancy Pelosi usually rise by exactly one percentage point from one week to the next, or decline by exactly one percentage point, but almost never remain unchanged from week to week? You say that's the result of "weighting"?

Our third test, which involved constructing a hybrid variable that should have shown a huge amount of statistical "noise", found virtually NO noise. Our constructed variable should have shown a variance of roughly 160 squared-percent from one week to the next, but instead we found only about 1 squared-percent. In other words, the comparison we looked at should have jumped by an average of over 12% from one week to the next, based purely on fluctuation caused by sampling, but instead moved only 1%, generally in smooth steps. If you believe that damping was the result of "weighting", you have a very different understanding of the term than I do.

Del Ali's comment on TPM that he felt free to "adjust" the published data in a post hoc manner would explain everything very neatly.

I hope you won't be offended if I fail to respond to subsequent postings.



A note on weighting. As far as I know weighting is always done on the basis of respondent characteristics (age, education, region, gender, sometimes party ID, etc.), to bring the distributions on those characteristics in the survey respondents in line with some kind of external standard, typically census proportions.

Weighting is not done directly on toplines nor does it touch the main "dependent" variables of interest (e.g., vote intention). But the toplines on such variables will change with weighting the proportions of the respondent who fall into different combinations of demographic characteristics (age, gender, region, etc.).

What Mr. Ali says he did was to change the toplines directly ("within the margin of error"), while presumably not changing the underlying weights (e.g., always keeping the male-female division at 50-50). This required him to change the response proportions for all subcategories of the respondents. As he said, things need to add up. But he apparently went about things in a completely bassackwards way.

What he said he did is wholly unconventional, and creates suspicion about both the toplines and the reported responses for subcategories of the respondents. The fact that someone can "explain" the even-odd number observation by referring to an unconventional manipulation of the data by the survey organization doesn't make it right! In fact, that even-odd numbers problem proves the toplines were manipulated.

The question remains, "How much manipulation was there?" From my own experience conducting and analyzing dozens of surveys, it's a shock to think that the data were violated at all in this way. If a researcher wants to reweight the sample after a survey is completed and initial weights were calculated, that's sometimes done (e.g., when fresh population data come from the census bureau, or when a survey is part of panel study and the researcher reweights to adjust for attrition in the panel). But to change the responses or summaries based on them directly? That is way beyond the pale of accepted practice.


Mark Grebner:

@hobetoo: Your understanding of the word "weighting" is exactly in accord with mine. It's not a general excuse for tampering with data, but an algebraic step taken to correct specific problems.

I recently conducted a small poll of Republican primary voters in Michigan, asking who they intended to support for governor. After the data was collected, I discovered that either due to sampling fluctuation, or differential response rate, I had more respondents from western Michigan ("Grand Rapids") than seemed reasonable, based on the tallies from recent statewide primary elections. Since one of the candidates is Dutch and represents the Lake Michigan shoreline in Congress, this excessive representation of western Michigan had the effect of overstating his support in my poll. I calculated a set of weights, applied them to all the data I had collected, and reported the results both before and after re-weighting.

The client could see immediately what I had done, why I had done it, and what effect it had. That - and not arbitrary tampering - is what the polling industry generally means when they refer to "weighting".


It's funny that neither Markos or Silver has dared to call into question the gross weighting anomalies in the 2004 and 2008 National Exit Polls.

In order to match the 2008 recorded vote, the NEP indicated that returning Bush voters comprised 46% of the electorate, compared to 37% of Kerry voters. In other words, there had to be 12 million more returning Bush voters than Kerry voters - a mathematically impossible result considering voter mortality and turnout. Bush won the "official" recorded vote by just 3.0 million.

The NEP indicated that 4% (5.3 million) were returning 2004 third-party voters. But only 1.2 million were recorded.

In 2004, the Final NEP Bush/Gore returning voter mix was 43/37%,another matthematically impossible result (it indicated that there were 52.6 million returning Bush voters). But Bush only had 50.5 million recorded votes in 2000. Approximately 2.5 million died and 2 million did not vote. Therefore there could have only been at most 46 million returning Bush 2000 voters in 2004. Where did the 6.5 million (52.5-46) phantom returning Bush voters come from. Gore won the recorded vote by 540,000 in 2000.



Mark Grebner:

@Richard: You're making an unwarranted assumption: that almost everyone correctly remembers and reports their previous votes. Within the polling industry, NOBODY believes that. Because of the secret ballot, it's impossible to point to specific errors in recall; we can only say that the aggregate is way off.

But we have really good statistics on a related question. When the government researchers ask about voting behavior in the Current Population Survey, about two weeks after each even-year November election, roughly 10% of the subjects falsely report having just voted.

Given the short period of time, the objective and confirmable nature of the question (we ask, and then we go and check) we get a lot of bad answers.

Asking who somebody voted for 2 or 4 years earlier - which clearly can't be checked - is bound to produce even worse accuracy. And it does.


Matt Sheldon:

@Mark Grebner:

You completely misconstrued my point on the gender pattern--which ought to give pause to anyone considering your analysis.

Let me restate it with a simple example so that you will not become confused.

Let's assume it is Obama's Approval Rating.

1. We round off the topline to the nearest integer. Let's say we round 52.6% to 53%

2. Now, round the number for males to the nearest integer as well. Let's say it is 46.6% rounded to 47%.

3. Now, what is the number that would be required for women, such that when averaged with 47% is equal to the topline number of 53%?

Well? It is 59%. (47%+59%)/2 =53%

Strange thing, they are both odd numbers!

Given that the weights are approximately 50/50 in the dataset, then this number cannot terribly off, and will be rounded under any circumstance.

Your desire for such precision in the data is ludicrous considering we are rounding off to integers in the first place!

To further illustrate, what if male approval had rounded to 46% instead of 47%?

To get the numbers to foot, the rounded female number would be...60%!

Again, given the rough 50/50 weighting, this number is not made up. It is mathematically derived. IT MUST BE VERY CLOSE TO 60%.

So...53%= (46%+60%)/2

Strange thing, they are both even numbers!

Remember, this is a rounding procedure and there is no "right" answer when rounding crosstabs.

Any rounding method produces it's own errors, so the producer of the crosstabs has the discretion to determine which types of errors they are most comfortable with.

This is the point that you completely do not seem to understand. Every time it is made in the comments on various blogs you still do not get it.

Rounding creates it's own errors.

There is no set rule that says how you have to round the numbers. There are numerous rules settings for handling rounding that can be applied to a set of crosstabs.

I may choose that I want all crosstabs to total to 100%. That is a choice.

I may prioritize preserving the "movement" or change in the data ahead of the absolute percentages themselves.

You claim it is more accurate to ERASE movements of up to 1% with standard rounding procedures, rather than reflect them some other way.

You are proposing a set of crosstabs that claim ZERO movement, when in fact there is movement.

Your way is no better.

That is why there is some discretion in these cases. It does not destroy the data because we are talking about changes of a point or less.

This is, of course, well within any margin of error.

You guys are now grasping at straws because your original analysis comes nowhere near proving fraud...

...not even close.

It was amateurish to ignore the possibility of weights. Likewise, it is amateurish to mischaracterize how the ronuding could be done.

Your original report does not even consider the fact that there is a third option ("Don't Know"), nor does it point out the fact that all crosstabs equal to 100%.

This simple fact ENSURES that a custom rounding algorithm was being used, and there is nothing wrong with it. It does not heavily distort the data in any way and adheres to several simple rules in expressing the data.

Your analysis represents the trouble that can happen when statisticians look at data without understanding the context in which it is created. You expect randomness ALL THE TIME, yet your data is rounded to the nearest integer and always sums to 100%.

You should have known at that point that it is no longer random, yet you missed it entirely.



The more you find out, the more you know that either Markos ("Kos") Moulitsas was a total moron or he was complicit in this scandal. Everyone told him those polls were garbage but he stayed with them because they presented the propaganda he wanted to spread.

It sounds like the Del Ali guy might be an unemployed college drop-out working in the basement of his parent's house, but Moulitsas wasn't smart enough to know that?

Moulitsas was delighted with the polls until they were openly questioned. Now he is playing the poor victim. No one is buying that.


Mark W:

@Matt Sheldon:

"1. We round off the topline to the nearest integer. Let's say we round 52.6% to 53%

2. Now, round the number for males to the nearest integer as well. Let's say it is 46.6% rounded to 47%.

3. Now, what is the number that would be required for women, such that when averaged with 47% is equal to the topline number of 53%?"

But I'm confused, why do you need to figure out what number is "required" to make your topline accurate? Why not just take the actual numbers which you already have? Here's another formulation (where gender split is 50/50):

1. Let's say Obama's overall approval rating is 51.5%. Round it up to 52%.

2. Approval rating in males is 46.4%, rounds to 46%.

3. Approval rating in females is 56.6%, which rounds to 57%

(46.4 + 56.6)/2 = 103/2 = 51.5 = 52%
(46 + 57)/2 = 103/2 = 51.5 = 52%

There's your odd-even pair, with no magic required. I'm genuinely not understanding your point.



Why do they keep targeting a liberal pollster who probably polled all Americans. At least research 2000 showed that they overpolled Democrats. Rasmussen comes out with these ridiculous polls and don't even account for their information. Fair is fair. Polls are supposed to be scientific, and account for their information. Rasmussen has really gone of the deep end. Here is the poll I don't believe; The GOP is trusted more on education? That was never true.


Mark W:

@Matt Sheldon:

As an addendum to my last post, just to be clear, I'm not trying to fight about this, I just don't think you can say that rounding the topline to an integer demands that you go back and round the numbers in your gender breakdowns to integers. Even if you DID feel it was necessary to, rounding consistently would allow for several odd-even pairs. I've already shown one situation where applying round half-up across the board gives an odd-even pair, and even if you rounded half-to-even, the same numbers can be used to get one(51.5 still goes to 52). If you round ALL decimals to the lower bound, consider the following:
Topline approval rating: 51.5% = 51%
Male approval rating: 57.1% = 57%
Female approval rating: 46.4% = 46%
(46+57)/2 = 103/2 = 51.5 = 51%

The only way I see to create an odd-odd or even-even pair here, given the starting values, would be to arbitrarily change one of the male/female approval ratings to some other integer value (say 56 for females), just to have an even division for the topline number, which would A) be inconsistent with the rest of the rounding, and B) be totally unnecessary (as far as I can tell).



What seems to get lost here is that polling is an art form and not a science. All polls require the statistical processor to weight the poll by choosing and outside standard to use to compare to the actual poll results to determine weighting.

In the example posted above by Grebner he notes that he believed that the poll over represented West Michigan and under represented the other parts of the state so he made a conscious decision to adapt the numbers closer to where he thought they should be based upon an outside standard, but there is no guarantee that the outside standard is appropriate, or that it will have predictive value over future actions such as voting behavior. The turnout for the last election by demographic won't match the turnout for the present election.

When you are purchasing a poll, you are purchasing an estimation of the value relating to the questions you ask generally within a range of plus or minus 3%. It would be hard to demonstrate 'fraud' unless you can prove that the methodology places the value consistently outside of the margin of error, unless you can prove definitively that the numbers were made up. That is the reason for the margin of error, to account for polling errors. It is also the reason that sites like this one and RealClearPolitics use poll aggregates to try to come up with their predictions, because no pollster is likely to hit the accuracy and precision on each and every election. The pollsters who are closest to being on target this election may not be closest the next time. The same is true for side by side elections. The pollster closest to the mark in one state may be the furthest away in the next, because all polling is subjective. Your disagreement with R2K's polling and tabulation methodology doesn't mean the polling is fraudulent.

I think what you're really proven beyond a shadow of a doubt with this analysis is that there is a pretty good reason why most polling firms don't release their cross tabs as Kos does.

Polls are ballpark figures, and there seems to be some confusion in the matter between the notions of accuracy and precision. Just because R2K's polling methodology may not be as precise as you'd like doesn't mean that it isn't accurate within the margin of error.

It's also I think illustrative of how statisticians take themselves and their science a little too seriously. If you sample 1,200 M&Ms out of a vat of 16 million odds are you'll come up with a good estimation within a set margin of error of the color sample of the M&Ms in that vat. But sampling humans is much more complex, whereas how you ask a question, or even so much as the vocal inflection of the person asking can change the result, thus the same tight tolerances of statistical sampling don't really apply but we pretend they do because it makes following politics fun. And polls can have predictive value of trends and general results, but there is a reason that most races within 4% in the polls are called toss ups, if you're trying to use a poll to get any closer to your result than that you're a fool.

We've all been confronted with two different polls measuring the same race which come up 8 points apart when both polls have a margin of error of 3%. They both can't be right because their margins of error are mutually exclusive. Does that mean that one pollster committed fraud or simply made a mistake?




I think the argument against R2000 is not that they were 'wrong' or inaccurate, although going by the 538 ratings they were relatively poor, it was that their samples were not random, (even after allowing for weighting and rounding). This could either occur by not doing the polling in the first place or by arbitary adjusting the results afterwards (this second possibility seems to be confirmed by Ali e-mail).



Well, I was looking at a good post last night by Mark Blumenthal, the author of this thread, last night basically chastising 538 for their rankings being arbitrary and subjective too.

I understand the analysis the statisticians provided, that in one section of the crosstabs they found a non-random pattern which couldn't result from random polling, but my concern is that from that they extrapolated out a 'fraud' claim without from what I've seen seeking out explanations for the pattern which didn't involve fraud.

I'm not the only one who sees an explanation for the pattern in simple rounding error as opposed to fraud.

It could be that R2K is indeed fraudulent, but its just as possible that they aren't and I hate to see people on the left engaging in these type of witch hunts, even going to the extent of finding out how many 'tax leins' the guy has and publishing such information in blog posts to embarass him. Perhaps we'll soon find out if Mr. Ali has granite counter tops or not.


Mark Grebner:

@kwAwk: "... my concern is that from that they extrapolated out a 'fraud' claim without from what I've seen seeking out explanations for the pattern which didn't involve fraud."

Trust me, we sought and we sought. Both directly and through Markos, we repeatedly asked Del to provide any explanation he could. We basically delayed publication by a week, first in response to Del's promise to provide raw data, and then in order to give him time to respond to our calculations. He was furnished the complete text 36 hours before we published, with yet another request to respond. If an innocent explanation exists for the patterns we found, either Del doesn't know it, or he's not willing to tell us.



MG -- I appreciate that you guys want to get to the bottom of this, and I appreciate that if Markos feels like he got ripped off or he didn't get his money's worth that he wants to take it to the guy and try to recoup his losses but I just kind of feel like this might have been something that was better off handled in private with a little more respect to the notion that this man is a small business owner who could lose the livelyhood he's spent ten years building based upon these accusations, even if they're not true.

If the guy is guilty of fraud, then prove it first but otherwise it should be addressed as 'concerns with the methodology' or 'questions about anomalies in the results'.

But that is just my opinion.


An Open Letter to Nate Silver

Richard Charnin (TruthIsAll)

July 10, 2010

Nate, since your recent hiring by the NY Times, the R2K flap and your exchanges with Zogby you have been getting lots of publicity from blogs such as vanity fair and motherjones.com. Your characterization of Zogby’s expertise says more about you then it does about him. Zogby correctly projected the True Vote in 2000 (yes, Gore won Florida, despite what the NY Times said), 2004 and 2008 elections, yet you fail to give him credit. In fact, you rank him at the bottom – because you believe the recorded vote is sacrosanct. More on the True Vote vs. the recorded vote later.

As an Internet blogger who has been posting pre-election and exit poll analyses to prove election fraud since 2004, I have occasionally looked at your postings on fivethirtyeight.com. I will say right here that unlike the bloggers and mainstream media (MSNBC, the NY Times, etc.) who extol your forecasting “expertise”, I do not believe you are quite the polling guru that they claim you are.

I say this as one who has been building quantitative models since 1965 for defense/aerospace manufacturers, Wall Street investment banks and has consulted for many financial and corporate enterprises. I have three degrees in Mathematics, including an MS in Applied Mathematics and an MS in Operations Research.

Your 2008 simulation model win probabilities did not sync with the projected vote shares. The major flaw in your model was to conflate it with your pollster rankings, an ill-conceived methodology. The first rule of model building is KISS (keep it simple stupid). You not only introduced an extraneous variable into your model, but the rankings were incorrect – a double whammy. Now, what do I mean by this, you ask?

You fail to distinguish the True Vote from the Recorded vote by ignoring vote miscounts. The premise on which your models are based (that fraud does not exist) is incorrect from the get-go. In your ranking system, pollsters who come close to the recorded vote (i.e. Rasmussen in 2004) are ranked high, but pollsters who come close to the True Vote (i.e. Zogby) are ranked low. The fact that Zogby is ranked at the bottom is a clear indictment of your approach. Ranking pollsters based on their performance against the recorded vote is a waste of time. Fortunately for you, your fans are unaware of the distinction between the recorded vote and the True Vote. In fact, most are unaware of the extent in which their votes have been compromised by fraud. In your models, election fraud is never a factor.
This is the simple, yet fundamental equation that you seem to be blissfully unaware of: Recorded Vote = True Vote + Fraud.

Since you rank pollsters based on how close their polls match the recorded vote, I assume that exit pollsters Edison-Mitofsky are ranked at the top, since their final state and national exit polls always seem to match the recorded vote. So why don’t they release the unadjusted exit polls as well? These may actually reflect the True Vote. As a Polling Quant, you should be interested in the statistical rationale for the matching.

Check with your new employer, the Grey Lady. The NYT is an important part of the National Exit Pool, the consortium that sponsors the exit polls. The NEP also includes the Washington Post, ABC, CNN, AP and Fox News. That’s plenty of MSM polling power. Ask why they expect transparency from R2K but won’t release the raw, unadjusted precinct exit polls from 2000, 2004 or 2008. That information would be very useful. It might indicate which exit poll precincts show discrepancies to the recorded vote that are virtually impossible mathematically.

What are your thoughts about the 2010 primaries in MA, AR, SC and AL? Does the fact that Coakley won the hand-counts in MA indicate something to you? Does the fact that 40 of 42 SC precincts that favored Halter were closed down indicate something? Or how about the unknown, non-campaigner Greene winning in SC by 59-41% but losing the absentees by 84-16%? The DINOS on the state election commission refused to consider the recommendations of computer scientists to investigate the voting machines that were obviously rigged. In AL on June 8, the attorney general issued an opinion that an automatic recount does not apply in a primary election. Knowing all of this, will you be factoring fraud into your 2010 projections – along with turnout and final polling?

Do you want further confirmation that Kerry won in a landslide? As an “expert” analyst, you should have taken a close look at the 2004 National Exit Poll. If you had, you would have seen that the Final NEP, as is always the case, was forced to match the recorded vote by adjusting the number of returning 2000 voters to an impossible level– as well as the vote shares. According to the NEP, 43% (52.6 million) of 2004 voters were returning Bush 2000 voters. But this was impossible. Bush only had 50.46 million recorded votes. Based on voter mortality tables, 2.5 million Bush 2000 voters died prior to the 2004 election. Therefore at most only 48 million returning Bush voters could have voted in 2004. But if an estimated 98% turned out, 47 million voted. Therefore, the number of returning Bush voters was inflated by at least 5 million. Kerry won the election by 10 million votes. You are welcome to try and refute the True Vote Model.

Do you want to see proof that Obama won by nearly 22 million votes and not by the recorded 9.5 million? As an “expert” analyst, you should have taken a close look at the 2008 National Exit Poll. If you had, you would have seen that the Final NEP, as is always the case, was forced to match the recorded vote by adjusting the number of returning 2004 voters to an impossible level. According to the NEP, 46% (60 million) of 2008 voters were returning Bush 2004 voters and 37% were returning Kerry voters. That means there were 12 million more returning Bush voters than Kerry voters – and that’s assuming the myth perpetuated by the mainstream media (who you are now going to work for) that Bush won by 3 million votes in 2004. Do you believe it? How could that be?

But it’s much worse than that. If Kerry won by 10 million votes as the True Vote Model indicates (you are welcome to try and refute it) then there were approximately 10 million more returning Kerry voters than Bush voters. Assuming the same NEP vote shares that were used to match the recorded vote, Obama wins by 22 million votes, not by the 9.5 million recorded.

The 2008 NEP indicated that 4% (5 million) of the electorate consisted of returning third-party voters. That was clearly impossible; only 1.2 million third-party votes were recorded in 2004. In their zeal to match the recorded vote, the exit pollsters had to create millions of phantom Bush and third-party voters.

In the eleven presidential elections from 1968 to 2008, the Republicans won the popular vote by 49-45%, (6% went to third parties). But the Democrats won the True Vote by 49-45%.

It’s all in my book: Proving Election Fraud: Phantom Voters, Uncounted Votes, and the National Exit Poll.

I was the first election analyst to use Monte Carlo simulation in the 2004 Election Model and also in the 2008 Election Model. I also applied extensive exit poll analysis in developing corresponding the post-election True Vote Model. It proved that not only were the 2000 and 2004elections stolen, it is very likely that 1968 and 1988 were also. There were at least 6 million uncounted votes in 1968 and 11 million in 1988 – and the majority were Democratic (minority) votes.

The Edison Mitofsky 2004 Evaluation Report provides the results (WPE) of 238 state presidential election exit polls from 1988-2004. Out of the 66 that exceed the 3% margin of error, 65 favored the Republican. Was it due to reluctant Bush responders? No, it was due to uncounted Democratic votes and phantom Bush voters.

The Final 2004 Election Model Projection (Monte Carlo simulation) projected a Kerry win: a 51.3% share and 337 electoral votes. This closely matched the unadjusted aggregate state exit polls (52%) and the 12:22am National Exit Poll (51.2%). The True Vote Model indicated that Kerry had a 53.2% share. Of course Bush won by a bogus 50.7-48.3% recorded vote margin. How did your projections pan out?

In the 2006 midterms, the pre-election Trend Model (based on 120 Generic polls) projected a 56.43% share for the Democrats. The unadjusted National Exit Poll indicated a nearly identical 56.37%. The Final National Exit Poll was forced to match the 52% recorded vote. Nate, which one do you believe was correct? You are aware of documented miscounts in 15 –20 congressional elections, virtually all favoring the GOP (see FL–13, FL-24, OH-1, etc.). How did your projections pan out?

The Final 2008 Election Model Projection (Monte Carlo simulation) exactly matched Obama’s 365 electoral votes and was within 0.2%(53.1%) of his 52.9% share. But it was wrong. Obama did much better than that. The final state pre-election likely voter (LV) polls did not fully capture the late shift to Obama. Had they been registered voter (RV) polls, adjusted for undecided voters, Obama would have had a 57% share. He had 57% and 420 EV in the True Vote Model. As shown below, the final Gallup RV tracking poll gave Obama a 53-40% margin. After allocating undecided voters, he had 57% - matching the True Vote Model. How did your projections pan out?

So what does it all mean?

It means that any and all polling analysis that fails to consider voter mortality, uncounted votes and a feasible voter turnout is doomed to produce the wrong result. The correct result is the True Vote based on total votes cast. The wrong result is the recorded vote that ignores uncounted votes but includes phantom voters.

It means that the recorded vote, the basis for your rankings, never reflects the True Vote!

It exposes your ranking system, which places John Zogby (the only pollster to predict the True Vote in the last three presidential elections) at the bottom of a list of scores of obscure pollsters, as being fatally flawed.

It means that your comments disparaging exit polls, along with your failure to do post-election True Vote analyses, indicate that you are in sync with a moribund mainstream media that perpetuates endemic Election Fraud by withholding raw exit poll data. They accept the recorded vote as Gospel - just as you do in your rankings. You will fit in very well at the NY Times.

When will you incorporate the True Vote into your analysis? Why do you ignore the fact that the mainstream media (i.e. the National Election Pool, which includes the NY Times) is responsible for the impossible adjustments (made by the exit pollsters they employ) to the final 2004, 2006, 2008 state and national exit polls? They had to match the polls to corrupted recorded vote counts, come hell or high water - and will surely do so again in 2010.

You have questioned the R2K Democratic share of the 18-29 age group exceeding the 30-44 group in 20 of 20 races.

Table 1 shows the probabilities for all the age groups.
There was a 33% probability that the Dems would do better in the 18-29 group than the 30-44 group in all 20 races given the average two-party shares. The comparable probabilities were 77% for 45-59 and nearly 100% for 60+.

You have also questioned the apparent lack of volatility in the 2008 R2K tracking polls.

Table 2 displays R2K daily statistics.
The margin of error is 1.96 times the standard deviation (a measure of volatility) at the 95% confidence level.
The standard deviation of Obama’s daily poll shares was 1.83%. It was 1.59% for the 3-day moving average.

Table 3 is a comparison of Gallup vs. R2K.
Gallup was a registered voter (RV) poll. R2K was a likely voter (LV) poll.
The average shares and volatilities (standard deviation) closely match.

There was a strong 0.70 correlation between Obama’s Gallup and R2K shares.
There was a good 0.50 correlation between McCain’s Gallup and R2K shares.

Gallup Change Change R2K Change Change
Obama McCain Obama McCain Obama McCain Obama McCain
Avg 49.65 42.90 0.15 -0.15 50.29 42.21 0.06 -0.02
Stdev 2.02 1.74 0.94 0.89 1.59 1.86 0.70 0.73

Table 4 compares the R2K tracking poll and other polls (including standard, non-tracking polls)
Projections are based on the allocation of undecided voters (UVA).

1) 75% of the undecided vote is allocated to Obama, the de-facto challenger.
2) third parties have 1.5% (the actual recorded share).

The final Gallup projection (57.1%) for Obama is a close match to the True Vote Model (57.5%).
Obamal projected shares:
Gallup: 53 + .75 * 5.5 = 53 + 4.13 = 57.1%
R2K: 51 + .75 * 3.5 = 51 + 2.63 = 53.6%



To Mark Grebner:

Your "false recall" argument is bogus. There is no evidence that voters forget. On the contrary, based on average ANES srrvey responses from 1968 using the True Vote as a baaeline, they recalled quite clearly. You fall into the same old trap in assuming that the recorded vote is sacrosanct. The True Vote Model indicates that the Dems won the True Vote from 1968 by 49-45% based on total votes cast. The GOP won the recorded vote by 49-45%.
There were over 80 million uncounted votes since 1968. Look it up.

Click the three links in which false recall is debunked - and there is much else at this site.

I have been doing this stuff since 2004 and have two MS degrees in Applied Math.


Post a comment

Please be patient while your comment posts - sometimes it takes a minute or two. To check your comment, please wait 60 seconds and click your browser's refresh button. Note that comments with three or more hyperlinks will be held for approval.