Do we expect too much from exit polls? And can we rely on the final weighted data to fulfill their most important mission: to help us better understand who voted, how they voted and why? Those are questions I find myself asking while pondering the numbers I watched as the returns came in Tuesday night.
About a month ago I received an e-mail from a political pundit whose name you would probably recognize. He wrote to vent a bit about the way some of us obsess over election night exit poll numbers:
"I grew to dislike and be dismissive of exit polls after the 1992 election and find little value in predicting outcomes (which as you have written was not their original intent). They have become crack cocaine for political junkies looking to score on Election Day. Maybe it's a sign of getting older but I am content to wait for actual returns.... Exit polls should go back to their original purpose, explaining who did what and why, rather than trying to forecast what will be widely known anyway in just a few hours."
He suggested an exercise: Compile a list of the leaked mid- to late-afternoon exit polls and compare those to the actual result. Using numbers that leaked out this week and on Super Tuesday, I did just that.
The table below shows exit poll numbers and election results for 20 states that held primaries or caucuses on Feb. 5 or March 4. The first three columns show the Clinton margin (Clinton's percentage minus Obama's percentage). The first column shows the earliest leaked results, including late-afternoon numbers from East Coast states and "first call" morning interviews from Western states. The second column shows the results extrapolated from official tabulations posted on network Web sites as the polls closed. The third column shows the actual vote results.
The last two columns in the table show the error in the exit poll estimates -- first, those from the late afternoon, and second, those from official tabulations posted online as the polls closed. A positive number is an error that overstates Obama's support (or, put another way, denotes a shift to Clinton between the estimate and the final result); a negative number is an error that overstates Clinton's support (or marks a shift in Obama's direction between the poll and the final result).
The errors show a clear if not perfectly consistent pattern: The early leaked results overestimated Obama's strength in 18 of 20 states, for an average error of 7 percentage points on the margin.
The picture improves for the numbers we compiled at poll closing time. Eleven states had errors of 6 points or less that showed no consistent pattern and amounted to an average error of zero. In other words, when the errors were small, they tended to be random and cancel out.
However, the larger errors consistently overstated Obama's support. Seven states showed errors in Obama's favor ranging between 7 and 17 points, and averaging 11 points. In four -- Arizona (+11), Massachusetts (+17), New Jersey (+10) and Rhode Island (+15) -- the error on the vote margin reached double digits.
The people who work hard to produce this data would want you to know a few things. First, the early leaked numbers are incomplete (and sometimes inconsistent), based on interviews conducted through midafternoon and often weighted by turnout assumptions from prior elections. The leaked numbers from Western states (California, New Mexico and Utah) were especially preliminary, based only on morning interviews.
Second, the at-poll-closing numbers used to weight the preliminary cross-tabulations posted online are not based on "pure" exit polls, but rather a composite of the exit poll tallies and pre-election poll averages (not unlike those we post at Pollster.com). This blended estimate is intended to reduce errors, and based on the table above, appears to serve that purpose. In 16 of the 20 states above, the at-poll-closing estimates were more accurate than the earlier leaks.
Third, whatever flaws are evident in these numbers, they did not lead to missed calls (the mistaken projection by AP in Missouri was apparently based on actual vote count; the exit poll showed the race too close to call). The reason is that exit pollsters are quick to identify statistical bias on election night, with a system that identifies precinct-level error and corrects it in "real time" as actual vote counts become available.
Finally, they will remind us that none of these numbers are intended for public consumption. The early estimates are leaks, and the overall estimates not included in the public cross-tabulations are considered not "air worthy," although pollsters will use these numbers to characterize the race (and anyone with a spreadsheet can easily extrapolate the overall numbers as we did).
Of course, numbers or not, the errors led to some on-air characterization that was flat wrong. An hour after the polls had closed in Massachusetts and New Jersey, for example, Tim Russert was telling MSNBC viewers that the lack of a projected winner meant that "no matter who wins, it's going to be a close race" in those states. Clinton won New Jersey by 10 points and Massachusetts by 15 points.
So what do we make of all this? First, some will inevitably see evidence of vote fraud conspiracies. Such theories arose to explain similar apparent errors overstating Obama's support in New Hampshire, but a hand recount of paper ballots yielded no evidence of foul play. The errors are consistent with a known exit poll problem: The typically younger exit poll interviewers have trouble gaining cooperation from older voters, and Obama does best with younger voters.
Second, as these errors are inconsistent and hard to predict, we should probably follow the advice offered to journalists by the New Republic's Michael Crowley to "turn off their BlackBerries from 5 to 8 p.m. on election nights and, like, go do ESL tutoring or some other charitable work instead." It would not kill us to wait a few hours for better data.
Finally, this pattern raises a deeper and more complicated issue: Do these big initial errors, and the weighting necessary to correct them, undermine the quality of the "final" weighted exit poll that helps us understand who voted and why? That is a topic I will take up in a future column.
-- Mark Blumenthal is editor and publisher of Pollster.com. His e-mail address is firstname.lastname@example.org.