It's a tough time to be a pollster.
The art and science of predicting how people will vote is getting harder every day. A year ago, pre-election surveys fairly dramatically underestimated how Republicans would fare in the midterms. Since then, polls of voters in both Israel and the United Kingdom were revealed to have been way off. And last night in Ohio and Kentucky, pollsters missed again.
In the Buckeye State, voters were presented with a chance to weigh in on whether or not to follow in Colorado and Washington's footsteps by legalizing marijuana use. As of early October, polls were finding a majority supporting such a move.
Admittedly, two weeks before the voting booths opened, a Bowling Green State survey found the margin had narrowed. "The November 3rd vote on legalizing marijuana in Ohio is too close to call," reported NBC24.com. "The poll…showed 44 percent supporting Issue Three and 43 percent opposing it. Thirteen percent are undecided. Since the margin of error is plus or minus three and a half percent, it's a dead heat."
But the outcome didn't end up being close. Instead, voters roundly defeated the measure, with more than 64 percent rejecting the plan to create 10 facilities with exclusive commercial rights to grow pot plants. You can see my colleague Jacob Sullum's take on the results here.
Meanwhile, just to the south, voters in the Bluegrass State were charged with selecting their next governor. As my colleague Peter Suderman explained last night, the polling got it wrong there too:
Surveys showed that Republican candidate Matt Bevin, a venture capitalist and Tea Party favorite who had unsuccessfully challenged Senate Majority Leader Mitch McConnell in a GOP primary less than two years ago, had partially closed the gap with Democratic candidate Jack Conway, but was still expected to lose by about two points.
Instead, Bevin won in a landslide, besting Conway by nine points, with 53 percent of the vote to Conway's 44 percent.
Conducting survey research during off-year elections is notoriously difficult, mostly because far fewer people generally turn out—and guessing what the electorate will look like demographically is one of the hardest things survey practitioners have to do. As I explained back in May:
Because it's impossible in most cases to get a truly representative sample, pollsters are forced to use complicated statistical modeling to adjust their data. If you know, for instance, that your poll reached too few young people, you might "weight up" the millennial results you did get. But doing this involves making some crucial assumptions—namely, what is the correct number of millennials?
We don't know before Election Day what percentage of the electorate will actually be under 30, or Hispanic, or even Democrats vs. Republicans. So pollsters have to guess. This creates a lot of room for error, and means that part of the reason recent elections underestimated the actual conservative vote share could be that pollsters are systematically assuming left-of-center voters will turn out in larger numbers than they actually do.
But off-year polls are also less trustworthy for another, often-overlooked reason: namely, that there are just so darn many fewer of them.
Ultimately, in a world in which polling is hard and getting harder, the smartest thing an observer can do is to look at the average across a large number of independently conducted surveys. One of the reasons Nate Silver at FiveThirtyEight has had so much success over the last few cycles is that he makes use of a model that incorporated lots of different polls—conducted by lots of different outfits and using lots of different methodologies—in order to reduce the likelihood that the read he's getting is skewed.
The more polls a model has to work with, the more accurate it's likely to be. As Silver wrote in an October 2012 blog post titled "State Poll Averages Usually Call Election Right":
It is important to emphasize that this analysis covers cases in which there were at least three distinct polling firms active in a state; you will find more frequent misses in cases where there were just one or two polls. … There are no precedents in the database for a candidate losing with a two- or three-point lead in a state when the polling volume was that rich.
But off-year elections, like local elections (and races that are expected to be a runaway for one candidate or the other), tend to attract fewer polls—which in turn increases the chances that the few polls being conducted will end up being wrong.