Public opinion polls have come under greater scrutiny this election season, with claims that several
prominent polls have oversampled Democrats putting Obama in the lead. Frankly, I am skeptical of claims that contend Obama is leading in the polls due to oversampling of Democrats. Instead of focusing on the share of Democratic respondents in each poll, the primary issue is if pollsters' likely voter models are correct. If pollsters are relying too heavily on 2008 turnout and overestimate Democratic turnout, they may overestimate Obama's lead by about one and a half points.
NumbersMuncher, an anonymous poll analyst, tweeted the following graph, garnering Internet attention, here and here. Some conclude this is further evidence prominent public opinion polls have oversampled Democrats with the intent to bolster the bandwagon effect for the president.
When looking at this graph, one should not assume that each survey has oversampled Democrats, because typically more Americans do self-identify as Democratic than Republican, likewise more Americans self-identify as conservative than liberal. Moreover, these numbers do not sort Independent-leaning Republicans and Democrats. Yet, this graph's horizontal access presumes each sample should have an equal number of Democrats and Republicans, since it's titled "% of Dem Sample Advantage." [emphasis added]
Moreover, this graph combines partisan breakdowns among both likely voters and all survey respondentswithout identifying each. If it did, one would quickly recognize that the polls with higher Democratic margins are also polls which reported partisan breakdowns among registered voters but reported election match-ups based on likely voters (Ipsos-Reuters, ABC/Washington Post).
I have updated the above graph to reflect Real Clear Politics' most recent poll compilation. This demonstrates how sometimes partisan breakdowns are based on registered voters when election match-ups are based on likely voters.
Source: Real Clear Politics, and author's calculations.
Typically, there is a wider partisan gap among registered voters than likely voters. Historical data from national exit polls reveals that actual voters tend to reflect more equal numbers of Democrats and Republicans, with a slight Democratic advantage before sorting Independents. Likely voter models should take this into account.
The following chart compiles actual voter data from national exit polls:
Source: Roper Center for Public Opinion Research
If pollsters rely too heavily on 2008's partisan turnout, in which Democrats had a significant turnout advantage, this could skew the likely voter model to include more Democrats than would be historically advisable. There is little evidence thus far to conclude that 2012 will mobilize voters similar to 2008. Consequently, likely voter models should consider a broader historical trend.
It is hard to say whether pollsters are in fact relying too heavily on 2008 partisan turnout, because it is extraordinarily difficult to track down how these pollsters define likely voters.
According to Chris Jackson at Ipsos-Reuters, "most research organizations use a combination of prior voting behavior, interest in the election and self-report likelihood to vote to categorize likely voters. …Some pollsters also use 'voter lists' or commercial lists of people who voted in the last election instead of screening these individuals from the population."
Rasmussen gives a vague explanation here, "The questions involve voting history, interest in the current campaign, and likely voting intentions. Rasmussen Reports determines its partisan weighting targets through a dynamic weighting system that takes into account the state's voting history, national trends, and recent polling in a particular state or geographic area."
ABC News explains, they "develop a range of 'likely voter' models, employing elements such as self-reported voter registration, intention to vote, attention to the race, past voting, age, respondents' knowledge of their polling places, and political party identification."
As Huffington Post's Mark Blumenthal reports, "CNN has published no explanation of how they select likely voters."
Without seeing the demographic composition of the likely voters in each poll, it appears that several polls are extrapolating 2008 turnout beyond what will actually occur in 2012. Consequently, these polls may overestimate Democratic turnout and thus Obama's lead in the polls. Excluding Rasmussen, the average Democratic share lead among these polls is approximately 5 points, when historically it has been 3 points among actual voters (excluding partisan Independents). If likely voter models have in fact overestimated Democratic turnout, Obama's lead could shrink about one and a half points.
Rather than protest the share of Democrats sampled, poll consumers should redirect their energies toward determining whether the likely voter models employed are in fact predictive of actual voter turnout.