by: Grebner
Mon Feb 25, 2013 at 21:48:56 PM EST
I’ve just run into one of those simple ideas that make me wonder why I didn’t see it myself, and makes me regret all the effort I wasted before I learned about it.
In a horserace poll, we want information that will predict who will win the election, or some closely related measurement like the size of the lead, whether the lead is growing or shrinking, and so on. We’ve always assumed the best question to ask our respondents is “Who will you vote for?” We use statistical methods to combine all the individual answers we receive into a prediction of what will happen on election day when a much larger number of people cast their ballots. The fundamental idea is the extrapolation from a collection of YES/NO votes to the total the election Clerk will announce a few hours after the polls close.
But what if we ask a completely different question? What if we ask, “Who do you expect will win?”
If we put aside our common sense idea of how a poll “should” be conducted, which simply arises from our experience with doing it the same way every time, we realize we have no way to guess whether “who will win?” is a better or worse predictor than “who will you vote for?”.
We can think of arguments that point in either direction. “Who will you vote for?” at least asks a question to which the voter should KNOW the answer, while “who will win?” requires them to guess about something they can’t really know. And the answer to “who will win?” is bound to be tainted by superficialities like media exposure or irrelevant results of previous elections.
But “who will win?” may free shy voters to whisper their secret opinions, since they’re supposedly only talking about other people’s opinions. And it might possibly serve to increase the effective sample size of the poll, by asking each respondent to tell us not how ONE person will vote (the respondent) but how TEN or TWENTY people will vote (their neighbors, relatives, co-workers, and friends).
One way to settle this argument is to pile up opinions and war stories, and then divide into conflicting schools of thought, which each raise doubts about the other’s competence and integrity. If we employ this approach to deciding between them, we’ll either see the dead hand of history triumph, or at least battle the new idea to an inconclusive draw.
Instead, we might TEST the two, by asking both questions of all the respondents in a large series of polls, to see which successfully predicts the election result more often. In a wide range of high-visibility elections, “who will win?” turns out to be considerably more accurate.
From the paper by Rothschild and Wolfers, it appears “who will win?” works best within say 90 days of an election. And most of their comparisons involved high level offices (president, US Senate, Brittish Parliament) so we don’t have strong evidence about the use in local races. I suspect the superiority of “who will win?” depends on having an election contest that is salient enough that the poll respondent might have actually talked to other people about it. So it may not work if applied to an obscure election, or one that is far in the future.
I’ve only started testing the new method, but it seemed to work reasonably well in predicting Lon Johnson’s win over Brewer – keeping in mind that wasn’t a genuine poll. For the forseeable future, I intend to ask the horserace question BOTH ways, in order to develop a better feel for the method. According to the paper cited above, combining the two versions gives a slightly more accurate result than using “who will win?” alone.
“Who will win?” has two major advantages over “who will you vote for?”. First, it makes the most of a small sample. To take the Johnson/Brewer race for example, assuming it was an actual poll of a randomly selected sample, imagine I had asked “who do you support?”, and gotten a result of 26-to-14, which would be non-significant. But imagine that four of Brewer’s supporters in the poll had realized from what they’d been hearing, that Brewer was in trouble, and had switched their vote when answering “who will win?”, yielding the 30-to-10 result we actually saw – a statistically significant margin.
Second, although the answers to the two questions are strongly correlated (meaning that the answers to “who will win?” are biased by the voter’s personal preference) it turns out to be easier to correct for the bias. As a result, even from a small sample in which one political party is over-represented, it is possible to get a fairly accurate reading of the overall state of the horse-race.
Leave a Reply to Giovanni241 Cancel reply