by: Grebner
Tue Feb 18, 2014 at 17:31:21 PM EST
I’ve been thinking about building a statistical model of recounts, to make it possible to estimate probabilities that the outcome of an election would be reversed under various conditions. It’s impossible to do a rigorous test, since it’s not clear what statistical model might be appropriate, and we haven’t have adequate data for a rigorous test anyway.
Now, an organization called the Michigan Election Reform Alliance has painstakingly conducted an unofficial recount of the ballots cast in two elections in Allegan County, as part of their larger examination of voting in Michigan. Using Michigan’s FOIA, they obtained access to the ballots voted in the August 2008 and November 2012 elections, and they tallied some 135,000 individual votes, by hand, and compared their totals to the official tallies.
MERA is interested in the big picture: reforming the entire election process, which is a wonderful idea. But as a small-picture guy, I borrowed their data to try to estimate the probability that under various conditions a recount would reverse the initially announced results of a very close race.
Here’s the bottom line: when the ballots are re-checked, there are about five random changes per thousand ballots. Since some of the errors being corrected actually cancelled each other, the likely net change can be estimated to be: SQRT(ballots/200).
In addition, as large numbers of ballots are counted, the Democratic candidate in any given two-party contest tends to gain about 0.2 votes for each 1000. In small recounts (say, fewer than 100,000 ballots) the random effect is the only thing that counts. In larger recounts (congressional or statewide) the tendency for a recount to benefit the Democrat becomes more important while the random effects tend to cancel out.
Before the recount begins in earnest, the election officials look for errors that involved mishandling of groups of ballots, which might have been counted twice, or not counted at all. Or numerical errors such as mistakes in copying numbers or adding them together. I don’t know of any data set which allows such such arithmetic errors to be modelled – they might be large or small, and in small districts they probably won’t appear. These “bulk errors” need to be corrected and included in the tallies before we talk about the effects of a recount.
Once we have solid totals for the two candidates, to estimate the impact of the random effect, to estimate the net number of ballots which will be shifted from one candidate to the other, we divide the number of ballots by 200 and take the square root. (This corresponds to “the standard error”.) If we double that number, we get a reasonable idea of the largest likely change. (I.e.: a 95% confidence interval.)
Second, in a partisan general election, a careful recount is likely to increase the Democratic candidate’s share by about 20 votes per 100,000 cast, mainly resulting from the scanning machine having overlooked ballots marked by voters who failed to follow instructions very well, perhaps using the wrong pencil, or failing to fill in the area completely. In other cases, ballots are “rehabilitated” which had been disqualified because they apparently showed too many votes for a given office, where the extra “vote” turned out to be a smudge or crease mark.
Let’s use a specific example: the 2000 Congressional race in CD8, where the initial results showed Mike Rogers getting 145179 votes, to Dianne Byrum’s 145019, a difference of 160 votes. If the ballots had been cast using the optical scan system that is currently in use in Michigan (the election was actually conducted using punchcards) this model says that the largest Byrum gain to be expected would have been
Random effect: 2 * SQRT(145179+145019) * 0.05 = 54 votes
Specific Democratic effect: (145179+145019) * 0.00020 = 58 votes.
In the actual recount, Byrum’s position improved by roughly 50 votes, leaving her 111 short.
Because that election was held using punch-cards, this is an illustration, rather than a literal analysis. But if the same election occurred again, my analysis suggests that gap is probably too large to overcome merely by correcting random individual-ballot errors – the result would probably not be overturned unless the margin were narrowed by the discovery of a bulk error.
Now, let’s dig deeper into MERA’s data. They studied the ballots from a collection of precincts cast in two Allegan County elections. First, they hand-tallied the ballots cast in November 2008 in 17 precincts of Allegan County, looking at votes cast for 36 candidates running for the four statewide education boards (State Bd. of Ed, UM, MSU, Wayne State). Second, they looked at fifteen Republican candidates, running for seven offices, in twelve precincts of votes cast in the August 2012 primary. Altogether, they tallied some 135,000 votes.
Obviously, Allegan County isn’t perfectly typical of Michigan, and we can’t say for sure that we’d see exactly the same patterns elsewhere. But as Donald Rumsfeld would have said, you have to conduct your analysis with the data you have, not the data you want. And this data seems to match pretty well what I’ve seen elsewhere.
The statistical analysis of these results is far too long to include here; I will only touch on major results.
I broke the 2008 general election results into three groups – Democrats, Republicans, and third-party. Partly that was so I could derive multiple estimates of the error rates. Partly it was to allow me to estimate any general bias against Democratic candidates. And partly it was because there is good reason to believe that some of the errors are not statistically independent, and keeping separate tallies might protect me from certain kinds of mistakes.
The first two columns (“Official” and “Hand count”) simply reflect the election-night machine tally and the later count conducted by MERA. The column labeled “Sum(errors)” shows the sum of the absolute values of the discrepancies found by MERA; whether the official tally was +2 or -2, it counts as 2 for this purpose. The column labeled “variance” shows the square of the discrepancy. The reason to use the square, rather than the actual discrepancy is to allow for the likelihood that in precincts large enough to include multiple errors, some of them cancel one another. Use of the squared error means that large precincts and small precincts are treated equivalently. Finally, “Net error” shows the overall effect of the tallying errors made, with positive errors offsetting the negative errors.
First, notice that the variance is remarkably even – and high – amounting to roughly one incorrectly tallied vote per candidate for each 200 ballots counted. That’s amazingly bad – on a ballot containing 100 candidates or proposal choices, there’s a fifty percent chance that there’s at least one error in tallying it. (It may be that the errors are concentrated on a relative handful of ballots, while the great majority are counted perfectly. Our data doesn’t permit us to be sure, but that does not appear to be the case.) That error rate is much worse than properly supervised hand-counting, and also worse than the much-maligned punchcard systems.
Second, notice that the “net error” is much less than the variance. (For professional statisticians, recall that a poisson variable’s variance and mean are equal.) This tells us that the great majority of errors are random, and we don’t appear to have found an attempt to steer votes from one party to the other. This randomness showed up in each of the analyses I performed. The problem is sloppiness, not dishonesty.
I calculated the apparent pro-Republican bias that was uncovered by MERA’s tally, but it falls well below statistical significance. I base my crude “guesstimate” of 20 lost Democratic votes/100,000 tallied on a mishmash of evidence including the Gore-Bush in Florida (before the Supremes put the kibosh on actually counting anything), the Byrum- Rogers race from 2000, Franken-Coleman (Minnesota, 2008), and Gregoire-Dino (Washington State, 2004).
It appears that if you count 1,000,000 votes carefully, you generally discover about 200 additional net Democratic votes. Why should that be? I don’t think it’s either deliberate or unconscious bias. The effect is NOT concentrated in areas with Republican election clerks – it seems to be found equally in heavily Democratic areas which have Democratic officials. The real cause appears to be that various Dem-leaning demographic groups are disproportionately likely to mark their ballots in ways that the scanners don’t read correctly. Think about first time voters, visually handicapped, people who are marginally literate – each of those groups is predominantly Democratic. It’s probably not just ballots; I bet if we could get good statistics, Democrats are slightly more likely to renew their auto registrations late, or send checks with transposed digits to the Secretary of State. In any event, the effect is very small. It only matters if you’re, say, 600 votes short in a state with six million votes cast. (That would be Florida.)
November 2008 – 17 precincts – Allegan County
DEMOCRATIC CANDIDATES
Official Hand Count Sum(errors) Variance Net Error
49590 49595 119 257 -5
average variance: 5.2/1000 votes
REPUBLICAN CANDIDATES
Official Hand Count Sum(errors) Variance Net Error
53851 53842 125 251 9
average variance: 4.7/1000 votes
THIRD PARTY CANDIDATES
Official Hand Count Sum(errors) Variance Net Error
10486 10487 51 53 -1
average variance: 5.1/1000 votes
Net Republican bias: 0.13 votes/1000 votes
August 2012 – 12 precincts – Allegan County
Republican primaries
Official Hand Count Sum(errors) Variance Net Error
15287 15308 61 89 -21
average variance: 5.8/1000 votes
Leave a Reply to Violet Cancel reply