“Margin of Error” in polls

by: Grebner

Sat Dec 22, 2007 at 01:04:56 AM EST

An introduction to some elementary statistical concepts, specifically aimed at explaining how to compare percentages within and between polls.  I cover some of the stuff you barely understood when you were struggling through Methods courses, and which you have long forgotten.  Fortunately, there’s no homework, and no grading.

 UPDATE:  Margin of error when looking at changes from one poll to another.

I’m following up a discussion in another thread: 

Everybody already knows what “margin of error” means for a single percentage in a given poll:  it tells you the range in which the “true” value which the poll is attempting to estimate will lie at least 95% of the time. 

You can easily calculate such a “95% confidence interval” for a simple percentage reported in a poll – it’s almost exactly equal to the reciprocal of the square root of the sample size.  Taking taking a few shortcuts, if a poll of 400 Democratic primary voters shows Hillary Clinton with 40%, take the square root of 400, which is 20.  Take the reciprocal, which is 0.05 or 5%.  (The precise number is more like 4.8% – the rough methods shown here are slightly conservative.)

But knowing Hillary has 40% isn’t as important as knowing whether she’s really ahead of Barack Obama (who polled 30%) or whether the difference may be statistical noise.  As the previous discussion asked, if Hillary’s percentage could be 5% too high, couldn’t Barack’s be 5% too low?  This seems especially likely, since her “extra” votes must have been taken from somebody else’s total.

To understand the comparison of percentages between choices posed in a single question, we have to change our thinking from “deviation” to “variance” which is defined as the average squared deviation.  This is necessary because statistical mathematics deals almost exclusively with squared errors, not actual errors, for reasons which lie buried deep in theory.

 The “variance” of the difference between two simple (“binomial”) variables like the candidates’ percentages is equal to the sum of their variances, plus twice their covariance.  “Covariance”  is closely related to “correlation”, which you may dimly remember is a measure of the degree to which two variables tend to move together (in which case they are positively correlated) or move in opposite directions (in which case they are negatively correlated).  

 When you look at the answer to a poll question, it’s obvious that each candidate’s support is negatively correlated with each of the others.  If one candidate goes up, somebody else is likely to go down.  If a statistical fluke gives too many votes to Hillary, it’s likely to have taken them from Barack.

 Unfortunately, I don’t know an authoritative method for estimating the covariance in this circumstance, but perhaps my critics will provide one.  Until we hear from them, I assert the correct formula (using H for Hillary’s support, and B for Barack’s) is to multiply the MOE for a single candidate by:

 Sqrt(2 + B/(1-H) + H/(1-B))

 which in this example would be 

 Sqrt(2+ 30/60 + 40/70)  =  1.75

 Don’t worry if you didn’t follow that calculation – I didn’t follow it either, and I bet I left out at least one factor somewhere.  The only claim I make is that it produces reasonable results for every pair of numbers I try.

 Back to the original question:  what’s the MOE for comparing Hillary to Barack when a poll with 400 respondents shows her winning 40% to 30%?

Since the MOE for a Hillary (or Barack) considered alone is +/- 5%, the MOE for the difference between them is about +/- 8.75%.

In general, the MOE for comparing two candidates is close to twice the MOE stated by the pollster for a single candidate, unless both of the candidates are well below 50%.  For complete non-entities (if you were concerned, for example whether  Dodd will finish ahead of Kucinich) the MOE is only about 1.45 times the individual figure.

So, if we want to know whether a candidate’s lead over another in a poll is real,  the margin of error is between 1.5 and 2.0 times as large as the typically stated number.

 What about comparing two different polls, to see if there has been a change?

 As before, the answer hinges on the variance of the polls’ estimates.  In the simplest case, where the two polls have equal sample sizes – and therefor equal stated MOE’s – the variance of the difference from one poll to the next is simply the sum of the variances of the two polls, which is to say: the variance is exactly doubled.  Since the MOE is proportional to the square root of the variance, it is simply multiplied by about 1.41.

 To expand our previous example, if in the first poll Hillary is ahead of Barack 40% to 30%, and in a second poll, they’re tied at 35% to 35%, should we believe the breathless pundits, when they tell us things have dramatically changed?  Well, no.

 As we saw before, assuming both polls have samples of 400, the MOE for the size of the lead in either poll is almost 9%, so the MOE for the change in the difference would be slightly larger than 12%.

But let’s be realistic.  The standard MOE is calculated as a “two-tailed 95% confidence interval”, meaning that if nothing really changes, random fluctuations in the sample should only reach the MOE in a given direction once out of forty trials.  A change in the lead of ten points from one poll to another, with an MOE of 12%, should occur less than one time in ten.   If I were working for Barack, and I heard we had caught up to a tie, after trailing by 10 points, I’d be happy, because it would probably mean we were doing better – but I wouldn’t be sure it was a real change.

 Where the sample sizes are different, you wouldn’t go wrong by multiplying the larger MOE (from the poll with the smaller sample) by 1.4.

The published “Margin of Error” is generally slightly overestimated.   The standard methods used for calculating MOE are very conservative, which I guess serves as a sloppy compensation for its inappropriate use comparing differences among candidates or differences from one poll to another. 

Still, it seems worthwhile stating the causes of the overestimation.  First, and generally least important, is the fact that to some extent every poll is also sort of a “census”.   I sometimes have a client who wants to conduct a poll in a small area – say the City of Perry.  Let’s say there are only 500 likely voters, and the polling firm ends up interviewing 200 of them.    Because they were chosen to be representative of all likely voters, we use their opinions to estimated the remaining 60% – but we aren’t “estimating” the opinions of the 40% we talked to – we actually know what they think.  In this case, we say the poll had a “sampling factor of 40%”, meaning the variance of our estimate is only 60% as large as it would have been if the universe from which we drew our sample was infinitely large.  In this example, the improved accuracy from conducting the partial census allows us to reduce our MOE by 23%.  (The square root of .60 is about .77.)

Non-statisticians tend to overestimate the effect of sampling fraction on MOE.  For a typical poll, if the universe being sampled is over 2000, the increase in accuracy is too small to notice.

A second source of improved accuracy is due to the use of stratified samples.  For example, let’s imagine a poll conducted in a City of Flint mayoral election between a black candidate and a white candidate, each of whom received overwhelming support from voters of their own race.  If a 400-interview poll were conducted using a simple random sample, the black percentage interviewed might range anywhere from 52% to 62%, which would of course be likely to be reflected in the level of support for each of the candidates.  But because Flint is so sharply segregated, if we assign a proportionate number of interviews to each precinct, in effect we can force the entire poll to accurately reflect the black percentage of the likely vote:  57%.  (This is for illustration only – I haven’t looked at the actual percentages in Flint in the last year or two.)  By reducing the variability of the racial distribution, we also reduce the variability of the candidate preference estimate.  Other situations which benefit from statification include contests where quotas can be assigned to geographic regions, gender, political party, or age, if those divisions are strongly correlated to candidate preference.  Because the benefit is difficult to explain or calculate, it’s generally simply ignored when calculating the MOE.

 Finally, the standard method of calculating MOE assumes each candidate will receive precisely 50% of the vote, which is the value for which a simple (“binomial”) variable has the maximum variance.  At values either higher or lower than 50%, the variance (and MOE) decline.  This effect is especially strong for values close to 0% or 100%.  For a sample with 400 interviews, if our MOE is +/- 5% for a candidate at 50%, the MOE is 4.9% for a candidate with either 40% or 60% support , 4.6% at 30% or 70%, 4.0% at 20% or 80%, 3% at 10% or 90%, and approximately 0.5% at 1% or 99%.  This makes sense if you think about it:  If exactly zero of the people interviewed say they’re supporting Mike Gravel for president, it’s not likely that another poll would have found 20 such people (5%).


Comments

12 responses to ““Margin of Error” in polls”

  1. Violet Avatar
    Violet

    “Margin exceeded” on Barack
    It’s BARACK Obama, not Barak. Cripes. “The acceptable margin of error has been exceeded…”
    What use are statistics if you can’t even get candidate names right??
    If you spent more time trying to destroy the voter list graft, and less on “Mich Lib” diaries, I might not complain much…

    by: David Boyle @ Sat Dec 22, 2007 at 00:55:45 AM CST

    1. Violet Avatar
      Violet

      Thanks.
      I’ve fixed the spelling.
      You may be pleased to hear that I’ve done everything I can to move legal matters along – nothing is hung up waiting for me.

      For my part, I’d suggest that accurate statistics are more useful than winning legal arguments, both for campaigns and for my business.

      by: Grebner @ Sat Dec 22, 2007 at 01:16:59 AM CST

      1. Violet Avatar
        Violet

        Mebbe
        Then again, if you, say, work for a leading Michigan politician and tell people her name is “Jellifer Grabhome” instead of a more orthodox spelling, that might not impress folx, and you might get less work. . .
        (And as for legal arguments: ask that “Al Gore guy” about “Bush v. Gore”………..)

        by: David Boyle @ Sat Dec 22, 2007 at 01:25:56 AM CST

  2. Violet Avatar
    Violet

    Useful Diary
    This is one of the must useful diaries I have come across on Michigan Liberal. Thanks for the college refresher applied to a real life application we all take seriously -perhaps to seriously in the case of certain bloggers. It would be great to see more of these types of diaries in the future. Thank you for your decades of contributions to the Democratic Party and progressives throughout the state.
    by: northernlib @ Sat Dec 22, 2007 at 07:17:33 AM CST

  3. Violet Avatar
    Violet

    Thank you Mr. Grebner.
    I think I actually understand that (sort of).

    The end of the human race will be that it will eventually die of civilization.

    – Ralph Waldo Emerson
    by: michmark @ Sat Dec 22, 2007 at 07:37:00 AM CST

  4. Violet Avatar
    Violet

    It makes intuitive sense that the combined standard error
    will be larger than the SE for one estimate, but less than twice as much. You have two estimates varying, the potential error is greater, yet some of the time they will be varying (from their underlying true values) in the same direction, which will tend to cancel some of the error in the difference.
    This is most interesting, but I would like to know where this:

    Sqrt(2 + B/(1-H) + H/(1-B))

    comes from?

    I use statistics extensively in my research, but have not dealt with this particular situation. I’ll see what I can find on point…

    by: memiller @ Sat Dec 22, 2007 at 14:14:10 PM CST

    1. Violet Avatar
      Violet

      Where did the formula come from? I made it up.
      Recall that I simplified the problem by looking for a factor to multiply by the “margin of error”, rather than actually estimating the variances and covariance. Even though it’s not exactly right, we’re treating the variances of the two candidates’ marginals as equal, which is the reason for the first “2”. (If we were trying to be precise, we would note that the variance of a binomial variable is p*(1-p), which is to say .21 for Barack and .24 for Hillary, rather than .25 as everybody – including the media – always assume.)
      The next term (“B/(1-H)”) suggests that any votes Hillary might gain or lose would affect Barack in proportion to the vote he already has, compared to the total vote Hillary doesn’t already have. That is, with Hillary at 40% and Barack at 30%, if she were to gain one vote, there’s a 30/60 chance that it would affect Barack’s total rather than the residue. The final term (“H/(1-B”) represents the world from Barack’s point of view – any vote he picks up has a 40/70 chance of coming from Hillary.

      Multiplying those two terms together seems right – dividing it by the square root of the product of the two candidate variances should yield an ordinary correlation coefficient, which I think it would. The method also yields extremely reasonable results for every combination of marginals I’ve tried.

      The best authority I can offer is that I struggled through a lot of PhD-level Stat, relying on similar methods. Generally, my exams received grades of B+, with the notation that my approach was “brilliant but not quite correct”.

      by: Grebner @ Sat Dec 22, 2007 at 16:45:42 PM CST

      1. Violet Avatar
        Violet

        Yet another correction
        “r” – the correlation coefficient – would be equal to the square root of the covariance divided by the square root of the product of the variances of the two variables.
        by: Grebner @ Sat Dec 22, 2007 at 18:36:01 PM CST

        1. Violet Avatar
          Violet

          Accuracy of my proposed formula (0.00 / 0)
          I just spent (“wasted”) four hours running “Monte Carlo” simulations, generating something like 5 billion sets of hypothetical poll results and comparing my results to the observed value for Pearson’s r.
          The good news is that my formula performs very well under almost all circumstances – it appears to converge with the “true” value when the probabilities for each candidate lie between 0.1 and 0.9, and are roughly equal.

          The bad news (other than the loss of half a days work) was that for very high or low percentages, especially where the candidates receive very different votes, it underestimates r by as much as 0.05.

          In other words, for the example we’ve been using (40% for Hillary, 30% for Barack) it’s essentially perfect. If you compare Hillary to Dennis Kucinich’s 4%, using my formula would cause you to underestimate the MOE by almost half a percent.

          This tells me there’s a factor missing somewhere, of the n/(n-1) variety. Close enough for government work, I think.

          by: Grebner @ Thu Dec 27, 2007 at 23:30:14 PM CST

  5. Violet Avatar
    Violet

    What would Epic Ed Sarpoulis say?
    I bekieve Mark’s formulas for predicting outcome. That ballot issue last year regarding affirmative action displayed Grebner’s theory.
    I think polls run themselves to keep people interested in polls. They are just junk. Cell phones, call screening and less land lines make polls garbage.

    Ed Sarpoulos is always running on about what his survy finds. Well, I have been called more than once in one month regarding his surveys. They call people who will cooperate. Pure and simple. My feeling is that they are useless and only serve to drive business to them selves.

    Follow Grebner’s formula and you will better predict the winnerw.

    by: treehugger52 @ Sat Dec 22, 2007 at 19:11:10 PM CST

  6. Violet Avatar
    Violet

    Overthinking MOE
    I have never understood the over analysis of margin of error. The assumption is that at the polls worst the data could be 5 percent off.
    Seems to me that creating formulas to calculate a weighted margin of error is just a way to create false hope. The only way that I can see to evaluate a poll’s accuracy is to compare it to other polls.

    by: spontoni @ Sun Dec 23, 2007 at 16:33:51 PM CST

    1. Violet Avatar
      Violet

      The main reason to think about MOE is when it’s very large
      Looking at the simple marginal from a typical poll, you’re exactly right – it’s a handful of points, and not worth obsessing about. But when the question involves something else – a change in the size of one candidate’s lead from one poll to the next – the MOE can easily be double-digit, and what looks like a substantial shift can be simply noise.
      But that’s a subsequent posting. . .

      by: Grebner @ Sun Dec 23, 2007 at 19:16:14 PM CST

Leave a Reply to Violet Cancel reply

Your email address will not be published. Required fields are marked *