Statistics and contested elections

Documents have recently been filed in the US Supreme Court by those contesting the US electoral outcomes that suggest that a late run of votes for Joe Biden in some Counties or States, was statistically, a very low ‘chance’ and so should also be proof that there was definitely something illegal going on. See page 8 of this Motion.

The problem here, like many desperate resorts to ‘low likelihood’ statistics, is that mathematical facts, which are not remarkable in themselves, are given a legal significance that they do not deserve.

The main problem in putting forward statistics of this kind at all is that ‘low-statistics’ exercises are examples of mathematical forecasting of future events, and Courts deciding historical facts are not interested in forecasting. They still require proof of what actually happened.

This resort to statistics is similar to the prosecutor’s fallacy which I have written about before, in the context of injustice. Those ideas are still relevant to the way in which these arguments about statistics, repeated in these US Court cases, are received.

It is always the plaintiffs who introduces the idea that one now certain, historical sequence of events (e.g. voting sequences) is mathematically similar to every other low-likelihood sequence of unpredictable events (thereby depriving it of any legal significance for proof of historical facts, let alone general illegality). It is a false comparison, because both are unproven theories. If the sequence of events were actually tainted by generally illegality, then it would not be appropriate to say that, in the past, there was ever an infinitesimally-small, unpredictable likelihood. They are two competing hypotheses, and both require proper proof.

What assumptions are needed to make low-likelihood statements?

In the US documents, it is alleged that the likelihood of any late run of votes all for Biden in some States was extremely low (infinitesimally small). Plaintiffs then argue (erroneously) that this low-likelihood statistic is proof of illegality. It’s not. It’s just an example of doing mathematics. And forecasts that assume everything in the world is done by chance, where individual votes are not affected by any general factor or illegality, are thereby irrelevant to any competing legal case that suggests there was such a general determinative factor. This includes a situation where the same plaintiff is running both of these competing arguments.

The actual reasoning in some of the legal documents asserts that voting patterns in the past (e.g. 4 years ago) still help with forecasting votes now. That’s silly. But even if we just assumed that you could try and calculate how low a chance there was of actually forecasting any specific series of votes in an election, it still wouldn’t help build a case for illegality. At best, you might say that the preferences of voters (or the mid-range of those voters) shifted over time. In any case, much of the complaint in this election has been about late runs of votes in favour of Biden, which has more to do with sequences, than just general distributions.

The kind of mathematics for forecasting a random sequence of votes, and how likely any particular sequence is, is similar to multiplying a lot of numbers less than 1 many times. For example, if I flip a coin 50 times, and the likelihood of heads and tails is 50% (and that’s all I know), then to calculate how unlikely that sequence of coin tosses is, from that point onwards into the future, I may need to multiple 0.5 times the number of tosses (that’s forecasting). So it’s 0.5 x 0.5 x 0.5 …. and so on. It turns out that all of the sequences (the ones that look like they have patterns in them and the ones that look more random) are, taken individually, insignificant. The odds of any specific sequence of 50 coin tosses, in a particular order, is very low. This forecast has nothing to do with whether a coin tosser is engaged in criminal activity or not. It’s based on the number of combinations that exist, and the maths used for forecasting.

To compare electoral voting to coin tossing is quite a stretch, but it doesn’t change the nature of the exercise if someone wants to do some simple forecasting. Doing maths about forecasting the sequences of random and independent votes tells us nothing about how a particular sequence actually came about.

Here’s an illustrative example (B=vote for Biden, T=vote for Trump). Let’s say BBBTTTBTBTB is a series of votes. This has nothing to do with party preference, and everything to do with the starting assumptions.

If a Court looked at the sequence of BBBTTTBTBTB, its usual task is not to start on the left side and assess how likely that run of voting events were, assuming each was independent. It starts with the knowledge that the votes BBBTTTBTBTB have actually occurred. It merely has to confirm that the run of voting events, as recorded, (BBBTTTBTBTB) was historically accurate. The odds that this sequence occurred, historically? Probably very high, if not 100%.

A Court wouldn’t be concerned to do mathematical forecasting at all, certainly not the kind that involves multiplication of small numbers thousands of times. Mathematicians doing this would first need to assume that we should :

  • go back in time (back before any voting happened); and
  • ignore any relevant information about the reasons for individual voting outcomes being dependant on individuals, manner, timing or location.
  • forecast the likelihood of particular voting sequences (many thousands of votes long) into the future.

Voting in the US elections for 2020 has concluded. A Court does not need to go back in time, and pretend they are engaged in forecasting. And if they did that, why wouldn’t they go back to different times, and compare different situations?

There are two sobering observations for those who become excited by the prospect that certain runs of events that occurred might be low-likelihood.

The first is that you have to apply the same logic to all other sequences of events in your sample (whenever they occurred). If you say that a run of ‘BBBBBBB’ in the sequence above is low-likelihood, but the same logic leads to the conclusion that ‘TTTTTBT’ is also a low-likelihood. This could be an early-run or a late-run of votes. Or any other sequence we care to choose. Depending on where we wanted to go back in time, we could say that any future sequence of voting is low-likelihood (for either candidate).

The second is that you are going to go back in time, and say that you knew that, in fact, there was actual fraud, or deliberate intervention, then you are asserting the outcome was predictable, and not a ‘chance’ event at all. Even at that time, your forecast for this sequence of votes should be closer to 100%, not infinitesimal. For this reason, usually only a person who lacks the relevant evidence resorts to these statistical ‘low-likelihood’ tricks when they really have a different narrative: that it was inevitable.

In a purely ‘chance’ situation, we can only speculate about a run of events into the future. For an election, that would mean assuming the outcome of each vote is independent of any other factors, so that the time and sequence in which it occurs is not going to influence the likelihood of any other voting outcome. No Court should do this (and if they do, they are falling into error).

What evidence is needed in the Courts?

Courts examine what has historically occurred, and take into account any useful information they have that might explain the dependence of events on other events, or the facts that occurred during a particular period of time.

Mathematical forecasting is immediately at odds with what Courts normally do when making findings about historical facts. A resort to statistics is, inherently, something that reverses the normal attention of the Courts in confirming events which we can usually say, with nearer to 100% certainty, have already occurred in the past.

The problem for those who introduce statistical arguments is that they usually have to have a new hypothesis that asserts some historical wrongdoing (which is a non-mathematical conclusion). Once they do this, they are no longer engaging in mathematical forecasting. The mathematical forecasting does not provide any useful proof of what actually happened, and it will not support any specific explanations of wrongdoing. So the plaintiffs now need to resume the normal legal practice of providing independent evidence to support their claims.

The appropriate evidence for reviewing electoral fraud claims needs to be in the form of actual evidence of how voting occurred. The historical events, as they occurred, might demonstrate that each voter cast their vote, unexceptionally. At different times there were voters voting for one candidate or another. This actual voting had nothing to do with chance. The manner of votes going for particular candidates may be explained by specific events in the real world. Without evidence that anything other than normal voting occurred, the sequence of ‘BBBTTTBTBTB’ would be confirmed by a Court to be historically accurate.