Bullshit Baffles Brains
My dad always said to me as a kid “bullshit baffles brains”. And that saying recently came back to my mind yesterday. There has been a whole lot of misinformation around the internet recently regarding the US election. This is not a new thing, the general trend started in the run up to Brexit four years ago, and has accelerated immensely. Of course there has been misinformation back through time as long as there have been humans, but the current scale is at a level that, I feel, is one of the main threats to society now.
I believe that “bullshit baffles brains” is a military term. Which stands to reason as my father was a Red Beret (Paratrooper) in the British Army.
Yesterday I saw some tweets going around mentioning something called “Benford's Law” and showing two charts, one of which supposedly showed “fraud” on behalf of the Biden votes in the recent US election.
I'd not come across Benford's law and so took a bit of time to look it up. I am not a data scientist, but I am very interested in data science, and as a job I help explain data science and machine learning to software developers.
Immediately my alarms bells were going off. Why would the number distribution follow such a pattern, and under what circumstances?
Most natural statistics of numbers follow what is called a “natural distribution”. Take a person's height for example. The average height for a male in the UK is 5ft 9in (175.3cm). But you know people shorter than that and tall than that. But the further from the average you get, the fewer people you know. If you were to plot the number of people at each height this results in what is know as a “bell curve”.
Benford's law states that for certain distributions of numbers that the first digit of the number is most likely to be 1, then 2, then 3 etc. This seemingly odd conclusion comes about from the combination of several statistics with natural distribution. It was originally noticed and used to detect fraud in accounting data.
But what about elections? In the tweets above they plot the frequency of the first digit of the number of votes each voting precinct got in the US. The main problem with this is that those numbers are actually quite small and as a result don't comply with Benford's law (which only works when you have numbers that span many orders of magnitude). Most voting districts have similarly sized catchment areas by design.
Taking this to an extreme, imagine that all districts had between 30 and 90 people in them. Then a chat plotting the frequency of the first digit of number of people in the district would have no results for 1 or 2 as there are no districts with a number of people that starts with a 1 or 2 in it.
Here is a fantastic video explaining it in a far better way than I can:
The key takeaway is that whilst Benford's law can show some evidence of numbers being “not normal” it does not necessarily point out fraud. You then need to look at why those numbers are what they are. And in the case of the US election it appears to mainly come down to a fluke of the distribution of the number of people in each district.
In fact there have been papers published that actually show that Benford's law is not very applicable to votes and elections. E.g this one from Cambridge University in 2017, whose abstract states that it is “...problematical at best as a forensic tool and wholly misleading at worst.”
And so onto the second blatant attempt at bullshit baffles brains:
I've come across this chap before. He tries to trade on some false credibility about being the “inventor of email”. I've already covered Benford's Law above, which is how he relates to this mess. So I've dealt with the substance of his claims, but now I'd like to move onto him himself.
The “inventor of email”. He has an entire web site dedicated to trying to uphold this charade:
Of course, this is a complete lie to begin with. He claims as a 14 year old boy in 1978 he wrote a program called “EMAIL”. However engineer Ray Tomlinson created the first email program in 1971. Tomlinson coined the term “email” and first used the @ sign as a means to delimit the user from the machine they were on in an address.
But as with most successful conmen, Ayyadurai take a kernel of truth and then adds layers of misdirection to it. The most obvious of this is the claim he has a “copyright on email”. To many people that sounds pretty convincing. And it is actually true. He has a copyright issued by the US Copyright office. But what for? First let's look at various common intellectual property (IP) protections:
- Copyright: an intellectual property protection on a creative work (story, music, etc)
- Patent: an intellectual property protection on an invention, method, device, mechanism, etc.
- Trademark: an intellectual property protection on a term, logo, name, etc
He has a copyright. But you can't copyright a concept, or a word. So what has he got? Read closer and he has a copyright on a program he wrote, called “email”:
In most jurisdictions copyright is automatically granted to the author of a piece of work. ie this blog post is automatically copyright myself. Do you know what else I have copyright on?
An email program.
Granted, it is not a very good email program (or VERY good depending on your view of email). But it is a program. And it is called “email” and I have automatic copyright on that code as I wrote it.
So what does it mean having copyright on “email”? Nothing really. It is a completely nonsensical claim. It would be like Charles Dickens saying he has a copyright on Oliver Twist, and hence invented the “novel”. Apart from Dickens is dead, and copyright automatically ends after a set period of time, and the works become public domain. Usually a number of years, e.g. 70, after the author of the work dies.
If he had a patent on email as a method/process then he could prevent others from implementing something that does the same thing. If he had a trademark on “email” he could prevent others from calling something “email”. But he has a copyright, which just means others can't copy the specific code he wrote 42 years ago. Code for a concept that someone else invented 7 years earlier.
But the point is, he tries to use this nonsensical, but true, claim of having a “copyright on email” to give weight to the lie that he is the inventor of email. This in turn has no doubt led to a number of other deceptions... ending up with him claiming some authority in his lie that Benford's Law indicates Biden's votes were manipulated.
And are people fooled by this charade? Yes. They are. The inventor of email has mathematic proof that Biden cheated. Sounds very convincing.
But none of it is true.