League of extraordinary gentlemen
Everyone seems to know the score, they've seen it all before
I posted a graph on LinkedIn yesterday which did good business there. It shows a very strong correlation between Premier League teams’ rank in alphabetical order, and their current position in the league table. For people who like p-values, I can tell you that p=0.0131, which means that if there was really no correlation between these things, then you’d only expect to see as good a fit as this once every 76 years.
I mean, that’s obviously absurd - we don’t believe that the alphabet affects football results! And of course, I posted it as a joke. But in a world where getting a p-value lower than 0.05 can be the threshold for publication in a decent scientific journal, it’s at least worth thinking what it all means.
Firstly, maybe I’ve made a mistake with the data? It’s not a bad place to start. But if you check the table you’ll see that first-in-the-alphabet Arsenal are top, last-in-the-alphabet Wolves are bottom, and there’s even a lovely little alphabetical run of Aston Villa-Bournemouth-Brentford-Brighton-Chelsea-Everton-Fulham from 5th to 11th.
And, even if you don’t know any fancy statistics, you can see there’s something a bit rare going on. If you put twenty tiles from A to T in random order, the chance that the first one would be A and the last one would be T is 1 in 380 (that’s 20 times 19), so even Arsenal being first and Wolves being last seems unusual.
But is that right? On some level the answer to the graph conundrum is that “correlation doesn’t imply causation”, and that’s kind of true. Except that causation can often imply correlation, and newspapers can be quite happy to mix the two up. Notice here how the warning in this article that:
observational studies like this cannot prove cause and effect. People who eat eggs regularly may differ from those who rarely do. Their overall diet and lifestyle may be different in important ways.
has become
Eating eggs five times a week could cut Alzheimer’s risk
under the influence of the headline writer. It’s definitely an elision worth keeping an eye out for. But I’ve written about correlation and causation before, whereas I think that here something simpler is going on.
That is, what possessed me to make the graph in the first place? Well, as an Aston Villa fan, I’ve been obsessing over that alphabetical run of teams lately. As it stands after the weekend, the only teams which can stop us qualifying for the Champions League are the aforementioned Bournemouth, Brentford and Brighton, and it’s natural to spot what they have in common. Once you notice that, it’s hard not to notice the nice little run below that, to join that in your mind with the Arsenal-Wolves coincidence, and wonder about plotting the data.
And if I’d done that, and the correlation hadn’t been so strong, then I probably wouldn’t have bothered putting the graph on LinkedIn. Similarly if the p-value had been higher than 0.05, I wouldn’t have told you about it. And if the effect hadn’t been so strong, then nobody would have liked the post. (Pedants will also spot that I cheated by putting AFC Bournemouth under B not A - of course this was a deliberate decision to help the pattern work).
In other words, what we are looking at is an example of HARKing - “Hypothesising After the Results are Known” - coupled with a small dose of cherry picking and the file-drawer effect. If I’d made the prediction explicitly at the start of the season, or if the same thing carries over for a league I’ve never looked at, that would have been much more impressive. But somehow I’m putting together the a posteriori outcome of a series of unlikely events (Liverpool getting worse despite spending a fortune, World Club Champions Chelsea dropping down the table, Newcastle and Nottingham Forest being much worse than last season, Spurs somehow not being any better) and telling a cute story to explain it.
Of course this kind of thing goes down well on LinkedIn. And I like to think that most of the people liking the post, many with fancy job titles involving analytics and data, were in on the joke.
But equally, it’s worth thinking about the way that this data was collected, and how the human mind looks for coincidences and isn’t always great at considering the chance of them happening. The surnames of my first three PhD students began with the letters A, B and C in that order - which is another neat fact, except I wouldn’t be telling you if they’d been S, J and L. David Hand’s excellent book The Improbability Principle gives many examples of this kind of effect, and argues that at least in part coincidences seem to happen so often simply because we have many opportunities for them to occur, and we only focus on the times that they do.
As a result, it’s important to bear in mind these kinds of effects when data is being reported. If there’s an apparent strong side-effect of some drug which is only seen in 30-34 year old men, is it possible that someone is just reporting the effect of random variation showing up in one group? Is it biologically plausible you wouldn’t see some effect in 25-29s and 35-39s? As is often the case, maybe we can leave the last word to Richard Feynman:
You know, the most amazing thing happened to me tonight... I saw a car with the license plate ARW 357. Can you imagine? Of all the millions of license plates in the state, what was the chance that I would see that particular one tonight? Amazing!



The important message here is the one you mention. But ... we do have some algorithmic ways to avoid spurious correlation finding! For example, I would expect a QQ plot to reveal that the residuals are very far from the normal distribution usually implicit in a linear regression.
In the interests of science and in light of the ongoing replication crisis, I repeated your analysis for the Premier League and then on each of the English Championship, League One, and League Two.
Although your finding for the Prem seems robust, I regret to inform you that I did not find any significant relationship when I repeated the analysis for each of the other leagues. When I looked across the 92 (all Prem and EFL teams in alphabetical order and position from 1 to 92) there was, again, no significant relationship.
I think your working theory may require some post hoc modification to explain why we see this correlation only in the top tier and not in any of the others.