Discussion about this post

User's avatar
Thomas House's avatar

The important message here is the one you mention. But ... we do have some algorithmic ways to avoid spurious correlation finding! For example, I would expect a QQ plot to reveal that the residuals are very far from the normal distribution usually implicit in a linear regression.

I Lang's avatar

In the interests of science and in light of the ongoing replication crisis, I repeated your analysis for the Premier League and then on each of the English Championship, League One, and League Two.

Although your finding for the Prem seems robust, I regret to inform you that I did not find any significant relationship when I repeated the analysis for each of the other leagues. When I looked across the 92 (all Prem and EFL teams in alphabetical order and position from 1 to 92) there was, again, no significant relationship.

I think your working theory may require some post hoc modification to explain why we see this correlation only in the top tier and not in any of the others.

8 more comments...

No posts

Ready for more?