Update: Predicting Leeds Utd's Progress Through the 2025/26 EPL Season

Update: Predicting Leeds Utd's Progress Through the 2025/26 EPL Season

Back in July I wrote about how Particle Filters can be used to provide a probabilistic estimate of the points Leeds Utd could accumulate through the new season. Based purely on bookmakers odds, the model predicted a modal outcome of 32 points, significantly below the widely considered safe total of 40 points if relegation is to be avoided.

Well, we're 17 games into the season (almost half-way), and so it's a good time to re-run the model to see if Leeds' fortunes have improved. To make it interesting, I've modified the model so that the form of the team in past matches contributes to future outcomes, rather than relying solely on bookmakers odds. The key idea is to introduce a latent variable into the model. A latent variable is not directly observable; it represents intangible qualities like team quality, current form or managerial effectiveness. So, instead of pretending probabilities are fixed, we say, "There is an underlying hidden state (the latent variable) that generates match results."

My original model assumed each match that Leeds played has a fixed probability (pW, pD, pL). In the original version, past results only add points, they don't change future probabilities. In the new model, we introduce a latent variable representing Leeds' "strength" or "form."

How Latent Strength Affects Match Odds

The strength, s, of the Leeds team is a continuous variable where when s = 0, Leeds is exactly as good as preseason expectations. If s > 0 Leeds is stronger than expected, and for s < 0, the team is weaker than expected.

We think of the preseason odds (pW, pD, pL) as the prior belief about a match outcome. The latent strength nudges these beliefs with positive values pushing the probabilities from loss to win, and negative values of strength pushing probabilities from win towards loss.

Without going into detail, mathematically, we manipulate the bookmakers odds in log-odds or logit space. We manipulate odds in log-odds space, then apply a softmax transform, which ensures the final probabilities are valid and sum to 1. This mirrors how bookmakers adjust odds.

How the Model Learns from Results

When a match is played:

  • Every particle gets its own s
  • That s implies a probability of the observed result
  • Particles that made the result more likely get more weight
  • We resample, concentrating probability mass around plausible s values

This is effectively Bayesian updating. After 17 matches played, we now have a posterior distribution over s that feeds into all remaining matches.

So when we simulate future fixtures each particle represents a plausible Leeds performance, where strong particles win more often, weak particles lose more often, and our baseline odds anchor future match difficulty.

So What's Leeds Utd's Predicted Points Total at the End of the Season?

The image at the top of the article shows the predicted percentile bands showing how the distribution of points evolves over the season. The lines are quantiles of the distribution of simulated seasons. At each matchday, half the simulated seasons lie above the green line and half below it.

The most frequent points total (the mode) provides a useful estimate of the final total in this case. The model predicts a modal value of 40 points! This is good news, and is driven by recent improvements in form interacting with the baseline difficulty of the remaining fixtures.

At present the remaining fixtures still use the preseason bookmaker probabilities as the baseline; updating these to current market odds will further refine the forecast.

If this current run of form continues, then Leeds are on target to reach the "safe" total of 40 points! Time will tell, of course.

Hope you found this interesting. I'll provide an update at the end of the season, to see how close we got to the model prediction.

Until then Merry Christmas and MOT!



As long as they don't get 3 points today. 😉 😆

Like
Reply

And people still doubt the real-world impact of AI....

Like
Reply

To view or add a comment, sign in

More articles by Mark Cusack

  • Sentiment Analysis of a Popular Leeds Utd Forum

    A few weeks ago, I wrote about using particle filters to predict Leeds United’s fortunes for the coming season. This…

  • Predicting Leeds United’s Season with Particle Filters

    Back in 2003, I co-authored a technical paper with Simon Maskell on particle filters for tracking movement of people…

    7 Comments
  • The Heat Death of LLMs

    Emboldened by the response to my last couple of end-of-the-week posts on AI, consciousness and physics, I'm moved to…

    1 Comment
  • What the Limits of Science Taught Me About Enterprise Sales

    Like many people in enterprise sales, I spend much of my time thinking about the collapse of the wavefunction, emergent…

    2 Comments
  • AI, Simulation and Consciousness

    What, if anything, do large language models tell us about consciousness? I was wary about posting my take on this…

    15 Comments
  • Workload Analytics: Tickling the Soft Underbelly of the Platform

    I’ve recently written about the strengths and weaknesses of LLM-based text-to-SQL. I wanted to continue the discussion,…

    4 Comments
  • Text-to-SQL with Dataherald and Yellowbrick

    Natural language to SQL transformation is a very active field of research and product development, boosted by the…

    5 Comments
  • Holiday fun with LLMs and vector databases

    As a fun holiday diversion, I've implemented a simple chat application that recreates an earlier internal project by…

    3 Comments

Others also viewed

Explore content categories