Lab
experiments behind the glass · what we are testing · the method
Behind the glass
The work in progress, not just the finished number.
The live call is built from football alone. These are the rival models and extra signals we are testing to read a match more deeply than a single rating can: who actually plays, and whether the game even matters to a side. Nothing here is the published call. Each one has to earn that, in the open, on the record.
How an experiment earns the live call
- 01Shadow
- 02Score
- 03Promote
- SHADOWA new idea runs quietly beside the live model on real games, locked before kickoff. It is never shown as the call.
- SCOREAfter full time it is graded against the result, the same way the live model is, over many games. One match never decides anything.
- PROMOTEIt takes over the live call only when it genuinely forecasts better across a long run. Never because it is new, never on a hot streak. A human opens that door.
On the record so far
Each rival is scored against the live model on games locked before kickoff. Lower average miss is better, and the live model keeps the call until a challenger genuinely forecasts better over a long run.
- Goals modelahead, and has earned a closer look
74 graded · average miss: live 0.907 vs this 0.891 (lower is better)
- Pi-ratingsbehind so far
74 graded · average miss: live 0.907 vs this 0.915 (lower is better)
- Motivation
no graded games on the bench yet
On the bench
- Goals modelshadow · running
A rival that forecasts the score, not just the result. It turns the same team strength into each side's expected goals and builds the win, draw and loss from every plausible scoreline, so the draw emerges from the goals rather than a fixed dial. It runs on every locked call and is scored against the live model.
needs: the data we already hold
- Pi-ratingsshadow · running
A different way to rate a team. It keeps a separate home rating and away rating for each side, because a team can be a fortress at home and soft on the road, and it learns from the winning margin with diminishing returns, so a first goal teaches it more than a fifth. Now running in the background and scored against the live model on every graded game.
needs: results and fixtures, which we already hold
- Motivationshadow · running
What a side actually has to play for. A team already through to the next round may rest its best players in a dead final group game, and a plain rating cannot see that. It rebuilds the group table from results before kickoff, marks a side safe or already out only when it is mathematically certain, and leans the call toward whoever has more to play for. Now running in the background, scored against the live model on group-stage games.
needs: the group table, reconstructed from results
- Availabilityplanned · needs data
Who is fit to start. A rating rates the badge, so it is slow to notice when a key player is missing. With a sourced, dated team-news feed we can lean a call when a genuinely important player is out, and stay silent when we do not know.
needs: a sourced injuries and team-news feed
- Player ratingsplanned · needs data
A rating for every player, so the eleven who walk out set the team's number, not the crest. Honest in football only with care: with few substitutions, players are hard to tell apart, so the player view is anchored to the team and only ever redistributes its strength. Valuation first, forecasting later.
needs: historical lineups and minutes
- Tactical readsopinion, not the number
Shape and style: who presses whom, who exploits a high line. The evidence says formation matchups are mostly swamped by which side is simply better, so this stays in the written read, labelled as judgement, and never moves the locked probability.
needs: none, it stays opinion
The same rules apply here
Everything on this bench obeys the front-page rules: built from football alone, shown only as a plain frequency, locked before kickoff and judged forward-only over many games. None of it is fitted to results it has already seen. When one of these genuinely reads matches better than the live model, it graduates, and you will see why on the record.