How the Deadlock win-prediction models work, what they were trained on, and how accurate they actually are.
No black boxes. No "trust us". Real numbers from the current production models.
The split is chronological, not random: the model trains on older matches and is evaluated on the newest ones, so the reported accuracy reflects predicting matches from after the training window — the honest setting, not a leak-prone random shuffle. Models are retrained as the meta moves and hot-swapped into production from cloud storage with no redeploy.
A Transformer encoder over the 12 hero slots of a 6v6 lineup. Each slot is an embedding of (hero_id + team_side); the encoder pools to a single sigmoid for win probability. Architecture: 256-d embeddings, 8 attention heads, 4 layers, 1024-d MLP.
Both models are exported to ONNX and run on CPU. New checkpoints are published to cloud storage and the live site polls and hot-swaps them — the same hero data and predictions update without a redeploy.
| Model | Accuracy | BCE (lower = better) | Holdout size | Input |
|---|---|---|---|---|
| Baseline | 56.94% | 0.678 | 206,993 | Draft only (12 heroes) |
| Live | 70.58% | 0.545 | 206,993 | Draft + match state |
How to read these: a classifier that always guesses 50/50 gets 50.00% accuracy and BCE 0.693. The Baseline sits ~7 pp above chance on the draft alone; the Live model reaches 70.6% by also reading the game in progress. Both are measured on the most-recent held-out matches, never seen during training.
The Live model's accuracy rises as a match develops and more state is known — measured on the held-out set, bucketed by game time:
Note the honest tail: very long games (the rare matches that run deep into the late game) regress back toward ~70%, because games that stay close that long genuinely are close. We don't hide that.
The counters, synergy, and tier list pages are direct aggregates over recent public matches, not model output:
generated_at timestamp and refresh as new matches land.