How the models work, what they were trained on, and how accurate they actually are.
No black boxes. No "trust us". Real numbers from the most recent production evaluation.
Data refreshes daily from Steam Web API, OpenDota, and Stratz. Each new model checkpoint is evaluated against the held-out set and only promoted to production if it beats the current champion on BCE without regressing on accuracy. The promotion log is public via our internal eval history.
A small Transformer encoder over the 10 hero slots. Each slot is an embedding of (hero_id + team_side); the encoder produces a CLS token that goes through an MLP into a single sigmoid for Radiant win probability.
All three models are exported to ONNX and run on CPU. Inference for a full BP lookahead tree (~100 candidate evaluations) takes under 200 ms.
| Model | Accuracy | BCE (lower = better) | Brier | ECE (calibration) | Holdout size |
|---|---|---|---|---|---|
| Standard | 55.19% | 0.685 | 0.246 | 0.0037 | 561,855 |
| Plus | 55.28% | 0.685 | 0.246 | 0.0134 | 3,345 pro |
| Pro | 54.52% | 0.685 | 0.246 | 0.0255 | 3,345 pro |
How to read these: A binary classifier that always predicts 50/50 gets 50.00% accuracy and BCE 0.693. Our models sit ~5 pp above chance and below the naive BCE -- modest but meaningful. The standout number is ECE 0.0037 on the Standard model: when we say 60%, the actual outcome is 60% (averaged over many such predictions). That's the property serious users care about.
Want to see the model called against actual matches you can verify? The /live page always shows the model's predictions for the most recent finished pro matches with the actual result alongside. Predictions are computed pre-game from the draft only -- the model has not seen the result.
No cherry-picking. The list is the raw OpenDota proMatches feed in reverse-chronological order. If we're wrong, you'll see it.