6
benchmark games
Casual surfaces, hidden ML structure.
A condensed read on where ML-style validation and modeling outperform generic agent heuristics.
Casual surfaces, hidden ML structure.
Modeling beat heuristics in every game.
Validation, planning, graph learning, causality, backtesting.
The decisive edge is knowing what to trust.
Benchmark summary
Final outcomes, condensed.
| Game | Metric | Generic | ML | Winner | ML advantage |
|---|---|---|---|---|---|
| Across all six games, the ML agent wins by identifying hidden structure before acting on it. | |||||
Detailed benchmark
Each entry shows the hidden problem, both approaches, and the replay.