What a Statistician Found When They Analyzed a World Champion's Games (in Blitz Chess)

· Chess Research

The chess world was shaken in late 2023 when Hikaru Nakamura, one of the greatest blitz players in history, achieved a staggering 46-game winning streak on Chess.com. The streak prompted widespread discussion, with some prominent figures suggesting the performance was statistically suspicious. However, a comprehensive 2025 investigation by statistics professor Jeffrey Rosenthal [1] concluded that the accusations were mathematically unfounded.

But what does "statistically suspicious" performance actually look like? How do statisticians differentiate between a player having the tournament of their life and a player receiving illicit assistance? To answer these questions, we analyzed over 425,000 Lichess Blitz games using the Grandmaster Guide engine analytics, alongside a targeted sample of 377 games across various rating bands. By mapping these performances to Chess.com equivalents, we can build a roadmap of how move quality, blunder rates, and streak probabilities evolve from beginner to elite levels.

The Anatomy of Move Quality

The primary metric used by anti-cheat systems is Average Centipawn Loss (ACPL). A centipawn is one-hundredth of a pawn. If the engine's best move evaluates a position at +1.00 (White is ahead by exactly one pawn), and a player makes a move that drops the evaluation to +0.80, they have incurred a centipawn loss of 20.

As players climb the rating ladder, their ACPL steadily decreases. Our analysis of the Lichess database reveals a clear progression in move quality.

Centipawn Loss by Rating

For a beginner rated around 500–800 on Chess.com (roughly 700–900 on Lichess), the average centipawn loss hovers around 182. This means that, on average, every single move they make gives away nearly two pawns of value compared to perfect engine play. By the time a player reaches the 1600–1900 Chess.com range, their ACPL drops to 150. While this is a significant improvement, it remains far from the engine-caliber performance of super-grandmasters, who routinely maintain an ACPL of 25 or lower in classical time controls.

Actionable Advice for Climbing the Ranks

For the 500–800 Chess.com Player: Your primary focus should be on board vision and basic safety. An ACPL of 182 indicates that pieces are frequently being left undefended. Before making any move, perform a strict "blunder check" to ensure you are not hanging material.

For the 850–1100 Chess.com Player: As your ACPL drops toward 169, you are likely making fewer outright blunders but still struggling with tactical sequences. Focus on solving basic tactical puzzles (pins, forks, skewers) to sharpen your calculation skills.

The Reality of Blunders

A common misconception among amateur players is that higher-rated opponents simply stop blundering. The data tells a different story. When we break down move quality into blunders (a drop of 300+ centipawns), mistakes (100–299 centipawns), and inaccuracies (50–99 centipawns), we see that blunders remain a persistent feature of blitz chess across all amateur rating bands.

Move-Quality Mix

Even at the top of the amateur range (1600–1900 Chess.com), players still average around 18 blunders per game per side. The improvement in rating comes not from eliminating blunders entirely, but from reducing their severity and capitalizing on the opponent's mistakes more efficiently.

Interestingly, when we look at the distribution of blunders within individual games, we find a surprising dichotomy.

Blunder Distribution

Across all rating bands, a "perfect" game with zero blunders occurs in roughly 32% to 36% of game-sides. Conversely, games with five or more blunders account for roughly half of all performances. This variance highlights the chaotic nature of blitz chess, where a single mistake can cascade into a series of blunders as the position becomes increasingly complex or time pressure mounts.

Visualizing the Blunder

To understand what these blunders look like in practice, we extracted illustrative examples from our game sample.

Blunder Example 1 In this 500–800 Chess.com game, Black plays Qxd4, a 429-centipawn blunder that ignores the engine's recommendation to simply castle (O-O).

Blunder Example 2 In this 850–1100 Chess.com game, Black plays Qxd7, a 464-centipawn blunder. The engine prefers a different continuation that maintains the tension.

Actionable Advice for Climbing the Ranks

For the 1050–1300 Chess.com Player: You are entering a phase where your opponents will punish obvious blunders. Your goal is to reduce the frequency of 5+ blunder games. Practice time management to avoid the panic-induced cascades of errors that characterize the end of many blitz games.

For the 1300–1600 Chess.com Player: Your blunder rate is stabilizing, but mistakes (100–299cp drops) are becoming the differentiating factor. Focus on positional understanding and prophylactic thinking to limit your opponent's counterplay.

The Illusion of the "Perfect" Game

When a player suspects an opponent of cheating, they often point to a game where the opponent played with near-perfect accuracy. However, our analysis shows that ordinary players frequently produce games that mimic engine-level precision.

Perfect Game Rates

In our random sample of 377 evaluated games, we found that players across all rating bands occasionally produce "near-perfect" game-sides (ACPL ≤ 15) or "elite-grade" game-sides (ACPL ≤ 25). For instance, players in the 1600–1900 Chess.com range achieved an elite-grade performance in over 19% of their games.

This occurs because chess is a game of forcing variations. If an opponent blunders early and the winning path is straightforward, a 1200-rated player can easily find the same moves as Stockfish. Therefore, a single perfect game is never sufficient evidence of cheating. Anti-cheat systems require a sustained pattern of anomalous performance over multiple games to trigger a flag.

The Mathematics of Winning Streaks

The most contentious aspect of the Nakamura investigation was his 46-game winning streak. To the layperson, winning 46 consecutive games seems mathematically impossible. However, streak probabilities are heavily dependent on the rating gap between the player and their opponents.

Streak Probabilities

When two equally matched players face off, the probability of a long winning streak is astronomically low. For a 1500-rated player facing a pool of 1500-rated opponents, the odds of winning just 10 games in a row are roughly 1 in 3,666. Winning 20 in a row is a 1-in-13-million event.

However, the math changes dramatically when a super-grandmaster enters the arena.

Hikaru Streak Context

During his 46-game streak, Nakamura (whose blitz rating hovers around 3300 on Chess.com, or roughly 3017 in our Lichess-calibrated model) was facing opponents with an average rating of approximately 2400. Against a 2400-rated opponent, Nakamura's per-game win probability is an overwhelming 96.96%.

When you compound a 96.96% win probability over 46 games, the odds of completing the streak in a single attempt are roughly 24%. Given that Nakamura plays thousands of blitz games every year, a 46-game streak against this level of opposition is not just possible; it is statistically expected to happen every few months.

Actionable Advice for Climbing the Ranks

For the 1600–1900 Chess.com Player: Do not be discouraged by losing streaks, and do not let winning streaks inflate your ego. Streaks are a natural mathematical consequence of playing many games. Focus on your average performance and long-term improvement rather than short-term variance.

Reading the Z-Score

When statisticians like Ken Regan [2] analyze chess performances for cheating, they do not look at single games or raw winning streaks. Instead, they calculate a Z-score, which measures how far a player's performance deviates from their expected level, measured in standard deviations (sigmas).

Sigma Thresholds

A Z-score of 2.0 (roughly 1-in-43 odds) is considered within normal human variation. A Z-score of 3.0 (1-in-740) might be suggestive but requires follow-up. It is only when a Z-score reaches 4.0 (1-in-32,000) or 4.75 (1-in-1,000,000) that statisticians consider the evidence strong enough to warrant sanctions.

In the case of Nakamura's streak, the Z-score did not approach these thresholds because his performance was entirely consistent with his established rating and the strength of his opponents.

Conclusion

Statistically suspicious performance in chess is not defined by a single perfect game or a long winning streak against weaker opposition. It is defined by a sustained, mathematically improbable deviation from a player's established baseline, measured across dozens of games and rigorously tested against the null hypothesis of fair play.

For the improving amateur, the data offers a clear roadmap: accept that blunders are part of the game, focus on incremental improvements in move quality, and understand that variance—both good and bad—is an inherent feature of blitz chess.

Chess Coach
April 17, 2026


Data and Methodology

This analysis was conducted using data from the Lichess API, processed via the Grandmaster Guide MCP analytics endpoints. The dataset includes over 425,000 evaluated Lichess Blitz games, supplemented by a targeted sample of 377 games for specific blunder and perfect-game rate extraction.

Rating bands were converted from Lichess to Chess.com equivalents using an interpolated mapping model to ensure relevance for the target audience. Streak probabilities were calculated using the expected score model detailed in Rosenthal's 2025 paper.

Underlying Data Files:

References

[1] Rosenthal, J. (2025). Statistical Analysis of Winning Streaks in Online Chess. Harvard Data Science Review.
[2] Regan, K. (2024). Cheating Detection and Cognitive Modeling At Chess. University at Buffalo.

Frequently Asked Questions

What did the statistician conclude about Hikaru Nakamura's blitz streak?

Jeffrey Rosenthal's 2025 investigation concluded that the accusations of statistical suspiciousness were mathematically unfounded.

What is Average Centipawn Loss in chess analysis?

Average Centipawn Loss (ACPL) measures how far a player's move deviates from the engine's best move, in hundredths of a pawn. Lower ACPL generally means more accurate play.

How do statisticians detect suspicious chess performance?

They compare move quality, blunder rates, and streak probabilities against expected patterns for a player's rating level. Unusual consistency alone is not enough to prove cheating.

How many games were analyzed in the study?

The article says the analysis used over 425,000 Lichess blitz games, plus a targeted sample of 377 games across different rating bands.

Why do rating bands matter in chess performance analysis?

Rating bands help show how move quality and blunder rates change from beginner to elite levels. This makes it easier to compare a player's performance against realistic expectations.

What is the difference between a great streak and statistically suspicious play?

A great streak can happen when a strong player performs at a very high level for a short period. Statistically suspicious play usually requires patterns that are unlikely even after accounting for rating, game volume, and normal variance.

Can engine analysis alone prove cheating in blitz chess?

No. Engine-like accuracy can raise questions, but it must be interpreted alongside statistical context, rating level, and game patterns before drawing conclusions.