Research Report: Average Engine Evaluation Drop When Voluntarily Accepting Pawn Islands

2026-04-21 · Chess Research

Status: INCOMPLETE (Grandmaster-Guide MCP Server Outage)

This research article was intended to analyze the average engine evaluation drop when chess players voluntarily accept pawn islands in Rapid games, across Chess.com rating bands 800-1500.

Per the project instructions: "at any time if any of the tools in the grandmaster-guide mcp goes down, error out don't attempt to finish analysis without it."

During the data collection and augmentation phase, the grandmaster-guide MCP server began returning HTTP 502 (Bad Gateway) errors for both the /api/lichess-pgn/augment and /api/python-analyzer-status endpoints. The outage persisted for over 25 minutes, blocking the retrieval of Stockfish 12 evaluations required to compute the evaluation drops.

Therefore, the analysis has been halted. Below is a summary of the data collected prior to the outage.

Data Collection Progress

Raw Game Data: Successfully downloaded 6,000 Lichess Rapid games (1,500 games per 200-point Chess.com rating band). The Lichess ratings were calibrated to match Chess.com ratings (e.g., Chess.com 800-1000 ≈ Lichess 1400-1615).
Event Detection: Successfully parsed all 6,000 games to identify 10,381 candidate "pawn island events." These are plies where a player voluntarily made a pawn move or capture that increased their number of pawn islands, doubled pawns, or isolated pawns.
Engine Evaluation Augmentation: The process of augmenting these specific games with Stockfish 12 evaluations ([%eval]) was initiated. Approximately 316 games from the 800-1000 band were successfully augmented before the MCP server went down.

Data and Methodology

The methodology involved:

Fetching games from the api_lichess_games_by_rating endpoint.
Replaying the games using python-chess to track pawn structures (islands, doubled files, isolated files) and identifying voluntary structural concessions.
Augmenting the PGNs via api_lichess_pgn_augment to get engine evaluations before and after the structural change.

The partial datasets (raw games, detected events, and the small batch of augmented games) are attached for reference.

Chess Coach

Frequently Asked Questions

What is this chess research report about?

It was designed to measure the average engine evaluation drop when players voluntarily accept pawn islands in rapid games.

Why is the analysis marked incomplete?

The analysis was halted because the grandmaster-guide MCP server returned HTTP 502 errors during data collection and engine evaluation retrieval.

How many games were collected for the study?

The project successfully downloaded 6,000 Lichess rapid games, with 1,500 games in each 200-point Chess.com rating band.

How many pawn island events were detected?

The parser identified 10,381 candidate pawn island events across the 6,000 games.

What rating range was analyzed in the report?

The study focused on Chess.com rating bands from 800 to 1500 in rapid games.

What caused the engine evaluation step to fail?

The required Stockfish 12 evaluations could not be retrieved because the `/api/lichess-pgn/augment` and `/api/python-analyzer-status` endpoints were returning 502 Bad Gateway errors.

Were the Lichess ratings adjusted to Chess.com ratings?

Yes. The report states that Lichess ratings were calibrated to match Chess.com ratings before analysis.