With the rise in internet usage during the COVID-19 pandemic, it is no surprise that there was also increased popularity of online chess. In this study, we have investigated and analyzed low to moderately rated online chess players and the games they participated in. We utilized data sets from Chess.com in which we were provided with data concerned with individual players, clubs, tournaments, teams, countries, daily puzzles, streamers, and leaderboards. We utilized JMP and Python to complete our analysis. We took a random sample of low to moderately rated players from October 4, 2020 to March 4, 2021. We noted the portable game notation and specific moves completed by each player. When beginner-level chess players utilize certain moves constantly, they are more likely to see consistent wins, therefore increasing their status from beginner to moderate player. When looking at these moves on an individual basis it is unclear the impact on success, however when move combinations were further examined, the prediction of success was much more accurate. The results of our analysis allowed us to identify a series of moves most moderately rated players employ leading up to a game-losing move. With the COVID-19 pandemic occurring during the data collection, the data may be skewed. External environmental factors such as the pandemic may lead to inaccurate results and findings. This research and analysis aims to help chess trainers and coaches in better formulating strategies and training exercises to help beginner to moderately rated players improve their skills.

Introduction: Chess is one of the oldest and most widespread sports across the world. With the introduction of new technology and increasing internet accessibility, people have been given the opportunity to play chess in virtually any area of the world. As popularity and access to chess continue to increase, it is important for players to understand the best way to improve their game skills. In this study, we will be investigating chess players with a low or moderate rating. Games participated in by these players will be looked at in depth to allow for us to better understand the blunders and mistakes that determine the results of games, and in turn, change player ranking. To first grasp the reach of our study, we set out to understand external factors that have affected the population of the chess community, such as advancing technology and the COVID-19 pandemic. The research provided will help players and readers to reach a  better understanding of openings and tactics that are most beneficial to low and moderately ranked players when navigating online chess. In this case, low rated players will be defined as players rated between 800 to 1000, while moderately rated players will be defined by a range of rating between 1000 to 1300. Our research will be specified to six countries: Canada, Australia, United Kingdom, United States, India, and Bangladesh.  The overall purpose of this study is to pinpoint consistent blunders and mistake patterns in moderately ranked players and utilize them to devise strategies that will increase competition and wins. This study will point to a direction in which an optimal winning strategy can be determined, and ultimately help online chess players change their player ranking.

Data Overview: The data collected to complete this research was provided by Chess.com, one of the top online chess communities that offers players online chess games for free. We accessed the website’s API database, where we gathered data revolving around individual players such as their profile, titled players, stats, and online gamer status. In addition, we were also provided with access to specific games including current daily chess, concise-to-move daily chess, available archives, monthly archives, and multi-game PNG download. In order to complete a more accurate and in depth analysis, we also downloaded and utilized specific country data including the country profile, list of players in each country, and a list of clubs within the country.

Access to data: https://www.chess.com/news/view/published-data-api / https://lichess.org/api

Mined data

Name of the Variable

Description

Username

Username of both players

Elo

ELO rating of both players

Result

Result of the match

ECO Code

Unique code indicating the opening employed in the game

PGN (Portable game notation)

The entire series of moves in the game in a text format

Generated Data

Name of the Variable

Description

Blunder PGN

PGN of moves leading up to a blunder

Mistake PGN

PGN of moves leading up to a mistake



Method: The method in approaching this data first began with the cleaning and mining of the accessed data. The data was mostly clean when it was received, however, there were minor edits and changes that needed to be made in order to continue forward in the analysis process. After the data was cleaned, reviewed, and processed, we took a random sample of low to moderately rated players in the United States, United Kingdom, Canada, Australia, India, and Bangladesh from October 4th, 2020 to March 4th, 2021.. For each randomly selected player, we investigated five rapid games that the player participated in. As each rapid game was assessed,   After our initial assessment and investigation, we utilized Python to form code that would allow for us to merge, join, and compare the datasets compiled for each country and its selected players.

A random sample of 1000 games were selected from the pool of  users in our target rating range of 1000-1400. These games were then analyzed using Stockfish at a depth of 20. Using stockfish evaluation of the position at each move we come up with a score indicating which player has a better position quantitatively. The unit used in such a score is called Centipawns. A score of  +100 Centipawn signifies an advantage of 1 pawn of the white player over the black player. After each move a new score was calculated along with the change in score from the previous move. We define two classes of moves, a blunder and a mistake. A blunder means the move made by the player has cost them a 500 centipawn disadvantage while a mistake has a threshold of 300 centipawns. Blunders would create a worse position for the blundering player, leading to higher losing chances for the player.

By identifying blunders and mistakes we generate variable Blunder_pgn, which would be a PGN string with the series of move leading upto the blunder

Results:

Using Blunder PGN and Mistake PGN we were able to identify a series of moves most moderately rated players employ leading upto a game losing move. We identified 3 Blunder and 4 Mistake PGN’s which players struggle with the most among all combinations at our target rating level.

Ally_Clifft_0-1658862499314.png

Ally_Clifft_1-1658862499381.png

Pic 1: Scandinavian defense and its success rate

Ally_Clifft_2-1658862499360.png

Ally_Clifft_3-1658862499409.png

Pic 2: Blackmar Gambit and its success rate

Ally_Clifft_4-1658862499393.png

Ally_Clifft_5-1658862499416.png

Pic 3: Center Game and its success rate

Mistake prone openings

Ally_Clifft_6-1658862499352.png

Implications:

Black players should refrain from Blackmar Gambit and scandinavian defense.

White players generally have an advantage but tend to struggle with the center game openings.

While there are different openings with different problems the general trend of weak opening principles in blundering players is observed specifically:

  1. Pawn sacrifices without compensation
  2. Queen safety
  3. Development of pieces

Conclusion:

Moderate rated players play the most accurate when they employ standard openings such as London system and the Giuoco Piano Game, hence should be trained on these fundamentals first before moving onto complicated openings

References:

https://www.chess.com/analysis

https://python-chess.readthedocs.io/en/latest/pgn.html

https://stockfishchess.org/

All right, good afternoon, and today I'm going to be talking

about the predictive analysis of online chess outcomes and success.

My name is Allison Clift and I had the opportunity

to work on this project with another another student

in my business analytics program, Calbe Abbas Agaria,

however, he is not with us here today.

To begin, we analyzed low and moderately- rated online chess players.

Since the COVID-19 pandemic, there was an increase in Internet usage

as well as with the advancement of technology,

people have switched over to playing online chess

as it is more readily available to users.

We wanted to look at the effectiveness of different game strategies,

specific moves, and individual techniques, and their impact on potential wins

or potential losses in the game of chess.

Player data was pulled from chess.com, which is where we were able to view

profile of the player, titled players, their statistics,

and the online gamer status.

We utilized JMP and Python to be able to complete the study.

We noted the Portable Game Notation, also known as the PGN.

This was used to determine the openings, blunders, and mistakes

that were occurring during the competition.

We learnt that looking at individual moves on their own was not as predictive

as looking at move combinations as a whole.

It was found that the prediction of chess was much more accurate

when we looked at different move combinations.

We were able to identify moves from moderately- rated players

to employ leading up to game- losing moves such as blunders

or different opening moves that led to more success.

The analysis aims to help chest trainers and coaches in finding weak points

and beginner to moderately- rated players to help them increase their player rating.

They will also be able to formulate better strategies

and training exercises to help these players improve their skills.

Like I said, the increasing popularity of virtual chess

really encouraged us to complete this study.

We wanted to investigate and understand the differing game strategies

employed by beginner and moderately- rated players.

We wanted to determine the optimal winning strategy

for these players to help them

increase their rating on the online platform.

We wanted to learn how to help these players

be able to determine a specific strategy to utilize moving forward.

To begin with our methods, we started by sampling the data

we received from chess.com.

After cleaning and mining the data, we were able to collect

a random sample of players from the United States,

the United Kingdom, Canada, Australia, India, and Bangladesh.

Looking through our own research, we found that this is where chess

was most popular in the past few years.

So we really wanted to look at that data in specific.

Specifically, we looked at the data from October 4th, 2020 to March 4th, 2021.

We did this in order to avoid potential implications

from looking at data that occurred during the COVID-19 pandemic

when internet usage was at its highest.

We also were able to do some feature generation.

We generated two features which allowed for the users

to determine move combinations that led up to blunders or mistakes.

Here, we created the Blunder PGN and the Mistake PGN.

The Blunder PGN was just the record of moves that were made

by a player leading up to a blunder and chess.

The Mistake PGN was just a collection of moves that a player made

leading up to a mistake.

This is what allowed us to complete our analysis.

Next, we utilized a Python code to merge, join,

and compare all of the data that we collected.

This data was compiled of five games per player

from about a 1,000 to 1,400- player rating.

We selected 1000 games randomly from this selection of data.

While we were looking at this data, we wanted to do...

We measured it and using a stockfish depth of 20.

To describe these measures a little bit more,

it was measured in what we call a centipawn in chess.

A plus 100 centipawn signifies that there is an advantage

of one pawn of the white player over the black player.

During a blunder, this means that a move made

by one player has cost them a negative 500 centipawn disadvantage.

A mistake is equivalent to a negative 300 centipawn disadvantage.

A blunder is normally what occurs in a game losing mistake.

Down to the bottom you can see some analysis that we conducted via JMP.

In this graph right here, it is the top ten

most used openings in blunders.

As you can see, the number one used opening that leads to blunders

is the Queen's Pawn Opening London system.

Secondly, we look at the Scandinavian Defence that is oftenly used

and this can be led to blunders as well.

I will mention these again later in the results

and the conclusions of our presentation.

At the bottom you can just see two graphs.

These graphs just show the number of wins that are occurring per level of player.

We can look at the lowest- rated players

up to the highest- rated players.

These show just the average number of losses in comparison.

Over to the right you can see the blunder flag

which this is just the white player versus the black player.

At the bottom is the list of frequencies that occur

during these moves that are made to the left.

For example, you can see when we look at the London System Opening,

it is about half and half for white players

and black players in the wins and loss ratio.

However, when we look at the Scandinavian Defence, we can see that the white players

often make blunders more often compared to the black players.

When we look at our results using the Blunder PGN and the Mistake PGN

features that we developed, we were able to identify a series of moves

that most moderately players employ leading up to a losing move.

We identified three blunders and four Mistake PGNs,

which players struggle with the most among all combinations.

For one, black players should refrain from the Blackmar Gambit

and the Scandinavian Defence.

The Blackmar Gambit only results in about 29.3% of wins

for black chess players.

Secondly, the Scandinavian Defence only equivalents in about 27.7%

for players that are using the black pawn .

White pawn players generally have an advantage here.

They do struggle with center openings though.

When we look at what moves and openings the w hite pawn players utilize

when they move strictly forward in the center,

they tend to lose games more often.

Lastly, weak openings and blundering players.

There were a few openings that we were able to identify

that consistently led to blunders in both players.

These were pawn sacrifices without compensation,

queen safety, and the development of pieces.

While we look at all of this data together and all of our results,

we were able to come up with a conclusion.

Moderately- rated players are most accurate and successful

when they employ standard openings.

They should be trained on the fundamentals of chess before learning

how to move on to complicated openings.

Some of the openings that we suggest that beginner players start off with

are the London System, and the Giuoco Piano game.

At this time, I would just like to thank you guys

and I will be accepting any questions that you have over the report.

Published on ‎05-20-2024 07:52 AM by | Updated on ‎07-23-2025 11:14 AM

With the rise in internet usage during the COVID-19 pandemic, it is no surprise that there was also increased popularity of online chess. In this study, we have investigated and analyzed low to moderately rated online chess players and the games they participated in. We utilized data sets from Chess.com in which we were provided with data concerned with individual players, clubs, tournaments, teams, countries, daily puzzles, streamers, and leaderboards. We utilized JMP and Python to complete our analysis. We took a random sample of low to moderately rated players from October 4, 2020 to March 4, 2021. We noted the portable game notation and specific moves completed by each player. When beginner-level chess players utilize certain moves constantly, they are more likely to see consistent wins, therefore increasing their status from beginner to moderate player. When looking at these moves on an individual basis it is unclear the impact on success, however when move combinations were further examined, the prediction of success was much more accurate. The results of our analysis allowed us to identify a series of moves most moderately rated players employ leading up to a game-losing move. With the COVID-19 pandemic occurring during the data collection, the data may be skewed. External environmental factors such as the pandemic may lead to inaccurate results and findings. This research and analysis aims to help chess trainers and coaches in better formulating strategies and training exercises to help beginner to moderately rated players improve their skills.

Introduction: Chess is one of the oldest and most widespread sports across the world. With the introduction of new technology and increasing internet accessibility, people have been given the opportunity to play chess in virtually any area of the world. As popularity and access to chess continue to increase, it is important for players to understand the best way to improve their game skills. In this study, we will be investigating chess players with a low or moderate rating. Games participated in by these players will be looked at in depth to allow for us to better understand the blunders and mistakes that determine the results of games, and in turn, change player ranking. To first grasp the reach of our study, we set out to understand external factors that have affected the population of the chess community, such as advancing technology and the COVID-19 pandemic. The research provided will help players and readers to reach a  better understanding of openings and tactics that are most beneficial to low and moderately ranked players when navigating online chess. In this case, low rated players will be defined as players rated between 800 to 1000, while moderately rated players will be defined by a range of rating between 1000 to 1300. Our research will be specified to six countries: Canada, Australia, United Kingdom, United States, India, and Bangladesh.  The overall purpose of this study is to pinpoint consistent blunders and mistake patterns in moderately ranked players and utilize them to devise strategies that will increase competition and wins. This study will point to a direction in which an optimal winning strategy can be determined, and ultimately help online chess players change their player ranking.

Data Overview: The data collected to complete this research was provided by Chess.com, one of the top online chess communities that offers players online chess games for free. We accessed the website’s API database, where we gathered data revolving around individual players such as their profile, titled players, stats, and online gamer status. In addition, we were also provided with access to specific games including current daily chess, concise-to-move daily chess, available archives, monthly archives, and multi-game PNG download. In order to complete a more accurate and in depth analysis, we also downloaded and utilized specific country data including the country profile, list of players in each country, and a list of clubs within the country.

Access to data: https://www.chess.com/news/view/published-data-api / https://lichess.org/api

Mined data

Name of the Variable

Description

Username

Username of both players

Elo

ELO rating of both players

Result

Result of the match

ECO Code

Unique code indicating the opening employed in the game

PGN (Portable game notation)

The entire series of moves in the game in a text format

Generated Data

Name of the Variable

Description

Blunder PGN

PGN of moves leading up to a blunder

Mistake PGN

PGN of moves leading up to a mistake



Method: The method in approaching this data first began with the cleaning and mining of the accessed data. The data was mostly clean when it was received, however, there were minor edits and changes that needed to be made in order to continue forward in the analysis process. After the data was cleaned, reviewed, and processed, we took a random sample of low to moderately rated players in the United States, United Kingdom, Canada, Australia, India, and Bangladesh from October 4th, 2020 to March 4th, 2021.. For each randomly selected player, we investigated five rapid games that the player participated in. As each rapid game was assessed,   After our initial assessment and investigation, we utilized Python to form code that would allow for us to merge, join, and compare the datasets compiled for each country and its selected players.

A random sample of 1000 games were selected from the pool of  users in our target rating range of 1000-1400. These games were then analyzed using Stockfish at a depth of 20. Using stockfish evaluation of the position at each move we come up with a score indicating which player has a better position quantitatively. The unit used in such a score is called Centipawns. A score of  +100 Centipawn signifies an advantage of 1 pawn of the white player over the black player. After each move a new score was calculated along with the change in score from the previous move. We define two classes of moves, a blunder and a mistake. A blunder means the move made by the player has cost them a 500 centipawn disadvantage while a mistake has a threshold of 300 centipawns. Blunders would create a worse position for the blundering player, leading to higher losing chances for the player.

By identifying blunders and mistakes we generate variable Blunder_pgn, which would be a PGN string with the series of move leading upto the blunder

Results:

Using Blunder PGN and Mistake PGN we were able to identify a series of moves most moderately rated players employ leading upto a game losing move. We identified 3 Blunder and 4 Mistake PGN’s which players struggle with the most among all combinations at our target rating level.

Ally_Clifft_0-1658862499314.png

Ally_Clifft_1-1658862499381.png

Pic 1: Scandinavian defense and its success rate

Ally_Clifft_2-1658862499360.png

Ally_Clifft_3-1658862499409.png

Pic 2: Blackmar Gambit and its success rate

Ally_Clifft_4-1658862499393.png

Ally_Clifft_5-1658862499416.png

Pic 3: Center Game and its success rate

Mistake prone openings

Ally_Clifft_6-1658862499352.png

Implications:

Black players should refrain from Blackmar Gambit and scandinavian defense.

White players generally have an advantage but tend to struggle with the center game openings.

While there are different openings with different problems the general trend of weak opening principles in blundering players is observed specifically:

  1. Pawn sacrifices without compensation
  2. Queen safety
  3. Development of pieces

Conclusion:

Moderate rated players play the most accurate when they employ standard openings such as London system and the Giuoco Piano Game, hence should be trained on these fundamentals first before moving onto complicated openings

References:

https://www.chess.com/analysis

https://python-chess.readthedocs.io/en/latest/pgn.html

https://stockfishchess.org/

All right, good afternoon, and today I'm going to be talking

about the predictive analysis of online chess outcomes and success.

My name is Allison Clift and I had the opportunity

to work on this project with another another student

in my business analytics program, Calbe Abbas Agaria,

however, he is not with us here today.

To begin, we analyzed low and moderately- rated online chess players.

Since the COVID-19 pandemic, there was an increase in Internet usage

as well as with the advancement of technology,

people have switched over to playing online chess

as it is more readily available to users.

We wanted to look at the effectiveness of different game strategies,

specific moves, and individual techniques, and their impact on potential wins

or potential losses in the game of chess.

Player data was pulled from chess.com, which is where we were able to view

profile of the player, titled players, their statistics,

and the online gamer status.

We utilized JMP and Python to be able to complete the study.

We noted the Portable Game Notation, also known as the PGN.

This was used to determine the openings, blunders, and mistakes

that were occurring during the competition.

We learnt that looking at individual moves on their own was not as predictive

as looking at move combinations as a whole.

It was found that the prediction of chess was much more accurate

when we looked at different move combinations.

We were able to identify moves from moderately- rated players

to employ leading up to game- losing moves such as blunders

or different opening moves that led to more success.

The analysis aims to help chest trainers and coaches in finding weak points

and beginner to moderately- rated players to help them increase their player rating.

They will also be able to formulate better strategies

and training exercises to help these players improve their skills.

Like I said, the increasing popularity of virtual chess

really encouraged us to complete this study.

We wanted to investigate and understand the differing game strategies

employed by beginner and moderately- rated players.

We wanted to determine the optimal winning strategy

for these players to help them

increase their rating on the online platform.

We wanted to learn how to help these players

be able to determine a specific strategy to utilize moving forward.

To begin with our methods, we started by sampling the data

we received from chess.com.

After cleaning and mining the data, we were able to collect

a random sample of players from the United States,

the United Kingdom, Canada, Australia, India, and Bangladesh.

Looking through our own research, we found that this is where chess

was most popular in the past few years.

So we really wanted to look at that data in specific.

Specifically, we looked at the data from October 4th, 2020 to March 4th, 2021.

We did this in order to avoid potential implications

from looking at data that occurred during the COVID-19 pandemic

when internet usage was at its highest.

We also were able to do some feature generation.

We generated two features which allowed for the users

to determine move combinations that led up to blunders or mistakes.

Here, we created the Blunder PGN and the Mistake PGN.

The Blunder PGN was just the record of moves that were made

by a player leading up to a blunder and chess.

The Mistake PGN was just a collection of moves that a player made

leading up to a mistake.

This is what allowed us to complete our analysis.

Next, we utilized a Python code to merge, join,

and compare all of the data that we collected.

This data was compiled of five games per player

from about a 1,000 to 1,400- player rating.

We selected 1000 games randomly from this selection of data.

While we were looking at this data, we wanted to do...

We measured it and using a stockfish depth of 20.

To describe these measures a little bit more,

it was measured in what we call a centipawn in chess.

A plus 100 centipawn signifies that there is an advantage

of one pawn of the white player over the black player.

During a blunder, this means that a move made

by one player has cost them a negative 500 centipawn disadvantage.

A mistake is equivalent to a negative 300 centipawn disadvantage.

A blunder is normally what occurs in a game losing mistake.

Down to the bottom you can see some analysis that we conducted via JMP.

In this graph right here, it is the top ten

most used openings in blunders.

As you can see, the number one used opening that leads to blunders

is the Queen's Pawn Opening London system.

Secondly, we look at the Scandinavian Defence that is oftenly used

and this can be led to blunders as well.

I will mention these again later in the results

and the conclusions of our presentation.

At the bottom you can just see two graphs.

These graphs just show the number of wins that are occurring per level of player.

We can look at the lowest- rated players

up to the highest- rated players.

These show just the average number of losses in comparison.

Over to the right you can see the blunder flag

which this is just the white player versus the black player.

At the bottom is the list of frequencies that occur

during these moves that are made to the left.

For example, you can see when we look at the London System Opening,

it is about half and half for white players

and black players in the wins and loss ratio.

However, when we look at the Scandinavian Defence, we can see that the white players

often make blunders more often compared to the black players.

When we look at our results using the Blunder PGN and the Mistake PGN

features that we developed, we were able to identify a series of moves

that most moderately players employ leading up to a losing move.

We identified three blunders and four Mistake PGNs,

which players struggle with the most among all combinations.

For one, black players should refrain from the Blackmar Gambit

and the Scandinavian Defence.

The Blackmar Gambit only results in about 29.3% of wins

for black chess players.

Secondly, the Scandinavian Defence only equivalents in about 27.7%

for players that are using the black pawn .

White pawn players generally have an advantage here.

They do struggle with center openings though.

When we look at what moves and openings the w hite pawn players utilize

when they move strictly forward in the center,

they tend to lose games more often.

Lastly, weak openings and blundering players.

There were a few openings that we were able to identify

that consistently led to blunders in both players.

These were pawn sacrifices without compensation,

queen safety, and the development of pieces.

While we look at all of this data together and all of our results,

we were able to come up with a conclusion.

Moderately- rated players are most accurate and successful

when they employ standard openings.

They should be trained on the fundamentals of chess before learning

how to move on to complicated openings.

Some of the openings that we suggest that beginner players start off with

are the London System, and the Giuoco Piano game.

At this time, I would just like to thank you guys

and I will be accepting any questions that you have over the report.



0 Kudos