cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Choose Language Hide Translation Bar
Predictive Analysis of Online Chess Outcomes and Success (2022-US-EPO-1169)

Ally Clifft, Student at Oklahoma State University, Oklahoma State University
Kalbe Abbas Agharia, Student, Oklahoma State University

 

With the rise in internet usage during the COVID-19 pandemic, it is no surprise that there was also increased popularity of online chess. In this study, we have investigated and analyzed low to moderately rated online chess players and the games they participated in. We utilized data sets from Chess.com in which we were provided with data concerned with individual players, clubs, tournaments, teams, countries, daily puzzles, streamers, and leaderboards. We utilized JMP and Python to complete our analysis. We took a random sample of low to moderately rated players from October 4, 2020 to March 4, 2021. We noted the portable game notation and specific moves completed by each player. When beginner-level chess players utilize certain moves constantly, they are more likely to see consistent wins, therefore increasing their status from beginner to moderate player. When looking at these moves on an individual basis it is unclear the impact on success, however when move combinations were further examined, the prediction of success was much more accurate. The results of our analysis allowed us to identify a series of moves most moderately rated players employ leading up to a game-losing move. With the COVID-19 pandemic occurring during the data collection, the data may be skewed. External environmental factors such as the pandemic may lead to inaccurate results and findings. This research and analysis aims to help chess trainers and coaches in better formulating strategies and training exercises to help beginner to moderately rated players improve their skills.

 

Introduction: Chess is one of the oldest and most widespread sports across the world. With the introduction of new technology and increasing internet accessibility, people have been given the opportunity to play chess in virtually any area of the world. As popularity and access to chess continue to increase, it is important for players to understand the best way to improve their game skills. In this study, we will be investigating chess players with a low or moderate rating. Games participated in by these players will be looked at in depth to allow for us to better understand the blunders and mistakes that determine the results of games, and in turn, change player ranking. To first grasp the reach of our study, we set out to understand external factors that have affected the population of the chess community, such as advancing technology and the COVID-19 pandemic. The research provided will help players and readers to reach a  better understanding of openings and tactics that are most beneficial to low and moderately ranked players when navigating online chess. In this case, low rated players will be defined as players rated between 800 to 1000, while moderately rated players will be defined by a range of rating between 1000 to 1300. Our research will be specified to six countries: Canada, Australia, United Kingdom, United States, India, and Bangladesh.  The overall purpose of this study is to pinpoint consistent blunders and mistake patterns in moderately ranked players and utilize them to devise strategies that will increase competition and wins. This study will point to a direction in which an optimal winning strategy can be determined, and ultimately help online chess players change their player ranking. 

 

Data Overview: The data collected to complete this research was provided by Chess.com, one of the top online chess communities that offers players online chess games for free. We accessed the website’s API database, where we gathered data revolving around individual players such as their profile, titled players, stats, and online gamer status. In addition, we were also provided with access to specific games including current daily chess, concise-to-move daily chess, available archives, monthly archives, and multi-game PNG download. In order to complete a more accurate and in depth analysis, we also downloaded and utilized specific country data including the country profile, list of players in each country, and a list of clubs within the country. 

 

Access to data: https://www.chess.com/news/view/published-data-api / https://lichess.org/api

 

Mined data

Name of the Variable

Description

Username

Username of both players

Elo

ELO rating of both players

Result

Result of the match

ECO Code

Unique code indicating the opening employed in the game

PGN (Portable game notation)

The entire series of moves in the game in a text format

 

Generated Data

 

Name of the Variable

Description

Blunder PGN

PGN of moves leading up to a blunder

Mistake PGN

PGN of moves leading up to a mistake



Method: The method in approaching this data first began with the cleaning and mining of the accessed data. The data was mostly clean when it was received, however, there were minor edits and changes that needed to be made in order to continue forward in the analysis process. After the data was cleaned, reviewed, and processed, we took a random sample of low to moderately rated players in the United States, United Kingdom, Canada, Australia, India, and Bangladesh from October 4th, 2020 to March 4th, 2021.. For each randomly selected player, we investigated five rapid games that the player participated in. As each rapid game was assessed,   After our initial assessment and investigation, we utilized Python to form code that would allow for us to merge, join, and compare the datasets compiled for each country and its selected players.

A random sample of 1000 games were selected from the pool of  users in our target rating range of 1000-1400. These games were then analyzed using Stockfish at a depth of 20. Using stockfish evaluation of the position at each move we come up with a score indicating which player has a better position quantitatively. The unit used in such a score is called Centipawns. A score of  +100 Centipawn signifies an advantage of 1 pawn of the white player over the black player. After each move a new score was calculated along with the change in score from the previous move. We define two classes of moves, a blunder and a mistake. A blunder means the move made by the player has cost them a 500 centipawn disadvantage while a mistake has a threshold of 300 centipawns. Blunders would create a worse position for the blundering player, leading to higher losing chances for the player.

By identifying blunders and mistakes we generate variable Blunder_pgn, which would be a PGN string with the series of move leading upto the blunder

 

Results:

Using Blunder PGN and Mistake PGN we were able to identify a series of moves most moderately rated players employ leading upto a game losing move. We identified 3 Blunder and 4 Mistake PGN’s which players struggle with the most among all combinations at our target rating level.

Ally_Clifft_0-1658862499314.png

 

Ally_Clifft_1-1658862499381.png

 

Pic 1: Scandinavian defense and its success rate

Ally_Clifft_2-1658862499360.png

 

 

Ally_Clifft_3-1658862499409.png

 

Pic 2: Blackmar Gambit and its success rate

 

Ally_Clifft_4-1658862499393.png

 

Ally_Clifft_5-1658862499416.png

 

Pic 3: Center Game and its success rate

 

Mistake prone openings

 

Ally_Clifft_6-1658862499352.png

 

 

Implications:

Black players should refrain from Blackmar Gambit and scandinavian defense.

White players generally have an advantage but tend to struggle with the center game openings.

While there are different openings with different problems the general trend of weak opening principles in blundering players is observed specifically:

  1. Pawn sacrifices without compensation
  2. Queen safety
  3. Development of pieces

 

Conclusion:

Moderate rated players play the most accurate when they employ standard openings such as London system and the Giuoco Piano Game, hence should be trained on these fundamentals first before moving onto complicated openings

 

References:

 

https://www.chess.com/analysis

https://python-chess.readthedocs.io/en/latest/pgn.html

https://stockfishchess.org/

 

 

All  right,   good  afternoon, and  today  I'm  going  to  be  talking

about  the  predictive  analysis of  online  chess  outcomes  and  success.

My  name  is  Allison  Clift and  I  had  the  opportunity

to  work  on  this  project with  another  another  student

in  my  business  analytics  program,   Calbe  Abbas  Agaria,

however,  he  is  not  with  us  here  today.

To  begin,  we  analyzed  low   and  moderately- rated  online  chess  players.

Since  the   COVID-19  pandemic, there  was  an  increase  in  Internet  usage

as  well  as  with   the  advancement  of  technology,

people  have  switched  over  to  playing  online  chess

as  it  is  more  readily  available  to  users.

We  wanted  to  look  at  the  effectiveness of  different  game  strategies,

specific  moves,  and  individual  techniques, and  their  impact  on  potential  wins

or  potential  losses  in  the  game  of  chess.

Player  data  was  pulled  from  chess.com, which  is  where  we  were  able  to  view

profile  of  the  player, titled  players,  their  statistics,

and  the  online  gamer  status.

We  utilized  JMP  and  Python to  be  able  to  complete  the  study.

We  noted  the   Portable Game Notation,   also  known  as  the  PGN.

This  was  used  to  determine the  openings,  blunders,  and  mistakes

that  were  occurring during  the  competition.

We  learnt  that  looking  at  individual  moves on  their  own  was  not  as  predictive

as  looking  at  move combinations  as  a  whole.

It  was  found  that  the  prediction of  chess  was  much  more  accurate

when  we  looked at  different  move  combinations.

We  were  able  to  identify  moves from  moderately- rated  players

to  employ  leading   up  to  game- losing  moves  such  as  blunders

or  different  opening  moves that  led  to  more  success.

The  analysis  aims  to  help  chest  trainers and  coaches  in  finding  weak  points

and  beginner  to  moderately- rated  players to  help  them  increase  their  player  rating.

They  will  also  be  able to  formulate  better  strategies

and  training  exercises  to  help these  players  improve  their  skills.

Like  I  said, the  increasing  popularity  of  virtual  chess

really  encouraged  us to  complete  this  study.

We  wanted  to  investigate  and  understand the  differing  game  strategies

employed  by  beginner and  moderately- rated  players.

We  wanted  to  determine the  optimal  winning  strategy

for  these  players  to  help  them

increase  their  rating on  the  online  platform.

We  wanted  to  learn how  to  help  these  players

be  able  to  determine  a  specific strategy  to  utilize  moving  forward.

To  begin  with  our  methods, we  started  by  sampling  the  data

we  received  from  chess.com.

After  cleaning  and  mining  the  data, we  were  able  to  collect

a  random  sample  of  players from  the  United  States,

the  United  Kingdom,  Canada, Australia,  India,  and  Bangladesh.

Looking  through  our  own  research, we  found  that  this  is  where  chess

was  most  popular  in  the  past  few  years.

So we  really  wanted  to  look at  that  data  in  specific.

Specifically,  we  looked  at  the  data from  October  4th,  2020  to  March  4th,  2021.

We  did  this  in  order to  avoid  potential  implications

from  looking  at  data  that  occurred during  the   COVID-19  pandemic

when  internet  usage was  at  its  highest.

We  also  were  able  to  do some  feature  generation.

We  generated  two  features which  allowed  for  the  users

to  determine  move  combinations that  led  up  to  blunders  or  mistakes.

Here,  we  created the   Blunder PGN  and  the   Mistake PGN.

The   Blunder PGN  was  just the  record  of  moves  that  were  made

by  a  player leading  up  to  a  blunder  and  chess.

The   Mistake PGN  was  just a  collection  of  moves  that  a  player  made

leading  up  to  a  mistake.

This  is  what  allowed  us to  complete  our  analysis.

Next,  we  utilized a  Python  code  to  merge,  join,

and  compare all  of  the  data  that  we  collected.

This  data  was  compiled of  five  games  per  player

from  about  a  1,000  to  1,400- player  rating.

We  selected  1000  games randomly  from  this  selection  of  data.

While  we  were  looking at  this  data,  we  wanted  to  do...

We  measured  it   and  using  a  stockfish  depth  of  20.

To  describe these  measures  a  little  bit  more,

it  was  measured in  what  we  call  a   centipawn  in  chess.

A  plus  100   centipawn signifies  that  there  is  an  advantage

of  one  pawn  of  the white  player  over  the  black  player.

During  a  blunder, this  means  that  a  move  made

by  one  player  has  cost  them a  negative  500  centipawn  disadvantage.

A  mistake  is  equivalent to  a  negative  300  centipawn  disadvantage.

A  blunder  is  normally   what  occurs  in  a  game  losing  mistake.

Down  to  the  bottom  you  can  see some  analysis  that  we  conducted  via  JMP.

In  this  graph  right  here, it  is  the  top  ten

 most  used  openings  in  blunders.

As  you  can  see,  the  number  one used  opening  that  leads  to  blunders

is  the   Queen's Pawn  Opening  London  system.

Secondly,  we  look  at  the  Scandinavian Defence  that  is  oftenly  used

and  this  can  be  led  to  blunders  as  well.

I  will  mention  these again  later  in  the  results

and  the  conclusions of  our  presentation.

At  the  bottom  you  can  just  see  two  graphs.

These  graphs  just  show  the  number  of  wins that  are  occurring  per  level  of  player.

We  can  look at  the  lowest- rated  players

up  to  the  highest- rated  players.

These  show  just  the  average number  of  losses  in  comparison.

Over  to  the  right you  can  see  the  blunder  flag

which  this  is  just  the  white player  versus  the  black  player.

At  the  bottom  is  the  list of  frequencies  that  occur

during  these  moves that  are  made  to  the  left.

For  example,  you  can  see when  we  look  at  the  London  System  Opening,

it  is  about  half  and  half for  white  players

and  black  players in  the  wins  and  loss  ratio.

However,  when  we  look  at  the  Scandinavian Defence,  we  can  see  that  the  white  players

often  make  blunders  more  often compared  to  the  black  players.

When  we  look  at  our  results  using the   Blunder PGN  and  the   Mistake PGN

features  that  we  developed, we  were  able  to  identify  a  series  of  moves

that  most  moderately  players  employ leading  up  to  a  losing  move.

We  identified  three  blunders and  four  Mistake PGNs,

which  players  struggle with  the  most  among  all  combinations.

For  one,  black  players  should  refrain from  the   Blackmar Gambit

and  the   Scandinavian Defence.

The   Blackmar Gambit  only  results in  about  29.3%  of  wins

for  black  chess  players.

Secondly,  the   Scandinavian Defence only  equivalents  in  about  27.7%

for  players that  are  using  the  black pawn .

White pawn  players  generally have  an  advantage  here.

They  do  struggle with  center  openings  though.

When  we  look  at  what  moves  and  openings the  w hite pawn  players  utilize

when  they  move  strictly forward  in  the  center,

they  tend  to  lose  games  more  often.

Lastly,  weak  openings and  blundering  players.

There  were  a  few  openings that  we  were  able  to  identify

that  consistently led  to  blunders  in  both  players.

These  were  pawn  sacrifices without  compensation,

queen  safety, and  the  development  of  pieces.

While  we  look  at  all  of  this  data  together and  all  of  our  results,

we  were  able  to  come  up  with  a  conclusion.

Moderately- rated  players  are  most  accurate and  successful

when  they  employ  standard  openings.

They  should  be  trained  on  the  fundamentals of  chess  before  learning

how  to  move  on to  complicated  openings.

Some  of  the  openings  that  we  suggest that  beginner  players  start  off  with

are  the  London  System, and  the   Giuoco Piano  game.

At  this  time, I  would  just  like  to  thank  you  guys

and  I  will  be  accepting  any  questions that  you  have  over  the  report.