The Elo Rating system is a method to rate players in chess and other competitive games. A new player starts with a rating of 1000. This rating will go up if they win games, and go down if they lose games. Over time a player’s rating becomes a true reflection of their ability – relative to the population.
My video was mostly based on A Comprehensive Guide to Chess Ratings by Prof Mark E Glickman
Below are some of the things I wanted to talk about, but cut so the video wasn’t too long!
Some explanations of the Elo rating system say it is based on the normal distribution, which is not quite true. Elo’s original idea did model each player’s ability as a normal distribution. The difference between the two players strengths would then also be a normal distribution. However, the formula for a normal distribution is a bit messy so today it is preferred to model each player using an extreme value distribution. The difference between the two players strengths is then a logistic distribution. This has the property that if a player has a rating 400 points more than another player they are 10 times more likely to win, this makes the formula nicer to use. Practically, the difference between a logistic distribution and the normal distribution is small.
Logistic distribution on Wikipedia
We replace e with base 10, s=400, mu=R_A – R_B and x=0 in the cdf.
For the update formula I say that your rating can increase or decrease by a maximum of 32 points, and I said there was no special reason for that. This value is called the K-factor, and the higher the K-factor the more weight you give to the players tournament performance (and so less weight to their pre-tournament performance). For high level chess tournaments they use a K-factor of 16 as it is believed their pre-tournament rating is about right, so their rating will not fluctuate as much. Some tournaments use different K-factors.
In the original Elo system, draws are not included, instead they are considered to be equivalent to half a win and half a loss. The paper by Mark Glickman above contains a formula that includes draws. Similarly the paper contains a formula that includes the advantage to white.
Another criticism of Elo is the reliability of the rating. The rating of an infrequent player is a less reliable measure of that player’s strength, so to address this problem Mark Glickman devised Glicko and Glicko2. See descriptions of these methods at
On the plus side, the Elo system was leagues ahead of what it replaced, known as the Harkness system. I originally intended to explain the Harkness system as well, so here are the paragraphs I cut:
“In the Harkness system an average was taken of everyone’s rating, then at the end of the tournament if the percentage of games you won was 50% then your new rating was the average rating.
If you did better or worse than 50% then 10 points was added or subtracted to the average rating for every percentage point above or below 50.
This system was not the best and could produce some strange results. For example, it was possible for a player to lose every game and still gain points.”
This video was suggested by Outray Chess. The maths is a bit harder, but I liked the idea so I made a in-front-of-a-wall video.
3 October 2021
FARMING WORKS TROLOLOL
Look this up:
> Farming chess960 on lichess: I am on a 30 win streak, having gained 74 points (1553 to 1627) in the past 4 days. I just challenged a bunch of 1399 standard blitz and lower who haven't played 9LX much so their rating is treated as 1500. When I win/lose, it's +3/-8. I think this is a good deal.
Or look me up on Reddit or lichess. Farming works. Lol.
Is this similar to the algorithm used in "the social network" movie to rank the girl's hotness?
Nice explanation. Can you show the derivation of the formula, in the comments or something?
2:32 Is this really a logistic curve? I thought those are S-shaped, with the limit of −∞ being at 0 and the limit of +∞ being at 1 (or some transformation applied to it). The sum (and therefore the difference) is also a normal distribution, with the mean being the difference of the prior means, and the variance being the sum of the prior variances.
Isn't this the guy from numberphile?
These comments are way too distracting. I can’t pay attention to this guy because I’m laughing at most of the comments.
I was hoping to have an explanation of the complete chess rating system (which is very confusing). It changes according to the type of game one plays (blitz, bullet, rapid, classical) and I think also from one website to another and from one chess organization to the other. Very complex.
Isn't Árpád Elo hungarian?
4:44 I would guess it's 32 because there are 64 squares on a chess board. 32 squares are your 'territory', 32 are your opponent's. so when you win you take your opponent's territory.
I dont get it. I played cribbage online and skunked a person and played again same person. I got lower points than they did when they wont second game.
Can you prove (mathematically) that given any initial rating, after many matches you converge to your "real" rating? I am thinking about Markov chains and limiting probabilities
can we have a video about how this whole concept would work in a team game? what would the formula be of a player's elo change winning or losing against a team?
Lighting in this videos is really low. I thought my brightness was wrong.
Competitive NES tetris is the first place I've seen ELO
I had a blitz rating aproaching 1400 a few years ago. I struggle holding my 1200 today
really makes you realize how insanely crazy good super GMs are compared to GMs. It's not even close
Hey, isn’t this the numberphile guy?
this guy is describing the opposite of the algorithm for quick money
where have I seen this person before, either ive seen him, or he has the face of a thousand people
In table tennis in the Netherlands we use a similar system, but the max points gained/lost in a single match isn't 32, but 63. But i also don't feel like a 400 point difference is a 10× offset in that system. I'm pretty sure I could beat people rated 400 points lower than me closer to 99/100 times.
Average Dead by daylight MMR system enjoyer: "You're wrong."
I comment to help statistics
It's + -32 because that's a 64 point spread, as in 64 squares in a chess board
I have a couple questions. Firstly, is the variance for a normal player and a GM necessarily the same? Personally I'm rated 1100 and my skill varies greatly while I doubt Magnus varies this much. So surely the shape of the distributions is different for different levels?
great video 🙂
I don't play chess but I do play CS:GO and they use a modified version of Elo (GLICKO-2) so getting this recommended is very interesting.
Yes I’m happy 😃
Well I guess I understood nothing
I learned that the probability of winning was based on a Normal Distribution with a mean of 0 and a standard deviation of 283.165. That is, p(A wins) = N(RA-RB, 0, 283.165). Both formulas give the same value (24%) when player B is rated 200 points higher than player A. However, the Normal Distribution formula tapers off more sharply for higher differences in the ELO ratings.
recently reached 1601, Class B. taking a break so I don't go on a huge loss cycle.
I will forever be in Low Elo 🙁
Élő Árpád was Hungarian 😡
What a great explanation many thanks for it!
Just one note: Elo Arpad was actually Hungarian
Like Wigner Jeno, Szilard Leo, Neumann Janos, Teller Ede, Rubik Erno, etc…
This means that if I play a million games with Magnus Carlson, I am expected to win about 10.
Magic the gathering USED to use ELO. and in place of the 32… was a number which varied according to the importance of the event.. (8 for local games at social clubs, 16 for medium tournaments, 32 for larger events, and 64 for world championships) (approx, it was a long time ago)
"400 points difference means that player a is 10 times more likely to win". Well is there an explanation for that or is it an arbitrary decision made by elo to try and forecast a result of a game?
@singingbanana Was there ever a follow-up on how the elo of new players are calculated? After 20 matches (at least in Belgium) an estimation of the elo is made. I know it is based on the average elo of those 20 adversaries and the winning chance against that average (supposibly 1150)?
This is also one of the reasons that there is some kind of elo inflation.
"Forget Chess for the moment" Thinks about Chess
This fails to take into account players like myself who have an 80% chance of losing regardless of my rating.
Dr. Elo was a very smart man but it's not a flawless system, in part bc FIDE has 2 or 3 ratings total now. I think a better idea is an all-around class rank for anything below IM. USCF is even worse. 1500s constantly beat 1900s IME. One of the funniest things ever was being an unrated player and shellacking an expert in a tournament. Wow, what a whiner. The whining does go down as your rating goes up. Also, winning 10-1 being 400 points, poor choice. That should be around 500-550, especially online. I love losing a blitz game bc of a mouse slip and then stomping someone 11 times in a row.
i started chess around 2 weeks ago and am at 550 elo is that good or bad or average?
Match making opponents based on elo score is a great way to take the fun out of any competition. You are playing against someone with identical skills as yourself so you win and lose pretty much 50% of the time. Forever. You never get to play someone much better than you so can see where they excel and try to learn from them. You never get to play against someone weaker than you and share your skills or tips or tricks or knowledge with them. Instead you just essentially play against yourself over and over and over until you rage quit in frustration. Having a rating system is good. Always matching opponents based on that rating is absolutely terrible. The phrase "elo hell" exists because of shitty match making systems that always put opponents of similar strength against each other.
Isn’t this the numberphile guy?
Dizzy? Is dat chiu?
anyone else having an urge to log into lichess and lose some elo?
This is using the exponential approximation of the normal-distribution density, right?