Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

ELO is a stochastic gradient descent approximation of logistic regression.

You can do much better just by actually running the logistic regression over the games. In this framework, incorporating any per-game bias such as the characters chosen is a trivial variable to add to the model and fit jointly.

Our ranking systems are holdovers from a time when the calculations had to be done by hand. If the whole set of games fits in ram, there's no need to use ancient optimization methods.




Even that is still assuming you can only update parameters once per game, and only for the players in the game. If I've played a large number of games against someone, and the win-rate is 50/50, and then that player plays in a tournament, my skill should move up or down in accordance with their performance in that tournament.


Not necessarily. At least I don't know how this works in smash, but in competitive fencing I'd see people go 50-50 consistently locally, but one would always do drastically better at nationals, year after year after year.

Right like there are A rank fencers, and then there are A rank fencers who actually have a shot at placing on the points table.

I'm not sure why.


If you told me these facts about a random video game I'd guess the following:

- A high rank player can consistently execute a strategy that wins against the majority of players most of the time ("beats the meta")

- The above has a counter strategy, but this strategy often fails against the majority of the players ("loses to the meta")

When these two players meet, they go 50-50, but have very different results in tournaments. Alternatively, one player is generally bad but exploits a particularly hard to observe weakness in the first.

I know nothing about fencing, but I suspect something similar is going on here.


Yea I suspect you may be right. The ones I saw who did better in tournaments tended to have more controlled, standard style. Nothing too fancy.


I agree in principle, and having new data affect the interpretation of old results was one of the goals for the rating system for a game I run [0]. But while I believe it's the right thing to do if the goal is to predict results more accurately, there are downsides.

Basically players want rating systems to be reward loops; they hate systems where their rating can change randomly, and they want the system to be very volatile in response to their own results. If they go on a statistically insignificant winning streak, they want their ratings to shoot up. Not a rating system to go "meh, it's probably just random chance".

[0] https://www.snellman.net/blog/archive/2015-11-18-rating-syst...


I think if the system provides reliable results, people will come around. There are a lot of preferences that players have, but I think they ultimately come to respect systems that work.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: