Boost in elo

Janis · December 22, 2018, 11:40pm

Can someone explain this spike?
133 elo for one match.
https://bcverdict.github.io/?id=44927

Ryan_Draves · December 22, 2018, 11:44pm

Interesting. I’m not sure that’s chart’s accurate, unless that was its initial climb for the timescale. That algo is a week+ old at this point I think. It peaked around 2370 elo and had a sharp dip downward against some of the newer ones.

That match history is right, but the elo graph is way off. It’s been 2300+ for about a week and should have a lot more matches between 2200 and current elo.

Janis · December 23, 2018, 12:47am

I see, I think there is definitely something off with graph.
For another thing I noticed, the initial elo value of algo is always set to 1500 even though algo is older than 100 matches e.g. this algo: https://bcverdict.github.io/?id=40061
If I am not wrong @bcverdict you implemented elo calculation which might not always match C1’s implementation of elo calculation?

Edit:
Looks like it - https://github.com/bcverdict/bcverdict.github.io/blob/master/main.js
chartConfig.data.datasets[0].data.push(index == matches.length - 1 ? elo : Math.round((previousElo += K * ((won ? 1 : 0) - 1 / (1 + Math.pow(10, (opponent.elo - previousElo) / 400))))))

876584635678890 · December 23, 2018, 1:09am

Yep, you spotted that correctly. Has been in discussion with @bcverdict.

You see, there are multiple problems here because the data retrieved from Terminal about the match history is very limited.

Limited history

This is the first obvious issue. @bcverdict’s current calculation assumes that the algo was at 1500 elo when the first match was played as you have correctly identified. A workaround would be back tracing elo in those cases with some numerical methods, e.g. the Bisection method. This would however still not account for the second problem:

Insufficient information

Without third-party applications like Terminal Tools, which is obviously accurate because it stores the data independently, there is no way in Terminal to see what the opponent’s elo score was at the time of the match.

The only score you can get from C1’s API is the current elo of both algos and working with these will lead to deviations from the real elo graph. There is just no possibility to use accurate data.
On a greater scale these deviations will not be too bad (not unusable, still gives you an idea of elo over matches) because there are algos that had a lower and algos that had a higher elo at the time of the match than they have currently, although there is definitely a bias in a certain direction due to elo distribution and how matchmaking works.

Having said this, if we wanted to accurately graph the elo over matches without error, we would need to:

either store all matches with at that time elo scores (simpler: store all elo of every algo, which you could theoretically do for Top 10 algos by visiting Terminal Tools all the time, or if I implemented a Cron job, which is too arbitrary for my taste and less personal )
or by using recursion to find opponent’s elo scores (eventually there would be a first match, where we know that the algo had 1500 elo and then we can work from there)

Both of these options are hypothetical, although the bandwidth needed to track single algos is not that high.

Conclusion

You should treat the graphs over at bcverdict.github.io as visualizations of wins and losses rather than looking at the absolute values. The tool is great for having an overview of the matches and watching them.

Also the real question is what the point of per match accurate historical data really is. To me it seems like it would not be worth the effort.

bcverdict · December 23, 2018, 2:06am

@876584635678890 covered this pretty well so I won’t add to much but yes, the elo visualizer isn’t completely accurate due to those factors mentioned: changing and limited data. The jump at the end is there because the only number it can confirm with the API is the current elo of the algo. The reason you might not see the dip in the middle is because the algo’s you played against might have still been climbing in elo, so your actual elo was calculated when your opponents elo was 1700 while ours was calculated with their new elo say 1780(what the API provides). That would then decrease the elo loss and make the graph less defined. We have a work around for the limited data but it won’t be 100% accurate due to changing data but it should be pretty close after I apply some fixes but should still only be used as a cool little visual for your algo’s elo growth. It’s other features for watching the games and seeing who your rival might be is always something you can use it reliably for though.

RegularRyan · December 23, 2018, 4:08am

I’m going to discuss adding ‘elo at time of match’ to our stat tracking for matches with the team

Ryan_Draves · December 23, 2018, 6:03am

I like seeing current elo, personally, as it helps gives a unique identifier on algos. If I were to, say, only upload under the name “CantSeeMeNow” for the rest of the competition, people would have no idea what version they’re dealing with if they only saw the elo at the time of the match.

Isaac · December 23, 2018, 6:19am

I think there are a lot of advantages to both views, maybe there could be an option to select which you’d like to see? I’m not sure how hard it would be to add since I haven’t looked at the source code for that project much but I might give it a look.

bcverdict · December 24, 2018, 6:49am

Nothing has to change visually. The information only has to be accessible from the api

RegularRyan · December 27, 2018, 1:15am

I actually just realized we already do this. If you check out the ‘game_matchalgostat’ table, you will be able to see ‘elo_before’ and ‘elo_after’. Each match has 2 corresponding matchalgostat entries, one for each algo. There is also alot of other useful data in that table too.

RegularRyan · December 27, 2018, 3:44pm

Oh ya, I don’t use the API directly and did not consider that you all might not have access to this table (which is the case). I’ll see what I can do about that.