Updating algorithms

CarpetScratcher · October 28, 2018, 8:58pm

I have an algorithm with a lot of ELO and I was wondering if i can update the algorithm without removing it so that it keeps it’s ELO. And if possible, how can I do that?

kkroep · October 29, 2018, 8:23am

You can’t right now, but I think the problem you are trying to address is how slow algorithms converge to the ELO rating they should have. I think C1Ryan is working on that right now.

I recently uploaded an algorithm that won all approx 20 matches, including four +1800 and one +2000 opponent, and the ELO rating is stuck below at 1650 at the moment of writing.

A properly functioning ELO system would have algorithms converge more quickly by sparring them against opponents of similar rank. A win streak would indicate an algorithm is in the wrong ELO range. Something that cannot be done if every match-up is completely random.

[edit] Actually that example might not be the best one. As it already sparred against some high algo’s it can at least rise decently fast. If the algos are only sparred against sub 1400 level algo’s it takes forever. So first priority would be a matchmaking system where the outcome of the match is not directly obvious to the match maker, where it tries to find an elo where every algorithm loses and wins about half of the time.

If this would work as intended there is no need to borrow ELO from previous algo’s, and I think such functionality would introduce new problems. (Imagine a +2000 ELO algorithm being replaced by a crashing algo, that would shake up the ladder a lot for the few lucky opponents)

876584635678890 · October 29, 2018, 2:55pm

@kkroep Win streaks do not indicate that an algo is placed falsely, I believe because it is pure coincidence when the wins, you would eventually get anyways if the algo was good enough, occur.

RegularRyan · October 29, 2018, 3:16pm

To answer Carpet Scratchers question, we consider those two algos to be different algos, and you will have to submit your algo in a seperate slot.

On kkroeps point, we will be improving matchmaking dramatically this week. I have already done a few of the changes, the main one that needs doing is a bug where many more matches are being scheduled than there should be, leading to many algos being unavailable for matches.

I’ll push for the 400 Elo gap limit to be hotfixed in today since thats done so that you dont get those 0 Elo games

kkroep · October 29, 2018, 3:20pm

If you post a new algorithm, and it is sparred against the 1400-1600 range, and it has a large win streak, that would indicate the assigned base ELO is not correct. I am an online chess player and there they use ELO this way. When you haven’t played the game for a while, or are completely new to the game, the ELO scales more rapidly until the win percentage of the last couple of games falls below a certain threshold. A same method is applied when all matches are lost. This saves grand masters the frustration of grinding through a sea of low level players, and it saves the beginner chess players from losing their bottoms blue. Also it makes it so that ELO of players matched against these players with wrong ELO is also affected the wrong way, giving off way too much penalty or boost, causing the leaderboard to be partially luck based.

I think similar arguments can be raised for this application. I think most programmers are mainly interested in quickly knowing how good their algo is.

876584635678890 · October 29, 2018, 4:02pm

I am aware of this, but there is a dramatic difference you are not taking into account.
In chess, you will never (you should know that it is not exaggerated) lose a game against an opponent that is a few hundred elo below you (granted not playing ultrabullet or something crazy like that).

As occasions like this and matches in the CodeBullet challenge (notably the defeat of Voice_of_Cthaeh and EMP^ERROR_v1.0 ) have proven, this is not even remotely the case for Terminal.

Your algo could easily beat the top ten, but lose to some strange algos that are ranked fairly low.

In chess the theory of higher elo beats lower elo works, fairly well, but here it just does not work, which would open the possibility for a lucky win streak.
One could easily just out of pure coincidence face many algos it counters in a row and get a false push because of that. Same applies for losing streaks.

Win and loss streaks are mainly used because of human feelings and not because they are proven to be very accurate.

RegularRyan · October 29, 2018, 4:12pm

The system being discussed is more similar to a glicko system, which takes uncertainty into account to let ratings change faster-slower, and is very popular in online games, far more popular than a more traditional Elo system. Its something we have considered switching to eventually, but is more algorithmically complex and would involve adjusting database models and possibly a leaderboard reset, so it doesn’t seem too appealing at the moment.

We believe that the current matchmaking changes in the works will create a tremendous improvement in how quickly you climb and will implement deeper solutions if the problem persists

kkroep · October 29, 2018, 5:09pm

I think we are talking past each other. I agree with your post, but I don’t think it addresses the same thing.

There are two different concepts here. What you are talking about is that a (much) higher ELO doesn’t guarantee a win, and I totally agree. I’ve witnessed some painful losses so far, especially in the code bullet challenge . This idea would affect the ELO gap that you allow for in match making, so that these match-ups can still occur. Actually this is the same in chess. Especially with different opening knowledge a good player can be at an early disadvantage against a less skilled player.

What I am talking about is having a new algorithm have it’s ELO affected more strongly than established algorithms. C1Ryan formulated it much better than I did.

The idea is that the ELO rating of a freshly posted algorithm is like a fleeting cloud, it means nothing. One can use this knowledge to converge the ELO more quickly until you grow some certainty about it’s ELO rating due to accumulated matches, decreasing the convergence time. Likewise a competing algorithm is less punished/rewarded for beating an algorithm with an uncertain ELO.

Now I do agree that a win streak for an established algorithm is less desirable. That is because the performance of the algorithm won’t change. If we stick to the analogy of chess, a player can suddenly become a lot better, because they did some lessons or something causing them to win a lot. Some systems can catch this change in strength and adjust the ELO more flexibly. This ain’t gonna happen with these algos of course.

Regardless of that discussion I think that C1Ryan has a point with this one. I’m looking forward to the changes