Player ratings, near Season End

The root of the issue is that if you run out of players near the top, you will constantly play these games where you have nothing to gain.

I think I am going to have the maximum elo range increase the longer an algo goes with no matches, so that game playing will still slow significantly but will not stop completely if you end up on a ‘rating island’. These players won’t be spammed with games where they have nothing to gain, but will still have to fend off challangers occasionally to hold onto their position.

4 Likes

That sound about right.
May be also reduce the win/lose rating proportionally when making this special matches.
The goal is to get SOME movement, but not to crash them in the ground.

Also putting this in, these changes will be out sometime next week

2 Likes

I would be for allowing rematches if an algo has exhausted all its possible opponents. The way it works now when uploading a new algo, you want to play the high ranking algos asap to climb faster (which also takes them lower), or if you lose it won’t hurt as much as if you play them later when ranked higher. Also some algos are random and have different results, although I agree it is less useful than playing new algos which is why I would suggest it only after exhausting new opponents.

This was a big problem before @RegularRyan introduced the glicko rating system. He would need to confirm, but I believe the glicko system makes the timing of the match matter much less due to a built in “uncertainty” about a new algo’s true rating.

Even as someone that ran an algo with some amount of random behavior, I’m not seeing the clear benefit to rematches.

1 Like

OK, lets reflect on this change, have any one noticed positive change ?

I seem to be a bit lucky that my Strategy works well on Season 6 and we can observe a drastic change on the Leaderboard.

I have 6 algos uploaded that are technically identical.
The first 2, Demorf_v6-0-1, Demorf_v6-0-2 have unproportional +200 rating, above my other algos, and around +300 more around the other players at this moment.
image

I can see there a enough low level matches … but they are often easy wins … and not sure if they give any positive rating at all, but surely, they don’t bring any balance:
image

So … to Reiterate the problem again:
By luck and timing, I have an algo that is 200+ rating above the others.
And current match making mechanics push it even further apart.
I can not achieve this rating with a new upload of the same algo.
But If I just keep it … i got some unfair positioning advantage.

I think it’s great, that there are low level matches when no higher matches are possible anymore. And I don’t see any issue with the fact that not all of your algos have the same rating. It certainly is related to the fact that your older algos have played more matches and therefore have a higher rating.
For comparing my own algos I am preferring your tool Cross Check anyway instead of the rating.
But if you wnat your elo to drop again, I could upload an Anti-Demorf-Algo :slight_smile:

Looking at this after few days, this extra matches are may be making the situation worse:
The give some minimal rating may be less then five points… but it is above zero, and it adds up over time.
So in fact the “fix” is may be making the situation worse.

@Felix, lets make a real test for this:
If you manage to drop my rating to 2500 with less then 20 uploads,
I will happily remove my algos and re-upload them, clearing the board a bit.

If you don’t manage to do it … this will confirm the problem.

1 Like

Challenge accepted!
But I might not have time for this before thursday this week.

I don’t see any issue if there is a large gap in ratings so long as a high rated algo will continue to find opponents. To my understanding, the only reason Demorf’s isn’t dropping is because it isn’t losing. If it starts losing it should drop, whereas in the old system it would stop getting matches and be unable to drop.

2 Likes

It is not so black and white. Demorf_v6-1-2 is losing … just 5% time.
But most of the “extra matches” are vs really random players … some of them really low, so even when losing from them the penalty is probably reduced.

The end result is my algo keeps climbing in rating and this is even worse then before, where you just kept your high rating.

I would love to see some ones rating on the leaderboard to be reproducible, by uploading the same algo.

I know there is probably not a simple thing to balance, but we have good setup now for testing it.

That 234 rating jump from beating a 2057 rated algo at the end seems odd.

The change that was made doesn’t address the problem of rapidly iterated algos pushing the algos that beat them up and the ones that lose down. This will still occur, and perhaps more so as the algos on either end are no longer “out of reach” of this phenomena. The changed addressed lack of availability of matches, not large gaps in the ratings. Yes, you will continue to rise until a challenger appears. I think if Felix does upload an algo to beat you it will fall quickly. That last match where you jumped 234 rating does seem like a bug though.

1 Like

So turns out finally updating my season 5 algo’s sim to use season 6 Destructor stats gave me an algo that’s been beaten the 3 Demorfs it’s played so far. Right now just a single one is running around in the wild.

Are you wondering what happens if you rack up several losses in a row?
(Are you asking for someone to spam you with losses to see how many it takes to drop you below 2500?)

Agree that facing distant players is not the root issue. With each match played, glicko moves you closer to some hypothetical ‘True’ rating, and gets closer the more data it has access to. With more matches played, you get a few more points because you earn and deserve them, proving yourself against weaker algos still shows the system that your ‘True’ rating is higher than your current one.

The definition of the ‘True’ rating is the rating a player would converge on if all players played all other players an infinite number of times.

This is definitely a problem that still exists and is the main reason we give the leaderboard a few days to stabilize before running a season finale. Hopefully having it be ‘off’ during the season is tolerable. I’ll add a low priority ticket to investigate improvements to this system but it probably wont be addressed for a while.

As always, everyone should feel free to voice their thoughts on the matter and we can adjust priorities if needed.

Pretty late today, but im going to investigate this later this week.

https://bcverdict.github.io/?id=121114
I think this bug might also be in the tool from @bcverdict, since all the rating points from the chart are calculated by him (since this information is not available in the match history) and this tool was not updated for quite some time.

3 Likes

You’re right, the large jump is now shown on the most recent match again (which is a different match than last time). It could still possibly be an indication that the rating is higher than it should be though.

image

@Demorf until now I won 5 times against your top algo and it’s now about 100 points lower than before. I don’t think more uploads are necessary to prove the point.

1 Like

Good Job. I removed my older algos.
Now lets see if YOU get to escape velocity and separate from the rest …

I noticed something else … just before you pinned me down. Some of the other top algos also had wins agains me, but their rating was 300 less then mine … so the penalty (and probably their reward) seams to be greatly reduced, less then 30 points. So this effect helps excelerate the gap.

A larger rating difference leads to a higher change in rating after a match is resolved. A more likely explination is that as your algo plays more matches, the algorithm becomes more and more certain that your rating is ‘correct’ or close to correct and reduces the amount that it fluctuates.

This should be clear through example. If an algo has played 1 game where it beat a 2100 algo, its probably pretty good so it gets alot of points. If an algo has played 10,000 games and is still 1500 rating, a single win against a 2100 algo is probably an outlier, and the algorithm is more stingy giving out points

4 Likes