Player ratings, near Season End

Player ratings, near Season End
0

#1

I know this was discussed before, and probably does not have easy solution, but I wanted to mention,
few specific problems, and hopefully we may find a work around for them:

Bigger gaps, when player rapidly submit new algos
At some point we had like 150 points of gap, between top 3 players (2700 | 2550 | 2400)
This in turn has several side effects:

  • it gets hard to get a match vs the top players even if you have better algo
  • and if you are between 2 gaps … you don’t get ANY matches
  • I personally had a situation where my newly submitted algo was staying on 15:0 matches for several hours.

Then when things start to normalize, and you get more mixed matches, you may save an algo with ridiculously high rating that rarely gets a new match, as it is 100-200 points above the rest,
but if you resubmit it … it will it ends up in middle tier.

Again, this is not a big problem, in normal days, as things get normalized after few days,
but near the Season end … it can be abused, and tricking the rating system is not goal of this game IMO but is currently a thing.


#2

Another factor is algos being added and deleted during rapid prototyping. I had 13 algos submitted on 5/10 alone (and only 4 stuck around past that day). They would rise quickly to the 2.1k - 2.2k range and then get pulled after I saw their behavior in a few key match-ups.

That means some algos had the chance to get up to nine wins against ~2.1k rated algos that others would never play. This may help explain how a resubmitted algo could get stuck in the middle tier when there are “less” ~2.1k wins available for it.


#3

Yep, good example. and also the reverse is true: if a algo reached 2400 3 days ago …
it will never have to face the blood bath on 2100, with the new challengers there.

I tend to keep my highest rated algo for benchmark even if it is few days old.
But it is highly misleading for me and especially for the Leaderboard.
Often if i resubmit it, it gets smashed.

I think something like a decay mechanic can solve some of this issue:
If the algo did not find a match for the past hour, remove 10 points or 1% from its score.


#4

That’s a very clever idea. I’m not sure how I feel about it though. I know C1 has said in the past they want to incentivize algos being around for a while, rather than creating a system that encourages procrastination until near the end of the season (part of why they have the auto-competitions now). And being penalized because their matchmaking system couldn’t find a match for my algo to play sounds a bit discouraging.

I felt like this season had pretty good fluidity regarding the top algo spots. And I think @acshikh’s au2 spent the whole season out in the wild and maintained a solid rating (but they’d have to verify if it was ever taken down and re-uploaded).

There are definitely times when algos can rise quite high if their strat works exceptionally well versus the current meta at the time. @amit21min’s ‘Good Night Reliable’ is an example of this. They held the top spot for a while, and ended up at #12 at season’s end.


#5

Another option would be to broaden the range of Elo that matchmaking considers. If an algo is at the top it can persist up there quite a while because it is not getting matches with the sub 2k algos that happen to counter it.

However, I’m not sure that any of these tweaks will necessarily do all that much better than the current system, because there is the underlying problem that terminal has a sort of rock-paper-scissors quality to it. This means that the notion of an algo’s strength can’t really quite be described by just one number as in its Elo rating.


#6

Overall im pretty happy with how rating has been working since we switched over to glicko in season two, but agree that algos being stranded at the top of the leaderboard could cause issues.

What are the thoughts on some of these ideas:

  • Limit algo uploads to 1 or two per day, stacking up to like 8 if you have not used your uploads. This will reduce shocks to the system when top players upload alot all at once. It also forces a more gradual iteration process throughout a season, rather than trying to rapidly iterate on the last day before anyone has a chance to see your work. The obvious downside of this idea is how cumbersome the constraint might feel.
  • Increase the maximum matchmaking rating difference between opponents. The concern here is that a high ranking player loses alot of rating if they lose a match against someone very far away from them in rating, and gain very little if they win. So it can feel bad if players are matched like this often. Potentially, I could set a cap on rating losses such that you cant lose more rating than you would playing against someone 400 points away from you…
  • Allow algos to rematch if they havent played in over a certain time period, maby like 2 days. This will at least keep rating more liquid at high ratings, but im not a fan of this idea since repeated matches aren’t very useful for iteration, and will almost always have the same result.
  • Perform a blanket ‘deflation’ pulling everyone back towards 1500 once per week. Say, reducing everyone’s rating by 0.2 * (your_rating - 1500). This will get stranded algos more matches and reduce the relative badness of playing against much lower rated alogs. It also pulls outliers back towards where they maby actually are supposed to be. But im not 100% sure how nicely this math plays with the base glicko formula. Its also pretty confusing for anyone just looking at the leaderboard everyday.

#7

Personally, I have loved for some period having an algo at the top reaches of rating! It is fun and rewarding to achieve that, and it would feel less rewarding to have that rating drained away by some sort of deflation.

Also, while it is certainly sad to lose a lot of rating to a low ranked player, it also feels just to me! Hey, if a 1950 ranked player beats a 2600 ranked one, I think that is real cool for the lower ranked player and well, clearly it shows that my fancy code has a hole in its armor!

So my vote here is for increasing the maximum matchmaking rating difference. Maybe we just have that maximum difference increase arbitrarily whenever an algo can no longer find anyone to play within the current difference, guaranteeing, say, at least one more match every hour or so, regardless of how far away that other algo is in ranking.


#8

The root of the issue is that if you run out of players near the top, you will constantly play these games where you have nothing to gain.

I think I am going to have the maximum elo range increase the longer an algo goes with no matches, so that game playing will still slow significantly but will not stop completely if you end up on a ‘rating island’. These players won’t be spammed with games where they have nothing to gain, but will still have to fend off challangers occasionally to hold onto their position.


#9

That sound about right.
May be also reduce the win/lose rating proportionally when making this special matches.
The goal is to get SOME movement, but not to crash them in the ground.


#10

Also putting this in, these changes will be out sometime next week


#11

I would be for allowing rematches if an algo has exhausted all its possible opponents. The way it works now when uploading a new algo, you want to play the high ranking algos asap to climb faster (which also takes them lower), or if you lose it won’t hurt as much as if you play them later when ranked higher. Also some algos are random and have different results, although I agree it is less useful than playing new algos which is why I would suggest it only after exhausting new opponents.


#12

This was a big problem before @C1Ryan introduced the glicko rating system. He would need to confirm, but I believe the glicko system makes the timing of the match matter much less due to a built in “uncertainty” about a new algo’s true rating.

Even as someone that ran an algo with some amount of random behavior, I’m not seeing the clear benefit to rematches.