Season 3 Finale Postponed

RegularRyan · September 9, 2019, 6:12pm

Hey everyone,

Due to an error last week with server stability, many games crashed out during the leaderboard stabilization period. There were also concerns that not enough time was given for the leaderboard to stabilize properly. Because of this, we are going to allow Season 3 games to continue playing for the next week. We will have games run until Monday morning next week, and use the leaderboard as it is after games are stopped at that point. We will be clearing the match histories for active algos so that they can play against each other again.

The stream will be Tuesday Sept 17th at 6pm EST

Also, I was on vacation last week, so my apologies about the lack of communication.

Thanks for your patience,
Ryan

IWannaWin · September 9, 2019, 6:38pm

Will we be able to see the match history?

KauffK · September 9, 2019, 6:45pm

Will this continue to use the locked in algos that were developed during server instability? And will the scores be reset or just new matches building on the current scores?
Thanks

RegularRyan · September 9, 2019, 6:52pm

Unfortunately since season 4 has started I don’t think there is an elegant way for me to set this up on the site…
If you have the algo’s ID, you can use https://terminal.c1games.com/api/game/algo/<algo_id>/matches. (Currently this filters by current season, but i’m going to remove this restiction later today so this solution works.)

All algos that were active when the S3 transition started are eligible

We think the current scores are fairly close to the final scores and think they make for a better starting point than resetting them to 1500.

KauffK · September 9, 2019, 7:08pm

Ok, so whatever algos (up to the usual 6) that we had uploaded when matches stopped are still around somewhere and will simply resume playing with the scores they left off at?

RegularRyan · September 9, 2019, 7:11pm

Yes, that sounds accurate. They will keep the scores they had but will be able to match each other again

arnby · September 9, 2019, 7:39pm

Great to hear that you are back! Hope that you had some great time on vacation.
Just one question my mind: has the server issue been dealt with or at least identified? Or is it likely to happen again? (especially as season 3 and 4 are running at the same time)

If anyone needs his ids, I saved some of them so I can look up for you.

RegularRyan · September 9, 2019, 8:00pm

Junaid has been looking into it. We have seen it return after leaving the servers running for close to a week, implying some form of resource leak. We also have a way to bring up a list of games that have failed due to this error.

We will be running maintenance before starting games back up which should keep the problem at bay for a few days, and then running maintenance again on friday. We can also pull up a list of games that failed due to this issue now, so if there are a small number we can make some minor re-adjustments by hand. In the worst case, we may postpone the finale again, though this seems unlikely for now.

IWannaWin · September 9, 2019, 8:09pm

Do you have my ids by any chance?

RegularRyan · September 9, 2019, 8:15pm

Messaged it to you. If anyone else wants, i’ll message them as well.

KauffK · September 9, 2019, 9:20pm

I would like my Ids as well thanks

Although at this point maybe it’s less stressful if I don’t look since my programs will be running in different conditions than they were tuned in and we can’t make changes

At the very least, this will be the least-predictable season finale of all time

Blonded · September 10, 2019, 1:49am

I’d love it if I could get my algos!

Also, glad you’re back. Hope your vacation was stress free and fun!

C1Junaid · September 11, 2019, 3:37pm

Hey guys update here:

We figured out what the issue was with turn 0 crashes finally after a lot of detective work and trying a lot of stuff. It was related to settings on servers for google that caused unfortunate allocation of workers. This should probably also help with extreme compute inconsistencies that Kauffk and others were having, though there will be some variance still it should be better.

Season 4 we plan to roll out a fix that should make compute and memory very stable but it will affect total resources so many algos that are calibrated to season 3 workers will have less compute than expected so we are holding off releasing it until after season 3 finale.

The issue with s3 and s4 algos matching each other is fixed as well.

We are going to reset match history for season 3 algos again so they can rematch each other. Third try is the charm .

IWannaWin · September 11, 2019, 4:05pm

So when is season 3 ending?

KauffK · September 11, 2019, 9:44pm

Is there any concern that the multiple consecutive resetting of match histories will boost or damage any rank scores disproportionately, potentially beyond what could be recovered in a ‘single round’ of settling?

For example, if player A beats player B in a deterministic fashion, and during each of the match history restarts player A immediately matched player B several times, as they were already close in score, and the scores always persist, player A would potentially end up with a maximum of 12 (but realistically probably around 3 to 6) additional wins from the restarts, which would then sit on top of one full playthrough of their peers, with the usual distribution of wins and losses. This could leave player A with a sort of score bias that, given the leaderboard has historically ended with only one or two wins worth separating most of the ranks, could potentially boost player A’s position in the “settled” leaderboard by two or more ranks compared to a single full round of settling matches.

Thoughts?

IWannaWin · September 12, 2019, 5:26am

I was thinking exactly the same, a solution could have been to reset the leaderboard to before the first postpone and then reset the matches.

KauffK · September 12, 2019, 4:04pm

Also, hold on a second, one of my programs has 9 copies of Oracles in its match history, seemingly since the last reset. Some of the oldest ones are from season 4. Would these just be hanging around from before the fix, or should they have been cleared?

n-sanders · September 12, 2019, 6:40pm

I’ve really regretted not taking my own snap-shots of the leaderboard at various stages, but using @IWannaWin’s list from the other thread, there has currently been this movement in the leaderboard:

paprikadobi (0)
n-sanders (+2)
Arnby (0)
kauffk (-2)
tambourineman42 (0)
Team13 (0)
acshi h (+2)
luckyR (+2)
quincy3 (-2)
Michigan_difference (+something)

Iwannawin (-3)

This snapshot was prior to the first time they stopped matches, and there was some adjustments up to that period. I beleve Michigan_difference had claimed #10 by the end of the day that day. Unfortunately I don’t recall where Iwannawin was at that time.

Speaking for a minute about my own algos, I’m sure the multiple matchmaking resets have helped me, but I don’t think my case has been due to @KauffK’s scenario. I mentioned at another point, that there is one glaring weakness that can crush my aglo but it’s not necessarily an effective strategy in general. So the algos that employ it have ratings in the 1800s. Given “enough” matchmaking time, I’ll end up paired with some of those algos and drop 20+ rating points from it. I’d guess that in the two resets they’ve done so far, I wasn’t yet matched against those.

I think @KauffK’s scenario depends quite a bit on duplicate algos, which I don’t think the top 10 is full of. I have 4 different algos between my 6 slots and their behavior is different enough they don’t all win or lose the same matchups against others in the top 10.

Given the prize pool distribution plan, final rank yields a fairly minor benefit ($20 difference per position) when compared to the $100 per win in the group stage. In the end I think the primary goal is to have the “right” top 10. I can’t imagine the roller coaster folks like @IWannaWin have been on where they are right on the edge of the cutoff.

IWannaWin · September 12, 2019, 7:31pm

RegularRyan · September 13, 2019, 4:33am

Short answer: No

We deliberately chose not to reset the ratings to 1500, and think this will give better results. The glicko2 system is extremely good at rating players, and problems are introduced only when we deviate from the ‘standard’ system glicko is expected to function on.

The main concern we have in general with placing people is the fact that algos cannot rematch eachother. With this consideration in mind, we can predict the outcomes of resetting/not resetting the leaderboard:

Not Resetting algo rating:
Algos should start reasonably close to their ‘true’ rating. They will match against other algos, oscillating around their ‘true’ rating and get closer, on average, after each match.

Resetting to 1500:
Algos start at 1500. They will move towards their ‘true’ rating, then oscillate around it, getting closer, on average, after each match.

Where does the problem come in? What I am concerned might happen is the top algos will take an early lead and move up to around 1700 for example. Then they will match against eachother more often than not as they continue to climb up on average. By the time they are all close to their proper placement, many of them will no longer be eligible to face each other. For a majority of algos, this is not a problem, but some algos will ‘waste’ their best matchups or get through their worst matchups while climbing. With 100s of algos, the most lucky of them would be likely to be bumped up a few places.

I haven’t crunched the numbers, but the alternative seems much better in my opinion: Starting value matters very little in glicko compared to the potential to enter into a ‘favorable pool’ at the top after getting lucky (bad) matchups earlier on.

To address KauffK’s point more specifically: As long as they play a solid sample of algos the pool of top algos, the ratings should sort everyone pretty well within that group.

Monday next week at an unspecified time (We need to do a deploy to undo the hack we did that allowed S3 algos to match for a bit) games are going to stop being played and we will run the competition. The stream will be Tuesday, September 17th at 6pm

PS
I’m sure there will be a few people interested in looking into glicko2, this is the best resource to start with.
http://www.glicko.net/glicko/glicko2.pdf
If you are interested in rating systems in general, Microsoft’s Trueskill was interesting to read about as well