Hi all, good luck with the coming competition!
I am the author of Forceo and will be keeping my fingers crossed as well!
Hi all, good luck with the coming competition!
I agree with @Janis: The leaderboard seems very unsettled. For example my top algo went from 1920 elo to 2018 and back to 1940 over the 12 past hours (4th rank, 1st rank and 5th rank)
Something else crossed my mind as I saw that my last drop was in part due loosing twice to the same algo: if we aim towards a settlement, won’t we reached a point were a lot of double match are played as there are fewer and fewer matchup to do?
I think this is an inevitable frustration of this sort of “pick a place to stop” method of scoring. We all have those programs that we inexplicably lose to, often to opponents at frustratingly low rank that take a big bite out of our elo. The name of the game now is just who gets matched with the least number of these such losses.
Unless we allow things to run until everyone has played literally almost everyone on the leaderboard, there will always be the possibility of running into sudden losses close to the deadline.
I’m really not sure what would constitute a ‘stable’ leaderboard at this point, considering how close the scores are and the fact that a single loss could be -30 elo and a drop of two or more ranks. Given that it is highly impractical to wait until everyone has played enough people on the leaderboard that they don’t really lose anymore, I think that at some point we’re going to have to bite the bullet and just declare it done. What that point should be, I’m not sure.
I agree with you, but at the same time I think it would be better to let more matches be played until Sunday finals.
Initially it was said to be a week and my attitude was - 5 days (or 7) should be plenty. If the doubling the match count has been introduced from the deadline, I think we would be having better picture. Currently, it helps only to make it less obvious especially for a reason what you named. -30 elo per match is a big hit.
I understand this is the pilot finals and I enjoy it very much but to give some feedback for the future, it would be best to:
- reset the leaderboard
- have more days e.g. 7
- match count as it is now
- less ‘punishment’ for top losing to low elos.
Overall it has been great to see @C1Ryan taking all the feedback into account so rapidly, e.g., doubling the match count and adjusting the elo ratio.
Ironically, I think this ‘punishment’ aspect was re-introduced just recently to fix an unexpected side-effect of leaving two algos ‘stranded’ without opponents at the top of a huge elo gap. In response, the elo matching window was widened, which fixed that problem but allowed for an old problem to resurface: top programs now have matches where a victory grants +2 and a loss is -30
I actually like the matchmaking speed etc. right now, eventhough I agree that it shuffles the leaderboard a lot. Yesterday I was tenth and then went down to 22. in a few hours nad now Im back at 14th.
We will be further increasing matches played for a few hours today, but I believe we are still on track for the 5pm cutoff. Our main concern was that algos uploaded the night before the deadline would be unfairly unable to reach the top 10 in the time provided, even with very high winrates. With top players ‘last minute’ algos reaching fairly close to their pre-existing algos, and with last minute algos like some by @zigzagninja and @RuberCuber reaching the top 10.
Ultimately, from my perspective, our system seems to be working pretty well given the constraints of the Elo system, and the top 20 look fairly close to what we expect them too. In my view there seem to be two problems causing the issues addressed here:
We will be announcing it officially during the stream, but we will be switching to the glicko2 system next season. You can read about it more online, but it should fix nearly all of the issues Elo causes and increase the overall quality of the rating system significantly. This will prevent large jumps in score for single game losses, ‘tightening in’ algos with consistent performance, and dramatically reducing the time it takes to reach a score close to what your algo should have.
A more fundamental problem is that we have a dozen extremely talented engineers competing for the ‘cutoff’ spots who have invested a significant amount of time into terminal and have created some great algos, and only a small handful of them will be in the competition. I think hard cutoffs like this where #10 is ‘in’ and #11 is ‘out’ are fundamentally problematic, but smoothening the curve between ‘ins’ and ‘outs’ is a very difficult problem. Many of you know we have slowly but surely been building out our platform to support players connecting with our partners online, similarly to how they might at a live competition, so this might help at some point.
In any case, I consider our teams willingness to experiment and learn from problems that come up one of our strengths, so we will at least have some discussions about what we can do to improve our large-scale competitions.
That was a bit longer than expected and still doesn’t actually cover everything i’m thinking about, but feel free to ask questions or continue the discussion below. Just to clarify, we are currently expecting to go forward with the 4pm cutoff time.
Ok, interesting, thanks for noticing these problems and trying to fix them at such short notice. Im a little confused now though is the cutoff at 5pm or 4pm?
I’m sorry, its whatever it says on the ‘global competition details’ page. Ill check and update anything incorrect
Edit: It looks like things are consistent. Some people may have mixed up the 5pm stream time and 4pm leaderboard cutoff.
OK 4pm then, thanks
If that’s the elo system I’m thinking of, that Halite uses, I’m a huge fan. They announced some details about their rating system for Halite III that I’ve always felt it was a fantastic system for this type of competition. I’m not sure how 1 upload vs 6 uploads changes things, but I’m eager to see this implemented for Terminal.
Huge fan of your algo. It’s one of the best designed ones I’ve seen and seems to give mine the (consistently) hardest time.
Goodness. I’m looking at my algos right now and it seems whoever’s on top has just had the most recent win streak vs 1700 elo algos lol. Two peaked at 2110 doing that it seems. You can see a very visible jump in elo across the board from the time more matches/bigger pool of elos got turned on. I’m curious if I briefly held the top spot in the last 24 hours; I’ve been checking sporadically and have been floating around 2nd-4th but it’s always a different algo when I check.
How is the ranking determined for the algos that ended with the same elo?
Well its over right?
Well done kauffk for winnnig, unfortunaletly I was only able to make it to 15th but I thinks thats still decent xD
Big shout out to winners and most importantly those who did not make it even though did their best!
I had fun to be part of the top 10 for some of the time but ended up as 14th.
As @C1Ryan mentioned ‘cut off’ is rather harsh but looking at other examples, e.g., sports there are only 3 sets of medals. This is much more democratic.
wait my algo just played another game, isnt it over?
Oh ok, youre right
Greetings. I’m representing 16daystocode (more on that below).
Congrats to all of the other 10 ten finishers as well as everyone else! Will we be getting any feedback on which bots will be in the finals before the matches occur? One of our algos was tied for 9th and 10th place at the end but shortly after (roughly 8 minutes later) another one of our algos overtook it. Our highest algo at the time of the 4pm ‘snapshot’ is the algo that we would like to use. We just want to make sure that is correct.
As a side note, we cant seem to activate a forum account for 16daystocode. The link provided to verify our email address doesn’t work, even when we copy and paste it.