Hi all, good luck with the coming competition!
I am the author of Forceo and will be keeping my fingers crossed as well!
I agree with @Janis: The leaderboard seems very unsettled. For example my top algo went from 1920 elo to 2018 and back to 1940 over the 12 past hours (4th rank, 1st rank and 5th rank)
Something else crossed my mind as I saw that my last drop was in part due loosing twice to the same algo: if we aim towards a settlement, wonāt we reached a point were a lot of double match are played as there are fewer and fewer matchup to do?
I think this is an inevitable frustration of this sort of āpick a place to stopā method of scoring. We all have those programs that we inexplicably lose to, often to opponents at frustratingly low rank that take a big bite out of our elo. The name of the game now is just who gets matched with the least number of these such losses.
Unless we allow things to run until everyone has played literally almost everyone on the leaderboard, there will always be the possibility of running into sudden losses close to the deadline.
Iām really not sure what would constitute a āstableā leaderboard at this point, considering how close the scores are and the fact that a single loss could be -30 elo and a drop of two or more ranks. Given that it is highly impractical to wait until everyone has played enough people on the leaderboard that they donāt really lose anymore, I think that at some point weāre going to have to bite the bullet and just declare it done. What that point should be, Iām not sure.
I agree with you, but at the same time I think it would be better to let more matches be played until Sunday finals.
Initially it was said to be a week and my attitude was - 5 days (or 7) should be plenty. If the doubling the match count has been introduced from the deadline, I think we would be having better picture. Currently, it helps only to make it less obvious especially for a reason what you named. -30 elo per match is a big hit.
I understand this is the pilot finals and I enjoy it very much but to give some feedback for the future, it would be best to:
- reset the leaderboard
- have more days e.g. 7
- match count as it is now
- less āpunishmentā for top losing to low elos.
Overall it has been great to see @RegularRyan taking all the feedback into account so rapidly, e.g., doubling the match count and adjusting the elo ratio.
Ironically, I think this āpunishmentā aspect was re-introduced just recently to fix an unexpected side-effect of leaving two algos āstrandedā without opponents at the top of a huge elo gap. In response, the elo matching window was widened, which fixed that problem but allowed for an old problem to resurface: top programs now have matches where a victory grants +2 and a loss is -30
I actually like the matchmaking speed etc. right now, eventhough I agree that it shuffles the leaderboard a lot. Yesterday I was tenth and then went down to 22. in a few hours nad now Im back at 14th.
We will be further increasing matches played for a few hours today, but I believe we are still on track for the 5pm cutoff. Our main concern was that algos uploaded the night before the deadline would be unfairly unable to reach the top 10 in the time provided, even with very high winrates. With top players ālast minuteā algos reaching fairly close to their pre-existing algos, and with last minute algos like some by @zigzagninja and @RuberCuber reaching the top 10.
Ultimately, from my perspective, our system seems to be working pretty well given the constraints of the Elo system, and the top 20 look fairly close to what we expect them too. In my view there seem to be two problems causing the issues addressed here:
-
Elo System
We will be announcing it officially during the stream, but we will be switching to the glicko2 system next season. You can read about it more online, but it should fix nearly all of the issues Elo causes and increase the overall quality of the rating system significantly. This will prevent large jumps in score for single game losses, ātightening inā algos with consistent performance, and dramatically reducing the time it takes to reach a score close to what your algo should have. -
A more fundamental problem is that we have a dozen extremely talented engineers competing for the ācutoffā spots who have invested a significant amount of time into terminal and have created some great algos, and only a small handful of them will be in the competition. I think hard cutoffs like this where #10 is āinā and #11 is āoutā are fundamentally problematic, but smoothening the curve between āinsā and āoutsā is a very difficult problem. Many of you know we have slowly but surely been building out our platform to support players connecting with our partners online, similarly to how they might at a live competition, so this might help at some point.
In any case, I consider our teams willingness to experiment and learn from problems that come up one of our strengths, so we will at least have some discussions about what we can do to improve our large-scale competitions.
That was a bit longer than expected and still doesnāt actually cover everything iām thinking about, but feel free to ask questions or continue the discussion below. Just to clarify, we are currently expecting to go forward with the 4pm cutoff time.
Ok, interesting, thanks for noticing these problems and trying to fix them at such short notice. Im a little confused now though is the cutoff at 5pm or 4pm?
Iām sorry, its whatever it says on the āglobal competition detailsā page. Ill check and update anything incorrect
Edit: It looks like things are consistent. Some people may have mixed up the 5pm stream time and 4pm leaderboard cutoff.
OK 4pm then, thanks
If thatās the elo system Iām thinking of, that Halite uses, Iām a huge fan. They announced some details about their rating system for Halite III that Iāve always felt it was a fantastic system for this type of competition. Iām not sure how 1 upload vs 6 uploads changes things, but Iām eager to see this implemented for Terminal.
Huge fan of your algo. Itās one of the best designed ones Iāve seen and seems to give mine the (consistently) hardest time.
Goodness. Iām looking at my algos right now and it seems whoeverās on top has just had the most recent win streak vs 1700 elo algos lol. Two peaked at 2110 doing that it seems. You can see a very visible jump in elo across the board from the time more matches/bigger pool of elos got turned on. Iām curious if I briefly held the top spot in the last 24 hours; Iāve been checking sporadically and have been floating around 2nd-4th but itās always a different algo when I check.
How is the ranking determined for the algos that ended with the same elo?
Well its over right?
Well done kauffk for winnnig, unfortunaletly I was only able to make it to 15th but I thinks thats still decent xD
Big shout out to winners and most importantly those who did not make it even though did their best!
I had fun to be part of the top 10 for some of the time but ended up as 14th.
As @RegularRyan mentioned ācut offā is rather harsh but looking at other examples, e.g., sports there are only 3 sets of medals. This is much more democratic.
wait my algo just played another game, isnt it over?
Oh ok, youre right
Greetings. Iām representing 16daystocode (more on that below).
Congrats to all of the other 10 ten finishers as well as everyone else! Will we be getting any feedback on which bots will be in the finals before the matches occur? One of our algos was tied for 9th and 10th place at the end but shortly after (roughly 8 minutes later) another one of our algos overtook it. Our highest algo at the time of the 4pm āsnapshotā is the algo that we would like to use. We just want to make sure that is correct.
As a side note, we cant seem to activate a forum account for 16daystocode. The link provided to verify our email address doesnāt work, even when we copy and paste it.