Yeah, I beat every Aelgoo up until that version. Forceo is the first algo I looked at and said “I have no idea why I lost,” but the Aelgoo match was unfortunate.
Hey everyone,
We are not planning on penalizing KauffK or changing any aspect of the results of the global competition as they do have a very strong algo and the ‘cheese’ strategy used was not against any established rules. It was a clever use of the resources available.
The comparison I would make would be to ‘meta’ techniques that many players did like switching algos to try to better match against their pool or keeping their algos secret until the last day. The problem in this case is that instead of increasing your win-rates by a small percentage, it gave KauffK a tremendous advantage in that specific matchup.
I would also like to give Rodger, RyanD’s partner and creator of their team’s predictor credit for the 3 fantastic scrambler interceptions we saw in the final match.
We will be discussing this more internally with long term considerations in mind.
I’ll be staying on top of this discussion.
Hello everyone. First, thank you all for a great season one, you are all excellent competitors and I greatly admire all of your programs’ unique strengths.
My technique in the final match seems to be the subject of a minor controversy, so I wanted to clear a few things up.
Ryan’s assessment in the opening post is quite accurate; my program passed turn 0 and identified the opponent by the signature opening turn, then attempted to replicate a parsed replay, except for turn 0 which had already passed.
There is a little more to this approach than meets the eye, and ultimately it is a strategic analysis and application of the publicly available game data, although I do admit it was a bit of a ‘cheese’. In the interest of transparency I will discuss my thoughts and experiences with this technique thus far.
Using an approach such as this carries several serious downsides in terms of a match strategy. For one thing, many programs share matching turn 0’s, resulting in the need for further methods of identification. In my development I experimented with a variety of identifying factors such as computing time and secondary turn placement to attempt an ID as fast as possible.
However, this itself creates another problem: once the first turn has passed, the game is now “off script”, so to speak, and committing to following a parsed replay could be sending yourself to a misguided doom. Additionally, simply by passing on turn 0 to get an ID, the butterfly effect of that missed turn might similarly invalidate any replays.
To attempt to mitigate this effect, I created an elaborate data analysis program that cross-referenced all the winning replays against all programs with similar starts, and identified candidates both with minimal action in the first few turns of the game, and with similar opening moves against the maximum number of opponents with identical signatures. This was critical, as it allowed for a potential recovery by switching scripts if it was later revealed that the initial ID was not correct. Simply following the first winning replay that popped out of the API typically had a very low success rate, and was not at all flexible. Compounding all of this is that the opponent is ultimately unknown, and a large amount of preparation went in to preparing for a variety of opponents and contingencies.
I wanted to reveal all of this to make it clear that I was not just out to sabotage kkroep, and neither did I hand-pick and hard code Ryan’s response. Ryan’s replay was selected by my analytic program from a large pool of candidates as the most flexible in terms of opening moves against the Demux series, and as the most likely to succeed even with a blank board on turn 0. I would like to acknowledge and appreciate Ryan’s team’s program, your predictor is fantastic and your performance against the fierce competitors in the Demux series is something to behold.
I imagine that the possibility of using opening moves as an identifying signature has occurred to a few other players, and I would be curious as to what thoughts or results, if any, they experienced.
Maybe a problem is that with the tools available, people can see all the matches of all algos. I feel like this is too much information so that people with their own creative ideas are hardly protected. Maybe an option would be to change the API so people have access to less replays of other players. KauffK here used the resources to find a specific match I lost, other people have used it to try and copy the algo (and got to top 10 with it). Maybe the Terminal developers can decide which replays to share instead of making literally everything available?
I even tried to prevent people from using the tool to view my best algo by keeping up demux_1.9 till the last moment. However, clearly KauffK was able to circumvent that. I congratulate KauffK on pulling it off. I guess I’ll just add a random number generator somewhere . I am more concerned with the ease of copying and gathering info than his specific solution though. I understand that eventually new strategies will get noticed and understood, but Demux_1.25 was literally uploaded a day before the deadline… To see every match it played directly when it is uploaded feels out of touch with the spirit of the competition…
Thanks for sharing how you pulled of your strategy! It gives me a higher appreciation for the effort in pulling this of even though I don’t want this type of strategy to be viable in the long run
Thanks for sharing, KauffK. Its definitely an impressive piece of work, and what you are describing sounds much less ‘cheesy’ than hard-coding something to beat a single player on the leaderboard you expected to struggle against. I’m actually much more confident about our decision about reading it, and hopefully everyone else feels better as well.
Ill discuss this with the team
We will be looking into ways to handle this, I have some ideas
I agree that this behavior should be limited in the future, but I think that removing access to replays is the wrong approach. Yes, it is one of the easiest methods for removing access to this functionality, but it also has some downsides. I think that a big one is a huge decrease in the viability of neural network algos. We haven’t really seen many / any successful ones, but I think that they may become more common in the future. Another downside of removing this is that people can’t watch replays either. The threads about leaderboard matches were very interesting for me, to be able to see the meta and what strategies the best algos used. This allowed the meta to keep developing, without keeping us stuck on maze algos, for example.
Well that is why I proposed them to be limited instead of every match available. For example the terminal team could host such a dump of replays themselves. Still much better than being to watch every single replay that is played. By watching all my replays one could easilly make a carbon copy of my algos if they want. Without any game sense or knowledge required to design it. There is no clear advantage between the creator of a strategy and its copies.
You are saying that less replay access hurts neural networks, but having access to all hurts handcrafted strategies. I am pretty sure access to all replays was not intentional by the terminal team.
I dont see why neural networks need this much info? They can just watch their own matches right? And you have access to your own matches always.
Anyway I’ll be interested to see what the development team comes up with.
I strongly disagree with @kkroep here, I think this kind of strategy should not be restricted at all, and that we should continue to have access to every ranked matches replays. My reasons are:
-
I think that if a strategy has to be secret to work, then it can only work for a short term, so it should be considered as a bad strategy.
-
As explained by @KauffK, this approach is actually very complex to implement, and also needs a backup strategy.
-
KauffK’s approach is restricting the most statics algos, and it is a a good thing, since terminal is mostly supposed to be a coding contest and you don’t need a great understanding of algorithmic to build a static algo.
-
There is an easy way to counter this approach: implementing randomness in your algo will force the game to be “off script”.
-
A harder (and funnier) way to counter this: having a database of replays, you may be able to identify the one your opponent tries to follow and thus make very precise predictions of their future moves, allowing you to counter them, or even better, you can choose at which turn you will make a decisive change that Cthaeh will probably not foresee. (You could as well use that database to detect and counter other static algos).
I totally agree with @Aeldrexan, well said!
usually in ML, the more the better, especially here where the state-action space is crazy big.
You can also consider that the overall level is rising so you need up-to-date replays for your training. Replays from mid-December could be completely useless today (for certain types of ML applications)
right but it depends on what are your training goals and how you implement your training. If you want to make a very general AI, having access to all replays is crucial. Before beating the world champion and before playing a lot against itself, alpha go first watched thousands and thousands of Go expert matches to learn.
Hahah this would be hilarious! Imagine two algos having a battle of replays, changing as soon as they recognize the opponent replay. (Way too complicated in practice but still funny)
I see a disagreement here. Especially with the players who are least negatively affected by this mechanic. @Aeldrexan and @arnby have the two least copyable algorithms on the ladder. @Aeldrexan here:
You don’t need a lot of programming knowledge to design a static algorithm, this is true. But in order to design an effective semi-static algo that can compete with dynamic ones you need a lot a very solid understanding of how the game works. We both have had very different approaches to tackling this game (polar opposites). Your algorithm makes all the decisions itself, while my algorithm is the result of calculating structures with pen and paper and a lot of by hand play to figure out new strategies. For example the gap-jump idea implemented in my transistor and demux algos.
Again, I am not convinced that the current amount of access to information is intended behavior. Though, I found a solution that doesn’t need changes from terminal: I will remove all my algorithms from the ladder, as their presence mainly negatively affects my algos. I can simply upload at the last moment before a competition, or in short burst to monitor performance. I guess that kind of gets the same effect that I am looking for.
you have a point here
I think it is intended: the aim is to have a global improvement, to ‘force’ players to always improve. Each player needs to know what the others are doing to improve.
I understand your issue, but if everyone do that, we will receive true feedbacks only when competitions occur because everyone will released their hidden algos only at that time. And this behavior is definitely not wanted by C1.
Anyway, I definitely understand your concerns and we will see what C1 says about this.
You would probably hurt yourself doing that, since you would not be able to see how your algos do against the others players. But that is still an interesting idea, and it could actually work since your algos are usually masterpieces of handcrafted defenses.
I completely agree with this statement. This is why I would like there to be less info available. To incentivise people to keep competing. My proposed solution would only help me, but be detrimental to the community. I am trying to argue a point that having access to all replays is not only a good thing. This would be one example of that. If I would have lost to the algos with my own strategy that I designed for weeks and uploaded a week before the competition deadline I would have been way more salty about this
Thanks for the compliment!
Actually I already tried a similar strategy with the Terminal series. I designed that strategy on paper, and then implemented a barebones version of it. I put it on the ladder and it shoot up to no.1 spot in one day. I then copied the links to all replays and removed it again. That gave enough resources to develop demux. C1Ryan contacted me for the highlight video, and I shared with him the replays of the algo I removed showing a strategy that was not on the ladder yet. This was what C1Ryan was referring to on the stream and why he was confident that I came up with it.
So I am guessing this strategy would be effective in the future too… but I would rather it not be that effective/necesarry, and detrimental to the community. I much prefer open discussions on the forum about strategies (very much like this one actually), and sharing ideas, instead of actively hiding them. For me that is half of the fun.
it’s a bit funny how my 1700 elo algo beat kauffk algo: https://terminal.c1games.com/watch/1863571
and i guess a big part of kauffk success is lack of randomness in most of the games, and maybe this is a big downside for his strategy. my algo is just a bit random (only if there are two paths with equal number of units scoring, the path chosen is random) but due to the butterfly effect, it doesn’t take much of a change to make big differences in the long term and change the course of games.
Great example of the problems with this technique. Judging from the looks of things, my analyzer went with something resembling the track series for this match, but between you having multiple programs that look the same on turn 0 and your random attack paths, clearly something went wrong.
I considered adding a mechanism to detect when games go off-script and switching to a recently built dynamic defense, but I ran out of time and it never worked quite the way I wanted. Fixing my dynamic mode is one of my first agenda items on season two
Ohh, so this explains your computation time in this match:
https://terminal.c1games.com/watch/1864837
I was kind of confused when I saw my algo lose to yours, since usually I won against voice_of_cthae.
Very interesting, I actually used to use something like this back with my old algo (mode_switch), mine was way worse though.
I think this is a little more widespread as a strategy than thought, and that’s something I’m not as ok with.
I remember an issue when we developed this (for the purpose of replaying matches against ourselves and getting debug messages from our algo). When a parsed game like that ends, there aren’t any more turns to play, so your algo crashes. To fix this, when games go “off script,” we made the last turn repeat itself if the game hasn’t ended. That way, the algo doesn’t crash and get removed if we decide to upload it to get the debug messages vs ourself.
So did You’re_Allowed_A_Wand in their round robin match against Demux. It looks like the last round of that match is round 22, as they repeat the same turn until they lose, having gone off script. Looking at their computation times, it’s a very similar system. I’d love to hear from @16daystocode on their system for this.
UPDATE: All of You’re_Allowed’s matches did this. Looking at one of their previous version’s matches, I am unable to find a single instance of them using their own algo. I’m not sure they have one.
On a separate note, turns out, my match against Kauffk faced a replay of KKroep’s algorithm as well. It looks like the same series of KKroep’s algos that You’re_Allowed selected, though I can’t place that as a Transistor or Demux. I figured I’d introduce my thoughts on “countering” this cheese strategy, as it seems my algo took it off script almost immediately, and I think I know why. Hopefully this helps people focused on pioneering the designs of algos, like mazes and transistors, and makes them less susceptible to copy-cats and parsed replays of defeats.
Firstly, I selected a different algo than what was on the leaderboards. This isn’t incredibly useful in countering this strategy, but it is worth noting simply having a new version can send these games “off script” and tilt the game in your favor.
Second, my algo does care about what the enemy placed turn 0. Skipping a turn and “catching up” on placements by round 1 will alter the behavior of my algo. I would recommend doing something with turn 0 enemy placements (or the lack thereof) if you want to ensure your algo takes things “off script” at minimal disruption to you.
Third, a little bit of RNG may help. I haven’t incorporated this yet, but I really liked @codegame 's idea of using RNG when two equally good attacks could be selected.
We will share our experience shortly.