Kauffk's Final Game

Ryan_Draves · January 7, 2019, 12:01am

Just a quick write up since the Twitch Chat got messy. I don’t think this is cheating, and find it genuinely funny, but here’s how Voice beat Demux in the final match.

He downloaded a replay of me (older version of mine) beating Demux. From there he recognized Demux’s turn 0 placements and switched to a hardcoded strategy of parsing the replay and using my exact placements to replicate the game and win. You can see this from the 13 millisecond computations times.

Edit: Just my take. Probably close to reality, we’ll let Kauffk clear things up if there’s something incorrect there.

Thorn0906 · January 7, 2019, 12:08am

It’s this game, Zephyr vs. Demux-1.25: https://terminal.c1games.com/watch/1787320. Identical to this one, Demux_1.25 vs. Voice_of_cthaeh_2.8: https://terminal.c1games.com/watch/1804075.

Edit: Health isn’t the same because Zephyr went overtime once.

KauffK · January 7, 2019, 12:08am

Sorry I wasn’t active in the Twitch chat; I am attending dinner with my family and should be able to discuss in this thread in about 20 minutes.

kkroep · January 7, 2019, 12:15am

Nice find! I’m a bit salty that my algo was beaten this way for a whopping 1000,- but I’ll get over it in a few days…

Ryan_Draves · January 7, 2019, 12:18am

If it helps, if I had won my round robin, SIXTH ITEM is the algo before Zephyr, so it probably would’ve turned out similar. (In fact, here’s the win for me)

kkroep · January 7, 2019, 12:21am

I know, I was already celebrating I dodged you in the pool ;). I would have been more comfortable if it would have been you taking home the price though…

Ryan_Draves · January 7, 2019, 12:24am

Yeah, I beat every Aelgoo up until that version. Forceo is the first algo I looked at and said “I have no idea why I lost,” but the Aelgoo match was unfortunate.

RegularRyan · January 7, 2019, 12:53am

Hey everyone,

We are not planning on penalizing KauffK or changing any aspect of the results of the global competition as they do have a very strong algo and the ‘cheese’ strategy used was not against any established rules. It was a clever use of the resources available.

The comparison I would make would be to ‘meta’ techniques that many players did like switching algos to try to better match against their pool or keeping their algos secret until the last day. The problem in this case is that instead of increasing your win-rates by a small percentage, it gave KauffK a tremendous advantage in that specific matchup.

I would also like to give Rodger, RyanD’s partner and creator of their team’s predictor credit for the 3 fantastic scrambler interceptions we saw in the final match.

We will be discussing this more internally with long term considerations in mind.
I’ll be staying on top of this discussion.

KauffK · January 7, 2019, 1:07am

Hello everyone. First, thank you all for a great season one, you are all excellent competitors and I greatly admire all of your programs’ unique strengths.

My technique in the final match seems to be the subject of a minor controversy, so I wanted to clear a few things up.

Ryan’s assessment in the opening post is quite accurate; my program passed turn 0 and identified the opponent by the signature opening turn, then attempted to replicate a parsed replay, except for turn 0 which had already passed.

There is a little more to this approach than meets the eye, and ultimately it is a strategic analysis and application of the publicly available game data, although I do admit it was a bit of a ‘cheese’. In the interest of transparency I will discuss my thoughts and experiences with this technique thus far.

Using an approach such as this carries several serious downsides in terms of a match strategy. For one thing, many programs share matching turn 0’s, resulting in the need for further methods of identification. In my development I experimented with a variety of identifying factors such as computing time and secondary turn placement to attempt an ID as fast as possible.
However, this itself creates another problem: once the first turn has passed, the game is now “off script”, so to speak, and committing to following a parsed replay could be sending yourself to a misguided doom. Additionally, simply by passing on turn 0 to get an ID, the butterfly effect of that missed turn might similarly invalidate any replays.
To attempt to mitigate this effect, I created an elaborate data analysis program that cross-referenced all the winning replays against all programs with similar starts, and identified candidates both with minimal action in the first few turns of the game, and with similar opening moves against the maximum number of opponents with identical signatures. This was critical, as it allowed for a potential recovery by switching scripts if it was later revealed that the initial ID was not correct. Simply following the first winning replay that popped out of the API typically had a very low success rate, and was not at all flexible. Compounding all of this is that the opponent is ultimately unknown, and a large amount of preparation went in to preparing for a variety of opponents and contingencies.

I wanted to reveal all of this to make it clear that I was not just out to sabotage kkroep, and neither did I hand-pick and hard code Ryan’s response. Ryan’s replay was selected by my analytic program from a large pool of candidates as the most flexible in terms of opening moves against the Demux series, and as the most likely to succeed even with a blank board on turn 0. I would like to acknowledge and appreciate Ryan’s team’s program, your predictor is fantastic and your performance against the fierce competitors in the Demux series is something to behold.

I imagine that the possibility of using opening moves as an identifying signature has occurred to a few other players, and I would be curious as to what thoughts or results, if any, they experienced.

kkroep · January 7, 2019, 1:35am

Maybe a problem is that with the tools available, people can see all the matches of all algos. I feel like this is too much information so that people with their own creative ideas are hardly protected. Maybe an option would be to change the API so people have access to less replays of other players. KauffK here used the resources to find a specific match I lost, other people have used it to try and copy the algo (and got to top 10 with it). Maybe the Terminal developers can decide which replays to share instead of making literally everything available?

I even tried to prevent people from using the tool to view my best algo by keeping up demux_1.9 till the last moment. However, clearly KauffK was able to circumvent that. I congratulate KauffK on pulling it off. I guess I’ll just add a random number generator somewhere . I am more concerned with the ease of copying and gathering info than his specific solution though. I understand that eventually new strategies will get noticed and understood, but Demux_1.25 was literally uploaded a day before the deadline… To see every match it played directly when it is uploaded feels out of touch with the spirit of the competition…

kkroep · January 7, 2019, 1:41am

Thanks for sharing how you pulled of your strategy! It gives me a higher appreciation for the effort in pulling this of even though I don’t want this type of strategy to be viable in the long run

RegularRyan · January 7, 2019, 2:08am

Thanks for sharing, KauffK. Its definitely an impressive piece of work, and what you are describing sounds much less ‘cheesy’ than hard-coding something to beat a single player on the leaderboard you expected to struggle against. I’m actually much more confident about our decision about reading it, and hopefully everyone else feels better as well.

Ill discuss this with the team

We will be looking into ways to handle this, I have some ideas

Tim · January 7, 2019, 2:10am

I agree that this behavior should be limited in the future, but I think that removing access to replays is the wrong approach. Yes, it is one of the easiest methods for removing access to this functionality, but it also has some downsides. I think that a big one is a huge decrease in the viability of neural network algos. We haven’t really seen many / any successful ones, but I think that they may become more common in the future. Another downside of removing this is that people can’t watch replays either. The threads about leaderboard matches were very interesting for me, to be able to see the meta and what strategies the best algos used. This allowed the meta to keep developing, without keeping us stuck on maze algos, for example.

kkroep · January 7, 2019, 8:25am

Well that is why I proposed them to be limited instead of every match available. For example the terminal team could host such a dump of replays themselves. Still much better than being to watch every single replay that is played. By watching all my replays one could easilly make a carbon copy of my algos if they want. Without any game sense or knowledge required to design it. There is no clear advantage between the creator of a strategy and its copies.

You are saying that less replay access hurts neural networks, but having access to all hurts handcrafted strategies. I am pretty sure access to all replays was not intentional by the terminal team.

I dont see why neural networks need this much info? They can just watch their own matches right? And you have access to your own matches always.

Anyway I’ll be interested to see what the development team comes up with.

Aeldrexan · January 7, 2019, 9:37am

I strongly disagree with @kkroep here, I think this kind of strategy should not be restricted at all, and that we should continue to have access to every ranked matches replays. My reasons are:

I think that if a strategy has to be secret to work, then it can only work for a short term, so it should be considered as a bad strategy.
As explained by @KauffK, this approach is actually very complex to implement, and also needs a backup strategy.
KauffK’s approach is restricting the most statics algos, and it is a a good thing, since terminal is mostly supposed to be a coding contest and you don’t need a great understanding of algorithmic to build a static algo.
There is an easy way to counter this approach: implementing randomness in your algo will force the game to be “off script”.
A harder (and funnier) way to counter this: having a database of replays, you may be able to identify the one your opponent tries to follow and thus make very precise predictions of their future moves, allowing you to counter them, or even better, you can choose at which turn you will make a decisive change that Cthaeh will probably not foresee. (You could as well use that database to detect and counter other static algos).

arnby · January 7, 2019, 9:52am

I totally agree with @Aeldrexan, well said!

usually in ML, the more the better, especially here where the state-action space is crazy big.
You can also consider that the overall level is rising so you need up-to-date replays for your training. Replays from mid-December could be completely useless today (for certain types of ML applications)

right but it depends on what are your training goals and how you implement your training. If you want to make a very general AI, having access to all replays is crucial. Before beating the world champion and before playing a lot against itself, alpha go first watched thousands and thousands of Go expert matches to learn.

Hahah this would be hilarious! Imagine two algos having a battle of replays, changing as soon as they recognize the opponent replay. (Way too complicated in practice but still funny)

kkroep · January 7, 2019, 10:48am

I see a disagreement here. Especially with the players who are least negatively affected by this mechanic. @Aeldrexan and @arnby have the two least copyable algorithms on the ladder. @Aeldrexan here:

You don’t need a lot of programming knowledge to design a static algorithm, this is true. But in order to design an effective semi-static algo that can compete with dynamic ones you need a lot a very solid understanding of how the game works. We both have had very different approaches to tackling this game (polar opposites). Your algorithm makes all the decisions itself, while my algorithm is the result of calculating structures with pen and paper and a lot of by hand play to figure out new strategies. For example the gap-jump idea implemented in my transistor and demux algos.

Again, I am not convinced that the current amount of access to information is intended behavior. Though, I found a solution that doesn’t need changes from terminal: I will remove all my algorithms from the ladder, as their presence mainly negatively affects my algos. I can simply upload at the last moment before a competition, or in short burst to monitor performance. I guess that kind of gets the same effect that I am looking for.

arnby · January 7, 2019, 11:15am

you have a point here

I think it is intended: the aim is to have a global improvement, to ‘force’ players to always improve. Each player needs to know what the others are doing to improve.

I understand your issue, but if everyone do that, we will receive true feedbacks only when competitions occur because everyone will released their hidden algos only at that time. And this behavior is definitely not wanted by C1.

Anyway, I definitely understand your concerns and we will see what C1 says about this.

Aeldrexan · January 7, 2019, 11:26am

You would probably hurt yourself doing that, since you would not be able to see how your algos do against the others players. But that is still an interesting idea, and it could actually work since your algos are usually masterpieces of handcrafted defenses.

kkroep · January 7, 2019, 11:27am

I completely agree with this statement. This is why I would like there to be less info available. To incentivise people to keep competing. My proposed solution would only help me, but be detrimental to the community. I am trying to argue a point that having access to all replays is not only a good thing. This would be one example of that. If I would have lost to the algos with my own strategy that I designed for weeks and uploaded a week before the competition deadline I would have been way more salty about this