Regarding the Protection of Ideas

Isaac · January 15, 2019, 10:56am

In recent Terminal history, many significant events have taken place (such as the Season 1 competition) that have sparked many different conversations about various aspects of the Terminal competition, but a central topic that was discussed quite a bit was about the privacy and protection of a users strategy. In my mind, many ideas were presented and addressed, but personally, I find many of the points brought up remain unresolved (C1 is probably still figuring this out, so the intention of this post is not to get people riled up or anything like that ).

My goal for this topic is to summarize some of the issues and points brought up in a few other topics that became a little convoluted. I realize that this may seem a little belated, but I intentionally did not comment in the previous topics because I wanted some time to think about this and let the dust settle, so to speak.

The protection of ideas/strategies wasn’t something I had thought about much since my algo builds dynamically and can’t really be copied, something which is definitely a bias as noted by @kkroep.

Both @Aeldrexan and @kkroep brought up valid and pertinent points in the discussion. Prior to the Season 1 competition, I, like, @Aeldrexan (see here) believed that restricting replays was fundamentally a bad thing to do. Specifically, I believed (and still do) the strategy of copying other people’s designs is simply part of the Terminal game, whether intended or not. It is impossible to completely hide one’s design as highlighted by the final game of Season 1. It is one of the reasons I have never once uploaded a static, copyable, design. Yet it doesn’t seem quite right when you have users like @kkroep where their algos are

This is simply a different type of strategy. @Aeldrexan argued if a strategy has to be secret to work, then it can only work for a short term, so it should be considered as a bad strategy. However, I don’t think this is always true. A new design may be a good one, but once other people use it the creator no longer benefits from discovering an innovative, good design. In fact, pretty much everyone benefits from their good idea except for the person who created it after a certain amount of time. Furthermore, there really isn’t a way to test a strategy in full without uploading it. And once it is uploaded, even if you take it down very quickly, you cannot change that people will see that design and likely use it.

My main concern is that I feel there is little motivation to create a new impressive design since it will simply be used by everyone against you. Again, this does not personally affect me so I may be way off here, but the very fact that I (and others) have completely avoided creating designs shows there is at least some dissuasion from creating new designs. Certainly, at this point, I would not spend the time to create a completely new design, I would use the transistor as a template since it is such a great design.

Something I view to be a separate issue is the points about the accessibility of replay files. While copying ideas and replays may appear to be connected (particularly after the final game of the Season 1) I think @KauffK’s post explaining his win does a good job of showing how access to replays alone isn’t really an issue. Simply having access to a replay file does not suddenly enable you to copy/paste and win games. Thus I strongly support leaving replays fully accessible and available to all. I would argue that the problem that

isn’t related to replay files being downloaded, it’s the nature of other people being able to watch the games they play against you. One could argue that being able to watch any game makes this worse, and it may. But the problem would stay the same. Even if just one person watched your design, they then have the capability to copy and use that design and then others would see theirs, etc until the same result happens. The meta may shift at a slower rate, but I believe it would still shift even if replays were restricted. I would go back to my above section and claim that the issue is with being able to see any games at all, and restricting replays will not protect people’s original ideas.

The conclusion I have isn’t a very satisfying one: I’m not sure there is a perfect solution. Any creative, good idea that can be replicated will be replicated because people can see it. At the same time, being able to watch games (even if it just your own) is absolutely necessary when it comes to improving your own algo and iterating designs. It is impossible to improve an algo significantly without being able to watch it play. I have also ignored many other points that were brought up regarding ML and just the validity of statically designed algos.

I would like to end with the notes from C1:

and

So I’m sure we will get to see what C1 plans to do regarding this.

Lastly, I’d just like to make it clear that there were many more ideas discussed at the time and apologize for any comments I may have missed or left out since they were similar to ideas I summarized above. Thanks for all of your awesome thoughts, happy coding :).

kkroep · January 15, 2019, 12:01pm

Awesome post. The thread Kauffk’s Final Game became quite a long thread so to have someone who is not directly affected take a step back and summarize the whole thing is very good. I think the problems I tried to raise are well represented in this post. In fact you did a much better job than I did as I was quite salty back then…

This is the part where I have a slightly different opinion. Seeing an algorithm play once or maybe up to 10 times, is a whole different ballpark than being able to look up every match-up that algorithm has had. From only a few replays, it is still quite difficult to understand the purpose behind specific placements. They might be put there to counter a possible strategy that wasn’t encountered in this replay, but might occur in another. However if you start implementing a copy and you lose against a certain algo, you can just go to the replay discount shopping basket of the “OG” algo, and see how he counters it. Without that information it is actually quite difficult to beat the original creator with his own strategy. So to summerize my point:

and that is the nature of the game, but they should not have access to the complete database of matches ever performed by the algo they are trying to imitate.

quincyc3 · January 15, 2019, 9:11pm

Maybe more frequent contests with smaller prizes could insensitive innovation. The crux of the problem as I see it is that if you have a great new idea and implement it 2 months before the end of the season it will be part of the standard meta by the time it matters, or at the very least other users will have had time to adjust to beat it. More routine pay outs should help ease this problem. Thoughts?

C1Junaid · January 16, 2019, 12:19am

Thanks for this great summary post!

A lot of people are talking about this issue and “information warfare” as I like to call it in general.

There are a lot of different opinions and options people are talking about.

Don’t take this as anything official (I have my own biases as well) but some things we are considering are:

More frequent competitions with more spread out rewards instead of a big reward for the final competition. This way you can get rewarded for your innovations before people have time to copy them.
“Anonymizing” replays. Public replays don’t show which algos are playing.
Delaying public replay release by some amount (week or two?). But you can still see your own replays.
Implementing gameplay changes that have some symmetrical randomization. These will be symmetrical so no one player has a random advantage but there could still be issues with certain configurations benefiting a certain algo more than another. For example:
- Randomly give out more bits to both players on random turns.
- Give free towers on turn 0 that you can remove or dynamically utilize.
- Have certain spots of the board that randomly give some bonus for being interacted with.

I’m curious what solutions you guys think will work the best especially people that develop their strategies “by hand” (@kkroep). Every solution has some kind of flaw. Some might not be enough others I personally think hurt more than they help.

This is an unofficial post though so I can’t guarantee anything these are just some ideas I’m curious about.

Ryan_Draves · January 16, 2019, 12:45am

I would personally rather see the last option, as I think it would add an exciting twist to the game and push for more dynamic approaches. The downside is, adding even more depth may push machine-learning experimentation out the window, the development of which I’m sure is something both players and C1 would like to see. Also, I don’t know what kind of room there would be for static-strategy development being a primary focus (except for maybe the first idea with bits), but then again in the grand scheme of trying to develop a powerful algorithm I don’t see how much longer static-structure based innovation can even persist…

In the case of hard coded matches, like Kauffk’s algo and @16daystocode, who I would still like to hear from on their system, some of these ideas don’t seem to help the matter. Particularly the first two ideas don’t stop the behavior, as “signatures” are identified by round 0/1 placements, not names, and the third only weakens it (particularly to new algos being uploaded within the delay period, assuming the copier doesn’t get to play versus this new algo and add it to the “library” of matches).

C1Junaid · January 16, 2019, 12:56am

Anonymizing may help with making it more difficult to find a replay of an algo that they lose in is the hope. You would have to scan all the replays and try to figure out which ones belong to the algo you lost to. Though its possible @KauffK’s setup already can handle analyzing that many replays not sure.

You think a one to two week delay still wouldn’t be enough? I was thinking one week before a competition algos would change enough to make copying obsolete, but maybe not (assuming you don’t have a match with them)?

As for the last option, the hope is that static algos can sort of ignore the randomized elements because they only give a slight edge that only top players would absolutely need to focus on. But its still all theoretical. It could be too weak of an impact to affect even top players or too much of an impact that it affects new players too much.

KauffK · January 16, 2019, 1:19am

If the replays could no longer be queried by user as with

 https://terminal.c1games.com/api/game/algo/<algo_id>/matches

and instead could only be accessed by whacking a ton of sequential replay IDs into the download link it would definitely brick my current system, although it does operate with huge numbers of replays already so with enough work it could probably be recovered. Regardless I’ve pretty much taken that system apart to use it for machine learning anyways.

That being said, having replays not able to be downloaded on a per-match or per-user basis would make machine learning problematic as well, since I personally (and I imagine many others) would not want to train ML on just a random collection of replays, and naturally would want training data from only the most successful examples of algos.

Ryan_Draves · January 16, 2019, 1:20am

I suppose you’re right, I guess it would depend on how “unique” someone’s placement signatures are/whether or not they’ve already been copied by others.

The reason I said one to two weeks might not help was in part due to either having a match versus that person anyways or a case where the past one to two weeks of development for someone weren’t focused in areas like placements that could “throw off” a scripted replay. I don’t know which version of Demux Kauffk’s algo attempted to script against me in the finals, or how old that replay was to make it obsolete in playing against me, and I wouldn’t know for sure if it what version of Demux Kauffk’s algo thought it was emulating Zephyr against (perhaps it was the same version). But I do know my development that week was solely on placement strategy.

I’m going to bring up @16daystocode again here, as I’m pretty convinced this “scripted replays” thing is their sole strategy, so they would be a more valuable resource in terms of having a giant pool of replays to evaluate how long it takes before the replays are obsolete and their algo starts losing… (though I’m sure cross referencing each match to replay would be painstakingly difficult).

KauffK · January 16, 2019, 1:39am

The main thing that went wrong against you in the finals didn’t have to do with replay availability so much as the fundamental weakness of the “script” technique; you had 6 programs with very similar moves in the first 1 or 2 turns and my analyzer just had to take its best guess for which of your programs I was actually against. It went with Demux since a few different Demux’s beat a few different ones of yours while still keeping the first 1 or 2 turns being similar, but in the end it either guessed wrong or took too long to figure it out and things went off-script. This sort of problem gave me trouble the entire time and is one of the reasons I’m re-purposing my analytical engine; too make people have 6 copies with similar openings and its just not consistent enough for my liking.

Aeldrexan · January 16, 2019, 10:22am

I would like to discuss some of @C1Junaid ideas that seems really interesting to me:

This can be a very good idea, but as @KauffK said it would be very detrimental to ML (as well as many other ways of using replays) if matches between best algos are scattered among many low rated algos replays.
A way to solve this problem would be to be able to query replays above a minimum rating, for instance:

 https://terminal.c1games.com/api/game/replays?minrating=2000&quantity=100

would list the 100 last replays where both algos have elo above 2000.

At first I was loving this idea, but now that I think of it, it has several problems

It would mess up currents static algos, including most of bosses.
Making a beginner algo would be harder.
Overall it would help pure adaptive algo, and hurt about anything else.

But you could encourage such opening diversity by having firewall cost reduction on some spots, only at turn 0
or for the whole game. That would be quitte similar to:

kkroep · January 16, 2019, 11:27am

I would also like to give some thought on the proposed ideas by @C1Junaid.

Rather then how to implement changes, we should discuss how we want the replay system to work. For example, the developers could disable any external interaction with the API, and introduce an option where you can click on any algo on the leaderboard and view the 5-10 matches against highest ranked opponents directly.

Now for the rest of this post I want to talk a bit about machine learning because I have the impression that it is used as an argument for change in the wrong way. This is probably going to be lengthy.

It seems to be the case that you guys all have an extremely specific type of machine learning in mind. This concept is called “Supervised learning”. This is a well known concept where you feed an algorithm a set of training data (for example from high performing algorithms) and the machine learning algorithm then adjusts to mimic the behavior presented to it in the replays.

This type of machine learning is by far the least interesting way to handle this game. At best you get an algorithm behaving close to the top algorithm it is mimicking, but there is no structure in place that will help it generate novel ideas. AlphaGo, one of the most famous ML examples uses deep learning. Deep learning is a type of algo you can loosely divide into a few facets:

multiple layers of pre-processing units that extract features from the raw input data. The goal here is that the neural network can work with a set of proporties instead of, for example, a million pixels.
A supervised or unsupervised learning method to train a neural network. The key difference is that for supervised learning you show examples of desired behavior, whereas with unsupervised you simply provide a metric for good performance that the network wants to maximize. (AlphaGo uses the latter)
Learn how to make additional abstraction layers. This is actually similar to the first step, where hardcoded pre-processing is used, but here the neural network itself figures out abstraction steps to improve decision making.
a translation step that picks the decisions made by the neural network (e.g. attack from the left with EMPs) and makes it happen in the best way on the board.

If we want to make some kind of AlphaTerminal, you would need:

a really fast engine that plays games against itself until the end of times,
a really good pre-processing step to reduce the required neural network complexity
a really good set of neural network parameters that fit the application
a lot more items on this list deep learning is really difficult

You DONT need:

replays

So now to expand on the ideas of machine learning, there naturally are a lot more options to consider that might be more interesting to pursue than the before mentioned machine learning options. One specific one that might be interesting is the class of algorithms called Evolutionary algorithms. The most used example is genetic algorithms. Here you try and tune the decision making to optimize for a performance metric. Again you need a fast engine where you have different families of genetic iterations play against each other to try and survive. (For example one family of EMP line, one of ping cannon, etc), where you iterate with mutations and parent-child mechanisms. Again this would need

a preprocessing stage
a genetic representation of decision making
a fast engine
an initialization with enough genetic diversity
a lot of matches within the gene pool
a good survival function to allow for convergence and prevent local optima
probably a lot more still very difficult

again you DONT need:

replays

So now the question for some other guys considering (aspects of) machine learning is: why do you need access to all the high level replays to pull it off? I’m the first to admit there are large gaps in my machine learning knowledge, so if someone can give a link or roughly describe the type of algorithm they want to try and implement where replays would help I would love to read about it!

[edit] it is only when you hit send that realized what a behemoth of a post this is. I hope you found the read interesting. If there are errors in my piece please correct me

Aeldrexan · January 16, 2019, 2:11pm

While my knowledge on machine learning is as incomplete as your, I think you lack imagination on how we could use ML. You seems to imagine only a pure ML algo, but there could be many way of using ML as an helper feature of an algo.

You could also use ML to detect common strategies, if you are making an algo that react with counters to those strategies.

You could also use ML to make prediction phase: you can train a neural network on replays to predict the next moves of other algos. Even without ML there are many way of analyzing replays to produce data that you would use in your prediction phase.

To me the problem that make your algos lose to @KauffK strategy is clearly not that the replays are accessible, but that your algo is extremely predictable. In the sense that both player chose their move at the same time, Terminal is similar to Rock–paper–scissors, and in both of those game, predictability is a huge strategical flaw.

This replay of Tiny_Oracle (in blue) vs Aelgoo (in red) is another example showing that predictability is a mistake: after turn 20, while blue had a huge core advantage, Aelgoo was able to predict Tiny_Oracle moves and such was able to hold many turns.

kkroep · January 16, 2019, 4:36pm

There are several assumptions in this statement that are false. I think you should take a step back and look at the point I am trying to make a case for here. Not how much knowledge I have about ML as it is not the purpose of my post. I was trying to prove that ML doesnt need replays to work by providing examples of approaches that don’t need replays, and challenging others to provide examples that do need replays that justify making them available. Because these examples have to be strong enough to justify opening up the entire logged replay database despite the previously discussed downsides it brings.

I wouldn’t think that these examples would be sifficient to justify opening up the replay database. I certainly wouldn’t want to share the replays of my algos just for this reason, but that is just my opinion.

In a private discussion I had just now I gave this example: If you would come to me and tell me: “Hey I am trying to program this neural network, and I would like to have the replays of your algos so that I can make a version of it that looks similar but is slightly better. I understand that I dont need those replays to make machine learning work, but it would make my life a lot easier”. I’ll be removing my algos from the ladder faster than you can say “Terminal”.

Actually if you just want a large amount of replays, we could opt for the anonymous replay option. That might be a solution that keeps the promise of this type of ML intact while removing the more questionable aspects of full replay database access.

arnby · January 16, 2019, 5:23pm

Well the debate here is whether or not closing the replay database, not about opening it (when something is established, the argumentation is left to the ones that want to change things) but joke aside:

This here seems good to me. It avoids the targeting ML (when you focus on one particular opponent)
I would still add an indicator for the quality of the match (one could want to avoid scanning through matches below a certain rank).
For example it could be two letters, one for each algo rank: A-> top5%, B->top10% and so on…
This way you could still work with loads of replays, while sorting them by interest if you need (AA matches would be matches only between the top 5% algos in my example)

RegularRyan · January 16, 2019, 6:17pm

I was planning on waiting until we had some official plans before discussing this more, but i’ll share my ideas on the topic since there is some decent discussion going on.

There are two problems, which I will address separately.

1. Replay copying
This strategy, used by KauffK in the global final, involves storing or otherwise accessing large amounts of data from replays, identifying an enemy algo by their turn 0 move, and then trying to find a replay that beats that algo and copy that replay move for move. This strategy is an auto-win against certain algos, but has some issues.

To work, the enemy algo must have the following properties:

Identifiable turn 0 or 1
Very little or no randomness
Does not significantly alter its decision based on enemy turn 0 build.

It is also non-trivial to implement, and can confuse algos within the same series or algos with similar openers. Because of the many issues with this strategy, it is more a ‘tool in a toolbox’ that requires a lot of work to gain a large benefit from, especially after everyone is aware of its weakness. For these reasons, I am personally not incredibly worried about this strategy at this point, but I expect we are going to be doing something about it in coming weeks. We are still discussing it internally.

2. Protection of Ideas
This I see as the bigger problem. Releasing an algo that uses a new or unique and very effective structure causes it to be immediately viewable by many players. The core issue is that this incentivizes players who design such structures to keep their algos ‘hidden’ until immediately before a competition. The core issue is NOT that these ideas eventually make their way around, as this is inevitable. In any game where people come up with unique ideas, they become part of the ‘core gameplay’ after some amount of time, and it is beneficial to that creator for this process to take longer.

Thoughts on Proposed solutions
1. Anonymized replays.
This doesn’t seem to help either problem in my mind. You can still ID algos based on their turn 0-1 build, even though it would take more work, and you can still copy algos without seeing their name. One thing I like about terminal is the ‘identity’ of algos and players, and its cool to bump into a rival or top player on the leaderboard. Losing this sense of identity is something I wouldn’t like

2. Delaying replays releasing by one week or so
This breaks the replay copy strategy, as most algos update at least once a week. Its also a minor band-aid for issue #2 but doesn’t solve the root issue. I don’t see much downsides to it, so I do consider it a decent option.

3. Reducing availability of replays overall
I don’t like this solution that much. The problem with ‘protection of ideas’ is that the algo has to reveal itself eventually: As long as they are playing on the leaderboard, players are going to see their algo play. Ultimately, these sorts of unique ideas are going to make their way though Terminal, and I don’t think that limiting replays to slow that process down is going to have a net benefit to the platform.

4. Symmetrical randomization
Benefits:

Nearly impossible to identify enemy algo in most games
Keeps every match fresh and unique.
Encourages dynamic strategies that can handle a wide variety of unique game states
Rematches against the same algo actually make sense, which allows us to do things like double elimination tournaments or rematches in matchmaking, if we ever felt the need

Downsides:

Its cool that many algos have their own ‘identity’, and this may take away from that if people move away from identifiable ‘core strategies’ that they build towards.
Dramatically changes the game. People would need to put work into their strategies to make them ideal in the new environment. A change like this should only go out during a ‘quiet period’ in terminal, like a season transition or maby ‘halfway’ through so people have time to prepare for the Global Comp.

5. Incentivizing consistent performance on the leaderboard
Providing some incentive for consistant leaderboard performance encourages people to upload their algo rather than wait until the last minute. This solves the main issue I see with idea protection, where users are encouraged to keep their algos off the leaderboard.

Note that these are my general thoughts on the current situation, and are not necessarily reflective of changes we plan to make. I am just sharing my current mindset here to encourage further discussion and give everyone a chance to contribute their ideas on these points.

KauffK · January 16, 2019, 6:32pm

I also, personally, like the idea of Terminal being a many-dimensional game, with its currently held emphasis on data analysis and the use of data in general (which makes sense given its parent company ), and I would be a little sad if the dimension of data analysis was removed entirely.
It opens the possibility for a lot of interesting meta plays, beyond just the direct/naive approach of choosing the best replay to copy, and even that relatively ham-fisted use of data has already sparked some interesting discussions on the pros, cons, and counterplays for such a technique.

And I feel that that could just be the beginning, the replays are a treasure trove of data with lots of interesting potential, not just for straightforward supervised machine learning, but, as mentioned, for several more interesting applications such as classification or prediction. And I suspect there are further still applications waiting to be discovered.

This might be my personal bias towards a certain flavor of programming, but I’m not great at writing lightning-fast paths or being the Van Gogh of base designs, but I am quite fond of data analysis and informed decision making. Being able to instill an algo with information from outside the current game is a nice third dimension in my opinion, and provides another avenue for programmers to show off.

I have no good rebuttal for the protection of ideas, other than my own opinion that the strength of an algo has to come from more than the structure alone, and the processing and reasoning behind the in-game decisions is not visible in the replays. Given the somewhat inevitable visibility of structures, it seem like the true strength is going to be in how each program uses the structure, as the base layout is revealed just by playing. Yes, the replays may reveal if a base has small adaptations to various situations, but re-implementing these from scratch and getting good results does still require a competent programmer.

However the game eventually changes, I feel it is important to preserve the multiple dimensions of design wherever possible. It would be a shame if the game collapsed into an “adaptive or bust” mode, with lots of unpredictable elements and where the fastest simulator is king.