Benchmark for "better" performance

Benchmark for "better" performance


Imagine if you have 5-10 local testing algoes,
Then you implement a significant change to your main strategy and want to test it.
You run it against all local algos, and it wins every match …
So, without watching every single replay, what information can you use to determine if the change was significantly better (or worst) then the baseline ?
Number of Turns, and final Health seems logical, but also my speak for a more risky game and also need some parsing to get them.
I often use use the replay file size as first factor.