Terminal plays millions of matches each season, and currently, we store all of these replays, adding up to many terabytes of data being stored that very few people need.
We are planning on deleting all replays that are not part of a competition match very soon. If anyone wants to save a replay before then, it can be downloaded from the ‘watch’ link.
We are aware that using a diff-based replay format would dramatically reduce the size of replays. If we ever convert to such a system in the future, we will reconsider holding onto old replays.
I don’t really care about these old replays, but you might consider zipping the files to save space. I did this with 1000 replays and it reduced the total size to about 3% of the total size. It’s about the same also with just a single replay file.
gzip, bzip, or xz at least in benchmarks have a better compression ratio than zip. xz and 7zip being the best to my knowledge. Granted bzip2, gzip and xz don’t have archive capabilities, so for multiple files you need to wrap them in a tar? Honeslty I’ve very rarly seen a straight bz,gz,xz useually they’re always wrapped in a tar. It would be interesting. I don’t know how useful the older season replays would be, but current season, wouldn’t it be possible to train an ML algo on them? ML isn’t really something I’ve really looked at for my algo, but there was a lot of talk about it in previous seasons.
Replays are stored zipped, thought we are not getting close to that level of compression. I will look into this and see if it can be optimized better for the future
Will look into this