Hi folks, We recently had a Jira and PR created around allowing sha256 to be used instead of (or as well as) md5 for script caching/naming purposes. The relevant issue is:
https://issues.apache.org/jira/browse/GROOVY-11459 By way of background, often md5 is not a good choice when a hack threat is possible. That's not really the scenario for our usage, we only use hashing to generate a unique name for scripts without a filename or as a caching index. The sha256 algorithm does offer slightly better collision avoidance which would be a good thing for the caching scenario. Having said that, we'd not want to have a big performance hit to gain a little collision avoidance. I'll also point out that this caching isn't applicable to normally compiled sources, just when using GroovyClassLoader directly (or indirectly via GroovyScriptEngine), so most users wouldn't be impacted anyway. My initial thinking was that we'd make the algorithm configurable and allow sha256 to be swapped in for folks that needed it. But in my tests, sha256 is very close to md5, so having the default be md5 possibly comes into play. I also tested murmur3 and xxHash algorithms but at least for the all-Java implementations that I tried, they didn't offer any useful improvement for us. An option for Groovy 5 is to make sha256 the default with the ability to swap back to md5 with a system property. And we could back port the configurability part to Groovy 4 but keep the default as is (md5). But for us to consider this, I think we'd need to do some more testing. My current testing is based around just the 12 sample files that the performance testing uses. Does anyone else have thoughts or have the ability to help out with further testing? (See more information in the Jira). Thanks, Paul.