* xxhash128 is not a cryptographic hash function, so it doesn't attempt to be random.
Just a correction : xxh128 does try to be random. And quite hardly: a significant amount of development is spent on ensuring this property. It’s even tested with PractRand, and it could be used as a good random number generator. Being non-cryptographic means that what it doesn’t try is to make sure no one can intentionally forge a hash collision from 2 different files (other than brute-forcing, which is impractical). But that’s different, and I wouldn’t call this property “randomness”, even though randomness is a pre-requisite (but not sufficient in itself) to collision resistance. From: Paul Eggert <egg...@cs.ucla.edu> Date: Sunday, February 25, 2024 at 10:25 PM To: Pádraig Brady <p...@draigbrady.com>, Bruno Haible <br...@clisp.org>, bug-gnulib@gnu.org <bug-gnulib@gnu.org>, Coreutils <coreut...@gnu.org> Cc: Yann Collet <c...@meta.com> Subject: Re: sort dynamic linking overhead On 2023-10-09 06:48, Pádraig Brady wrote: > An incremental patch attached to use xxhash128 (0.8.2) > shows a good improvement (note avx2 being used on this cpu): xxhash128 is not a cryptographic hash function, so it doesn't attempt to be random. Of course most people won't care - it's random "enough" - but it would be a functionality change. blake2 is cryptographic and would be random, but would bloat the 'sort' executable with code that's hardly ever used. To attack the problem in a more conservative way, I installed the attached patch into coreutils. With it, 'sort -R' continues to use MD5 but on GNUish platforms 'sort' links libcrypto dynamically only if -R is used (Bruno's suggestion). This doesn't significantly affect 'sort -R' performance, and reduces the startup overhead of plain 'sort' to be what it was before we started passing -lcrypto to gcc by default (in coreutils 8.32). I also toyed with changing MD5 to SHA512, but that hurt performance. For what it's worth, although I tested with an Intel Xeon W-1350, which supports SHA-NI as well as various AVX-512 options, I didn't see where libcrypto (at least on Ubuntu 23.10, which has OpenSSL 3.0.10) takes advantage of these special-purpose instructions.