On 13/04/2025 19:47, to...@tuxteam.de wrote:
On Sun, Apr 13, 2025 at 07:03:56PM +0200, Maxime Devos wrote:
[...]
Typical hashing of the non-cryptographic kind aren't designed to
virtually eliminate hash collisions [...]
Nit: given a "reasonable" hash, the collision probability should be
the same for crypto or non-crypto hash (for the same width, for "random"
input). [...]
Many hashes aren't reasonable, then.When looking at documents describing
the quality of (non-crypto) string hash functions, you can sometimes see
discussion about how certain hash values are unreachable and how to
choose parameters such that hashes are more 'spread out' (and hence,
making collisions less common), trade-offs between speed of hashing and
risk of collision ...
For a simple (non-string) example, consider e.g. a situation where a
hash in [0, M] is available, but you need a hash in range [0,N] where
'N<M'. Doing 'old hash % N' brings it in the correct range, but can
often seriously increase the risk of collisions in a way that cannot
solely be accounted for by the decrease in range (specifics depend on
the distribution of the initial hash function, and relation between N
and M).
Best regards, Maxime Devos