On Wed, May 23, 2012 at 12:44 AM, Branko Čibej <br...@apache.org> wrote:
>...
> I'd really like to see you explain why this change of yours (33 -> 33^4)
> is relevant in practice. It's not at all clear that this multiplier
> gives a better key distribution than the time-honoured 33.

Actually, there are some reasoned/studied arguments for 33 ("it works
well, but nobody knows why"). And 33^4 is likely a poor replacement
:-P

For PoCore's hash table[1], I did a survey of the research around
hashing functions. I selected the FNV-1 hash function:
  http://www.isthe.com/chongo/tech/comp/fnv/

Comparisons of functions are here:
  http://www.eternallyconfuzzled.com/tuts/algorithms/jsw_tut_hashing.aspx

The 33 variety is named as the "Bernstein hash".

> It's my considered opinion that this fiddling around with hash function
> implementations is way overboard. Just use apr_hashfunc_default already.
> Unless you can prove that using your "optimized" version results in
> siginificant savings in space and/or time, anything else is just piling
> on more lines of code that need to be maintained for no good reason.

I'm assuming Stefan ran some tests, and (iirc) saw a few percent
increase. For that, maybe a new hash function is okay. (it isn't like
he built a whole new type; just a new func)

Cheers,
-g

[1] http://pocore.googlecode.com/svn/trunk/src/hash.c

Reply via email to