On 19/03/2010 21:13, Robert Dewar wrote:
> Jae Hyuk Kwak wrote:
> 
>> For the issue of Modulus operation, it does not need to be divide or
>> hardware modulus operation.

> Yes, of course, we all know that, and of course the compiler does
> these simple optimizations. However, for computing hashes you
> typically want modulus with OTHER than a power of 2 to involve
> all bits.

  Please, nobody bring up the old saw that prime numbers are a good choice for
hashtable sizes.  They aren't, they're a crude workaround for a poor hash
function.

>> For example, we have 257 cases on a switch statement, then we need a
>> hash table whose bucket size is 512.
>> It will have 50% filled hash table, but 50% filled hash table is usual.
> 
> Again, yes, of course, but typically power of two size for a hash table
> followed by an and that takes only the low order bits is likely to be
> a very poor hash choice.

  Only if there's something wrong with your hash function in the first place.
 If your hash has decent diffusion, all the output bits should be equally
affected by all the input bits, and it should make no difference which ones
you use.

> Again, you can assume we all know the elementary algorithms for
> managing hash tables.

  I think you need to refresh your memory about some of the more advanced
points though.

> One thing to realize is that hashing only has good average performance.
> So your O(N) is talking about average case performance, whereas the
> O(NlogN) for a tree search is worst case. That's comparing apples and
> oranges. It is worrisome to have the possibility of really bad
> performance for a particular bad luck case.

  That's not how minimal perfect hashing works, which was one of the main
suggestions.

    cheers,
      DaveK
-- 
(*) - http://burtleburtle.net/bob/hash/evahash.html
    - http://www.burtleburtle.net/bob/hash/doobs.html
    - http://burtleburtle.net/bob/hash/index.html

Reply via email to