Here are two useful references: http://bretm.home.comcast.net/hash/ http://burtleburtle.net/bob/hash/
RE computation cost: Generalized integer multiplication and integer division both used to be expensive. On modern processors a multiplication is fully pipelined and comparable to an L1 data cache hit (~2 to 4 cycles). Most modern implementations of division run the divide circuit at a multiple greater than 1 of the cpu clock (typically 2x) and implement early out. Further division is nearly always an autonomous, non-pipelined logic block that once initiated runs asynchronously until it is ready to deliver its result. Bottom line: on any contemporary out-of-order x86 chip from either Intel or AMD the cost of either of these instruction is negligible. RE general hashing concepts: I find it helpful to think of a hash table lookup as proceeding in steps. The first step is to generate a fixed size digest of the (possibly variable length) key. On a 32 bit machine this digest most commonly is a 32 bit (unsigned) integer. The quality of a digest is related to how much of the entropy in the key it is able to preserve. The second step is to map the digest into a bucket selector. Nearly universally this is a modulo operation taking the remainder of the digest divided by the number of buckets. A particularly poor approach is to have a power of 2 number of buckets so as to to reduce the modulo operation to simple masking. The reason is that this discards all entropy in all masked out bits. A _much_ better approach is to pick the number of buckets to be a prime number -- the remainder of division by a prime number will be influenced by _every_ bit in the digest. (When I need to implement a dynamic hash table I always include a statically initialized array of primes chosen to give me the growth pattern I desire.) /john