On Tue, Aug 03, 2010 at 09:53:27PM -0400, Jacob Todd wrote:
In K&R, chapter 6, section 6, there is a funtion called hash that hashes a
string, which will be stored in a linked list. The function in question is on
page 144, but here it is for those of you who don't have K&R handy.

        /* hash:  form hash value for string s */
        unsigned
        hash(char *s)
        {
                unsigned hashval;
                
                for(hashval = 0; *s != '\0'; s++)
                        hashval = *s + 31 * hashval;
                return hashval % HASHSIZE;
        }

So what is the purpose of the magic 31? The only thing that I think might be a
reference to what it may be for is the paragraph prior, which states

        The hashing function, ..., adds each character value in the string to a
        scrambled combination of the previous ones and returns the remainder
        modulo the array size.

Does the magic 31 have to do with scrambling?

It's just an arbitrary number that happens to perform well in that particular hash function. There isn't much science to hash functions, but some of them prove to have good performance and good distribution, others, not so. I tend to use Dan Bernstein's hash function, which uses 33. It's not magic, it's just been shown to work well. Although, in this case, I suspect that it was chosen because it optimizes to a particularly fast addition/binary shift, but so do a lot of other choices.

static ulong
hash(const char *str) {
        ulong h;

        h = 5381;
        while (*str != '\0') {
                h += h << 5; /* h *= 33 */
                h ^= *str++;
        }
        return h;
}

--
Kris Maglione

You talk when you cease to be at peace with your thoughts.
        --Khalil Gibran


Reply via email to