Thomas Gleixner wrote: > I'm not a hashing wizard and I completely failed to understand why > hash_long/ptr are so horrible for the various test cases I ran.
It's very simple: the constants chosen are bit-sparse, *particularly* in the least significant bits, and only 32/64 bits of the product are kept. Using the high-word of a double-width multiply is even better, but some machines (*cough* SPARCv9 *cough*) don't have hardware support for that. So what you get is: (0x9e370001 * (x << 12)) & 0xffffffff = (0x9e370001 * x & 0xfffff) << 12 = (0x70001 * x & 0xfffff) << 12 *Now* does it make sense? 64 bits is just as bad... 0x9e37fffffffc0001 becomes 0x7fffffffc0001, which is 2^51 - 2^18 + 1. The challenge is the !CONFIG_ARCH_HAS_FAST_MULTIPLIER case, when it has to be done with shifts and adds/subtracts. Now, what's odd is that it's only relevant for 64-bit platforms, and currently only x86 and POWER7+ have it. SPARCv9, MIPS64, ARM64, SH64, PPC64, and IA64 all have it turned off. Is this a bug that should be fixed? In fact, do *any* 64-bit platforms need multiply emulation? How many 32-bit platforms nead a multiplier that's easy for GCC to evaluate via shifts and adds? Generlly, by the time you've got a machine grunty enough to need 64 bits, a multiplier is quite affordable. Anyway, assuming there exists at least one platform that needs the shift-and-add sequence, it's quite easy to get a higher hamming weight, you just have to use a few more registers to save some intermediate results. E.g. u64 x = val, t = val, u; x <<= 2; u = x += t; /* val * 5 */ x <<= 4; /* val * 80 */ x -= u; /* val * 75 = 0b1001011 */ Shall I try to come up with something? Footnote: useful web pages on shift-and-add/subtract mutliplciation http://www.vinc17.org/research/mulbyconst/index.en.html http://www.spiral.net/hardware/multless.html