On Wed, Apr 06, 2016 at 20:23:42 +0200, Paolo Bonzini wrote: > On 06/04/2016 19:44, Emilio G. Cota wrote: > > I like this idea, because the ugliness of the sizeof checks is significant. > > However, the quality of the resulting hash is not as good when always using > > func5. > > For instance, when we'd otherwise use func3, two fifths of every input > > contain > > exactly the same bits: all 0's. This inevitably leads to more collisions.
I take this back. I don't know anymore what I measured earlier today--it's been a long day and was juggling quite a few things. I essentially see the same chain lengths (within 0.2%) for either function, i.e. func3 or func5 with the padded 0's when running arm-softmmu. So this is good news :> > Perhaps better is to always use a three-word xxhash, but pick the 64-bit > version if any of phys_pc and pc are 64-bits. The unrolling would be > very effective, and the performance penalty not too important (64-bit on > 32-bit is very slow anyway). By "the 64-bit version" you mean what I called func5? That is: if (sizeof(phys_pc) == sizeof(uint64_t) || sizeof(pc) == sizeof(uint64_t)) return tb_hash_func5(); return tb_hash_func3(); or do you mean xxhash64 (which I did not include in my patchset)? My tests with xxhash64 suggest that the quality of the results do not improve over xxhash32, and the computation takes longer (it's more instructions); not much, but measurable. So we should probably just go with func5 always, as you suggested initially. If so, I'm ready to send a v2. Thanks, Emilio