On Wed, Apr 06, 2016 at 13:52:21 +0200, Paolo Bonzini wrote: > > > On 06/04/2016 02:52, Emilio G. Cota wrote: > > +static inline uint32_t tb_hash_func5(uint64_t a0, uint64_t b0, uint32_t e, > > int seed) > > I would keep just this version and unconditionally zero-extend to > 64-bits. The compiler is able to detect the high 32 bits are zero, drop > the more expensive multiplications and constant fold everything. > > For example if you write > > unsigned tb_hash_func(uint32_t phys_pc, uint32_t pc, int flags) > { > return tb_hash_func5(phys_pc, pc, flags, 1); > } > > and check the optimized code with -fdump-tree-optimized you'll see that > the rotated v1, the rotated v3 and the 20 merge into a single constant > 1733907856.
I like this idea, because the ugliness of the sizeof checks is significant. However, the quality of the resulting hash is not as good when always using func5. For instance, when we'd otherwise use func3, two fifths of every input contain exactly the same bits: all 0's. This inevitably leads to more collisions. Performance (for the debian arm bootup test) gets up to 15% worse. Emilio