I appreciate you spending the time to look into this, Alex. Thanks. This approach sounds reasonable to me. I think my only concern is that if we remove the test, then we will only be testing quite large (256 byte) inputs, and nothing smaller. I wonder if it would be useful to have a test on smaller input, but larger than 32-bit (perhaps 128 or 256 bits).
On 26 February 2015 at 10:44, Alex Wang <al...@nicira.com> wrote: > Hey All, > > I'm helping Joe to look at this issue. > > Here is excerpt of Joe's comment on his original patch: > """ > My thoughts are converging towards this: The test is very strict, > particularly for inputs that have little variance (eg, single bit set > in the whole input). The hash function itself has been thoroughly > tested by the original developers at Google, and I shouldn't have done > anything to decrease the hash quality. I'm inclined to push the > testsuite change that modifies the seed to something which passes on > both big- and little-endian systems. > """ > > Seems to me that the hash function is meant to compute hash for very long > words (longer than 128 bits, e.g. struct flow). And we already have test > for > checking the 1-bit set case for hash of 16*128-bit word. So, could we just > remove the this failing test that uses uint32_t as input and add comment to > warn users (that the function should only be called with > 128-bit word)? > > The patch is sent out here: > http://openvswitch.org/pipermail/dev/2015-February/051727.html > > Thanks, > Alex Wang, > > On Tue, Dec 30, 2014 at 10:28 AM, Andrew Mann <and...@divvycloud.com> wrote: >> >> For what it's worth, Bob Jenkins has some good discussion around this, and >> some code that can help measure such properties of hashes: >> >> http://burtleburtle.net/bob/hash/index.html >> >> Specifically the "tests" section may offer a good start for measuring the >> characteristics of the hash output with quantitative results. >> >> On Mon, Dec 29, 2014 at 3:08 PM, Ben Pfaff <b...@nicira.com> wrote: >>> >>> On Mon, Dec 22, 2014 at 03:35:22PM -0800, Joe Stringer wrote: >>> > Previously, when using the 128-bit hash in conjunction with the 32-bit >>> > hash tables, we would ignore the upper 96 bits rather than attempting >>> > to >>> > redistribute the hash across the 32-bit output space. This patch adds a >>> > new function to translate the hash down from 128 bits to 32 bits for >>> > this use case. >>> >>> Suppose that we had 128-bit random numbers instead of 128-bit hashes. >>> Then, if combining the 32-bit pieces of that random number gave us a >>> higher-quality random 32-bit number than just taking any one 32-bit >>> piece, it would mean that the random numbers weren't very random. >>> >>> By analogy, I think that this patch (without reading it) should only >>> make a difference if the 128-bit hash isn't very high-quality. If so, >>> it might be better to consider improving our 128-bit hash function, >>> instead of the approach taken here. >>> _______________________________________________ >>> discuss mailing list >>> discuss@openvswitch.org >>> http://openvswitch.org/mailman/listinfo/discuss >> >> >> >> >> -- >> Andrew Mann >> DivvyCloud Inc. >> www.divvycloud.com >> >> _______________________________________________ >> discuss mailing list >> discuss@openvswitch.org >> http://openvswitch.org/mailman/listinfo/discuss >> > > > _______________________________________________ > discuss mailing list > discuss@openvswitch.org > http://openvswitch.org/mailman/listinfo/discuss > _______________________________________________ discuss mailing list discuss@openvswitch.org http://openvswitch.org/mailman/listinfo/discuss