On 3 December 2014 at 01:58, Mijo Safradin <m...@linux.vnet.ibm.com> wrote:
>
>
>
> On Thu, 27 Nov 2014, Ben Pfaff wrote:
>>
>>
>> I'd suggest doing a "git bisect" run to find the point at which tests
>> started failing due to endian problems.
>>
>
> Hi Ben,
>
> there's a new failing test which recently passed.
>
>     23: test hash functions
>
> [root@ppc64 ovs-test]# git rev-parse HEAD
> 1cea007c92e6b33f194fd7acbba9795e03b10a19
>
> 'git bisect' result
>
> [root@ppc64 ovs-test]# git bisect bad
> 468cdd91c3bc21acde1d98250def7d9330204b8e is the first bad commit
> commit 468cdd91c3bc21acde1d98250def7d9330204b8e
> Author: Joe Stringer <joestrin...@nicira.com>
> Date:   Tue Aug 12 11:12:12 2014 +1200
>
>     hash: Add 128-bit murmurhash.
>
>     Add the 128-bit murmurhash by Austin Appleby, r150 from:
>     http://code.google.com/p/smhasher/source/browse/trunk/MurmurHash3.cpp
>
>     Signed-off-by: Joe Stringer <joestrin...@nicira.com>
>     Acked-by: Ben Pfaff <b...@nicira.com>
>
> :040000 040000 42bf320cfdb046d26a05afa0c024bc0e3268cb4b 
> 47f922348a915fdcf0f5100438e7d064664d5152 M      include
> :040000 040000 848bc72a466002cb9eaa7b927cd590c46b6816d6 
> 875f8d5945fb539fce1b4db8a5fd6bf8077b3be9 M      lib
> :040000 040000 7f8b92f26c8fa472a5590d668136e0cfe65a5803 
> 3f6578898266df4e8b06b570a5fb7335a86440c5 M      tests
> [root@lnxpdcn1 ovs-test]# vim tests/testsuite.log
> [root@lnxpdcn1 ovs-test]# git bisect reset
> Previous HEAD position was 468cdd9... hash: Add 128-bit murmurhash.
> Switched to branch 'master'


Apologies for the delayed reply, thanks for bisecting this.

I think there's a few problems here:

Firstly, there are two tests that are performed on this hash function.
The first will exit(1) immediately if there is a collision; this
prevents us from seeing how many collisions there are in the first
test, and prevents the second test from being performed. This reduces
the amount of useful information we can get in the logs.

Next, the first test takes 32-bit inputs with a single bit set, hashes
them to 128 bits, then discards 96 bits to provide a 32-bit output.
Discarding these bits means we are more reliant on the remaining 32
bits than if we had a proper hash finish function to redistribute the
'randomness' down to 32 bits. I've briefly experimented with some
alternatives for a 128bit->32bit finish function and ended up with the
version in the patch below, which should improve the situation. This
is by no means a comprehensive attempt at finding an optimal hash
finish function.

I think that the test is fairly aggressive, in that any 12-bit
collision across any combination of inputs with a single bit set will
cause a failure; this combined with the short input makes it prone to
fail. It seems like most basis values will cause a few 12-bit
collisions, generally around 1 or 2 but perhaps as many as 5. From
briefly looking into these, it seems that as many as 15 or 16 bits may
collide out of the 32. The below patch selects a magic number that
seems to pass in most conditions I could test, but I wonder if we
should document the 128-bit hash function to discourage using <32 bits
of the output value when input is small (less than the size of the
hash).

Would you be able to test the patch below and tell me if it fixes the
issue on your platform and provide a log if it doesn't?


--8<--------------------------cut here-------------------------->8--

From: Joe Stringer <joestrin...@nicira.com>
Date: Mon, 22 Dec 2014 13:02:56 -0800
Subject: [PATCH] hash: Improve 128-bit hash to 32-bit output.

Previously, when using the 128-bit hash in conjunction with the 32-bit
hash tables, we would ignore the upper 96 bits rather than attempting to
redistribute the hash across the 32-bit output space. This patch adds a
new function to translate the hash down from 128 bits to 32 bits for
this use case.

XXX: Also modify the u64 accessors for ovs_u128 on big_endian systems.
XXX: Also don't exit immediately on hash test failure.

Signed-off-by: Joe Stringer <joestrin...@nicira.com>
---
 include/openvswitch/types.h   |    4 ++++
 lib/dpif-netdev.c             |    2 +-
 lib/hash.h                    |   11 +++++++++++
 ofproto/ofproto-dpif-upcall.c |    2 +-
 tests/test-hash.c             |    5 ++---
 5 files changed, 19 insertions(+), 5 deletions(-)


diff --git a/include/openvswitch/types.h b/include/openvswitch/types.h
index 2afb7b7..efd44fc 100644
--- a/include/openvswitch/types.h
+++ b/include/openvswitch/types.h
@@ -84,7 +84,11 @@ typedef struct {
 typedef union {
     uint32_t u32[4];
     struct {
+#ifdef WORDS_BIGENDIAN
+        uint64_t hi, lo;
+#else
         uint64_t lo, hi;
+#endif
     } u64;
 } ovs_u128;

diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index 890870c..2cf41dd 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -1162,7 +1162,7 @@ static void dp_netdev_flow_unref(struct
dp_netdev_flow *flow)
 static uint32_t
 dp_netdev_flow_hash(const ovs_u128 *ufid)
 {
-    return ufid->u32[0];
+    return hash128_to_32(ufid);
 }

 static void
diff --git a/lib/hash.h b/lib/hash.h
index c2820dd..818fcca 100644
--- a/lib/hash.h
+++ b/lib/hash.h
@@ -35,6 +35,7 @@ hash_rot(uint32_t x, int k)
 uint32_t hash_bytes(const void *, size_t n_bytes, uint32_t basis);
 void hash_bytes128(const void *_, size_t n_bytes, uint32_t basis,
                    ovs_u128 *out);
+uint32_t hash_128_to_32(const ovs_u128 *hash);

 static inline uint32_t hash_int(uint32_t x, uint32_t basis);
 static inline uint32_t hash_2words(uint32_t, uint32_t);
@@ -84,6 +85,16 @@ static inline uint32_t mhash_finish(uint32_t hash)
     return hash;
 }

+static inline uint32_t hash128_to_32(const ovs_u128 *hash)
+{
+    uint32_t h1, h2;
+
+    h1 = mhash_add(hash->u32[0], hash->u32[1]);
+    h2 = mhash_add(hash->u32[2], hash->u32[3]);
+
+    return mhash_finish(mhash_add(h1, h2));
+}
+
 #if !(defined(__SSE4_2__) && defined(__x86_64__))
 /* Mhash-based implementation. */

diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c
index 4a7a829..8b2a38d 100644
--- a/ofproto/ofproto-dpif-upcall.c
+++ b/ofproto/ofproto-dpif-upcall.c
@@ -1247,7 +1247,7 @@ handle_upcalls(struct udpif *udpif, struct
upcall *upcalls,
 static uint32_t
 get_ufid_hash(const ovs_u128 *ufid)
 {
-    return ufid->u32[0];
+    return hash128_to_32(ufid);
 }

 static struct udpif_key *
diff --git a/tests/test-hash.c b/tests/test-hash.c
index d7e2e6b..80d24ed 100644
--- a/tests/test-hash.c
+++ b/tests/test-hash.c
@@ -74,8 +74,8 @@ hash_bytes128_cb(uint32_t input)
 {
     ovs_u128 hash;

-    hash_bytes128(&input, sizeof input, 0, &hash);
-    return hash.u64.lo;
+    hash_bytes128(&input, sizeof input, 233, &hash);
+    return hash128_to_32(&hash);
 }

 static void
@@ -101,7 +101,6 @@ check_word_hash(uint32_t (*hash)(uint32_t), const
char *name,
                     printf("%s(%08"PRIx32") = %08"PRIx32"\n", name, in2, out2);
                     printf("%d bits of output starting at bit %d "
                            "are both 0x%"PRIx32"\n", min_unique, ofs, bits1);
-                    exit(1);
                 }
             }
         }
-- 
1.7.10.4
_______________________________________________
discuss mailing list
discuss@openvswitch.org
http://openvswitch.org/mailman/listinfo/discuss

Reply via email to