Hi Thomas, -----Original Message----- From: Thomas Monjalon <tho...@monjalon.net> Sent: Tuesday, March 31, 2020 8:56 PM To: Medvedkin, Vladimir <vladimir.medved...@intel.com> Cc: Wang, Yipeng1 <yipeng1.w...@intel.com>; Stephen Hemminger <step...@networkplumber.org>; dev@dpdk.org; Morten Brørup <m...@smartsharesystems.com>; dev@dpdk.org; Ananyev, Konstantin <konstantin.anan...@intel.com>; Gobriel, Sameh <sameh.gobr...@intel.com>; Richardson, Bruce <bruce.richard...@intel.com>; Suanming Mou <suanmi...@mellanox.com>; Olivier Matz <olivier.m...@6wind.com>; Xueming(Steven) Li <xuemi...@mellanox.com>; Andrew Rybchenko <arybche...@solarflare.com>; Asaf Penso <as...@mellanox.com>; Ori Kam <or...@mellanox.com> Subject: Re: [dpdk-dev] [PATCH 0/3] add new Double Word Key hash table
26/03/2020 18:28, Medvedkin, Vladimir: > Hi Yipeng, Stephen, all, > > On 17/03/2020 19:52, Wang, Yipeng1 wrote: > > From: Stephen Hemminger <step...@networkplumber.org> > >> On Mon, 16 Mar 2020 18:27:40 +0000 > >> "Medvedkin, Vladimir" <vladimir.medved...@intel.com> wrote: > >> > >>> Hi Morten, > >>> > >>> > >>> On 16/03/2020 14:39, Morten Brørup wrote: > >>>>> From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Vladimir > >>>>> Medvedkin > >>>>> Sent: Monday, March 16, 2020 2:38 PM > >>>>> > >>>>> Currently DPDK has a special implementation of a hash table for > >>>>> 4 byte keys which is called FBK hash. Unfortunately its main > >>>>> drawback is that it only supports 2 byte values. > >>>>> The new implementation called DWK (double word key) hash > >>>>> supports 8 byte values, which is enough to store a pointer. > >>>>> > >>>>> It would also be nice to get feedback on whether to leave the > >>>>> old FBK and new DWK implementations, or whether to deprecate the > >>>>> old > >> one? > >>>> <rant on> > >>>> > >>>> Who comes up with these names?!? > >>>> > >>>> FBK (Four Byte Key) and DWK (Double Word Key) is supposed to mean > >> the same. Could you use 32 somewhere in the name instead, like in > >> int32_t, instead of using a growing list of creative synonyms for the same > >> thing? > >> Pretty please, with a cherry on top! > >>> > >>> That's true, at first I named it as fbk2, but then it was decided > >>> to rename it "dwk", so that there was no confusion with the > >>> existing FBK library. Naming suggestions are welcome! > >>> > >>>> And if the value size is fixed too, perhaps the name should also > >>>> indicate > >> the value size. > >>>> <rant off> > >>>> > >>>> It's a shame we don't have C++ class templates available in DPDK... > >>>> > >>>> In other news, Mellanox has sent an RFC for an "indexed memory pool" > >> library [1] to conserve memory by using uintXX_t instead of > >> pointers, so perhaps a variant of a 32 bit key hash library with 32 > >> bit values (in addition to > >> 16 bit values in FBK and 64 bit in DWK) would be nice combination > >> with that library. > >>>> [1]: http://mails.dpdk.org/archives/dev/2019-October/147513.html Yes some work is in progress to propose a new memory allocator for small objects of fixed size with small memory overhead. > >> Why is this different (or better) than existing rte_hash. > >> Having more flavors is not necessarily a good thing (except in > >> Gelato) > > [Wang, Yipeng] > > Hi, Vladimir, > > As Stephen mentioned, I think it is good idea to explain the benefit > > of this new type of hash table more explicitly such as Specific use cases, > > differences with current rte_hash, and performance numbers, etc. > > The main reason for this new hash library is performance. As I > mentioned earlier, the current rte_fbk implementation is pretty fast > but it has a number of drawbacks such as 2 byte values and limited > collision resolving capabilities. On the other hand, rte_hash (cuckoo > hash) doesn't have this drawbacks but at the cost of lower performance > comparing to rte_fbk. > > If I understand correctly, performance penalty are due to : > > 1. Load two buckets > > 2. First compare signatures > > 3. If signature comparison hits get a key index and find memory > location with a key itself and get the key > > 4. Using indirect call to memcmp() to compare two uint32_t. > > The new proposed 4 byte key hash table doesn't have rte_fbk drawbacks > while offers the same performance as rte_fbk. > > Regarding use cases, in rte_ipsec_sad we are using rte_hash with 4 > byte key size. Replacing it with a new implementation gives about 30% > in performance. > > The main disadvantage comparing to rte_hash is some performance > degradation with high average table utilization due to chain resolving > for 5th and subsequent collision. Thanks for explaining. Please, such information should added in the documentation: doc/guides/prog_guide/hash_lib.rst I'm going to submit v2 this week, will add documentation update.