26/03/2020 18:28, Medvedkin, Vladimir: > Hi Yipeng, Stephen, all, > > On 17/03/2020 19:52, Wang, Yipeng1 wrote: > > From: Stephen Hemminger <step...@networkplumber.org> > >> On Mon, 16 Mar 2020 18:27:40 +0000 > >> "Medvedkin, Vladimir" <vladimir.medved...@intel.com> wrote: > >> > >>> Hi Morten, > >>> > >>> > >>> On 16/03/2020 14:39, Morten Brørup wrote: > >>>>> From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Vladimir > >>>>> Medvedkin > >>>>> Sent: Monday, March 16, 2020 2:38 PM > >>>>> > >>>>> Currently DPDK has a special implementation of a hash table for > >>>>> 4 byte keys which is called FBK hash. Unfortunately its main > >>>>> drawback is that it only supports 2 byte values. > >>>>> The new implementation called DWK (double word key) hash supports 8 > >>>>> byte values, which is enough to store a pointer. > >>>>> > >>>>> It would also be nice to get feedback on whether to leave the old > >>>>> FBK and new DWK implementations, or whether to deprecate the old > >> one? > >>>> <rant on> > >>>> > >>>> Who comes up with these names?!? > >>>> > >>>> FBK (Four Byte Key) and DWK (Double Word Key) is supposed to mean > >> the same. Could you use 32 somewhere in the name instead, like in int32_t, > >> instead of using a growing list of creative synonyms for the same thing? > >> Pretty please, with a cherry on top! > >>> > >>> That's true, at first I named it as fbk2, but then it was decided to > >>> rename it "dwk", so that there was no confusion with the existing FBK > >>> library. Naming suggestions are welcome! > >>> > >>>> And if the value size is fixed too, perhaps the name should also indicate > >> the value size. > >>>> <rant off> > >>>> > >>>> It's a shame we don't have C++ class templates available in DPDK... > >>>> > >>>> In other news, Mellanox has sent an RFC for an "indexed memory pool" > >> library [1] to conserve memory by using uintXX_t instead of pointers, so > >> perhaps a variant of a 32 bit key hash library with 32 bit values (in > >> addition to > >> 16 bit values in FBK and 64 bit in DWK) would be nice combination with that > >> library. > >>>> [1]: http://mails.dpdk.org/archives/dev/2019-October/147513.html
Yes some work is in progress to propose a new memory allocator for small objects of fixed size with small memory overhead. > >> Why is this different (or better) than existing rte_hash. > >> Having more flavors is not necessarily a good thing (except in Gelato) > > [Wang, Yipeng] > > Hi, Vladimir, > > As Stephen mentioned, I think it is good idea to explain the benefit of > > this new type of hash table more explicitly such as > > Specific use cases, differences with current rte_hash, and performance > > numbers, etc. > > The main reason for this new hash library is performance. As I mentioned > earlier, the current rte_fbk implementation is pretty fast but it has a > number of drawbacks such as 2 byte values and limited collision > resolving capabilities. On the other hand, rte_hash (cuckoo hash) > doesn't have this drawbacks but at the cost of lower performance > comparing to rte_fbk. > > If I understand correctly, performance penalty are due to : > > 1. Load two buckets > > 2. First compare signatures > > 3. If signature comparison hits get a key index and find memory location > with a key itself and get the key > > 4. Using indirect call to memcmp() to compare two uint32_t. > > The new proposed 4 byte key hash table doesn't have rte_fbk drawbacks > while offers the same performance as rte_fbk. > > Regarding use cases, in rte_ipsec_sad we are using rte_hash with 4 byte > key size. Replacing it with a new implementation gives about 30% in > performance. > > The main disadvantage comparing to rte_hash is some performance > degradation with high average table utilization due to chain resolving > for 5th and subsequent collision. Thanks for explaining. Please, such information should added in the documentation: doc/guides/prog_guide/hash_lib.rst