>-----Original Message----- >From: Honnappa Nagarahalli [mailto:honnappa.nagaraha...@arm.com] >Sent: Tuesday, July 2, 2019 10:23 AM >To: Wang, Yipeng1 <yipeng1.w...@intel.com>; Gobriel, Sameh ><sameh.gobr...@intel.com>; Richardson, Bruce ><bruce.richard...@intel.com>; De Lara Guarch, Pablo ><pablo.de.lara.gua...@intel.com> >Cc: Gavin Hu (Arm Technology China) <gavin...@arm.com>; Ruifeng Wang (Arm >Technology China) <ruifeng.w...@arm.com>; >dev@dpdk.org; Honnappa Nagarahalli <honnappa.nagaraha...@arm.com>; nd ><n...@arm.com>; sta...@dpdk.org; nd <n...@arm.com> >Subject: RE: [PATCH 3/3] lib/hash: adjust tbl_chng_cnt position > ><snip> > >> > >> >tbl_chng_cnt is one of the first elements of the structure used in the >> >lookup. Move it to the beginning of the cache line to gain performance. >> > >> >Fixes: e605a1d36 ("hash: add lock-free r/w concurrency") >> >Cc: sta...@dpdk.org >> > >> >Signed-off-by: Honnappa Nagarahalli <honnappa.nagaraha...@arm.com> >> >Reviewed-by: Gavin Hu <gavin...@arm.com> >> >Tested-by: Ruifeng Wang <ruifeng.w...@arm.com> >> >--- >> > lib/librte_hash/rte_cuckoo_hash.h | 6 +++--- >> > 1 file changed, 3 insertions(+), 3 deletions(-) >> > >> >diff --git a/lib/librte_hash/rte_cuckoo_hash.h >> >b/lib/librte_hash/rte_cuckoo_hash.h >> >index fb19bb27d..af6451b5c 100644 >> >--- a/lib/librte_hash/rte_cuckoo_hash.h >> >+++ b/lib/librte_hash/rte_cuckoo_hash.h >> >@@ -170,7 +170,9 @@ struct rte_hash { >> > >> > /* Fields used in lookup */ >> > >> >- uint32_t key_len __rte_cache_aligned; >> >+ uint32_t *tbl_chng_cnt __rte_cache_aligned; >> >+ /**< Indicates if the hash table changed from last read. */ >> >+ uint32_t key_len; >> > /**< Length of hash key. */ >> > uint8_t hw_trans_mem_support; >> > /**< If hardware transactional memory is used. */ @@ -218,8 +220,6 >> @@ >> >struct rte_hash { >> > * is piggy-backed to freeing of the key index. >> > */ >> > uint32_t *ext_bkt_to_free; >> >- uint32_t *tbl_chng_cnt; >> >- /**< Indicates if the hash table changed from last read. */ >> > } __rte_cache_aligned; >> > >> > struct queue_node { >> >-- >> >2.17.1 >> >> [Wang, Yipeng] >> I am not sure about this change. By moving counter to front, I think you >> seems push key_store out of the cache line. And key_store Is also used in >> lookup (and more commonly). >> My tests also show perf drop in many cases. >I ran hash_readwrite_lf tests and L3 fwd application. Both of them showed >improvements for both lock-free and using locks for Arm >platforms (L3 fwd was not run on x86). Which tests are resulting in >performance drops for you? > >But, I do agree that this work is not complete. We can drop this patch and >take this up separately. [Wang, Yipeng] I run the LF test on x86. For 1 reader case it seems a clear improvement for lookup without key-shifts. Other results are a mixed bag even for lock-free. For most HTM it is a drop.
My opinion is to delay similar fine tuning effort tied to architecture specs.