<snip> > > > >tbl_chng_cnt is one of the first elements of the structure used in the > >lookup. Move it to the beginning of the cache line to gain performance. > > > >Fixes: e605a1d36 ("hash: add lock-free r/w concurrency") > >Cc: sta...@dpdk.org > > > >Signed-off-by: Honnappa Nagarahalli <honnappa.nagaraha...@arm.com> > >Reviewed-by: Gavin Hu <gavin...@arm.com> > >Tested-by: Ruifeng Wang <ruifeng.w...@arm.com> > >--- > > lib/librte_hash/rte_cuckoo_hash.h | 6 +++--- > > 1 file changed, 3 insertions(+), 3 deletions(-) > > > >diff --git a/lib/librte_hash/rte_cuckoo_hash.h > >b/lib/librte_hash/rte_cuckoo_hash.h > >index fb19bb27d..af6451b5c 100644 > >--- a/lib/librte_hash/rte_cuckoo_hash.h > >+++ b/lib/librte_hash/rte_cuckoo_hash.h > >@@ -170,7 +170,9 @@ struct rte_hash { > > > > /* Fields used in lookup */ > > > >- uint32_t key_len __rte_cache_aligned; > >+ uint32_t *tbl_chng_cnt __rte_cache_aligned; > >+ /**< Indicates if the hash table changed from last read. */ > >+ uint32_t key_len; > > /**< Length of hash key. */ > > uint8_t hw_trans_mem_support; > > /**< If hardware transactional memory is used. */ @@ -218,8 +220,6 > @@ > >struct rte_hash { > > * is piggy-backed to freeing of the key index. > > */ > > uint32_t *ext_bkt_to_free; > >- uint32_t *tbl_chng_cnt; > >- /**< Indicates if the hash table changed from last read. */ > > } __rte_cache_aligned; > > > > struct queue_node { > >-- > >2.17.1 > > [Wang, Yipeng] > I am not sure about this change. By moving counter to front, I think you > seems push key_store out of the cache line. And key_store Is also used in > lookup (and more commonly). > My tests also show perf drop in many cases. I ran hash_readwrite_lf tests and L3 fwd application. Both of them showed improvements for both lock-free and using locks for Arm platforms (L3 fwd was not run on x86). Which tests are resulting in performance drops for you?
But, I do agree that this work is not complete. We can drop this patch and take this up separately.