On Thu, Jan 13, 2022 at 11:06 AM Dharmik Thakkar <dharmik.thak...@arm.com> wrote: > > Current mempool per core cache implementation stores pointers to mbufs > On 64b architectures, each pointer consumes 8B > This patch replaces it with index-based implementation, > where in each buffer is addressed by (pool base address + index) > It reduces the amount of memory/cache required for per core cache > > L3Fwd performance testing reveals minor improvements in the cache > performance (L1 and L2 misses reduced by 0.60%) > with no change in throughput > > Suggested-by: Honnappa Nagarahalli <honnappa.nagaraha...@arm.com> > Signed-off-by: Dharmik Thakkar <dharmik.thak...@arm.com> > Reviewed-by: Ruifeng Wang <ruifeng.w...@arm.com> > ---
> > /* Now fill in the response ... */ > +#ifdef RTE_MEMPOOL_INDEX_BASED_LCORE_CACHE Instead of having this #ifdef clutter everywhere for the pair, I think, we can define RTE_MEMPOOL_INDEX_BASED_LCORE_CACHE once, and have a different implementation. i.e #ifdef RTE_MEMPOOL_INDEX_BASED_LCORE_CACHE void x() { } void y() { } #else void x() { } void y() { } #endif call x(); y(); in the main code. > diff --git a/lib/mempool/rte_mempool_ops_default.c > b/lib/mempool/rte_mempool_ops_default.c > index 22fccf9d7619..3543cad9d4ce 100644 > --- a/lib/mempool/rte_mempool_ops_default.c > +++ b/lib/mempool/rte_mempool_ops_default.c > @@ -127,6 +127,13 @@ rte_mempool_op_populate_helper(struct rte_mempool *mp, > unsigned int flags, > obj = va + off; > obj_cb(mp, obj_cb_arg, obj, > (iova == RTE_BAD_IOVA) ? RTE_BAD_IOVA : (iova + off)); > +#ifdef RTE_MEMPOOL_INDEX_BASED_LCORE_CACHE This is the only place used in C code. Since we are going compile time approach. Can make this unconditional? That will enable the use of this model in the application, without recompiling DPDK. All application needs to #define RTE_MEMPOOL_INDEX_BASED_LCORE_CACHE 1 #include <rte_mempool.h> I believe enabling such structuring helps to avoid DPDK recompilation of code. > + /* Store pool base value to calculate indices for index-based > + * lcore cache implementation > + */ > + if (i == 0) > + mp->pool_base_value = obj; > +#endif > rte_mempool_ops_enqueue_bulk(mp, &obj, 1); > off += mp->elt_size + mp->trailer_size; > } > -- > 2.17.1 >