+CC Beilei as i40e maintainer

> From: Dharmik Thakkar [mailto:dharmik.thak...@arm.com]
> Sent: Thursday, 13 January 2022 06.37
> 
> Current mempool per core cache implementation stores pointers to mbufs
> On 64b architectures, each pointer consumes 8B
> This patch replaces it with index-based implementation,
> where in each buffer is addressed by (pool base address + index)
> It reduces the amount of memory/cache required for per core cache
> 
> L3Fwd performance testing reveals minor improvements in the cache
> performance (L1 and L2 misses reduced by 0.60%)
> with no change in throughput
> 
> Suggested-by: Honnappa Nagarahalli <honnappa.nagaraha...@arm.com>
> Signed-off-by: Dharmik Thakkar <dharmik.thak...@arm.com>
> Reviewed-by: Ruifeng Wang <ruifeng.w...@arm.com>
> ---
>  lib/mempool/rte_mempool.h             | 150 +++++++++++++++++++++++++-
>  lib/mempool/rte_mempool_ops_default.c |   7 ++
>  2 files changed, 156 insertions(+), 1 deletion(-)
> 
> diff --git a/lib/mempool/rte_mempool.h b/lib/mempool/rte_mempool.h
> index 1e7a3c15273c..f2403fbc97a7 100644
> --- a/lib/mempool/rte_mempool.h
> +++ b/lib/mempool/rte_mempool.h
> @@ -50,6 +50,10 @@
>  #include <rte_memcpy.h>
>  #include <rte_common.h>
> 
> +#ifdef RTE_MEMPOOL_INDEX_BASED_LCORE_CACHE
> +#include <rte_vect.h>
> +#endif
> +
>  #include "rte_mempool_trace_fp.h"
> 
>  #ifdef __cplusplus
> @@ -239,6 +243,9 @@ struct rte_mempool {
>       int32_t ops_index;
> 
>       struct rte_mempool_cache *local_cache; /**< Per-lcore local cache
> */
> +#ifdef RTE_MEMPOOL_INDEX_BASED_LCORE_CACHE
> +     void *pool_base_value; /**< Base value to calculate indices */
> +#endif
> 
>       uint32_t populated_size;         /**< Number of populated
> objects. */
>       struct rte_mempool_objhdr_list elt_list; /**< List of objects in
> pool */
> @@ -1314,7 +1321,22 @@ rte_mempool_cache_flush(struct rte_mempool_cache
> *cache,
>       if (cache == NULL || cache->len == 0)
>               return;
>       rte_mempool_trace_cache_flush(cache, mp);
> +
> +#ifdef RTE_MEMPOOL_INDEX_BASED_LCORE_CACHE
> +     unsigned int i;
> +     unsigned int cache_len = cache->len;
> +     void *obj_table[RTE_MEMPOOL_CACHE_MAX_SIZE * 3];
> +     void *base_value = mp->pool_base_value;
> +     uint32_t *cache_objs = (uint32_t *) cache->objs;

Hi Dharmik and Honnappa,

The essence of this patch is based on recasting the type of the objs field in 
the rte_mempool_cache structure from an array of pointers to an array of 
uint32_t.

However, this effectively breaks the ABI, because the rte_mempool_cache 
structure is public and part of the API.

Some drivers [1] even bypass the mempool API and access the rte_mempool_cache 
structure directly, assuming that the objs array in the cache is an array of 
pointers. So you cannot recast the fields in the rte_mempool_cache structure 
the way this patch requires.

Although I do consider bypassing an API's accessor functions "spaghetti code", 
this driver's behavior is formally acceptable as long as the rte_mempool_cache 
structure is not marked as internal.

I really liked your idea of using indexes instead of pointers, so I'm very 
sorry to shoot it down. :-(

[1]: E.g. the Intel i40e PMD, 
http://code.dpdk.org/dpdk/latest/source/drivers/net/i40e/i40e_rxtx_vec_avx512.c#L25

-Morten

Reply via email to