On 5/21/19 12:03 PM, Xiao Wang wrote:
Use rte_memcpy to improve the pointer array copy. This optimization method
has already been applied to __mempool_generic_put() [1], this patch applies
it to __mempool_generic_get(). Slight performance gain can be observed in
testpmd txonly test.

[1] 863bfb47449 ("mempool: optimize copy in cache")

Signed-off-by: Xiao Wang <xiao.w.w...@intel.com>
---
  lib/librte_mempool/rte_mempool.h | 7 +------
  1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index 8053f7a04..975da8d22 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -1344,15 +1344,11 @@ __mempool_generic_get(struct rte_mempool *mp, void 
**obj_table,
                      unsigned int n, struct rte_mempool_cache *cache)
  {
        int ret;
-       uint32_t index, len;
-       void **cache_objs;
/* No cache provided or cannot be satisfied from cache */
        if (unlikely(cache == NULL || n >= cache->size))
                goto ring_dequeue;
- cache_objs = cache->objs;
-
        /* Can this be satisfied from the cache? */
        if (cache->len < n) {
                /* No. Backfill the cache first, and then fill from it */
@@ -1375,8 +1371,7 @@ __mempool_generic_get(struct rte_mempool *mp, void 
**obj_table,
        }
/* Now fill in the response ... */
-       for (index = 0, len = cache->len - 1; index < n; ++index, len--, 
obj_table++)
-               *obj_table = cache_objs[len];
+       rte_memcpy(obj_table, &cache->objs[cache->len - n], sizeof(void *) * n);
cache->len -= n;

I think the idea of the loop above is to get objects in reverse order to order to reuse cache top objects (put last) first. It should improve cache hit etc.
So, performance effect of the patch could be very different on various CPUs
(with different cache sizes) and various work-loads.

So, I doubt that it is a step in right direction.

Andrew.

Reply via email to