While going through the mempool code for potential optimizations, I found two 
details in rte_mempool_do_generic_get(), which are easily improved.

Any comments or alternative suggestions?


1. The objects are returned in reverse order. This is silly, and should be 
optimized.

rte_mempool_do_generic_get() line 1493:

        /* Now fill in the response ... */
-       for (index = 0, len = cache->len - 1; index < n; ++index, len--, 
obj_table++)
-               *obj_table = cache_objs[len];
+       rte_memcpy(obj_table, &cache_objs[cache->len - n], sizeof(void *) * n);


2. The initial screening in rte_mempool_do_generic_get() differs from the 
initial screening in rte_mempool_do_generic_put().

For reference, rte_mempool_do_generic_put() line 1343:

        /* No cache provided or if put would overflow mem allocated for cache */
        if (unlikely(cache == NULL || n > RTE_MEMPOOL_CACHE_MAX_SIZE))
                goto ring_enqueue;

Notice how this uses RTE_MEMPOOL_CACHE_MAX_SIZE to determine the maximum burst 
size into the cache.

Now, rte_mempool_do_generic_get() line 1466:

        /* No cache provided or cannot be satisfied from cache */
        if (unlikely(cache == NULL || n >= cache->size))
                goto ring_dequeue;

        cache_objs = cache->objs;

        /* Can this be satisfied from the cache? */
        if (cache->len < n) {
                /* No. Backfill the cache first, and then fill from it */
                uint32_t req = n + (cache->size - cache->len);

First of all, there might already be up to cache->flushthresh - 1 objects in 
the cache, which is 50 % more than cache->size, so screening for n >= 
cache->size would not serve those from the cache!

Second of all, the next step is to check if the cache holds sufficient objects. 
So the initial screening should only do initial screening. Therefore, I propose 
changing the initial screening to also use RTE_MEMPOOL_CACHE_MAX_SIZE to 
determine the maximum burst size from the cache, like in 
rte_mempool_do_generic_put().

rte_mempool_do_generic_get() line 1466:

-       /* No cache provided or cannot be satisfied from cache */
-       if (unlikely(cache == NULL || n >= cache->size))
+       /* No cache provided or if get would overflow mem allocated for cache */
+       if (unlikely(cache == NULL || n > RTE_MEMPOOL_CACHE_MAX_SIZE))
                goto ring_dequeue;


Med venlig hilsen / Kind regards,
-Morten Brørup

Reply via email to