While going through the mempool code for potential optimizations, I found two details in rte_mempool_do_generic_get(), which are easily improved.
Any comments or alternative suggestions? 1. The objects are returned in reverse order. This is silly, and should be optimized. rte_mempool_do_generic_get() line 1493: /* Now fill in the response ... */ - for (index = 0, len = cache->len - 1; index < n; ++index, len--, obj_table++) - *obj_table = cache_objs[len]; + rte_memcpy(obj_table, &cache_objs[cache->len - n], sizeof(void *) * n); 2. The initial screening in rte_mempool_do_generic_get() differs from the initial screening in rte_mempool_do_generic_put(). For reference, rte_mempool_do_generic_put() line 1343: /* No cache provided or if put would overflow mem allocated for cache */ if (unlikely(cache == NULL || n > RTE_MEMPOOL_CACHE_MAX_SIZE)) goto ring_enqueue; Notice how this uses RTE_MEMPOOL_CACHE_MAX_SIZE to determine the maximum burst size into the cache. Now, rte_mempool_do_generic_get() line 1466: /* No cache provided or cannot be satisfied from cache */ if (unlikely(cache == NULL || n >= cache->size)) goto ring_dequeue; cache_objs = cache->objs; /* Can this be satisfied from the cache? */ if (cache->len < n) { /* No. Backfill the cache first, and then fill from it */ uint32_t req = n + (cache->size - cache->len); First of all, there might already be up to cache->flushthresh - 1 objects in the cache, which is 50 % more than cache->size, so screening for n >= cache->size would not serve those from the cache! Second of all, the next step is to check if the cache holds sufficient objects. So the initial screening should only do initial screening. Therefore, I propose changing the initial screening to also use RTE_MEMPOOL_CACHE_MAX_SIZE to determine the maximum burst size from the cache, like in rte_mempool_do_generic_put(). rte_mempool_do_generic_get() line 1466: - /* No cache provided or cannot be satisfied from cache */ - if (unlikely(cache == NULL || n >= cache->size)) + /* No cache provided or if get would overflow mem allocated for cache */ + if (unlikely(cache == NULL || n > RTE_MEMPOOL_CACHE_MAX_SIZE)) goto ring_dequeue; Med venlig hilsen / Kind regards, -Morten Brørup