> From: Honnappa Nagarahalli [mailto:honnappa.nagaraha...@arm.com] > Sent: Wednesday, 16 November 2022 16.51 > > <snip> > > > > Micro-optimization: > > Reduced the most likely code path in the generic put function by > moving an > > unlikely check out of the most likely code path and further down. > > > > Also updated the comments in the function. > > > > v2 (feedback from Andrew Rybchenko): > > * Modified comparison to prevent overflow if n is really huge and len > is > > non-zero. > > * Added assertion about the invariant preventing overflow in the > > comparison. > > * Crossing the threshold is not extremely unlikely, so removed > likely() > > from that comparison. > > The compiler will generate code with optimal static branch > prediction > > here anyway. > > > > Signed-off-by: Morten Brørup <m...@smartsharesystems.com> > > --- > > lib/mempool/rte_mempool.h | 36 ++++++++++++++++++++---------------- > > 1 file changed, 20 insertions(+), 16 deletions(-) > > > > diff --git a/lib/mempool/rte_mempool.h b/lib/mempool/rte_mempool.h > > index 9f530db24b..dd1a3177d6 100644 > > --- a/lib/mempool/rte_mempool.h > > +++ b/lib/mempool/rte_mempool.h > > @@ -1364,32 +1364,36 @@ rte_mempool_do_generic_put(struct > > rte_mempool *mp, void * const *obj_table, { > > void **cache_objs; > > > > - /* No cache provided */ > > + /* No cache provided? */ > > if (unlikely(cache == NULL)) > > goto driver_enqueue; > > > > - /* increment stat now, adding in mempool always success */ > > + /* Increment stats now, adding in mempool always succeeds. */ > > RTE_MEMPOOL_CACHE_STAT_ADD(cache, put_bulk, 1); > > RTE_MEMPOOL_CACHE_STAT_ADD(cache, put_objs, n); > > > > - /* The request itself is too big for the cache */ > > - if (unlikely(n > cache->flushthresh)) > > - goto driver_enqueue_stats_incremented; > > - > > - /* > > - * The cache follows the following algorithm: > > - * 1. If the objects cannot be added to the cache without > crossing > > - * the flush threshold, flush the cache to the backend. > > - * 2. Add the objects to the cache. > > - */ > > + /* Assert the invariant preventing overflow in the comparison > below. > > */ > > + RTE_ASSERT(cache->len <= cache->flushthresh); > > > > - if (cache->len + n <= cache->flushthresh) { > > + if (n <= cache->flushthresh - cache->len) { > > + /* > > + * The objects can be added to the cache without crossing > the > > + * flush threshold. > > + */ > > cache_objs = &cache->objs[cache->len]; > > cache->len += n; > > - } else { > > + } else if (likely(n <= cache->flushthresh)) { > IMO, this is a misconfiguration on the application part. In the PMDs I > have looked at, max value of 'n' is controlled by compile time > constants. Application could do a compile time check on the cache > threshold or we could have another RTE_ASSERT on this.
There could be applications using a mempool for something else than mbufs. In that case, the application should be allowed to get/put many objects in one transaction. > > > + /* > > + * The request itself fits into the cache. > > + * But first, the cache must be flushed to the backend, so > > + * adding the objects does not cross the flush threshold. > > + */ > > cache_objs = &cache->objs[0]; > > rte_mempool_ops_enqueue_bulk(mp, cache_objs, cache- > > >len); > > cache->len = n; > > + } else { > > + /* The request itself is too big for the cache. */ > > + goto driver_enqueue_stats_incremented; > > } > > > > /* Add the objects to the cache. */ > > @@ -1399,13 +1403,13 @@ rte_mempool_do_generic_put(struct > > rte_mempool *mp, void * const *obj_table, > > > > driver_enqueue: > > > > - /* increment stat now, adding in mempool always success */ > > + /* Increment stats now, adding in mempool always succeeds. */ > > RTE_MEMPOOL_STAT_ADD(mp, put_bulk, 1); > > RTE_MEMPOOL_STAT_ADD(mp, put_objs, n); > > > > driver_enqueue_stats_incremented: > > > > - /* push objects to the backend */ > > + /* Push the objects to the backend. */ > > rte_mempool_ops_enqueue_bulk(mp, obj_table, n); } > > > > -- > > 2.17.1