PING mempool maintainers. If you don't provide ACK or Review to patches, how should Thomas know that it is ready for merging?
-Morten > From: Morten Brørup [mailto:m...@smartsharesystems.com] > Sent: Wednesday, 16 November 2022 18.39 > > > From: Honnappa Nagarahalli [mailto:honnappa.nagaraha...@arm.com] > > Sent: Wednesday, 16 November 2022 17.27 > > > > <snip> > > > > > > > > > > > > Micro-optimization: > > > > > Reduced the most likely code path in the generic put function > by > > > > moving an > > > > > unlikely check out of the most likely code path and further > down. > > > > > > > > > > Also updated the comments in the function. > > > > > > > > > > v2 (feedback from Andrew Rybchenko): > > > > > * Modified comparison to prevent overflow if n is really huge > and > > > > > len > > > > is > > > > > non-zero. > > > > > * Added assertion about the invariant preventing overflow in > the > > > > > comparison. > > > > > * Crossing the threshold is not extremely unlikely, so removed > > > > likely() > > > > > from that comparison. > > > > > The compiler will generate code with optimal static branch > > > > prediction > > > > > here anyway. > > > > > > > > > > Signed-off-by: Morten Brørup <m...@smartsharesystems.com> > > > > > --- > > > > > lib/mempool/rte_mempool.h | 36 ++++++++++++++++++++----------- > -- > > --- > > > > > 1 file changed, 20 insertions(+), 16 deletions(-) > > > > > > > > > > diff --git a/lib/mempool/rte_mempool.h > > b/lib/mempool/rte_mempool.h > > > > > index 9f530db24b..dd1a3177d6 100644 > > > > > --- a/lib/mempool/rte_mempool.h > > > > > +++ b/lib/mempool/rte_mempool.h > > > > > @@ -1364,32 +1364,36 @@ rte_mempool_do_generic_put(struct > > > > > rte_mempool *mp, void * const *obj_table, { > > > > > void **cache_objs; > > > > > > > > > > - /* No cache provided */ > > > > > + /* No cache provided? */ > > > > > if (unlikely(cache == NULL)) > > > > > goto driver_enqueue; > > > > > > > > > > - /* increment stat now, adding in mempool always success */ > > > > > + /* Increment stats now, adding in mempool always succeeds. > > */ > > > > > RTE_MEMPOOL_CACHE_STAT_ADD(cache, put_bulk, 1); > > > > > RTE_MEMPOOL_CACHE_STAT_ADD(cache, put_objs, n); > > > > > > > > > > - /* The request itself is too big for the cache */ > > > > > - if (unlikely(n > cache->flushthresh)) > > > > > - goto driver_enqueue_stats_incremented; > > > > > - > > > > > - /* > > > > > - * The cache follows the following algorithm: > > > > > - * 1. If the objects cannot be added to the cache without > > > > crossing > > > > > - * the flush threshold, flush the cache to the > > backend. > > > > > - * 2. Add the objects to the cache. > > > > > - */ > > > > > + /* Assert the invariant preventing overflow in the > > comparison > > > > below. > > > > > */ > > > > > + RTE_ASSERT(cache->len <= cache->flushthresh); > > > > > > > > > > - if (cache->len + n <= cache->flushthresh) { > > > > > + if (n <= cache->flushthresh - cache->len) { > > > > > + /* > > > > > + * The objects can be added to the cache without > > crossing > > > > the > > > > > + * flush threshold. > > > > > + */ > > > > > cache_objs = &cache->objs[cache->len]; > > > > > cache->len += n; > > > > > - } else { > > > > > + } else if (likely(n <= cache->flushthresh)) { > > > > IMO, this is a misconfiguration on the application part. In the > > PMDs I > > > > have looked at, max value of 'n' is controlled by compile time > > > > constants. Application could do a compile time check on the cache > > > > threshold or we could have another RTE_ASSERT on this. > > > > > > There could be applications using a mempool for something else than > > mbufs. > > Agree > > > > > > > > In that case, the application should be allowed to get/put many > > objects in > > > one transaction. > > Still, this is a misconfiguration on the application. On one hand the > > threshold is configured for 'x' but they are sending a request which > is > > more than 'x'. It should be possible to change the threshold > > configuration or reduce the request size. > > > > If the application does not fix the misconfiguration, it is possible > > that it will always hit this case and does not get the benefit of > using > > the per-core cache. > > Correct. I suppose this is the intended behavior of this API. > > The zero-copy API proposed in another patch [1] has stricter > requirements to the bulk size. > > [1]: http://inbox.dpdk.org/dev/20221115161822.70886-1- > m...@smartsharesystems.com/T/#u > > > > > With this check, we are introducing an additional memcpy as well. I > am > > not sure if reusing the latest buffers is better than having an > memcpy. > > There is no additional memcpy. The large bulk transfer is stored > directly in the backend pool, bypassing the mempool cache. > > Please note that this check is not new, it has just been moved. Before > this patch, it was checked on every call (if a cache is present); with > this patch, it is only checked if the entire request cannot go directly > into the cache. > > > > > > > > > > > > > > > + /* > > > > > + * The request itself fits into the cache. > > > > > + * But first, the cache must be flushed to the > > backend, so > > > > > + * adding the objects does not cross the flush > > threshold. > > > > > + */ > > > > > cache_objs = &cache->objs[0]; > > > > > rte_mempool_ops_enqueue_bulk(mp, cache_objs, cache- > > > > > >len); > > > > > cache->len = n; > > > > > + } else { > > > > > + /* The request itself is too big for the cache. */ > > > > > + goto driver_enqueue_stats_incremented; > > > > > } > > > > > > > > > > /* Add the objects to the cache. */ @@ -1399,13 +1403,13 @@ > > > > > rte_mempool_do_generic_put(struct rte_mempool *mp, void * const > > > > > *obj_table, > > > > > > > > > > driver_enqueue: > > > > > > > > > > - /* increment stat now, adding in mempool always success */ > > > > > + /* Increment stats now, adding in mempool always succeeds. > > */ > > > > > RTE_MEMPOOL_STAT_ADD(mp, put_bulk, 1); > > > > > RTE_MEMPOOL_STAT_ADD(mp, put_objs, n); > > > > > > > > > > driver_enqueue_stats_incremented: > > > > > > > > > > - /* push objects to the backend */ > > > > > + /* Push the objects to the backend. */ > > > > > rte_mempool_ops_enqueue_bulk(mp, obj_table, n); } > > > > > > > > > > -- > > > > > 2.17.1