Hi Andrew Rybchenko, This patch is merged into dpdk22.11.0-rc1 we execute meson test driver/eventdev_selftest_sw failed. Could you please have a look at it, also submitted a Bugzilla ticket: https://bugs.dpdk.org/show_bug.cgi?id=1101
Regards, Li, Weiyuan > -----Original Message----- > From: Andrew Rybchenko <andrew.rybche...@oktetlabs.ru> > Sent: Friday, October 7, 2022 6:45 PM > To: Matz, Olivier <olivier.m...@6wind.com> > Cc: dev@dpdk.org; Morten Brørup <m...@smartsharesystems.com> > Subject: [PATCH v4] mempool: fix get objects from mempool with cache > > From: Morten Brørup <m...@smartsharesystems.com> > > A flush threshold for the mempool cache was introduced in DPDK version 1.3, > but rte_mempool_do_generic_get() was not completely updated back then, > and some inefficiencies were introduced. > > Fix the following in rte_mempool_do_generic_get(): > > 1. The code that initially screens the cache request was not updated with the > change in DPDK version 1.3. > The initial screening compared the request length to the cache size, which > was correct before, but became irrelevant with the introduction of the flush > threshold. E.g. the cache can hold up to flushthresh objects, which is more > than its size, so some requests were not served from the cache, even though > they could be. > The initial screening has now been corrected to match the initial screening in > rte_mempool_do_generic_put(), which verifies that a cache is present, and > that the length of the request does not overflow the memory allocated for > the cache. > > This bug caused a major performance degradation in scenarios where the > application burst length is the same as the cache size. In such cases, the > objects were not ever fetched from the mempool cache, regardless if they > could have been. > This scenario occurs e.g. if an application has configured a mempool with a > size matching the application's burst size. > > 2. The function is a helper for rte_mempool_generic_get(), so it must > behave according to the description of that function. > Specifically, objects must first be returned from the cache, subsequently > from the backend. > After the change in DPDK version 1.3, this was not the behavior when the > request was partially satisfied from the cache; instead, the objects from the > backend were returned ahead of the objects from the cache. > This bug degraded application performance on CPUs with a small L1 cache, > which benefit from having the hot objects first in the returned array. > (This is probably also the reason why the function returns the objects in > reverse order, which it still does.) Now, all code paths first return objects > from the cache, subsequently from the backend. > > The function was not behaving as described (by the function using it) and > expected by applications using it. This in itself is also a bug. > > 3. If the cache could not be backfilled, the function would attempt to get all > the requested objects from the backend (instead of only the number of > requested objects minus the objects available in the backend), and the > function would fail if that failed. > Now, the first part of the request is always satisfied from the cache, and if > the subsequent backfilling of the cache from the backend fails, only the > remaining requested objects are retrieved from the backend. > > The function would fail despite there are enough objects in the cache plus > the common pool. > > 4. The code flow for satisfying the request from the cache was slightly > inefficient: > The likely code path where the objects are simply served from the cache was > treated as unlikely. Now it is treated as likely. > > Signed-off-by: Morten Brørup <m...@smartsharesystems.com> > Signed-off-by: Andrew Rybchenko <andrew.rybche...@oktetlabs.ru> > Reviewed-by: Morten Brørup <m...@smartsharesystems.com> > --- > v4 changes (Andrew Rybchenko) > - Avoid usage of misleading ring, since other mempool drivers > exist, use term backend > - Avoid term ring in goto label, use driver_dequeue as a label name > - Add likely() to cache != NULL in driver dequeue, just for symmetry > - Highlight that remaining objects are deqeueued from the driver > > v3 changes (Andrew Rybchenko) > - Always get first objects from the cache even if request is bigger > than cache size. Remove one corresponding condition from the path > when request is fully served from cache. > - Simplify code to avoid duplication: > - Get objects directly from backend in single place only. > - Share code which gets from the cache first regardless if > everythihg is obtained from the cache or just the first part. > - Rollback cache length in unlikely failure branch to avoid cache > vs NULL check in success branch. > > v2 changes > - Do not modify description of return value. This belongs in a separate doc > fix. > - Elaborate even more on which bugs the modifications fix. > > lib/mempool/rte_mempool.h | 80 +++++++++++++++++++++++++----------- > ---