On Tue, May 26, 2026 at 07:45:24PM +0200, Morten Brørup wrote: > > From: Morten Brørup [mailto:[email protected]] > > Sent: Tuesday, 26 May 2026 12.37 > > > > > From: Bruce Richardson [mailto:[email protected]] > > > Sent: Tuesday, 26 May 2026 11.40 > > > > > [...] > > > > [In all this, I am making the assumption that burst size is well less > > > than > > > cache size. Also, similar logic would be applicable for the inverse > > > scenario, e.g. flush to empty (and fill burst) and fill to 75%] > > > > I'm not so sure about this assumption. > > With a cache size of 512 and a bursts of 64, the cache only holds 8 > > bursts. > > 50% is 4 bursts, and 25% is only 2 bursts. > > > > Using a replenish/drain level in the middle requires 5 bursts in either > > direction to pass the edge (and trigger replenish/flush). > > Using a replenish/drain level 25% from the edge requires only 3 bursts > > in the wrong direction to pass the edge (and trigger replenish/flush). > > Much higher probability with random get/put. > > > > > > > > Now, all said, I tend to agree that we want to leave space for a > > decent > > > size burst after a fill. That is why I think that filling to 75% is > > > reasonable. After an alloc that triggers a fill, I don't want the > > cache > > > less than 50% full, but not completely full so there is room for a > > free > > > without a flush, and similarly for a free that triggers a flush, the > > > cache > > > should not be empty, but also should not be more than half full. > > > > > > One suggestion - we could always add a simple tunable that specifies > > > the > > > margin, or reserved entries for alloc and free. We can then guide in > > > the > > > docs that the value should be e.g. "zero for apps where alloc and > > free > > > take > > > place on different cores. 20%-50% of cache is recommended where alloc > > > and > > > free take place on the same core" > > > > Yes, a simple tunable is a really good idea. > > > > At this point, I think we should optimize for use case #1, and go for > > the 50% fill level. > > Then we can add a tunable to optimize for use case #2 later. I will try > > to come up with a draft for such a follow-up patch within the next few > > days. > > Adding a tunable is not so simple... > The choice of mempool cache algorithm (drain/replenish to 50% vs. > drain/replenish completely) should be passed via the "flags" parameter in > rte_mempool_create(), but rte_pktmbuf_pool_create() is missing the "flags" > parameter. > We can add it at the next ABI breaking release. > WDYT? > I don't want this just a binary flag with two settings, I think it should be an actual numeric value. Can we not use function versioning to add the new parameter to all functions needing it, without worrying about ABI breakage.
/Bruce

