> From: huangdengdui [mailto:huangdeng...@huawei.com] > Sent: Wednesday, 15 January 2025 07.52 > > On 2025/1/15 0:39, Morten Brørup wrote: > > mbuf: add fast free bulk function > > > > When putting an mbuf back into its mempool, there are certain > requirements > > to the mbuf. Specifically, some of its fields must be initialized. > > > > These requirements are in fact invariants about free mbufs, held in > > mempools, and thus also apply when allocating an mbuf from a mempool. > > With this in mind, the additional assertions in rte_mbuf_raw_free() > were > > moved to __rte_mbuf_raw_sanity_check(). > > Furthermore, the assertion regarding pinned external buffer was > enhanced; > > it now also asserts that the referenced pinned external buffer has > > refcnt == 1. > > > > The description of RTE_ETH_TX_OFFLOAD_MBUF_FAST_FREE was updated to > > include the remaining requirements, which were missing here. > > > > And finally: > > A new rte_mbuf_fast_free_bulk() inline function was added for the > benefit > > of ethdev drivers supporting fast release of mbufs. > > It asserts these requirements and that the mbufs belong to the > specified > > mempool, and then calls rte_mempool_put_bulk(). > > > > Signed-off-by: Morten Brørup <m...@smartsharesystems.com> > > --- > > v2: > > * Fixed missing inline. > > --- > > lib/ethdev/rte_ethdev.h | 6 ++++-- > > lib/mbuf/rte_mbuf.h | 39 +++++++++++++++++++++++++++++++++++++-- > > 2 files changed, 41 insertions(+), 4 deletions(-) > > > > diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h > > index 1f71cad244..e9267fca79 100644 > > --- a/lib/ethdev/rte_ethdev.h > > +++ b/lib/ethdev/rte_ethdev.h > > @@ -1612,8 +1612,10 @@ struct rte_eth_conf { > > #define RTE_ETH_TX_OFFLOAD_MULTI_SEGS RTE_BIT64(15) > > /** > > * Device supports optimization for fast release of mbufs. > > - * When set application must guarantee that per-queue all mbufs > comes from > > - * the same mempool and has refcnt = 1. > > + * When set application must guarantee that per-queue all mbufs come > from the same mempool, > > + * are direct, have refcnt=1, next=NULL and nb_segs=1, as done by > rte_pktmbuf_prefree_seg(). > > + * > > + * @see rte_mbuf_fast_free_bulk() > > */ > > #define RTE_ETH_TX_OFFLOAD_MBUF_FAST_FREE RTE_BIT64(16) > > #define RTE_ETH_TX_OFFLOAD_SECURITY RTE_BIT64(17) > > diff --git a/lib/mbuf/rte_mbuf.h b/lib/mbuf/rte_mbuf.h > > index 0d2e0e64b3..7590d82689 100644 > > --- a/lib/mbuf/rte_mbuf.h > > +++ b/lib/mbuf/rte_mbuf.h > > @@ -568,6 +568,10 @@ __rte_mbuf_raw_sanity_check(__rte_unused const > struct rte_mbuf *m) > > RTE_ASSERT(rte_mbuf_refcnt_read(m) == 1); > > RTE_ASSERT(m->next == NULL); > > RTE_ASSERT(m->nb_segs == 1); > > + RTE_ASSERT(!RTE_MBUF_CLONED(m)); > > + RTE_ASSERT(!RTE_MBUF_HAS_EXTBUF(m) || > > + (RTE_MBUF_HAS_PINNED_EXTBUF(m) && > > + rte_mbuf_ext_refcnt_read(m->shinfo) == 1)); > > __rte_mbuf_sanity_check(m, 0); > > } > > > > @@ -623,12 +627,43 @@ static inline struct rte_mbuf > *rte_mbuf_raw_alloc(struct rte_mempool *mp) > > static __rte_always_inline void > > rte_mbuf_raw_free(struct rte_mbuf *m) > > { > > - RTE_ASSERT(!RTE_MBUF_CLONED(m) && > > - (!RTE_MBUF_HAS_EXTBUF(m) || > RTE_MBUF_HAS_PINNED_EXTBUF(m))); > > __rte_mbuf_raw_sanity_check(m); > > rte_mempool_put(m->pool, m); > > } > > > > +/** > > + * Put a bulk of mbufs allocated from the same mempool back into the > mempool. > > + * > > + * The caller must ensure that the mbufs come from the specified > mempool, > > + * are direct and properly reinitialized (refcnt=1, next=NULL, > nb_segs=1), as done by > > + * rte_pktmbuf_prefree_seg(). > > + * > > + * This function should be used with care, when optimization is > > + * required. For standard needs, prefer rte_pktmbuf_free_bulk(). > > + * > > + * @see RTE_ETH_TX_OFFLOAD_MBUF_FAST_FREE > > + * > > + * @param mp > > + * The mempool to which the mbufs belong. > > + * @param mbufs > > + * Array of pointers to packet mbufs. > > + * The array must not contain NULL pointers. > > + * @param count > > + * Array size. > > + */ > > +static __rte_always_inline void > > +rte_mbuf_fast_free_bulk(struct rte_mempool *mp, struct rte_mbuf > **mbufs, unsigned int count) > > +{ > > + for (unsigned int idx = 0; idx < count; idx++) { > > + const struct rte_mbuf *m = mbufs[idx]; > > + RTE_ASSERT(m != NULL); > > + RTE_ASSERT(m->pool == mp); > > + __rte_mbuf_raw_sanity_check(m); > > + } > > Is there some way to avoid executing a loop in non-debug mode? Like the > following or other better way > > #ifdef RTE_LIBRTE_MBUF_DEBUG > { > for (unsigned int idx = 0; idx < count; idx++) { > const struct rte_mbuf *m = mbufs[idx]; > RTE_ASSERT(m != NULL); > RTE_ASSERT(m->pool == mp); > __rte_mbuf_raw_sanity_check(m); > } > } > #endif
The loop is already omitted in non-debug mode: RTE_ASSERT() [1] is omitted unless RTE_ENABLE_ASSERT is set. __rte_mbuf_raw_sanity_check() [2] consist of some RTE_ASSERTs and __rte_mbuf_sanity_check(). __rte_mbuf_sanity_check() [3] is omitted unless RTE_LIBRTE_MBUF_DEBUG is set. So the compiler will detect that the loop has no effect, and optimize away the loop. [1]: https://elixir.bootlin.com/dpdk/v24.11.1/source/lib/eal/include/rte_debug.h#L46 [2]: https://elixir.bootlin.com/dpdk/v24.11.1/source/lib/mbuf/rte_mbuf.h#L566 [3]: https://elixir.bootlin.com/dpdk/v24.11.1/source/lib/mbuf/rte_mbuf.h#L348 > > > + > > + rte_mempool_put_bulk(mp, (void **)mbufs, count); > > Can the mp be obtained from the mbuf? Passing "mp" as a parameter avoids a potential CPU cache miss by not dereferencing the first mbuf, if the driver/application already has the mempool pointer readily available (and hot in the CPU cache) from somewhere else. If the driver/or application doesn't have the mempool pointer readily available, it can obtain it from the mbuf when calling this function. And as a bonus side effect, passing "mp" as a parameter allows calling the function with count=0 without special handling inside the function. Obviously, if the driver/application gets "mp" from mbuf[0]->pool, it needs to first check that count>0; but that would be the situation for the driver/application whenever it accesses an mbuf array, regardless what it is doing with that array.