RE: [PATCH v2] net/null: Add fast mbuf release TX offload

Konstantin Ananyev Mon, 28 Jul 2025 08:42:33 -0700


> > Hi Morten,
> >
> > Good patch. Please see below.
> >
> > On Sat, 26 Jul 2025, Morten Brørup wrote:
> >
> > > Added fast mbuf release, re-using the existing mbuf pool pointer
> > > in the queue structure.
> > >
> > > Signed-off-by: Morten Brørup <m...@smartsharesystems.com>
> > > ---
> > > v2:
> > > * Also announce the offload as a per-queue capability.
> > > * Added missing test of per-device offload configuration when
> > configuring
> > >  the queue.
> > > ---
> > > drivers/net/null/rte_eth_null.c | 33 ++++++++++++++++++++++++++++++---
> > > 1 file changed, 30 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/drivers/net/null/rte_eth_null.c
> > b/drivers/net/null/rte_eth_null.c
> > > index 8a9b74a03b..09cfc74494 100644
> > > --- a/drivers/net/null/rte_eth_null.c
> > > +++ b/drivers/net/null/rte_eth_null.c
> > > @@ -34,6 +34,17 @@ struct pmd_internals;
> > > struct null_queue {
> > >   struct pmd_internals *internals;
> > >
> > > + /**
> > > +  * For RX queue:
> > > +  *  Mempool to allocate mbufs from.
> > > +  *
> > > +  * For TX queue:
> >
> > Perhaps spell it 'Rx', 'Tx', but this is up to you.
> 
> I just checked, and it seems all three spellings "rx", "Rx" and "RX" are 
> being used in DPDK.
> I personally prefer RX, so I'll keep that.
> 
> >
> > > +  *  Mempool to free mbufs to, if fast release of mbufs is enabled.
> > > +  *  UINTPTR_MAX if the mempool for fast release of mbufs has not
> > yet been detected.
> > > +  *  NULL if fast release of mbufs is not enabled.
> > > +  *
> > > +  *  @see RTE_ETH_TX_OFFLOAD_MBUF_FAST_FREE
> > > +  */
> >
> > May be it would be better to have a separate 'tx_pkt_burst' callback, to
> > avoid
> > conditional checks below. Though, I understand this will downgrade the
> > per-queue
> > capability to the per-port only, so feel free to disregard this point.
> 
> I considered this, and I can imagine an application using FAST_FREE for its 
> primary queues, and normal free for some secondary
> queues.
> Looking at other drivers - which have implemented a runtime check, not 
> separate callbacks for FAST_FREE - I guess they came to the
> same conclusion.
> 
> >
> > >   struct rte_mempool *mb_pool;
> > >   void *dummy_packet;
> > >
> > > @@ -151,7 +162,16 @@ eth_null_tx(void *q, struct rte_mbuf **bufs,
> > uint16_t nb_bufs)
> > >   for (i = 0; i < nb_bufs; i++)
> > >           bytes += rte_pktmbuf_pkt_len(bufs[i]);
> > >
> > > - rte_pktmbuf_free_bulk(bufs, nb_bufs);
> > > + if (h->mb_pool != NULL) { /* RTE_ETH_TX_OFFLOAD_MBUF_FAST_FREE */
> > > +         if (unlikely(h->mb_pool == (void *)UINTPTR_MAX)) {
> > > +                 if (unlikely(nb_bufs == 0))
> > > +                         return 0; /* Do not dereference uninitialized
> > bufs[0]. */
> > > +                 h->mb_pool = bufs[0]->pool;
> > > +         }
> > > +         rte_mbuf_raw_free_bulk(h->mb_pool, bufs, nb_bufs);
> > > + } else {
> > > +         rte_pktmbuf_free_bulk(bufs, nb_bufs);
> > > + }
> > >   rte_atomic_fetch_add_explicit(&h->tx_pkts, nb_bufs,
> > rte_memory_order_relaxed);
> > >   rte_atomic_fetch_add_explicit(&h->tx_bytes, bytes,
> > rte_memory_order_relaxed);
> > >
> > > @@ -259,7 +279,7 @@ static int
> > > eth_tx_queue_setup(struct rte_eth_dev *dev, uint16_t tx_queue_id,
> > >           uint16_t nb_tx_desc __rte_unused,
> > >           unsigned int socket_id __rte_unused,
> > > -         const struct rte_eth_txconf *tx_conf __rte_unused)
> > > +         const struct rte_eth_txconf *tx_conf)
> > > {
> > >   struct rte_mbuf *dummy_packet;
> > >   struct pmd_internals *internals;
> > > @@ -284,6 +304,10 @@ eth_tx_queue_setup(struct rte_eth_dev *dev,
> > uint16_t tx_queue_id,
> > >
> > >   internals->tx_null_queues[tx_queue_id].internals = internals;
> > >   internals->tx_null_queues[tx_queue_id].dummy_packet =
> > dummy_packet;
> > > + internals->tx_null_queues[tx_queue_id].mb_pool =
> > > +                 (dev->data->dev_conf.txmode.offloads | tx_conf-
> > >offloads) &
> > > +                 RTE_ETH_TX_OFFLOAD_MBUF_FAST_FREE ?
> > > +                 (void *)UINTPTR_MAX : NULL;
> >
> > Given the fact that FAST_FREE and MULTI_SEGS are effectively
> > conflicting,
> > wouldn't it be better to have a check for the presence of both flags
> > here? I'm
> > not sure whether this is already checked at the generic layer above the
> > PMD.
> 
> Interesting thought - got me looking deeper into this.
> 
> It seems MULTI_SEGS is primarily a capability flag.
> The description of the MULTI_SEGS flag [1] could be interpreted that way too:
> /** Device supports multi segment send. */
> 
> [1]: 
> https://elixir.bootlin.com/dpdk/v25.07/source/lib/ethdev/rte_ethdev.h#L1614


In fact, I believe it serves both purposes: report capabilities and request for 
offloads to enable.
Few example, I believe request this offload:
https://elixir.bootlin.com/dpdk/v25.07/source/examples/ip_fragmentation/main.c#L156
https://elixir.bootlin.com/dpdk/v25.07/source/examples/ipsec-secgw/ipsec-secgw.c#L1985
https://elixir.bootlin.com/dpdk/v25.07/source/examples/ip_reassembly/main.c#L177

> 
> E.g. the i40e driver offers MULTI_SEGS capability per-device, but not 
> per-queue. And it doesn't use the MULTI_SEGS flag for any
> purpose (beyond capability reporting).
> 
> Furthermore, enabling MULTI_SEGS on TX (per device or per queue) wouldn't 
> mean that all transmitted packets are segmented; it
> only means that the driver must be able to handle segmented packets.

Yep.
 
> I.e. MULTI_SEGS could be enabled on a device, and yet it would be acceptable 
> to enable FAST_FREE on a queue on that device.

In theory yes... you probably can have one TX queue with FAST_FREE (no 
multi-seg packets) and another TX queue serving mulit-seg packets.
Again, probably some drivers can even support both offloads on the same TX 
queue, 
as long as conditions for the FAST_FREE offload are still satisfied: single 
mempool, refcnt==1, no indirect mbufs, etc. 
Though in practice, using MULTI_SEG usually implies usage all these mbuf 
features that are not-compatible with FAST_FREE.
BTW,  I see many of DPDK examples - do use both FAST_FREE and MULTI_SEG.
TBH - I don't understand how it works together, from my memories - for many 
cases it just shouldn't.
 
> 
> >
> > Thank you.
> 
> Thank you for reviewing.
> 
> >
> > >
> > >   return 0;
> > > }
> > > @@ -309,7 +333,10 @@ eth_dev_info(struct rte_eth_dev *dev,
> > >   dev_info->max_rx_queues = RTE_DIM(internals->rx_null_queues);
> > >   dev_info->max_tx_queues = RTE_DIM(internals->tx_null_queues);
> > >   dev_info->min_rx_bufsize = 0;
> > > - dev_info->tx_offload_capa = RTE_ETH_TX_OFFLOAD_MULTI_SEGS |
> > RTE_ETH_TX_OFFLOAD_MT_LOCKFREE;
> > > + dev_info->tx_queue_offload_capa =
> > RTE_ETH_TX_OFFLOAD_MBUF_FAST_FREE;
> > > + dev_info->tx_offload_capa = RTE_ETH_TX_OFFLOAD_MULTI_SEGS |
> > > +                 RTE_ETH_TX_OFFLOAD_MT_LOCKFREE |
> > > +                 dev_info->tx_queue_offload_capa;
> > >
> > >   dev_info->reta_size = internals->reta_size;
> > >   dev_info->flow_type_rss_offloads = internals-
> > >flow_type_rss_offloads;
> > > --
> > > 2.43.0
> > >
> > >

RE: [PATCH v2] net/null: Add fast mbuf release TX offload

Reply via email to