<snip> [ > > > From: Kamalakshitha Aligeri [mailto:kamalakshitha.alig...@arm.com] > > Sent: Friday, 10 February 2023 07.54 > > > > Integrated zero-copy put API in mempool cache in i40e PMD. > > On Ampere Altra server, l3fwd single core's performance improves by 5% > > with the new API > > > > Signed-off-by: Kamalakshitha Aligeri <kamalakshitha.alig...@arm.com> > > Reviewed-by: Ruifeng Wang <ruifeng.w...@arm.com> > > Reviewed-by: Feifei Wang <feifei.wa...@arm.com> > > --- > > Link: > > https://patchwork.dpdk.org/project/dpdk/patch/20230209145833.129986-1- > > m...@smartsharesystems.com/ > > If you agree with the referred patch, please review or acknowledge it on the > mailing list, so it can be merged. > > > > > .mailmap | 1 + > > drivers/net/i40e/i40e_rxtx_vec_common.h | 28 > > ++++++++++++++++++++----- > > 2 files changed, 24 insertions(+), 5 deletions(-) > > > > diff --git a/.mailmap b/.mailmap > > index 75884b6fe2..05a42edbcf 100644 > > --- a/.mailmap > > +++ b/.mailmap > > @@ -670,6 +670,7 @@ Kai Ji <kai...@intel.com> Kaiwen Deng > > <kaiwenx.d...@intel.com> Kalesh AP > > <kalesh-anakkur.pura...@broadcom.com> > > Kamalakannan R <kamalakanna...@intel.com> > > +Kamalakshitha Aligeri <kamalakshitha.alig...@arm.com> > > Kamil Bednarczyk <kamil.bednarc...@intel.com> Kamil Chalupnik > > <kamilx.chalup...@intel.com> Kamil Rytarowski > > <kamil.rytarow...@caviumnetworks.com> > > diff --git a/drivers/net/i40e/i40e_rxtx_vec_common.h > > b/drivers/net/i40e/i40e_rxtx_vec_common.h > > index fe1a6ec75e..113599d82b 100644 > > --- a/drivers/net/i40e/i40e_rxtx_vec_common.h > > +++ b/drivers/net/i40e/i40e_rxtx_vec_common.h > > @@ -95,18 +95,36 @@ i40e_tx_free_bufs(struct i40e_tx_queue *txq) > > > > n = txq->tx_rs_thresh; > > > > - /* first buffer to free from S/W ring is at index > > - * tx_next_dd - (tx_rs_thresh-1) > > - */ > > + /* first buffer to free from S/W ring is at index > > + * tx_next_dd - (tx_rs_thresh-1) > > + */ > > txep = &txq->sw_ring[txq->tx_next_dd - (n - 1)]; > > > > if (txq->offloads & RTE_ETH_TX_OFFLOAD_MBUF_FAST_FREE) { > > + struct rte_mempool *mp = txep[0].mbuf->pool; > > + struct rte_mempool_cache *cache = > > rte_mempool_default_cache(mp, rte_lcore_id()); > > + void **cache_objs; > > + > > + if (unlikely(!cache)) > > + goto fallback; > > + > > + cache_objs = rte_mempool_cache_zc_put_bulk(cache, mp, n); > > + if (unlikely(!cache_objs)) > > + goto fallback; > > + > > for (i = 0; i < n; i++) { > > - free[i] = txep[i].mbuf; > > + cache_objs[i] = txep->mbuf; > > /* no need to reset txep[i].mbuf in vector path */ > > + txep++; > > Why the change from "xyz[i] = txep[i].mbuf;" to "xyz[i] = txep->mbuf; > txep++;"? I > don't see "txep" being used after the "done" label. And at the fallback, you > still > use "xyz[i] = txep[i].mbuf;". It would look cleaner using the same method in > both places. +1
> > It's not important, so feel free to keep as is or change as suggested. Both > ways, > > Acked-by: Morten Brørup <m...@smartsharesystems.com> > > > } > > - rte_mempool_put_bulk(free[0]->pool, (void **)free, n); > > goto done; > > + > > +fallback: > > + for (i = 0; i < n; i++) > > + free[i] = txep[i].mbuf; > > + rte_mempool_generic_put(mp, (void **)free, n, cache); > > + goto done; > > + > > } > > > > m = rte_pktmbuf_prefree_seg(txep[0].mbuf); > > -- > > 2.25.1 > >