Hi, Morten

> -----邮件原件-----
> 发件人: Morten Brørup <m...@smartsharesystems.com>
> 发送时间: Thursday, February 9, 2023 5:34 PM
> 收件人: Kamalakshitha Aligeri <kamalakshitha.alig...@arm.com>;
> yuying.zh...@intel.com; beilei.x...@intel.com; olivier.m...@6wind.com;
> andrew.rybche...@oktetlabs.ru; bruce.richard...@intel.com;
> konstantin.anan...@huawei.com; Honnappa Nagarahalli
> <honnappa.nagaraha...@arm.com>
> 抄送: dev@dpdk.org; nd <n...@arm.com>; Ruifeng Wang
> <ruifeng.w...@arm.com>; Feifei Wang <feifei.wa...@arm.com>
> 主题: RE: [PATCH 1/2] net/i40e: replace put function
> 
> > From: Kamalakshitha Aligeri [mailto:kamalakshitha.alig...@arm.com]
> > Sent: Thursday, 9 February 2023 07.25
> >
> > Integrated zero-copy put API in mempool cache in i40e PMD.
> > On Ampere Altra server, l3fwd single core's performance improves by 5%
> > with the new API
> >
> > Signed-off-by: Kamalakshitha Aligeri <kamalakshitha.alig...@arm.com>
> > Reviewed-by: Ruifeng Wang <ruifeng.w...@arm.com>
> > Reviewed-by: Feifei Wang <feifei.wa...@arm.com>
> > ---
> > Link:
> > https://patchwork.dpdk.org/project/dpdk/patch/20221227151700.80887-1-
> > m...@smartsharesystems.com/
> >
> >  .mailmap                                |  1 +
> >  drivers/net/i40e/i40e_rxtx_vec_common.h | 34
> > ++++++++++++++++++++-----
> >  2 files changed, 28 insertions(+), 7 deletions(-)
> >
> > diff --git a/.mailmap b/.mailmap
> > index 75884b6fe2..05a42edbcf 100644
> > --- a/.mailmap
> > +++ b/.mailmap
> > @@ -670,6 +670,7 @@ Kai Ji <kai...@intel.com>  Kaiwen Deng
> > <kaiwenx.d...@intel.com>  Kalesh AP
> > <kalesh-anakkur.pura...@broadcom.com>
> >  Kamalakannan R <kamalakanna...@intel.com>
> > +Kamalakshitha Aligeri <kamalakshitha.alig...@arm.com>
> >  Kamil Bednarczyk <kamil.bednarc...@intel.com>  Kamil Chalupnik
> > <kamilx.chalup...@intel.com>  Kamil Rytarowski
> > <kamil.rytarow...@caviumnetworks.com>
> > diff --git a/drivers/net/i40e/i40e_rxtx_vec_common.h
> > b/drivers/net/i40e/i40e_rxtx_vec_common.h
> > index fe1a6ec75e..80d4a159e6 100644
> > --- a/drivers/net/i40e/i40e_rxtx_vec_common.h
> > +++ b/drivers/net/i40e/i40e_rxtx_vec_common.h
> > @@ -95,17 +95,37 @@ i40e_tx_free_bufs(struct i40e_tx_queue *txq)
> >
> >     n = txq->tx_rs_thresh;
> >
> > -    /* first buffer to free from S/W ring is at index
> > -     * tx_next_dd - (tx_rs_thresh-1)
> > -     */
> > +   /* first buffer to free from S/W ring is at index
> > +    * tx_next_dd - (tx_rs_thresh-1)
> > +    */
> >     txep = &txq->sw_ring[txq->tx_next_dd - (n - 1)];
> >
> >     if (txq->offloads & RTE_ETH_TX_OFFLOAD_MBUF_FAST_FREE) {
> > -           for (i = 0; i < n; i++) {
> > -                   free[i] = txep[i].mbuf;
> > -                   /* no need to reset txep[i].mbuf in vector path */
> > +           struct rte_mempool *mp = txep[0].mbuf->pool;
> > +           struct rte_mempool_cache *cache =
> > rte_mempool_default_cache(mp, rte_lcore_id());
> > +
> > +           if (!cache || n > RTE_MEMPOOL_CACHE_MAX_SIZE) {
> 
> If the mempool has a cache, do not compare n to
> RTE_MEMPOOL_CACHE_MAX_SIZE. Instead, call
> rte_mempool_cache_zc_put_bulk() to determine if n is acceptable for zero-
> copy.
> 

> It looks like this patch behaves incorrectly if the cache is configured to be
> smaller than RTE_MEMPOOL_CACHE_MAX_SIZE. Let's say the cache size is 8,
> which will make the flush threshold 12. If n is 32, your code will not enter 
> this
> branch, but proceed to call rte_mempool_cache_zc_put_bulk(), which will
> return NULL, and then you will goto done.
> 
> Obviously, if there is no cache, fall back to the standard
> rte_mempool_put_bulk().

Agree with this. I think we ignore the case that (cache -> flushthresh  < n <  
RTE_MEMPOOL_CACHE_MAX_SIZE).

Our goal is that if (!cache || n > cache -> flushthresh), we can put the buffers
into mempool directly.  

Thus maybe we can change as:
struct rte_mempool_cache *cache = rte_mempool_default_cache(mp, rte_lcore_id());
if (!cache || n > cache -> flushthresh) {
      for (i = 0; i < n ; i++)
          free[i] = txep[i].mbuf;
      if (!cache) {
                rte_mempool_generic_put;
                goto done;
      } else if {
                rte_mempool_ops_enqueue_bulk;
                goto done;
      }
}

If we can change like this?

> 
> > +                   for (i = 0; i < n ; i++)
> > +                           free[i] = txep[i].mbuf;
> > +                   if (!cache) {
> > +                           rte_mempool_generic_put(mp, (void
> **)free, n,
> > cache);
> > +                           goto done;
> > +                   }
> > +                   if (n > RTE_MEMPOOL_CACHE_MAX_SIZE) {
> > +                           rte_mempool_ops_enqueue_bulk(mp, (void
> **)free,
> > n);
> > +                           goto done;
> > +                   }
> > +           }
> > +           void **cache_objs;
> > +
> > +           cache_objs = rte_mempool_cache_zc_put_bulk(cache, mp,
> n);
> > +           if (cache_objs) {
> > +                   for (i = 0; i < n; i++) {
> > +                           cache_objs[i] = txep->mbuf;
> > +                           /* no need to reset txep[i].mbuf in vector
> path
> > */
> > +                           txep++;
> > +                   }
> >             }
> > -           rte_mempool_put_bulk(free[0]->pool, (void **)free, n);
> >             goto done;
> >     }
> >
> > --
> > 2.25.1
> >

Reply via email to