> > Subject: Re: [EXT] Re: [PATCH] examples/ipsec-secgw: fix IPsec performance
> drop
> >
> > On 2/9/2024 1:10 PM, Rahul Bhansali wrote:
> > >
> > >
> > >> -----Original Message-----
> > >> From: Ferruh Yigit <ferruh.yi...@amd.com>
> > >> Sent: Wednesday, February 7, 2024 4:06 PM
> > >> To: Rahul Bhansali <rbhans...@marvell.com>; dev@dpdk.org; Radu
> > >> Nicolau <radu.nico...@intel.com>; Akhil Goyal <gak...@marvell.com>;
> > >> Konstantin Ananyev <konstantin.anan...@huawei.com>; Anoob Joseph
> > >> <ano...@marvell.com>
> > >> Subject: Re: [EXT] Re: [PATCH] examples/ipsec-secgw: fix IPsec
> > >> performance drop
> > >>
> > >> On 2/7/2024 6:46 AM, Rahul Bhansali wrote:
> > >>>
> > >>>
> > >>>> -----Original Message-----
> > >>>> From: Ferruh Yigit <ferruh.yi...@amd.com>
> > >>>> Sent: Tuesday, February 6, 2024 11:55 PM
> > >>>> To: Rahul Bhansali <rbhans...@marvell.com>; dev@dpdk.org; Radu
> > >>>> Nicolau <radu.nico...@intel.com>; Akhil Goyal <gak...@marvell.com>;
> > >>>> Konstantin Ananyev <konstantin.anan...@huawei.com>; Anoob Joseph
> > >>>> <ano...@marvell.com>
> > >>>> Subject: [EXT] Re: [PATCH] examples/ipsec-secgw: fix IPsec
> > >>>> performance drop
> > >>>>
> > >>>> External Email
> > >>>>
> > >>>> -------------------------------------------------------------------
> > >>>> --
> > >>>> - On 2/6/2024 12:38 PM, Rahul Bhansali wrote:
> > >>>>> Single packet free using rte_pktmbuf_free_bulk() is dropping the
> > >>>>> performance. On cn10k, maximum of ~4% drop observed for IPsec
> > >>>>> event mode single SA outbound case.
> > >>>>>
> > >>>>> To fix this issue, single packet free will use rte_pktmbuf_free API.
> > >>>>>
> > >>>>> Fixes: bd7c063561b3 ("examples/ipsec-secgw: use bulk free")
> > >>>>>
> > >>>>> Signed-off-by: Rahul Bhansali <rbhans...@marvell.com>
> > >>>>> ---
> > >>>>>  examples/ipsec-secgw/ipsec-secgw.h | 7 +++----
> > >>>>>  1 file changed, 3 insertions(+), 4 deletions(-)
> > >>>>>
> > >>>>> diff --git a/examples/ipsec-secgw/ipsec-secgw.h
> > >>>>> b/examples/ipsec-secgw/ipsec-secgw.h
> > >>>>> index 8baab44ee7..ec33a982df 100644
> > >>>>> --- a/examples/ipsec-secgw/ipsec-secgw.h
> > >>>>> +++ b/examples/ipsec-secgw/ipsec-secgw.h
> > >>>>> @@ -229,11 +229,10 @@ free_reassembly_fail_pkt(struct rte_mbuf
> > >>>>> *mb) }
> > >>>>>
> > >>>>>  /* helper routine to free bulk of packets */ -static inline void
> > >>>>> -free_pkts(struct rte_mbuf *mb[], uint32_t n)
> > >>>>> +static __rte_always_inline void
> > >>>>> +free_pkts(struct rte_mbuf *mb[], const uint32_t n)
> > >>>>>  {
> > >>>>> -     rte_pktmbuf_free_bulk(mb, n);
> > >>>>> -
> > >>>>> +     n == 1 ? rte_pktmbuf_free(mb[0]) : rte_pktmbuf_free_bulk(mb,
> n);
> > >>>>>       core_stats_update_drop(n);
> > >>>>>  }
> > >>>>>
> > >>>>
> > >>>> Hi Rahul,
> > >>>>
> > >>>> Do you think the 'rte_pktmbuf_free_bulk()' API performance can be
> > >>>> improved by similar change?
> > >>>
> > >>> Hi Ferruh,
> > >>> Currently 'rte_pktmbuf_free_bulk() is not inline. If we make that
> > >>> along with
> > >> __rte_pktmbuf_free_seg_via_array()  both inline then performance can
> > >> be improved similar.
> > >>>
> > >>
> > >> Ah, so performance improvement is coming from 'rte_pktmbuf_free()'
> > >> being inline, OK.
> > >>
> > >> As you are doing performance testing in that area, can you please
> > >> check if '__rte_pktmbuf_free_seg_via_array()' is inlined, as it is
> > >> static function I expect it to be inlined. If not, can you please
> > >> test with force inlining it (__rte_always_inline)?
> > > It was not inline, did check with force inline also and no impact with 
> > > this, so I
> > can make it force inline.
> > >
> >
> > If there is no performance improvement, I think no need to force inline
> > '__rte_pktmbuf_free_seg_via_array()'.
> >
> > >>
> > >>
> > >> And I wonder if bulk() API may get single mbuf is a common theme,
> > >> does it makes sense add a new inline wrapper to library to cover this
> > >> case, if it is bringing ~4% improvement, like:
> > >> ```
> > >> static inline void
> > >> rte_pktmbuf_free_bulk_or_one(... **mb, unsigned int n) {
> > >>  if (n == 1)
> > >>          return rte_pktmbuf_free(mb[0]);
> > >>  return rte_pktmbuf_free_bulk(mb, n); }
> > > Agree, can make this wrapper to cover a case where bulk free API is
> > > called but might have single mbuf to get better perf. It can be
> > > further optimize " if (n == 1)" with compile time constant check, ```
> > > static inline void rte_pktmbuf_free_bulk_or_one(struct rte_mbuf **mb,
> > > unsigned int n) {
> > >        if (__builtin_constant_p(n) && (n == 1))
> > >                rte_pktmbuf_free(mb[0]);
> > >        else
> > >                rte_pktmbuf_free_bulk(mb, n); } ``` Let me know if it
> > > is fine. I'll send v2. And, this will be " __rte_experimental" right ?
> > >
> >
> > Compile time constant check can prevent penalty from additional check, which
> is
> > good, and I can see this can work for the examples/ipsec-secgw usecase 
> > above,
> > which has some hardcoded single mbuf free calls.
> >
> > But most of the other usecases I think 'n' won't be known in compile time, 
> > so
> API
> > will be effectively same as free_bulk().
> Agree.
> >
> > If you have it with runtime check, do you still observe any performance
> > improvement? If not perhaps we can go only with example code update,
> without
> > new API.
> With runtime check, performance improvement is small only in compare to
> compile time check. So can continue without this new API.

Acked-by: Akhil Goyal <gak...@marvell.com>
Applied to dpdk-next-crypto
Thanks.

Reply via email to