> -----Original Message-----
> From: Ferruh Yigit <ferruh.yi...@amd.com>
> Sent: Wednesday, February 7, 2024 4:06 PM
> To: Rahul Bhansali <rbhans...@marvell.com>; dev@dpdk.org; Radu Nicolau
> <radu.nico...@intel.com>; Akhil Goyal <gak...@marvell.com>; Konstantin
> Ananyev <konstantin.anan...@huawei.com>; Anoob Joseph
> <ano...@marvell.com>
> Subject: Re: [EXT] Re: [PATCH] examples/ipsec-secgw: fix IPsec performance 
> drop
> 
> On 2/7/2024 6:46 AM, Rahul Bhansali wrote:
> >
> >
> >> -----Original Message-----
> >> From: Ferruh Yigit <ferruh.yi...@amd.com>
> >> Sent: Tuesday, February 6, 2024 11:55 PM
> >> To: Rahul Bhansali <rbhans...@marvell.com>; dev@dpdk.org; Radu
> >> Nicolau <radu.nico...@intel.com>; Akhil Goyal <gak...@marvell.com>;
> >> Konstantin Ananyev <konstantin.anan...@huawei.com>; Anoob Joseph
> >> <ano...@marvell.com>
> >> Subject: [EXT] Re: [PATCH] examples/ipsec-secgw: fix IPsec
> >> performance drop
> >>
> >> External Email
> >>
> >> ---------------------------------------------------------------------
> >> - On 2/6/2024 12:38 PM, Rahul Bhansali wrote:
> >>> Single packet free using rte_pktmbuf_free_bulk() is dropping the
> >>> performance. On cn10k, maximum of ~4% drop observed for IPsec event
> >>> mode single SA outbound case.
> >>>
> >>> To fix this issue, single packet free will use rte_pktmbuf_free API.
> >>>
> >>> Fixes: bd7c063561b3 ("examples/ipsec-secgw: use bulk free")
> >>>
> >>> Signed-off-by: Rahul Bhansali <rbhans...@marvell.com>
> >>> ---
> >>>  examples/ipsec-secgw/ipsec-secgw.h | 7 +++----
> >>>  1 file changed, 3 insertions(+), 4 deletions(-)
> >>>
> >>> diff --git a/examples/ipsec-secgw/ipsec-secgw.h
> >>> b/examples/ipsec-secgw/ipsec-secgw.h
> >>> index 8baab44ee7..ec33a982df 100644
> >>> --- a/examples/ipsec-secgw/ipsec-secgw.h
> >>> +++ b/examples/ipsec-secgw/ipsec-secgw.h
> >>> @@ -229,11 +229,10 @@ free_reassembly_fail_pkt(struct rte_mbuf *mb)
> >>> }
> >>>
> >>>  /* helper routine to free bulk of packets */ -static inline void
> >>> -free_pkts(struct rte_mbuf *mb[], uint32_t n)
> >>> +static __rte_always_inline void
> >>> +free_pkts(struct rte_mbuf *mb[], const uint32_t n)
> >>>  {
> >>> - rte_pktmbuf_free_bulk(mb, n);
> >>> -
> >>> + n == 1 ? rte_pktmbuf_free(mb[0]) : rte_pktmbuf_free_bulk(mb, n);
> >>>   core_stats_update_drop(n);
> >>>  }
> >>>
> >>
> >> Hi Rahul,
> >>
> >> Do you think the 'rte_pktmbuf_free_bulk()' API performance can be
> >> improved by similar change?
> >
> > Hi Ferruh,
> > Currently 'rte_pktmbuf_free_bulk() is not inline. If we make that along with
> __rte_pktmbuf_free_seg_via_array()  both inline then performance can be
> improved similar.
> >
> 
> Ah, so performance improvement is coming from 'rte_pktmbuf_free()' being
> inline, OK.
> 
> As you are doing performance testing in that area, can you please check if
> '__rte_pktmbuf_free_seg_via_array()' is inlined, as it is static function I 
> expect it
> to be inlined. If not, can you please test with force inlining it
> (__rte_always_inline)?
It was not inline, did check with force inline also and no impact with this, so 
I can make it force inline.
> 
> 
> And I wonder if bulk() API may get single mbuf is a common theme, does it 
> makes
> sense add a new inline wrapper to library to cover this case, if it is 
> bringing ~4%
> improvement, like:
> ```
> static inline void
> rte_pktmbuf_free_bulk_or_one(... **mb, unsigned int n) {
>       if (n == 1)
>               return rte_pktmbuf_free(mb[0]);
>       return rte_pktmbuf_free_bulk(mb, n);
> }
Agree, can make this wrapper to cover a case where bulk free API is called but 
might have single mbuf to get better perf. It can be further optimize " if (n 
== 1)" with compile time constant check,
```
static inline void
rte_pktmbuf_free_bulk_or_one(struct rte_mbuf **mb, unsigned int n)
{
       if (__builtin_constant_p(n) && (n == 1))
               rte_pktmbuf_free(mb[0]);
       else
               rte_pktmbuf_free_bulk(mb, n);
}
```
Let me know if it is fine. I'll send v2. And, this will be " 
__rte_experimental" right ?
> ```

Reply via email to