> -----Original Message-----
> From: Akhil Goyal [mailto:akhil.go...@nxp.com]
> Sent: Monday, December 24, 2018 11:25 AM
> To: Ananyev, Konstantin <konstantin.anan...@intel.com>; dev@dpdk.org
> Cc: Nicolau, Radu <radu.nico...@intel.com>; Horton, Remy 
> <remy.hor...@intel.com>
> Subject: Re: [dpdk-dev] [PATCH v4 1/9] examples/ipsec-secgw: avoid to request 
> unused TX offloads
> 
> 
> 
> On 12/24/2018 4:52 PM, Ananyev, Konstantin wrote:
> >
> >> -----Original Message-----
> >> From: Akhil Goyal [mailto:akhil.go...@nxp.com]
> >> Sent: Monday, December 24, 2018 10:54 AM
> >> To: Ananyev, Konstantin <konstantin.anan...@intel.com>; dev@dpdk.org
> >> Cc: Nicolau, Radu <radu.nico...@intel.com>; Horton, Remy 
> >> <remy.hor...@intel.com>
> >> Subject: Re: [dpdk-dev] [PATCH v4 1/9] examples/ipsec-secgw: avoid to 
> >> request unused TX offloads
> >>
> >>
> >>
> >> On 12/24/2018 3:49 PM, Ananyev, Konstantin wrote:
> >>>>>> On 12/14/2018 10:10 PM, Konstantin Ananyev wrote:
> >>>>>>> ipsec-secgw always enables TX offloads
> >>>>>>> (DEV_TX_OFFLOAD_MULTI_SEGS, DEV_TX_OFFLOAD_SECURITY),
> >>>>>>> even when they are not requested by the config.
> >>>>>>> That causes many PMD to choose full-featured TX function,
> >>>>>>> which in many cases is much slower then one without offloads.
> >>>>>>> That patch adds checks to enabled extra HW offloads, only when
> >>>>>>> they were requested.
> >>>>>>> Plus it enables DEV_TX_OFFLOAD_IPV4_CKSUM,
> >>>>>>> only when other HW TX ofloads are going to be enabled.
> >>>>>>> Otherwise SW version of ip cksum calculation is used.
> >>>>>>> That allows to use vector TX function, when inline-ipsec is not
> >>>>>>> requested.
> >>>>>>>
> >>>>>>> Signed-off-by: Remy Horton <remy.hor...@intel.com>
> >>>>>>> Signed-off-by: Konstantin Ananyev <konstantin.anan...@intel.com>
> >>>>>>> Acked-by: Radu Nicolau <radu.nico...@intel.com>
> >>>>>>> ---
> >>>>>>>      examples/ipsec-secgw/ipsec-secgw.c | 44 +++++++++++++++--------
> >>>>>>>      examples/ipsec-secgw/ipsec.h       |  6 ++++
> >>>>>>>      examples/ipsec-secgw/sa.c          | 56 
> >>>>>>> ++++++++++++++++++++++++++++++
> >>>>>>>      3 files changed, 91 insertions(+), 15 deletions(-)
> >>>>>>>
> >>>>>>> diff --git a/examples/ipsec-secgw/ipsec-secgw.c 
> >>>>>>> b/examples/ipsec-secgw/ipsec-secgw.c
> >>>>>>> index 1bc0b5b50..cfc2b05e5 100644
> >>>>>>> --- a/examples/ipsec-secgw/ipsec-secgw.c
> >>>>>>> +++ b/examples/ipsec-secgw/ipsec-secgw.c
> >>>>>>> @@ -208,8 +208,6 @@ static struct rte_eth_conf port_conf = {
> >>>>>>>       },
> >>>>>>>       .txmode = {
> >>>>>>>               .mq_mode = ETH_MQ_TX_NONE,
> >>>>>>> -             .offloads = (DEV_TX_OFFLOAD_IPV4_CKSUM |
> >>>>>>> -                          DEV_TX_OFFLOAD_MULTI_SEGS),
> >>>>>> I believe this is disabling checksum offload for all cases and then
> >>>>>> enabling only for inline crypto and inline proto.
> >>>>> Yes.
> >>>>>
> >>>>>> This is breaking lookaside proto and lookaside none cases. Please
> >>>>>> correct me if I am wrong.
> >>>>> Why breaking?
> >>>> reduction in performance is kind of breaking the code.
> >>> I didn’t observe any performance drop with that patch.
> >>> In fact there was a tiny improvement (see below).
> >>> Did you see any regression with this patch on your HW?
> >> NXP hardware are low -end to mid end devices and we are always
> >> bottleneck by core cycles.
> >> So we would like to have as much offloads to HW as possible.
> > Ok, then I suppose we need to introduce new cmd-line options,
> > Something like: --txoffloads=<tx_offload_mask> 
> > --rx_offloads=<rx_offload_mask>
> > to keep everyone happy.
> > Are you ok with that?
> I think it should be taken from the PMD capabilities. 
Don't see how?
Let say, Intel NICs do support HW IPv4 cksum offload, but we don't want
to enable it on its own - only if IPsec offload is also enabled.
From other side you want HW IPv4 cksum offload to be always enabled
if present.
As I can see, to fulfill everyone needs we need to provide user ability
to specify which HW offloads to use. 

> cmd line for every
> parameter will make it very complex.

> > Konstantin
> >
> >>>>> For cases when HW cksum offload is disabled, IPv4 cksum calculation
> >>>>> will be done in SW, see below:
> >>>>> prepare_tx_pkt(...)
> >>>>> {
> >>>>>       ...
> >>>>>        +
> >>>>>        +                /* calculate IPv4 cksum in SW */
> >>>>>        +                if ((pkt->ol_flags & PKT_TX_IP_CKSUM) == 0)
> >>>>>        +                        ip->ip_sum = rte_ipv4_cksum((struct 
> >>>>> ipv4_hdr *)ip);
> >>>>>
> >>>>>
> >>>>> We tested lookaside-none case quite extensively - all works well,
> >>>>> in fact on Intel NICs it became even a bit faster because of that change
> >>>>> (though not much).
> >>>> yes, it may work well on one hardware, but may not perform good in other
> >>>> hardware where cores are limited.
> >>> Could you elaborate a bit more what do you mean by 'cores are limited' 
> >>> here?
> >> we have single core devices as well on which we run ipsec-secgw.
> >>> Do you mean that for some low end cpus calculating IPv4 cksum in SW is 
> >>> too expensive?
> >> yes, limited by core cycles and not by HW
> >>> Note that prepare_tx_pkts() and friends read/write L2/L3 packet headers 
> >>> anyway -
> >>> so IPv4 header will be in L1 cache already.
> >> Agreed, but still it will consume some cycles which are more than that
> >> of HW.
> >>>>> Disabling HW offloads when they are not really required has 2 benefits:
> >>>>>     1) allows app to be run on NICs without HW offloads support.
> >>>>>     2) allows dev_configure() for TX path to select simple/vector TX 
> >>>>> functions
> >>>>>         which for many NICs are significantly faster.
> >>>>>
> >>>>> Konstantin
> >>>>>
> >>>>>> So a NACK for this if my understanding is correct.
> >>>>>>

Reply via email to