On 12/24/2018 5:07 PM, Ananyev, Konstantin wrote:
>
>> -----Original Message-----
>> From: Akhil Goyal [mailto:akhil.go...@nxp.com]
>> Sent: Monday, December 24, 2018 11:25 AM
>> To: Ananyev, Konstantin <konstantin.anan...@intel.com>; dev@dpdk.org
>> Cc: Nicolau, Radu <radu.nico...@intel.com>; Horton, Remy 
>> <remy.hor...@intel.com>
>> Subject: Re: [dpdk-dev] [PATCH v4 1/9] examples/ipsec-secgw: avoid to 
>> request unused TX offloads
>>
>>
>>
>> On 12/24/2018 4:52 PM, Ananyev, Konstantin wrote:
>>>> -----Original Message-----
>>>> From: Akhil Goyal [mailto:akhil.go...@nxp.com]
>>>> Sent: Monday, December 24, 2018 10:54 AM
>>>> To: Ananyev, Konstantin <konstantin.anan...@intel.com>; dev@dpdk.org
>>>> Cc: Nicolau, Radu <radu.nico...@intel.com>; Horton, Remy 
>>>> <remy.hor...@intel.com>
>>>> Subject: Re: [dpdk-dev] [PATCH v4 1/9] examples/ipsec-secgw: avoid to 
>>>> request unused TX offloads
>>>>
>>>>
>>>>
>>>> On 12/24/2018 3:49 PM, Ananyev, Konstantin wrote:
>>>>>>>> On 12/14/2018 10:10 PM, Konstantin Ananyev wrote:
>>>>>>>>> ipsec-secgw always enables TX offloads
>>>>>>>>> (DEV_TX_OFFLOAD_MULTI_SEGS, DEV_TX_OFFLOAD_SECURITY),
>>>>>>>>> even when they are not requested by the config.
>>>>>>>>> That causes many PMD to choose full-featured TX function,
>>>>>>>>> which in many cases is much slower then one without offloads.
>>>>>>>>> That patch adds checks to enabled extra HW offloads, only when
>>>>>>>>> they were requested.
>>>>>>>>> Plus it enables DEV_TX_OFFLOAD_IPV4_CKSUM,
>>>>>>>>> only when other HW TX ofloads are going to be enabled.
>>>>>>>>> Otherwise SW version of ip cksum calculation is used.
>>>>>>>>> That allows to use vector TX function, when inline-ipsec is not
>>>>>>>>> requested.
>>>>>>>>>
>>>>>>>>> Signed-off-by: Remy Horton <remy.hor...@intel.com>
>>>>>>>>> Signed-off-by: Konstantin Ananyev <konstantin.anan...@intel.com>
>>>>>>>>> Acked-by: Radu Nicolau <radu.nico...@intel.com>
>>>>>>>>> ---
>>>>>>>>>       examples/ipsec-secgw/ipsec-secgw.c | 44 +++++++++++++++--------
>>>>>>>>>       examples/ipsec-secgw/ipsec.h       |  6 ++++
>>>>>>>>>       examples/ipsec-secgw/sa.c          | 56 
>>>>>>>>> ++++++++++++++++++++++++++++++
>>>>>>>>>       3 files changed, 91 insertions(+), 15 deletions(-)
>>>>>>>>>
>>>>>>>>> diff --git a/examples/ipsec-secgw/ipsec-secgw.c 
>>>>>>>>> b/examples/ipsec-secgw/ipsec-secgw.c
>>>>>>>>> index 1bc0b5b50..cfc2b05e5 100644
>>>>>>>>> --- a/examples/ipsec-secgw/ipsec-secgw.c
>>>>>>>>> +++ b/examples/ipsec-secgw/ipsec-secgw.c
>>>>>>>>> @@ -208,8 +208,6 @@ static struct rte_eth_conf port_conf = {
>>>>>>>>>               },
>>>>>>>>>               .txmode = {
>>>>>>>>>                       .mq_mode = ETH_MQ_TX_NONE,
>>>>>>>>> -             .offloads = (DEV_TX_OFFLOAD_IPV4_CKSUM |
>>>>>>>>> -                          DEV_TX_OFFLOAD_MULTI_SEGS),
>>>>>>>> I believe this is disabling checksum offload for all cases and then
>>>>>>>> enabling only for inline crypto and inline proto.
>>>>>>> Yes.
>>>>>>>
>>>>>>>> This is breaking lookaside proto and lookaside none cases. Please
>>>>>>>> correct me if I am wrong.
>>>>>>> Why breaking?
>>>>>> reduction in performance is kind of breaking the code.
>>>>> I didn’t observe any performance drop with that patch.
>>>>> In fact there was a tiny improvement (see below).
>>>>> Did you see any regression with this patch on your HW?
>>>> NXP hardware are low -end to mid end devices and we are always
>>>> bottleneck by core cycles.
>>>> So we would like to have as much offloads to HW as possible.
>>> Ok, then I suppose we need to introduce new cmd-line options,
>>> Something like: --txoffloads=<tx_offload_mask> 
>>> --rx_offloads=<rx_offload_mask>
>>> to keep everyone happy.
>>> Are you ok with that?
>> I think it should be taken from the PMD capabilities.
> Don't see how?
> Let say, Intel NICs do support HW IPv4 cksum offload, but we don't want
> to enable it on its own - only if IPsec offload is also enabled.
>  From other side you want HW IPv4 cksum offload to be always enabled
> if present.
> As I can see, to fulfill everyone needs we need to provide user ability
> to specify which HW offloads to use.
if there is no other way to resolve it, then probably you can add an 
optional parameter to disable offloads when it is required.
I believe default behavior should not be changed.
>
>> cmd line for every
>> parameter will make it very complex.
>>> Konstantin
>>>
>>>>>>> For cases when HW cksum offload is disabled, IPv4 cksum calculation
>>>>>>> will be done in SW, see below:
>>>>>>> prepare_tx_pkt(...)
>>>>>>> {
>>>>>>>        ...
>>>>>>>         +
>>>>>>>         +               /* calculate IPv4 cksum in SW */
>>>>>>>         +               if ((pkt->ol_flags & PKT_TX_IP_CKSUM) == 0)
>>>>>>>         +                       ip->ip_sum = rte_ipv4_cksum((struct 
>>>>>>> ipv4_hdr *)ip);
>>>>>>>
>>>>>>>
>>>>>>> We tested lookaside-none case quite extensively - all works well,
>>>>>>> in fact on Intel NICs it became even a bit faster because of that change
>>>>>>> (though not much).
>>>>>> yes, it may work well on one hardware, but may not perform good in other
>>>>>> hardware where cores are limited.
>>>>> Could you elaborate a bit more what do you mean by 'cores are limited' 
>>>>> here?
>>>> we have single core devices as well on which we run ipsec-secgw.
>>>>> Do you mean that for some low end cpus calculating IPv4 cksum in SW is 
>>>>> too expensive?
>>>> yes, limited by core cycles and not by HW
>>>>> Note that prepare_tx_pkts() and friends read/write L2/L3 packet headers 
>>>>> anyway -
>>>>> so IPv4 header will be in L1 cache already.
>>>> Agreed, but still it will consume some cycles which are more than that
>>>> of HW.
>>>>>>> Disabling HW offloads when they are not really required has 2 benefits:
>>>>>>>      1) allows app to be run on NICs without HW offloads support.
>>>>>>>      2) allows dev_configure() for TX path to select simple/vector TX 
>>>>>>> functions
>>>>>>>          which for many NICs are significantly faster.
>>>>>>>
>>>>>>> Konstantin
>>>>>>>
>>>>>>>> So a NACK for this if my understanding is correct.
>>>>>>>>

Reply via email to