Re: [dpdk-dev] [PATCH 1/7] eal: move OS common functions to single file

2020-04-23 Thread Thomas Monjalon
23/04/2020 01:51, Ranjit Menon:
> On 4/22/2020 12:27 AM, tal...@mellanox.com wrote:
> > From: Tal Shnaiderman 
> > 
> > Move common functions between Unix and Windows to eal_config.c.
> 
> Like other files in common, we should call this eal_common_config.c

I am not sure about the interest of repeating the directory name
in the file name in general.
Do you see a real benefit?

Note: the naming in lib/librte_eal/common/ is not uniform.




Re: [dpdk-dev] [EXT] Re: [PATCH v3 1/1] bus/pci: optimise scanning with whitelist/blacklist

2020-04-23 Thread Sunil Kumar Kori
>-Original Message-
>From: Gaëtan Rivet 
>Sent: Wednesday, April 22, 2020 3:09 PM
>To: Sunil Kumar Kori 
>Cc: step...@networkplumber.org; david.march...@redhat.com; Jerin Jacob
>Kollanukkaran ; dev@dpdk.org
>Subject: Re: [EXT] Re: [dpdk-dev] [PATCH v3 1/1] bus/pci: optimise scanning
>with whitelist/blacklist
>
>On 22/04/20 06:17 +, Sunil Kumar Kori wrote:
>> >-Original Message-
>> >From: Gaëtan Rivet 
>> >Sent: Tuesday, April 21, 2020 8:48 PM
>> >To: Sunil Kumar Kori 
>> >Cc: step...@networkplumber.org; david.march...@redhat.com; Jerin
>> >Jacob Kollanukkaran ; dev@dpdk.org
>> >Subject: [EXT] Re: [dpdk-dev] [PATCH v3 1/1] bus/pci: optimise
>> >scanning with whitelist/blacklist
>> >
>> >External Email
>> >
>> >-
>> >- On 20/04/20 12:25 +0530, Sunil Kumar Kori wrote:
>> >> rte_bus_scan API scans all the available PCI devices irrespective
>> >> of white or black listing parameters then further devices are
>> >> probed based on white or black listing parameters. So unnecessary
>> >> CPU cycles are wasted during rte_pci_scan.
>> >>
>> >> For Octeontx2 platform with core frequency 2.4 Ghz, rte_bus_scan
>> >> consumes around 26ms to scan around 90 PCI devices but all may not
>> >> be used by the application. So for the application which uses 2
>> >> NICs, rte_bus_scan consumes few microseconds and rest time is saved
>> >> with this
>> >patch.
>> >>
>> >
>> >Hi Sunil,
>> >
>> >The PCI bus was written at first with the understanding that all PCI
>> >devices were scanned and made available on the bus -- the probe will filter
>afterward.
>> >
>> >Device hotplug and iteration were written with this in mind. Changing
>> >this principle might have unintended consequences in other EAL parts.
>> >I'm not fundamentally against it, but it is not how buses are
>> >currently designed in DPDK.
>> >
>> I am also not sure about this. I would request you provide suggestion
>> to ensure that there won't be any negative consequences if any.  So that I
>can handle those too.
>>
>
>I would like also to hear from other stakeholders for the PCI bus.
>
>Generally, as long as the blacklist mode is the default, behavior should not
>change, but devil is in the details.
>
>I would have some comments on the patch itself if everyone agrees with this
>direction.
>
>If the principle of the patch is accepted, it would be great for you to test
>hotplug and device listing with testpmd:
>
>   hotplug:
>* You can spawn VMs with virtual e1000 ports on PCI using QEMU for this,
>  and within testpmd `port attach ` -- of course, the
>  port(s) should not be attached when starting testpmd. You might
>  have to either blacklist them, or you could hotplug them in QEMU using 
> the
>  monitor. I don't recall the QEMU commands to do that, sorry.
>
>   device list:
>* `show device info all` in testpmd. I thought I had added a command to
>  test the device iterator, taking an arbitrary device string, but
>  the patch has been dropped it seems.
>
>If you have no segfault and no surprise, it is a good start.
>
Okay but before verification I would appreciate to have more comments and
closure on principle from other PCI stakeholders. If there is no objection on
principle then I will invest energy in testing.
I will wait for inputs by next week and if there are no inputs then
 assuming it fundamentally correct, I will start verification of above test 
cases. 

Also If anyone has already validated above mentioned test cases then
Suggestions, about the impact of this patch on PCI bus design, will be very
helpful to understand the real issues with this. 

>> >To me, a one-time 26ms gain is not enough justification to change
>> >this principle. How problematic is this for you? Do you encounter
>> >specific issues due to this delay?
>> >
>> >Thanks,
>>
>> Recently we observed this requirement to cater a use of having lowest
>bootup time for DPDK application.
>> One of the use-case for this to reduce the downtime as part of DPDK SW
>> upgrade in the field. i.e after the SW update, time to close the application
>and restart it again for packet processing.
>> Having this solution application will be up soon and lesser traffic impact 
>> will
>be there in a deployed system.
>
>DPDK startup was not written with low latency in mind. You will find here and
>there minute improvements, but I think it is a pipedream to reduce service
>disruption on binary upgrade.
>
>People in the field would be better served with HA, not relying on a critical
>apps restarting as fast as possible.
>
Recently we had a requirement to have bootup time <= 500ms and find
it as one of the candidate to be improved. So thought of to upstream it. 
Also having mechanism to improve bootup time is  good. I think, there is
no harm in this.

>Cheers,
>--
>Gaëtan


Re: [dpdk-dev] [PATCH] security: fix crash at accessing non-implemented ops

2020-04-23 Thread Ananyev, Konstantin
> 
> Hi Konstantin,
> 
> These are data path ops and so it will be better if we can avoid such checks 
> in the datapath. The same is done in ethdev also.

AFAIK,  get_userdata is an *optional* dev-ops function that can be used by 
data-path.
So far there was no strict requirement for the rte_security PMDs to *always* 
implement it.
So what you guys did is a silent change of public API behaviour.
As result ixgbe, (and probably some others rte_security PMDs) stopped working 
properly.
I don't see any point in these changes, but if you'd like to do that, at
least our usual procedure has to be followed:
1. Send and RFC to get an agreement with rte_security PMDs maintainers (one 
release ahead)
2. send a deprecation note (one release ahead)
3. change the behaviour of the public API
4. update release notes 

AFAIK 1), 2), 4) wasn't done.
So I think right now we need to revert original behaviour.

> 
> http://code.dpdk.org/dpdk/v20.02/source/lib/librte_ethdev/rte_ethdev.h#L4372
> 
> Datapath functions in cryptodev (enqueue/dequeue) doesn't even have such 
> checks.
> http://code.dpdk.org/dpdk/v20.02/source/lib/librte_cryptodev/rte_cryptodev.h#L962

That's a different story:
rx_burst/tx_burst, enqueue/dequeue are mandatory dev-ops functions that
have to be implemented by each  ethdev/cryptodev API.

> 
> 
> Thanks,
> Anoob
> 
> > -Original Message-
> > From: dev  On Behalf Of Konstantin Ananyev
> > Sent: Thursday, April 23, 2020 5:22 AM
> > To: dev@dpdk.org
> > Cc: akhil.go...@nxp.com; declan.dohe...@intel.com; Konstantin Ananyev
> > 
> > Subject: [dpdk-dev] [PATCH] security: fix crash at accessing non-implemented
> > ops
> >
> > Valid checks for optional function pointers inside dev-ops were disabled by
> > undefined macro.
> >
> > Fixes: b6ee98547847 ("security: fix verification of parameters")
> >
> > Signed-off-by: Konstantin Ananyev 
> > ---
> >  lib/librte_security/rte_security.c | 4 
> >  1 file changed, 4 deletions(-)
> >
> > diff --git a/lib/librte_security/rte_security.c 
> > b/lib/librte_security/rte_security.c
> > index d475b0977..b65430ce2 100644
> > --- a/lib/librte_security/rte_security.c
> > +++ b/lib/librte_security/rte_security.c
> > @@ -107,11 +107,9 @@ rte_security_set_pkt_metadata(struct rte_security_ctx
> > *instance,
> >   struct rte_security_session *sess,
> >   struct rte_mbuf *m, void *params)  { -#ifdef
> > RTE_DEBUG
> > RTE_PTR_CHAIN3_OR_ERR_RET(instance, ops, set_pkt_metadata, -
> > EINVAL,
> > -ENOTSUP);
> > RTE_PTR_OR_ERR_RET(sess, -EINVAL);
> > -#endif
> > return instance->ops->set_pkt_metadata(instance->device,
> >sess, m, params);
> >  }
> > @@ -121,9 +119,7 @@ rte_security_get_userdata(struct rte_security_ctx
> > *instance, uint64_t md)  {
> > void *userdata = NULL;
> >
> > -#ifdef RTE_DEBUG
> > RTE_PTR_CHAIN3_OR_ERR_RET(instance, ops, get_userdata, NULL,
> > NULL); -#endif
> > if (instance->ops->get_userdata(instance->device, md, &userdata))
> > return NULL;
> >
> > --
> > 2.17.1



Re: [dpdk-dev] [PATCH] security: fix crash at accessing non-implemented ops

2020-04-23 Thread Lukasz Wojciechowski


W dniu 23.04.2020 o 02:11, Ananyev, Konstantin pisze:
> Actually looking at app/test/test_security.c
> I also see a few '#ifdef RTE_DEBUG's.
> Let say:
>
> +static int
> +test_get_userdata_inv_context(void)
> +{
> +#ifdef RTE_DEBUG
> +   uint64_t md = 0xDEADBEEF;
> +
> +   void *ret = rte_security_get_userdata(NULL, md);
> +   TEST_ASSERT_MOCK_FUNCTION_CALL_RET(rte_security_get_userdata,
> +   ret, NULL, "%p");
> +   TEST_ASSERT_MOCK_CALLS(mock_get_userdata_exp, 0);
> +
> +   return TEST_SUCCESS;
> +#else
> +   return TEST_SKIPPED;
> +#endif
> +}
>
> What is the point?
> Why not always run the test unconditionally?

If there is no RTE_DEBUG defined, the tested functionality is not 
compiled, so the tests won't work.

They must be wrapped with same macro as library code.

>
>
>> -Original Message-
>> From: Ananyev, Konstantin 
>> Sent: Thursday, April 23, 2020 12:52 AM
>> To: dev@dpdk.org
>> Cc: akhil.go...@nxp.com; Doherty, Declan ; 
>> Ananyev, Konstantin 
>> Subject: [PATCH] security: fix crash at accessing non-implemented ops
>>
>> Valid checks for optional function pointers inside dev-ops
>> were disabled by undefined macro.
>>
>> Fixes: b6ee98547847 ("security: fix verification of parameters")
>>
>> Signed-off-by: Konstantin Ananyev 
>> ---
>>   lib/librte_security/rte_security.c | 4 
>>   1 file changed, 4 deletions(-)
>>
>> diff --git a/lib/librte_security/rte_security.c 
>> b/lib/librte_security/rte_security.c
>> index d475b0977..b65430ce2 100644
>> --- a/lib/librte_security/rte_security.c
>> +++ b/lib/librte_security/rte_security.c
>> @@ -107,11 +107,9 @@ rte_security_set_pkt_metadata(struct rte_security_ctx 
>> *instance,
>>struct rte_security_session *sess,
>>struct rte_mbuf *m, void *params)
>>   {
>> -#ifdef RTE_DEBUG
>>  RTE_PTR_CHAIN3_OR_ERR_RET(instance, ops, set_pkt_metadata, -EINVAL,
>>  -ENOTSUP);
>>  RTE_PTR_OR_ERR_RET(sess, -EINVAL);
>> -#endif
>>  return instance->ops->set_pkt_metadata(instance->device,
>> sess, m, params);
>>   }
>> @@ -121,9 +119,7 @@ rte_security_get_userdata(struct rte_security_ctx 
>> *instance, uint64_t md)
>>   {
>>  void *userdata = NULL;
>>
>> -#ifdef RTE_DEBUG
>>  RTE_PTR_CHAIN3_OR_ERR_RET(instance, ops, get_userdata, NULL, NULL);
>> -#endif
>>  if (instance->ops->get_userdata(instance->device, md, &userdata))
>>  return NULL;
>>
>> --
>> 2.17.1

-- 

Lukasz Wojciechowski
Principal Software Engineer

Samsung R&D Institute Poland
Samsung Electronics
Office +48 22 377 88 25
l.wojciec...@partner.samsung.com



Re: [dpdk-dev] [PATCH] security: fix crash at accessing non-implemented ops

2020-04-23 Thread Lukasz Wojciechowski


W dniu 23.04.2020 o 06:07, Anoob Joseph pisze:
> Hi Konstantin,
>
> These are data path ops and so it will be better if we can avoid such checks 
> in the datapath. The same is done in ethdev also.
>
> https://protect2.fireeye.com/url?k=d44931cf-89d2cdac-d448ba80-0cc47a31cdbc-8281a62b4c91d848&q=1&u=http%3A%2F%2Fcode.dpdk.org%2Fdpdk%2Fv20.02%2Fsource%2Flib%2Flibrte_ethdev%2Frte_ethdev.h%23L4372
>
> Datapath functions in cryptodev (enqueue/dequeue) doesn't even have such 
> checks.
> https://protect2.fireeye.com/url?k=51324200-0ca9be63-5133c94f-0cc47a31cdbc-11f88758fc12c996&q=1&u=http%3A%2F%2Fcode.dpdk.org%2Fdpdk%2Fv20.02%2Fsource%2Flib%2Flibrte_cryptodev%2Frte_cryptodev.h%23L962
>
>
> Thanks,
> Anoob

Hi Konstantine,

It's my fault. Sorry.

These checks need to be disabled in non-debug code, so they should be 
wrapped in a macro. It's just not the valid macro.
The discussion about rte_debug mode is ongoing 
(https://patchwork.dpdk.org/patch/68815/)
and currently the v2 version of patches is prepared to gather 
maintainers opinion.

After the rte_debug is introduced the proper macro to use will be 
RTE_DEBUG_SECURITY.

Until then, the RTE_DEBUG macro can stay as like Anoob mentioned the 
checks will have impact on dataplane performance.

If you want to enable this code, please use CFLAGS="-DRTE_DEBUG"

>
>> -Original Message-
>> From: dev  On Behalf Of Konstantin Ananyev
>> Sent: Thursday, April 23, 2020 5:22 AM
>> To: dev@dpdk.org
>> Cc: akhil.go...@nxp.com; declan.dohe...@intel.com; Konstantin Ananyev
>> 
>> Subject: [dpdk-dev] [PATCH] security: fix crash at accessing non-implemented
>> ops
>>
>> Valid checks for optional function pointers inside dev-ops were disabled by
>> undefined macro.
>>
>> Fixes: b6ee98547847 ("security: fix verification of parameters")
>>
>> Signed-off-by: Konstantin Ananyev 
>> ---
>>   lib/librte_security/rte_security.c | 4 
>>   1 file changed, 4 deletions(-)
>>
>> diff --git a/lib/librte_security/rte_security.c 
>> b/lib/librte_security/rte_security.c
>> index d475b0977..b65430ce2 100644
>> --- a/lib/librte_security/rte_security.c
>> +++ b/lib/librte_security/rte_security.c
>> @@ -107,11 +107,9 @@ rte_security_set_pkt_metadata(struct rte_security_ctx
>> *instance,
>>struct rte_security_session *sess,
>>struct rte_mbuf *m, void *params)  { -#ifdef
>> RTE_DEBUG
>>  RTE_PTR_CHAIN3_OR_ERR_RET(instance, ops, set_pkt_metadata, -
>> EINVAL,
>>  -ENOTSUP);
>>  RTE_PTR_OR_ERR_RET(sess, -EINVAL);
>> -#endif
>>  return instance->ops->set_pkt_metadata(instance->device,
>> sess, m, params);
>>   }
>> @@ -121,9 +119,7 @@ rte_security_get_userdata(struct rte_security_ctx
>> *instance, uint64_t md)  {
>>  void *userdata = NULL;
>>
>> -#ifdef RTE_DEBUG
>>  RTE_PTR_CHAIN3_OR_ERR_RET(instance, ops, get_userdata, NULL,
>> NULL); -#endif
>>  if (instance->ops->get_userdata(instance->device, md, &userdata))
>>  return NULL;
>>
>> --
>> 2.17.1
>
-- 

Lukasz Wojciechowski
Principal Software Engineer

Samsung R&D Institute Poland
Samsung Electronics
Office +48 22 377 88 25
l.wojciec...@partner.samsung.com



Re: [dpdk-dev] [PATCH] security: fix crash at accessing non-implemented ops

2020-04-23 Thread Lukasz Wojciechowski


W dniu 23.04.2020 o 09:54, Ananyev, Konstantin pisze:
>> Hi Konstantin,
>>
>> These are data path ops and so it will be better if we can avoid such checks 
>> in the datapath. The same is done in ethdev also.
> AFAIK,  get_userdata is an *optional* dev-ops function that can be used by 
> data-path.
> So far there was no strict requirement for the rte_security PMDs to *always* 
> implement it.
> So what you guys did is a silent change of public API behaviour.
> As result ixgbe, (and probably some others rte_security PMDs) stopped working 
> properly.
> I don't see any point in these changes, but if you'd like to do that, at
> least our usual procedure has to be followed:
> 1. Send and RFC to get an agreement with rte_security PMDs maintainers (one 
> release ahead)
> 2. send a deprecation note (one release ahead)
> 3. change the behaviour of the public API
> 4. update release notes
>
> AFAIK 1), 2), 4) wasn't done.
> So I think right now we need to revert original behaviour.
The current changes were made in patch: b6ee9854784 security: fix 
verification of parameters


@@ -91,7 +119,9 @@ rte_security_get_userdata(struct rte_security_ctx 
*instance, uint64_t md)
  {
     void *userdata = NULL;

- RTE_FUNC_PTR_OR_ERR_RET(*instance->ops->get_userdata, NULL);
+#ifdef RTE_DEBUG
+   RTE_PTR_CHAIN3_OR_ERR_RET(instance, ops, get_userdata, NULL, NULL);
+#endif
     if (instance->ops->get_userdata(instance->device, md, &userdata))
     return NULL;
  


So as you can see, the checks were already there. They've just been 
wrapped up with RTE_DEBUG macro for disabling them in non-debug 
compilation mode and the validation of paramter was change to avoid 
possible segmentation fault if instance lub ops would be NULL

>> https://protect2.fireeye.com/url?k=e0478418-bdd92a82-e0460f57-0cc47a336fae-55cc35a7b94c97c0&q=1&u=http%3A%2F%2Fcode.dpdk.org%2Fdpdk%2Fv20.02%2Fsource%2Flib%2Flibrte_ethdev%2Frte_ethdev.h%23L4372
>>
>> Datapath functions in cryptodev (enqueue/dequeue) doesn't even have such 
>> checks.
>> https://protect2.fireeye.com/url?k=79d7974a-244939d0-79d61c05-0cc47a336fae-19f540008a9467cf&q=1&u=http%3A%2F%2Fcode.dpdk.org%2Fdpdk%2Fv20.02%2Fsource%2Flib%2Flibrte_cryptodev%2Frte_cryptodev.h%23L962
> That's a different story:
> rx_burst/tx_burst, enqueue/dequeue are mandatory dev-ops functions that
> have to be implemented by each  ethdev/cryptodev API.
>
>>
>> Thanks,
>> Anoob
>>
>>> -Original Message-
>>> From: dev  On Behalf Of Konstantin Ananyev
>>> Sent: Thursday, April 23, 2020 5:22 AM
>>> To: dev@dpdk.org
>>> Cc: akhil.go...@nxp.com; declan.dohe...@intel.com; Konstantin Ananyev
>>> 
>>> Subject: [dpdk-dev] [PATCH] security: fix crash at accessing non-implemented
>>> ops
>>>
>>> Valid checks for optional function pointers inside dev-ops were disabled by
>>> undefined macro.
>>>
>>> Fixes: b6ee98547847 ("security: fix verification of parameters")
>>>
>>> Signed-off-by: Konstantin Ananyev 
>>> ---
>>>   lib/librte_security/rte_security.c | 4 
>>>   1 file changed, 4 deletions(-)
>>>
>>> diff --git a/lib/librte_security/rte_security.c 
>>> b/lib/librte_security/rte_security.c
>>> index d475b0977..b65430ce2 100644
>>> --- a/lib/librte_security/rte_security.c
>>> +++ b/lib/librte_security/rte_security.c
>>> @@ -107,11 +107,9 @@ rte_security_set_pkt_metadata(struct rte_security_ctx
>>> *instance,
>>>   struct rte_security_session *sess,
>>>   struct rte_mbuf *m, void *params)  { -#ifdef
>>> RTE_DEBUG
>>> RTE_PTR_CHAIN3_OR_ERR_RET(instance, ops, set_pkt_metadata, -
>>> EINVAL,
>>> -ENOTSUP);
>>> RTE_PTR_OR_ERR_RET(sess, -EINVAL);
>>> -#endif
>>> return instance->ops->set_pkt_metadata(instance->device,
>>>sess, m, params);
>>>   }
>>> @@ -121,9 +119,7 @@ rte_security_get_userdata(struct rte_security_ctx
>>> *instance, uint64_t md)  {
>>> void *userdata = NULL;
>>>
>>> -#ifdef RTE_DEBUG
>>> RTE_PTR_CHAIN3_OR_ERR_RET(instance, ops, get_userdata, NULL,
>>> NULL); -#endif
>>> if (instance->ops->get_userdata(instance->device, md, &userdata))
>>> return NULL;
>>>
>>> --
>>> 2.17.1
>
-- 

Lukasz Wojciechowski
Principal Software Engineer

Samsung R&D Institute Poland
Samsung Electronics
Office +48 22 377 88 25
l.wojciec...@partner.samsung.com



Re: [dpdk-dev] [PATCH v8 1/9] net/virtio: add Rx free threshold setting

2020-04-23 Thread Maxime Coquelin



On 4/23/20 2:30 PM, Marvin Liu wrote:
> Introduce free threshold setting in Rx queue, default value of it is 32.
> Limiated threshold size to multiple of four as only vectorized packed Rx
s/Limiated/Limit the/

> function will utilize it. Virtio driver will rearm Rx queue when more
> than rx_free_thresh descs were dequeued.
> 
> Signed-off-by: Marvin Liu 
> 

Reviewed-by: Maxime Coquelin 

Thanks,
Maxime



Re: [dpdk-dev] [PATCH] security: fix crash at accessing non-implemented ops

2020-04-23 Thread Ananyev, Konstantin


> -Original Message-
> From: Lukasz Wojciechowski 
> Sent: Thursday, April 23, 2020 9:06 AM
> To: Ananyev, Konstantin ; Anoob Joseph 
> ; dev@dpdk.org
> Cc: akhil.go...@nxp.com; Doherty, Declan ; 
> techbo...@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH] security: fix crash at accessing 
> non-implemented ops
> 
> 
> W dniu 23.04.2020 o 09:54, Ananyev, Konstantin pisze:
> >> Hi Konstantin,
> >>
> >> These are data path ops and so it will be better if we can avoid such 
> >> checks in the datapath. The same is done in ethdev also.
> > AFAIK,  get_userdata is an *optional* dev-ops function that can be used by 
> > data-path.
> > So far there was no strict requirement for the rte_security PMDs to 
> > *always* implement it.
> > So what you guys did is a silent change of public API behaviour.
> > As result ixgbe, (and probably some others rte_security PMDs) stopped 
> > working properly.
> > I don't see any point in these changes, but if you'd like to do that, at
> > least our usual procedure has to be followed:
> > 1. Send and RFC to get an agreement with rte_security PMDs maintainers (one 
> > release ahead)
> > 2. send a deprecation note (one release ahead)
> > 3. change the behaviour of the public API
> > 4. update release notes
> >
> > AFAIK 1), 2), 4) wasn't done.
> > So I think right now we need to revert original behaviour.
> The current changes were made in patch: b6ee9854784 security: fix
> verification of parameters
> 
> 
> @@ -91,7 +119,9 @@ rte_security_get_userdata(struct rte_security_ctx
> *instance, uint64_t md)
>   {
>      void *userdata = NULL;
> 
> - RTE_FUNC_PTR_OR_ERR_RET(*instance->ops->get_userdata, NULL);
> +#ifdef RTE_DEBUG
> +   RTE_PTR_CHAIN3_OR_ERR_RET(instance, ops, get_userdata, NULL, NULL);
> +#endif
>      if (instance->ops->get_userdata(instance->device, md, &userdata))
>      return NULL;
>   
> 
> 
> So as you can see, the checks were already there. They've just been
> wrapped up with RTE_DEBUG macro for disabling them in non-debug
> compilation mode and the validation of paramter was change to avoid
> possible segmentation fault if instance lub ops would be NULL
> 
> >> https://protect2.fireeye.com/url?k=e0478418-bdd92a82-e0460f57-0cc47a336fae-
> 55cc35a7b94c97c0&q=1&u=http%3A%2F%2Fcode.dpdk.org%2Fdpdk%2Fv20.02%2Fsource%2Flib%2Flibrte_ethdev%2Frte_ethdev.h%23L43
> 72
> >>
> >> Datapath functions in cryptodev (enqueue/dequeue) doesn't even have such 
> >> checks.
> >> https://protect2.fireeye.com/url?k=79d7974a-244939d0-79d61c05-0cc47a336fae-
> 19f540008a9467cf&q=1&u=http%3A%2F%2Fcode.dpdk.org%2Fdpdk%2Fv20.02%2Fsource%2Flib%2Flibrte_cryptodev%2Frte_cryptodev.h%
> 23L962
> > That's a different story:
> > rx_burst/tx_burst, enqueue/dequeue are mandatory dev-ops functions that
> > have to be implemented by each  ethdev/cryptodev API.
> >
> >>
> >> Thanks,
> >> Anoob
> >>
> >>> -Original Message-
> >>> From: dev  On Behalf Of Konstantin Ananyev
> >>> Sent: Thursday, April 23, 2020 5:22 AM
> >>> To: dev@dpdk.org
> >>> Cc: akhil.go...@nxp.com; declan.dohe...@intel.com; Konstantin Ananyev
> >>> 
> >>> Subject: [dpdk-dev] [PATCH] security: fix crash at accessing 
> >>> non-implemented
> >>> ops
> >>>
> >>> Valid checks for optional function pointers inside dev-ops were disabled 
> >>> by
> >>> undefined macro.
> >>>
> >>> Fixes: b6ee98547847 ("security: fix verification of parameters")
> >>>
> >>> Signed-off-by: Konstantin Ananyev 
> >>> ---
> >>>   lib/librte_security/rte_security.c | 4 
> >>>   1 file changed, 4 deletions(-)
> >>>
> >>> diff --git a/lib/librte_security/rte_security.c 
> >>> b/lib/librte_security/rte_security.c
> >>> index d475b0977..b65430ce2 100644
> >>> --- a/lib/librte_security/rte_security.c
> >>> +++ b/lib/librte_security/rte_security.c
> >>> @@ -107,11 +107,9 @@ rte_security_set_pkt_metadata(struct rte_security_ctx
> >>> *instance,
> >>> struct rte_security_session *sess,
> >>> struct rte_mbuf *m, void *params)  { 
> >>> -#ifdef
> >>> RTE_DEBUG
> >>>   RTE_PTR_CHAIN3_OR_ERR_RET(instance, ops, set_pkt_metadata, -
> >>> EINVAL,
> >>>   -ENOTSUP);
> >>>   RTE_PTR_OR_ERR_RET(sess, -EINVAL);
> >>> -#endif
> >>>   return instance->ops->set_pkt_metadata(instance->device,
> >>>  sess, m, params);
> >>>   }
> >>> @@ -121,9 +119,7 @@ rte_security_get_userdata(struct rte_security_ctx
> >>> *instance, uint64_t md)  {
> >>>   void *userdata = NULL;
> >>>
> >>> -#ifdef RTE_DEBUG
> >>>   RTE_PTR_CHAIN3_OR_ERR_RET(instance, ops, get_userdata, NULL,
> >>> NULL); -#endif
> >>>   if (instance->ops->get_userdata(instance->device, md, 
> >>> &userdata))
> >>>   return NULL;
> >>>
> >>> --
> >>> 2.17.1
> >
> --
> 
> Lukasz Wojciechowski
> Principal Software Engineer
> 
> Samsung R&D Institute Poland
> Samsung Electronics
> Office +48 22 377 

Re: [dpdk-dev] [PATCH] security: fix crash at accessing non-implemented ops

2020-04-23 Thread Ananyev, Konstantin

> >> Hi Konstantin,
> >>
> >> These are data path ops and so it will be better if we can avoid such 
> >> checks in the datapath. The same is done in ethdev also.
> > AFAIK,  get_userdata is an *optional* dev-ops function that can be used by 
> > data-path.
> > So far there was no strict requirement for the rte_security PMDs to 
> > *always* implement it.
> > So what you guys did is a silent change of public API behaviour.
> > As result ixgbe, (and probably some others rte_security PMDs) stopped 
> > working properly.
> > I don't see any point in these changes, but if you'd like to do that, at
> > least our usual procedure has to be followed:
> > 1. Send and RFC to get an agreement with rte_security PMDs maintainers (one 
> > release ahead)
> > 2. send a deprecation note (one release ahead)
> > 3. change the behaviour of the public API
> > 4. update release notes
> >
> > AFAIK 1), 2), 4) wasn't done.
> > So I think right now we need to revert original behaviour.
> The current changes were made in patch: b6ee9854784 security: fix
> verification of parameters
> 
> 
> @@ -91,7 +119,9 @@ rte_security_get_userdata(struct rte_security_ctx
> *instance, uint64_t md)
>   {
>      void *userdata = NULL;
> 
> - RTE_FUNC_PTR_OR_ERR_RET(*instance->ops->get_userdata, NULL);
> +#ifdef RTE_DEBUG
> +   RTE_PTR_CHAIN3_OR_ERR_RET(instance, ops, get_userdata, NULL, NULL);
> +#endif
>      if (instance->ops->get_userdata(instance->device, md, &userdata))
>      return NULL;
>   
> 
> 
> So as you can see, the checks were already there. 
>They've just been
> wrapped up with RTE_DEBUG macro for disabling them in non-debug
> compilation mode and the validation of paramter was change to avoid
> possible segmentation fault if instance lub ops would be NULL

Sigh, that's what I am talking about:
you effectively complied out valid checks for non-debug mode. 
Yes, these checks have been there and they *should* stay there
for *ANY* mode (both debug and non-debug).
This is an *optional* dev-ops function.
PMD has a freedom not to implement optional function.
It is a rte_security framework responsibility to check that these functions
are implemented or not.
If you like to change that - the procedure described above has to be followed.

Konstantin 

> 
> >> https://protect2.fireeye.com/url?k=e0478418-bdd92a82-e0460f57-0cc47a336fae-
> 55cc35a7b94c97c0&q=1&u=http%3A%2F%2Fcode.dpdk.org%2Fdpdk%2Fv20.02%2Fsource%2Flib%2Flibrte_ethdev%2Frte_ethdev.h%23L43
> 72
> >>
> >> Datapath functions in cryptodev (enqueue/dequeue) doesn't even have such 
> >> checks.
> >> https://protect2.fireeye.com/url?k=79d7974a-244939d0-79d61c05-0cc47a336fae-
> 19f540008a9467cf&q=1&u=http%3A%2F%2Fcode.dpdk.org%2Fdpdk%2Fv20.02%2Fsource%2Flib%2Flibrte_cryptodev%2Frte_cryptodev.h%
> 23L962
> > That's a different story:
> > rx_burst/tx_burst, enqueue/dequeue are mandatory dev-ops functions that
> > have to be implemented by each  ethdev/cryptodev API.
> >
> >>
> >> Thanks,
> >> Anoob
> >>
> >>> -Original Message-
> >>> From: dev  On Behalf Of Konstantin Ananyev
> >>> Sent: Thursday, April 23, 2020 5:22 AM
> >>> To: dev@dpdk.org
> >>> Cc: akhil.go...@nxp.com; declan.dohe...@intel.com; Konstantin Ananyev
> >>> 
> >>> Subject: [dpdk-dev] [PATCH] security: fix crash at accessing 
> >>> non-implemented
> >>> ops
> >>>
> >>> Valid checks for optional function pointers inside dev-ops were disabled 
> >>> by
> >>> undefined macro.
> >>>
> >>> Fixes: b6ee98547847 ("security: fix verification of parameters")
> >>>
> >>> Signed-off-by: Konstantin Ananyev 
> >>> ---
> >>>   lib/librte_security/rte_security.c | 4 
> >>>   1 file changed, 4 deletions(-)
> >>>
> >>> diff --git a/lib/librte_security/rte_security.c 
> >>> b/lib/librte_security/rte_security.c
> >>> index d475b0977..b65430ce2 100644
> >>> --- a/lib/librte_security/rte_security.c
> >>> +++ b/lib/librte_security/rte_security.c
> >>> @@ -107,11 +107,9 @@ rte_security_set_pkt_metadata(struct rte_security_ctx
> >>> *instance,
> >>> struct rte_security_session *sess,
> >>> struct rte_mbuf *m, void *params)  { 
> >>> -#ifdef
> >>> RTE_DEBUG
> >>>   RTE_PTR_CHAIN3_OR_ERR_RET(instance, ops, set_pkt_metadata, -
> >>> EINVAL,
> >>>   -ENOTSUP);
> >>>   RTE_PTR_OR_ERR_RET(sess, -EINVAL);
> >>> -#endif
> >>>   return instance->ops->set_pkt_metadata(instance->device,
> >>>  sess, m, params);
> >>>   }
> >>> @@ -121,9 +119,7 @@ rte_security_get_userdata(struct rte_security_ctx
> >>> *instance, uint64_t md)  {
> >>>   void *userdata = NULL;
> >>>
> >>> -#ifdef RTE_DEBUG
> >>>   RTE_PTR_CHAIN3_OR_ERR_RET(instance, ops, get_userdata, NULL,
> >>> NULL); -#endif
> >>>   if (instance->ops->get_userdata(instance->device, md, 
> >>> &userdata))
> >>>   return NULL;
> >>>
> >>> --
> >>> 2.17.1
> >
> --
> 
> Lukasz Wo

Re: [dpdk-dev] [PATCH V1] add meson build 32-bits on x86_64

2020-04-23 Thread Zhang, XuemingX


Many thanks Bruce, 
Your suggestion is very good, I will try to do it

>-Original Message-
>From: Bruce Richardson [mailto:bruce.richard...@intel.com]
>Sent: Wednesday, April 22, 2020 7:12 PM
>To: Zhang, XuemingX 
>Cc: dev@dpdk.org; tho...@monjalon.net; Chen, Zhaoyan
>; Ma, LihongX 
>Subject: Re: [dpdk-dev] [PATCH V1] add meson build 32-bits on x86_64
>
>On Wed, Apr 22, 2020 at 09:48:54AM +, xueming wrote:
>> Add user interaction features. The default build is 64-bits.
>> if you want to build 32-bits on x86_64, need to pass parameters of i686.
>> Merge -vv and -v into the method
>>
>> Signed-off-by: xueming 
>> ---
>>  devtools/test-meson-builds.sh | 64
>> +++
>>  1 file changed, 58 insertions(+), 6 deletions(-)
>>
>I'm wondering if this would be better done using a cross-file, rather than
>overriding a bunch of environment variables - the pkg-config one would be
>especially relevant. A separate cross-file could be used for debian-based
>and fedora-based OS's with correct 32-bit paths and parameters in each.
>
>What do you think?
>
>/Bruce


Re: [dpdk-dev] [PATCH] security: fix crash at accessing non-implemented ops

2020-04-23 Thread Ananyev, Konstantin

> W dniu 23.04.2020 o 06:07, Anoob Joseph pisze:
> > Hi Konstantin,
> >
> > These are data path ops and so it will be better if we can avoid such 
> > checks in the datapath. The same is done in ethdev also.
> >
> > https://protect2.fireeye.com/url?k=d44931cf-89d2cdac-d448ba80-0cc47a31cdbc-
> 8281a62b4c91d848&q=1&u=http%3A%2F%2Fcode.dpdk.org%2Fdpdk%2Fv20.02%2Fsource%2Flib%2Flibrte_ethdev%2Frte_ethdev.h%23L43
> 72
> >
> > Datapath functions in cryptodev (enqueue/dequeue) doesn't even have such 
> > checks.
> > https://protect2.fireeye.com/url?k=51324200-0ca9be63-5133c94f-0cc47a31cdbc-
> 11f88758fc12c996&q=1&u=http%3A%2F%2Fcode.dpdk.org%2Fdpdk%2Fv20.02%2Fsource%2Flib%2Flibrte_cryptodev%2Frte_cryptodev.h%
> 23L962
> >
> >
> > Thanks,
> > Anoob
> 
> Hi Konstantine,
> 
> It's my fault. Sorry.
> 
> These checks need to be disabled in non-debug code, so they should be
> wrapped in a macro. It's just not the valid macro.
> The discussion about rte_debug mode is ongoing
> (https://patchwork.dpdk.org/patch/68815/)
> and currently the v2 version of patches is prepared to gather
> maintainers opinion.
> 
> After the rte_debug is introduced the proper macro to use will be
> RTE_DEBUG_SECURITY.
> 
> Until then, the RTE_DEBUG macro can stay as like Anoob mentioned the
> checks will have impact on dataplane performance.
> 
> If you want to enable this code, please use CFLAGS="-DRTE_DEBUG"

Really? So what we have to tell now to our customers?
"Yes, rte_security is broken and can easily crash your app.
But we might fix it in future versions... or maybe not.
For now just recompile our source with that flag enabled?"
Obviously this is not an option.
It is a bug and it is a stopper for 20.05 release.
It has to be fixed asap. 


> 
> >
> >> -Original Message-
> >> From: dev  On Behalf Of Konstantin Ananyev
> >> Sent: Thursday, April 23, 2020 5:22 AM
> >> To: dev@dpdk.org
> >> Cc: akhil.go...@nxp.com; declan.dohe...@intel.com; Konstantin Ananyev
> >> 
> >> Subject: [dpdk-dev] [PATCH] security: fix crash at accessing 
> >> non-implemented
> >> ops
> >>
> >> Valid checks for optional function pointers inside dev-ops were disabled by
> >> undefined macro.
> >>
> >> Fixes: b6ee98547847 ("security: fix verification of parameters")
> >>
> >> Signed-off-by: Konstantin Ananyev 
> >> ---
> >>   lib/librte_security/rte_security.c | 4 
> >>   1 file changed, 4 deletions(-)
> >>
> >> diff --git a/lib/librte_security/rte_security.c 
> >> b/lib/librte_security/rte_security.c
> >> index d475b0977..b65430ce2 100644
> >> --- a/lib/librte_security/rte_security.c
> >> +++ b/lib/librte_security/rte_security.c
> >> @@ -107,11 +107,9 @@ rte_security_set_pkt_metadata(struct rte_security_ctx
> >> *instance,
> >>  struct rte_security_session *sess,
> >>  struct rte_mbuf *m, void *params)  { -#ifdef
> >> RTE_DEBUG
> >>RTE_PTR_CHAIN3_OR_ERR_RET(instance, ops, set_pkt_metadata, -
> >> EINVAL,
> >>-ENOTSUP);
> >>RTE_PTR_OR_ERR_RET(sess, -EINVAL);
> >> -#endif
> >>return instance->ops->set_pkt_metadata(instance->device,
> >>   sess, m, params);
> >>   }
> >> @@ -121,9 +119,7 @@ rte_security_get_userdata(struct rte_security_ctx
> >> *instance, uint64_t md)  {
> >>void *userdata = NULL;
> >>
> >> -#ifdef RTE_DEBUG
> >>RTE_PTR_CHAIN3_OR_ERR_RET(instance, ops, get_userdata, NULL,
> >> NULL); -#endif
> >>if (instance->ops->get_userdata(instance->device, md, &userdata))
> >>return NULL;
> >>
> >> --
> >> 2.17.1
> >
> --
> 
> Lukasz Wojciechowski
> Principal Software Engineer
> 
> Samsung R&D Institute Poland
> Samsung Electronics
> Office +48 22 377 88 25
> l.wojciec...@partner.samsung.com



Re: [dpdk-dev] [PATCH v8 2/9] net/virtio: enable vectorized path

2020-04-23 Thread Maxime Coquelin



On 4/23/20 2:30 PM, Marvin Liu wrote:
> Previously, virtio split ring vectorized path is enabled as default.

s/is/was/
s/as/by/

> This is not suitable for everyone because of that path not follow virtio

s/because of that path not follow/because that path does not follow the/

> spec. Add new config for virtio vectorized path selection. By default
> vectorized path is disabled.

I think we can keep it enabled by default for consistency between make &
meson, now that you are providing a devarg for it that is disabled by
default.

Maybe we can just drop this config flag, what do you think?

Thanks,
Maxime

> Signed-off-by: Marvin Liu 
> 
> diff --git a/config/common_base b/config/common_base
> index 00d8d0792..334a26a17 100644
> --- a/config/common_base
> +++ b/config/common_base
> @@ -456,6 +456,7 @@ CONFIG_RTE_LIBRTE_VIRTIO_PMD=y
>  CONFIG_RTE_LIBRTE_VIRTIO_DEBUG_RX=n
>  CONFIG_RTE_LIBRTE_VIRTIO_DEBUG_TX=n
>  CONFIG_RTE_LIBRTE_VIRTIO_DEBUG_DUMP=n
> +CONFIG_RTE_LIBRTE_VIRTIO_INC_VECTOR=n
>  
>  #
>  # Compile virtio device emulation inside virtio PMD driver
> diff --git a/drivers/net/virtio/Makefile b/drivers/net/virtio/Makefile
> index c9edb84ee..4b69827ab 100644
> --- a/drivers/net/virtio/Makefile
> +++ b/drivers/net/virtio/Makefile
> @@ -28,6 +28,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += virtio_rxtx.c
>  SRCS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += virtio_ethdev.c
>  SRCS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += virtio_rxtx_simple.c
>  
> +ifeq ($(CONFIG_RTE_LIBRTE_VIRTIO_INC_VECTOR),y)
>  ifeq ($(CONFIG_RTE_ARCH_X86),y)
>  SRCS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += virtio_rxtx_simple_sse.c
>  else ifeq ($(CONFIG_RTE_ARCH_PPC_64),y)
> @@ -35,6 +36,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += 
> virtio_rxtx_simple_altivec.c
>  else ifneq ($(filter y,$(CONFIG_RTE_ARCH_ARM) $(CONFIG_RTE_ARCH_ARM64)),)
>  SRCS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += virtio_rxtx_simple_neon.c
>  endif
> +endif
>  
>  ifeq ($(CONFIG_RTE_VIRTIO_USER),y)
>  SRCS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += virtio_user/vhost_user.c
> diff --git a/drivers/net/virtio/meson.build b/drivers/net/virtio/meson.build
> index 15150eea1..ce3525ef5 100644
> --- a/drivers/net/virtio/meson.build
> +++ b/drivers/net/virtio/meson.build
> @@ -8,6 +8,7 @@ sources += files('virtio_ethdev.c',
>   'virtqueue.c')
>  deps += ['kvargs', 'bus_pci']
>  
> +dpdk_conf.set('RTE_LIBRTE_VIRTIO_INC_VECTOR', 1)
>  if arch_subdir == 'x86'
>   sources += files('virtio_rxtx_simple_sse.c')
>  elif arch_subdir == 'ppc'
> 



[dpdk-dev] [PATCH v2] app/testpmd: add parsing for multiple VLAN headers

2020-04-23 Thread Raslan Darawsheh
When having multiple VLANs in the packet, parse_ethernet
is cabable of parsing only the first vlan.

add parsing for mutliple VLAN headers in the packet.

Fixes: 51f694dd40f5 ("app/testpmd: rework checksum forward engine")
Cc: sta...@dpdk.org

Signed-off-by: Raslan Darawsheh 
Acked-by: Ori Kam 
Acked-by: Bernard Iremonger 
---
v2: added QinQ to check for multiple vlan's
---
 app/test-pmd/csumonly.c | 13 +++--
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/app/test-pmd/csumonly.c b/app/test-pmd/csumonly.c
index fe19615..8626223 100644
--- a/app/test-pmd/csumonly.c
+++ b/app/test-pmd/csumonly.c
@@ -139,22 +139,23 @@ parse_ipv6(struct rte_ipv6_hdr *ipv6_hdr, struct 
testpmd_offload_info *info)
 
 /*
  * Parse an ethernet header to fill the ethertype, l2_len, l3_len and
- * ipproto. This function is able to recognize IPv4/IPv6 with one optional vlan
- * header. The l4_len argument is only set in case of TCP (useful for TSO).
+ * ipproto. This function is able to recognize IPv4/IPv6 with optional VLAN
+ * headers. The l4_len argument is only set in case of TCP (useful for TSO).
  */
 static void
 parse_ethernet(struct rte_ether_hdr *eth_hdr, struct testpmd_offload_info 
*info)
 {
struct rte_ipv4_hdr *ipv4_hdr;
struct rte_ipv6_hdr *ipv6_hdr;
+   struct rte_vlan_hdr *vlan_hdr;
 
info->l2_len = sizeof(struct rte_ether_hdr);
info->ethertype = eth_hdr->ether_type;
 
-   if (info->ethertype == _htons(RTE_ETHER_TYPE_VLAN)) {
-   struct rte_vlan_hdr *vlan_hdr = (
-   struct rte_vlan_hdr *)(eth_hdr + 1);
-
+   while (info->ethertype == _htons(RTE_ETHER_TYPE_VLAN) ||
+  info->ethertype == _htons(RTE_ETHER_TYPE_QINQ)) {
+   vlan_hdr = (struct rte_vlan_hdr *)
+   ((char *)eth_hdr + info->l2_len);
info->l2_len  += sizeof(struct rte_vlan_hdr);
info->ethertype = vlan_hdr->eth_proto;
}
-- 
2.7.4



Re: [dpdk-dev] [PATCH v8 3/9] net/virtio: inorder should depend on feature bit

2020-04-23 Thread Maxime Coquelin



On 4/23/20 2:31 PM, Marvin Liu wrote:
> Ring initialzation is different when inorder feature negotiated. This
s/initialzation/initialization/
> action should dependent on negotiated feature bits.
> 
> Signed-off-by: Marvin Liu 
> 

Reviewed-by: Maxime Coquelin 

Thanks,
Maxime



Re: [dpdk-dev] [PATCH v8 2/9] net/virtio: enable vectorized path

2020-04-23 Thread Liu, Yong



> -Original Message-
> From: Maxime Coquelin 
> Sent: Thursday, April 23, 2020 4:34 PM
> To: Liu, Yong ; Ye, Xiaolong ;
> Wang, Zhihong 
> Cc: Van Haaren, Harry ; dev@dpdk.org
> Subject: Re: [PATCH v8 2/9] net/virtio: enable vectorized path
> 
> 
> 
> On 4/23/20 2:30 PM, Marvin Liu wrote:
> > Previously, virtio split ring vectorized path is enabled as default.
> 
> s/is/was/
> s/as/by/
> 
> > This is not suitable for everyone because of that path not follow virtio
> 
> s/because of that path not follow/because that path does not follow the/
> 
> > spec. Add new config for virtio vectorized path selection. By default
> > vectorized path is disabled.
> 
> I think we can keep it enabled by default for consistency between make &
> meson, now that you are providing a devarg for it that is disabled by
> default.
> 
> Maybe we can just drop this config flag, what do you think?
> 

Maxime, 
Devarg will only have effect on virtio-user path selection, while DPDK 
configuration can affect both virtio pmd and virtio-user.
It maybe worth to add new configuration as it can allow user to choice whether 
disabled vectorized path in virtio pmd.  
IMHO, AVX512 instructions should be selective in each component. 

Regards,
Marvin

> Thanks,
> Maxime
> 
> > Signed-off-by: Marvin Liu 
> >
> > diff --git a/config/common_base b/config/common_base
> > index 00d8d0792..334a26a17 100644
> > --- a/config/common_base
> > +++ b/config/common_base
> > @@ -456,6 +456,7 @@ CONFIG_RTE_LIBRTE_VIRTIO_PMD=y
> >  CONFIG_RTE_LIBRTE_VIRTIO_DEBUG_RX=n
> >  CONFIG_RTE_LIBRTE_VIRTIO_DEBUG_TX=n
> >  CONFIG_RTE_LIBRTE_VIRTIO_DEBUG_DUMP=n
> > +CONFIG_RTE_LIBRTE_VIRTIO_INC_VECTOR=n
> >
> >  #
> >  # Compile virtio device emulation inside virtio PMD driver
> > diff --git a/drivers/net/virtio/Makefile b/drivers/net/virtio/Makefile
> > index c9edb84ee..4b69827ab 100644
> > --- a/drivers/net/virtio/Makefile
> > +++ b/drivers/net/virtio/Makefile
> > @@ -28,6 +28,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) +=
> virtio_rxtx.c
> >  SRCS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += virtio_ethdev.c
> >  SRCS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += virtio_rxtx_simple.c
> >
> > +ifeq ($(CONFIG_RTE_LIBRTE_VIRTIO_INC_VECTOR),y)
> >  ifeq ($(CONFIG_RTE_ARCH_X86),y)
> >  SRCS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += virtio_rxtx_simple_sse.c
> >  else ifeq ($(CONFIG_RTE_ARCH_PPC_64),y)
> > @@ -35,6 +36,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) +=
> virtio_rxtx_simple_altivec.c
> >  else ifneq ($(filter y,$(CONFIG_RTE_ARCH_ARM)
> $(CONFIG_RTE_ARCH_ARM64)),)
> >  SRCS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += virtio_rxtx_simple_neon.c
> >  endif
> > +endif
> >
> >  ifeq ($(CONFIG_RTE_VIRTIO_USER),y)
> >  SRCS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += virtio_user/vhost_user.c
> > diff --git a/drivers/net/virtio/meson.build b/drivers/net/virtio/meson.build
> > index 15150eea1..ce3525ef5 100644
> > --- a/drivers/net/virtio/meson.build
> > +++ b/drivers/net/virtio/meson.build
> > @@ -8,6 +8,7 @@ sources += files('virtio_ethdev.c',
> > 'virtqueue.c')
> >  deps += ['kvargs', 'bus_pci']
> >
> > +dpdk_conf.set('RTE_LIBRTE_VIRTIO_INC_VECTOR', 1)
> >  if arch_subdir == 'x86'
> > sources += files('virtio_rxtx_simple_sse.c')
> >  elif arch_subdir == 'ppc'
> >



Re: [dpdk-dev] [PATCH v8 2/9] net/virtio: enable vectorized path

2020-04-23 Thread Maxime Coquelin



On 4/23/20 10:46 AM, Liu, Yong wrote:
> 
> 
>> -Original Message-
>> From: Maxime Coquelin 
>> Sent: Thursday, April 23, 2020 4:34 PM
>> To: Liu, Yong ; Ye, Xiaolong ;
>> Wang, Zhihong 
>> Cc: Van Haaren, Harry ; dev@dpdk.org
>> Subject: Re: [PATCH v8 2/9] net/virtio: enable vectorized path
>>
>>
>>
>> On 4/23/20 2:30 PM, Marvin Liu wrote:
>>> Previously, virtio split ring vectorized path is enabled as default.
>>
>> s/is/was/
>> s/as/by/
>>
>>> This is not suitable for everyone because of that path not follow virtio
>>
>> s/because of that path not follow/because that path does not follow the/
>>
>>> spec. Add new config for virtio vectorized path selection. By default
>>> vectorized path is disabled.
>>
>> I think we can keep it enabled by default for consistency between make &
>> meson, now that you are providing a devarg for it that is disabled by
>> default.
>>
>> Maybe we can just drop this config flag, what do you think?
>>
> 
> Maxime, 
> Devarg will only have effect on virtio-user path selection, while DPDK 
> configuration can affect both virtio pmd and virtio-user.
> It maybe worth to add new configuration as it can allow user to choice 
> whether disabled vectorized path in virtio pmd. 

Ok, so we had a misunderstanding. I was requesting the the devarg to be
effective also for the Virtio PMD, disabled by default.

Thanks,
Maxime
> IMHO, AVX512 instructions should be selective in each component. 
> 
> Regards,
> Marvin
> 
>> Thanks,
>> Maxime
>>
>>> Signed-off-by: Marvin Liu 
>>>
>>> diff --git a/config/common_base b/config/common_base
>>> index 00d8d0792..334a26a17 100644
>>> --- a/config/common_base
>>> +++ b/config/common_base
>>> @@ -456,6 +456,7 @@ CONFIG_RTE_LIBRTE_VIRTIO_PMD=y
>>>  CONFIG_RTE_LIBRTE_VIRTIO_DEBUG_RX=n
>>>  CONFIG_RTE_LIBRTE_VIRTIO_DEBUG_TX=n
>>>  CONFIG_RTE_LIBRTE_VIRTIO_DEBUG_DUMP=n
>>> +CONFIG_RTE_LIBRTE_VIRTIO_INC_VECTOR=n
>>>
>>>  #
>>>  # Compile virtio device emulation inside virtio PMD driver
>>> diff --git a/drivers/net/virtio/Makefile b/drivers/net/virtio/Makefile
>>> index c9edb84ee..4b69827ab 100644
>>> --- a/drivers/net/virtio/Makefile
>>> +++ b/drivers/net/virtio/Makefile
>>> @@ -28,6 +28,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) +=
>> virtio_rxtx.c
>>>  SRCS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += virtio_ethdev.c
>>>  SRCS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += virtio_rxtx_simple.c
>>>
>>> +ifeq ($(CONFIG_RTE_LIBRTE_VIRTIO_INC_VECTOR),y)
>>>  ifeq ($(CONFIG_RTE_ARCH_X86),y)
>>>  SRCS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += virtio_rxtx_simple_sse.c
>>>  else ifeq ($(CONFIG_RTE_ARCH_PPC_64),y)
>>> @@ -35,6 +36,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) +=
>> virtio_rxtx_simple_altivec.c
>>>  else ifneq ($(filter y,$(CONFIG_RTE_ARCH_ARM)
>> $(CONFIG_RTE_ARCH_ARM64)),)
>>>  SRCS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += virtio_rxtx_simple_neon.c
>>>  endif
>>> +endif
>>>
>>>  ifeq ($(CONFIG_RTE_VIRTIO_USER),y)
>>>  SRCS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += virtio_user/vhost_user.c
>>> diff --git a/drivers/net/virtio/meson.build b/drivers/net/virtio/meson.build
>>> index 15150eea1..ce3525ef5 100644
>>> --- a/drivers/net/virtio/meson.build
>>> +++ b/drivers/net/virtio/meson.build
>>> @@ -8,6 +8,7 @@ sources += files('virtio_ethdev.c',
>>> 'virtqueue.c')
>>>  deps += ['kvargs', 'bus_pci']
>>>
>>> +dpdk_conf.set('RTE_LIBRTE_VIRTIO_INC_VECTOR', 1)
>>>  if arch_subdir == 'x86'
>>> sources += files('virtio_rxtx_simple_sse.c')
>>>  elif arch_subdir == 'ppc'
>>>
> 



Re: [dpdk-dev] [PATCH v2] app/testpmd: add parsing for multiple VLAN headers

2020-04-23 Thread Iremonger, Bernard
Hi Raslan,

> -Original Message-
> From: Raslan Darawsheh 
> Sent: Thursday, April 23, 2020 9:41 AM
> To: Iremonger, Bernard ; Wu, Jingjing
> ; Lu, Wenzhuo 
> Cc: Yigit, Ferruh ; dev@dpdk.org; sta...@dpdk.org
> Subject: [PATCH v2] app/testpmd: add parsing for multiple VLAN headers

Might be better to replace "multiple" with QINQ in the commit message.
 
> When having multiple VLANs in the packet, parse_ethernet is cabable of

Might be better to replace "multiple" with QINQ

> parsing only the first vlan.
> 
> add parsing for mutliple VLAN headers in the packet.

Might be better to replace "multiple" with QINQ

> 
> Fixes: 51f694dd40f5 ("app/testpmd: rework checksum forward engine")
> Cc: sta...@dpdk.org
> 
> Signed-off-by: Raslan Darawsheh 
> Acked-by: Ori Kam 
> Acked-by: Bernard Iremonger 
> ---
> v2: added QinQ to check for multiple vlan's
> ---
>  app/test-pmd/csumonly.c | 13 +++--
>  1 file changed, 7 insertions(+), 6 deletions(-)
> 
> diff --git a/app/test-pmd/csumonly.c b/app/test-pmd/csumonly.c index
> fe19615..8626223 100644
> --- a/app/test-pmd/csumonly.c
> +++ b/app/test-pmd/csumonly.c
> @@ -139,22 +139,23 @@ parse_ipv6(struct rte_ipv6_hdr *ipv6_hdr, struct
> testpmd_offload_info *info)
> 
>  /*
>   * Parse an ethernet header to fill the ethertype, l2_len, l3_len and
> - * ipproto. This function is able to recognize IPv4/IPv6 with one optional 
> vlan
> - * header. The l4_len argument is only set in case of TCP (useful for TSO).
> + * ipproto. This function is able to recognize IPv4/IPv6 with optional
> + VLAN
> + * headers. The l4_len argument is only set in case of TCP (useful for TSO).
>   */
>  static void
>  parse_ethernet(struct rte_ether_hdr *eth_hdr, struct
> testpmd_offload_info *info)  {
>   struct rte_ipv4_hdr *ipv4_hdr;
>   struct rte_ipv6_hdr *ipv6_hdr;
> + struct rte_vlan_hdr *vlan_hdr;
> 
>   info->l2_len = sizeof(struct rte_ether_hdr);
>   info->ethertype = eth_hdr->ether_type;
> 
> - if (info->ethertype == _htons(RTE_ETHER_TYPE_VLAN)) {
> - struct rte_vlan_hdr *vlan_hdr = (
> - struct rte_vlan_hdr *)(eth_hdr + 1);
> -
> + while (info->ethertype == _htons(RTE_ETHER_TYPE_VLAN) ||
> +info->ethertype == _htons(RTE_ETHER_TYPE_QINQ)) {
> + vlan_hdr = (struct rte_vlan_hdr *)
> + ((char *)eth_hdr + info->l2_len);
>   info->l2_len  += sizeof(struct rte_vlan_hdr);
>   info->ethertype = vlan_hdr->eth_proto;
>   }
> --
> 2.7.4
Otherwise

Acked-by: Bernard Iremonger 



Re: [dpdk-dev] [PATCH v2] app/testpmd: add parsing for multiple VLAN headers

2020-04-23 Thread Raslan Darawsheh
Hi,

> -Original Message-
> From: Iremonger, Bernard 
> Sent: Thursday, April 23, 2020 12:00 PM
> To: Raslan Darawsheh ; Wu, Jingjing
> ; Lu, Wenzhuo 
> Cc: Yigit, Ferruh ; dev@dpdk.org; sta...@dpdk.org
> Subject: RE: [PATCH v2] app/testpmd: add parsing for multiple VLAN headers
> 
> Hi Raslan,
> 
> > -Original Message-
> > From: Raslan Darawsheh 
> > Sent: Thursday, April 23, 2020 9:41 AM
> > To: Iremonger, Bernard ; Wu, Jingjing
> > ; Lu, Wenzhuo 
> > Cc: Yigit, Ferruh ; dev@dpdk.org;
> sta...@dpdk.org
> > Subject: [PATCH v2] app/testpmd: add parsing for multiple VLAN headers
> 
> Might be better to replace "multiple" with QINQ in the commit message.
> 
> > When having multiple VLANs in the packet, parse_ethernet is cabable of
> 
> Might be better to replace "multiple" with QINQ
> 
> > parsing only the first vlan.
> >
> > add parsing for mutliple VLAN headers in the packet.
> 
> Might be better to replace "multiple" with QINQ

Sure, will handle in v3

> 
> >
> > Fixes: 51f694dd40f5 ("app/testpmd: rework checksum forward engine")
> > Cc: sta...@dpdk.org
> >
> > Signed-off-by: Raslan Darawsheh 
> > Acked-by: Ori Kam 
> > Acked-by: Bernard Iremonger 
> > ---
> > v2: added QinQ to check for multiple vlan's
> > ---
> >  app/test-pmd/csumonly.c | 13 +++--
> >  1 file changed, 7 insertions(+), 6 deletions(-)
> >
> > diff --git a/app/test-pmd/csumonly.c b/app/test-pmd/csumonly.c index
> > fe19615..8626223 100644
> > --- a/app/test-pmd/csumonly.c
> > +++ b/app/test-pmd/csumonly.c
> > @@ -139,22 +139,23 @@ parse_ipv6(struct rte_ipv6_hdr *ipv6_hdr, struct
> > testpmd_offload_info *info)
> >
> >  /*
> >   * Parse an ethernet header to fill the ethertype, l2_len, l3_len and
> > - * ipproto. This function is able to recognize IPv4/IPv6 with one optional
> vlan
> > - * header. The l4_len argument is only set in case of TCP (useful for TSO).
> > + * ipproto. This function is able to recognize IPv4/IPv6 with optional
> > + VLAN
> > + * headers. The l4_len argument is only set in case of TCP (useful for 
> > TSO).
> >   */
> >  static void
> >  parse_ethernet(struct rte_ether_hdr *eth_hdr, struct
> > testpmd_offload_info *info)  {
> > struct rte_ipv4_hdr *ipv4_hdr;
> > struct rte_ipv6_hdr *ipv6_hdr;
> > +   struct rte_vlan_hdr *vlan_hdr;
> >
> > info->l2_len = sizeof(struct rte_ether_hdr);
> > info->ethertype = eth_hdr->ether_type;
> >
> > -   if (info->ethertype == _htons(RTE_ETHER_TYPE_VLAN)) {
> > -   struct rte_vlan_hdr *vlan_hdr = (
> > -   struct rte_vlan_hdr *)(eth_hdr + 1);
> > -
> > +   while (info->ethertype == _htons(RTE_ETHER_TYPE_VLAN) ||
> > +  info->ethertype == _htons(RTE_ETHER_TYPE_QINQ)) {
> > +   vlan_hdr = (struct rte_vlan_hdr *)
> > +   ((char *)eth_hdr + info->l2_len);
> > info->l2_len  += sizeof(struct rte_vlan_hdr);
> > info->ethertype = vlan_hdr->eth_proto;
> > }
> > --
> > 2.7.4
> Otherwise
> 
> Acked-by: Bernard Iremonger 

Kindest regards,
Raslan Darawsheh


[dpdk-dev] [PATCH v3] app/testpmd: add parsing for QINQ VLAN headers

2020-04-23 Thread Raslan Darawsheh
When having QINQ VLAN headers in the packet, parse_ethernet
is cabable of parsing only the first vlan.

add parsing for QINQ VLAN headers in the packet.

Fixes: 51f694dd40f5 ("app/testpmd: rework checksum forward engine")
Cc: sta...@dpdk.org

Signed-off-by: Raslan Darawsheh 
Acked-by: Ori Kam 
Acked-by: Bernard Iremonger 
---
v2: added QinQ to check for multiple vlan's
v3: reword commit to replace Multiple with QINQ
---
 app/test-pmd/csumonly.c | 13 +++--
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/app/test-pmd/csumonly.c b/app/test-pmd/csumonly.c
index fe19615..8626223 100644
--- a/app/test-pmd/csumonly.c
+++ b/app/test-pmd/csumonly.c
@@ -139,22 +139,23 @@ parse_ipv6(struct rte_ipv6_hdr *ipv6_hdr, struct 
testpmd_offload_info *info)
 
 /*
  * Parse an ethernet header to fill the ethertype, l2_len, l3_len and
- * ipproto. This function is able to recognize IPv4/IPv6 with one optional vlan
- * header. The l4_len argument is only set in case of TCP (useful for TSO).
+ * ipproto. This function is able to recognize IPv4/IPv6 with optional VLAN
+ * headers. The l4_len argument is only set in case of TCP (useful for TSO).
  */
 static void
 parse_ethernet(struct rte_ether_hdr *eth_hdr, struct testpmd_offload_info 
*info)
 {
struct rte_ipv4_hdr *ipv4_hdr;
struct rte_ipv6_hdr *ipv6_hdr;
+   struct rte_vlan_hdr *vlan_hdr;
 
info->l2_len = sizeof(struct rte_ether_hdr);
info->ethertype = eth_hdr->ether_type;
 
-   if (info->ethertype == _htons(RTE_ETHER_TYPE_VLAN)) {
-   struct rte_vlan_hdr *vlan_hdr = (
-   struct rte_vlan_hdr *)(eth_hdr + 1);
-
+   while (info->ethertype == _htons(RTE_ETHER_TYPE_VLAN) ||
+  info->ethertype == _htons(RTE_ETHER_TYPE_QINQ)) {
+   vlan_hdr = (struct rte_vlan_hdr *)
+   ((char *)eth_hdr + info->l2_len);
info->l2_len  += sizeof(struct rte_vlan_hdr);
info->ethertype = vlan_hdr->eth_proto;
}
-- 
2.7.4



Re: [dpdk-dev] [PATCH 1/7] eal: move OS common functions to single file

2020-04-23 Thread Dmitry Kozlyuk
On 2020-04-23 09:27 GMT+0200 Thomas Monjalon wrote:
> 23/04/2020 01:51, Ranjit Menon:
> > On 4/22/2020 12:27 AM, tal...@mellanox.com wrote:  
> > > From: Tal Shnaiderman 
> > > 
> > > Move common functions between Unix and Windows to eal_config.c.  
> > 
> > Like other files in common, we should call this eal_common_config.c  
> 
> I am not sure about the interest of repeating the directory name
> in the file name in general.
> Do you see a real benefit?

It allows using VPATH in Makefile. If filenames are identical in different
VPATH directories, make can't pick both. Makefiles are being deprecated, but
they'll be around for some more time.

-- 
Dmitry Kozlyuk


Re: [dpdk-dev] [PATCH] security: fix crash at accessing non-implemented ops

2020-04-23 Thread Anoob Joseph
Hi Konstantin,

Please see inline.

Thanks,
Anoob

> -Original Message-
> From: Ananyev, Konstantin 
> Sent: Thursday, April 23, 2020 1:24 PM
> To: Anoob Joseph ; dev@dpdk.org
> Cc: akhil.go...@nxp.com; Doherty, Declan ;
> techbo...@dpdk.org
> Subject: [EXT] RE: [dpdk-dev] [PATCH] security: fix crash at accessing non-
> implemented ops
> 
> External Email
> 
> --
> >
> > Hi Konstantin,
> >
> > These are data path ops and so it will be better if we can avoid such 
> > checks in
> the datapath. The same is done in ethdev also.
> 
> AFAIK,  get_userdata is an *optional* dev-ops function that can be used by 
> data-
> path.
> So far there was no strict requirement for the rte_security PMDs to *always*
> implement it.

[Anoob] I don't think DPDK categorizes dev-ops as *optional* and *always*. If 
yes, can you point me?

My understanding is, all ops are optional. For example, I could implement a 
crypto PMD which is doing packet delivery only via event device (using crypto 
adapter). So dequeue op will not be implemented in that case and DPDK spec 
allows it. 
 
> So what you guys did is a silent change of public API behaviour.

[Anoob] I believe Lukasz had submitted 3 or 4 revisions and it was all in the 
ML. RTE_DEBUG was suggested by Thomas I guess.
 
> As result ixgbe, (and probably some others rte_security PMDs) stopped working
> properly.

[Anoob] set_pkt_metadata() is the only one of interest to IXGBE. And I believe 
the function is implemented as well. So what exactly is the concern?
 
> I don't see any point in these changes, but if you'd like to do that, at 
> least our
> usual procedure has to be followed:
> 1. Send and RFC to get an agreement with rte_security PMDs maintainers (one
> release ahead) 2. send a deprecation note (one release ahead) 3. change the
> behaviour of the public API 4. update release notes
> 
> AFAIK 1), 2), 4) wasn't done.
> So I think right now we need to revert original behaviour.
> 
> >
> > https://urldefense.proofpoint.com/v2/url?u=http-3A__code.dpdk.org_dpdk
> > _v20.02_source_lib_librte-5Fethdev_rte-5Fethdev.h-23L4372&d=DwIFAg&c=n
> > KjWec2b6R0mOyPaz7xtfQ&r=jPfB8rwwviRSxyLWs2n6B-
> WYLn1v9SyTMrT5EQqh2TU&m=
> > 6ObfSanVVuHOsiqVlWxXsFWi-
> 2XNp76HCOX0vbUfma4&s=jDVyDDEILmgY1Yb9ZBswBVbn
> > 8FpZuQI5ukH_osmtUiI&e=
> >
> > Datapath functions in cryptodev (enqueue/dequeue) doesn't even have such
> checks.
> > https://urldefense.proofpoint.com/v2/url?u=http-3A__code.dpdk.org_dpdk
> > _v20.02_source_lib_librte-5Fcryptodev_rte-5Fcryptodev.h-23L962&d=DwIFA
> > g&c=nKjWec2b6R0mOyPaz7xtfQ&r=jPfB8rwwviRSxyLWs2n6B-
> WYLn1v9SyTMrT5EQqh2
> > TU&m=6ObfSanVVuHOsiqVlWxXsFWi-
> 2XNp76HCOX0vbUfma4&s=LEWQOKs0r2Im_zL95VI
> > df4kQ2Pu0iRHV9Co2J1gsNBE&e=
> 
> That's a different story:
> rx_burst/tx_burst, enqueue/dequeue are mandatory dev-ops functions that have
> to be implemented by each  ethdev/cryptodev API.

[Anoob] I couldn't find any reference stating that way. If you can point me, I 
can update that to include datapath ops required for inline protocol processing.

> 
> >
> >
> > Thanks,
> > Anoob
> >
> > > -Original Message-
> > > From: dev  On Behalf Of Konstantin Ananyev
> > > Sent: Thursday, April 23, 2020 5:22 AM
> > > To: dev@dpdk.org
> > > Cc: akhil.go...@nxp.com; declan.dohe...@intel.com; Konstantin
> > > Ananyev 
> > > Subject: [dpdk-dev] [PATCH] security: fix crash at accessing
> > > non-implemented ops
> > >
> > > Valid checks for optional function pointers inside dev-ops were
> > > disabled by undefined macro.
> > >
> > > Fixes: b6ee98547847 ("security: fix verification of parameters")
> > >
> > > Signed-off-by: Konstantin Ananyev 
> > > ---
> > >  lib/librte_security/rte_security.c | 4 
> > >  1 file changed, 4 deletions(-)
> > >
> > > diff --git a/lib/librte_security/rte_security.c
> > > b/lib/librte_security/rte_security.c
> > > index d475b0977..b65430ce2 100644
> > > --- a/lib/librte_security/rte_security.c
> > > +++ b/lib/librte_security/rte_security.c
> > > @@ -107,11 +107,9 @@ rte_security_set_pkt_metadata(struct
> > > rte_security_ctx *instance,
> > > struct rte_security_session *sess,
> > > struct rte_mbuf *m, void *params)  { -#ifdef
> RTE_DEBUG
> > >   RTE_PTR_CHAIN3_OR_ERR_RET(instance, ops, set_pkt_metadata, -
> > > EINVAL,
> > >   -ENOTSUP);
> > >   RTE_PTR_OR_ERR_RET(sess, -EINVAL); -#endif
> > >   return instance->ops->set_pkt_metadata(instance->device,
> > >  sess, m, params);
> > >  }
> > > @@ -121,9 +119,7 @@ rte_security_get_userdata(struct
> > > rte_security_ctx *instance, uint64_t md)  {
> > >   void *userdata = NULL;
> > >
> > > -#ifdef RTE_DEBUG
> > >   RTE_PTR_CHAIN3_OR_ERR_RET(instance, ops, get_userdata, NULL,
> > > NULL); -#endif
> > >   if (instance->ops->get_userdata(instance->device, md, &userdata))
> > >   return NULL;
> > >
> > > --
> > > 2.17.1



Re: [dpdk-dev] [PATCH v8 2/9] net/virtio: enable vectorized path

2020-04-23 Thread Liu, Yong



> -Original Message-
> From: Maxime Coquelin 
> Sent: Thursday, April 23, 2020 4:50 PM
> To: Liu, Yong ; Ye, Xiaolong ;
> Wang, Zhihong 
> Cc: Van Haaren, Harry ; dev@dpdk.org
> Subject: Re: [PATCH v8 2/9] net/virtio: enable vectorized path
> 
> 
> 
> On 4/23/20 10:46 AM, Liu, Yong wrote:
> >
> >
> >> -Original Message-
> >> From: Maxime Coquelin 
> >> Sent: Thursday, April 23, 2020 4:34 PM
> >> To: Liu, Yong ; Ye, Xiaolong ;
> >> Wang, Zhihong 
> >> Cc: Van Haaren, Harry ; dev@dpdk.org
> >> Subject: Re: [PATCH v8 2/9] net/virtio: enable vectorized path
> >>
> >>
> >>
> >> On 4/23/20 2:30 PM, Marvin Liu wrote:
> >>> Previously, virtio split ring vectorized path is enabled as default.
> >>
> >> s/is/was/
> >> s/as/by/
> >>
> >>> This is not suitable for everyone because of that path not follow virtio
> >>
> >> s/because of that path not follow/because that path does not follow the/
> >>
> >>> spec. Add new config for virtio vectorized path selection. By default
> >>> vectorized path is disabled.
> >>
> >> I think we can keep it enabled by default for consistency between make &
> >> meson, now that you are providing a devarg for it that is disabled by
> >> default.
> >>
> >> Maybe we can just drop this config flag, what do you think?
> >>
> >
> > Maxime,
> > Devarg will only have effect on virtio-user path selection, while DPDK
> configuration can affect both virtio pmd and virtio-user.
> > It maybe worth to add new configuration as it can allow user to choice
> whether disabled vectorized path in virtio pmd.
> 
> Ok, so we had a misunderstanding. I was requesting the the devarg to be
> effective also for the Virtio PMD, disabled by default.
> 
Got you, will change in next vesion.

> Thanks,
> Maxime
> > IMHO, AVX512 instructions should be selective in each component.
> >
> > Regards,
> > Marvin
> >
> >> Thanks,
> >> Maxime
> >>
> >>> Signed-off-by: Marvin Liu 
> >>>
> >>> diff --git a/config/common_base b/config/common_base
> >>> index 00d8d0792..334a26a17 100644
> >>> --- a/config/common_base
> >>> +++ b/config/common_base
> >>> @@ -456,6 +456,7 @@ CONFIG_RTE_LIBRTE_VIRTIO_PMD=y
> >>>  CONFIG_RTE_LIBRTE_VIRTIO_DEBUG_RX=n
> >>>  CONFIG_RTE_LIBRTE_VIRTIO_DEBUG_TX=n
> >>>  CONFIG_RTE_LIBRTE_VIRTIO_DEBUG_DUMP=n
> >>> +CONFIG_RTE_LIBRTE_VIRTIO_INC_VECTOR=n
> >>>
> >>>  #
> >>>  # Compile virtio device emulation inside virtio PMD driver
> >>> diff --git a/drivers/net/virtio/Makefile b/drivers/net/virtio/Makefile
> >>> index c9edb84ee..4b69827ab 100644
> >>> --- a/drivers/net/virtio/Makefile
> >>> +++ b/drivers/net/virtio/Makefile
> >>> @@ -28,6 +28,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) +=
> >> virtio_rxtx.c
> >>>  SRCS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += virtio_ethdev.c
> >>>  SRCS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += virtio_rxtx_simple.c
> >>>
> >>> +ifeq ($(CONFIG_RTE_LIBRTE_VIRTIO_INC_VECTOR),y)
> >>>  ifeq ($(CONFIG_RTE_ARCH_X86),y)
> >>>  SRCS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += virtio_rxtx_simple_sse.c
> >>>  else ifeq ($(CONFIG_RTE_ARCH_PPC_64),y)
> >>> @@ -35,6 +36,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) +=
> >> virtio_rxtx_simple_altivec.c
> >>>  else ifneq ($(filter y,$(CONFIG_RTE_ARCH_ARM)
> >> $(CONFIG_RTE_ARCH_ARM64)),)
> >>>  SRCS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += virtio_rxtx_simple_neon.c
> >>>  endif
> >>> +endif
> >>>
> >>>  ifeq ($(CONFIG_RTE_VIRTIO_USER),y)
> >>>  SRCS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += virtio_user/vhost_user.c
> >>> diff --git a/drivers/net/virtio/meson.build
> b/drivers/net/virtio/meson.build
> >>> index 15150eea1..ce3525ef5 100644
> >>> --- a/drivers/net/virtio/meson.build
> >>> +++ b/drivers/net/virtio/meson.build
> >>> @@ -8,6 +8,7 @@ sources += files('virtio_ethdev.c',
> >>>   'virtqueue.c')
> >>>  deps += ['kvargs', 'bus_pci']
> >>>
> >>> +dpdk_conf.set('RTE_LIBRTE_VIRTIO_INC_VECTOR', 1)
> >>>  if arch_subdir == 'x86'
> >>>   sources += files('virtio_rxtx_simple_sse.c')
> >>>  elif arch_subdir == 'ppc'
> >>>
> >



Re: [dpdk-dev] [PATCH 2/2] eal: resolve getentropy at run time for random seed

2020-04-23 Thread Luca Boccassi
On Wed, 2020-04-22 at 17:35 -0300, Dan Gora wrote:
> On Wed, Apr 22, 2020 at 5:14 PM Mattias Rönnblom
>  wrote:
> > On 2020-04-22 19:44, Dan Gora wrote:
> > > On Wed, Apr 22, 2020 at 5:28 AM Mattias Rönnblom
> > >  wrote:
> > > > On 2020-04-21 21:54, Dan Gora wrote:
> > > > > The getentropy() function was introduced into glibc v2.25 and so is
> > > > > not available on all supported platforms.  Previously, if DPDK was
> > > > > compiled (using meson) on a system which has getentropy(), it would
> > > > > introduce a dependency on glibc v2.25 which would prevent that binary
> > > > > from running on a system with an older glibc.  Similarly if DPDK was
> > > > > compiled on a system which did not have getentropy(), getentropy()
> > > > > could not be used even if the execution system supported it.
> > > > > 
> > > > > Introduce a new static function, __rte_getentropy() which will try to
> > > > > resolve the getentropy() function dynamically using dlopen()/dlsym(),
> > > > > returning failure if the getentropy() function cannot be resolved or
> > > > > if it fails.
> > > > 
> > > > Two other options: providing a DPDK-native syscall wrapper for
> > > > getrandom(), or falling back to reading /dev/urandom. Have you
> > > > considered any of those two options? If so, why do you prefer
> > > > dlopen()/dlsym()?
> > > I didn't give any thought at all to using /dev/urandom.  The goal was
> > > not really to change how the thing worked, just to remove the
> > > dependency on glibc 2.25.
> > 
> > /dev/urandom is basically only a different interface to the same
> > underlying mechanism.

This is not the whole story though - while the end result when all
works is the same, there are important differences in getting there.
There's a reason a programmatic interface was added - it's just better
in general.
Just to name one - opening files has implications for LSMs like
SELinux. You now need a specific policy to allow it, which means
applications that upgrade from one version of DPDK to the next will
break.

In general, I do not think we should go backwards. The programmatic
interface to the random pools are good and we should use them by
default - of course by all means add fallbacks to urandom if they are
not available.

But as Stephen said glibc generally does not support compiling on new +
running on old - so if it's not this that breaks, it will be something
else.

-- 
Kind regards,
Luca Boccassi


[dpdk-dev] [PATCH v1] abi: document reasons behind the three part versioning

2020-04-23 Thread Ray Kinsella
Clarify the reasons behind the three part version numbering scheme.
Documents the fixes made in f26c2b3.

Signed-off-by: Ray Kinsella 
Signed-off-by: Bruce Richardson 
---
 doc/guides/contributing/abi_policy.rst |  3 ++-
 doc/guides/rel_notes/release_20_05.rst | 12 
 2 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/doc/guides/contributing/abi_policy.rst 
b/doc/guides/contributing/abi_policy.rst
index 05ca959..86e7dd9 100644
--- a/doc/guides/contributing/abi_policy.rst
+++ b/doc/guides/contributing/abi_policy.rst
@@ -39,7 +39,8 @@ General Guidelines
releases, over a number of release cycles. This change begins with
maintaining ABI stability through one year of DPDK releases starting from
DPDK 19.11. This policy will be reviewed in 2020, with intention of
-   lengthening the stability period.
+   lengthening the stability period. Additional implementation detail can be
+   found in the :ref:`release notes <20_05_abi_changes>`.
 
 What is an ABI?
 ~~~
diff --git a/doc/guides/rel_notes/release_20_05.rst 
b/doc/guides/rel_notes/release_20_05.rst
index 7f2049a..8653f7a 100644
--- a/doc/guides/rel_notes/release_20_05.rst
+++ b/doc/guides/rel_notes/release_20_05.rst
@@ -164,6 +164,7 @@ API Changes
Also, make sure to start the actual text at the margin.
=
 
+.. _20_05_abi_changes:
 
 ABI Changes
 ---
@@ -180,6 +181,17 @@ ABI Changes
Also, make sure to start the actual text at the margin.
=
 
+* The soname for each stable ABI version should be just the ABI version major
+  number without the minor number. Unfortunately both major and minor were used
+  in the v19.11 release, causing version v20.x releases to be incompatible with
+  ABI v20.0.
+
+  The `commit f26c2b3
+  
`_
+  fixed the issue by switching from 2-part to 3-part ABI version numbers so 
that
+  we can keep v20.0 as soname and using the final digits to identify the DPDK
+  20.x releases which are ABI compatible.
+
 * No ABI change that would break compatibility with DPDK 20.02 and 19.11.
 
 
-- 
2.7.4



Re: [dpdk-dev] [PATCH v2 00/16] update and simplify telemetry library.

2020-04-23 Thread Luca Boccassi
On Thu, 2020-04-09 at 11:37 +0200, Thomas Monjalon wrote:
> 09/04/2020 11:19, Bruce Richardson:
> > On Wed, Apr 08, 2020 at 08:03:26PM +0200, Thomas Monjalon wrote:
> > > 08/04/2020 18:49, Ciara Power:
> > > > This patchset extensively reworks the telemetry library adding new
> > > > functionality and simplifying much of the existing code, while
> > > > maintaining backward compatibility.
> > > > 
> > > > This work is based on the previously sent RFC for a "process info"
> > > > library: https://patchwork.dpdk.org/project/dpdk/list/?series=7741
> > > > However, rather than creating a new library, this patchset takes
> > > > that work and merges it into the existing telemetry library, as
> > > > mentioned above.
> > > > 
> > > > The telemetry library as shipped in 19.11 is based upon the metrics
> > > > library and outputs all statistics based on that as a source. However,
> > > > this limits the telemetry output to only port-level statistics
> > > > information, rather than allowing it to be used as a general scheme for
> > > > telemetry information across all DPDK libraries.
> > > > 
> > > > With this patchset applied, rather than the telemetry library being
> > > > responsible for pulling ethdev stats and pushing them into the metrics
> > > > library for retrieval later, each library e.g. ethdev, rawdev, and even
> > > > the metrics library itself (for backwards compatiblity) now handle their
> > > > own stats.  Any library or app can register a callback function with
> > > > telemetry, which will be called if requested by the client connected via
> > > > the telemetry socket. The callback function in the library/app then
> > > > formats its stats, or other data, into a JSON string, and returns it to
> > > > telemetry to be sent to the client.
> > > 
> > > I think this is a global need in DPDK, and it is usually called RPC,
> > > IPC or control messaging.
> > > We had a similar need for multi-process communication, thus rte_mp IPC.
> > > We also need a control channel for user configuration applications.
> > > We also need to control some features like logging or tracing.
> > > 
> > > In my opinion, it is time to introduce a general control channel in DPDK.
> > > The application must be in the loop of the control mechanism.
> > > Making such channel standard will ease application adoption.
> > > 
> > > Please read some comments here:
> > > http://inbox.dpdk.org/dev/2580933.jp2sp48Hzj@xps/
> > > 
> > Hi Thomas,
> > 
> > I agree that having a single control mechanism or messaging mechanism in
> > DPDK would be nice to have. However, I don't believe the plans for such a
> > scheme should impact this patchset right now as the idea of a common
> > channel was only first mooted about a week ago, and while there has been
> > some email discussion about it, there is as yet no requirements list that
> > I've seen, nobody actually doing coding work on it, no rfc and most
> > importantly no timeline for creating and merging such into DPDK.
> 
> Yes, this is a new idea.
> Throwing the idea in this "telemetry" thread and in "IF proxy" thread
> is the first step before starting a dedicated thread to design
> a generic mechanism.

May I offer the services of https://zeromq.org/ ?

-- 
Kind regards,
Luca Boccassi


Re: [dpdk-dev] [PATCH v2 00/16] update and simplify telemetry library.

2020-04-23 Thread Thomas Monjalon
23/04/2020 12:30, Luca Boccassi:
> On Thu, 2020-04-09 at 11:37 +0200, Thomas Monjalon wrote:
> > 09/04/2020 11:19, Bruce Richardson:
> > > On Wed, Apr 08, 2020 at 08:03:26PM +0200, Thomas Monjalon wrote:
> > > > 08/04/2020 18:49, Ciara Power:
> > > > > This patchset extensively reworks the telemetry library adding new
> > > > > functionality and simplifying much of the existing code, while
> > > > > maintaining backward compatibility.
> > > > > 
> > > > > This work is based on the previously sent RFC for a "process info"
> > > > > library: https://patchwork.dpdk.org/project/dpdk/list/?series=7741
> > > > > However, rather than creating a new library, this patchset takes
> > > > > that work and merges it into the existing telemetry library, as
> > > > > mentioned above.
> > > > > 
> > > > > The telemetry library as shipped in 19.11 is based upon the metrics
> > > > > library and outputs all statistics based on that as a source. However,
> > > > > this limits the telemetry output to only port-level statistics
> > > > > information, rather than allowing it to be used as a general scheme 
> > > > > for
> > > > > telemetry information across all DPDK libraries.
> > > > > 
> > > > > With this patchset applied, rather than the telemetry library being
> > > > > responsible for pulling ethdev stats and pushing them into the metrics
> > > > > library for retrieval later, each library e.g. ethdev, rawdev, and 
> > > > > even
> > > > > the metrics library itself (for backwards compatiblity) now handle 
> > > > > their
> > > > > own stats.  Any library or app can register a callback function with
> > > > > telemetry, which will be called if requested by the client connected 
> > > > > via
> > > > > the telemetry socket. The callback function in the library/app then
> > > > > formats its stats, or other data, into a JSON string, and returns it 
> > > > > to
> > > > > telemetry to be sent to the client.
> > > > 
> > > > I think this is a global need in DPDK, and it is usually called RPC,
> > > > IPC or control messaging.
> > > > We had a similar need for multi-process communication, thus rte_mp IPC.
> > > > We also need a control channel for user configuration applications.
> > > > We also need to control some features like logging or tracing.
> > > > 
> > > > In my opinion, it is time to introduce a general control channel in 
> > > > DPDK.
> > > > The application must be in the loop of the control mechanism.
> > > > Making such channel standard will ease application adoption.
> > > > 
> > > > Please read some comments here:
> > > > http://inbox.dpdk.org/dev/2580933.jp2sp48Hzj@xps/
> > > > 
> > > Hi Thomas,
> > > 
> > > I agree that having a single control mechanism or messaging mechanism in
> > > DPDK would be nice to have. However, I don't believe the plans for such a
> > > scheme should impact this patchset right now as the idea of a common
> > > channel was only first mooted about a week ago, and while there has been
> > > some email discussion about it, there is as yet no requirements list that
> > > I've seen, nobody actually doing coding work on it, no rfc and most
> > > importantly no timeline for creating and merging such into DPDK.
> > 
> > Yes, this is a new idea.
> > Throwing the idea in this "telemetry" thread and in "IF proxy" thread
> > is the first step before starting a dedicated thread to design
> > a generic mechanism.
> 
> May I offer the services of https://zeromq.org/ ?

This is what I already proposed:
http://inbox.dpdk.org/dev/20334513.huCnfhLgOn@xps/

I'm sorry, I was supposed to start a new thread for this discussion.
I will summarize my thoughts and discussions just after -rc1 is done.




Re: [dpdk-dev] [PATCH 1/7] eal: move OS common functions to single file

2020-04-23 Thread Thomas Monjalon
23/04/2020 11:06, Dmitry Kozlyuk:
> On 2020-04-23 09:27 GMT+0200 Thomas Monjalon wrote:
> > 23/04/2020 01:51, Ranjit Menon:
> > > On 4/22/2020 12:27 AM, tal...@mellanox.com wrote:  
> > > > From: Tal Shnaiderman 
> > > > 
> > > > Move common functions between Unix and Windows to eal_config.c.  
> > > 
> > > Like other files in common, we should call this eal_common_config.c  
> > 
> > I am not sure about the interest of repeating the directory name
> > in the file name in general.
> > Do you see a real benefit?
> 
> It allows using VPATH in Makefile. If filenames are identical in different
> VPATH directories, make can't pick both. Makefiles are being deprecated, but
> they'll be around for some more time.

Makefile will be removed in 20.11




Re: [dpdk-dev] [PATCH] security: fix crash at accessing non-implemented ops

2020-04-23 Thread Ananyev, Konstantin


> > >
> > > These are data path ops and so it will be better if we can avoid such 
> > > checks in
> > the datapath. The same is done in ethdev also.
> >
> > AFAIK,  get_userdata is an *optional* dev-ops function that can be used by 
> > data-
> > path.
> > So far there was no strict requirement for the rte_security PMDs to *always*
> > implement it.
> 
> [Anoob] I don't think DPDK categorizes dev-ops as *optional* and *always*. If 
> yes, can you point me?

> My understanding is, all ops are optional. For example, I could implement a 
> crypto PMD which is doing packet delivery only via event device
> (using crypto adapter). So dequeue op will not be implemented in that case 
> and DPDK spec allows it.

Your PMD can have enqueue_burst/dequeue_burst as NOP,
but you still have  to provide valid function pointers:
they are stored inside crypto_dev structure itself and will be called
unconditionally (without any extra checking) by
rte_cryptodev_enqueue_burst/rte_cryptodev_dequeue_burst.
For all other calls (both data and control path) there is a check
that actual function pointer is a valid one.
Same story for eth dev: pkt_rx_burst/pkt_tx_burst and rest of dev-ops.
 
> > So what you guys did is a silent change of public API behaviour.
> 
> [Anoob] I believe Lukasz had submitted 3 or 4 revisions and it was all in the 
> ML. RTE_DEBUG was suggested by Thomas I guess.

I believe it is not a right procedure to change existing behaviour of 
rte_security framework.
I think you have to communicate clear and loudly in advance (at least one 
release in advance).
Plus RTE_DEBUG has nothing to do with changing non-debug behaviour.
 
> > As result ixgbe, (and probably some others rte_security PMDs) stopped 
> > working
> > properly.
> 
> [Anoob] set_pkt_metadata() is the only one of interest to IXGBE. And I 
> believe the function is implemented as well. So what exactly is the
> concern?

Check that ops->get_userdata is a valid function pointer will be compiled out.
So PMDs that don't implement this function will crash in 
rte_security_get_userdata().
In our particular case - ixgbe.
Same story with  rte_security_set_pkt_metadata() - see the patch. 

> 
> > I don't see any point in these changes, but if you'd like to do that, at 
> > least our
> > usual procedure has to be followed:
> > 1. Send and RFC to get an agreement with rte_security PMDs maintainers (one
> > release ahead) 2. send a deprecation note (one release ahead) 3. change the
> > behaviour of the public API 4. update release notes
> >
> > AFAIK 1), 2), 4) wasn't done.
> > So I think right now we need to revert original behaviour.
> >
> > >
> > > https://urldefense.proofpoint.com/v2/url?u=http-3A__code.dpdk.org_dpdk
> > > _v20.02_source_lib_librte-5Fethdev_rte-5Fethdev.h-23L4372&d=DwIFAg&c=n
> > > KjWec2b6R0mOyPaz7xtfQ&r=jPfB8rwwviRSxyLWs2n6B-
> > WYLn1v9SyTMrT5EQqh2TU&m=
> > > 6ObfSanVVuHOsiqVlWxXsFWi-
> > 2XNp76HCOX0vbUfma4&s=jDVyDDEILmgY1Yb9ZBswBVbn
> > > 8FpZuQI5ukH_osmtUiI&e=
> > >
> > > Datapath functions in cryptodev (enqueue/dequeue) doesn't even have such
> > checks.
> > > https://urldefense.proofpoint.com/v2/url?u=http-3A__code.dpdk.org_dpdk
> > > _v20.02_source_lib_librte-5Fcryptodev_rte-5Fcryptodev.h-23L962&d=DwIFA
> > > g&c=nKjWec2b6R0mOyPaz7xtfQ&r=jPfB8rwwviRSxyLWs2n6B-
> > WYLn1v9SyTMrT5EQqh2
> > > TU&m=6ObfSanVVuHOsiqVlWxXsFWi-
> > 2XNp76HCOX0vbUfma4&s=LEWQOKs0r2Im_zL95VI
> > > df4kQ2Pu0iRHV9Co2J1gsNBE&e=
> >
> > That's a different story:
> > rx_burst/tx_burst, enqueue/dequeue are mandatory dev-ops functions that have
> > to be implemented by each  ethdev/cryptodev API.
> 
> [Anoob] I couldn't find any reference stating that way. If you can point me, 
> I can update that to include datapath ops required for inline
> protocol processing.

Look at the code.

> 
> >
> > >
> > >
> > > Thanks,
> > > Anoob
> > >
> > > > -Original Message-
> > > > From: dev  On Behalf Of Konstantin Ananyev
> > > > Sent: Thursday, April 23, 2020 5:22 AM
> > > > To: dev@dpdk.org
> > > > Cc: akhil.go...@nxp.com; declan.dohe...@intel.com; Konstantin
> > > > Ananyev 
> > > > Subject: [dpdk-dev] [PATCH] security: fix crash at accessing
> > > > non-implemented ops
> > > >
> > > > Valid checks for optional function pointers inside dev-ops were
> > > > disabled by undefined macro.
> > > >
> > > > Fixes: b6ee98547847 ("security: fix verification of parameters")
> > > >
> > > > Signed-off-by: Konstantin Ananyev 
> > > > ---
> > > >  lib/librte_security/rte_security.c | 4 
> > > >  1 file changed, 4 deletions(-)
> > > >
> > > > diff --git a/lib/librte_security/rte_security.c
> > > > b/lib/librte_security/rte_security.c
> > > > index d475b0977..b65430ce2 100644
> > > > --- a/lib/librte_security/rte_security.c
> > > > +++ b/lib/librte_security/rte_security.c
> > > > @@ -107,11 +107,9 @@ rte_security_set_pkt_metadata(struct
> > > > rte_security_ctx *instance,
> > > >   struct rte_security_session *sess,
> > > >  

Re: [dpdk-dev] [PATCHv3] Remove validate-abi.sh from tree

2020-04-23 Thread Neil Horman
On Wed, Apr 22, 2020 at 02:16:57PM +0200, Thomas Monjalon wrote:
> 22/04/2020 14:01, Neil Horman:
> > On Tue, Apr 21, 2020 at 11:42:42PM +0200, Thomas Monjalon wrote:
> > > 21/04/2020 20:56, Neil Horman:
> > > > On Tue, Apr 21, 2020 at 01:46:43PM +0200, Thomas Monjalon wrote:
> > > > > 21/04/2020 13:12, Neil Horman:
> > > > > > On Fri, Apr 17, 2020 at 04:42:38PM +0100, Ray Kinsella wrote:
> > > > > > > On 17/04/2020 13:10, Thomas Monjalon wrote:
> > > > > > > > 17/04/2020 13:47, Ray Kinsella:
> > > > > > > >> On 17/04/2020 11:20, Thomas Monjalon wrote:
> > > > > > > >>> 17/04/2020 12:11, Ray Kinsella:
> > > > > > >  check-abi.sh appears to be backward step in terms of 
> > > > > > >  usability.
> > > > > > > >>>
> > > > > > > >>> No, check-abi.sh benefits from a nice integration in build 
> > > > > > > >>> scripts.
> > > > > > > >>> See below.
> > > > > > > >>>
> > > > > > >  With validate-abi.sh I do can do a "validate-abi.sh HEAD~1 
> > > > > > >  HEAD".
> > > > > > >  And it will do the build, install, dump and comparison for 
> > > > > > >  me. 
> > > > > > >  And it picked up my 20.0.2 - > 21.0 changes no problem. 
> > > > > > > 
> > > > > > >  With check-abi on the other hand, I need to the build and 
> > > > > > >  install myself.
> > > > > > >  check-abi requires dump files, but I see no reference in the 
> > > > > > >  documentation to how these are created.
> > > > > > >  It silently fails when it doesn't find any ...
> > > > > > > 
> > > > > > >  Do I run abi-dumper on the so's myself, or how does it work?
> > > > > > > >>>
> > > > > > > >>> check-abi.sh is integrated in test-build.sh and 
> > > > > > > >>> test-meson-builds.sh.
> > > > > > > >>> Probably we should document usage in these scripts.
> > > > > > > >>
> > > > > > > >> Looks like I need to set DPDK_ABI_REF_VERSION=master, not 
> > > > > > > >> obvious.
> > > > > > > >> Any tips or tricks would be welcome.
> > > > > > > > 
> > > > > > > > export DPDK_ABI_REF_VERSION=v20.02
> > > > > > > > or
> > > > > > > > export DPDK_ABI_REF_VERSION=v19.11
> > > > > > > > 
> > > > > > > > Depends on which compatibility you want to test...
> > > > > > > > 
> > > > > > > 
> > > > > > > Few things ...
> > > > > > > 
> > > > > > > 1. test-meson-build.sh keep barfing complaining about reference 
> > > > > > > paths.
> > > > > > > ValueError: dst_dir must be absolute, got 
> > > > > > > reference/v19.11/build-gcc-static/usr/local/share/dpdk/examples/bbdev_app
> > > > > > > 
> > > > > > > Under the hood, ninja install is failing complaining that it 
> > > > > > > needs an absolute path.
> > > > > > > I fixed this in test_meson_build.sh and will send a patch in a 
> > > > > > > minute. 
> > > > > > > Though it's strange no-one else has seen it?
> > > > > > > 
> > > > > > > 2. test-meson-build.sh compares the abi for the static builds, 
> > > > > > > which doesn't make any sense. 
> > > > > > > 
> > > > > > > 3. test-meson-build.sh will only take a branch in 
> > > > > > > DPDK_ABI_REF_VERSION that exists locally.
> > > > > > > In order to get it to compare HEAD against HEAD~1, which you 
> > > > > > > would imagine is a pretty common case.
> > > > > > > I had a create a branch for HEAD~1, in validate-abi this a pretty 
> > > > > > > simple `validate-abi HEAD~1 HEAD`
> > > > > > > 
> > > > > >  I think this code in test-meson-build.sh should probably be fixed:
> > > > > > 
> > > > > > if [ ! -d $abirefdir/src ]; then
> > > > > > git clone --local --no-hardlinks \
> > > > > > --single-branch \
> > > > > > -b $DPDK_ABI_REF_VERSION \
> > > > > > $srcdir $abirefdir/src
> > > > > > fi
> > > > > > 
> > > > > > Like you noted, using -b allows us to checkout a tag/branch in the 
> > > > > > cloned
> > > > > > repository but requires that it exist locally.  We should probably 
> > > > > > prefix the
> > > > > > checkout with a git fetch --tags
> > > > > 
> > > > > I don't understand your concern.
> > > > > A reference is an older version, so it should be in the git tree.
> > > > > 
> > > > yes, but not unless you've done a recent pull or fetch.  If you set
> > > > DPDK_ABI_REF_VERSION to a tag/branch that didn't exist as of the last 
> > > > time you
> > > > updated the tree, it won't be there (which it sounds like what is being
> > > > encountered here).  You can fix that by doing a git pull or git fetch 
> > > > prior to
> > > > running this script (or internal to the script)
> > > 
> > > Sorry I still don't understand the case.
> > > We want to compare the current version C with a reference R which is 
> > > older.
> > > If the reference R is not in the tree, it means the version C is not in 
> > > the tree.
> > > But C is the current version, so it is in the tree by definition.
> > > 
> > 
> > 
> >  

Re: [dpdk-dev] [PATCHv3] Remove validate-abi.sh from tree

2020-04-23 Thread Neil Horman
On Wed, Apr 22, 2020 at 02:18:05PM +0200, Thomas Monjalon wrote:
> 22/04/2020 14:07, Neil Horman:
> > On Wed, Apr 22, 2020 at 12:43:44PM +0100, Ray Kinsella wrote:
> > > On 21/04/2020 22:42, Thomas Monjalon wrote:
> > > > 21/04/2020 20:56, Neil Horman:
> > > >> On Tue, Apr 21, 2020 at 01:46:43PM +0200, Thomas Monjalon wrote:
> > > >>> 21/04/2020 13:12, Neil Horman:
> > >  On Fri, Apr 17, 2020 at 04:42:38PM +0100, Ray Kinsella wrote:
> > > > On 17/04/2020 13:10, Thomas Monjalon wrote:
> > > >> 17/04/2020 13:47, Ray Kinsella:
> > > >>> On 17/04/2020 11:20, Thomas Monjalon wrote:
> > >  17/04/2020 12:11, Ray Kinsella:
> > > > check-abi.sh appears to be backward step in terms of usability.
> > > 
> > >  No, check-abi.sh benefits from a nice integration in build 
> > >  scripts.
> > >  See below.
> > > 
> > > > With validate-abi.sh I do can do a "validate-abi.sh HEAD~1 
> > > > HEAD".
> > > > And it will do the build, install, dump and comparison for me. 
> > > > And it picked up my 20.0.2 - > 21.0 changes no problem. 
> > > >
> > > > With check-abi on the other hand, I need to the build and 
> > > > install myself.
> > > > check-abi requires dump files, but I see no reference in the 
> > > > documentation to how these are created.
> > > > It silently fails when it doesn't find any ...
> > > >
> > > > Do I run abi-dumper on the so's myself, or how does it work?
> > > 
> > >  check-abi.sh is integrated in test-build.sh and 
> > >  test-meson-builds.sh.
> > >  Probably we should document usage in these scripts.
> > > >>>
> > > >>> Looks like I need to set DPDK_ABI_REF_VERSION=master, not obvious.
> > > >>> Any tips or tricks would be welcome.
> > > >>
> > > >> export DPDK_ABI_REF_VERSION=v20.02
> > > >> or
> > > >> export DPDK_ABI_REF_VERSION=v19.11
> > > >>
> > > >> Depends on which compatibility you want to test...
> > > >>
> > > >
> > > > Few things ...
> > > >
> > > > 1. test-meson-build.sh keep barfing complaining about reference 
> > > > paths.
> > > > ValueError: dst_dir must be absolute, got 
> > > > reference/v19.11/build-gcc-static/usr/local/share/dpdk/examples/bbdev_app
> > > >
> > > > Under the hood, ninja install is failing complaining that it needs 
> > > > an absolute path.
> > > > I fixed this in test_meson_build.sh and will send a patch in a 
> > > > minute. 
> > > > Though it's strange no-one else has seen it?
> > > >
> > > > 2. test-meson-build.sh compares the abi for the static builds, 
> > > > which doesn't make any sense. 
> > > >
> > > > 3. test-meson-build.sh will only take a branch in 
> > > > DPDK_ABI_REF_VERSION that exists locally.
> > > > In order to get it to compare HEAD against HEAD~1, which you would 
> > > > imagine is a pretty common case.
> > > > I had a create a branch for HEAD~1, in validate-abi this a pretty 
> > > > simple `validate-abi HEAD~1 HEAD`
> > > >
> > >   I think this code in test-meson-build.sh should probably be fixed:
> > > 
> > >  if [ ! -d $abirefdir/src ]; then
> > >  git clone --local --no-hardlinks \
> > >  --single-branch \
> > >  -b $DPDK_ABI_REF_VERSION \
> > >  $srcdir $abirefdir/src
> > >  fi
> > > 
> > >  Like you noted, using -b allows us to checkout a tag/branch in the 
> > >  cloned
> > >  repository but requires that it exist locally.  We should probably 
> > >  prefix the
> > >  checkout with a git fetch --tags
> > > >>>
> > > >>> I don't understand your concern.
> > > >>> A reference is an older version, so it should be in the git tree.
> > > >>>
> > > >> yes, but not unless you've done a recent pull or fetch.  If you set
> > > >> DPDK_ABI_REF_VERSION to a tag/branch that didn't exist as of the last 
> > > >> time you
> > > >> updated the tree, it won't be there (which it sounds like what is being
> > > >> encountered here).  You can fix that by doing a git pull or git fetch 
> > > >> prior to
> > > >> running this script (or internal to the script)
> > > > 
> > > > Sorry I still don't understand the case.
> > > > We want to compare the current version C with a reference R which is 
> > > > older.
> > > > If the reference R is not in the tree, it means the version C is not in 
> > > > the tree.
> > > > But C is the current version, so it is in the tree by definition.
> > > > 
> > > 
> > > So I can just relate my experience 
> > > 
> > > root@silpixa00395806:/build/dpdk# DPDK_ABI_REF_VERSION=HEAD~1 
> > > ./devtools/test-meson-builds.sh
> > > ninja -C ./build-gcc-static
> > > ninja: E

Re: [dpdk-dev] [PATCH] security: fix crash at accessing non-implemented ops

2020-04-23 Thread Anoob Joseph
Hi Konstantin,

Please see inline.

Thanks,
Anoob

> -Original Message-
> From: Ananyev, Konstantin 
> Sent: Thursday, April 23, 2020 4:25 PM
> To: Anoob Joseph ; dev@dpdk.org; Lukasz
> Wojciechowski 
> Cc: akhil.go...@nxp.com; Doherty, Declan 
> Subject: [EXT] RE: [dpdk-dev] [PATCH] security: fix crash at accessing non-
> implemented ops
> 
> External Email
> 
> --
> 
> > > >
> > > > These are data path ops and so it will be better if we can avoid
> > > > such checks in
> > > the datapath. The same is done in ethdev also.
> > >
> > > AFAIK,  get_userdata is an *optional* dev-ops function that can be
> > > used by data- path.
> > > So far there was no strict requirement for the rte_security PMDs to
> > > *always* implement it.
> >
> > [Anoob] I don't think DPDK categorizes dev-ops as *optional* and *always*. 
> > If
> yes, can you point me?
> 
> > My understanding is, all ops are optional. For example, I could
> > implement a crypto PMD which is doing packet delivery only via event device
> (using crypto adapter). So dequeue op will not be implemented in that case and
> DPDK spec allows it.
> 
> Your PMD can have enqueue_burst/dequeue_burst as NOP, but you still have  to
> provide valid function pointers:
> they are stored inside crypto_dev structure itself and will be called
> unconditionally (without any extra checking) by
> rte_cryptodev_enqueue_burst/rte_cryptodev_dequeue_burst.

[Anoob] I think there is a basic misunderstanding here. You are treating 
unconditional calls as mandatory implementations. If that is the case 
rte_eth_tx_burst() & rte_eth_rx_burst() shouldn't check for function pointers 
even when DEBUG is enabled.

static inline uint16_t
rte_eth_rx_burst(uint16_t port_id, uint16_t queue_id,
 struct rte_mbuf **rx_pkts, const uint16_t nb_pkts)
{
struct rte_eth_dev *dev = &rte_eth_devices[port_id];
uint16_t nb_rx;

#ifdef RTE_LIBRTE_ETHDEV_DEBUG
RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, 0);
RTE_FUNC_PTR_OR_ERR_RET(*dev->rx_pkt_burst, 0);

if (queue_id >= dev->data->nb_rx_queues) {
RTE_ETHDEV_LOG(ERR, "Invalid RX queue_id=%u\n", queue_id);
return 0;
}
#endif
nb_rx = (*dev->rx_pkt_burst)(dev->data->rx_queues[queue_id],
 rx_pkts, nb_pkts);

>From my view point, function pointer checks and argument checks are required 
>in every API for stability. But having such checks in the datapath adversely 
>affects the performance. And for cases where function pointers are not set, 
>application would get one crash in the first run. And that can be debugged 
>after having the required options enabled.
 
> For all other calls (both data and control path) there is a check that actual
> function pointer is a valid one.
> Same story for eth dev: pkt_rx_burst/pkt_tx_burst and rest of dev-ops.
> 
> > > So what you guys did is a silent change of public API behaviour.
> >
> > [Anoob] I believe Lukasz had submitted 3 or 4 revisions and it was all in 
> > the ML.
> RTE_DEBUG was suggested by Thomas I guess.
> 
> I believe it is not a right procedure to change existing behaviour of 
> rte_security
> framework.
> I think you have to communicate clear and loudly in advance (at least one
> release in advance).
> Plus RTE_DEBUG has nothing to do with changing non-debug behaviour.
> 
> > > As result ixgbe, (and probably some others rte_security PMDs)
> > > stopped working properly.
> >
> > [Anoob] set_pkt_metadata() is the only one of interest to IXGBE. And I
> > believe the function is implemented as well. So what exactly is the concern?
> 
> Check that ops->get_userdata is a valid function pointer will be compiled out.
> So PMDs that don't implement this function will crash in
> rte_security_get_userdata().
> In our particular case - ixgbe.
> Same story with  rte_security_set_pkt_metadata() - see the patch.

[Anoob] But ixgbe doesn't implement inline protocol which is the primary 
consumer of this API (rte_security_get_userdata()). So what is the trouble? 

Also, application is expected to call rte_security_set_pkt_metadata() only on 
devices with offload flag RTE_SECURITY_TX_OLOAD_NEED_MDATA. If a PMD states it 
needs MDATA but fails to register a function pointer for doing the same, it is 
a control path problem. Checking for that in the datapath is an overkill.

> 
> >
> > > I don't see any point in these changes, but if you'd like to do
> > > that, at least our usual procedure has to be followed:
> > > 1. Send and RFC to get an agreement with rte_security PMDs
> > > maintainers (one release ahead) 2. send a deprecation note (one
> > > release ahead) 3. change the behaviour of the public API 4. update
> > > release notes
> > >
> > > AFAIK 1), 2), 4) wasn't done.
> > > So I think right now we need to revert original behaviour.
> > >
> > > >
> > > > https://urldefense.proofpoint.com/v2/url?u=http-3A__code.dp

Re: [dpdk-dev] [PATCH v2 00/16] update and simplify telemetry library.

2020-04-23 Thread Luca Boccassi
On Thu, 2020-04-23 at 12:44 +0200, Thomas Monjalon wrote:
> 23/04/2020 12:30, Luca Boccassi:
> > On Thu, 2020-04-09 at 11:37 +0200, Thomas Monjalon wrote:
> > > 09/04/2020 11:19, Bruce Richardson:
> > > > On Wed, Apr 08, 2020 at 08:03:26PM +0200, Thomas Monjalon wrote:
> > > > > 08/04/2020 18:49, Ciara Power:
> > > > > > This patchset extensively reworks the telemetry library adding new
> > > > > > functionality and simplifying much of the existing code, while
> > > > > > maintaining backward compatibility.
> > > > > > 
> > > > > > This work is based on the previously sent RFC for a "process info"
> > > > > > library: https://patchwork.dpdk.org/project/dpdk/list/?series=7741
> > > > > > However, rather than creating a new library, this patchset takes
> > > > > > that work and merges it into the existing telemetry library, as
> > > > > > mentioned above.
> > > > > > 
> > > > > > The telemetry library as shipped in 19.11 is based upon the metrics
> > > > > > library and outputs all statistics based on that as a source. 
> > > > > > However,
> > > > > > this limits the telemetry output to only port-level statistics
> > > > > > information, rather than allowing it to be used as a general scheme 
> > > > > > for
> > > > > > telemetry information across all DPDK libraries.
> > > > > > 
> > > > > > With this patchset applied, rather than the telemetry library being
> > > > > > responsible for pulling ethdev stats and pushing them into the 
> > > > > > metrics
> > > > > > library for retrieval later, each library e.g. ethdev, rawdev, and 
> > > > > > even
> > > > > > the metrics library itself (for backwards compatiblity) now handle 
> > > > > > their
> > > > > > own stats.  Any library or app can register a callback function with
> > > > > > telemetry, which will be called if requested by the client 
> > > > > > connected via
> > > > > > the telemetry socket. The callback function in the library/app then
> > > > > > formats its stats, or other data, into a JSON string, and returns 
> > > > > > it to
> > > > > > telemetry to be sent to the client.
> > > > > 
> > > > > I think this is a global need in DPDK, and it is usually called RPC,
> > > > > IPC or control messaging.
> > > > > We had a similar need for multi-process communication, thus rte_mp 
> > > > > IPC.
> > > > > We also need a control channel for user configuration applications.
> > > > > We also need to control some features like logging or tracing.
> > > > > 
> > > > > In my opinion, it is time to introduce a general control channel in 
> > > > > DPDK.
> > > > > The application must be in the loop of the control mechanism.
> > > > > Making such channel standard will ease application adoption.
> > > > > 
> > > > > Please read some comments here:
> > > > > http://inbox.dpdk.org/dev/2580933.jp2sp48Hzj@xps/
> > > > > 
> > > > Hi Thomas,
> > > > 
> > > > I agree that having a single control mechanism or messaging mechanism in
> > > > DPDK would be nice to have. However, I don't believe the plans for such 
> > > > a
> > > > scheme should impact this patchset right now as the idea of a common
> > > > channel was only first mooted about a week ago, and while there has been
> > > > some email discussion about it, there is as yet no requirements list 
> > > > that
> > > > I've seen, nobody actually doing coding work on it, no rfc and most
> > > > importantly no timeline for creating and merging such into DPDK.
> > > 
> > > Yes, this is a new idea.
> > > Throwing the idea in this "telemetry" thread and in "IF proxy" thread
> > > is the first step before starting a dedicated thread to design
> > > a generic mechanism.
> > 
> > May I offer the services of https://zeromq.org/ ?
> 
> This is what I already proposed:
> http://inbox.dpdk.org/dev/20334513.huCnfhLgOn@xps/
> 
> I'm sorry, I was supposed to start a new thread for this discussion.
> I will summarize my thoughts and discussions just after -rc1 is done.

Ah! They say great minds think alike :-P

-- 
Kind regards,
Luca Boccassi


Re: [dpdk-dev] [PATCH] eal: add madvise to avoid dump memory

2020-04-23 Thread Burakov, Anatoly

On 23-Apr-20 7:36 AM, Feng Li wrote:

Hi,
I have tested as follows, the core dump file is ~ 200KB.
It should generate one core dump file each crash.

#include 
#include 
#include 
#include 
#include 
#include 

int main(int argc, char** argv) {
// FIXME(fengli): X
uint64_t size = 1<<30;
void* ptr = mmap(0, size , PROT_READ, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
if (ptr == (void*)-1) {
perror("[-] mmap failed with MAP_PRIVATE | MAP_ANONYMOUS");
exit(1);
}
if (madvise(ptr, size , MADV_DONTDUMP) != 0)
perror("[-] madvise failed");
while(1)
sleep(1);
return 0;
}



That's odd, your code works. Mine, even though it did the same thing, 
didn't work the same way. My compiler must like you more than it likes 
me :) (or perhaps i had a typo...)


Anyway, i can see that this indeed prevents core dumps on madvise'd 
memory (i've also tested it with PROT_NONE).


I'll go ahead and ack the original patch then.

--
Thanks,
Anatoly


[dpdk-dev] [PATCH v1] doc: QAT support for AES-256 DOCSIS

2020-04-23 Thread Mairtin o Loingsigh
Update QAT pmd to support AES-256 DOCSIS

Signed-off-by: Mairtin o Loingsigh 
---
 doc/guides/rel_notes/release_20_05.rst | 4 
 1 file changed, 4 insertions(+)

diff --git a/doc/guides/rel_notes/release_20_05.rst 
b/doc/guides/rel_notes/release_20_05.rst
index 7f2049a0f..5e81c1964 100644
--- a/doc/guides/rel_notes/release_20_05.rst
+++ b/doc/guides/rel_notes/release_20_05.rst
@@ -110,6 +110,10 @@ New Features
   any checksum calculation was requested - in such case the code falls back to
   fixed compression as before.
 
+* **Updated the QAT PMD.**
+
+  * Added AES-256 DOCSIS algorithm support to QAT PMD.
+
 * **Updated the turbo_sw bbdev PMD.**
 
   Supported large size code blocks which does not fit in one mbuf segment.
-- 
2.12.3



Re: [dpdk-dev] [PATCH 2/2] eal: resolve getentropy at run time for random seed

2020-04-23 Thread Mattias Rönnblom
On 2020-04-22 22:35, Dan Gora wrote:
> On Wed, Apr 22, 2020 at 5:14 PM Mattias Rönnblom
>  wrote:
>> On 2020-04-22 19:44, Dan Gora wrote:
>>> On Wed, Apr 22, 2020 at 5:28 AM Mattias Rönnblom
>>>  wrote:
 On 2020-04-21 21:54, Dan Gora wrote:
> The getentropy() function was introduced into glibc v2.25 and so is
> not available on all supported platforms.  Previously, if DPDK was
> compiled (using meson) on a system which has getentropy(), it would
> introduce a dependency on glibc v2.25 which would prevent that binary
> from running on a system with an older glibc.  Similarly if DPDK was
> compiled on a system which did not have getentropy(), getentropy()
> could not be used even if the execution system supported it.
>
> Introduce a new static function, __rte_getentropy() which will try to
> resolve the getentropy() function dynamically using dlopen()/dlsym(),
> returning failure if the getentropy() function cannot be resolved or
> if it fails.
 Two other options: providing a DPDK-native syscall wrapper for
 getrandom(), or falling back to reading /dev/urandom. Have you
 considered any of those two options? If so, why do you prefer
 dlopen()/dlsym()?
>>> I didn't give any thought at all to using /dev/urandom.  The goal was
>>> not really to change how the thing worked, just to remove the
>>> dependency on glibc 2.25.
>>
>> /dev/urandom is basically only a different interface to the same
>> underlying mechanism.
>>
>> Such an alternative would look something like:
>>
>> static int
>> getentropy(void *buffer, size_t length)
>> {
>>   int rc = -1;
>>   int old_errno = errno;
>>   int fd;
>>
>>   fd = open("/dev/urandom", O_RDONLY);
>>
>>   if (fd < 0)
>>   goto out;
>>
>>   if (read(fd, buffer, length) != length)
>>   goto out_close;
>>
>>   rc = 0;
>>
>> out_close:
>>   close(fd);
>> out:
>>   errno = old_errno;
>>
>>   return rc;
>> }
> That's fine with me, but like I said I wasn't trying to change how any
> of this worked, just work around glibc dependencies.  There seems to
> be some subtle difference between /dev/urandom and /dev/random, but...
>
> https://protect2.fireeye.com/v1/url?k=1705be57-4b8f6b41-1705fecc-862f14a9365e-bb983def357fdfad&q=1&e=10fec9c1-51b3-4bc3-b77d-7eb39787d007&u=https%3A%2F%2Fpatches-gcc.linaro.org%2Fcomment%2F14484%2F
>
 Failure to run on old libc seems like a non-issue to me.
>>> Well, again, it's a new dependency that didn't exist before.. We sell
>>> to telco customers, so we have to support 10s of different target
>>> platforms of various ages.  If they update their system, we'd have to
>>> recompile our code to be able to use getentropy().  Similarly, if we
>>> compiled on a system which has getentropy(), but the target system
>>> doesn't, then they cannot run our binary because of the glibc 2.25
>>> dependency.  That means that we have to have separate versions with
>>> and without getentropy().  It's a maintenance headache for no real
>>> benefit.
>>
>> I'm not sure I follow. Why would you need to recompile DPDK in case they
>> upgrade their system? It sounds like you care about initial seeding,
>> since you want getentropy() if it exists, but then in the next paragraph
>> you want to throw it out, so I'm a little confused.
> Well  _I_ wouldn't but maybe someone wants getentropy() for the
> initial seed.. I assume that's why it was added in the first place..
> For my application we don't care at all.  I just want to get rid of
> this dependency on glibc 2.25 and have the behavior be the same on
> meson and Makefile builds on the same complication system.


The reason for trying to avoid a wall time-based seed as the default is 
that application instances started at the roughly the same time might 
end up having a the same seed, which in turn might impact their behavior 
in an adverse way. For example, random back-off timers may be the same. 
On x86_64, TSC has a high resolution, but on other platforms its 
equivalent the clock rate is much lower.


>> Why doesn't the standard practice of compiling against the oldest
>> supported libc work for you?
> I guess I didn't realize that was "standard practice" but even so it
> still adds an unnecessary restriction on the complication platform.


If DPDK has the policy of attempting to allow DPDK applications compiled 
against one glibc version to run against another, older, version, we can 
go ahead and discuss the details further. That would be up to the tech 
board to decide. I would vote against it.


If the fix was simple, that's one thing. dlopen()/dlsym() doesn't 
qualify as such, nor does a syscall wrapper, as you pointed out.


>>> To my mind, since getentropy() can block it seems like it would
>>> probably be better to just remove it entirely, but I suppose that's up
>>> to the person(s) who put it in in the first place.
>>
>> Maybe I'm wrong,

Re: [dpdk-dev] [PATCH] mempool: remove inline functions from export list

2020-04-23 Thread Andrew Rybchenko
On 4/22/20 10:37 AM, Fady Bader wrote:
> The code didn't compile when using exported mempool functions under windows.
> 
> compilation error logs:
> rte_mempool_exports.def : error LNK2001:
> unresolved external symbol rte_mempool_cache_flush
> rte_mempool_exports.def : error LNK2001:
> unresolved external symbol rte_mempool_default_cache
> rte_mempool_exports.def : error LNK2001:
> unresolved external symbol rte_mempool_generic_get
> rte_mempool_exports.def : error LNK2001:
> unresolved external symbol rte_mempool_generic_put
> lib\librte_mempool.dll.a : fatal error LNK1120: 4 unresolved externals
> clang: error: linker command failed with exit code 1120 (use -v to see 
> invocation)
> [77/77] Linking target drivers/librte_bus_pci-0.200.2.dll.
> ninja: build stopped: subcommand failed.
> 
> The cause was that there were some inline functions that were included
> in the export list.
> To solve this the functions were removed from rte_mempool_version.map
> export list which are implemented in the header and shouldn't be exported.
> 
> Fixes: 4b5062755aa74517ed1d7bd ("mempool: allow user-owned cache")
> Fixes: 656f2d3ede96902202a1a5f ("mempool: deprecate specific get and put 
> functions")
> Cc: sta...@dpdk.org
> 
> Signed-off-by: Fady Bader 

Acked-by: Andrew Rybchenko 



Re: [dpdk-dev] [PATCH] security: fix crash at accessing non-implemented ops

2020-04-23 Thread Akhil Goyal
Hi Anoob/Konstantin,
> >
> > Check that ops->get_userdata is a valid function pointer will be compiled 
> > out.
> > So PMDs that don't implement this function will crash in
> > rte_security_get_userdata().
> > In our particular case - ixgbe.
> > Same story with  rte_security_set_pkt_metadata() - see the patch.
> 
> [Anoob] But ixgbe doesn't implement inline protocol which is the primary
> consumer of this API (rte_security_get_userdata()). So what is the trouble?
> 
> Also, application is expected to call rte_security_set_pkt_metadata() only on
> devices with offload flag RTE_SECURITY_TX_OLOAD_NEED_MDATA. If a PMD
> states it needs MDATA but fails to register a function pointer for doing the 
> same,
> it is a control path problem. Checking for that in the datapath is an 
> overkill.
> 
Whatever your concern is, we can resolve it later, but for now we should have 
the same
Unconditional checks that were there earlier. We need to make RC1 
today/tomorrow.
And this cannot go as an issue.

These are optional APIs and every PMD may not have supported that.

Konstantin,
Please send an update to your patch reverting the original patch for these 2 
functions.
Currently it is adding 2 extra checks.

Regards,
Akhil



[dpdk-dev] [PATCH v2] net/ice/base: fix DCF switch rule

2020-04-23 Thread Qi Zhang
1. ln_en bit should not be turned on, since we only support Rx VEB.
2. lan_en bit need to be turn on for a DCF switch rule, otherwise
   any Tx packet that hit on a rule will be dropped.

Fixes: fed0c5ca5f19 ("net/ice/base: support programming a new switch recipe")

Signed-off-by: Qi Zhang 
---
v2:
- fix a bug

 drivers/net/ice/base/ice_switch.c   | 11 ++-
 drivers/net/ice/ice_switch_filter.c |  1 +
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ice/base/ice_switch.c 
b/drivers/net/ice/base/ice_switch.c
index fd2cf101a..0970ffdd0 100644
--- a/drivers/net/ice/base/ice_switch.c
+++ b/drivers/net/ice/base/ice_switch.c
@@ -1938,6 +1938,13 @@ static void ice_fill_sw_info(struct ice_hw *hw, struct 
ice_fltr_info *fi)
 {
fi->lb_en = false;
fi->lan_en = false;
+
+   if ((fi->flag & ICE_FLTR_RX) &&
+   (fi->fltr_act == ICE_FWD_TO_VSI ||
+fi->fltr_act == ICE_FWD_TO_VSI_LIST) &&
+   fi->lkup_type == ICE_SW_LKUP_LAST)
+   fi->lan_en = true;
+
if ((fi->flag & ICE_FLTR_TX) &&
(fi->fltr_act == ICE_FWD_TO_VSI ||
 fi->fltr_act == ICE_FWD_TO_VSI_LIST ||
@@ -6453,6 +6460,7 @@ ice_adv_add_update_vsi_list(struct ice_hw *hw,
return status;
 
ice_memset(&tmp_fltr, 0, sizeof(tmp_fltr), ICE_NONDMA_MEM);
+   tmp_fltr.flag = m_entry->rule_info.sw_act.flag;
tmp_fltr.fltr_rule_id = cur_fltr->fltr_rule_id;
tmp_fltr.fltr_act = ICE_FWD_TO_VSI_LIST;
tmp_fltr.fwd_id.vsi_list_id = vsi_list_id;
@@ -6615,7 +6623,7 @@ ice_add_adv_rule(struct ice_hw *hw, struct 
ice_adv_lkup_elem *lkups,
s_rule = (struct ice_aqc_sw_rules_elem *)ice_malloc(hw, rule_buf_sz);
if (!s_rule)
return ICE_ERR_NO_MEMORY;
-   act |= ICE_SINGLE_ACT_LB_ENABLE | ICE_SINGLE_ACT_LAN_ENABLE;
+   act |= ICE_SINGLE_ACT_LAN_ENABLE;
switch (rinfo->sw_act.fltr_act) {
case ICE_FWD_TO_VSI:
act |= (rinfo->sw_act.fwd_id.hw_vsi_id <<
@@ -6780,6 +6788,7 @@ ice_adv_rem_update_vsi_list(struct ice_hw *hw, u16 
vsi_handle,
return status;
 
ice_memset(&tmp_fltr, 0, sizeof(tmp_fltr), ICE_NONDMA_MEM);
+   tmp_fltr.flag = fm_list->rule_info.sw_act.flag;
tmp_fltr.fltr_rule_id = fm_list->rule_info.fltr_rule_id;
fm_list->rule_info.sw_act.fltr_act = ICE_FWD_TO_VSI;
tmp_fltr.fltr_act = ICE_FWD_TO_VSI;
diff --git a/drivers/net/ice/ice_switch_filter.c 
b/drivers/net/ice/ice_switch_filter.c
index 55a5618a7..8b007b7eb 100644
--- a/drivers/net/ice/ice_switch_filter.c
+++ b/drivers/net/ice/ice_switch_filter.c
@@ -1129,6 +1129,7 @@ ice_switch_parse_dcf_action(const struct rte_flow_action 
*actions,
}
 
rule_info->sw_act.src = rule_info->sw_act.vsi_handle;
+   rule_info->sw_act.flag = ICE_FLTR_RX;
rule_info->rx = 1;
rule_info->priority = 5;
 
-- 
2.13.6



[dpdk-dev] [PATCH v2] examples/ipsec-secgw: add per core packet stats

2020-04-23 Thread Anoob Joseph
Adding per core packet handling stats to analyze traffic distribution
when multiple cores are engaged.

Since aggregating the packet stats across cores would affect
performance, keeping the feature disabled using compile time flags.

Signed-off-by: Anoob Joseph 
---

v2:
* Added lookup failure cases to drop count

 examples/ipsec-secgw/ipsec-secgw.c   | 118 +--
 examples/ipsec-secgw/ipsec-secgw.h   |   2 +
 examples/ipsec-secgw/ipsec.c |  13 +++-
 examples/ipsec-secgw/ipsec.h |  22 +++
 examples/ipsec-secgw/ipsec_process.c |   5 ++
 5 files changed, 154 insertions(+), 6 deletions(-)

diff --git a/examples/ipsec-secgw/ipsec-secgw.c 
b/examples/ipsec-secgw/ipsec-secgw.c
index 6d02341..db92ddc 100644
--- a/examples/ipsec-secgw/ipsec-secgw.c
+++ b/examples/ipsec-secgw/ipsec-secgw.c
@@ -288,6 +288,61 @@ adjust_ipv6_pktlen(struct rte_mbuf *m, const struct 
rte_ipv6_hdr *iph,
}
 }
 
+#ifdef ENABLE_STATS
+static uint64_t timer_period = 10; /* default period is 10 seconds */
+
+/* Print out statistics on packet distribution */
+static void
+print_stats(void)
+{
+   uint64_t total_packets_dropped, total_packets_tx, total_packets_rx;
+   unsigned int coreid;
+   float burst_percent;
+
+   total_packets_dropped = 0;
+   total_packets_tx = 0;
+   total_packets_rx = 0;
+
+   const char clr[] = { 27, '[', '2', 'J', '\0' };
+   const char topLeft[] = { 27, '[', '1', ';', '1', 'H', '\0' };
+
+   /* Clear screen and move to top left */
+   printf("%s%s", clr, topLeft);
+
+   printf("\nCore statistics ");
+
+   for (coreid = 0; coreid < RTE_MAX_LCORE; coreid++) {
+   /* skip disabled cores */
+   if (rte_lcore_is_enabled(coreid) == 0)
+   continue;
+   burst_percent = (float)(core_statistics[coreid].burst_rx * 100)/
+   core_statistics[coreid].rx;
+   printf("\nStatistics for core %u --"
+  "\nPackets received: %20"PRIu64
+  "\nPackets sent: %24"PRIu64
+  "\nPackets dropped: %21"PRIu64
+  "\nBurst percent: %23.2f",
+  coreid,
+  core_statistics[coreid].rx,
+  core_statistics[coreid].tx,
+  core_statistics[coreid].dropped,
+  burst_percent);
+
+   total_packets_dropped += core_statistics[coreid].dropped;
+   total_packets_tx += core_statistics[coreid].tx;
+   total_packets_rx += core_statistics[coreid].rx;
+   }
+   printf("\nAggregate statistics ==="
+  "\nTotal packets received: %14"PRIu64
+  "\nTotal packets sent: %18"PRIu64
+  "\nTotal packets dropped: %15"PRIu64,
+  total_packets_rx,
+  total_packets_tx,
+  total_packets_dropped);
+   printf("\n\n");
+}
+#endif /* ENABLE_STATS */
+
 static inline void
 prepare_one_packet(struct rte_mbuf *pkt, struct ipsec_traffic *t)
 {
@@ -333,6 +388,7 @@ prepare_one_packet(struct rte_mbuf *pkt, struct 
ipsec_traffic *t)
 
/* drop packet when IPv6 header exceeds first segment length */
if (unlikely(l3len > pkt->data_len)) {
+   core_stats_update_drop(1);
rte_pktmbuf_free(pkt);
return;
}
@@ -350,6 +406,7 @@ prepare_one_packet(struct rte_mbuf *pkt, struct 
ipsec_traffic *t)
/* Unknown/Unsupported type, drop the packet */
RTE_LOG(ERR, IPSEC, "Unsupported packet type 0x%x\n",
rte_be_to_cpu_16(eth->ether_type));
+   core_stats_update_drop(1);
rte_pktmbuf_free(pkt);
return;
}
@@ -471,6 +528,11 @@ send_burst(struct lcore_conf *qconf, uint16_t n, uint16_t 
port)
int32_t ret;
uint16_t queueid;
 
+#ifdef ENABLE_STATS
+   int lcore_id = rte_lcore_id();
+   core_statistics[lcore_id].tx += n;
+#endif /* ENABLE_STATS */
+
queueid = qconf->tx_queue_id[port];
m_table = (struct rte_mbuf **)qconf->tx_mbufs[port].m_table;
 
@@ -478,6 +540,9 @@ send_burst(struct lcore_conf *qconf, uint16_t n, uint16_t 
port)
 
ret = rte_eth_tx_burst(port, queueid, m_table, n);
if (unlikely(ret < n)) {
+#ifdef ENABLE_STATS
+   core_statistics[lcore_id].dropped += n-ret;
+#endif /* ENABLE_STATS */
do {
rte_pktmbuf_free(m_table[ret]);
} while (++ret < n);
@@ -525,6 +590,7 @@ send_fragment_packet(struct lcore_conf *qconf, struct 
rte_mbuf *m,
"error code: %d\n

[dpdk-dev] DPDK Release Status Meeting 23/04/2020

2020-04-23 Thread Ferruh Yigit
Minutes 23 April 2020
-

Agenda:
* Release Dates
* Subtrees

Participants:
* Arm
* Debian/Microsoft
* Intel
* Marvell
* Mellanox
* NXP
* Red Hat


Release Dates
-

* v20.05 dates:
  * Integration/Merge/RC1 pushed to *Friday 24 April 2020*
  * Release:Wednesday 20 May 2020

  * PRC holiday on 1-5 May, need to take into account for validation effort


Subtrees


* main
  * crypto tree pulled, in progress of pulling next-net
  * Not able to review rte_graph, planning to merge it in -rc2
  * Merged ring patches from Konstantin and RCU patches from Honnappa
  * Looking trace series, planning to have them in -rc1
  * Some fixes for eal and vfio merged
  * Checking fixes and changes related to ABI, from Ray and Neil
  * gcc10 support
* Some fixes are missing
* Fixes can go after -rc1
* Need a long term plan for the new compiler support
  * CI checks exists for new compilers
  * Hash library not reviewed
  * Some patches related atomic discussed in techboard,
not much to do for the release
  * Windows, some patches was not ready postponed to 20.08
  * Telemetry may go in -rc2
* There will be a new version, possibly today, to make it more generic
  * Call to review for Windows patches

* next-net
  * closed the tree yesterday, ready for -rc1

* next-crypto
  * pulled for -rc1
  * There are two existing issues
* Cryptodev ABI versioning, Fiona working on
  * Not sure if it will be ready for -rc1
* rte_security issue, mainline is broken as of now
  * Fix is in the mail list, should be merged quickly

* next-eventdev
  * No more patches for -rc1, already pulled to main

* next-virtio
  * Pulled for -rc1
  * Some fixes for next -rc
  * Mellanox vdpa queue stats patch postponed to next release
  * Mellanox vdpa virtq patch requires rebase, it may wait next -rc
  * Packet ring vectorized path patch under review, should be OK for -rc2
* Good to ask vector implementation from all architectures

* next-net-intel
  * Most patches merged, only some fixes for -rc2
  * fm10k still has build errors, postponed to -rc2

* next-net-mlx
  * Some small fixes for -rc2

* next-net-mrvl
  * All patches merged for -rc1



DPDK Release Status Meetings


The DPDK Release Status Meeting is intended for DPDK Committers to discuss
the status of the master tree and sub-trees, and for project managers to
track progress or milestone dates.

The meeting occurs on Thursdays at 8:30 UTC. If you wish to attend just
send an email to "John McNamara " for the invite.


Re: [dpdk-dev] [PATCH] security: fix crash at accessing non-implemented ops

2020-04-23 Thread Lukasz Wojciechowski


W dniu 23.04.2020 o 14:55, Akhil Goyal pisze:
> Hi Anoob/Konstantin,
>>> Check that ops->get_userdata is a valid function pointer will be compiled 
>>> out.
>>> So PMDs that don't implement this function will crash in
>>> rte_security_get_userdata().
>>> In our particular case - ixgbe.
>>> Same story with  rte_security_set_pkt_metadata() - see the patch.
>> [Anoob] But ixgbe doesn't implement inline protocol which is the primary
>> consumer of this API (rte_security_get_userdata()). So what is the trouble?
>>
>> Also, application is expected to call rte_security_set_pkt_metadata() only on
>> devices with offload flag RTE_SECURITY_TX_OLOAD_NEED_MDATA. If a PMD
>> states it needs MDATA but fails to register a function pointer for doing the 
>> same,
>> it is a control path problem. Checking for that in the datapath is an 
>> overkill.
>>
> Whatever your concern is, we can resolve it later, but for now we should have 
> the same
> Unconditional checks that were there earlier. We need to make RC1 
> today/tomorrow.
> And this cannot go as an issue.
>
> These are optional APIs and every PMD may not have supported that.
>
> Konstantin,
> Please send an update to your patch reverting the original patch for these 2 
> functions.
> Currently it is adding 2 extra checks.
>
> Regards,
> Akhil
>
Please remember also about updating app/test.
I will be glad to help with this matter.

-- 

Lukasz Wojciechowski
Principal Software Engineer

Samsung R&D Institute Poland
Samsung Electronics
Office +48 22 377 88 25
l.wojciec...@partner.samsung.com



Re: [dpdk-dev] [PATCH v3 1/4] hash: add k32v64 hash library

2020-04-23 Thread Ananyev, Konstantin
Hi Vladimir,

Apologies for late review.
My comments below. 

> K32V64 hash is a hash table that supports 32 bit keys and 64 bit values.
> This table is hash function agnostic so user must provide
> precalculated hash signature for add/delete/lookup operations.
> 
> Signed-off-by: Vladimir Medvedkin 
> ---
> 
> --- /dev/null
> +++ b/lib/librte_hash/rte_k32v64_hash.c
> @@ -0,0 +1,315 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2020 Intel Corporation
> + */
> +
> +#include 
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#include 
> +
> +TAILQ_HEAD(rte_k32v64_hash_list, rte_tailq_entry);
> +
> +static struct rte_tailq_elem rte_k32v64_hash_tailq = {
> + .name = "RTE_K32V64_HASH",
> +};
> +
> +EAL_REGISTER_TAILQ(rte_k32v64_hash_tailq);
> +
> +#define VALID_KEY_MSK   ((1 << RTE_K32V64_KEYS_PER_BUCKET) - 1)
> +
> +#ifdef CC_AVX512VL_SUPPORT
> +int
> +k32v64_hash_bulk_lookup_avx512vl(struct rte_k32v64_hash_table *table,
> + uint32_t *keys, uint32_t *hashes, uint64_t *values, unsigned int n);
> +#endif
> +
> +static int
> +k32v64_hash_bulk_lookup(struct rte_k32v64_hash_table *table, uint32_t *keys,
> + uint32_t *hashes, uint64_t *values, unsigned int n)
> +{
> + int ret, cnt = 0;
> + unsigned int i;
> +
> + if (unlikely((table == NULL) || (keys == NULL) || (hashes == NULL) ||
> + (values == NULL)))
> + return -EINVAL;
> +
> + for (i = 0; i < n; i++) {
> + ret = rte_k32v64_hash_lookup(table, keys[i], hashes[i],
> + &values[i]);
> + if (ret == 0)
> + cnt++;
> + }
> + return cnt;
> +}
> +
> +static rte_k32v64_hash_bulk_lookup_t
> +get_lookup_bulk_fn(void)
> +{
> +#ifdef CC_AVX512VL_SUPPORT
> + if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512F))
> + return k32v64_hash_bulk_lookup_avx512vl;
> +#endif
> + return k32v64_hash_bulk_lookup;
> +}
> +
> +int
> +rte_k32v64_hash_add(struct rte_k32v64_hash_table *table, uint32_t key,
> + uint32_t hash, uint64_t value)
> +{
> + uint32_t bucket;
> + int i, idx, ret;
> + uint8_t msk;
> + struct rte_k32v64_ext_ent *tmp, *ent, *prev = NULL;
> +
> + if (table == NULL)
> + return -EINVAL;
> +

I think for add you also need to do update bucket.cnt
at the start/end of updates (as you do for del). 
 
> + bucket = hash & table->bucket_msk;
> + /* Search key in table. Update value if exists */
> + for (i = 0; i < RTE_K32V64_KEYS_PER_BUCKET; i++) {
> + if ((key == table->t[bucket].key[i]) &&
> + (table->t[bucket].key_mask & (1 << i))) {
> + table->t[bucket].val[i] = value;
> + return 0;
> + }
> + }
> +
> + if (!SLIST_EMPTY(&table->t[bucket].head)) {
> + SLIST_FOREACH(ent, &table->t[bucket].head, next) {
> + if (ent->key == key) {
> + ent->val = value;
> + return 0;
> + }
> + }
> + }
> +
> + msk = ~table->t[bucket].key_mask & VALID_KEY_MSK;
> + if (msk) {
> + idx = __builtin_ctz(msk);
> + table->t[bucket].key[idx] = key;
> + table->t[bucket].val[idx] = value;
> + rte_smp_wmb();
> + table->t[bucket].key_mask |= 1 << idx;
> + table->nb_ent++;
> + return 0;
> + }
> +
> + ret = rte_mempool_get(table->ext_ent_pool, (void **)&ent);
> + if (ret < 0)
> + return ret;
> +
> + SLIST_NEXT(ent, next) = NULL;
> + ent->key = key;
> + ent->val = value;
> + rte_smp_wmb();
> + SLIST_FOREACH(tmp, &table->t[bucket].head, next)
> + prev = tmp;
> +
> + if (prev == NULL)
> + SLIST_INSERT_HEAD(&table->t[bucket].head, ent, next);
> + else
> + SLIST_INSERT_AFTER(prev, ent, next);
> +
> + table->nb_ent++;
> + table->nb_ext_ent++;
> + return 0;
> +}
> +
> +int
> +rte_k32v64_hash_delete(struct rte_k32v64_hash_table *table, uint32_t key,
> + uint32_t hash)
> +{
> + uint32_t bucket;
> + int i;
> + struct rte_k32v64_ext_ent *ent;
> +
> + if (table == NULL)
> + return -EINVAL;
> +
> + bucket = hash & table->bucket_msk;
> +
> + for (i = 0; i < RTE_K32V64_KEYS_PER_BUCKET; i++) {
> + if ((key == table->t[bucket].key[i]) &&
> + (table->t[bucket].key_mask & (1 << i))) {
> + ent = SLIST_FIRST(&table->t[bucket].head);
> + if (ent) {
> + rte_atomic32_inc(&table->t[bucket].cnt);

I know that right now rte_atomic32 uses _sync gcc builtins underneath, 
so it should be safe.
But I think the proper way would be:
table->t[bucket].cnt++;
rte_smp_wmb();  
or as alternative probably use C11 atomic ACQUIRE/RELEASE

> +   

Re: [dpdk-dev] [PATCH] examples/fips_validation: fix parsing of algo from NIST TDES test files

2020-04-23 Thread Anoob Joseph
> 
> Few of the NIST TDES test files don't contain TDES string.
> Added indicators to identify such files. These indicators are part of only 
> NIST
> TDES test vector files.
> 
> Fixes: 527cbf3d5ee3 ("examples/fips_validation: support TDES parsing")
> 
> Signed-off-by: Archana Muniganti 
> Signed-off-by: Ayuj Verma 
 
Acked-by: Anoob Joseph 


Re: [dpdk-dev] [PATCH dpdk-dev v3 2/2] mempool: use shared memzone for rte_mempool_ops

2020-04-23 Thread Andrew Rybchenko
On 4/13/20 5:21 PM, xiangxia.m@gmail.com wrote:
> From: Tonghao Zhang 
> 
> The order of mempool initiation affects mempool index in the
> rte_mempool_ops_table. For example, when building APPs with:
> 
> $ gcc -lrte_mempool_bucket -lrte_mempool_ring ...
> 
> The "bucket" mempool will be registered firstly, and its index
> in table is 0 while the index of "ring" mempool is 1. DPDK
> uses the mk/rte.app.mk to build APPs, and others, for example,
> Open vSwitch, use the libdpdk.a or libdpdk.so to build it.
> The mempool lib linked in dpdk and Open vSwitch is different.
> 
> The mempool can be used between primary and secondary process,
> such as dpdk-pdump and pdump-pmd/Open vSwitch(pdump enabled).
> There will be a crash because dpdk-pdump creates the "ring_mp_mc"
> ring which index in table is 0, but the index of "bucket" ring
> is 0 in Open vSwitch. If Open vSwitch use the index 0 to get
> mempool ops and malloc memory from mempool. The crash will occur:
> 
> bucket_dequeue (access null and crash)
> rte_mempool_get_ops (should get "ring_mp_mc",
>  but get "bucket" mempool)
> rte_mempool_ops_dequeue_bulk
> ...
> rte_pktmbuf_alloc
> rte_pktmbuf_copy
> pdump_copy
> pdump_rx
> rte_eth_rx_burst
> 
> To avoid the crash, there are some solution:
> * constructor priority: Different mempool uses different
>   priority in RTE_INIT, but it's not easy to maintain.
> 
> * change mk/rte.app.mk: Change the order in mk/rte.app.mk to
>   be same as libdpdk.a/libdpdk.so, but when adding a new mempool
>   driver in future, we must make sure the order.
> 
> * register mempool orderly: Sort the mempool when registering,
>   so the lib linked will not affect the index in mempool table.
>   but the number of mempool libraries may be different.
> 
> * shared memzone: The primary process allocates a struct in
>   shared memory named memzone, When we register a mempool ops,
>   we first get a name and id from the shared struct: with the lock held,
>   lookup for the registered name and return its index, else
>   get the last id and copy the name in the struct.
> 
> Previous discussion: 
> https://mails.dpdk.org/archives/dev/2020-March/159354.html
> 
> Suggested-by: Olivier Matz 
> Suggested-by: Jerin Jacob 
> Signed-off-by: Tonghao Zhang 
> ---
> v2:
> * fix checkpatch warning
> ---
>  lib/librte_mempool/rte_mempool.h | 28 +++-
>  lib/librte_mempool/rte_mempool_ops.c | 89 
> 
>  2 files changed, 96 insertions(+), 21 deletions(-)
> 
> diff --git a/lib/librte_mempool/rte_mempool.h 
> b/lib/librte_mempool/rte_mempool.h
> index c90cf31467b2..2709b9e1d51b 100644
> --- a/lib/librte_mempool/rte_mempool.h
> +++ b/lib/librte_mempool/rte_mempool.h
> @@ -50,6 +50,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #ifdef __cplusplus
>  extern "C" {
> @@ -678,7 +679,6 @@ struct rte_mempool_ops {
>   */
>  struct rte_mempool_ops_table {
>   rte_spinlock_t sl; /**< Spinlock for add/delete. */
> - uint32_t num_ops;  /**< Number of used ops structs in the table. */
>   /**
>* Storage for all possible ops structs.
>*/
> @@ -910,6 +910,30 @@ int rte_mempool_ops_get_info(const struct rte_mempool 
> *mp,
>   */
>  int rte_mempool_register_ops(const struct rte_mempool_ops *ops);
>  
> +struct rte_mempool_shared_ops {
> + size_t num_mempool_ops;

Is there any specific reason to change type from uint32_t used
above to size_t? I think that uint32_t is better here since
it is just a number, not a size of memory or related value.

> + struct {
> + char name[RTE_MEMPOOL_OPS_NAMESIZE];
> + } mempool_ops[RTE_MEMPOOL_MAX_OPS_IDX];
> +
> + rte_spinlock_t mempool;
> +};
> +
> +static inline int
> +mempool_ops_register_cb(const void *arg)
> +{
> + const struct rte_mempool_ops *h = (const struct rte_mempool_ops *)arg;
> +
> + return rte_mempool_register_ops(h);
> +}
> +
> +static inline void
> +mempool_ops_register(const struct rte_mempool_ops *ops)
> +{
> + rte_init_register(mempool_ops_register_cb, (const void *)ops,
> +   RTE_INIT_PRE);
> +}
> +
>  /**
>   * Macro to statically register the ops of a mempool handler.
>   * Note that the rte_mempool_register_ops fails silently here when
> @@ -918,7 +942,7 @@ int rte_mempool_ops_get_info(const struct rte_mempool *mp,
>  #define MEMPOOL_REGISTER_OPS(ops)\
>   RTE_INIT(mp_hdlr_init_##ops)\
>   {   \
> - rte_mempool_register_ops(&ops); \
> + mempool_ops_register(&ops); \
>   }
>  
>  /**
> diff --git a/lib/librte_mempool/rte_mempool_ops.c 
> b/lib/librte_mempool/rte_mempool_ops.c
> index 22c5251eb068..b10fda662db6 100644
> --- a/lib/librte_mempool/rte_mempool_ops.c
> +++ b/lib/librte_mempool/rte_mempool_ops.c
> @@ -14,43 +14

[dpdk-dev] [PATCH v2] crypto/aesni_mb: fix DOCSIS AES-256

2020-04-23 Thread Pablo de Lara
When adding support for DOCSIS AES-256,
when setting the cipher parameters, all key sizes
were accepted, but only 128-bit and 256-bit keys
are supported.

Fixes: 124d04b43743 ("crypto/aesni_mb: support DOCSIS AES-256")

Signed-off-by: Pablo de Lara 
Acked-by: Mairtin o Loingsigh 
---

v2:
- Fixed commit message (missing a word).
- Rebased on top of dpdk-next-crypto

 drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c | 23 ++-
 1 file changed, 22 insertions(+), 1 deletion(-)

diff --git a/drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c 
b/drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c
index a1d59e8..5ff6a79 100644
--- a/drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c
+++ b/drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c
@@ -381,6 +381,7 @@ aesni_mb_set_session_cipher_parameters(const MB_MGR *mb_mgr,
 {
uint8_t is_aes = 0;
uint8_t is_3DES = 0;
+   uint8_t is_docsis = 0;
 
if (xform == NULL) {
sess->cipher.mode = NULL_CIPHER;
@@ -417,7 +418,7 @@ aesni_mb_set_session_cipher_parameters(const MB_MGR *mb_mgr,
break;
case RTE_CRYPTO_CIPHER_AES_DOCSISBPI:
sess->cipher.mode = DOCSIS_SEC_BPI;
-   is_aes = 1;
+   is_docsis = 1;
break;
case RTE_CRYPTO_CIPHER_DES_CBC:
sess->cipher.mode = DES;
@@ -463,6 +464,26 @@ aesni_mb_set_session_cipher_parameters(const MB_MGR 
*mb_mgr,
AESNI_MB_LOG(ERR, "Invalid cipher key length");
return -EINVAL;
}
+   } else if (is_docsis) {
+   switch (xform->cipher.key.length) {
+   case AES_128_BYTES:
+   sess->cipher.key_length_in_bytes = AES_128_BYTES;
+   IMB_AES_KEYEXP_128(mb_mgr, xform->cipher.key.data,
+   sess->cipher.expanded_aes_keys.encode,
+   sess->cipher.expanded_aes_keys.decode);
+   break;
+#if IMB_VERSION_NUM >= IMB_VERSION(0, 53, 3)
+   case AES_256_BYTES:
+   sess->cipher.key_length_in_bytes = AES_256_BYTES;
+   IMB_AES_KEYEXP_256(mb_mgr, xform->cipher.key.data,
+   sess->cipher.expanded_aes_keys.encode,
+   sess->cipher.expanded_aes_keys.decode);
+   break;
+#endif
+   default:
+   AESNI_MB_LOG(ERR, "Invalid cipher key length");
+   return -EINVAL;
+   }
} else if (is_3DES) {
uint64_t *keys[3] = {sess->cipher.exp_3des_keys.key[0],
sess->cipher.exp_3des_keys.key[1],
-- 
2.7.5



Re: [dpdk-dev] [PATCH] security: fix crash at accessing non-implemented ops

2020-04-23 Thread Ananyev, Konstantin
Hi Akhil,

> 
> Hi Anoob/Konstantin,
> > >
> > > Check that ops->get_userdata is a valid function pointer will be compiled 
> > > out.
> > > So PMDs that don't implement this function will crash in
> > > rte_security_get_userdata().
> > > In our particular case - ixgbe.
> > > Same story with  rte_security_set_pkt_metadata() - see the patch.
> >
> > [Anoob] But ixgbe doesn't implement inline protocol which is the primary
> > consumer of this API (rte_security_get_userdata()). So what is the trouble?
> >
> > Also, application is expected to call rte_security_set_pkt_metadata() only 
> > on
> > devices with offload flag RTE_SECURITY_TX_OLOAD_NEED_MDATA. If a PMD
> > states it needs MDATA but fails to register a function pointer for doing 
> > the same,
> > it is a control path problem. Checking for that in the datapath is an 
> > overkill.
> >
> Whatever your concern is, we can resolve it later, but for now we should have 
> the same
> Unconditional checks that were there earlier. We need to make RC1 
> today/tomorrow.
> And this cannot go as an issue.
> 
> These are optional APIs and every PMD may not have supported that.
> 
> Konstantin,
> Please send an update to your patch reverting the original patch for these 2 
> functions.
> Currently it is adding 2 extra checks.
> 

I am afraid we can't do just that.
As in that case /app/test/test_security.c build wih -DRE_DEBUG will start 
crashing.

I think we have 3 alternative how to fix it:

1. Keep all these 3 checks for debug and non-debug mode (that what my current 
patch does).
2. Have both: existed 1 check in non-debug mode, plus new checks in debug mode, 
i.e.:
rte_security_get_userdata(struct rte_security_ctx *instance, uint64_t md)
 {
void *userdata = NULL;
 
+#ifdef RTE_DEBUG
+   RTE_PTR_CHAIN3_OR_ERR_RET(instance, ops, get_userdata, NULL, NULL);
+#else
RTE_FUNC_PTR_OR_ERR_RET(*instance->ops->get_userdata, NULL);
+#endif

 

3. Keep only 1 existed check in non-debug mode and remove cases in 
app/test/test_security.c
that would crash with -DRTE_DEBUG.  

My preference is 1), I don't think these 2 extra checks will affect performance 
greatly.
Also with 1) we can make these new test-case to be executed for non-debug mode 
too.
2) is probably also ok - but I think RTE_DEBUG concept should be a separate 
patch series,
and I don't want to mix things.
What is your opinion here?

Konstantin






Re: [dpdk-dev] [PATCH] security: fix crash at accessing non-implemented ops

2020-04-23 Thread Akhil Goyal


> 
> Hi Akhil,
> 
> >
> > Hi Anoob/Konstantin,
> > > >
> > > > Check that ops->get_userdata is a valid function pointer will be 
> > > > compiled
> out.
> > > > So PMDs that don't implement this function will crash in
> > > > rte_security_get_userdata().
> > > > In our particular case - ixgbe.
> > > > Same story with  rte_security_set_pkt_metadata() - see the patch.
> > >
> > > [Anoob] But ixgbe doesn't implement inline protocol which is the primary
> > > consumer of this API (rte_security_get_userdata()). So what is the 
> > > trouble?
> > >
> > > Also, application is expected to call rte_security_set_pkt_metadata() 
> > > only on
> > > devices with offload flag RTE_SECURITY_TX_OLOAD_NEED_MDATA. If a
> PMD
> > > states it needs MDATA but fails to register a function pointer for doing 
> > > the
> same,
> > > it is a control path problem. Checking for that in the datapath is an 
> > > overkill.
> > >
> > Whatever your concern is, we can resolve it later, but for now we should 
> > have
> the same
> > Unconditional checks that were there earlier. We need to make RC1
> today/tomorrow.
> > And this cannot go as an issue.
> >
> > These are optional APIs and every PMD may not have supported that.
> >
> > Konstantin,
> > Please send an update to your patch reverting the original patch for these 2
> functions.
> > Currently it is adding 2 extra checks.
> >
> 
> I am afraid we can't do just that.
> As in that case /app/test/test_security.c build wih -DRE_DEBUG will start
> crashing.
> 
> I think we have 3 alternative how to fix it:
> 
> 1. Keep all these 3 checks for debug and non-debug mode (that what my current
> patch does).
> 2. Have both: existed 1 check in non-debug mode, plus new checks in debug
> mode, i.e.:
> rte_security_get_userdata(struct rte_security_ctx *instance, uint64_t md)
>  {
>   void *userdata = NULL;
> 
> +#ifdef RTE_DEBUG
> + RTE_PTR_CHAIN3_OR_ERR_RET(instance, ops, get_userdata, NULL,
> NULL);
> +#else
>   RTE_FUNC_PTR_OR_ERR_RET(*instance->ops->get_userdata, NULL);
> +#endif
> 
>  
> 
> 3. Keep only 1 existed check in non-debug mode and remove cases in
> app/test/test_security.c
> that would crash with -DRTE_DEBUG.
> 
> My preference is 1), I don't think these 2 extra checks will affect 
> performance
> greatly.
> Also with 1) we can make these new test-case to be executed for non-debug
> mode too.
> 2) is probably also ok - but I think RTE_DEBUG concept should be a separate
> patch series,
> and I don't want to mix things.
> What is your opinion here?
> 
I am OK with both 1 and 2.
Anoob may be concerned about the performance.
 But if we go with 2, it would be better to have
 rte_security_get_userdata(struct rte_security_ctx *instance, uint64_t md)
  {
void *userdata = NULL;
 
 +#ifdef RTE_DEBUG
 +  RTE_PTR_OR_ERR_RET(instance, NULL); 
 +  RTE_PTR_OR_ERR_RET(instance->ops, NULL); 
 +#endif
RTE_FUNC_PTR_OR_ERR_RET(*instance->ops->get_userdata, NULL);

}

And for security test, we can have a separate patch. Lukasz or you can send 
that later if not now.


Re: [dpdk-dev] [PATCH v7 00/32] DPDK Trace support

2020-04-23 Thread David Marchand
On Wed, Apr 22, 2020 at 9:04 PM  wrote:
> This patch set contains
> 
>
> # The native implementation of common trace format(CTF)[1] based tracer
> # Public API to create the trace points.
> # Add tracepoints to eal, ethdev, mempool, eventdev and cryptodev
> library for tracing support
> # A unit test case
> # Performance test case to measure the trace overhead. (See eal/trace:
> # add trace performance test cases, patch)
> # Programmers guide for Trace support(See doc: add trace library guide,
> # patch)
>
> # Tested OS:
> ~~~
> - Linux
> - FreeBSD
>
> # Tested open source CTF trace viewers
> ~~
> - Babeltrace
> - Tracecompass
>
> # Trace overhead comparison with LTTng
> ~~~
>
> trace overhead data on x86:[2]
> # 236 cycles with LTTng(>100ns)
> # 18 cycles(7ns) with Native DPDK CTF emitter.(See eal/trace: add trace
> # performance test cases patch)
>
> trace overhead data on arm64:
> #  312  cycles to  1100 cycles with LTTng based on the class of arm64  CPU.
> #  11 cycles to 13 cycles with Native DPDK CTF emitter based on the
> class of arm64 CPU.
>
> 18 cycles(on x86) vs 11 cycles(on arm64) is due to rdtsc() overhead in
> x86. It seems  rdtsc takes around 15cycles in x86.
>
> More details:
> ~
>
> # The Native DPDK CTF trace support does not have any dependency on
> third-party library.
> The generated output file is compatible with LTTng as both are using
> CTF trace format.
>
> The performance gain comes from:
> 1) exploit dpdk worker thread usage model to avoid atomics and use per
> core variables
> 2) use hugepage,
> 3) avoid a lot function pointers in fast-path etc
> 4) avoid unaligned store for arm64 etc
>
> Features:
> ~
> - No specific limit on the events. A string-based event like rte_log
> for pattern matching
> - Dynamic enable/disable support.
> - Instructmention overhead is ~1 cycle. i.e cost of adding the code
> wth out using trace feature.
> - Timestamp support for all the events using DPDK rte_rtdsc
> - No dependency on another library. Clean room native implementation of CTF.
>
> Functional test case:
> a) echo "trace_autotest" | sudo ./build/app/test/dpdk-test  -c 0x3 --trace=.*
>
> The above command emits the following trace events
> 
> uint8_t i;
>
> rte_trace_lib_eal_generic_void();
> rte_trace_lib_eal_generic_u64(0x10);
> rte_trace_lib_eal_generic_u32(0x1000);
> rte_trace_lib_eal_generic_u16(0xffee);
> rte_trace_lib_eal_generic_u8(0xc);
> rte_trace_lib_eal_generic_i64(-1234);
> rte_trace_lib_eal_generic_i32(-1234567);
> rte_trace_lib_eal_generic_i16(12);
> rte_trace_lib_eal_generic_i8(-3);
> rte_trace_lib_eal_generic_string("my string");
> rte_trace_lib_eal_generic_function(__func__);
>
> 
>
> Install babeltrace package in Linux and point the generated trace file
> to babel trace. By default trace file created under
> /dpdk-traces/time_stamp/
>
> example:
> # babeltrace /root/dpdk-traces/rte-2020-02-15-PM-02-56-51 | more
>
> [13:27:36.138468807] (+?.?) lib.eal.generic.void: { cpu_id =0, name = 
> "dpdk-test" }, { }
> [13:27:36.138468851] (+0.00044) lib.eal.generic.u64: { cpu_id = 0, name = 
> "dpdk-test" }, { in = 4503599627370496 }
> [13:27:36.138468860] (+0.9) lib.eal.generic.u32: { cpu_id = 0, name = 
> "dpdk-test" }, { in = 268435456 }
> [13:27:36.138468934] (+0.00074) lib.eal.generic.u16: { cpu_id = 0, name = 
> "dpdk-test" }, { in = 65518 }
> [13:27:36.138468949] (+0.00015) lib.eal.generic.u8: { cpu_id = 0, name = 
> "dpdk-test" }, { in = 12 }
> [13:27:36.138468956] (+0.7) lib.eal.generic.i64: { cpu_id = 0, name = 
> "dpdk-test" }, { in = -1234 }
> [13:27:36.138468963] (+0.7) lib.eal.generic.i32: { cpu_id = 0, name = 
> "dpdk-test" }, { in = -1234567 }
> [13:27:36.138469024] (+0.00061) lib.eal.generic.i16: { cpu_id = 0, name = 
> "dpdk-test" }, { in = 12 }
> [13:27:36.138469044] (+0.00020) lib.eal.generic.i8: { cpu_id = 0, name = 
> "dpdk-test" }, { in = -3 }
> [13:27:36.138469051] (+0.7) lib.eal.generic.string: { cpu_id = 0, 
> name = "dpdk-test" }, { str = "my string" }
> [13:27:36.138469203] (+0.00152) lib.eal.generic.func: { cpu_id = 0, name 
> = "dpdk-test" }, { func = "test_trace_points" }
>
> # There is a  GUI based trace viewer available in Windows, Linux and  Mac.
> It is called as tracecompass.(https://www.eclipse.org/tracecompass/)
>
> The example screenshot and Histogram of above DPDK trace using
> Tracecompass.
>
> https://github.com/jerinjacobk/share/blob/master/dpdk_trace.JPG

This series is quite big and did not get a lot of comments/reviews:
especially the tracepoints added to important subsystems.

- fixed some typos, some missed renames in commit logs and
intermediate patches, some nits on coding style and reworded comments.
- added the unit tests in MAINTAINERS.
- mo

[dpdk-dev] [PATCH] eal: fix build on armv7

2020-04-23 Thread David Marchand
Caught by OBS on armv7:

In file included from .../lib/librte_eal/include/rte_string_fns.h:21,
 from .../lib/librte_kvargs/rte_kvargs.c:9:
.../lib/librte_eal/include/rte_common.h:67:37: error: expected '=', ',',
 ';', 'asm' or '__attribute__' before '__rte_aligned'
   67 | typedef uint64_t unaligned_uint64_t __rte_aligned(1);
  | ^
.../lib/librte_eal/include/rte_common.h:68:37: error: expected '=', ',',
 ';', 'asm' or '__attribute__' before '__rte_aligned'
   68 | typedef uint32_t unaligned_uint32_t __rte_aligned(1);
  | ^
.../lib/librte_eal/include/rte_common.h:69:37: error: expected '=', ',',
 ';', 'asm' or '__attribute__' before '__rte_aligned'
   69 | typedef uint16_t unaligned_uint16_t __rte_aligned(1);
  | ^
make[3]: *** [.../mk/internal/rte.compile-pre.mk:116: rte_kvargs.o] Error 1

Fixes: f35e5b3e07b2 ("replace alignment attributes")

Signed-off-by: David Marchand 
---
 lib/librte_eal/include/rte_common.h | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/lib/librte_eal/include/rte_common.h 
b/lib/librte_eal/include/rte_common.h
index 733447b736..668e8b0af8 100644
--- a/lib/librte_eal/include/rte_common.h
+++ b/lib/librte_eal/include/rte_common.h
@@ -63,6 +63,11 @@ extern "C" {
__GNUC_PATCHLEVEL__)
 #endif
 
+/**
+ * Force alignment
+ */
+#define __rte_aligned(a) __attribute__((__aligned__(a)))
+
 #ifdef RTE_ARCH_STRICT_ALIGN
 typedef uint64_t unaligned_uint64_t __rte_aligned(1);
 typedef uint32_t unaligned_uint32_t __rte_aligned(1);
@@ -73,11 +78,6 @@ typedef uint32_t unaligned_uint32_t;
 typedef uint16_t unaligned_uint16_t;
 #endif
 
-/**
- * Force alignment
- */
-#define __rte_aligned(a) __attribute__((__aligned__(a)))
-
 /**
  * Force a structure to be packed
  */
-- 
2.23.0



[dpdk-dev] [PATCH 2/2] timer: support EAL functions on Windows

2020-04-23 Thread Fady Bader
Implemented the needed Windows eal timer functions.

Signed-off-by: Fady Bader 
---
 lib/librte_eal/common/meson.build   |  1 +
 lib/librte_eal/windows/eal.c|  6 +++
 lib/librte_eal/windows/eal_timer.c  | 67 +
 lib/librte_eal/windows/include/rte_os.h |  2 +
 lib/librte_eal/windows/meson.build  |  1 +
 5 files changed, 77 insertions(+)
 create mode 100644 lib/librte_eal/windows/eal_timer.c

diff --git a/lib/librte_eal/common/meson.build 
b/lib/librte_eal/common/meson.build
index 6dcdcc890..532330e6d 100644
--- a/lib/librte_eal/common/meson.build
+++ b/lib/librte_eal/common/meson.build
@@ -20,6 +20,7 @@ if is_windows
'eal_common_options.c',
'eal_common_tailqs.c',
'eal_common_thread.c',
+   'eal_common_timer.c',
'malloc_elem.c',
'malloc_heap.c',
'rte_malloc.c',
diff --git a/lib/librte_eal/windows/eal.c b/lib/librte_eal/windows/eal.c
index 38f17f09c..32853fbac 100644
--- a/lib/librte_eal/windows/eal.c
+++ b/lib/librte_eal/windows/eal.c
@@ -400,6 +400,12 @@ rte_eal_init(int argc, char **argv)
return -1;
}
 
+   if (rte_eal_timer_init() < 0) {
+   rte_eal_init_alert("Cannot init TSC timer");
+   rte_errno = EFAULT;
+   return -1;
+   }
+
eal_thread_init_master(rte_config.master_lcore);
 
RTE_LCORE_FOREACH_SLAVE(i) {
diff --git a/lib/librte_eal/windows/eal_timer.c 
b/lib/librte_eal/windows/eal_timer.c
new file mode 100644
index 0..73eaff948
--- /dev/null
+++ b/lib/librte_eal/windows/eal_timer.c
@@ -0,0 +1,67 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2020 Mellanox Technologies, Ltd
+ */
+#include 
+#include 
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+/* The frequency of the RDTSC timer resolution */
+static uint64_t eal_tsc_resolution_hz;
+
+void
+rte_delay_us_sleep(unsigned int us)
+{
+   LONGLONG ns = us * 1000;
+   HANDLE timer;
+   LARGE_INTEGER liDueTime;
+   /* create waitable timer */
+   timer = CreateWaitableTimer(NULL, TRUE, NULL);
+   if(!timer){
+   /* didnt find any better errno val */
+   rte_errno = EINVAL;
+   return;
+   }
+
+   /* set us microseconds time for timer */
+   liDueTime.QuadPart = -ns;
+   if(!SetWaitableTimer(timer, &liDueTime, 0, NULL, NULL, FALSE)){
+   CloseHandle(timer);
+   /* didnt find any better errno val */
+   rte_errno = EFAULT;
+   return;
+   }
+   /* start wait for timer for us microseconds */
+   WaitForSingleObject(timer, INFINITE);
+   CloseHandle(timer);
+}
+
+uint64_t
+get_tsc_freq(void)
+{
+   uint64_t tsc_freq;
+   LARGE_INTEGER Frequency;
+
+   QueryPerformanceFrequency(&Frequency);
+   /*
+   QueryPerformanceFrequency output is in khz.
+   Mulitply by 1K to obtain the true frequency of the CPU (khz -> hz)
+   */
+   tsc_freq = ((uint64_t)Frequency.QuadPart * 1000);
+
+   return tsc_freq;
+}
+
+
+int
+rte_eal_timer_init(void)
+{
+   set_tsc_freq();
+   return 0;
+}
+
diff --git a/lib/librte_eal/windows/include/rte_os.h 
b/lib/librte_eal/windows/include/rte_os.h
index 62805a307..951a14d72 100644
--- a/lib/librte_eal/windows/include/rte_os.h
+++ b/lib/librte_eal/windows/include/rte_os.h
@@ -24,6 +24,8 @@ extern "C" {
 #define PATH_MAX _MAX_PATH
 #endif
 
+#define sleep(x) Sleep(1000 * x)
+
 #define strerror_r(a, b, c) strerror_s(b, c, a)
 
 /* strdup is deprecated in Microsoft libc and _strdup is preferred */
diff --git a/lib/librte_eal/windows/meson.build 
b/lib/librte_eal/windows/meson.build
index 0bd56cd8f..769cde797 100644
--- a/lib/librte_eal/windows/meson.build
+++ b/lib/librte_eal/windows/meson.build
@@ -12,6 +12,7 @@ sources += files(
'eal_memory.c',
'eal_mp.c',
'eal_thread.c',
+   'eal_timer.c',
'getopt.c',
 )
 
-- 
2.16.1.windows.4



[dpdk-dev] [PATCH 0/2] eal timer split and implementation for Windows

2020-04-23 Thread Fady Bader
This patchset splits OS dependent EAL timer functions and implements them for 
windows.

Depends-on: series-9374 ("Windows basic memory management")

Fady Bader (2):
  timer: move from common to Unix directory
  timer: support EAL functions on Windows

 lib/librte_eal/common/eal_common_timer.c | 22 ---
 lib/librte_eal/common/meson.build|  1 +
 lib/librte_eal/unix/eal_timer.c  | 29 ++
 lib/librte_eal/unix/meson.build  |  1 +
 lib/librte_eal/windows/eal.c |  6 +++
 lib/librte_eal/windows/eal_timer.c   | 67 
 lib/librte_eal/windows/include/rte_os.h  |  2 +
 lib/librte_eal/windows/meson.build   |  1 +
 8 files changed, 107 insertions(+), 22 deletions(-)
 create mode 100644 lib/librte_eal/unix/eal_timer.c
 create mode 100644 lib/librte_eal/windows/eal_timer.c

-- 
2.16.1.windows.4



[dpdk-dev] [PATCH 1/2] timer: move from common to Unix directory

2020-04-23 Thread Fady Bader
Eal common timer doesn't compile under Windows.

Compilation log:
error LNK2019:
unresolved external symbol nanosleep referenced in function rte_delay_us_sleep
error LNK2019:
unresolved external symbol get_tsc_freq referenced in function set_tsc_freq
error LNK2019:
unresolved external symbol sleep referenced in function set_tsc_freq

The reason was that some functions called POSIX functions.
The solution was to move POSIX dependent functions from common to Unix.

Signed-off-by: Fady Bader 
---
 lib/librte_eal/common/eal_common_timer.c | 22 --
 lib/librte_eal/unix/eal_timer.c  | 29 +
 lib/librte_eal/unix/meson.build  |  1 +
 3 files changed, 30 insertions(+), 22 deletions(-)
 create mode 100644 lib/librte_eal/unix/eal_timer.c

diff --git a/lib/librte_eal/common/eal_common_timer.c 
b/lib/librte_eal/common/eal_common_timer.c
index fa9ee1b22..71e0bd035 100644
--- a/lib/librte_eal/common/eal_common_timer.c
+++ b/lib/librte_eal/common/eal_common_timer.c
@@ -35,28 +35,6 @@ rte_delay_us_block(unsigned int us)
rte_pause();
 }
 
-void
-rte_delay_us_sleep(unsigned int us)
-{
-   struct timespec wait[2];
-   int ind = 0;
-
-   wait[0].tv_sec = 0;
-   if (us >= US_PER_S) {
-   wait[0].tv_sec = us / US_PER_S;
-   us -= wait[0].tv_sec * US_PER_S;
-   }
-   wait[0].tv_nsec = 1000 * us;
-
-   while (nanosleep(&wait[ind], &wait[1 - ind]) && errno == EINTR) {
-   /*
-* Sleep was interrupted. Flip the index, so the 'remainder'
-* will become the 'request' for a next call.
-*/
-   ind = 1 - ind;
-   }
-}
-
 uint64_t
 rte_get_tsc_hz(void)
 {
diff --git a/lib/librte_eal/unix/eal_timer.c b/lib/librte_eal/unix/eal_timer.c
new file mode 100644
index 0..36189c346
--- /dev/null
+++ b/lib/librte_eal/unix/eal_timer.c
@@ -0,0 +1,29 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2020 Mellanox Technologies, Ltd
+ */
+#include 
+
+#include 
+
+void
+rte_delay_us_sleep(unsigned int us)
+{
+   struct timespec wait[2];
+   int ind = 0;
+
+   wait[0].tv_sec = 0;
+   if (us >= US_PER_S) {
+   wait[0].tv_sec = us / US_PER_S;
+   us -= wait[0].tv_sec * US_PER_S;
+   }
+   wait[0].tv_nsec = 1000 * us;
+
+   while (nanosleep(&wait[ind], &wait[1 - ind]) && errno == EINTR) {
+   /*
+* Sleep was interrupted. Flip the index, so the 'remainder'
+* will become the 'request' for a next call.
+*/
+   ind = 1 - ind;
+   }
+}
+
diff --git a/lib/librte_eal/unix/meson.build b/lib/librte_eal/unix/meson.build
index 50c019a56..9da601658 100644
--- a/lib/librte_eal/unix/meson.build
+++ b/lib/librte_eal/unix/meson.build
@@ -4,4 +4,5 @@
 sources += files(
'eal.c',
'eal_memory.c',
+   'eal_timer.c',
 )
-- 
2.16.1.windows.4



[dpdk-dev] [PATCH 0/2] bnxt bug fixes

2020-04-23 Thread Kalesh A P
From: Kalesh AP 

Please apply.

Kalesh AP (1):
  net/bnxt: fix to reset VNIC rxq count on VNIC free

Rahul Gupta (1):
  net/bnxt: fix for memleak during queue restart

 drivers/net/bnxt/bnxt_ethdev.c |  2 ++
 drivers/net/bnxt/bnxt_hwrm.c   | 12 
 drivers/net/bnxt/bnxt_rxr.c| 44 --
 3 files changed, 27 insertions(+), 31 deletions(-)

-- 
2.10.1



[dpdk-dev] [PATCH 2/2] net/bnxt: fix to reset VNIC rxq count on VNIC free

2020-04-23 Thread Kalesh A P
From: Kalesh AP 

bnxt_free_one_vnic and bnxt_setup_one_vnic are called on configuring
port vlan stripping. bnxt_setup_one_vnic keeps incrementing the
vnic rx_queue_cnt. Fix to reset vnic rx_queue_cnt in bnxt_free_one_vnic.

Fixes: cfadfee41ed1 ("net/bnxt: fix VLAN strip")
Cc: sta...@dpdk.org

Signed-off-by: Kalesh AP 
Reviewed-by: Somnath Kotur 
---
 drivers/net/bnxt/bnxt_ethdev.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/bnxt/bnxt_ethdev.c b/drivers/net/bnxt/bnxt_ethdev.c
index 1a3c7e6..ecfc765 100644
--- a/drivers/net/bnxt/bnxt_ethdev.c
+++ b/drivers/net/bnxt/bnxt_ethdev.c
@@ -2173,6 +2173,8 @@ static int bnxt_free_one_vnic(struct bnxt *bp, uint16_t 
vnic_id)
rte_free(vnic->fw_grp_ids);
vnic->fw_grp_ids = NULL;
 
+   vnic->rx_queue_cnt = 0;
+
return 0;
 }
 
-- 
2.10.1



[dpdk-dev] [PATCH 1/2] net/bnxt: fix for memleak during queue restart

2020-04-23 Thread Kalesh A P
From: Rahul Gupta 

During port 0 rxq 1 start ie queue start,
bnxt_free_hwrm_rx_ring() we are clearing the pointers to mbuf array.
Due to this we overwrite the queue with fresh mbuf allocations
causing previously allocated mbufs to leak.
Add a check before allocating mbuf to replenish only empty mbuf slots
in the RxQ.

Fixes: 2eb53b134aae ("net/bnxt: add initial Rx code")
Cc: sta...@dpdk.org

Signed-off-by: Rahul Gupta 
Reviewed-by: Somnath Kotur 
Reviewed-by: Ajit Kumar Khaparde 
---
 drivers/net/bnxt/bnxt_hwrm.c | 12 
 drivers/net/bnxt/bnxt_rxr.c  | 44 +---
 2 files changed, 25 insertions(+), 31 deletions(-)

diff --git a/drivers/net/bnxt/bnxt_hwrm.c b/drivers/net/bnxt/bnxt_hwrm.c
index ebf73e4..b0a7835 100644
--- a/drivers/net/bnxt/bnxt_hwrm.c
+++ b/drivers/net/bnxt/bnxt_hwrm.c
@@ -2476,13 +2476,6 @@ void bnxt_free_hwrm_rx_ring(struct bnxt *bp, int 
queue_index)
if (BNXT_HAS_RING_GRPS(bp))
bp->grp_info[queue_index].rx_fw_ring_id =
INVALID_HW_RING_ID;
-   memset(rxr->rx_desc_ring, 0,
-  rxr->rx_ring_struct->ring_size *
-  sizeof(*rxr->rx_desc_ring));
-   memset(rxr->rx_buf_ring, 0,
-  rxr->rx_ring_struct->ring_size *
-  sizeof(*rxr->rx_buf_ring));
-   rxr->rx_prod = 0;
}
ring = rxr->ag_ring_struct;
if (ring->fw_ring_id != INVALID_HW_RING_ID) {
@@ -2490,11 +2483,6 @@ void bnxt_free_hwrm_rx_ring(struct bnxt *bp, int 
queue_index)
BNXT_CHIP_THOR(bp) ?
HWRM_RING_FREE_INPUT_RING_TYPE_RX_AGG :
HWRM_RING_FREE_INPUT_RING_TYPE_RX);
-   ring->fw_ring_id = INVALID_HW_RING_ID;
-   memset(rxr->ag_buf_ring, 0,
-  rxr->ag_ring_struct->ring_size *
-  sizeof(*rxr->ag_buf_ring));
-   rxr->ag_prod = 0;
if (BNXT_HAS_RING_GRPS(bp))
bp->grp_info[queue_index].ag_fw_ring_id =
INVALID_HW_RING_ID;
diff --git a/drivers/net/bnxt/bnxt_rxr.c b/drivers/net/bnxt/bnxt_rxr.c
index 40da2f2..a657150 100644
--- a/drivers/net/bnxt/bnxt_rxr.c
+++ b/drivers/net/bnxt/bnxt_rxr.c
@@ -963,14 +963,16 @@ int bnxt_init_one_rx_ring(struct bnxt_rx_queue *rxq)
 
prod = rxr->rx_prod;
for (i = 0; i < ring->ring_size; i++) {
-   if (bnxt_alloc_rx_data(rxq, rxr, prod) != 0) {
-   PMD_DRV_LOG(WARNING,
-   "init'ed rx ring %d with %d/%d mbufs only\n",
-   rxq->queue_id, i, ring->ring_size);
-   break;
+   if (unlikely(!rxr->rx_buf_ring[i].mbuf)) {
+   if (bnxt_alloc_rx_data(rxq, rxr, prod) != 0) {
+   PMD_DRV_LOG(WARNING,
+   "init'ed rx ring %d with %d/%d 
mbufs only\n",
+   rxq->queue_id, i, ring->ring_size);
+   break;
+   }
+   rxr->rx_prod = prod;
+   prod = RING_NEXT(rxr->rx_ring_struct, prod);
}
-   rxr->rx_prod = prod;
-   prod = RING_NEXT(rxr->rx_ring_struct, prod);
}
 
ring = rxr->ag_ring_struct;
@@ -979,14 +981,16 @@ int bnxt_init_one_rx_ring(struct bnxt_rx_queue *rxq)
prod = rxr->ag_prod;
 
for (i = 0; i < ring->ring_size; i++) {
-   if (bnxt_alloc_ag_data(rxq, rxr, prod) != 0) {
-   PMD_DRV_LOG(WARNING,
-   "init'ed AG ring %d with %d/%d mbufs only\n",
-   rxq->queue_id, i, ring->ring_size);
-   break;
+   if (unlikely(!rxr->ag_buf_ring[i].mbuf)) {
+   if (bnxt_alloc_ag_data(rxq, rxr, prod) != 0) {
+   PMD_DRV_LOG(WARNING,
+   "init'ed AG ring %d with %d/%d 
mbufs only\n",
+   rxq->queue_id, i, ring->ring_size);
+   break;
+   }
+   rxr->ag_prod = prod;
+   prod = RING_NEXT(rxr->ag_ring_struct, prod);
}
-   rxr->ag_prod = prod;
-   prod = RING_NEXT(rxr->ag_ring_struct, prod);
}
PMD_DRV_LOG(DEBUG, "AGG Done!\n");
 
@@ -994,11 +998,13 @@ int bnxt_init_one_rx_ring(struct bnxt_rx_queue *rxq)
unsigned int max_aggs = BNXT_TPA_MAX_AGGS(rxq->bp);
 
for (i = 0; i < max_aggs; i++) {
-   rxr->tpa_info[i].mbuf =
-   __bnxt_alloc_rx_data(r

Re: [dpdk-dev] [PATCH] security: fix crash at accessing non-implemented ops

2020-04-23 Thread Ananyev, Konstantin


> >
> > Hi Akhil,
> >
> > >
> > > Hi Anoob/Konstantin,
> > > > >
> > > > > Check that ops->get_userdata is a valid function pointer will be 
> > > > > compiled
> > out.
> > > > > So PMDs that don't implement this function will crash in
> > > > > rte_security_get_userdata().
> > > > > In our particular case - ixgbe.
> > > > > Same story with  rte_security_set_pkt_metadata() - see the patch.
> > > >
> > > > [Anoob] But ixgbe doesn't implement inline protocol which is the primary
> > > > consumer of this API (rte_security_get_userdata()). So what is the 
> > > > trouble?
> > > >
> > > > Also, application is expected to call rte_security_set_pkt_metadata() 
> > > > only on
> > > > devices with offload flag RTE_SECURITY_TX_OLOAD_NEED_MDATA. If a
> > PMD
> > > > states it needs MDATA but fails to register a function pointer for 
> > > > doing the
> > same,
> > > > it is a control path problem. Checking for that in the datapath is an 
> > > > overkill.
> > > >
> > > Whatever your concern is, we can resolve it later, but for now we should 
> > > have
> > the same
> > > Unconditional checks that were there earlier. We need to make RC1
> > today/tomorrow.
> > > And this cannot go as an issue.
> > >
> > > These are optional APIs and every PMD may not have supported that.
> > >
> > > Konstantin,
> > > Please send an update to your patch reverting the original patch for 
> > > these 2
> > functions.
> > > Currently it is adding 2 extra checks.
> > >
> >
> > I am afraid we can't do just that.
> > As in that case /app/test/test_security.c build wih -DRE_DEBUG will start
> > crashing.
> >
> > I think we have 3 alternative how to fix it:
> >
> > 1. Keep all these 3 checks for debug and non-debug mode (that what my 
> > current
> > patch does).
> > 2. Have both: existed 1 check in non-debug mode, plus new checks in debug
> > mode, i.e.:
> > rte_security_get_userdata(struct rte_security_ctx *instance, uint64_t md)
> >  {
> > void *userdata = NULL;
> >
> > +#ifdef RTE_DEBUG
> > +   RTE_PTR_CHAIN3_OR_ERR_RET(instance, ops, get_userdata, NULL,
> > NULL);
> > +#else
> > RTE_FUNC_PTR_OR_ERR_RET(*instance->ops->get_userdata, NULL);
> > +#endif
> >
> >  
> >
> > 3. Keep only 1 existed check in non-debug mode and remove cases in
> > app/test/test_security.c
> > that would crash with -DRTE_DEBUG.
> >
> > My preference is 1), I don't think these 2 extra checks will affect 
> > performance
> > greatly.
> > Also with 1) we can make these new test-case to be executed for non-debug
> > mode too.
> > 2) is probably also ok - but I think RTE_DEBUG concept should be a separate
> > patch series,
> > and I don't want to mix things.
> > What is your opinion here?
> >
> I am OK with both 1 and 2.
> Anoob may be concerned about the performance.
>  But if we go with 2, it would be better to have
>  rte_security_get_userdata(struct rte_security_ctx *instance, uint64_t md)
>   {
>   void *userdata = NULL;
> 
>  +#ifdef RTE_DEBUG
>  +RTE_PTR_OR_ERR_RET(instance, NULL);
>  +RTE_PTR_OR_ERR_RET(instance->ops, NULL);
>  +#endif
>   RTE_FUNC_PTR_OR_ERR_RET(*instance->ops->get_userdata, NULL);
> 
> }
> 
> And for security test, we can have a separate patch. Lukasz or you can send 
> that later if not now.

Ok, then to keep everyone happy will go with your code snippet above.




[dpdk-dev] [PATCH v2] security: fix crash at accessing non-implemented ops

2020-04-23 Thread Konstantin Ananyev
Valid checks for optional function pointers inside dev-ops
were disabled by undefined macro.

Fixes: b6ee98547847 ("security: fix verification of parameters")
Cc: sta...@dpdk.org

Signed-off-by: Konstantin Ananyev 
---
 lib/librte_security/rte_security.c | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/lib/librte_security/rte_security.c 
b/lib/librte_security/rte_security.c
index d475b0977..dc9a3e89c 100644
--- a/lib/librte_security/rte_security.c
+++ b/lib/librte_security/rte_security.c
@@ -108,10 +108,11 @@ rte_security_set_pkt_metadata(struct rte_security_ctx 
*instance,
  struct rte_mbuf *m, void *params)
 {
 #ifdef RTE_DEBUG
-   RTE_PTR_CHAIN3_OR_ERR_RET(instance, ops, set_pkt_metadata, -EINVAL,
-   -ENOTSUP);
RTE_PTR_OR_ERR_RET(sess, -EINVAL);
+   RTE_PTR_OR_ERR_RET(instance, -EINVAL);
+   RTE_PTR_OR_ERR_RET(instance->ops, -EINVAL);
 #endif
+   RTE_FUNC_PTR_OR_ERR_RET(*instance->ops->set_pkt_metadata, -ENOTSUP);
return instance->ops->set_pkt_metadata(instance->device,
   sess, m, params);
 }
@@ -122,8 +123,10 @@ rte_security_get_userdata(struct rte_security_ctx 
*instance, uint64_t md)
void *userdata = NULL;
 
 #ifdef RTE_DEBUG
-   RTE_PTR_CHAIN3_OR_ERR_RET(instance, ops, get_userdata, NULL, NULL);
+   RTE_PTR_OR_ERR_RET(instance, NULL);
+   RTE_PTR_OR_ERR_RET(instance->ops, NULL);
 #endif
+   RTE_FUNC_PTR_OR_ERR_RET(*instance->ops->get_userdata, NULL);
if (instance->ops->get_userdata(instance->device, md, &userdata))
return NULL;
 
-- 
2.17.1



Re: [dpdk-dev] [PATCH 2/2] timer: support EAL functions on Windows

2020-04-23 Thread Dmitry Kozlyuk
On 2020-04-23 17:43 GMT+0300 Fady Bader wrote:
[snip]
> diff --git a/lib/librte_eal/windows/eal_timer.c 
> b/lib/librte_eal/windows/eal_timer.c
> new file mode 100644
> index 0..73eaff948
> --- /dev/null
> +++ b/lib/librte_eal/windows/eal_timer.c
> @@ -0,0 +1,67 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright 2020 Mellanox Technologies, Ltd
> + */
> +#include 
> +#include 
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +/* The frequency of the RDTSC timer resolution */
> +static uint64_t eal_tsc_resolution_hz;
> +
> +void
> +rte_delay_us_sleep(unsigned int us)
> +{
> + LONGLONG ns = us * 1000;
> + HANDLE timer;
> + LARGE_INTEGER liDueTime;

Shouldn't Windows code follow DPDK naming conventions?

> + /* create waitable timer */
> + timer = CreateWaitableTimer(NULL, TRUE, NULL);
> + if(!timer){

Missing spaces. There are more styling issues below that could be detected by
running ./devtools/checkpatches.sh.

> + /* didnt find any better errno val */
> + rte_errno = EINVAL;
> + return;

ENOMEM probably indicates lack of resources better. EINVAL usually implies
wrong arguments to the function. You can also use RTE_WIN32_LOG_ERR() here to
log exact error code on debug level.

> + }
> +
> + /* set us microseconds time for timer */
> + liDueTime.QuadPart = -ns;

Timeout is neither in microseconds, nor in nanoseconds, it's in 100-nanosecond
intervals.

> + if(!SetWaitableTimer(timer, &liDueTime, 0, NULL, NULL, FALSE)){
> + CloseHandle(timer);
> + /* didnt find any better errno val */
> + rte_errno = EFAULT;

And here, EINVAL is probably better, because result depends on function
argument and this probably will be the most frequent source of errors.

> + return;
> + }
> + /* start wait for timer for us microseconds */
> + WaitForSingleObject(timer, INFINITE);
> + CloseHandle(timer);
> +}
[snip]
> diff --git a/lib/librte_eal/windows/include/rte_os.h 
> b/lib/librte_eal/windows/include/rte_os.h
> index 62805a307..951a14d72 100644
> --- a/lib/librte_eal/windows/include/rte_os.h
> +++ b/lib/librte_eal/windows/include/rte_os.h
> @@ -24,6 +24,8 @@ extern "C" {
>  #define PATH_MAX _MAX_PATH
>  #endif
>  
> +#define sleep(x) Sleep(1000 * x)

It's better to enclose "x" in parentheses or to use inline function.

-- 
Dmitry Kozlyuk


Re: [dpdk-dev] [PATCH v8 0/9] add packed ring vectorized path

2020-04-23 Thread Wang, Yinan
Tested-by: Wang, Yinan 

> -Original Message-
> From: dev  On Behalf Of Marvin Liu
> Sent: 2020年4月23日 20:31
> To: maxime.coque...@redhat.com; Ye, Xiaolong ;
> Wang, Zhihong 
> Cc: Van Haaren, Harry ; dev@dpdk.org; Liu,
> Yong 
> Subject: [dpdk-dev] [PATCH v8 0/9] add packed ring vectorized path
> 
> This patch set introduced vectorized path for packed ring.
> 
> The size of packed ring descriptor is 16Bytes. Four batched descriptors are
> just placed into one cacheline. AVX512 instructions can well handle this kind
> of data. Packed ring TX path can fully transformed into vectorized path.
> Packed ring Rx path can be vectorized when requirements met(LRO and
> mergeable disabled).
> 
> New option RTE_LIBRTE_VIRTIO_INC_VECTOR will be introduced in this patch
> set. This option will unify split and packed ring vectorized path default 
> setting.
> Meanwhile user can specify whether enable vectorized path at runtime by
> 'vectorized' parameter of virtio user vdev.
> 
> v8:
> * fix meson build error on ubuntu16.04 and suse15
> 
> v7:
> * default vectorization is disabled
> * compilation time check dependency on rte_mbuf structure
> * offsets are calcuated when compiling
> * remove useless barrier as descs are batched store&load
> * vindex of scatter is directly set
> * some comments updates
> * enable vectorized path in meson build
> 
> v6:
> * fix issue when size not power of 2
> 
> v5:
> * remove cpuflags definition as required extensions always come with
>   AVX512F on x86_64
> * inorder actions should depend on feature bit
> * check ring type in rx queue setup
> * rewrite some commit logs
> * fix some checkpatch warnings
> 
> v4:
> * rename 'packed_vec' to 'vectorized', also used in split ring
> * add RTE_LIBRTE_VIRTIO_INC_VECTOR config for virtio ethdev
> * check required AVX512 extensions cpuflags
> * combine split and packed ring datapath selection logic
> * remove limitation that size must power of two
> * clear 12Bytes virtio_net_hdr
> 
> v3:
> * remove virtio_net_hdr array for better performance
> * disable 'packed_vec' by default
> 
> v2:
> * more function blocks replaced by vector instructions
> * clean virtio_net_hdr by vector instruction
> * allow header room size change
> * add 'packed_vec' option in virtio_user vdev
> * fix build not check whether AVX512 enabled
> * doc update
> 
> 
> Marvin Liu (9):
>   net/virtio: add Rx free threshold setting
>   net/virtio: enable vectorized path
>   net/virtio: inorder should depend on feature bit
>   net/virtio-user: add vectorized path parameter
>   net/virtio: add vectorized packed ring Rx path
>   net/virtio: reuse packed ring xmit functions
>   net/virtio: add vectorized packed ring Tx path
>   net/virtio: add election for vectorized path
>   doc: add packed vectorized path
> 
>  config/common_base  |   1 +
>  doc/guides/nics/virtio.rst  |  43 +-
>  drivers/net/virtio/Makefile |  37 ++
>  drivers/net/virtio/meson.build  |  15 +
>  drivers/net/virtio/virtio_ethdev.c  |  95 ++-
>  drivers/net/virtio/virtio_ethdev.h  |   6 +
>  drivers/net/virtio/virtio_pci.h |   3 +-
>  drivers/net/virtio/virtio_rxtx.c| 212 ++-
>  drivers/net/virtio/virtio_rxtx_packed_avx.c | 665 
>  drivers/net/virtio/virtio_user_ethdev.c |  37 +-
>  drivers/net/virtio/virtqueue.c  |   7 +-
>  drivers/net/virtio/virtqueue.h  | 168 -
>  12 files changed, 1075 insertions(+), 214 deletions(-)
>  create mode 100644 drivers/net/virtio/virtio_rxtx_packed_avx.c
> 
> --
> 2.17.1



[dpdk-dev] [PATCH v5] test/ipsec: measure libipsec performance

2020-04-23 Thread Savinay Dharmappa
Add new test-case to measure performance of
ipsec data-path functions.

Signed-off-by: Savinay Dharmappa 
---
 MAINTAINERS|   2 +-
 app/test/Makefile  |   2 +-
 app/test/meson.build   |   2 +
 app/test/test_ipsec_perf.c | 614 +
 4 files changed, 618 insertions(+), 2 deletions(-)
 create mode 100644 app/test/test_ipsec_perf.c

diff --git a/MAINTAINERS b/MAINTAINERS
index e18b2dba7..8fd029572 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1251,7 +1251,7 @@ M: Konstantin Ananyev 
 T: git://dpdk.org/next/dpdk-next-crypto
 F: lib/librte_ipsec/
 M: Bernard Iremonger 
-F: app/test/test_ipsec.c
+F: app/test/test_ipsec*
 F: doc/guides/prog_guide/ipsec_lib.rst
 M: Vladimir Medvedkin 
 F: app/test/test_ipsec_sad.c
diff --git a/app/test/Makefile b/app/test/Makefile
index 4582eca6c..54c72c8a9 100644
--- a/app/test/Makefile
+++ b/app/test/Makefile
@@ -240,7 +240,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_RCU) += test_rcu_qsbr.c 
test_rcu_qsbr_perf.c
 
 SRCS-$(CONFIG_RTE_LIBRTE_SECURITY) += test_security.c
 
-SRCS-$(CONFIG_RTE_LIBRTE_IPSEC) += test_ipsec.c
+SRCS-$(CONFIG_RTE_LIBRTE_IPSEC) += test_ipsec.c test_ipsec_perf.c
 SRCS-$(CONFIG_RTE_LIBRTE_IPSEC) += test_ipsec_sad.c
 ifeq ($(CONFIG_RTE_LIBRTE_IPSEC),y)
 LDLIBS += -lrte_ipsec
diff --git a/app/test/meson.build b/app/test/meson.build
index a9a8eabcd..2055c6ea0 100644
--- a/app/test/meson.build
+++ b/app/test/meson.build
@@ -60,6 +60,7 @@ test_sources = files('commands.c',
'test_interrupts.c',
'test_ipsec.c',
'test_ipsec_sad.c',
+   'test_ipsec_perf.c',
'test_kni.c',
'test_kvargs.c',
'test_link_bonding.c',
@@ -285,6 +286,7 @@ perf_test_names = [
 'hash_readwrite_perf_autotest',
 'hash_readwrite_lf_perf_autotest',
 'trace_perf_autotest',
+   'ipsec_perf_autotest',
 ]
 
 driver_test_names = [
diff --git a/app/test/test_ipsec_perf.c b/app/test/test_ipsec_perf.c
new file mode 100644
index 0..92106bf37
--- /dev/null
+++ b/app/test/test_ipsec_perf.c
@@ -0,0 +1,614 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "test.h"
+#include "test_cryptodev.h"
+
+#define RING_SIZE  4096
+#define BURST_SIZE 64
+#define NUM_MBUF   4095
+#define DEFAULT_SPI 7
+
+struct ipsec_test_cfg {
+   uint32_t replay_win_sz;
+   uint32_t esn;
+   uint64_t flags;
+   enum rte_crypto_sym_xform_type type;
+};
+
+struct rte_mempool *mbuf_pool, *cop_pool;
+
+struct stats_counter {
+   uint64_t nb_prepare_call;
+   uint64_t nb_prepare_pkt;
+   uint64_t nb_process_call;
+   uint64_t nb_process_pkt;
+   uint64_t prepare_ticks_elapsed;
+   uint64_t process_ticks_elapsed;
+};
+
+struct ipsec_sa {
+   struct rte_ipsec_session ss[2];
+   struct rte_ipsec_sa_prm sa_prm;
+   struct rte_security_ipsec_xform ipsec_xform;
+   struct rte_crypto_sym_xform cipher_xform;
+   struct rte_crypto_sym_xform auth_xform;
+   struct rte_crypto_sym_xform aead_xform;
+   struct rte_crypto_sym_xform *crypto_xforms;
+   struct rte_crypto_op *cop[BURST_SIZE];
+   enum rte_crypto_sym_xform_type type;
+   struct stats_counter cnt;
+   uint32_t replay_win_sz;
+   uint32_t sa_flags;
+};
+
+static const struct ipsec_test_cfg test_cfg[] = {
+   {0, 0, 0, RTE_CRYPTO_SYM_XFORM_AEAD},
+   {0, 0, 0, RTE_CRYPTO_SYM_XFORM_CIPHER},
+   {128, 1, 0, RTE_CRYPTO_SYM_XFORM_AEAD},
+   {128, 1, 0, RTE_CRYPTO_SYM_XFORM_CIPHER},
+
+};
+
+static struct rte_ipv4_hdr ipv4_outer  = {
+   .version_ihl = IPVERSION << 4 |
+   sizeof(ipv4_outer) / RTE_IPV4_IHL_MULTIPLIER,
+   .time_to_live = IPDEFTTL,
+   .next_proto_id = IPPROTO_ESP,
+   .src_addr = RTE_IPV4(192, 168, 1, 100),
+   .dst_addr = RTE_IPV4(192, 168, 2, 100),
+};
+
+static struct rte_ring *ring_inb_prepare;
+static struct rte_ring *ring_inb_process;
+static struct rte_ring *ring_outb_prepare;
+static struct rte_ring *ring_outb_process;
+
+struct supported_cipher_algo {
+   const char *keyword;
+   enum rte_crypto_cipher_algorithm algo;
+   uint16_t iv_len;
+   uint16_t block_size;
+   uint16_t key_len;
+};
+
+struct supported_auth_algo {
+   const char *keyword;
+   enum rte_crypto_auth_algorithm algo;
+   uint16_t digest_len;
+   uint16_t key_len;
+   uint8_t key_not_req;
+};
+
+struct supported_aead_algo {
+   const char *keyword;
+   enum rte_crypto_aead_algorithm algo;
+   uint16_t iv_len;
+   uint16_t block_size;
+   uint16_t digest_len;
+   uint16_t key_len;
+   uint8_t aad_len;
+};
+
+const struct supported_cipher_algo cipher_algo[] = {
+   {
+   .keyword = "aes-128-cbc",
+   .algo = RTE_CRYPTO_CIPHER_AES_CBC,
+   .iv_len = 16,
+   .block_size = 16,

Re: [dpdk-dev] [PATCH] eal: fix build on armv7

2020-04-23 Thread Thomas Monjalon
23/04/2020 16:24, David Marchand:
> Caught by OBS on armv7:
> 
> In file included from .../lib/librte_eal/include/rte_string_fns.h:21,
>  from .../lib/librte_kvargs/rte_kvargs.c:9:
> .../lib/librte_eal/include/rte_common.h:67:37: error: expected '=', ',',
>  ';', 'asm' or '__attribute__' before '__rte_aligned'
>67 | typedef uint64_t unaligned_uint64_t __rte_aligned(1);
>   | ^
> .../lib/librte_eal/include/rte_common.h:68:37: error: expected '=', ',',
>  ';', 'asm' or '__attribute__' before '__rte_aligned'
>68 | typedef uint32_t unaligned_uint32_t __rte_aligned(1);
>   | ^
> .../lib/librte_eal/include/rte_common.h:69:37: error: expected '=', ',',
>  ';', 'asm' or '__attribute__' before '__rte_aligned'
>69 | typedef uint16_t unaligned_uint16_t __rte_aligned(1);
>   | ^
> make[3]: *** [.../mk/internal/rte.compile-pre.mk:116: rte_kvargs.o] Error 1
> 
> Fixes: f35e5b3e07b2 ("replace alignment attributes")
> 
> Signed-off-by: David Marchand 

It deserves few words explanation about RTE_ARCH_STRICT_ALIGN.

Acked-by: Thomas Monjalon 




[dpdk-dev] [PATCH v2] eal: add madvise to avoid dump memory

2020-04-23 Thread Li Feng
Avoid dump all mapped memory to a core dump file when crash.
Otherwise it will very large and it's hard to analyze with gdb.

In my test, it will dump 128GiB memory to a core dump file when integrated
to spdk with default configuration.

Signed-off-by: Li Feng 
---
 lib/librte_eal/common/eal_common_memory.c | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/lib/librte_eal/common/eal_common_memory.c 
b/lib/librte_eal/common/eal_common_memory.c
index cc7d54e0c..2d9564b28 100644
--- a/lib/librte_eal/common/eal_common_memory.c
+++ b/lib/librte_eal/common/eal_common_memory.c
@@ -177,6 +177,20 @@ eal_get_virtual_area(void *requested_addr, size_t *size,
after_len = RTE_PTR_DIFF(map_end, aligned_end);
if (after_len > 0)
munmap(aligned_end, after_len);
+
+   /*
+* Exclude this pages from a core dump.
+*/
+   if (madvise(aligned_addr, *size, MADV_DONTDUMP) != 0)
+   RTE_LOG(WARNING, EAL, "Madvise with MADV_DONTDUMP 
failed: %s\n",
+   strerror(errno));
+   } else {
+   /*
+* Exclude this pages from a core dump.
+*/
+   if (madvise(mapped_addr, map_sz, MADV_DONTDUMP) != 0)
+   RTE_LOG(WARNING, EAL, "Madvise with MADV_DONTDUMP 
failed: %s\n",
+   strerror(errno));
}
 
return aligned_addr;
-- 
2.11.0


-- 
The SmartX email address is only for business purpose. Any sent message 
that is not related to the business is not authorized or permitted by 
SmartX.
本邮箱为北京志凌海纳科技有限公司(SmartX)工作邮箱. 如本邮箱发出的邮件与工作无关,该邮件未得到本公司任何的明示或默示的授权.




Re: [dpdk-dev] [PATCH v2] security: fix crash at accessing non-implemented ops

2020-04-23 Thread Akhil Goyal


> Valid checks for optional function pointers inside dev-ops
> were disabled by undefined macro.
> 
> Fixes: b6ee98547847 ("security: fix verification of parameters")
> Cc: sta...@dpdk.org
> 
> Signed-off-by: Konstantin Ananyev 
> ---

Acked-by: Akhil Goyal 

Anoob,

Do you have any concerns over this patch?

Regards,
Akhil


Re: [dpdk-dev] [PATCH v2] security: fix crash at accessing non-implemented ops

2020-04-23 Thread Anoob Joseph
Hi Akhil,

I have my concerns over unwanted checks in the datapath. Something that crypto 
enqueue/dequeue APIs are not doing is being enforced on other APIs. As 
Konstantin had suggested, PMDs (IXGBE here) could define a function which 
returns -ENOTSUP and it would have been win-win for everyone.

Anyway, I don't have any objections to this.

Thanks,
Anoob

> -Original Message-
> From: Akhil Goyal 
> Sent: Thursday, April 23, 2020 9:22 PM
> To: Konstantin Ananyev ; dev@dpdk.org;
> Anoob Joseph 
> Cc: declan.dohe...@intel.com; sta...@dpdk.org
> Subject: [EXT] RE: [PATCH v2] security: fix crash at accessing non-implemented
> ops
> 
> External Email
> 
> --
> 
> > Valid checks for optional function pointers inside dev-ops were
> > disabled by undefined macro.
> >
> > Fixes: b6ee98547847 ("security: fix verification of parameters")
> > Cc: sta...@dpdk.org
> >
> > Signed-off-by: Konstantin Ananyev 
> > ---
> 
> Acked-by: Akhil Goyal 
> 
> Anoob,
> 
> Do you have any concerns over this patch?
> 
> Regards,
> Akhil


Re: [dpdk-dev] [PATCH v2] security: fix crash at accessing non-implemented ops

2020-04-23 Thread Akhil Goyal
Hi Anoob,
> 
> Hi Akhil,
> 
> I have my concerns over unwanted checks in the datapath. Something that
> crypto enqueue/dequeue APIs are not doing is being enforced on other APIs. As
> Konstantin had suggested, PMDs (IXGBE here) could define a function which
> returns -ENOTSUP and it would have been win-win for everyone.
> 
> Anyway, I don't have any objections to this.
> 
Your concerns are valid and those can be handled separately.
We will apply this patch.



Re: [dpdk-dev] [PATCH v3 0/4] add new k32v64 hash table

2020-04-23 Thread Ananyev, Konstantin
Hi everyone,
 
> >
> > On 2020-04-16 12:18, Medvedkin, Vladimir wrote:
> > > Hi Mattias,
> > >
> > > -Original Message-
> > > From: Mattias Rönnblom 
> > > Sent: Wednesday, April 15, 2020 7:52 PM
> > > To: Medvedkin, Vladimir ; dev@dpdk.org
> > > Cc: Ananyev, Konstantin ; Wang, Yipeng1
> > > ; Gobriel, Sameh ;
> > > Richardson, Bruce 
> > > Subject: Re: [dpdk-dev] [PATCH v3 0/4] add new k32v64 hash table
> > >
> > > On 2020-04-15 20:17, Vladimir Medvedkin wrote:
> > >> Currently DPDK has a special implementation of a hash table for
> > >> 4 byte keys which is called FBK hash. Unfortunately its main drawback
> > >> is that it only supports 2 byte values.
> > >> The new implementation called K32V64 hash supports 4 byte keys and 8
> > >> byte associated values, which is enough to store a pointer.
> > >>
> > >> It would also be nice to get feedback on whether to leave the old FBK
> > >> and new k32v64 implementations or deprecate the old one?
> > >
> > > Do you think it would be feasible to support custom-sized values and 
> > > remain
> > efficient, in a similar manner to how rte_ring_elem.h does things?
> > >
> > > I'm afraid it is not feasible. For the performance reason keys and
> > corresponding values resides in single cache line so there are no extra 
> > memory
> > for bigger values, such as 16B.
> >
> >
> > Well, if you have a smaller value type (or key type) you would fit into
> > something less-than-a-cache line, and thus reduce your memory working set
> > further.
> >
> >
> > >> v3:
> > >> - added bulk lookup
> > >> - avx512 key comparizon is removed from .h
> > >>
> > >> v2:
> > >> - renamed from rte_dwk to rte_k32v64 as was suggested
> > >> - reworked lookup function, added inlined subroutines
> > >> - added avx512 key comparizon routine
> > >> - added documentation
> > >> - added statistic counters for total entries and extended
> > >> entries(linked list)
> > >>
> > >> Vladimir Medvedkin (4):
> > >> hash: add k32v64 hash library
> > >> hash: add documentation for k32v64 hash library
> > >> test: add k32v64 hash autotests
> > >> test: add k32v64 perf tests
> > >>
> > >>app/test/Makefile |   1 +
> > >>app/test/autotest_data.py |  12 ++
> > >>app/test/meson.build  |   3 +
> > >>app/test/test_hash_perf.c | 130 
> > >>app/test/test_k32v64_hash.c   | 229 ++
> > >>doc/api/doxy-api-index.md |   1 +
> > >>doc/guides/prog_guide/index.rst   |   1 +
> > >>doc/guides/prog_guide/k32v64_hash_lib.rst |  66 +++
> > >>lib/Makefile  |   2 +-
> > >>lib/librte_hash/Makefile  |  13 +-
> > >>lib/librte_hash/k32v64_hash_avx512vl.c|  56 ++
> > >>lib/librte_hash/meson.build   |  17 +-
> > >>lib/librte_hash/rte_hash_version.map  |   6 +-
> > >>lib/librte_hash/rte_k32v64_hash.c | 315
> > ++
> > >>lib/librte_hash/rte_k32v64_hash.h | 211 
> > >>15 files changed, 1058 insertions(+), 5 deletions(-)
> > >>create mode 100644 app/test/test_k32v64_hash.c
> > >>create mode 100644 doc/guides/prog_guide/k32v64_hash_lib.rst
> > >>create mode 100644 lib/librte_hash/k32v64_hash_avx512vl.c
> > >>create mode 100644 lib/librte_hash/rte_k32v64_hash.c
> > >>create mode 100644 lib/librte_hash/rte_k32v64_hash.h
> > >>
> [Wang, Yipeng]
> Hi, Vladimir,
> Thanks for responding with the use cases earlier.
> I discussed with Sameh offline, here are some comments.
> 
> 1. Since the proposed hash table also has some similarities to rte_table 
> library used by packet framework,
> have you tried it yet? Although it is mainly for packet framework, I believe 
> you can use it independently as well.
> It has implementations for special key value sizes.
> I added Cristian for his comment.
> 
> 2. We tend to agree with Mattias that it would be better if we have a more 
> generic API name and with the same
> API we can do multiple key/value size implementations.
> This is to avoid adding new APIs in future to again handle different key/value
> use cases.  For example, we call it rte_kv_hash, and through the parameter 
> struct we pass in a key-value size pair
> we want to use.
> Implementation-wise, we may only provide implementations for certain popular 
> use cases (like the one you provided).
> For other general use cases, people should go with the more flexible and 
> generic cuckoo hash.
> Then we should also merge the FBK under the new API.

From my perspective Vladimir work is not an attempt to introduce new API -
but to fix (extend) existing FBK one.
Right now there is a contradictory situation:
from one side with 4B keys hash tables are quite common, from other side
because of its limitations (2B value, no mechanism to resolve collisions) 
fbk_hash
is hardly usable i

[dpdk-dev] [PATCH] test/security: enable tests for non-implemented ops

2020-04-23 Thread Lukasz Wojciechowski
After re-enabling checks for non-implemneted ops in non-debug mode
in librte_security set_pkt_metadata and get_userdata functions,
tests verifying proper work of tests can be enabled also.

Signed-off-by: Lukasz Wojciechowski 
---
 app/test/test_security.c | 8 
 1 file changed, 8 deletions(-)

diff --git a/app/test/test_security.c b/app/test/test_security.c
index 724ce56f4..3076a4c5a 100644
--- a/app/test/test_security.c
+++ b/app/test/test_security.c
@@ -1474,7 +1474,6 @@ test_set_pkt_metadata_inv_context_ops(void)
 static int
 test_set_pkt_metadata_inv_context_ops_fun(void)
 {
-#ifdef RTE_DEBUG
struct security_unittest_params *ut_params = &unittest_params;
struct rte_mbuf m;
int params;
@@ -1487,9 +1486,6 @@ test_set_pkt_metadata_inv_context_ops_fun(void)
TEST_ASSERT_MOCK_CALLS(mock_set_pkt_metadata_exp, 0);
 
return TEST_SUCCESS;
-#else
-   return TEST_SKIPPED;
-#endif
 }
 
 /**
@@ -1621,7 +1617,6 @@ test_get_userdata_inv_context_ops(void)
 static int
 test_get_userdata_inv_context_ops_fun(void)
 {
-#ifdef RTE_DEBUG
struct security_unittest_params *ut_params = &unittest_params;
uint64_t md = 0xDEADBEEF;
ut_params->ctx.ops = &empty_ops;
@@ -1632,9 +1627,6 @@ test_get_userdata_inv_context_ops_fun(void)
TEST_ASSERT_MOCK_CALLS(mock_get_userdata_exp, 0);
 
return TEST_SUCCESS;
-#else
-   return TEST_SKIPPED;
-#endif
 }
 
 /**
-- 
2.17.1



Re: [dpdk-dev] [PATCH v2] security: fix crash at accessing non-implemented ops

2020-04-23 Thread Lukasz Wojciechowski
W dniu 23.04.2020 o 18:14, Akhil Goyal pisze:
> Hi Anoob,
>> Hi Akhil,
>>
>> I have my concerns over unwanted checks in the datapath. Something that
>> crypto enqueue/dequeue APIs are not doing is being enforced on other APIs. As
>> Konstantin had suggested, PMDs (IXGBE here) could define a function which
>> returns -ENOTSUP and it would have been win-win for everyone.
>>
>> Anyway, I don't have any objections to this.
>>
> Your concerns are valid and those can be handled separately.
> We will apply this patch.

I just pushed a patch enabling 2 tests in non-debug build mode.

Sorry for the trouble I caused


Best regards

Lukasz

-- 

Lukasz Wojciechowski
Principal Software Engineer

Samsung R&D Institute Poland
Samsung Electronics
Office +48 22 377 88 25
l.wojciec...@partner.samsung.com



[dpdk-dev] [PATCH v2 0/6] use c11 atomics for service core lib

2020-04-23 Thread Phil Yang
The rte_atomic ops and rte_smp barriers enforce DMB barriers on aarch64.
Using c11 atomics with explicit memory ordering instead of the rte_atomic
ops and rte_smp barriers for inter-threads synchronization can uplift the
performance on aarch64 and no performance loss on x86.

This patchset contains:
1) fix race condition for MT unsafe service.
2) clean up redundant code.
3) use c11 atomics for service core lib to avoid unnecessary barriers.

v2:
Still waiting on Harry for the final solution on the MT unsafe race
condition issue. But I have incorporated the comments so far.
1. add 'Fixes' tag for bug-fix patches.
2. remove 'Fixes' tag for code cleanup patches.
3. remove unused parameter for service_dump_one function.
4. replace the execute_lock atomic CAS operation to spinlock_try_lock.
5. use c11 atomics with RELAXED memory ordering for num_mapped_cores.
6. relax barriers for guard variables runstate, comp_runstate and
app_runstate with c11 one-way barriers.

Honnappa Nagarahalli (2):
  service: fix race condition for MT unsafe service
  service: identify service running on another core correctly

Phil Yang (4):
  service: remove rte prefix from static functions
  service: remove redundant code
  service: optimize with c11 atomics
  service: relax barriers with C11 atomics

 lib/librte_eal/common/rte_service.c | 234 +++-
 lib/librte_eal/meson.build  |   4 +
 2 files changed, 130 insertions(+), 108 deletions(-)

-- 
2.7.4



[dpdk-dev] [PATCH v2 2/6] service: identify service running on another core correctly

2020-04-23 Thread Phil Yang
From: Honnappa Nagarahalli 

The logic to identify if the MT unsafe service is running on another
core can return -EBUSY spuriously. In such cases, running the service
becomes costlier than using atomic operations. Assume that the
application passes the right parameters and reduces the number of
instructions for all cases.

Cc: sta...@dpdk.org
Fixes: 8d39d3e237c2 ("service: fix race in service on app lcore function")

Signed-off-by: Honnappa Nagarahalli 
Reviewed-by: Phil Yang 
---
 lib/librte_eal/common/rte_service.c | 26 --
 1 file changed, 8 insertions(+), 18 deletions(-)

diff --git a/lib/librte_eal/common/rte_service.c 
b/lib/librte_eal/common/rte_service.c
index b8c465e..c89472b 100644
--- a/lib/librte_eal/common/rte_service.c
+++ b/lib/librte_eal/common/rte_service.c
@@ -360,7 +360,7 @@ rte_service_runner_do_callback(struct rte_service_spec_impl 
*s,
 /* Expects the service 's' is valid. */
 static int32_t
 service_run(uint32_t i, struct core_state *cs, uint64_t service_mask,
-   struct rte_service_spec_impl *s)
+   struct rte_service_spec_impl *s, uint32_t serialize_mt_unsafe)
 {
if (!s)
return -EINVAL;
@@ -374,7 +374,7 @@ service_run(uint32_t i, struct core_state *cs, uint64_t 
service_mask,
 
cs->service_active_on_lcore[i] = 1;
 
-   if (service_mt_safe(s) == 0) {
+   if ((service_mt_safe(s) == 0) && (serialize_mt_unsafe == 1)) {
if (!rte_atomic32_cmpset((uint32_t *)&s->execute_lock, 0, 1))
return -EBUSY;
 
@@ -412,24 +412,14 @@ rte_service_run_iter_on_app_lcore(uint32_t id, uint32_t 
serialize_mt_unsafe)
 
SERVICE_VALID_GET_OR_ERR_RET(id, s, -EINVAL);
 
-   /* Atomically add this core to the mapped cores first, then examine if
-* we can run the service. This avoids a race condition between
-* checking the value, and atomically adding to the mapped count.
+   /* Increment num_mapped_cores to indicate that the service
+* is running on a core.
 */
-   if (serialize_mt_unsafe)
-   rte_atomic32_inc(&s->num_mapped_cores);
+   rte_atomic32_inc(&s->num_mapped_cores);
 
-   if (service_mt_safe(s) == 0 &&
-   rte_atomic32_read(&s->num_mapped_cores) > 1) {
-   if (serialize_mt_unsafe)
-   rte_atomic32_dec(&s->num_mapped_cores);
-   return -EBUSY;
-   }
-
-   int ret = service_run(id, cs, UINT64_MAX, s);
+   int ret = service_run(id, cs, UINT64_MAX, s, serialize_mt_unsafe);
 
-   if (serialize_mt_unsafe)
-   rte_atomic32_dec(&s->num_mapped_cores);
+   rte_atomic32_dec(&s->num_mapped_cores);
 
return ret;
 }
@@ -449,7 +439,7 @@ rte_service_runner_func(void *arg)
if (!service_valid(i))
continue;
/* return value ignored as no change to code flow */
-   service_run(i, cs, service_mask, service_get(i));
+   service_run(i, cs, service_mask, service_get(i), 1);
}
 
cs->loops++;
-- 
2.7.4



[dpdk-dev] [PATCH v2 1/6] service: fix race condition for MT unsafe service

2020-04-23 Thread Phil Yang
From: Honnappa Nagarahalli 

The MT unsafe service might get configured to run on another core
while the service is running currently. This might result in the
MT unsafe service running on multiple cores simultaneously. Use
'execute_lock' always when the service is MT unsafe.

Fixes: e9139a32f6e8 ("service: add function to run on app lcore")
Cc: sta...@dpdk.org

Signed-off-by: Honnappa Nagarahalli 
Reviewed-by: Phil Yang 
---
 lib/librte_eal/common/rte_service.c | 11 +--
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/lib/librte_eal/common/rte_service.c 
b/lib/librte_eal/common/rte_service.c
index 70d17a5..b8c465e 100644
--- a/lib/librte_eal/common/rte_service.c
+++ b/lib/librte_eal/common/rte_service.c
@@ -50,6 +50,10 @@ struct rte_service_spec_impl {
uint8_t internal_flags;
 
/* per service statistics */
+   /* Indicates how many cores the service is mapped to run on.
+* It does not indicate the number of cores the service is running
+* on currently.
+*/
rte_atomic32_t num_mapped_cores;
uint64_t calls;
uint64_t cycles_spent;
@@ -370,12 +374,7 @@ service_run(uint32_t i, struct core_state *cs, uint64_t 
service_mask,
 
cs->service_active_on_lcore[i] = 1;
 
-   /* check do we need cmpset, if MT safe or <= 1 core
-* mapped, atomic ops are not required.
-*/
-   const int use_atomics = (service_mt_safe(s) == 0) &&
-   (rte_atomic32_read(&s->num_mapped_cores) > 1);
-   if (use_atomics) {
+   if (service_mt_safe(s) == 0) {
if (!rte_atomic32_cmpset((uint32_t *)&s->execute_lock, 0, 1))
return -EBUSY;
 
-- 
2.7.4



Re: [dpdk-dev] [PATCH 1/7] eal: move OS common functions to single file

2020-04-23 Thread Ranjit Menon

On 4/23/2020 3:48 AM, Thomas Monjalon wrote:

23/04/2020 11:06, Dmitry Kozlyuk:

On 2020-04-23 09:27 GMT+0200 Thomas Monjalon wrote:

23/04/2020 01:51, Ranjit Menon:

On 4/22/2020 12:27 AM, tal...@mellanox.com wrote:

From: Tal Shnaiderman 

Move common functions between Unix and Windows to eal_config.c.


Like other files in common, we should call this eal_common_config.c


I am not sure about the interest of repeating the directory name
in the file name in general.
Do you see a real benefit?


In general, no. But in this case, it does make a difference, IMO.
When seeing all the files in EAL together, it clearly stands out that 
these are common/shared files and any change therein will affect others 
beyond Windows. IMO, I prefer it the way it is.




It allows using VPATH in Makefile. If filenames are identical in different
VPATH directories, make can't pick both. Makefiles are being deprecated, but
they'll be around for some more time.


Makefile will be removed in 20.11





ranjit m.


[dpdk-dev] [PATCH v2 3/6] service: remove rte prefix from static functions

2020-04-23 Thread Phil Yang
clean up rte prefix from static functions.
remove unused parameter for service_dump_one function.

Signed-off-by: Phil Yang 
Reviewed-by: Honnappa Nagarahalli 
---
 lib/librte_eal/common/rte_service.c | 34 +++---
 1 file changed, 11 insertions(+), 23 deletions(-)

diff --git a/lib/librte_eal/common/rte_service.c 
b/lib/librte_eal/common/rte_service.c
index c89472b..ed20702 100644
--- a/lib/librte_eal/common/rte_service.c
+++ b/lib/librte_eal/common/rte_service.c
@@ -340,7 +340,7 @@ rte_service_runstate_get(uint32_t id)
 }
 
 static inline void
-rte_service_runner_do_callback(struct rte_service_spec_impl *s,
+service_runner_do_callback(struct rte_service_spec_impl *s,
   struct core_state *cs, uint32_t service_idx)
 {
void *userdata = s->spec.callback_userdata;
@@ -378,10 +378,10 @@ service_run(uint32_t i, struct core_state *cs, uint64_t 
service_mask,
if (!rte_atomic32_cmpset((uint32_t *)&s->execute_lock, 0, 1))
return -EBUSY;
 
-   rte_service_runner_do_callback(s, cs, i);
+   service_runner_do_callback(s, cs, i);
rte_atomic32_clear(&s->execute_lock);
} else
-   rte_service_runner_do_callback(s, cs, i);
+   service_runner_do_callback(s, cs, i);
 
return 0;
 }
@@ -425,14 +425,14 @@ rte_service_run_iter_on_app_lcore(uint32_t id, uint32_t 
serialize_mt_unsafe)
 }
 
 static int32_t
-rte_service_runner_func(void *arg)
+service_runner_func(void *arg)
 {
RTE_SET_USED(arg);
uint32_t i;
const int lcore = rte_lcore_id();
struct core_state *cs = &lcore_states[lcore];
 
-   while (lcore_states[lcore].runstate == RUNSTATE_RUNNING) {
+   while (cs->runstate == RUNSTATE_RUNNING) {
const uint64_t service_mask = cs->service_mask;
 
for (i = 0; i < RTE_SERVICE_NUM_MAX; i++) {
@@ -693,9 +693,9 @@ rte_service_lcore_start(uint32_t lcore)
/* set core to run state first, and then launch otherwise it will
 * return immediately as runstate keeps it in the service poll loop
 */
-   lcore_states[lcore].runstate = RUNSTATE_RUNNING;
+   cs->runstate = RUNSTATE_RUNNING;
 
-   int ret = rte_eal_remote_launch(rte_service_runner_func, 0, lcore);
+   int ret = rte_eal_remote_launch(service_runner_func, 0, lcore);
/* returns -EBUSY if the core is already launched, 0 on success */
return ret;
 }
@@ -774,13 +774,9 @@ rte_service_lcore_attr_get(uint32_t lcore, uint32_t 
attr_id,
 }
 
 static void
-rte_service_dump_one(FILE *f, struct rte_service_spec_impl *s,
-uint64_t all_cycles, uint32_t reset)
+service_dump_one(FILE *f, struct rte_service_spec_impl *s, uint32_t reset)
 {
/* avoid divide by zero */
-   if (all_cycles == 0)
-   all_cycles = 1;
-
int calls = 1;
if (s->calls != 0)
calls = s->calls;
@@ -807,7 +803,7 @@ rte_service_attr_reset_all(uint32_t id)
SERVICE_VALID_GET_OR_ERR_RET(id, s, -EINVAL);
 
int reset = 1;
-   rte_service_dump_one(NULL, s, 0, reset);
+   service_dump_one(NULL, s, reset);
return 0;
 }
 
@@ -851,21 +847,13 @@ rte_service_dump(FILE *f, uint32_t id)
uint32_t i;
int print_one = (id != UINT32_MAX);
 
-   uint64_t total_cycles = 0;
-
-   for (i = 0; i < RTE_SERVICE_NUM_MAX; i++) {
-   if (!service_valid(i))
-   continue;
-   total_cycles += rte_services[i].cycles_spent;
-   }
-
/* print only the specified service */
if (print_one) {
struct rte_service_spec_impl *s;
SERVICE_VALID_GET_OR_ERR_RET(id, s, -EINVAL);
fprintf(f, "Service %s Summary\n", s->spec.name);
uint32_t reset = 0;
-   rte_service_dump_one(f, s, total_cycles, reset);
+   service_dump_one(f, s, reset);
return 0;
}
 
@@ -875,7 +863,7 @@ rte_service_dump(FILE *f, uint32_t id)
if (!service_valid(i))
continue;
uint32_t reset = 0;
-   rte_service_dump_one(f, &rte_services[i], total_cycles, reset);
+   service_dump_one(f, &rte_services[i], reset);
}
 
fprintf(f, "Service Cores Summary\n");
-- 
2.7.4



[dpdk-dev] [PATCH v2 4/6] service: remove redundant code

2020-04-23 Thread Phil Yang
The service id validation is duplicated, remove the redundant code
in the calling functions.

Signed-off-by: Phil Yang 
Reviewed-by: Honnappa Nagarahalli 
---
 lib/librte_eal/common/rte_service.c | 28 ++--
 1 file changed, 6 insertions(+), 22 deletions(-)

diff --git a/lib/librte_eal/common/rte_service.c 
b/lib/librte_eal/common/rte_service.c
index ed20702..9c1a1d5 100644
--- a/lib/librte_eal/common/rte_service.c
+++ b/lib/librte_eal/common/rte_service.c
@@ -541,24 +541,12 @@ rte_service_start_with_defaults(void)
 }
 
 static int32_t
-service_update(struct rte_service_spec *service, uint32_t lcore,
+service_update(uint32_t sid, uint32_t lcore,
uint32_t *set, uint32_t *enabled)
 {
-   uint32_t i;
-   int32_t sid = -1;
-
-   for (i = 0; i < RTE_SERVICE_NUM_MAX; i++) {
-   if ((struct rte_service_spec *)&rte_services[i] == service &&
-   service_valid(i)) {
-   sid = i;
-   break;
-   }
-   }
-
-   if (sid == -1 || lcore >= RTE_MAX_LCORE)
-   return -EINVAL;
-
-   if (!lcore_states[lcore].is_service_core)
+   /* validate ID, or return error value */
+   if (sid >= RTE_SERVICE_NUM_MAX || !service_valid(sid) ||
+   lcore >= RTE_MAX_LCORE || !lcore_states[lcore].is_service_core)
return -EINVAL;
 
uint64_t sid_mask = UINT64_C(1) << sid;
@@ -587,19 +575,15 @@ service_update(struct rte_service_spec *service, uint32_t 
lcore,
 int32_t
 rte_service_map_lcore_set(uint32_t id, uint32_t lcore, uint32_t enabled)
 {
-   struct rte_service_spec_impl *s;
-   SERVICE_VALID_GET_OR_ERR_RET(id, s, -EINVAL);
uint32_t on = enabled > 0;
-   return service_update(&s->spec, lcore, &on, 0);
+   return service_update(id, lcore, &on, 0);
 }
 
 int32_t
 rte_service_map_lcore_get(uint32_t id, uint32_t lcore)
 {
-   struct rte_service_spec_impl *s;
-   SERVICE_VALID_GET_OR_ERR_RET(id, s, -EINVAL);
uint32_t enabled;
-   int ret = service_update(&s->spec, lcore, 0, &enabled);
+   int ret = service_update(id, lcore, 0, &enabled);
if (ret == 0)
return enabled;
return ret;
-- 
2.7.4



[dpdk-dev] [PATCH v2 5/6] service: optimize with c11 atomics

2020-04-23 Thread Phil Yang
The num_mapped_cores is used as a statistics. Use c11 atomics with
RELAXED ordering for num_mapped_cores instead of rte_atomic ops which
enforce unnessary barriers on aarch64.

Replace execute_lock operations to spinlock_try_lock to avoid duplicate
code.

Signed-off-by: Phil Yang 
Reviewed-by: Honnappa Nagarahalli 
---
 lib/librte_eal/common/rte_service.c | 32 ++--
 lib/librte_eal/meson.build  |  4 
 2 files changed, 22 insertions(+), 14 deletions(-)

diff --git a/lib/librte_eal/common/rte_service.c 
b/lib/librte_eal/common/rte_service.c
index 9c1a1d5..8cac265 100644
--- a/lib/librte_eal/common/rte_service.c
+++ b/lib/librte_eal/common/rte_service.c
@@ -20,6 +20,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "eal_private.h"
 
@@ -38,11 +39,11 @@ struct rte_service_spec_impl {
/* public part of the struct */
struct rte_service_spec spec;
 
-   /* atomic lock that when set indicates a service core is currently
+   /* spin lock that when set indicates a service core is currently
 * running this service callback. When not set, a core may take the
 * lock and then run the service callback.
 */
-   rte_atomic32_t execute_lock;
+   rte_spinlock_t execute_lock;
 
/* API set/get-able variables */
int8_t app_runstate;
@@ -54,7 +55,7 @@ struct rte_service_spec_impl {
 * It does not indicate the number of cores the service is running
 * on currently.
 */
-   rte_atomic32_t num_mapped_cores;
+   uint32_t num_mapped_cores;
uint64_t calls;
uint64_t cycles_spent;
 } __rte_cache_aligned;
@@ -332,7 +333,8 @@ rte_service_runstate_get(uint32_t id)
rte_smp_rmb();
 
int check_disabled = !(s->internal_flags & SERVICE_F_START_CHECK);
-   int lcore_mapped = (rte_atomic32_read(&s->num_mapped_cores) > 0);
+   int lcore_mapped = (__atomic_load_n(&s->num_mapped_cores,
+   __ATOMIC_RELAXED) > 0);
 
return (s->app_runstate == RUNSTATE_RUNNING) &&
(s->comp_runstate == RUNSTATE_RUNNING) &&
@@ -375,11 +377,11 @@ service_run(uint32_t i, struct core_state *cs, uint64_t 
service_mask,
cs->service_active_on_lcore[i] = 1;
 
if ((service_mt_safe(s) == 0) && (serialize_mt_unsafe == 1)) {
-   if (!rte_atomic32_cmpset((uint32_t *)&s->execute_lock, 0, 1))
+   if (!rte_spinlock_trylock(&s->execute_lock))
return -EBUSY;
 
service_runner_do_callback(s, cs, i);
-   rte_atomic32_clear(&s->execute_lock);
+   rte_spinlock_unlock(&s->execute_lock);
} else
service_runner_do_callback(s, cs, i);
 
@@ -415,11 +417,11 @@ rte_service_run_iter_on_app_lcore(uint32_t id, uint32_t 
serialize_mt_unsafe)
/* Increment num_mapped_cores to indicate that the service
 * is running on a core.
 */
-   rte_atomic32_inc(&s->num_mapped_cores);
+   __atomic_add_fetch(&s->num_mapped_cores, 1, __ATOMIC_RELAXED);
 
int ret = service_run(id, cs, UINT64_MAX, s, serialize_mt_unsafe);
 
-   rte_atomic32_dec(&s->num_mapped_cores);
+   __atomic_sub_fetch(&s->num_mapped_cores, 1, __ATOMIC_RELAXED);
 
return ret;
 }
@@ -556,19 +558,19 @@ service_update(uint32_t sid, uint32_t lcore,
 
if (*set && !lcore_mapped) {
lcore_states[lcore].service_mask |= sid_mask;
-   rte_atomic32_inc(&rte_services[sid].num_mapped_cores);
+   __atomic_add_fetch(&rte_services[sid].num_mapped_cores,
+   1, __ATOMIC_RELAXED);
}
if (!*set && lcore_mapped) {
lcore_states[lcore].service_mask &= ~(sid_mask);
-   rte_atomic32_dec(&rte_services[sid].num_mapped_cores);
+   __atomic_sub_fetch(&rte_services[sid].num_mapped_cores,
+   1, __ATOMIC_RELAXED);
}
}
 
if (enabled)
*enabled = !!(lcore_states[lcore].service_mask & (sid_mask));
 
-   rte_smp_wmb();
-
return 0;
 }
 
@@ -616,7 +618,8 @@ rte_service_lcore_reset_all(void)
}
}
for (i = 0; i < RTE_SERVICE_NUM_MAX; i++)
-   rte_atomic32_set(&rte_services[i].num_mapped_cores, 0);
+   __atomic_store_n(&rte_services[i].num_mapped_cores, 0,
+   __ATOMIC_RELAXED);
 
rte_smp_wmb();
 
@@ -699,7 +702,8 @@ rte_service_lcore_stop(uint32_t lcore)
int32_t enabled = service_mask & (UINT64_C(1) << i);
int32_t service_running = rte_service_runstate_get(i);
int32_t only_core = (1 ==
-   rte_atomic32_read(&rte_services[i].num_mapped_cores));
+   __atomic_load_n(&rt

[dpdk-dev] [PATCH v2 6/6] service: relax barriers with C11 atomics

2020-04-23 Thread Phil Yang
The runstate, comp_runstate and app_runstate are used as guard variables
in the service core lib. To guarantee the inter-threads visibility of
these guard variables, it uses rte_smp_r/wmb. This patch use c11 atomic
built-ins to relax these barriers.

Signed-off-by: Phil Yang 
Reviewed-by: Honnappa Nagarahalli 
---
 lib/librte_eal/common/rte_service.c | 115 ++--
 1 file changed, 84 insertions(+), 31 deletions(-)

diff --git a/lib/librte_eal/common/rte_service.c 
b/lib/librte_eal/common/rte_service.c
index 8cac265..dbb8211 100644
--- a/lib/librte_eal/common/rte_service.c
+++ b/lib/librte_eal/common/rte_service.c
@@ -265,7 +265,6 @@ rte_service_component_register(const struct 
rte_service_spec *spec,
s->spec = *spec;
s->internal_flags |= SERVICE_F_REGISTERED | SERVICE_F_START_CHECK;
 
-   rte_smp_wmb();
rte_service_count++;
 
if (id_ptr)
@@ -282,7 +281,6 @@ rte_service_component_unregister(uint32_t id)
SERVICE_VALID_GET_OR_ERR_RET(id, s, -EINVAL);
 
rte_service_count--;
-   rte_smp_wmb();
 
s->internal_flags &= ~(SERVICE_F_REGISTERED);
 
@@ -301,12 +299,17 @@ rte_service_component_runstate_set(uint32_t id, uint32_t 
runstate)
struct rte_service_spec_impl *s;
SERVICE_VALID_GET_OR_ERR_RET(id, s, -EINVAL);
 
+   /* comp_runstate act as the guard variable. Use store-release
+* memory order. This synchronizes with load-acquire in
+* service_run and service_runstate_get function.
+*/
if (runstate)
-   s->comp_runstate = RUNSTATE_RUNNING;
+   __atomic_store_n(&s->comp_runstate, RUNSTATE_RUNNING,
+   __ATOMIC_RELEASE);
else
-   s->comp_runstate = RUNSTATE_STOPPED;
+   __atomic_store_n(&s->comp_runstate, RUNSTATE_STOPPED,
+   __ATOMIC_RELEASE);
 
-   rte_smp_wmb();
return 0;
 }
 
@@ -316,12 +319,17 @@ rte_service_runstate_set(uint32_t id, uint32_t runstate)
struct rte_service_spec_impl *s;
SERVICE_VALID_GET_OR_ERR_RET(id, s, -EINVAL);
 
+   /* app_runstate act as the guard variable. Use store-release
+* memory order. This synchronizes with load-acquire in
+* service_run runstate_get function.
+*/
if (runstate)
-   s->app_runstate = RUNSTATE_RUNNING;
+   __atomic_store_n(&s->app_runstate, RUNSTATE_RUNNING,
+   __ATOMIC_RELEASE);
else
-   s->app_runstate = RUNSTATE_STOPPED;
+   __atomic_store_n(&s->app_runstate, RUNSTATE_STOPPED,
+   __ATOMIC_RELEASE);
 
-   rte_smp_wmb();
return 0;
 }
 
@@ -330,15 +338,24 @@ rte_service_runstate_get(uint32_t id)
 {
struct rte_service_spec_impl *s;
SERVICE_VALID_GET_OR_ERR_RET(id, s, -EINVAL);
-   rte_smp_rmb();
 
-   int check_disabled = !(s->internal_flags & SERVICE_F_START_CHECK);
-   int lcore_mapped = (__atomic_load_n(&s->num_mapped_cores,
+   /* comp_runstate and app_runstate act as the guard variables.
+* Use load-acquire memory order. This synchronizes with
+* store-release in service state set functions.
+*/
+   if (__atomic_load_n(&s->comp_runstate,
+   __ATOMIC_ACQUIRE) == RUNSTATE_RUNNING &&
+__atomic_load_n(&s->app_runstate,
+   __ATOMIC_ACQUIRE) == RUNSTATE_RUNNING) {
+   int check_disabled = !(s->internal_flags &
+   SERVICE_F_START_CHECK);
+   int lcore_mapped = (__atomic_load_n(&s->num_mapped_cores,
__ATOMIC_RELAXED) > 0);
 
-   return (s->app_runstate == RUNSTATE_RUNNING) &&
-   (s->comp_runstate == RUNSTATE_RUNNING) &&
-   (check_disabled | lcore_mapped);
+   return (check_disabled | lcore_mapped);
+   } else
+   return 0;
+
 }
 
 static inline void
@@ -367,9 +384,15 @@ service_run(uint32_t i, struct core_state *cs, uint64_t 
service_mask,
if (!s)
return -EINVAL;
 
-   if (s->comp_runstate != RUNSTATE_RUNNING ||
-   s->app_runstate != RUNSTATE_RUNNING ||
-   !(service_mask & (UINT64_C(1) << i))) {
+   /* comp_runstate and app_runstate act as the guard variables.
+* Use load-acquire memory order. This synchronizes with
+* store-release in service state set functions.
+*/
+   if (__atomic_load_n(&s->comp_runstate,
+   __ATOMIC_ACQUIRE) != RUNSTATE_RUNNING ||
+__atomic_load_n(&s->app_runstate,
+   __ATOMIC_ACQUIRE) != RUNSTATE_RUNNING ||
+   !(service_mask & (UINT64_C(1) << i))) {
cs->service_active_on_lcore[i] = 0;
return -ENOEXEC;
}
@@ -434,7 +457,12 @@ s

Re: [dpdk-dev] [PATCH v2] eal: add madvise to avoid dump memory

2020-04-23 Thread Burakov, Anatoly

On 23-Apr-20 4:43 PM, Li Feng wrote:

Avoid dump all mapped memory to a core dump file when crash.
Otherwise it will very large and it's hard to analyze with gdb.

In my test, it will dump 128GiB memory to a core dump file when integrated
to spdk with default configuration.


Suggested rewording:

Currently, even though memory is mapped with PROT_NONE, this does not 
cause it to be excluded from core dumps. This is counter-productive, 
because in a lot of cases, this memory will go unused (e.g. when the 
memory subsystem preallocates VA space but hasn't yet mapped physical 
pages into it).


Use `madvise()` call with MADV_DONTDUMP parameter to exclude the 
unmapped memory from being dumped.




Signed-off-by: Li Feng 
---
  lib/librte_eal/common/eal_common_memory.c | 14 ++
  1 file changed, 14 insertions(+)

diff --git a/lib/librte_eal/common/eal_common_memory.c 
b/lib/librte_eal/common/eal_common_memory.c
index cc7d54e0c..2d9564b28 100644
--- a/lib/librte_eal/common/eal_common_memory.c
+++ b/lib/librte_eal/common/eal_common_memory.c
@@ -177,6 +177,20 @@ eal_get_virtual_area(void *requested_addr, size_t *size,
after_len = RTE_PTR_DIFF(map_end, aligned_end);
if (after_len > 0)
munmap(aligned_end, after_len);
+
+   /*
+* Exclude this pages from a core dump.
+*/
+   if (madvise(aligned_addr, *size, MADV_DONTDUMP) != 0)
+   RTE_LOG(WARNING, EAL, "Madvise with MADV_DONTDUMP failed: 
%s\n",
+   strerror(errno)); > +} else {
+   /*
+* Exclude this pages from a core dump.
+*/
+   if (madvise(mapped_addr, map_sz, MADV_DONTDUMP) != 0)
+   RTE_LOG(WARNING, EAL, "Madvise with MADV_DONTDUMP failed: 
%s\n",
+   strerror(errno));
}
  
  	return aligned_addr;




For the contents of this patch,

Acked-by: Anatoly Burakov 

However, even though this is good to have, after some more thought, i 
believe the fix is incomplete, because this is not the only place we're 
reserving anonymous memory. We're also doing so in 
`eal_memalloc.c:free_seg()`, so an `madvise()` call should also be added 
there.


@David, now that i think of it, the PROT_NONE patch also was incomplete, 
as we only set PROT_NONE to memory that's initially reserved, but not 
when it's unmapped and returned back to the pool of anonymous memory. 
So, eal_memalloc.c should also remap anonymous memory with PROT_NONE.


@Li Feng, would you be so kind as to provide a patch replacing PROT_READ 
with PROT_NONE in eal_memalloc.c as well? Thank you very much!


--
Thanks,
Anatoly


[dpdk-dev] [PATCH v2] vhost: optimize broadcast rarp sync with c11 atomic

2020-04-23 Thread Phil Yang
The rarp packet broadcast flag is synchronized with rte_atomic_XX APIs
which is a full barrier, DMB, on aarch64. This patch optimized it with
c11 atomic one-way barrier.

Signed-off-by: Phil Yang 
Reviewed-by: Gavin Hu 
Reviewed-by: Honnappa Nagarahalli 
Reviewed-by: Joyce Kong 
---
v2:
split from the 'generic rte atomic APIs deprecate proposal' patchset.

 lib/librte_vhost/vhost.h  |  2 +-
 lib/librte_vhost/vhost_user.c |  7 +++
 lib/librte_vhost/virtio_net.c | 16 +---
 3 files changed, 13 insertions(+), 12 deletions(-)

diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h
index 2087d14..0e22125 100644
--- a/lib/librte_vhost/vhost.h
+++ b/lib/librte_vhost/vhost.h
@@ -350,7 +350,7 @@ struct virtio_net {
uint32_tflags;
uint16_tvhost_hlen;
/* to tell if we need broadcast rarp packet */
-   rte_atomic16_t  broadcast_rarp;
+   int16_t broadcast_rarp;
uint32_tnr_vring;
int dequeue_zero_copy;
int extbuf;
diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c
index bd1be01..857187d 100644
--- a/lib/librte_vhost/vhost_user.c
+++ b/lib/librte_vhost/vhost_user.c
@@ -2145,11 +2145,10 @@ vhost_user_send_rarp(struct virtio_net **pdev, struct 
VhostUserMsg *msg,
 * Set the flag to inject a RARP broadcast packet at
 * rte_vhost_dequeue_burst().
 *
-* rte_smp_wmb() is for making sure the mac is copied
-* before the flag is set.
+* __ATOMIC_RELEASE ordering is for making sure the mac is
+* copied before the flag is set.
 */
-   rte_smp_wmb();
-   rte_atomic16_set(&dev->broadcast_rarp, 1);
+   __atomic_store_n(&dev->broadcast_rarp, 1, __ATOMIC_RELEASE);
did = dev->vdpa_dev_id;
vdpa_dev = rte_vdpa_get_device(did);
if (vdpa_dev && vdpa_dev->ops->migration_done)
diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c
index 37c47c7..fa10deb 100644
--- a/lib/librte_vhost/virtio_net.c
+++ b/lib/librte_vhost/virtio_net.c
@@ -2203,6 +2203,7 @@ rte_vhost_dequeue_burst(int vid, uint16_t queue_id,
struct virtio_net *dev;
struct rte_mbuf *rarp_mbuf = NULL;
struct vhost_virtqueue *vq;
+   int16_t success = 1;
 
dev = get_device(vid);
if (!dev)
@@ -2249,16 +2250,17 @@ rte_vhost_dequeue_burst(int vid, uint16_t queue_id,
 *
 * broadcast_rarp shares a cacheline in the virtio_net structure
 * with some fields that are accessed during enqueue and
-* rte_atomic16_cmpset() causes a write if using cmpxchg. This could
-* result in false sharing between enqueue and dequeue.
+* __atomic_compare_exchange_n causes a write if performed compare
+* and exchange. This could result in false sharing between enqueue
+* and dequeue.
 *
 * Prevent unnecessary false sharing by reading broadcast_rarp first
-* and only performing cmpset if the read indicates it is likely to
-* be set.
+* and only performing compare and exchange if the read indicates it
+* is likely to be set.
 */
-   if (unlikely(rte_atomic16_read(&dev->broadcast_rarp) &&
-   rte_atomic16_cmpset((volatile uint16_t *)
-   &dev->broadcast_rarp.cnt, 1, 0))) {
+   if (unlikely(__atomic_load_n(&dev->broadcast_rarp, __ATOMIC_ACQUIRE) &&
+   __atomic_compare_exchange_n(&dev->broadcast_rarp,
+   &success, 0, 0, __ATOMIC_RELEASE, __ATOMIC_RELAXED))) {
 
rarp_mbuf = rte_net_make_rarp_packet(mbuf_pool, &dev->mac);
if (rarp_mbuf == NULL) {
-- 
2.7.4



[dpdk-dev] [PATCH v2] ipsec: optimize with c11 atomic for sa outbound sqn update

2020-04-23 Thread Phil Yang
For SA outbound packets, rte_atomic64_add_return is used to generate
SQN atomically. This introduced an unnecessary full barrier by calling
the '__sync' builtin implemented rte_atomic_XX API on aarch64. This
patch optimized it with c11 atomic and eliminated the expensive barrier
for aarch64.

Signed-off-by: Phil Yang 
Reviewed-by: Ruifeng Wang 
Reviewed-by: Gavin Hu 
---
v2:
split from the "generic rte atomic APIs deprecate proposal" patchset.


 lib/librte_ipsec/ipsec_sqn.h | 3 ++-
 lib/librte_ipsec/meson.build | 5 +
 lib/librte_ipsec/sa.h| 2 +-
 3 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/lib/librte_ipsec/ipsec_sqn.h b/lib/librte_ipsec/ipsec_sqn.h
index 0c2f76a..e884af7 100644
--- a/lib/librte_ipsec/ipsec_sqn.h
+++ b/lib/librte_ipsec/ipsec_sqn.h
@@ -128,7 +128,8 @@ esn_outb_update_sqn(struct rte_ipsec_sa *sa, uint32_t *num)
 
n = *num;
if (SQN_ATOMIC(sa))
-   sqn = (uint64_t)rte_atomic64_add_return(&sa->sqn.outb.atom, n);
+   sqn = __atomic_add_fetch(&sa->sqn.outb.atom, n,
+   __ATOMIC_RELAXED);
else {
sqn = sa->sqn.outb.raw + n;
sa->sqn.outb.raw = sqn;
diff --git a/lib/librte_ipsec/meson.build b/lib/librte_ipsec/meson.build
index fc69970..9335f28 100644
--- a/lib/librte_ipsec/meson.build
+++ b/lib/librte_ipsec/meson.build
@@ -6,3 +6,8 @@ sources = files('esp_inb.c', 'esp_outb.c', 'sa.c', 'ses.c', 
'ipsec_sad.c')
 headers = files('rte_ipsec.h', 'rte_ipsec_group.h', 'rte_ipsec_sa.h', 
'rte_ipsec_sad.h')
 
 deps += ['mbuf', 'net', 'cryptodev', 'security', 'hash']
+
+# for clang 32-bit compiles we need libatomic for 64-bit atomic ops
+if cc.get_id() == 'clang' and dpdk_conf.get('RTE_ARCH_64') == false
+ext_deps += cc.find_library('atomic')
+endif
diff --git a/lib/librte_ipsec/sa.h b/lib/librte_ipsec/sa.h
index d22451b..cab9a2e 100644
--- a/lib/librte_ipsec/sa.h
+++ b/lib/librte_ipsec/sa.h
@@ -120,7 +120,7 @@ struct rte_ipsec_sa {
 */
union {
union {
-   rte_atomic64_t atom;
+   uint64_t atom;
uint64_t raw;
} outb;
struct {
-- 
2.7.4



Re: [dpdk-dev] [PATCH 2/2] eal: resolve getentropy at run time for random seed

2020-04-23 Thread Dan Gora
On Thu, Apr 23, 2020 at 9:36 AM Mattias Rönnblom
 wrote:
> >>
> >> /dev/urandom is basically only a different interface to the same
> >> underlying mechanism.
> >>
> >> Such an alternative would look something like:
> >>
> >> static int
> >> getentropy(void *buffer, size_t length)
> >> {
> >>   int rc = -1;
> >>   int old_errno = errno;
> >>   int fd;
> >>
> >>   fd = open("/dev/urandom", O_RDONLY);
> >>
> >>   if (fd < 0)
> >>   goto out;
> >>
> >>   if (read(fd, buffer, length) != length)
> >>   goto out_close;
> >>
> >>   rc = 0;
> >>
> >> out_close:
> >>   close(fd);
> >> out:
> >>   errno = old_errno;
> >>
> >>   return rc;
> >> }
> > That's fine with me, but like I said I wasn't trying to change how any
> > of this worked, just work around glibc dependencies.  There seems to
> > be some subtle difference between /dev/urandom and /dev/random, but...
> >
> > https://protect2.fireeye.com/v1/url?k=1705be57-4b8f6b41-1705fecc-862f14a9365e-bb983def357fdfad&q=1&e=10fec9c1-51b3-4bc3-b77d-7eb39787d007&u=https%3A%2F%2Fpatches-gcc.linaro.org%2Fcomment%2F14484%2F
> >
>  Failure to run on old libc seems like a non-issue to me.
> >>> Well, again, it's a new dependency that didn't exist before.. We sell
> >>> to telco customers, so we have to support 10s of different target
> >>> platforms of various ages.  If they update their system, we'd have to
> >>> recompile our code to be able to use getentropy().  Similarly, if we
> >>> compiled on a system which has getentropy(), but the target system
> >>> doesn't, then they cannot run our binary because of the glibc 2.25
> >>> dependency.  That means that we have to have separate versions with
> >>> and without getentropy().  It's a maintenance headache for no real
> >>> benefit.
> >>
> >> I'm not sure I follow. Why would you need to recompile DPDK in case they
> >> upgrade their system? It sounds like you care about initial seeding,
> >> since you want getentropy() if it exists, but then in the next paragraph
> >> you want to throw it out, so I'm a little confused.
> > Well  _I_ wouldn't but maybe someone wants getentropy() for the
> > initial seed.. I assume that's why it was added in the first place..
> > For my application we don't care at all.  I just want to get rid of
> > this dependency on glibc 2.25 and have the behavior be the same on
> > meson and Makefile builds on the same complication system.
>
>
> The reason for trying to avoid a wall time-based seed as the default is
> that application instances started at the roughly the same time might
> end up having a the same seed, which in turn might impact their behavior
> in an adverse way. For example, random back-off timers may be the same.
> On x86_64, TSC has a high resolution, but on other platforms its
> equivalent the clock rate is much lower.
>
>
> >> Why doesn't the standard practice of compiling against the oldest
> >> supported libc work for you?
> > I guess I didn't realize that was "standard practice" but even so it
> > still adds an unnecessary restriction on the complication platform.
>
>
> If DPDK has the policy of attempting to allow DPDK applications compiled
> against one glibc version to run against another, older, version, we can
> go ahead and discuss the details further. That would be up to the tech
> board to decide. I would vote against it.

I don't know why anyone would vote against removing an unnecessary
dependency, which was only introduced in v19.08 anyways.

> If the fix was simple, that's one thing. dlopen()/dlsym() doesn't
> qualify as such, nor does a syscall wrapper, as you pointed out.

The dlopen/dlsym() method is used in at least 4 other places in DPDK.
It's not that complicated.  There is plenty of precedence for it being
done this way.

I sent a v4 of the patch which emulated getentropy() using
/dev/urandom as you suggested. Did you see that?

thanks
dan


Re: [dpdk-dev] [PATCH 2/2] eal: resolve getentropy at run time for random seed

2020-04-23 Thread Dan Gora
On Thu, Apr 23, 2020 at 12:59 PM Luca Boccassi  wrote:
> > >
> > > /dev/urandom is basically only a different interface to the same
> > > underlying mechanism.
>
> This is not the whole story though - while the end result when all
> works is the same, there are important differences in getting there.
> There's a reason a programmatic interface was added - it's just better
> in general.
> Just to name one - opening files has implications for LSMs like
> SELinux. You now need a specific policy to allow it, which means
> applications that upgrade from one version of DPDK to the next will
> break.

DPDK opens _tons_ of files. This would not be the first file that DPDK
has to open.  And it's not like /dev/urandom is a new interface.  It's
been around forever.

If this is such a major problem, then that would argue for using the
dlsym()/dlopen() method to try to find the getentropy glibc function
that I sent in v3 of these patches.

> In general, I do not think we should go backwards. The programmatic
> interface to the random pools are good and we should use them by
> default - of course by all means add fallbacks to urandom if they are
> not available.

The original problem was that the "programmatic interface to the
random pools" (that is, getentropy()) can only be determined at
compilation time and if found introduce a new dependency on glibc 2.25
that can easily be avoided by emulating it (as I did here in v4 of the
patches) or by trying to dynamically find the symbol at run time using
dlopen()/dlsym() (as I did in v3 of the patches).

> But as Stephen said glibc generally does not support compiling on new +
> running on old - so if it's not this that breaks, it will be something
> else.

Well that's not necessarily true.  Most glibc interfaces have been
around forever and you can easily see what versions of glibc are
needed by running ldd on your application.  I don't see the point in
introducing a new dependency on a very recent version of glibc which
is not supported by all supported DPDK platforms when it can easily be
worked around.

The issue here is that the original patch to add getentropy():
1) Added a _new_ dependency on glibc 2.25.
2) Added a _new_ dependency that the rdseed CPU flag on the execution
machine has to match the complication machine.
3) Has different behavior if the DPDK is compiled with meson or with
Make on the same complication platform.

thanks,
dan


Re: [dpdk-dev] [PATCH v4 2/2] eal: emulate glibc getentropy for initial random seed

2020-04-23 Thread Dan Gora
On Wed, Apr 22, 2020 at 11:39 PM Stephen Hemminger
 wrote:
>
> On Wed, 22 Apr 2020 20:42:54 -0300
> Dan Gora  wrote:
>
> > + fd = open("/dev/urandom", O_RDONLY);
> > + if (fd < 0) {
> > + errno = ENODEV;
> > + return -1;
> > + }
> > +
> > + end = start + length;
> > + while (start < end) {
> > + bytes = read(fd, start, end - start);
> > + if (bytes < 0) {
>
> You are overdoing the complexity here. More error handling is not better.

I've definitely never heard that expression before!

> 1. This should only be called once at startup EINTR is not an issue then
> 2. The amount requested is always returned when using urandom (see man page 
> for random(4))
>
>The  O_NONBLOCK  flag  has  no effect when opening /dev/urandom.  When 
> calling
>read(2) for the device /dev/urandom, reads of up to 256 bytes will  
> return  as
>many  bytes  as are requested and will not be interrupted by a signal 
> handler.
>Reads with a buffer over this limit may return less than the requested 
>  number
>of bytes or fail with the error EINTR, if interrupted by a signal 
> handler.

I didn't just make this up out of whole cloth... This code was lifted,
almost verbatim, from the glibc implementation of getentropy(), which
is the function that we are trying to emulate:

https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/getentropy.c;h=1778632ff1f1fd77019401c3fbaa164c167248b0;hb=92dcaa3e2f7bf0f7f1c04cd2fb6a317df1a4e225

I assumed that they added this error handling for a reason.

Yes, since this function is only called once at startup EINTR should
not be an issue, but if we need to add __rte_getentropy() as a
generic, exported interface later, that error case would already be
taken care of.

thanks
dan


Re: [dpdk-dev] [PATCH v2] ipsec: optimize with c11 atomic for sa outbound sqn update

2020-04-23 Thread Jerin Jacob
On Thu, Apr 23, 2020 at 10:47 PM Phil Yang  wrote:
>
> For SA outbound packets, rte_atomic64_add_return is used to generate
> SQN atomically. This introduced an unnecessary full barrier by calling
> the '__sync' builtin implemented rte_atomic_XX API on aarch64. This
> patch optimized it with c11 atomic and eliminated the expensive barrier
> for aarch64.
>
> Signed-off-by: Phil Yang 
> Reviewed-by: Ruifeng Wang 
> Reviewed-by: Gavin Hu 

> diff --git a/lib/librte_ipsec/meson.build b/lib/librte_ipsec/meson.build
> index fc69970..9335f28 100644
> --- a/lib/librte_ipsec/meson.build
> +++ b/lib/librte_ipsec/meson.build
> @@ -6,3 +6,8 @@ sources = files('esp_inb.c', 'esp_outb.c', 'sa.c', 'ses.c', 
> 'ipsec_sad.c')
>  headers = files('rte_ipsec.h', 'rte_ipsec_group.h', 'rte_ipsec_sa.h', 
> 'rte_ipsec_sad.h')
>
>  deps += ['mbuf', 'net', 'cryptodev', 'security', 'hash']
> +
> +# for clang 32-bit compiles we need libatomic for 64-bit atomic ops
> +if cc.get_id() == 'clang' and dpdk_conf.get('RTE_ARCH_64') == false
> +ext_deps += cc.find_library('atomic')
> +endif


The following patch has been merged in master now. You don't need this anymore.

commit da4eae278b56e698c64d0c39939a7a55c5b6abdd
Author: Pavan Nikhilesh 
Date:   Sun Apr 19 15:31:01 2020 +0530

build: add global libatomic dependency for 32-bit clang

Add libatomic as a global dependency when compiling for 32-bit using
clang. As we need libatomic for 64-bit atomic ops.

Signed-off-by: Pavan Nikhilesh 
Acked-by: Bruce Richardson 


Re: [dpdk-dev] [PATCH v2] ipsec: optimize with c11 atomic for sa outbound sqn update

2020-04-23 Thread Ananyev, Konstantin
> 
> For SA outbound packets, rte_atomic64_add_return is used to generate
> SQN atomically. This introduced an unnecessary full barrier by calling
> the '__sync' builtin implemented rte_atomic_XX API on aarch64. This
> patch optimized it with c11 atomic and eliminated the expensive barrier
> for aarch64.
> 
> Signed-off-by: Phil Yang 
> Reviewed-by: Ruifeng Wang 
> Reviewed-by: Gavin Hu 
> ---
> v2:
> split from the "generic rte atomic APIs deprecate proposal" patchset.
> 
> 
>  lib/librte_ipsec/ipsec_sqn.h | 3 ++-
>  lib/librte_ipsec/meson.build | 5 +
>  lib/librte_ipsec/sa.h| 2 +-
>  3 files changed, 8 insertions(+), 2 deletions(-)
> 
> diff --git a/lib/librte_ipsec/ipsec_sqn.h b/lib/librte_ipsec/ipsec_sqn.h
> index 0c2f76a..e884af7 100644
> --- a/lib/librte_ipsec/ipsec_sqn.h
> +++ b/lib/librte_ipsec/ipsec_sqn.h
> @@ -128,7 +128,8 @@ esn_outb_update_sqn(struct rte_ipsec_sa *sa, uint32_t 
> *num)
> 
>   n = *num;
>   if (SQN_ATOMIC(sa))
> - sqn = (uint64_t)rte_atomic64_add_return(&sa->sqn.outb.atom, n);
> + sqn = __atomic_add_fetch(&sa->sqn.outb.atom, n,
> + __ATOMIC_RELAXED);
>   else {
>   sqn = sa->sqn.outb.raw + n;
>   sa->sqn.outb.raw = sqn;
> diff --git a/lib/librte_ipsec/meson.build b/lib/librte_ipsec/meson.build
> index fc69970..9335f28 100644
> --- a/lib/librte_ipsec/meson.build
> +++ b/lib/librte_ipsec/meson.build
> @@ -6,3 +6,8 @@ sources = files('esp_inb.c', 'esp_outb.c', 'sa.c', 'ses.c', 
> 'ipsec_sad.c')
>  headers = files('rte_ipsec.h', 'rte_ipsec_group.h', 'rte_ipsec_sa.h', 
> 'rte_ipsec_sad.h')
> 
>  deps += ['mbuf', 'net', 'cryptodev', 'security', 'hash']
> +
> +# for clang 32-bit compiles we need libatomic for 64-bit atomic ops
> +if cc.get_id() == 'clang' and dpdk_conf.get('RTE_ARCH_64') == false
> +ext_deps += cc.find_library('atomic')
> +endif
> diff --git a/lib/librte_ipsec/sa.h b/lib/librte_ipsec/sa.h
> index d22451b..cab9a2e 100644
> --- a/lib/librte_ipsec/sa.h
> +++ b/lib/librte_ipsec/sa.h
> @@ -120,7 +120,7 @@ struct rte_ipsec_sa {
>*/
>   union {
>   union {
> - rte_atomic64_t atom;
> + uint64_t atom;
>   uint64_t raw;
>   } outb;
>   struct {

Seems  you missed my comments for previous version, so I put here:

If we don't need rte_atomic64 here anymore,
then I think we can collapse the union to just:
uint64_t outb;

Konstantin


[dpdk-dev] [PATCH] doc: refine ethernet and VLAN flow rule items

2020-04-23 Thread Dekel Peled
Specified pattern may be translated in different manner.
For example the pattern "eth / ipv4" can be translated to match
untagged packets only, since the pattern doesn't specify a vlan item.
It can also be translated to match both tagged and untagged packets,
for the same reason.
This patch updates the rte_flow documentation to clearly specify the
required pattern to use.
For example:
To match tagged ipv4 packets, the pattern "eth type is 0x8100 /
vlan / ipv4 / end" should be used.
To match untagged ipv4 packets, the pattern "eth type is 0x0800 /
ipv4 / end" should be used.
To match both tagged and untagged packets, the pattern "eth / end"
should be used.

Signed-off-by: Dekel Peled 
---
 doc/guides/prog_guide/rte_flow.rst | 8 
 lib/librte_ethdev/rte_flow.h   | 9 +
 2 files changed, 17 insertions(+)

diff --git a/doc/guides/prog_guide/rte_flow.rst 
b/doc/guides/prog_guide/rte_flow.rst
index cf4368e..0d1c305 100644
--- a/doc/guides/prog_guide/rte_flow.rst
+++ b/doc/guides/prog_guide/rte_flow.rst
@@ -905,6 +905,12 @@ so-called layer 2.5 pattern items such as 
``RTE_FLOW_ITEM_TYPE_VLAN``. In
 the latter case, ``type`` refers to that of the outer header, with the inner
 EtherType/TPID provided by the subsequent pattern item. This is the same
 order as on the wire.
+If the ``type`` field contains a TPID value, then only tagged packets will 
match
+the pattern.
+If the ``type`` field contains another EtherType value, then only untagged
+packets will match the pattern.
+If the ``ETH`` item is the only item in the pattern, and the ``type`` field is
+not specified, then both tagged and untagged packets will match the pattern.
 
 - ``dst``: destination MAC.
 - ``src``: source MAC.
@@ -919,6 +925,8 @@ Matches an 802.1Q/ad VLAN tag.
 The corresponding standard outer EtherType (TPID) values are
 ``RTE_ETHER_TYPE_VLAN`` or ``RTE_ETHER_TYPE_QINQ``. It can be overridden by the
 preceding pattern item.
+If a ``VLAN`` item is present in the pattern, then only tagged packets will
+match the pattern.
 
 - ``tci``: tag control information.
 - ``inner_type``: inner EtherType or TPID.
diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
index 132b44e..178e87e 100644
--- a/lib/librte_ethdev/rte_flow.h
+++ b/lib/librte_ethdev/rte_flow.h
@@ -710,6 +710,13 @@ struct rte_flow_item_raw {
  * the latter case, @p type refers to that of the outer header, with the
  * inner EtherType/TPID provided by the subsequent pattern item. This is the
  * same order as on the wire.
+ * If the @p type field contains a TPID value, then only tagged packets will
+ * match the pattern.
+ * If the @p type field contains another EtherType value, then only untagged
+ * packets will match the pattern.
+ * If the @p ETH item is the only item in the pattern, and the @p type field
+ * is not specified, then both tagged and untagged packets will match the
+ * pattern.
  */
 struct rte_flow_item_eth {
struct rte_ether_addr dst; /**< Destination MAC. */
@@ -734,6 +741,8 @@ struct rte_flow_item_eth {
  * The corresponding standard outer EtherType (TPID) values are
  * RTE_ETHER_TYPE_VLAN or RTE_ETHER_TYPE_QINQ. It can be overridden by
  * the preceding pattern item.
+ * If a @p VLAN item is present in the pattern, then only tagged packets will
+ * match the pattern.
  */
 struct rte_flow_item_vlan {
rte_be16_t tci; /**< Tag control information. */
-- 
1.8.3.1



[dpdk-dev] DPDK techboard minutes for Apr 22nd 2020

2020-04-23 Thread Honnappa Nagarahalli
Meeting notes for the DPDK technical board meeting held on 2020-04-22

Attendees:
- Bruce Richardson
- Ferruh Yigit
- Hemant Agrawal
- Honnappa Nagarahalli (Chair)
- Jerin Jacob
- Kevin Traynor
- Konstantin Ananyev
- Maxime Coquelin
- Olivier Matz
- Stephen Hemminger
- Thomas Monjalon

NOTE: The technical board meetings every second Wednesday on IRC channel  
#dpdk-board, at 3pm UTC. Meetings are public and DPDK community members are 
welcome to attend.

NOTE: Next meeting will be on Wednesday 2020-05-06 @3pm UTC, and will be 
chaired by Jerin

1) Merging app/test-flow-perf in 20.05
- This application will be kept separate from testpmd
- It is approved to be merged in 20.05

2) Timeline to stop using rte_atomicNN_xx, rte_smp_*mb APIs (other barrier 
APIs, rte_*mb, rte_cio_*mb, rte_io_*mb, are allowed to be used)
- The formal deprecation of these APIs, which affects the DPDK 
applications, is postponed to a later point
- It was agreed to stop accepting new code with the above APIs after 
20.05 release
(subject to the availability of the patch for wrappers for C11 
atomic built-ins) - AI, Honnappa
- Add a note in the release notes to indicate the same - AI, Honnappa

3) Wrappers for C11 atomic built-ins will not be available for 20.05. Patches 
using C11 atomic built-ins will be accepted for 20.05
(as there is already lot of code with C11 atomic built-ins)

4) Need volunteers for maintaining C11 code
- This role is to support the maintainers with questions, help debug 
issues
- It would be good to have one maintainer from each supported 
architecture, volunteers are needed
- Honnappa is the maintainer from Arm

5) Ask Windows team to build and test with C11 code enabled in CI (for ex: 
rte_ring) - AI, Thomas

6) Multi-function chaining APIs
- A conclusion could not be reached. The discussion/voting is still 
ongoing. I will post the update once the decision is done.

Thank you,
Honnappa


Re: [dpdk-dev] [PATCH v2] eal: add madvise to avoid dump memory

2020-04-23 Thread David Marchand
On Thu, Apr 23, 2020 at 6:34 PM Burakov, Anatoly
 wrote:
> > diff --git a/lib/librte_eal/common/eal_common_memory.c 
> > b/lib/librte_eal/common/eal_common_memory.c
> > index cc7d54e0c..2d9564b28 100644
> > --- a/lib/librte_eal/common/eal_common_memory.c
> > +++ b/lib/librte_eal/common/eal_common_memory.c
> > @@ -177,6 +177,20 @@ eal_get_virtual_area(void *requested_addr, size_t 
> > *size,
> >   after_len = RTE_PTR_DIFF(map_end, aligned_end);
> >   if (after_len > 0)
> >   munmap(aligned_end, after_len);
> > +
> > + /*
> > +  * Exclude this pages from a core dump.
> > +  */
> > + if (madvise(aligned_addr, *size, MADV_DONTDUMP) != 0)
> > + RTE_LOG(WARNING, EAL, "Madvise with MADV_DONTDUMP 
> > failed: %s\n",
> > + strerror(errno));> +   } else {
> > + /*
> > +  * Exclude this pages from a core dump.
> > +  */
> > + if (madvise(mapped_addr, map_sz, MADV_DONTDUMP) != 0)
> > + RTE_LOG(WARNING, EAL, "Madvise with MADV_DONTDUMP 
> > failed: %s\n",
> > + strerror(errno));
> >   }
> >
> >   return aligned_addr;
> >
>
> For the contents of this patch,

MADV_DONTDUMP does not seem POSIX, but as I said [1], there seems to
be a MADV_NOCORE option on FreeBSD.
1: 
http://inbox.dpdk.org/dev/cajfav8y9ytt-7njuz+md6u8+3xuqyrgp28kd7jy2923epac...@mail.gmail.com/


>
> Acked-by: Anatoly Burakov 
>
> However, even though this is good to have, after some more thought, i
> believe the fix is incomplete, because this is not the only place we're
> reserving anonymous memory. We're also doing so in
> `eal_memalloc.c:free_seg()`, so an `madvise()` call should also be added
> there.
>
> @David, now that i think of it, the PROT_NONE patch also was incomplete,
> as we only set PROT_NONE to memory that's initially reserved, but not
> when it's unmapped and returned back to the pool of anonymous memory.
> So, eal_memalloc.c should also remap anonymous memory with PROT_NONE.

I can't disagree if you say so :-).

>
> @Li Feng, would you be so kind as to provide a patch replacing PROT_READ
> with PROT_NONE in eal_memalloc.c as well? Thank you very much!
>

Once we have the proper fixes, I'd like to get this Cc: sta...@dpdk.org.
Thanks.


-- 
David Marchand



Re: [dpdk-dev] [PATCH v2] lib/timer: relax barrier for status update

2020-04-23 Thread Honnappa Nagarahalli
Hi Erik,

> Subject: [PATCH v2] lib/timer: relax barrier for status update
> 
> Volatile has no ordering semantics. The rte_timer structure defines timer
> status as a volatile variable and uses the rte_r/wmb barrier to guarantee
> inter-thread visibility.
> 
> This patch optimized the volatile operation with c11 atomic operations and
> one-way barrier to save the performance penalty. According to the
> timer_perf_autotest benchmarking results, this patch can uplift 10%~16%
> timer appending performance, 3%~20% timer resetting performance and 45%
> timer callbacks scheduling performance on aarch64 and no loss in
> performance for x86.
> 
> Suggested-by: Honnappa Nagarahalli 
> Signed-off-by: Phil Yang 
> Reviewed-by: Gavin Hu 
> 
> ---
> This patch depends on patch:
> http://patchwork.dpdk.org/patch/65997/
> 
> v2:
> 1. Changed the memory ordering comment in timer_set_config_state.
> 2. It is still using built-ins as the wrapper functions for C11 built-ins are 
> not
> defined yet.
It is too late to get the wrapper functions done for 20.05. It was decided in 
yesterday's tech board meeting to go ahead with C11 atomic built-ins (since 
there is lot of code in DPDK that uses C11 built-ins). If there are no further 
comments, can you please provide your ack?

> 
>  lib/librte_timer/rte_timer.c | 85 ++---
> ---
>  lib/librte_timer/rte_timer.h |  2 +-
>  2 files changed, 60 insertions(+), 27 deletions(-)
> 
> diff --git a/lib/librte_timer/rte_timer.c b/lib/librte_timer/rte_timer.c index
> 269e921..ba17216 100644
> --- a/lib/librte_timer/rte_timer.c
> +++ b/lib/librte_timer/rte_timer.c
> @@ -10,7 +10,6 @@
>  #include 
>  #include 
> 
> -#include 
>  #include 
>  #include 
>  #include 
> @@ -218,7 +217,7 @@ rte_timer_init(struct rte_timer *tim)
> 
>   status.state = RTE_TIMER_STOP;
>   status.owner = RTE_TIMER_NO_OWNER;
> - tim->status.u32 = status.u32;
> + __atomic_store_n(&tim->status.u32, status.u32,
> __ATOMIC_RELAXED);
>  }
> 
>  /*
> @@ -239,9 +238,9 @@ timer_set_config_state(struct rte_timer *tim,
> 
>   /* wait that the timer is in correct status before update,
>* and mark it as being configured */
> - while (success == 0) {
> - prev_status.u32 = tim->status.u32;
> + prev_status.u32 = __atomic_load_n(&tim->status.u32,
> __ATOMIC_RELAXED);
> 
> + while (success == 0) {
>   /* timer is running on another core
>* or ready to run on local core, exit
>*/
> @@ -258,9 +257,15 @@ timer_set_config_state(struct rte_timer *tim,
>* mark it atomically as being configured */
>   status.state = RTE_TIMER_CONFIG;
>   status.owner = (int16_t)lcore_id;
> - success = rte_atomic32_cmpset(&tim->status.u32,
> -   prev_status.u32,
> -   status.u32);
> + /* CONFIG states are acting as locked states. If the
> +  * timer is in CONFIG state, the state cannot be changed
> +  * by other threads. So, we should use ACQUIRE here.
> +  */
> + success = __atomic_compare_exchange_n(&tim->status.u32,
> +   &prev_status.u32,
> +   status.u32, 0,
> +   __ATOMIC_ACQUIRE,
> +   __ATOMIC_RELAXED);
>   }
> 
>   ret_prev_status->u32 = prev_status.u32; @@ -279,20 +284,27 @@
> timer_set_running_state(struct rte_timer *tim)
> 
>   /* wait that the timer is in correct status before update,
>* and mark it as running */
> - while (success == 0) {
> - prev_status.u32 = tim->status.u32;
> + prev_status.u32 = __atomic_load_n(&tim->status.u32,
> __ATOMIC_RELAXED);
> 
> + while (success == 0) {
>   /* timer is not pending anymore */
>   if (prev_status.state != RTE_TIMER_PENDING)
>   return -1;
> 
>   /* here, we know that timer is stopped or pending,
> -  * mark it atomically as being configured */
> +  * mark it atomically as being running
> +  */
>   status.state = RTE_TIMER_RUNNING;
>   status.owner = (int16_t)lcore_id;
> - success = rte_atomic32_cmpset(&tim->status.u32,
> -   prev_status.u32,
> -   status.u32);
> + /* RUNNING states are acting as locked states. If the
> +  * timer is in RUNNING state, the state cannot be changed
> +  * by other threads. So, we should use ACQUIRE here.
> +  */
> + success = __atomic_compare_exchange_n(&tim->status.u32,
> +   &prev_status.u32,
> +   status.u32, 0

Re: [dpdk-dev] [PATCH 0/2] bnxt bug fixes

2020-04-23 Thread Ajit Khaparde
On Thu, Apr 23, 2020 at 7:46 AM Kalesh A P <
kalesh-anakkur.pura...@broadcom.com> wrote:

> From: Kalesh AP 
>
> Please apply.
>
Applied to dpdk-next-net-brcm. Thanks


>
> Kalesh AP (1):
>   net/bnxt: fix to reset VNIC rxq count on VNIC free
>
> Rahul Gupta (1):
>   net/bnxt: fix for memleak during queue restart
>
>  drivers/net/bnxt/bnxt_ethdev.c |  2 ++
>  drivers/net/bnxt/bnxt_hwrm.c   | 12 
>  drivers/net/bnxt/bnxt_rxr.c| 44
> --
>  3 files changed, 27 insertions(+), 31 deletions(-)
>
> --
> 2.10.1
>
>


Re: [dpdk-dev] [PATCH v2] lib/timer: relax barrier for status update

2020-04-23 Thread Carrillo, Erik G
Hi Honnappa,

> -Original Message-
> From: Honnappa Nagarahalli 
> Sent: Thursday, April 23, 2020 3:06 PM
> To: Phil Yang ; Carrillo, Erik G
> ; rsanf...@akamai.com; dev@dpdk.org
> Cc: tho...@monjalon.net; david.march...@redhat.com; Ananyev,
> Konstantin ; jer...@marvell.com;
> hemant.agra...@nxp.com; Gavin Hu ; nd
> ; Honnappa Nagarahalli ;
> nd 
> Subject: RE: [PATCH v2] lib/timer: relax barrier for status update
> 
> Hi Erik,
> 
> > Subject: [PATCH v2] lib/timer: relax barrier for status update
> >
> > Volatile has no ordering semantics. The rte_timer structure defines
> > timer status as a volatile variable and uses the rte_r/wmb barrier to
> > guarantee inter-thread visibility.
> >
> > This patch optimized the volatile operation with c11 atomic operations
> > and one-way barrier to save the performance penalty. According to the
> > timer_perf_autotest benchmarking results, this patch can uplift
> > 10%~16% timer appending performance, 3%~20% timer resetting
> > performance and 45% timer callbacks scheduling performance on aarch64
> > and no loss in performance for x86.
> >
> > Suggested-by: Honnappa Nagarahalli 
> > Signed-off-by: Phil Yang 
> > Reviewed-by: Gavin Hu 
> >
> > ---
> > This patch depends on patch:
> > http://patchwork.dpdk.org/patch/65997/
> >
> > v2:
> > 1. Changed the memory ordering comment in timer_set_config_state.
> > 2. It is still using built-ins as the wrapper functions for C11
> > built-ins are not defined yet.
> It is too late to get the wrapper functions done for 20.05. It was decided in
> yesterday's tech board meeting to go ahead with C11 atomic built-ins (since
> there is lot of code in DPDK that uses C11 built-ins). If there are no further
> comments, can you please provide your ack?
> 

Ok, thanks for letting me know.  Based on that decision,  I've taken another 
look 
and done some testing and it looks good to me.  I've made one comment in-line
below and acked it.

<... snipped ...>

> > @@ -258,9 +257,15 @@ timer_set_config_state(struct rte_timer *tim,
> >  * mark it atomically as being configured */
> > status.state = RTE_TIMER_CONFIG;
> > status.owner = (int16_t)lcore_id;
> > -   success = rte_atomic32_cmpset(&tim->status.u32,
> > - prev_status.u32,
> > - status.u32);
> > +   /* CONFIG states are acting as locked states. If the
> > +* timer is in CONFIG state, the state cannot be changed
> > +* by other threads. So, we should use ACQUIRE here.
> > +*/
> > +   success = __atomic_compare_exchange_n(&tim-
> >status.u32,
> > + &prev_status.u32,
> > + status.u32, 0,
> > + __ATOMIC_ACQUIRE,
> > + __ATOMIC_RELAXED);
> > }
> >
> > ret_prev_status->u32 = prev_status.u32; @@ -279,20 +284,27 @@
> > timer_set_running_state(struct rte_timer *tim)
> >
> > /* wait that the timer is in correct status before update,
> >  * and mark it as running */
> > -   while (success == 0) {
> > -   prev_status.u32 = tim->status.u32;
> > +   prev_status.u32 = __atomic_load_n(&tim->status.u32,
> > __ATOMIC_RELAXED);
> >
> > +   while (success == 0) {
> > /* timer is not pending anymore */
> > if (prev_status.state != RTE_TIMER_PENDING)
> > return -1;
> >
> > /* here, we know that timer is stopped or pending,

We know that the timer will be pending at this point... Since we're correcting 
the comment below, we can correct this part too.

With that change:
Acked-by: Erik Gabriel Carrillo 

> > -* mark it atomically as being configured */
> > +* mark it atomically as being running
> > +*/
> > status.state = RTE_TIMER_RUNNING;
> > status.owner = (int16_t)lcore_id;
> > -   success = rte_atomic32_cmpset(&tim->status.u32,
> > - prev_status.u32,
> > - status.u32);
> > +   /* RUNNING states are acting as locked states. If the
> > +* timer is in RUNNING state, the state cannot be changed
> > +* by other threads. So, we should use ACQUIRE here.
> > +*/
> > +   success = __atomic_compare_exchange_n(&tim-
> >status.u32,
> > + &prev_status.u32,
> > + status.u32, 0,
> > + __ATOMIC_ACQUIRE,
> > + __ATOMIC_RELAXED);
> > }
> >
> > return 0;

Thanks,
Erik


[dpdk-dev] [PATCH v9 0/9] add packed ring vectorized path

2020-04-23 Thread Marvin Liu
This patch set introduced vectorized path for packed ring.

The size of packed ring descriptor is 16Bytes. Four batched descriptors
are just placed into one cacheline. AVX512 instructions can well handle
this kind of data. Packed ring TX path can fully transformed into
vectorized path. Packed ring Rx path can be vectorized when requirements
met(LRO and mergeable disabled).

New option RTE_LIBRTE_VIRTIO_INC_VECTOR will be introduced in this
patch set. This option will unify split and packed ring vectorized
path default setting. Meanwhile user can specify whether enable
vectorized path at runtime by 'vectorized' parameter of virtio user
vdev.

v9:
* replace RTE_LIBRTE_VIRTIO_INC_VECTOR with vectorized devarg
* reorder patch sequence

v8:
* fix meson build error on ubuntu16.04 and suse15

v7:
* default vectorization is disabled
* compilation time check dependency on rte_mbuf structure
* offsets are calcuated when compiling
* remove useless barrier as descs are batched store&load
* vindex of scatter is directly set
* some comments updates
* enable vectorized path in meson build

v6:
* fix issue when size not power of 2

v5:
* remove cpuflags definition as required extensions always come with
  AVX512F on x86_64
* inorder actions should depend on feature bit
* check ring type in rx queue setup
* rewrite some commit logs
* fix some checkpatch warnings

v4:
* rename 'packed_vec' to 'vectorized', also used in split ring
* add RTE_LIBRTE_VIRTIO_INC_VECTOR config for virtio ethdev
* check required AVX512 extensions cpuflags
* combine split and packed ring datapath selection logic
* remove limitation that size must power of two
* clear 12Bytes virtio_net_hdr

v3:
* remove virtio_net_hdr array for better performance
* disable 'packed_vec' by default

v2:
* more function blocks replaced by vector instructions
* clean virtio_net_hdr by vector instruction
* allow header room size change
* add 'packed_vec' option in virtio_user vdev 
* fix build not check whether AVX512 enabled
* doc update

Tested-by: Wang, Yinan 

Marvin Liu (9):
  net/virtio: add Rx free threshold setting
  net/virtio: inorder should depend on feature bit
  net/virtio: add vectorized devarg
  net/virtio-user: add vectorized devarg
  net/virtio: add vectorized packed ring Rx path
  net/virtio: reuse packed ring xmit functions
  net/virtio: add vectorized packed ring Tx path
  net/virtio: add election for vectorized path
  doc: add packed vectorized path

 doc/guides/nics/virtio.rst  |  52 +-
 drivers/net/virtio/Makefile |  35 ++
 drivers/net/virtio/meson.build  |  14 +
 drivers/net/virtio/virtio_ethdev.c  | 136 +++-
 drivers/net/virtio/virtio_ethdev.h  |   6 +
 drivers/net/virtio/virtio_pci.h |   3 +-
 drivers/net/virtio/virtio_rxtx.c| 212 ++-
 drivers/net/virtio/virtio_rxtx_packed_avx.c | 665 
 drivers/net/virtio/virtio_user_ethdev.c |  32 +-
 drivers/net/virtio/virtqueue.c  |   7 +-
 drivers/net/virtio/virtqueue.h  | 168 -
 11 files changed, 1112 insertions(+), 218 deletions(-)
 create mode 100644 drivers/net/virtio/virtio_rxtx_packed_avx.c

-- 
2.17.1



[dpdk-dev] [PATCH v9 2/9] net/virtio: inorder should depend on feature bit

2020-04-23 Thread Marvin Liu
Ring initialization is different when inorder feature negotiated. This
action should dependent on negotiated feature bits.

Signed-off-by: Marvin Liu 
Reviewed-by: Maxime Coquelin 

diff --git a/drivers/net/virtio/virtio_rxtx.c b/drivers/net/virtio/virtio_rxtx.c
index 94ba7a3ec..e450477e8 100644
--- a/drivers/net/virtio/virtio_rxtx.c
+++ b/drivers/net/virtio/virtio_rxtx.c
@@ -989,6 +989,7 @@ virtio_dev_rx_queue_setup_finish(struct rte_eth_dev *dev, 
uint16_t queue_idx)
struct rte_mbuf *m;
uint16_t desc_idx;
int error, nbufs, i;
+   bool in_order = vtpci_with_feature(hw, VIRTIO_F_IN_ORDER);
 
PMD_INIT_FUNC_TRACE();
 
@@ -1018,7 +1019,7 @@ virtio_dev_rx_queue_setup_finish(struct rte_eth_dev *dev, 
uint16_t queue_idx)
virtio_rxq_rearm_vec(rxvq);
nbufs += RTE_VIRTIO_VPMD_RX_REARM_THRESH;
}
-   } else if (hw->use_inorder_rx) {
+   } else if (!vtpci_packed_queue(vq->hw) && in_order) {
if ((!virtqueue_full(vq))) {
uint16_t free_cnt = vq->vq_free_cnt;
struct rte_mbuf *pkts[free_cnt];
@@ -1133,7 +1134,7 @@ virtio_dev_tx_queue_setup_finish(struct rte_eth_dev *dev,
PMD_INIT_FUNC_TRACE();
 
if (!vtpci_packed_queue(hw)) {
-   if (hw->use_inorder_tx)
+   if (vtpci_with_feature(hw, VIRTIO_F_IN_ORDER))
vq->vq_split.ring.desc[vq->vq_nentries - 1].next = 0;
}
 
@@ -2046,7 +2047,7 @@ virtio_xmit_pkts_packed(void *tx_queue, struct rte_mbuf 
**tx_pkts,
struct virtio_hw *hw = vq->hw;
uint16_t hdr_size = hw->vtnet_hdr_size;
uint16_t nb_tx = 0;
-   bool in_order = hw->use_inorder_tx;
+   bool in_order = vtpci_with_feature(hw, VIRTIO_F_IN_ORDER);
 
if (unlikely(hw->started == 0 && tx_pkts != hw->inject_pkts))
return nb_tx;
-- 
2.17.1



[dpdk-dev] [PATCH v9 1/9] net/virtio: add Rx free threshold setting

2020-04-23 Thread Marvin Liu
Introduce free threshold setting in Rx queue, its default value is 32.
Limit the threshold size to multiple of four as only vectorized packed
Rx function will utilize it. Virtio driver will rearm Rx queue when
more than rx_free_thresh descs were dequeued.

Signed-off-by: Marvin Liu 
Reviewed-by: Maxime Coquelin 

diff --git a/drivers/net/virtio/virtio_rxtx.c b/drivers/net/virtio/virtio_rxtx.c
index 060410577..94ba7a3ec 100644
--- a/drivers/net/virtio/virtio_rxtx.c
+++ b/drivers/net/virtio/virtio_rxtx.c
@@ -936,6 +936,7 @@ virtio_dev_rx_queue_setup(struct rte_eth_dev *dev,
struct virtio_hw *hw = dev->data->dev_private;
struct virtqueue *vq = hw->vqs[vtpci_queue_idx];
struct virtnet_rx *rxvq;
+   uint16_t rx_free_thresh;
 
PMD_INIT_FUNC_TRACE();
 
@@ -944,6 +945,28 @@ virtio_dev_rx_queue_setup(struct rte_eth_dev *dev,
return -EINVAL;
}
 
+   rx_free_thresh = rx_conf->rx_free_thresh;
+   if (rx_free_thresh == 0)
+   rx_free_thresh =
+   RTE_MIN(vq->vq_nentries / 4, DEFAULT_RX_FREE_THRESH);
+
+   if (rx_free_thresh & 0x3) {
+   RTE_LOG(ERR, PMD, "rx_free_thresh must be multiples of four."
+   " (rx_free_thresh=%u port=%u queue=%u)\n",
+   rx_free_thresh, dev->data->port_id, queue_idx);
+   return -EINVAL;
+   }
+
+   if (rx_free_thresh >= vq->vq_nentries) {
+   RTE_LOG(ERR, PMD, "rx_free_thresh must be less than the "
+   "number of RX entries (%u)."
+   " (rx_free_thresh=%u port=%u queue=%u)\n",
+   vq->vq_nentries,
+   rx_free_thresh, dev->data->port_id, queue_idx);
+   return -EINVAL;
+   }
+   vq->vq_free_thresh = rx_free_thresh;
+
if (nb_desc == 0 || nb_desc > vq->vq_nentries)
nb_desc = vq->vq_nentries;
vq->vq_free_cnt = RTE_MIN(vq->vq_free_cnt, nb_desc);
diff --git a/drivers/net/virtio/virtqueue.h b/drivers/net/virtio/virtqueue.h
index 58ad7309a..6301c56b2 100644
--- a/drivers/net/virtio/virtqueue.h
+++ b/drivers/net/virtio/virtqueue.h
@@ -18,6 +18,8 @@
 
 struct rte_mbuf;
 
+#define DEFAULT_RX_FREE_THRESH 32
+
 /*
  * Per virtio_ring.h in Linux.
  * For virtio_pci on SMP, we don't need to order with respect to MMIO
-- 
2.17.1



[dpdk-dev] [PATCH v9 5/9] net/virtio: add vectorized packed ring Rx path

2020-04-23 Thread Marvin Liu
Optimize packed ring Rx path with SIMD instructions. Solution of
optimization is pretty like vhost, is that split path into batch and
single functions. Batch function is further optimized by AVX512
instructions. Also pad desc extra structure to 16 bytes aligned, thus
four elements will be saved in one batch.

Signed-off-by: Marvin Liu 

diff --git a/drivers/net/virtio/Makefile b/drivers/net/virtio/Makefile
index c9edb84ee..102b1deab 100644
--- a/drivers/net/virtio/Makefile
+++ b/drivers/net/virtio/Makefile
@@ -36,6 +36,41 @@ else ifneq ($(filter y,$(CONFIG_RTE_ARCH_ARM) 
$(CONFIG_RTE_ARCH_ARM64)),)
 SRCS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += virtio_rxtx_simple_neon.c
 endif
 
+ifneq ($(FORCE_DISABLE_AVX512), y)
+   CC_AVX512_SUPPORT=\
+   $(shell $(CC) -march=native -dM -E - &1 | \
+   sed '/./{H;$$!d} ; x ; /AVX512F/!d; /AVX512BW/!d; /AVX512VL/!d' | \
+   grep -q AVX512 && echo 1)
+endif
+
+ifeq ($(CC_AVX512_SUPPORT), 1)
+CFLAGS += -DCC_AVX512_SUPPORT
+SRCS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += virtio_rxtx_packed_avx.c
+
+ifeq ($(RTE_TOOLCHAIN), gcc)
+ifeq ($(shell test $(GCC_VERSION) -ge 83 && echo 1), 1)
+CFLAGS += -DVIRTIO_GCC_UNROLL_PRAGMA
+endif
+endif
+
+ifeq ($(RTE_TOOLCHAIN), clang)
+ifeq ($(shell test $(CLANG_MAJOR_VERSION)$(CLANG_MINOR_VERSION) -ge 37 && echo 
1), 1)
+CFLAGS += -DVIRTIO_CLANG_UNROLL_PRAGMA
+endif
+endif
+
+ifeq ($(RTE_TOOLCHAIN), icc)
+ifeq ($(shell test $(ICC_MAJOR_VERSION) -ge 16 && echo 1), 1)
+CFLAGS += -DVIRTIO_ICC_UNROLL_PRAGMA
+endif
+endif
+
+CFLAGS_virtio_rxtx_packed_avx.o += -mavx512f -mavx512bw -mavx512vl
+ifeq ($(shell test $(GCC_VERSION) -ge 100 && echo 1), 1)
+CFLAGS_virtio_rxtx_packed_avx.o += -Wno-zero-length-bounds
+endif
+endif
+
 ifeq ($(CONFIG_RTE_VIRTIO_USER),y)
 SRCS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += virtio_user/vhost_user.c
 SRCS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += virtio_user/vhost_kernel.c
diff --git a/drivers/net/virtio/meson.build b/drivers/net/virtio/meson.build
index 15150eea1..8e68c3039 100644
--- a/drivers/net/virtio/meson.build
+++ b/drivers/net/virtio/meson.build
@@ -9,6 +9,20 @@ sources += files('virtio_ethdev.c',
 deps += ['kvargs', 'bus_pci']
 
 if arch_subdir == 'x86'
+   if '-mno-avx512f' not in machine_args
+   if cc.has_argument('-mavx512f') and 
cc.has_argument('-mavx512vl') and cc.has_argument('-mavx512bw')
+   cflags += ['-mavx512f', '-mavx512bw', '-mavx512vl']
+   cflags += ['-DCC_AVX512_SUPPORT']
+   if (toolchain == 'gcc' and 
cc.version().version_compare('>=8.3.0'))
+   cflags += '-DVHOST_GCC_UNROLL_PRAGMA'
+   elif (toolchain == 'clang' and 
cc.version().version_compare('>=3.7.0'))
+   cflags += '-DVHOST_CLANG_UNROLL_PRAGMA'
+   elif (toolchain == 'icc' and 
cc.version().version_compare('>=16.0.0'))
+   cflags += '-DVHOST_ICC_UNROLL_PRAGMA'
+   endif
+   sources += files('virtio_rxtx_packed_avx.c')
+   endif
+   endif
sources += files('virtio_rxtx_simple_sse.c')
 elif arch_subdir == 'ppc'
sources += files('virtio_rxtx_simple_altivec.c')
diff --git a/drivers/net/virtio/virtio_ethdev.h 
b/drivers/net/virtio/virtio_ethdev.h
index febaf17a8..5c112cac7 100644
--- a/drivers/net/virtio/virtio_ethdev.h
+++ b/drivers/net/virtio/virtio_ethdev.h
@@ -105,6 +105,9 @@ uint16_t virtio_xmit_pkts_inorder(void *tx_queue, struct 
rte_mbuf **tx_pkts,
 uint16_t virtio_recv_pkts_vec(void *rx_queue, struct rte_mbuf **rx_pkts,
uint16_t nb_pkts);
 
+uint16_t virtio_recv_pkts_packed_vec(void *rx_queue, struct rte_mbuf **rx_pkts,
+   uint16_t nb_pkts);
+
 int eth_virtio_dev_init(struct rte_eth_dev *eth_dev);
 
 void virtio_interrupt_handler(void *param);
diff --git a/drivers/net/virtio/virtio_rxtx.c b/drivers/net/virtio/virtio_rxtx.c
index 84f4cf946..c9b6e7844 100644
--- a/drivers/net/virtio/virtio_rxtx.c
+++ b/drivers/net/virtio/virtio_rxtx.c
@@ -2329,3 +2329,11 @@ virtio_xmit_pkts_inorder(void *tx_queue,
 
return nb_tx;
 }
+
+__rte_weak uint16_t
+virtio_recv_pkts_packed_vec(void *rx_queue __rte_unused,
+   struct rte_mbuf **rx_pkts __rte_unused,
+   uint16_t nb_pkts __rte_unused)
+{
+   return 0;
+}
diff --git a/drivers/net/virtio/virtio_rxtx_packed_avx.c 
b/drivers/net/virtio/virtio_rxtx_packed_avx.c
new file mode 100644
index 0..8a7b459eb
--- /dev/null
+++ b/drivers/net/virtio/virtio_rxtx_packed_avx.c
@@ -0,0 +1,374 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2010-2020 Intel Corporation
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+
+#include "virtio_logs.h"
+#include "virtio_ethdev.h"
+#include "virtio_pci.h"
+#include "virtqueue.h"
+
+#define BYTE_SIZE 8
+/* flag bits offset in packed ring desc higher 64bits */
+#define FLAGS_BITS_OFF

[dpdk-dev] [PATCH v9 7/9] net/virtio: add vectorized packed ring Tx path

2020-04-23 Thread Marvin Liu
Optimize packed ring Tx path alike Rx path. Split Tx path into batch and
single Tx functions. Batch function is further optimized by AVX512
instructions.

Signed-off-by: Marvin Liu 

diff --git a/drivers/net/virtio/virtio_ethdev.h 
b/drivers/net/virtio/virtio_ethdev.h
index 5c112cac7..b7d52d497 100644
--- a/drivers/net/virtio/virtio_ethdev.h
+++ b/drivers/net/virtio/virtio_ethdev.h
@@ -108,6 +108,9 @@ uint16_t virtio_recv_pkts_vec(void *rx_queue, struct 
rte_mbuf **rx_pkts,
 uint16_t virtio_recv_pkts_packed_vec(void *rx_queue, struct rte_mbuf **rx_pkts,
uint16_t nb_pkts);
 
+uint16_t virtio_xmit_pkts_packed_vec(void *tx_queue, struct rte_mbuf **tx_pkts,
+   uint16_t nb_pkts);
+
 int eth_virtio_dev_init(struct rte_eth_dev *eth_dev);
 
 void virtio_interrupt_handler(void *param);
diff --git a/drivers/net/virtio/virtio_rxtx.c b/drivers/net/virtio/virtio_rxtx.c
index cf18fe564..f82fe8d64 100644
--- a/drivers/net/virtio/virtio_rxtx.c
+++ b/drivers/net/virtio/virtio_rxtx.c
@@ -2175,3 +2175,11 @@ virtio_recv_pkts_packed_vec(void *rx_queue __rte_unused,
 {
return 0;
 }
+
+__rte_weak uint16_t
+virtio_xmit_pkts_packed_vec(void *tx_queue __rte_unused,
+   struct rte_mbuf **tx_pkts __rte_unused,
+   uint16_t nb_pkts __rte_unused)
+{
+   return 0;
+}
diff --git a/drivers/net/virtio/virtio_rxtx_packed_avx.c 
b/drivers/net/virtio/virtio_rxtx_packed_avx.c
index 8a7b459eb..c023ace4e 100644
--- a/drivers/net/virtio/virtio_rxtx_packed_avx.c
+++ b/drivers/net/virtio/virtio_rxtx_packed_avx.c
@@ -23,6 +23,24 @@
 #define PACKED_FLAGS_MASK ((0ULL | VRING_PACKED_DESC_F_AVAIL_USED) << \
FLAGS_BITS_OFFSET)
 
+/* reference count offset in mbuf rearm data */
+#define REFCNT_BITS_OFFSET ((offsetof(struct rte_mbuf, refcnt) - \
+   offsetof(struct rte_mbuf, rearm_data)) * BYTE_SIZE)
+/* segment number offset in mbuf rearm data */
+#define SEG_NUM_BITS_OFFSET ((offsetof(struct rte_mbuf, nb_segs) - \
+   offsetof(struct rte_mbuf, rearm_data)) * BYTE_SIZE)
+
+/* default rearm data */
+#define DEFAULT_REARM_DATA (1ULL << SEG_NUM_BITS_OFFSET | \
+   1ULL << REFCNT_BITS_OFFSET)
+
+/* id bits offset in packed ring desc higher 64bits */
+#define ID_BITS_OFFSET ((offsetof(struct vring_packed_desc, id) - \
+   offsetof(struct vring_packed_desc, len)) * BYTE_SIZE)
+
+/* net hdr short size mask */
+#define NET_HDR_MASK 0x3F
+
 #define PACKED_BATCH_SIZE (RTE_CACHE_LINE_SIZE / \
sizeof(struct vring_packed_desc))
 #define PACKED_BATCH_MASK (PACKED_BATCH_SIZE - 1)
@@ -47,6 +65,48 @@
for (iter = val; iter < num; iter++)
 #endif
 
+static inline void
+virtio_xmit_cleanup_packed_vec(struct virtqueue *vq)
+{
+   struct vring_packed_desc *desc = vq->vq_packed.ring.desc;
+   struct vq_desc_extra *dxp;
+   uint16_t used_idx, id, curr_id, free_cnt = 0;
+   uint16_t size = vq->vq_nentries;
+   struct rte_mbuf *mbufs[size];
+   uint16_t nb_mbuf = 0, i;
+
+   used_idx = vq->vq_used_cons_idx;
+
+   if (!desc_is_used(&desc[used_idx], vq))
+   return;
+
+   id = desc[used_idx].id;
+
+   do {
+   curr_id = used_idx;
+   dxp = &vq->vq_descx[used_idx];
+   used_idx += dxp->ndescs;
+   free_cnt += dxp->ndescs;
+
+   if (dxp->cookie != NULL) {
+   mbufs[nb_mbuf] = dxp->cookie;
+   dxp->cookie = NULL;
+   nb_mbuf++;
+   }
+
+   if (used_idx >= size) {
+   used_idx -= size;
+   vq->vq_packed.used_wrap_counter ^= 1;
+   }
+   } while (curr_id != id);
+
+   for (i = 0; i < nb_mbuf; i++)
+   rte_pktmbuf_free(mbufs[i]);
+
+   vq->vq_used_cons_idx = used_idx;
+   vq->vq_free_cnt += free_cnt;
+}
+
 static inline void
 virtio_update_batch_stats(struct virtnet_stats *stats,
  uint16_t pkt_len1,
@@ -60,6 +120,237 @@ virtio_update_batch_stats(struct virtnet_stats *stats,
stats->bytes += pkt_len4;
 }
 
+static inline int
+virtqueue_enqueue_batch_packed_vec(struct virtnet_tx *txvq,
+  struct rte_mbuf **tx_pkts)
+{
+   struct virtqueue *vq = txvq->vq;
+   uint16_t head_size = vq->hw->vtnet_hdr_size;
+   uint16_t idx = vq->vq_avail_idx;
+   struct virtio_net_hdr *hdr;
+   uint16_t i, cmp;
+
+   if (vq->vq_avail_idx & PACKED_BATCH_MASK)
+   return -1;
+
+   if (unlikely((idx + PACKED_BATCH_SIZE) > vq->vq_nentries))
+   return -1;
+
+   /* Load four mbufs rearm data */
+   RTE_BUILD_BUG_ON(REFCNT_BITS_OFFSET >= 64);
+   RTE_BUILD_BUG_ON(SEG_NUM_BITS_OFFSET >= 64);
+   __m256i mbufs = _mm256_set_epi64x(*tx_pkts[3]->rearm_data,
+ *tx_pkts[2]->rearm_data,
+ *tx_pkts[1]->rearm_data,
+ 

[dpdk-dev] [PATCH v9 4/9] net/virtio-user: add vectorized devarg

2020-04-23 Thread Marvin Liu
Add new devarg for virtio user device vectorized path selection. By
default vectorized path is disabled.

Signed-off-by: Marvin Liu 

diff --git a/doc/guides/nics/virtio.rst b/doc/guides/nics/virtio.rst
index 902a1f0cf..d59add23e 100644
--- a/doc/guides/nics/virtio.rst
+++ b/doc/guides/nics/virtio.rst
@@ -424,6 +424,12 @@ Below devargs are supported by the virtio-user vdev:
 rte_eth_link_get_nowait function.
 (Default: 1 (10G))
 
+#.  ``vectorized``:
+
+It is used to specify whether virtio device perfer to use vectorized path.
+Afterwards, dependencies of vectorized path will be checked in path
+election.
+(Default: 0 (disabled))
 
 Virtio paths Selection and Usage
 
diff --git a/drivers/net/virtio/virtio_user_ethdev.c 
b/drivers/net/virtio/virtio_user_ethdev.c
index 150a8d987..40ad786cc 100644
--- a/drivers/net/virtio/virtio_user_ethdev.c
+++ b/drivers/net/virtio/virtio_user_ethdev.c
@@ -452,6 +452,8 @@ static const char *valid_args[] = {
VIRTIO_USER_ARG_PACKED_VQ,
 #define VIRTIO_USER_ARG_SPEED  "speed"
VIRTIO_USER_ARG_SPEED,
+#define VIRTIO_USER_ARG_VECTORIZED "vectorized"
+   VIRTIO_USER_ARG_VECTORIZED,
NULL
 };
 
@@ -559,6 +561,7 @@ virtio_user_pmd_probe(struct rte_vdev_device *dev)
uint64_t mrg_rxbuf = 1;
uint64_t in_order = 1;
uint64_t packed_vq = 0;
+   uint64_t vectorized = 0;
char *path = NULL;
char *ifname = NULL;
char *mac_addr = NULL;
@@ -675,6 +678,15 @@ virtio_user_pmd_probe(struct rte_vdev_device *dev)
}
}
 
+   if (rte_kvargs_count(kvlist, VIRTIO_USER_ARG_VECTORIZED) == 1) {
+   if (rte_kvargs_process(kvlist, VIRTIO_USER_ARG_VECTORIZED,
+  &get_integer_arg, &vectorized) < 0) {
+   PMD_INIT_LOG(ERR, "error to parse %s",
+VIRTIO_USER_ARG_VECTORIZED);
+   goto end;
+   }
+   }
+
if (queues > 1 && cq == 0) {
PMD_INIT_LOG(ERR, "multi-q requires ctrl-q");
goto end;
@@ -727,6 +739,9 @@ virtio_user_pmd_probe(struct rte_vdev_device *dev)
goto end;
}
 
+   if (vectorized)
+   hw->use_vec_rx = 1;
+
rte_eth_dev_probing_finish(eth_dev);
ret = 0;
 
@@ -785,4 +800,5 @@ RTE_PMD_REGISTER_PARAM_STRING(net_virtio_user,
"mrg_rxbuf=<0|1> "
"in_order=<0|1> "
"packed_vq=<0|1> "
-   "speed=");
+   "speed= "
+   "vectorized=<0|1>");
-- 
2.17.1



[dpdk-dev] [PATCH v9 6/9] net/virtio: reuse packed ring xmit functions

2020-04-23 Thread Marvin Liu
Move xmit offload and packed ring xmit enqueue function to header file.
These functions will be reused by packed ring vectorized Tx function.

Signed-off-by: Marvin Liu 

diff --git a/drivers/net/virtio/virtio_rxtx.c b/drivers/net/virtio/virtio_rxtx.c
index c9b6e7844..cf18fe564 100644
--- a/drivers/net/virtio/virtio_rxtx.c
+++ b/drivers/net/virtio/virtio_rxtx.c
@@ -264,10 +264,6 @@ virtqueue_dequeue_rx_inorder(struct virtqueue *vq,
return i;
 }
 
-#ifndef DEFAULT_TX_FREE_THRESH
-#define DEFAULT_TX_FREE_THRESH 32
-#endif
-
 static void
 virtio_xmit_cleanup_inorder_packed(struct virtqueue *vq, int num)
 {
@@ -562,68 +558,7 @@ virtio_tso_fix_cksum(struct rte_mbuf *m)
 }
 
 
-/* avoid write operation when necessary, to lessen cache issues */
-#define ASSIGN_UNLESS_EQUAL(var, val) do { \
-   if ((var) != (val)) \
-   (var) = (val);  \
-} while (0)
-
-#define virtqueue_clear_net_hdr(_hdr) do { \
-   ASSIGN_UNLESS_EQUAL((_hdr)->csum_start, 0); \
-   ASSIGN_UNLESS_EQUAL((_hdr)->csum_offset, 0);\
-   ASSIGN_UNLESS_EQUAL((_hdr)->flags, 0);  \
-   ASSIGN_UNLESS_EQUAL((_hdr)->gso_type, 0);   \
-   ASSIGN_UNLESS_EQUAL((_hdr)->gso_size, 0);   \
-   ASSIGN_UNLESS_EQUAL((_hdr)->hdr_len, 0);\
-} while (0)
-
-static inline void
-virtqueue_xmit_offload(struct virtio_net_hdr *hdr,
-   struct rte_mbuf *cookie,
-   bool offload)
-{
-   if (offload) {
-   if (cookie->ol_flags & PKT_TX_TCP_SEG)
-   cookie->ol_flags |= PKT_TX_TCP_CKSUM;
-
-   switch (cookie->ol_flags & PKT_TX_L4_MASK) {
-   case PKT_TX_UDP_CKSUM:
-   hdr->csum_start = cookie->l2_len + cookie->l3_len;
-   hdr->csum_offset = offsetof(struct rte_udp_hdr,
-   dgram_cksum);
-   hdr->flags = VIRTIO_NET_HDR_F_NEEDS_CSUM;
-   break;
-
-   case PKT_TX_TCP_CKSUM:
-   hdr->csum_start = cookie->l2_len + cookie->l3_len;
-   hdr->csum_offset = offsetof(struct rte_tcp_hdr, cksum);
-   hdr->flags = VIRTIO_NET_HDR_F_NEEDS_CSUM;
-   break;
-
-   default:
-   ASSIGN_UNLESS_EQUAL(hdr->csum_start, 0);
-   ASSIGN_UNLESS_EQUAL(hdr->csum_offset, 0);
-   ASSIGN_UNLESS_EQUAL(hdr->flags, 0);
-   break;
-   }
 
-   /* TCP Segmentation Offload */
-   if (cookie->ol_flags & PKT_TX_TCP_SEG) {
-   hdr->gso_type = (cookie->ol_flags & PKT_TX_IPV6) ?
-   VIRTIO_NET_HDR_GSO_TCPV6 :
-   VIRTIO_NET_HDR_GSO_TCPV4;
-   hdr->gso_size = cookie->tso_segsz;
-   hdr->hdr_len =
-   cookie->l2_len +
-   cookie->l3_len +
-   cookie->l4_len;
-   } else {
-   ASSIGN_UNLESS_EQUAL(hdr->gso_type, 0);
-   ASSIGN_UNLESS_EQUAL(hdr->gso_size, 0);
-   ASSIGN_UNLESS_EQUAL(hdr->hdr_len, 0);
-   }
-   }
-}
 
 static inline void
 virtqueue_enqueue_xmit_inorder(struct virtnet_tx *txvq,
@@ -725,102 +660,6 @@ virtqueue_enqueue_xmit_packed_fast(struct virtnet_tx 
*txvq,
virtqueue_store_flags_packed(dp, flags, vq->hw->weak_barriers);
 }
 
-static inline void
-virtqueue_enqueue_xmit_packed(struct virtnet_tx *txvq, struct rte_mbuf *cookie,
- uint16_t needed, int can_push, int in_order)
-{
-   struct virtio_tx_region *txr = txvq->virtio_net_hdr_mz->addr;
-   struct vq_desc_extra *dxp;
-   struct virtqueue *vq = txvq->vq;
-   struct vring_packed_desc *start_dp, *head_dp;
-   uint16_t idx, id, head_idx, head_flags;
-   int16_t head_size = vq->hw->vtnet_hdr_size;
-   struct virtio_net_hdr *hdr;
-   uint16_t prev;
-   bool prepend_header = false;
-
-   id = in_order ? vq->vq_avail_idx : vq->vq_desc_head_idx;
-
-   dxp = &vq->vq_descx[id];
-   dxp->ndescs = needed;
-   dxp->cookie = cookie;
-
-   head_idx = vq->vq_avail_idx;
-   idx = head_idx;
-   prev = head_idx;
-   start_dp = vq->vq_packed.ring.desc;
-
-   head_dp = &vq->vq_packed.ring.desc[idx];
-   head_flags = cookie->next ? VRING_DESC_F_NEXT : 0;
-   head_flags |= vq->vq_packed.cached_flags;
-
-   if (can_push) {
-   /* prepend cannot fail, checked by caller */
-   hdr = rte_pktmbuf_mtod_offset(cookie, struct virtio_net_hdr *,
- -head_size);
-   prepend_header = true;
-
-   /* if offload disabled, it is not zeroed below, do it now */
- 

[dpdk-dev] [PATCH v9 3/9] net/virtio: add vectorized devarg

2020-04-23 Thread Marvin Liu
Previously, virtio split ring vectorized path was enabled by default.
This is not suitable for everyone because that path dose not follow
virtio spec. Add new devarg for virtio vectorized path selection. By
default vectorized path is disabled.

Signed-off-by: Marvin Liu 

diff --git a/doc/guides/nics/virtio.rst b/doc/guides/nics/virtio.rst
index 6286286db..902a1f0cf 100644
--- a/doc/guides/nics/virtio.rst
+++ b/doc/guides/nics/virtio.rst
@@ -363,6 +363,13 @@ Below devargs are supported by the PCI virtio driver:
 rte_eth_link_get_nowait function.
 (Default: 1 (10G))
 
+#.  ``vectorized``:
+
+It is used to specify whether virtio device perfer to use vectorized path.
+Afterwards, dependencies of vectorized path will be checked in path
+election.
+(Default: 0 (disabled))
+
 Below devargs are supported by the virtio-user vdev:
 
 #.  ``path``:
diff --git a/drivers/net/virtio/virtio_ethdev.c 
b/drivers/net/virtio/virtio_ethdev.c
index 37766cbb6..0a69a4db1 100644
--- a/drivers/net/virtio/virtio_ethdev.c
+++ b/drivers/net/virtio/virtio_ethdev.c
@@ -48,7 +48,8 @@ static int virtio_dev_allmulticast_disable(struct rte_eth_dev 
*dev);
 static uint32_t virtio_dev_speed_capa_get(uint32_t speed);
 static int virtio_dev_devargs_parse(struct rte_devargs *devargs,
int *vdpa,
-   uint32_t *speed);
+   uint32_t *speed,
+   int *vectorized);
 static int virtio_dev_info_get(struct rte_eth_dev *dev,
struct rte_eth_dev_info *dev_info);
 static int virtio_dev_link_update(struct rte_eth_dev *dev,
@@ -1551,8 +1552,8 @@ set_rxtx_funcs(struct rte_eth_dev *eth_dev)
eth_dev->rx_pkt_burst = &virtio_recv_pkts_packed;
}
} else {
-   if (hw->use_simple_rx) {
-   PMD_INIT_LOG(INFO, "virtio: using simple Rx path on 
port %u",
+   if (hw->use_vec_rx) {
+   PMD_INIT_LOG(INFO, "virtio: using vectorized Rx path on 
port %u",
eth_dev->data->port_id);
eth_dev->rx_pkt_burst = virtio_recv_pkts_vec;
} else if (hw->use_inorder_rx) {
@@ -1886,6 +1887,7 @@ eth_virtio_dev_init(struct rte_eth_dev *eth_dev)
 {
struct virtio_hw *hw = eth_dev->data->dev_private;
uint32_t speed = SPEED_UNKNOWN;
+   int vectorized = 0;
int ret;
 
if (sizeof(struct virtio_net_hdr_mrg_rxbuf) > RTE_PKTMBUF_HEADROOM) {
@@ -1912,7 +1914,7 @@ eth_virtio_dev_init(struct rte_eth_dev *eth_dev)
return 0;
}
ret = virtio_dev_devargs_parse(eth_dev->device->devargs,
-NULL, &speed);
+NULL, &speed, &vectorized);
if (ret < 0)
return ret;
hw->speed = speed;
@@ -1949,6 +1951,11 @@ eth_virtio_dev_init(struct rte_eth_dev *eth_dev)
if (ret < 0)
goto err_virtio_init;
 
+   if (vectorized) {
+   if (!vtpci_packed_queue(hw))
+   hw->use_vec_rx = 1;
+   }
+
hw->opened = true;
 
return 0;
@@ -2021,9 +2028,20 @@ virtio_dev_speed_capa_get(uint32_t speed)
}
 }
 
+static int vectorized_check_handler(__rte_unused const char *key,
+   const char *value, void *ret_val)
+{
+   if (strcmp(value, "1") == 0)
+   *(int *)ret_val = 1;
+   else
+   *(int *)ret_val = 0;
+
+   return 0;
+}
 
 #define VIRTIO_ARG_SPEED  "speed"
 #define VIRTIO_ARG_VDPA   "vdpa"
+#define VIRTIO_ARG_VECTORIZED "vectorized"
 
 
 static int
@@ -2045,7 +2063,7 @@ link_speed_handler(const char *key __rte_unused,
 
 static int
 virtio_dev_devargs_parse(struct rte_devargs *devargs, int *vdpa,
-   uint32_t *speed)
+   uint32_t *speed, int *vectorized)
 {
struct rte_kvargs *kvlist;
int ret = 0;
@@ -2081,6 +2099,18 @@ virtio_dev_devargs_parse(struct rte_devargs *devargs, 
int *vdpa,
}
}
 
+   if (vectorized &&
+   rte_kvargs_count(kvlist, VIRTIO_ARG_VECTORIZED) == 1) {
+   ret = rte_kvargs_process(kvlist,
+   VIRTIO_ARG_VECTORIZED,
+   vectorized_check_handler, vectorized);
+   if (ret < 0) {
+   PMD_INIT_LOG(ERR, "Failed to parse %s",
+   VIRTIO_ARG_VECTORIZED);
+   goto exit;
+   }
+   }
+
 exit:
rte_kvargs_free(kvlist);
return ret;
@@ -2092,7 +2122,8 @@ static int eth_virtio_pci_probe(struct rte_pci_driver 
*pci_drv __rte_unused,
int vdpa = 0;
int ret = 0;
 
-   ret = virtio_dev_devargs_parse(pci_dev->device.devargs, &vdpa, NULL);
+   ret = virtio_dev_devargs_parse(pci_dev->device.devargs, &vdpa, NULL,
+   NULL);
if (ret < 0) {
PMD_INIT_LOG(ERR, "devargs parsing is failed");
return ret;
@@ -2257

[dpdk-dev] [PATCH v9 9/9] doc: add packed vectorized path

2020-04-23 Thread Marvin Liu
Document packed virtqueue vectorized path selection logic in virtio net
PMD.

Signed-off-by: Marvin Liu 

diff --git a/doc/guides/nics/virtio.rst b/doc/guides/nics/virtio.rst
index d59add23e..dbcf49ae1 100644
--- a/doc/guides/nics/virtio.rst
+++ b/doc/guides/nics/virtio.rst
@@ -482,6 +482,13 @@ according to below configuration:
both negotiated, this path will be selected.
 #. Packed virtqueue in-order non-mergeable path: If in-order feature is 
negotiated and
Rx mergeable is not negotiated, this path will be selected.
+#. Packed virtqueue vectorized Rx path: If building and running environment 
support
+   AVX512 && in-order feature is negotiated && Rx mergeable is not negotiated 
&&
+   TCP_LRO Rx offloading is disabled && vectorized option enabled,
+   this path will be selected.
+#. Packed virtqueue vectorized Tx path: If building and running environment 
support
+   AVX512 && in-order feature is negotiated && vectorized option enabled,
+   this path will be selected.
 
 Rx/Tx callbacks of each Virtio path
 ~~~
@@ -504,6 +511,8 @@ are shown in below table:
Packed virtqueue non-meregable path  virtio_recv_pkts_packed
   virtio_xmit_pkts_packed
Packed virtqueue in-order mergeable path 
virtio_recv_mergeable_pkts_packed virtio_xmit_pkts_packed
Packed virtqueue in-order non-mergeable path virtio_recv_pkts_packed
   virtio_xmit_pkts_packed
+   Packed virtqueue vectorized Rx path  virtio_recv_pkts_packed_vec
   virtio_xmit_pkts_packed
+   Packed virtqueue vectorized Tx path  virtio_recv_pkts_packed
   virtio_xmit_pkts_packed_vec
 
= 
 
 Virtio paths Support Status from Release to Release
@@ -521,20 +530,22 @@ All virtio paths support status are shown in below table:
 
 .. table:: Virtio Paths and Releases
 
-    = = 
=
-  Virtio paths  16.11 ~ 18.05 18.08 ~ 18.11 
19.02 ~ 19.11
-    = = 
=
-   Split virtqueue mergeable path Y Y  
   Y
-   Split virtqueue non-mergeable path Y Y  
   Y
-   Split virtqueue vectorized Rx path Y Y  
   Y
-   Split virtqueue simple Tx path Y N  
   N
-   Split virtqueue in-order mergeable path  Y  
   Y
-   Split virtqueue in-order non-mergeable path  Y  
   Y
-   Packed virtqueue mergeable path 
   Y
-   Packed virtqueue non-mergeable path 
   Y
-   Packed virtqueue in-order mergeable path
   Y
-   Packed virtqueue in-order non-mergeable path
   Y
-    = = 
=
+    = = 
= ===
+  Virtio paths  16.11 ~ 18.05 18.08 ~ 18.11 
19.02 ~ 19.11 20.05 ~
+    = = 
= ===
+   Split virtqueue mergeable path Y Y  
   Y  Y
+   Split virtqueue non-mergeable path Y Y  
   Y  Y
+   Split virtqueue vectorized Rx path Y Y  
   Y  Y
+   Split virtqueue simple Tx path Y N  
   N  N
+   Split virtqueue in-order mergeable path  Y  
   Y  Y
+   Split virtqueue in-order non-mergeable path  Y  
   Y  Y
+   Packed virtqueue mergeable path 
   Y  Y
+   Packed virtqueue non-mergeable path 
   Y  Y
+   Packed virtqueue in-order mergeable path
   Y  Y
+   Packed virtqueue in-order non-mergeable path
   Y  Y
+   Packed virtqueue vectorized Rx path 
  Y
+   Packed virtqueue vectorized Tx path 
  Y
+    = = 
= ===
 
 QEMU Support Status
 ~~~
-- 
2.17.1



[dpdk-dev] [PATCH v9 8/9] net/virtio: add election for vectorized path

2020-04-23 Thread Marvin Liu
Rewrite vectorized path selection logic. Default setting comes from
vectorized devarg, then checks each criteria.

Packed ring vectorized path need:
AVX512F and required extensions are supported by compiler and host
VERSION_1 and IN_ORDER features are negotiated
mergeable feature is not negotiated
LRO offloading is disabled

Split ring vectorized rx path need:
mergeable and IN_ORDER features are not negotiated
LRO, chksum and vlan strip offloadings are disabled

Signed-off-by: Marvin Liu 

diff --git a/drivers/net/virtio/virtio_ethdev.c 
b/drivers/net/virtio/virtio_ethdev.c
index 0a69a4db1..8a9545dd8 100644
--- a/drivers/net/virtio/virtio_ethdev.c
+++ b/drivers/net/virtio/virtio_ethdev.c
@@ -1523,9 +1523,12 @@ set_rxtx_funcs(struct rte_eth_dev *eth_dev)
if (vtpci_packed_queue(hw)) {
PMD_INIT_LOG(INFO,
"virtio: using packed ring %s Tx path on port %u",
-   hw->use_inorder_tx ? "inorder" : "standard",
+   hw->use_vec_tx ? "vectorized" : "standard",
eth_dev->data->port_id);
-   eth_dev->tx_pkt_burst = virtio_xmit_pkts_packed;
+   if (hw->use_vec_tx)
+   eth_dev->tx_pkt_burst = virtio_xmit_pkts_packed_vec;
+   else
+   eth_dev->tx_pkt_burst = virtio_xmit_pkts_packed;
} else {
if (hw->use_inorder_tx) {
PMD_INIT_LOG(INFO, "virtio: using inorder Tx path on 
port %u",
@@ -1539,7 +1542,13 @@ set_rxtx_funcs(struct rte_eth_dev *eth_dev)
}
 
if (vtpci_packed_queue(hw)) {
-   if (vtpci_with_feature(hw, VIRTIO_NET_F_MRG_RXBUF)) {
+   if (hw->use_vec_rx) {
+   PMD_INIT_LOG(INFO,
+   "virtio: using packed ring vectorized Rx path 
on port %u",
+   eth_dev->data->port_id);
+   eth_dev->rx_pkt_burst =
+   &virtio_recv_pkts_packed_vec;
+   } else if (vtpci_with_feature(hw, VIRTIO_NET_F_MRG_RXBUF)) {
PMD_INIT_LOG(INFO,
"virtio: using packed ring mergeable buffer Rx 
path on port %u",
eth_dev->data->port_id);
@@ -1952,8 +1961,17 @@ eth_virtio_dev_init(struct rte_eth_dev *eth_dev)
goto err_virtio_init;
 
if (vectorized) {
-   if (!vtpci_packed_queue(hw))
+   if (!vtpci_packed_queue(hw)) {
+   hw->use_vec_rx = 1;
+   } else {
+#if !defined(CC_AVX512_SUPPORT)
+   PMD_DRV_LOG(INFO,
+   "building environment do not support packed 
ring vectorized");
+#else
hw->use_vec_rx = 1;
+   hw->use_vec_tx = 1;
+#endif
+   }
}
 
hw->opened = true;
@@ -2099,11 +2117,10 @@ virtio_dev_devargs_parse(struct rte_devargs *devargs, 
int *vdpa,
}
}
 
-   if (vectorized &&
-   rte_kvargs_count(kvlist, VIRTIO_ARG_VECTORIZED) == 1) {
+   if (vectorized && rte_kvargs_count(kvlist, VIRTIO_ARG_VECTORIZED) == 1) 
{
ret = rte_kvargs_process(kvlist,
-   VIRTIO_ARG_VECTORIZED,
-   vectorized_check_handler, vectorized);
+   VIRTIO_ARG_VECTORIZED,
+   vectorized_check_handler, vectorized);
if (ret < 0) {
PMD_INIT_LOG(ERR, "Failed to parse %s",
VIRTIO_ARG_VECTORIZED);
@@ -2288,31 +2305,61 @@ virtio_dev_configure(struct rte_eth_dev *dev)
return -EBUSY;
}
 
-   if (vtpci_with_feature(hw, VIRTIO_F_IN_ORDER)) {
-   hw->use_inorder_tx = 1;
-   hw->use_inorder_rx = 1;
-   hw->use_vec_rx = 0;
-   }
-
if (vtpci_packed_queue(hw)) {
-   hw->use_vec_rx = 0;
-   hw->use_inorder_rx = 0;
-   }
+   if ((hw->use_vec_rx || hw->use_vec_tx) &&
+   (!rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512F) ||
+!vtpci_with_feature(hw, VIRTIO_F_IN_ORDER) ||
+!vtpci_with_feature(hw, VIRTIO_F_VERSION_1))) {
+   PMD_DRV_LOG(INFO,
+   "disabled packed ring vectorized path for 
requirements not met");
+   hw->use_vec_rx = 0;
+   hw->use_vec_tx = 0;
+   }
 
+   if (hw->use_vec_rx) {
+   if (vtpci_with_feature(hw, VIRTIO_NET_F_MRG_RXBUF)) {
+   PMD_DRV_LOG(INFO,
+   "disabled packed ring vectorized rx for 
mrg_rxbuf enabled");
+

  1   2   >