Re: [dpdk-dev] [PATCH v3 1/3] examples/ip_reassembly: add parse-ptype option

2017-02-10 Thread Thomas Monjalon
2017-02-10 07:53, Liu, Yong:
> From: Thomas Monjalon
> > 2017-02-09 22:25, Marvin Liu:
> > > Add new option parse-ptype in this sample in case of pmd driver
> > > not provide packet type info. If this option enabled, packet type
> > > will be analyzed in Rx callback function.
> > [...]
> > > + if (parse_ptype) {
> > > + if (add_cb_parse_ptype(portid, queueid) < 0)
> > > + rte_exit(EXIT_FAILURE,
> > > + "Fail to add ptype cb\n");
> > > + } else if (!check_ptype(portid))
> > > + rte_exit(EXIT_FAILURE,
> > > + "PMD can not provide needed ptypes\n");
> > 
> > Instead of adding a new option, why not adding the callback automatically
> > if the packet type is not supported by the hardware?
> 
> Thomas,
> We want to let user choice which kind of method for packet type parsing. 
> If start application with parse-type option, is meaning user want to use 
> software parsing otherwise will use hardware parsing.

I do not understand why this user choice matters.
If it is available, hardware ptype is better, isn't it?
It it is not available, we need to be aware of this specific issue,
otherwise we have the error "PMD can not provide needed ptypes"
(without suggesting to use the option).


Re: [dpdk-dev] [PATCH] Fill speed_capa for virtio

2017-02-10 Thread Thomas Monjalon
2017-02-10 07:49, Ido Barnea:
> On 09/02/2017, 6:19 PM, "Thomas Monjalon"  wrote:
> >2017-02-02 12:05, Ido Barnea:
> >> From: Ido Barnea 
> >> 
> >> Signed-off-by: Ido Barnea 
> >> ---
> >>  drivers/net/virtio/virtio_ethdev.c | 1 +
> >>  1 file changed, 1 insertion(+)
> >> 
> >> diff --git a/drivers/net/virtio/virtio_ethdev.c 
> >> b/drivers/net/virtio/virtio_ethdev.c
> >> index d1ff234..1d572b5 100644
> >> --- a/drivers/net/virtio/virtio_ethdev.c
> >> +++ b/drivers/net/virtio/virtio_ethdev.c
> >> @@ -1869,6 +1869,7 @@ virtio_dev_info_get(struct rte_eth_dev *dev, struct 
> >> rte_eth_dev_info *dev_info)
> >>(1ULL << VIRTIO_NET_F_HOST_TSO6);
> >>if ((hw->guest_features & tso_mask) == tso_mask)
> >>dev_info->tx_offload_capa |= DEV_TX_OFFLOAD_TCP_TSO;
> >> +  dev_info->speed_capa = ETH_LINK_SPEED_10G;
> >
> >Why 10G ?
> >Yuanhan, any opinion?
> 
> Just wanted this to be consistent with below (From virtio_dev_link_update):
> link.link_speed = SPEED_10G;

OK, that's the kind of justification which are good to have in
the commit message.



Re: [dpdk-dev] [PATCH v2 00/11] moving away from coremask to corelist

2017-02-10 Thread Thomas Monjalon
Hi Keith,

2017-02-09 17:42, Keith Wiles:
> The coremask option in DPDK is difficult to use and we should be
> promoting the use of the corelist (-l) option. The patch series
> adjusts the docs to use -l EAL option instead of the -c option.
> 
> The patch series doc change only and is not required to be done
> in 17.02 release, but should be added to the 17.05 release.
> The -c option will be kept and not removed for now unless in the
> future we decide to deprecate the code.
> 
> v2 - Fix taskset back to using -c
> 
> Keith Wiles (11):
>   doc/cryptodev: use corelist instead of coremask
>   doc/faq: use corelist instead of coremask
>   doc/freebsd: use corelist instead of coremask
>   doc/howto: use corelist instead of coremask
>   doc/linux: use corelist instead of coremask
>   doc/nics: use corelist instead of coremask
>   doc/prog_guide: use corelist instead of coremask
>   doc/testpmd: use corelist instead of coremask
>   doc/cryptoperf: use corelist instead of coremask
>   doc/xen: use corelist instead of coremask
>   doc/sample_app: use corelist instead of coremask

In case you make new revisions, I think you can squash every patches
in a single one. They are all doing the same thing in different files.


Re: [dpdk-dev] [PATCH v3 1/3] examples/ip_reassembly: add parse-ptype option

2017-02-10 Thread Tan, Jianfeng
Hi Thomas,

> -Original Message-
> From: Thomas Monjalon [mailto:thomas.monja...@6wind.com]
> Sent: Friday, February 10, 2017 4:36 PM
> To: Liu, Yong
> Cc: Tan, Jianfeng; dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v3 1/3] examples/ip_reassembly: add parse-
> ptype option
> 
> 2017-02-10 07:53, Liu, Yong:
> > From: Thomas Monjalon
> > > 2017-02-09 22:25, Marvin Liu:
> > > > Add new option parse-ptype in this sample in case of pmd driver
> > > > not provide packet type info. If this option enabled, packet type
> > > > will be analyzed in Rx callback function.
> > > [...]
> > > > +   if (parse_ptype) {
> > > > +   if (add_cb_parse_ptype(portid, queueid) < 0)
> > > > +   rte_exit(EXIT_FAILURE,
> > > > +   "Fail to add ptype cb\n");
> > > > +   } else if (!check_ptype(portid))
> > > > +   rte_exit(EXIT_FAILURE,
> > > > +   "PMD can not provide needed ptypes\n");
> > >
> > > Instead of adding a new option, why not adding the callback automatically
> > > if the packet type is not supported by the hardware?
> >
> > Thomas,
> > We want to let user choice which kind of method for packet type parsing.
> > If start application with parse-type option, is meaning user want to use
> software parsing otherwise will use hardware parsing.
> 
> I do not understand why this user choice matters.
> If it is available, hardware ptype is better, isn't it?
> It it is not available, we need to be aware of this specific issue,
> otherwise we have the error "PMD can not provide needed ptypes"
> (without suggesting to use the option).

Actually, Konstantin is suggesting this way, I quote here:
1. if '--parse-ptype' present always use SW parsing;
2. else check does HW support ptype recognition:
   - if yes, then use HW offload
   - else use SW

By this way, most case, user does not need to specify this option, except the 
case that, user wants to compare the performance of HW and SW ptype version 
when the NIC actually supports HW ptypes.

I agree with this way. How do you think?

Thanks,
Jianfeng





Re: [dpdk-dev] [PATCH v3 1/3] examples/ip_reassembly: add parse-ptype option

2017-02-10 Thread Liu, Yong


> -Original Message-
> From: Thomas Monjalon [mailto:thomas.monja...@6wind.com]
> Sent: Friday, February 10, 2017 4:36 PM
> To: Liu, Yong 
> Cc: Tan, Jianfeng ; dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v3 1/3] examples/ip_reassembly: add parse-
> ptype option
> 
> 2017-02-10 07:53, Liu, Yong:
> > From: Thomas Monjalon
> > > 2017-02-09 22:25, Marvin Liu:
> > > > Add new option parse-ptype in this sample in case of pmd driver
> > > > not provide packet type info. If this option enabled, packet type
> > > > will be analyzed in Rx callback function.
> > > [...]
> > > > +   if (parse_ptype) {
> > > > +   if (add_cb_parse_ptype(portid, queueid) < 0)
> > > > +   rte_exit(EXIT_FAILURE,
> > > > +   "Fail to add ptype cb\n");
> > > > +   } else if (!check_ptype(portid))
> > > > +   rte_exit(EXIT_FAILURE,
> > > > +   "PMD can not provide needed ptypes\n");
> > >
> > > Instead of adding a new option, why not adding the callback
> automatically
> > > if the packet type is not supported by the hardware?
> >
> > Thomas,
> > We want to let user choice which kind of method for packet type parsing.
> > If start application with parse-type option, is meaning user want to use
> software parsing otherwise will use hardware parsing.
> 
> I do not understand why this user choice matters.
> If it is available, hardware ptype is better, isn't it?
> It it is not available, we need to be aware of this specific issue,
> otherwise we have the error "PMD can not provide needed ptypes"
> (without suggesting to use the option).

Yes, hardware always has better performance than software. I think it matters 
in some performance measurement scenarios. 
Like l3fwd, we compared performance with software and hardware packet parsers 
and this option may not have much value in other samples.
I will rework this patch and fallback to software if hardware not support.  

BRs,
Marvin



[dpdk-dev] [PATCH] eventdev: amend comment for timeout_ticks in rte_event_dequeue_burst()

2017-02-10 Thread Nipun Gupta
Signed-off-by: Nipun Gupta 
---
 lib/librte_eventdev/rte_eventdev.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/librte_eventdev/rte_eventdev.h 
b/lib/librte_eventdev/rte_eventdev.h
index c2f9310..49a4739 100644
--- a/lib/librte_eventdev/rte_eventdev.h
+++ b/lib/librte_eventdev/rte_eventdev.h
@@ -1216,7 +1216,7 @@ struct rte_eventdev {
  *   - 0 no-wait, returns immediately if there is no event.
  *   - >0 wait for the event, if the device is configured with
  *   RTE_EVENT_DEV_CFG_PER_DEQUEUE_TIMEOUT then this function will wait until
- *   the event available or *timeout_ticks* time.
+ *   atleast one event is available or *timeout_ticks* time.
  *   if the device is not configured with RTE_EVENT_DEV_CFG_PER_DEQUEUE_TIMEOUT
  *   then this function will wait until the event available or
  *   *dequeue_timeout_ns* ns which was previously supplied to
-- 
1.9.1



[dpdk-dev] [PATCH] eventdev: amend comments for nb_events_limit and new_event_threshold

2017-02-10 Thread Nipun Gupta
Updated the comments on 'nb_events_limit' of 'struct rte_event_dev_config'
and 'new_event_threshold' of 'struct rte_event_port_conf' for open system
configuration.

Signed-off-by: Nipun Gupta 
---
 lib/librte_eventdev/rte_eventdev.h | 12 +++-
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/lib/librte_eventdev/rte_eventdev.h 
b/lib/librte_eventdev/rte_eventdev.h
index c2f9310..171e52e 100644
--- a/lib/librte_eventdev/rte_eventdev.h
+++ b/lib/librte_eventdev/rte_eventdev.h
@@ -404,11 +404,12 @@ struct rte_event_dev_config {
 * @see RTE_EVENT_DEV_CFG_PER_DEQUEUE_TIMEOUT
 */
int32_t nb_events_limit;
-   /**< Applies to *closed system* event dev only. This field indicates a
-* limit to ethdev-like devices to limit the number of events injected
-* into the system to not overwhelm core-to-core events.
+   /**< In a *closed system* this field indicates a limit to ethdev-like
+* devices to limit the number of events injected into the system to
+* not overwhelm core-to-core events.
 * This value cannot exceed the *max_num_events* which previously
-* provided in rte_event_dev_info_get()
+* provided in rte_event_dev_info_get().
+* This should be set to '-1' for *open system*.
 */
uint8_t nb_event_queues;
/**< Number of event queues to configure on this device.
@@ -633,7 +634,8 @@ struct rte_event_port_conf {
 * can have a lower threshold so as not to overwhelm the device,
 * while ports used for worker pools can have a higher threshold.
 * This value cannot exceed the *nb_events_limit*
-* which previously supplied to rte_event_dev_configure()
+* which previously supplied to rte_event_dev_configure().
+* This should be set to '-1' for *open system*.
 */
uint8_t dequeue_depth;
/**< Configure number of bulk dequeues for this event port.
-- 
1.9.1



[dpdk-dev] [PATCH v2] app/test-crypto-perf: fix uninitialized scalar variable

2017-02-10 Thread Aleksander Gajewski
Fix problem with uninitialized nb_cryptodevs variable by
initialize it with 0 value. Program could jump to err label
without running cperf_initialize_cryptodev() function. Also assign 0
value to nb_cryptodevs after cperf_initialize_cryptodev() when value is
negative.

Coverity issue: 141073
Fixes: f8be1786b1b8 ("app/crypto-perf: introduce performance test
application")

Signed-off-by: Aleksander Gajewski 
---
v2:
 * When nb_cryptodevs is negative after cperf_initialize_cryptodev()
assign 0 value to it.
---
 app/test-crypto-perf/main.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/app/test-crypto-perf/main.c b/app/test-crypto-perf/main.c
index 6c128d8..08bf5e4 100644
--- a/app/test-crypto-perf/main.c
+++ b/app/test-crypto-perf/main.c
@@ -264,7 +264,7 @@
 
void *ctx[RTE_MAX_LCORE] = { };
 
-   int nb_cryptodevs;
+   int nb_cryptodevs = 0;
uint8_t cdev_id, i;
uint8_t enabled_cdevs[RTE_CRYPTO_MAX_DEVS] = { 0 };
 
@@ -300,6 +300,7 @@
if (nb_cryptodevs < 1) {
RTE_LOG(ERR, USER1, "Failed to initialise requested crypto "
"device type\n");
+   nb_cryptodevs = 0;
goto err;
}
 
@@ -397,7 +398,6 @@
 err:
i = 0;
RTE_LCORE_FOREACH_SLAVE(lcore_id) {
-
if (i == nb_cryptodevs)
break;
 
-- 
1.9.1



Intel Technology Poland sp. z o.o.
ul. Slowackiego 173 | 80-298 Gdansk | Sad Rejonowy Gdansk Polnoc | VII Wydzial 
Gospodarczy Krajowego Rejestru Sadowego - KRS 101882 | NIP 957-07-52-316 | 
Kapital zakladowy 200.000 PLN.

Ta wiadomosc wraz z zalacznikami jest przeznaczona dla okreslonego adresata i 
moze zawierac informacje poufne. W razie przypadkowego otrzymania tej 
wiadomosci, prosimy o powiadomienie nadawcy oraz trwale jej usuniecie; 
jakiekolwiek
przegladanie lub rozpowszechnianie jest zabronione.
This e-mail and any attachments may contain confidential material for the sole 
use of the intended recipient(s). If you are not the intended recipient, please 
contact the sender and delete all copies; any review or distribution by
others is strictly prohibited.



[dpdk-dev] [PATCH v2] eventdev: amend timeout criteria comment for burst dequeue

2017-02-10 Thread Nipun Gupta
Signed-off-by: Nipun Gupta 
---
Changes for v2:
 - Fix errors reported by check-git-log.sh

 lib/librte_eventdev/rte_eventdev.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/librte_eventdev/rte_eventdev.h 
b/lib/librte_eventdev/rte_eventdev.h
index c2f9310..49a4739 100644
--- a/lib/librte_eventdev/rte_eventdev.h
+++ b/lib/librte_eventdev/rte_eventdev.h
@@ -1216,7 +1216,7 @@ struct rte_eventdev {
  *   - 0 no-wait, returns immediately if there is no event.
  *   - >0 wait for the event, if the device is configured with
  *   RTE_EVENT_DEV_CFG_PER_DEQUEUE_TIMEOUT then this function will wait until
- *   the event available or *timeout_ticks* time.
+ *   atleast one event is available or *timeout_ticks* time.
  *   if the device is not configured with RTE_EVENT_DEV_CFG_PER_DEQUEUE_TIMEOUT
  *   then this function will wait until the event available or
  *   *dequeue_timeout_ns* ns which was previously supplied to
-- 
1.9.1



Re: [dpdk-dev] [PATCH v2] app/test-crypto-perf: fix uninitialized scalar variable

2017-02-10 Thread De Lara Guarch, Pablo


> -Original Message-
> From: Gajewski, AleksanderX
> Sent: Friday, February 10, 2017 9:23 AM
> To: Doherty, Declan
> Cc: dev@dpdk.org; De Lara Guarch, Pablo; Gajewski, AleksanderX
> Subject: [PATCH v2] app/test-crypto-perf: fix uninitialized scalar variable
> 
> Fix problem with uninitialized nb_cryptodevs variable by
> initialize it with 0 value. Program could jump to err label
> without running cperf_initialize_cryptodev() function. Also assign 0
> value to nb_cryptodevs after cperf_initialize_cryptodev() when value is
> negative.
> 
> Coverity issue: 141073
> Fixes: f8be1786b1b8 ("app/crypto-perf: introduce performance test
> application")
> 
> Signed-off-by: Aleksander Gajewski 

Acked-by: Pablo de Lara 


[dpdk-dev] [PATCH v2] eventdev: amend comments for events limit and threshold

2017-02-10 Thread Nipun Gupta
Updated the comments on 'nb_events_limit' of 'struct rte_event_dev_config'
and 'new_event_threshold' of 'struct rte_event_port_conf' for open system
configuration.

Signed-off-by: Nipun Gupta 
---
Changes for v2:
 - Fix errors reported by check-git-log.sh

 lib/librte_eventdev/rte_eventdev.h | 12 +++-
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/lib/librte_eventdev/rte_eventdev.h 
b/lib/librte_eventdev/rte_eventdev.h
index c2f9310..171e52e 100644
--- a/lib/librte_eventdev/rte_eventdev.h
+++ b/lib/librte_eventdev/rte_eventdev.h
@@ -404,11 +404,12 @@ struct rte_event_dev_config {
 * @see RTE_EVENT_DEV_CFG_PER_DEQUEUE_TIMEOUT
 */
int32_t nb_events_limit;
-   /**< Applies to *closed system* event dev only. This field indicates a
-* limit to ethdev-like devices to limit the number of events injected
-* into the system to not overwhelm core-to-core events.
+   /**< In a *closed system* this field indicates a limit to ethdev-like
+* devices to limit the number of events injected into the system to
+* not overwhelm core-to-core events.
 * This value cannot exceed the *max_num_events* which previously
-* provided in rte_event_dev_info_get()
+* provided in rte_event_dev_info_get().
+* This should be set to '-1' for *open system*.
 */
uint8_t nb_event_queues;
/**< Number of event queues to configure on this device.
@@ -633,7 +634,8 @@ struct rte_event_port_conf {
 * can have a lower threshold so as not to overwhelm the device,
 * while ports used for worker pools can have a higher threshold.
 * This value cannot exceed the *nb_events_limit*
-* which previously supplied to rte_event_dev_configure()
+* which previously supplied to rte_event_dev_configure().
+* This should be set to '-1' for *open system*.
 */
uint8_t dequeue_depth;
/**< Configure number of bulk dequeues for this event port.
-- 
1.9.1



Re: [dpdk-dev] [PATCH v2] eventdev: amend timeout criteria comment for burst dequeue

2017-02-10 Thread Van Haaren, Harry
> -Original Message-
> From: Nipun Gupta [mailto:nipun.gu...@nxp.com]
> Sent: Friday, February 10, 2017 3:48 PM
> To: dev@dpdk.org
> Cc: hemant.agra...@nxp.com; jerin.ja...@caviumnetworks.com; Richardson, Bruce
> ; Eads, Gage ; Van Haaren, 
> Harry
> ; Nipun Gupta 
> Subject: [PATCH v2] eventdev: amend timeout criteria comment for burst dequeue
> 
> Signed-off-by: Nipun Gupta 

Comment inline

> ---
> Changes for v2:
>  - Fix errors reported by check-git-log.sh
> 
>  lib/librte_eventdev/rte_eventdev.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/lib/librte_eventdev/rte_eventdev.h 
> b/lib/librte_eventdev/rte_eventdev.h
> index c2f9310..49a4739 100644
> --- a/lib/librte_eventdev/rte_eventdev.h
> +++ b/lib/librte_eventdev/rte_eventdev.h
> @@ -1216,7 +1216,7 @@ struct rte_eventdev {
>   *   - 0 no-wait, returns immediately if there is no event.
>   *   - >0 wait for the event, if the device is configured with
>   *   RTE_EVENT_DEV_CFG_PER_DEQUEUE_TIMEOUT then this function will wait until
> - *   the event available or *timeout_ticks* time.
> + *   atleast one event is available or *timeout_ticks* time.

at least should have a space between the words.

Send v3 with

Acked-by: Harry van Haaren 


>   *   if the device is not configured with 
> RTE_EVENT_DEV_CFG_PER_DEQUEUE_TIMEOUT
>   *   then this function will wait until the event available or
>   *   *dequeue_timeout_ns* ns which was previously supplied to
> --
> 1.9.1



Re: [dpdk-dev] [PATCH] net/i40e: fix wrong handle when enable interrupt

2017-02-10 Thread Ferruh Yigit
On 2/9/2017 8:02 PM, Qi Zhang wrote:
> In i40e_dev_interrupt_handler, when call rte_intr_enable,
> We should parse dev->intr_handle but not intr_handle.
> intr_handle is the copy of dev->intr_handle when
> it is registered, but parameter of dev->intr_handle is
> possible to be modifed later in i40e driver.

If dev->intr_handle modified, shouldn't driver unregister old one and
register new interrupt handle?

Registering one handle but using another in handler function seems wrong.

> 
> Fixes: 2ce7a1ed09fc ("net/i40e: localize mapping of ethdev to PCI device")
> 
> Signed-off-by: Qi Zhang 

<...>


Re: [dpdk-dev] [PATCH v2 2/2] mk: move PMD libraries to applications

2017-02-10 Thread Thomas Monjalon
2017-01-31 15:04, Ferruh Yigit:
> Some PMDs provide device specific APIs. Bond and xenvirt are existing
> samples for this.
> 
> And since these are PMD libraries, there are two options on how to link
> them for shared library build:
> 
> 1- They can be linked to all applications by default, using common
> rte.app.mk file.
> 
> 2- They can be explicitly linked to applications that use device
> specific API.
> 
> Currently option one is in use, this patch switches to the option two.
> 
> Moves library linking to the Makefile of application Makefile that uses
> device specific API.
> 
> This prevent these PMD libraries to be a dependency to applications
> that don't use these device specific APIs.
> 
> Signed-off-by: Ferruh Yigit 

Series applied, thanks


Re: [dpdk-dev] Hotplug support for VFIO

2017-02-10 Thread Alejandro Lucero
On Thu, Feb 9, 2017 at 3:16 AM, Tetsuya Mukawa  wrote:

> 2017-02-09 0:48 GMT+09:00 Alejandro Lucero  >:
> > I just wanted to clarify the hotplug VFIO is not the problem, as I can
> see
> > it, but the unplug. When attaching a device the current VFIO code will be
> > used, but there is no code for doing the IOMMU unmapping when unplugging.
> >
> > On Wed, Feb 8, 2017 at 3:43 PM, Alejandro Lucero
> >  wrote:
> >>
> >> Hi Eelco,
> >>
> >> On Wed, Feb 8, 2017 at 9:41 AM, Eelco Chaudron 
> >> wrote:
> >>>
> >>> Hi Anatoly,
> >>>
> >>> This will be great... If you want I can do some testing on (early)
> >>> patches if required.
> >>>
> >>> Also do you know why it is currently not supported, i.e. what are the
> >>> limitations?
> >>>
> >>> Thanks,
> >>
> >>
> >> I assume your reply was to my previous email and not to Anatoly's one.
> My
> >> apologies if this assumption is wrong.
> >> By the way, you did not reply to the list, but I'm doing to do so
> because
> >> this could be interesting to other people. Also, I'm CCing hotplug UIO
> >> developer, Tetsuya Mukawa, since there is no a specific maintainer for
> this
> >> functionality.
>
> Hi,
>
> Sorry for late reply.
> Honestly, I am not so familiar with VFIO, so I may be wrong.
> But please check my below comments.
>
> >>
> >> I think the main problem for supporting VFIO hotplug is to deal with
> what
> >> VFIO does, this is IOMMU mappings.
> >> There is no code for doing so, and although the implementation would use
> >> most of the code already there for UIO hotplug, which is basically
> related
> >> to unmapping PCI resources, and the interrupt part will be likely quite
> >> similar, the IOMMU unmapping requires more work.
> >>
> >> I dare to say this is the main reason for not having VFIO hotplug
> support
> >> right now. Maybe Tetsuya can confirm this or give us other reasons.
>
> Yes, this is the main reason.
> When I implement hot plugging, I also needed to implement UIO detaching
> code.
> And at the time, I also tried to implement VFIO code as well, but I
> gave up with below reasons.
>  - Patch will be bigger and more difficult to be merged in mainline code.
>  - I don't have enough time to check actually what kind of code is
> needed to detach it.
>
>
I'm quite familiar with the VFIO code so I know what it requires. At least
the main part. You know there are details not so trivial to see initially
until you start doing the work.


> So above were not technical reasons.
>
> One thing I am not clear is whether we may need to do something
> special for multi process case.
> For example, if primary process is died suddenly, what kind of error
> will be happen to slave process, and what is good way to handle it.
>
>
I do no think this is a problem specific to the hotplug mechanism, but
thanks for the heads up.

I will start working on this. Hopefully it could be part of the 17.05
release.

Thank you


> Thanks,
> Tetsuya
>
> >>
> >>>
> >>>
> >>> Eelco
> >>>
> >>>
> >>>
> >>> On 07/02/17 10:29, Alejandro Lucero wrote:
> >>>
> >>> It seems none is working on this VFIO support.
> >>>
> >>> I will work on this if there is no reply to this thread saying the
> >>> opposite the next days.
> >>>
> >>> On Thu, Feb 2, 2017 at 12:58 PM, Eelco Chaudron 
> >>> wrote:
> 
>  On 02/02/17 13:05, Burakov, Anatoly wrote:
> >
> > Hi Eelco,
> >
> > Please forgive me my ignorance on this matter, but doesn't it work at
> > the moment? I would assume that if regular PCI hotplug works (with
> igb_uio),
> > then so would hotplug with VFIO, as it basically utilizes the same
> PCI
> > infrastructure igb_uio does. That said, I'm not aware of any patches
> > submitted that had to do with VFIO and hotplug, so I guess the
> answer is,
> > not at the moment.
> >
> > Thanks,
> > Anatoly
> 
>  I was asking as the documentation explicitly mentions its not
> supported.
> 
>  http://dpdk.org/doc/guides/prog_guide/port_hotplug_
> framework.html#hotplug
> 
>  - "To detach a port, the port should be backed by a device that
> igb_uio
>  manages. VFIO is not supported."
> 
>  I could not find any specific reason why it's not supported, so if
> some
>  one can explain this it would help also...
> 
>  Cheers,
> 
>  Eelco
> 
> 
> >>>
> >>>
> >>
> >
>


Re: [dpdk-dev] [PATCH] eal: fix max number of interrupt request

2017-02-10 Thread Thomas Monjalon
2017-02-09 14:59, Qi Zhang:
> The max number of interrupt request is possible
> be changed after rte_intr_callback_register, so
> in get_max_intr, we need to check if nessesary to
> update the max_intr.

So you are using rte_intr_enable() to update the max_intr field
in the case of VFIO_MSIX.
What about MSI, INTX and UIO cases?



[dpdk-dev] [PATCH v3] eventdev: amend timeout criteria comment for burst dequeue

2017-02-10 Thread Nipun Gupta
Signed-off-by: Nipun Gupta 
Acked-by: Harry van Haaren 
---
Changes for v2:
 - Fix errors reported by check-git-log.sh
Changes for v3:
 - Corrected comment's language

 lib/librte_eventdev/rte_eventdev.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/librte_eventdev/rte_eventdev.h 
b/lib/librte_eventdev/rte_eventdev.h
index c2f9310..29f0f46 100644
--- a/lib/librte_eventdev/rte_eventdev.h
+++ b/lib/librte_eventdev/rte_eventdev.h
@@ -1216,7 +1216,7 @@ struct rte_eventdev {
  *   - 0 no-wait, returns immediately if there is no event.
  *   - >0 wait for the event, if the device is configured with
  *   RTE_EVENT_DEV_CFG_PER_DEQUEUE_TIMEOUT then this function will wait until
- *   the event available or *timeout_ticks* time.
+ *   at least one event is available or *timeout_ticks* time.
  *   if the device is not configured with RTE_EVENT_DEV_CFG_PER_DEQUEUE_TIMEOUT
  *   then this function will wait until the event available or
  *   *dequeue_timeout_ns* ns which was previously supplied to
-- 
1.9.1



Re: [dpdk-dev] [PATCH] net/i40e: fix vlan insert code redundance

2017-02-10 Thread Ferruh Yigit
On 2/10/2017 1:26 AM, Qiming Yang wrote:
> This patch removed useless tx_flags in vlan insertion.

Overall this looks good, I wonder what was the initial intention of this
code, understanding it helps to figure out if there is a hidden defect.

This code not fixes a defect, but improves the code, is there any
performance gain with this?
If not, I am for deferring this to next release.

> 
> Fixes: 4861cde46116 ("i40e: new poll mode driver")
> 
> Signed-off-by: Qiming Yang 
> ---
>  drivers/net/i40e/i40e_rxtx.c | 8 +---
>  drivers/net/i40e/i40e_rxtx.h | 2 --
>  2 files changed, 1 insertion(+), 9 deletions(-)
> 
> diff --git a/drivers/net/i40e/i40e_rxtx.c b/drivers/net/i40e/i40e_rxtx.c
> index 608685f..b91cd70 100644
> --- a/drivers/net/i40e/i40e_rxtx.c
> +++ b/drivers/net/i40e/i40e_rxtx.c
> @@ -1026,7 +1026,6 @@ i40e_xmit_pkts(void *tx_queue, struct rte_mbuf 
> **tx_pkts, uint16_t nb_pkts)
>   uint16_t nb_tx;
>   uint32_t td_cmd;
>   uint32_t td_offset;
> - uint32_t tx_flags;
>   uint32_t td_tag;
>   uint64_t ol_flags;
>   uint16_t nb_used;
> @@ -1050,7 +1049,6 @@ i40e_xmit_pkts(void *tx_queue, struct rte_mbuf 
> **tx_pkts, uint16_t nb_pkts)
>   td_cmd = 0;
>   td_tag = 0;
>   td_offset = 0;
> - tx_flags = 0;
>  
>   tx_pkt = *tx_pkts++;
>   RTE_MBUF_PREFETCH_TO_FREE(txe->mbuf);
> @@ -1097,12 +1095,8 @@ i40e_xmit_pkts(void *tx_queue, struct rte_mbuf 
> **tx_pkts, uint16_t nb_pkts)
>  
>   /* Descriptor based VLAN insertion */
>   if (ol_flags & (PKT_TX_VLAN_PKT | PKT_TX_QINQ_PKT)) {
> - tx_flags |= tx_pkt->vlan_tci <<
> - I40E_TX_FLAG_L2TAG1_SHIFT;
> - tx_flags |= I40E_TX_FLAG_INSERT_VLAN;

The I40E_TX_FLAG_INSERT_VLAN flag also seems used only here, and can be
removed.

Also I40E_TX_FLAG_CSUM and I40E_TX_FLAG_TSYN seems not used at all,
understanding why they are introduced at first place can be useful.

>   td_cmd |= I40E_TX_DESC_CMD_IL2TAG1;
> - td_tag = (tx_flags & I40E_TX_FLAG_L2TAG1_MASK) >>
> - I40E_TX_FLAG_L2TAG1_SHIFT;
> + td_tag = tx_pkt->vlan_tci;
>   }
>  
>   /* Always enable CRC offload insertion */
> diff --git a/drivers/net/i40e/i40e_rxtx.h b/drivers/net/i40e/i40e_rxtx.h
> index 9df8a56..3d4abdc 100644
> --- a/drivers/net/i40e/i40e_rxtx.h
> +++ b/drivers/net/i40e/i40e_rxtx.h
> @@ -38,8 +38,6 @@
>   * 32 bits tx flags, high 16 bits for L2TAG1 (VLAN),
>   * low 16 bits for others.
>   */
> -#define I40E_TX_FLAG_L2TAG1_SHIFT 16
> -#define I40E_TX_FLAG_L2TAG1_MASK  0x
>  #define I40E_TX_FLAG_CSUM ((uint32_t)(1 << 0))
>  #define I40E_TX_FLAG_INSERT_VLAN  ((uint32_t)(1 << 1))
>  #define I40E_TX_FLAG_TSYN ((uint32_t)(1 << 2))
> 



Re: [dpdk-dev] [PATCH v2] eventdev: amend comments for events limit and threshold

2017-02-10 Thread Van Haaren, Harry
> From: Nipun Gupta [mailto:nipun.gu...@nxp.com]
> Sent: Friday, February 10, 2017 3:50 PM
> To: dev@dpdk.org
> Cc: hemant.agra...@nxp.com; jerin.ja...@caviumnetworks.com; Richardson, Bruce
> ; Eads, Gage ; Van Haaren, 
> Harry
> ; Nipun Gupta 
> Subject: [PATCH v2] eventdev: amend comments for events limit and threshold
> 
> Updated the comments on 'nb_events_limit' of 'struct rte_event_dev_config'
> and 'new_event_threshold' of 'struct rte_event_port_conf' for open system
> configuration.
> 
> Signed-off-by: Nipun Gupta 
> ---
> Changes for v2:
>  - Fix errors reported by check-git-log.sh
> 
>  lib/librte_eventdev/rte_eventdev.h | 12 +++-
>  1 file changed, 7 insertions(+), 5 deletions(-)
> 
> diff --git a/lib/librte_eventdev/rte_eventdev.h 
> b/lib/librte_eventdev/rte_eventdev.h
> index c2f9310..171e52e 100644
> --- a/lib/librte_eventdev/rte_eventdev.h
> +++ b/lib/librte_eventdev/rte_eventdev.h
> @@ -404,11 +404,12 @@ struct rte_event_dev_config {
>* @see RTE_EVENT_DEV_CFG_PER_DEQUEUE_TIMEOUT
>*/
>   int32_t nb_events_limit;
> - /**< Applies to *closed system* event dev only. This field indicates a
> -  * limit to ethdev-like devices to limit the number of events injected
> -  * into the system to not overwhelm core-to-core events.
> + /**< In a *closed system* this field indicates a limit to ethdev-like
> +  * devices to limit the number of events injected into the system to
> +  * not overwhelm core-to-core events.
>* This value cannot exceed the *max_num_events* which previously
> -  * provided in rte_event_dev_info_get()
> +  * provided in rte_event_dev_info_get().
> +  * This should be set to '-1' for *open system*.


I don't think we should mention ethdev explicitly here - it applies to any
port that is attempting to enqueue work into a closed-system eventdev.

What do you think of the following wording? (Suggestion only, feel free to
re-word if required).

/**< In a closed system this field is the limit on the maximum number of events
 that can be inflight in the eventdev at a given time. The limit is required
 to ensure that the finite space in a closed system is not overwhelmed. The
 value cannot exceed the *max_num_events* as provided by 
rte_event_dev_info_get().
 This value should be set to -1 for open systems.
 */

>*/
>   uint8_t nb_event_queues;
>   /**< Number of event queues to configure on this device.
> @@ -633,7 +634,8 @@ struct rte_event_port_conf {
>* can have a lower threshold so as not to overwhelm the device,
>* while ports used for worker pools can have a higher threshold.
>* This value cannot exceed the *nb_events_limit*
> -  * which previously supplied to rte_event_dev_configure()
> +  * which previously supplied to rte_event_dev_configure().
> +  * This should be set to '-1' for *open system*.
>*/

Minor grammer issue (that was previously there too, but worth fixing anyway),
there is a _was_ missing from the sentence:

+   which was previously supplied to rte_event_dev_configure().




Re: [dpdk-dev] [PATCH] eal: fix bug in x86 cmpset

2017-02-10 Thread Hunt, David


On 9/2/2017 4:53 PM, Thomas Monjalon wrote:

2016-11-06 22:09, Thomas Monjalon:

2016-09-29 18:34, Thomas Monjalon:

2016-09-30 02:54, Nikhil Rao:

The original code used movl instead of xchgl, this caused
rte_atomic64_cmpset to use ebx as the lower dword of the source
to cmpxchg8b instead of the lower dword of function argument "src".

Could you please start the explanation with a statement of
what is wrong from an user point of view?
It could help to understand how severe it is.

Please, we need a clear explanation of the bug, and an acknowledgement.

Should we close this bug?


I took a few minutes to look at this, and the issue can easily be 
reproduced with a small snippet of code.
With the 'mov', the lower dword of the result is incorrect. This is 
resolved by using 'xchgl'.


void main()
{
uint64_t a = 0xff00ff;

rte_atomic64_cmpset( &a, 0xff00ff, 0xfa00fa);
printf("0x%lx\n", a);
}

When using 'mov', the result is 0xfa
When using 'xchgl', the result is 0xfa00fa, as expected.

Rgds,
Dave.



Re: [dpdk-dev] [PATCH v2 3/3] doc: postpone ABI changes in igb_uio

2017-02-10 Thread Thomas Monjalon
2017-02-09 17:40, Ferruh Yigit:
> On 2/9/2017 4:06 PM, Jianfeng Tan wrote:
> > This ABI changes to remove iomem and ioport mapping in igb_uio. The
> > purpose of this changes was to fix a bug: when DPDK app crashes,
> > those devices by igb_uio are not stopped either DPDK PMD driver or
> > igb_uio driver.
> > 
> > Then it has been pointed out by Stephen Hemminger that it has
> > backward compatibility issue: cannot run old version DPDK on
> > modified igb_uio.
> > 
> > However, we still have not figure out a new way to fix this bug
> > without this change. Let's postpone this deprecation announcement
> > in case this change cannot be avoided.
> > 
> > Fixes: 3bac1dbc1ed ("doc: announce iomem and ioport removal from igb_uio")
> > 
> > Suggested-by: Stephen Hemminger 
> > Suggested-by: Ferruh Yigit 
> > Suggested-by: Thomas Monjalon 
> > Signed-off-by: Jianfeng Tan 
> 
> Acked-by: Ferruh Yigit 

Applied, thanks

The images are not real vector images and are almost unreadable.
Please make the effort to use inkscape in order to have images
we can update.

I did some changes: s/virtio_user/virtio-user/ in order to be consistent.
Like for vhost-user, we use the underscore only in code.


Re: [dpdk-dev] [PATCH] eal: fix bug in x86 cmpset

2017-02-10 Thread Thomas Monjalon
2017-02-10 10:39, Hunt, David:
> 
> On 9/2/2017 4:53 PM, Thomas Monjalon wrote:
> > 2016-11-06 22:09, Thomas Monjalon:
> >> 2016-09-29 18:34, Thomas Monjalon:
> >>> 2016-09-30 02:54, Nikhil Rao:
>  The original code used movl instead of xchgl, this caused
>  rte_atomic64_cmpset to use ebx as the lower dword of the source
>  to cmpxchg8b instead of the lower dword of function argument "src".
> >>> Could you please start the explanation with a statement of
> >>> what is wrong from an user point of view?
> >>> It could help to understand how severe it is.
> >> Please, we need a clear explanation of the bug, and an acknowledgement.
> > Should we close this bug?
> 
> I took a few minutes to look at this, and the issue can easily be 
> reproduced with a small snippet of code.
> With the 'mov', the lower dword of the result is incorrect. This is 
> resolved by using 'xchgl'.
> 
> void main()
> {
>  uint64_t a = 0xff00ff;
> 
>  rte_atomic64_cmpset( &a, 0xff00ff, 0xfa00fa);
>  printf("0x%lx\n", a);
> }
> 
> When using 'mov', the result is 0xfa
> When using 'xchgl', the result is 0xfa00fa, as expected.

This operation is used a lot in drivers for link status.

I think we need to clearly explain what was the consequence of this bug.


[dpdk-dev] [PATCH v2] net/virtio: add speed capability

2017-02-10 Thread Thomas Monjalon
From: Ido Barnea 

The chosen fake capability (10G) is consistent with the reported
link speed in virtio_dev_link_update():
link.link_speed = SPEED_10G;

The feature is not marked in doc/guides/nics/features/virtio.ini
as it is only a fake value.

Signed-off-by: Ido Barnea 
[Thomas: comments added]
Acked-by: Thomas Monjalon 
---
 drivers/net/virtio/virtio_ethdev.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/virtio/virtio_ethdev.c 
b/drivers/net/virtio/virtio_ethdev.c
index d1ff234..4dc03b9 100644
--- a/drivers/net/virtio/virtio_ethdev.c
+++ b/drivers/net/virtio/virtio_ethdev.c
@@ -1835,6 +1835,8 @@ virtio_dev_info_get(struct rte_eth_dev *dev, struct 
rte_eth_dev_info *dev_info)
uint64_t tso_mask, host_features;
struct virtio_hw *hw = dev->data->dev_private;
 
+   dev_info->speed_capa = ETH_LINK_SPEED_10G; /* fake value */
+
dev_info->pci_dev = dev->device ? RTE_DEV_TO_PCI(dev->device) : NULL;
dev_info->max_rx_queues =
RTE_MIN(hw->max_queue_pairs, VIRTIO_MAX_RX_QUEUES);
-- 
2.7.0



Re: [dpdk-dev] [PATCH v2] app/test-crypto-perf: fix uninitialized scalar variable

2017-02-10 Thread De Lara Guarch, Pablo


> -Original Message-
> From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of De Lara Guarch,
> Pablo
> Sent: Friday, February 10, 2017 9:47 AM
> To: Gajewski, AleksanderX; Doherty, Declan
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v2] app/test-crypto-perf: fix uninitialized
> scalar variable
> 
> 
> 
> > -Original Message-
> > From: Gajewski, AleksanderX
> > Sent: Friday, February 10, 2017 9:23 AM
> > To: Doherty, Declan
> > Cc: dev@dpdk.org; De Lara Guarch, Pablo; Gajewski, AleksanderX
> > Subject: [PATCH v2] app/test-crypto-perf: fix uninitialized scalar variable
> >
> > Fix problem with uninitialized nb_cryptodevs variable by
> > initialize it with 0 value. Program could jump to err label
> > without running cperf_initialize_cryptodev() function. Also assign 0
> > value to nb_cryptodevs after cperf_initialize_cryptodev() when value is
> > negative.
> >
> > Coverity issue: 141073
> > Fixes: f8be1786b1b8 ("app/crypto-perf: introduce performance test
> > application")
> >
> > Signed-off-by: Aleksander Gajewski 
> 
> Acked-by: Pablo de Lara 

Applied to dpdk-next-crypto.
Thanks,

Pablo


Re: [dpdk-dev] [PATCH v2] net/virtio: add speed capability

2017-02-10 Thread Thomas Monjalon
2017-02-10 12:05, Thomas Monjalon:
> From: Ido Barnea 
> 
> The chosen fake capability (10G) is consistent with the reported
> link speed in virtio_dev_link_update():
>   link.link_speed = SPEED_10G;
> 
> The feature is not marked in doc/guides/nics/features/virtio.ini
> as it is only a fake value.
> 
> Signed-off-by: Ido Barnea 
> [Thomas: comments added]
> Acked-by: Thomas Monjalon 

Applied, thanks


Re: [dpdk-dev] [PATCH v2 3/3] doc: postpone ABI changes in igb_uio

2017-02-10 Thread Tan, Jianfeng


> -Original Message-
> From: Thomas Monjalon [mailto:thomas.monja...@6wind.com]
> Sent: Friday, February 10, 2017 6:44 PM
> To: Tan, Jianfeng
> Cc: Yigit, Ferruh; dev@dpdk.org; Mcnamara, John;
> yuanhan@linux.intel.com; step...@networkplumber.org
> Subject: Re: [dpdk-dev] [PATCH v2 3/3] doc: postpone ABI changes in igb_uio
> 
> 2017-02-09 17:40, Ferruh Yigit:
> > On 2/9/2017 4:06 PM, Jianfeng Tan wrote:
> > > This ABI changes to remove iomem and ioport mapping in igb_uio. The
> > > purpose of this changes was to fix a bug: when DPDK app crashes,
> > > those devices by igb_uio are not stopped either DPDK PMD driver or
> > > igb_uio driver.
> > >
> > > Then it has been pointed out by Stephen Hemminger that it has
> > > backward compatibility issue: cannot run old version DPDK on
> > > modified igb_uio.
> > >
> > > However, we still have not figure out a new way to fix this bug
> > > without this change. Let's postpone this deprecation announcement
> > > in case this change cannot be avoided.
> > >
> > > Fixes: 3bac1dbc1ed ("doc: announce iomem and ioport removal from
> igb_uio")
> > >
> > > Suggested-by: Stephen Hemminger 
> > > Suggested-by: Ferruh Yigit 
> > > Suggested-by: Thomas Monjalon 
> > > Signed-off-by: Jianfeng Tan 
> >
> > Acked-by: Ferruh Yigit 
> 
> Applied, thanks
> 
> The images are not real vector images and are almost unreadable.
> Please make the effort to use inkscape in order to have images
> we can update.

Apologies for that. I've submitted a patch to changes the images. And thank you 
for the solution.

> 
> I did some changes: s/virtio_user/virtio-user/ in order to be consistent.
> Like for vhost-user, we use the underscore only in code.

Thank you for that.

Regards,
Jianfeng


Re: [dpdk-dev] [PATCH] net/i40e: fix vlan insert code redundance

2017-02-10 Thread Yang, Qiming


> -Original Message-
> From: Yigit, Ferruh
> Sent: Friday, February 10, 2017 6:25 PM
> To: Yang, Qiming ; dev@dpdk.org
> Cc: Wu, Jingjing 
> Subject: Re: [dpdk-dev] [PATCH] net/i40e: fix vlan insert code redundance
> 
> On 2/10/2017 1:26 AM, Qiming Yang wrote:
> > This patch removed useless tx_flags in vlan insertion.
> 
> Overall this looks good, I wonder what was the initial intention of this code,
> understanding it helps to figure out if there is a hidden defect. 

Thank you for your remind. I'll investigate it.

> 
> This code not fixes a defect, but improves the code, is there any
> performance gain with this?

I'll do more test and give you a feedback. 

> If not, I am for deferring this to next release.
> 
> >
> > Fixes: 4861cde46116 ("i40e: new poll mode driver")
> >
> > Signed-off-by: Qiming Yang 
> > ---
> >  drivers/net/i40e/i40e_rxtx.c | 8 +---
> > drivers/net/i40e/i40e_rxtx.h | 2 --
> >  2 files changed, 1 insertion(+), 9 deletions(-)
> >
> > diff --git a/drivers/net/i40e/i40e_rxtx.c
> > b/drivers/net/i40e/i40e_rxtx.c index 608685f..b91cd70 100644
> > --- a/drivers/net/i40e/i40e_rxtx.c
> > +++ b/drivers/net/i40e/i40e_rxtx.c
> > @@ -1026,7 +1026,6 @@ i40e_xmit_pkts(void *tx_queue, struct rte_mbuf
> **tx_pkts, uint16_t nb_pkts)
> > uint16_t nb_tx;
> > uint32_t td_cmd;
> > uint32_t td_offset;
> > -   uint32_t tx_flags;
> > uint32_t td_tag;
> > uint64_t ol_flags;
> > uint16_t nb_used;
> > @@ -1050,7 +1049,6 @@ i40e_xmit_pkts(void *tx_queue, struct rte_mbuf
> **tx_pkts, uint16_t nb_pkts)
> > td_cmd = 0;
> > td_tag = 0;
> > td_offset = 0;
> > -   tx_flags = 0;
> >
> > tx_pkt = *tx_pkts++;
> > RTE_MBUF_PREFETCH_TO_FREE(txe->mbuf);
> > @@ -1097,12 +1095,8 @@ i40e_xmit_pkts(void *tx_queue, struct
> rte_mbuf
> > **tx_pkts, uint16_t nb_pkts)
> >
> > /* Descriptor based VLAN insertion */
> > if (ol_flags & (PKT_TX_VLAN_PKT | PKT_TX_QINQ_PKT)) {
> > -   tx_flags |= tx_pkt->vlan_tci <<
> > -   I40E_TX_FLAG_L2TAG1_SHIFT;
> > -   tx_flags |= I40E_TX_FLAG_INSERT_VLAN;
> 
> The I40E_TX_FLAG_INSERT_VLAN flag also seems used only here, and can be
> removed.
> 
> Also I40E_TX_FLAG_CSUM and I40E_TX_FLAG_TSYN seems not used at all,
> understanding why they are introduced at first place can be useful.
> 
> > td_cmd |= I40E_TX_DESC_CMD_IL2TAG1;
> > -   td_tag = (tx_flags & I40E_TX_FLAG_L2TAG1_MASK) >>
> > -
>   I40E_TX_FLAG_L2TAG1_SHIFT;
> > +   td_tag = tx_pkt->vlan_tci;
> > }
> >
> > /* Always enable CRC offload insertion */ diff --git
> > a/drivers/net/i40e/i40e_rxtx.h b/drivers/net/i40e/i40e_rxtx.h index
> > 9df8a56..3d4abdc 100644
> > --- a/drivers/net/i40e/i40e_rxtx.h
> > +++ b/drivers/net/i40e/i40e_rxtx.h
> > @@ -38,8 +38,6 @@
> >   * 32 bits tx flags, high 16 bits for L2TAG1 (VLAN),
> >   * low 16 bits for others.
> >   */
> > -#define I40E_TX_FLAG_L2TAG1_SHIFT 16
> > -#define I40E_TX_FLAG_L2TAG1_MASK  0x
> >  #define I40E_TX_FLAG_CSUM ((uint32_t)(1 << 0))
> >  #define I40E_TX_FLAG_INSERT_VLAN  ((uint32_t)(1 << 1))
> >  #define I40E_TX_FLAG_TSYN ((uint32_t)(1 << 2))
> >



[dpdk-dev] [PATCH v2] app/test-crypto-perf: fix incorrect size of expression

2017-02-10 Thread Jacek Piasecki
Fix problem of passing a pointer to sizeof() function. Now the size
of enabled_cdevs structure is passed by RTE_CRYPTO_MAX_DEVS.

Coverity issue: 141068
Fixes: f8be1786b1b8 ("app/crypto-perf: introduce performance test application")

Signed-off-by: Jacek Piasecki 
---
v2:
* RTE_CRYPTO_MAX_DEVS is passed to rte_cryptodev_devices_get() directly

 app/test-crypto-perf/main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/app/test-crypto-perf/main.c b/app/test-crypto-perf/main.c
index 634ea5f..ac4f484 100644
--- a/app/test-crypto-perf/main.c
+++ b/app/test-crypto-perf/main.c
@@ -45,7 +45,7 @@
int ret;
 
enabled_cdev_count = rte_cryptodev_devices_get(opts->device_type,
-   enabled_cdevs, RTE_DIM(enabled_cdevs));
+   enabled_cdevs, RTE_CRYPTO_MAX_DEVS);
if (enabled_cdev_count == 0) {
printf("No crypto devices type %s available\n",
opts->device_type);
-- 
1.9.1



Re: [dpdk-dev] [PATCH] net/ixgbe: fix tci mask check in fdir pasrer

2017-02-10 Thread Ferruh Yigit
On 2/10/2017 1:37 AM, Wei Zhao wrote:
> It must use big endian when check on the tci mask of vlan
> and vxlan parser in fdir filter rule pattern parser.Because
> rte layer send out tci mask using big endian mode.
> 
> Fixes: 11777435c727 ("net/ixgbe: parse flow director filter")
> Fixes: cc83320af286 ("net/ixgbe: add tci mask check
> in fdir parser")
> 
> Signed-off-by: Wei Zhao 

net/ixgbe: fix VLAN mask TCI in flow rule parser

Fixes: 11777435c727 ("net/ixgbe: parse flow director filter")
Fixes: c7753a7e6968 ("net/ixgbe: add tci mask check in fdir parser")

Applied to dpdk-next-net/master, thanks.




Re: [dpdk-dev] [PATCH v2] app/test-crypto-perf: fix segmentation fault when use qat pmd

2017-02-10 Thread De Lara Guarch, Pablo
Hi Slawomir,

> -Original Message-
> From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Slawomir
> Mrozowicz
> Sent: Thursday, February 09, 2017 1:57 PM
> To: Doherty, Declan
> Cc: dev@dpdk.org; Mrozowicz, SlawomirX
> Subject: [dpdk-dev] [PATCH v2] app/test-crypto-perf: fix segmentation fault
> when use qat pmd
> 
> Fix segmentation fault happened when use QAT PMD's kasumi, snow3g or
> zug
> algorithm to do cipher-then-auth performance test application.
> The mentioned algorithms required authentication key data be set.
> This patch fix issue that gmac algorithm required authentication key data
> be set value equal to cipher key data.
> 
> Fixes: f8be1786b1b8 ("app/crypto-perf: introduce performance test
> application")
> 
> Signed-off-by: Slawomir Mrozowicz 

This also happened for other SW PMDs and not just QAT, but an incorrect 
implementation in them was hiding this issue.
I will reword this commit. Also, make sure to run check-git-log.sh next time, 
as I am seeing:

Wrong headline lowercase:
app/test-crypto-perf: fix segmentation fault when use qat pmd
Headline too long:
app/test-crypto-perf: fix segmentation fault when use qat pmd

Apart from this,

Acked-by: Pablo de Lara 






Re: [dpdk-dev] [PATCH v2] app/test-crypto-perf: fix incorrect size of expression

2017-02-10 Thread De Lara Guarch, Pablo


> -Original Message-
> From: Piasecki, JacekX
> Sent: Friday, February 10, 2017 1:26 PM
> To: Doherty, Declan
> Cc: dev@dpdk.org; De Lara Guarch, Pablo; Piasecki, JacekX
> Subject: [PATCH v2] app/test-crypto-perf: fix incorrect size of expression
> 
> Fix problem of passing a pointer to sizeof() function. Now the size
> of enabled_cdevs structure is passed by RTE_CRYPTO_MAX_DEVS.
> 
> Coverity issue: 141068
> Fixes: f8be1786b1b8 ("app/crypto-perf: introduce performance test
> application")
> 
> Signed-off-by: Jacek Piasecki 

Acked-by: Pablo de Lara 


Re: [dpdk-dev] [PATCH] net/i40e: fix wrong definition of TC bandwidth

2017-02-10 Thread Ferruh Yigit
On 2/10/2017 5:25 AM, Wenzhuo Lu wrote:
> The range of TC bandwidth is 0 ~ 800, it's 16bits not 8bits.
> 
> Fixes: c8b9a3e3fe1b ("i40e: support DCB mode")
> CC: sta...@dpdk.org
> 
> Signed-off-by: Wenzhuo Lu 

Applied to dpdk-next-net/master, thanks.



[dpdk-dev] [PATCH] doc: annouce ABI change for cryptodev ops structure

2017-02-10 Thread Fan Zhang
Signed-off-by: Fan Zhang 
---
 doc/guides/rel_notes/deprecation.rst | 4 
 1 file changed, 4 insertions(+)

diff --git a/doc/guides/rel_notes/deprecation.rst 
b/doc/guides/rel_notes/deprecation.rst
index 755dc65..564d93a 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -62,3 +62,7 @@ Deprecation Notices
   PMDs that implement the latter.
   Target release for removal of the legacy API will be defined once most
   PMDs have switched to rte_flow.
+
+* ABI changes are planned for 17.05 in the ``rte_cryptodev_ops`` structure.
+  The field ``cryptodev_configure_t`` function prototype will be added a
+  parameter of a struct rte_cryptodev_config type pointer.
-- 
2.7.4



Re: [dpdk-dev] [PATCH v2] app/test-crypto-perf: fix incorrect size of expression

2017-02-10 Thread De Lara Guarch, Pablo


> -Original Message-
> From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of De Lara Guarch,
> Pablo
> Sent: Friday, February 10, 2017 11:29 AM
> To: Piasecki, JacekX; Doherty, Declan
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v2] app/test-crypto-perf: fix incorrect size of
> expression
> 
> 
> 
> > -Original Message-
> > From: Piasecki, JacekX
> > Sent: Friday, February 10, 2017 1:26 PM
> > To: Doherty, Declan
> > Cc: dev@dpdk.org; De Lara Guarch, Pablo; Piasecki, JacekX
> > Subject: [PATCH v2] app/test-crypto-perf: fix incorrect size of expression
> >
> > Fix problem of passing a pointer to sizeof() function. Now the size
> > of enabled_cdevs structure is passed by RTE_CRYPTO_MAX_DEVS.
> >
> > Coverity issue: 141068
> > Fixes: f8be1786b1b8 ("app/crypto-perf: introduce performance test
> > application")
> >
> > Signed-off-by: Jacek Piasecki 
> 
> Acked-by: Pablo de Lara 

Applied to dpdk-next-crypto.
Thanks,

Pablo


Re: [dpdk-dev] [PATCH v2] app/test-crypto-perf: fix segmentation fault when use qat pmd

2017-02-10 Thread De Lara Guarch, Pablo


> -Original Message-
> From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of De Lara Guarch,
> Pablo
> Sent: Friday, February 10, 2017 11:26 AM
> To: Mrozowicz, SlawomirX; Doherty, Declan
> Cc: dev@dpdk.org; Mrozowicz, SlawomirX
> Subject: Re: [dpdk-dev] [PATCH v2] app/test-crypto-perf: fix segmentation
> fault when use qat pmd
> 
> Hi Slawomir,
> 
> > -Original Message-
> > From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Slawomir
> > Mrozowicz
> > Sent: Thursday, February 09, 2017 1:57 PM
> > To: Doherty, Declan
> > Cc: dev@dpdk.org; Mrozowicz, SlawomirX
> > Subject: [dpdk-dev] [PATCH v2] app/test-crypto-perf: fix segmentation
> fault
> > when use qat pmd
> >
> > Fix segmentation fault happened when use QAT PMD's kasumi, snow3g or
> > zug
> > algorithm to do cipher-then-auth performance test application.
> > The mentioned algorithms required authentication key data be set.
> > This patch fix issue that gmac algorithm required authentication key data
> > be set value equal to cipher key data.
> >
> > Fixes: f8be1786b1b8 ("app/crypto-perf: introduce performance test
> > application")
> >
> > Signed-off-by: Slawomir Mrozowicz 
> 
> This also happened for other SW PMDs and not just QAT, but an incorrect
> implementation in them was hiding this issue.
> I will reword this commit. Also, make sure to run check-git-log.sh next time,
> as I am seeing:
> 
> Wrong headline lowercase:
> app/test-crypto-perf: fix segmentation fault when use qat pmd
> Headline too long:
> app/test-crypto-perf: fix segmentation fault when use qat pmd
> 
> Apart from this,
> 
> Acked-by: Pablo de Lara 
> 

Applied to dpdk-next-crypto.
Thanks,

Pablo



Re: [dpdk-dev] [PATCH] eal: fix bug in x86 cmpset

2017-02-10 Thread Hunt, David



On 10/2/2017 10:53 AM, Thomas Monjalon wrote:

2017-02-10 10:39, Hunt, David:

On 9/2/2017 4:53 PM, Thomas Monjalon wrote:

2016-11-06 22:09, Thomas Monjalon:

2016-09-29 18:34, Thomas Monjalon:

2016-09-30 02:54, Nikhil Rao:

The original code used movl instead of xchgl, this caused
rte_atomic64_cmpset to use ebx as the lower dword of the source
to cmpxchg8b instead of the lower dword of function argument "src".

Could you please start the explanation with a statement of
what is wrong from an user point of view?
It could help to understand how severe it is.

Please, we need a clear explanation of the bug, and an acknowledgement.

Should we close this bug?

I took a few minutes to look at this, and the issue can easily be
reproduced with a small snippet of code.
With the 'mov', the lower dword of the result is incorrect. This is
resolved by using 'xchgl'.

void main()
{
  uint64_t a = 0xff00ff;

  rte_atomic64_cmpset( &a, 0xff00ff, 0xfa00fa);
  printf("0x%lx\n", a);
}

When using 'mov', the result is 0xfa
When using 'xchgl', the result is 0xfa00fa, as expected.

This operation is used a lot in drivers for link status.

I think we need to clearly explain what was the consequence of this bug.


Agreed. It's probably also worth noting that its only on the __PIC__ enabled
codepath so would have more of an affect on the distros.




[dpdk-dev] [PATCH v2] app/crypto-perf: fix dereference null return value

2017-02-10 Thread Slawomir Mrozowicz
Dereferencing a pointer that might be null key_token when calling strstr.
Check if the pointer is null before.

Coverity issue: 141071
Fixes: f8be1786b1b8 ("app/crypto-perf: introduce performance test application")

Signed-off-by: Slawomir Mrozowicz 
---
v2 changes:
- print message only if key_token exist
---
 app/test-crypto-perf/cperf_test_vector_parsing.c | 12 
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/app/test-crypto-perf/cperf_test_vector_parsing.c 
b/app/test-crypto-perf/cperf_test_vector_parsing.c
index e0bcb20..e442489 100644
--- a/app/test-crypto-perf/cperf_test_vector_parsing.c
+++ b/app/test-crypto-perf/cperf_test_vector_parsing.c
@@ -234,15 +234,19 @@ parse_entry(char *entry, struct cperf_test_vector *vector,
uint8_t *data = NULL;
char *token, *key_token;
 
+   if (entry == NULL) {
+   printf("Expected entry value\n");
+   return -1;
+   }
+
/* get key */
token = strtok(entry, CPERF_ENTRY_DELIMITER);
key_token = token;
-
/* get values for key */
token = strtok(NULL, CPERF_ENTRY_DELIMITER);
-   if (token == NULL) {
-   printf("Expected 'key = values' but was '%.40s'..\n",
-   key_token);
+
+   if (key_token == NULL || token == NULL) {
+   printf("Expected 'key = values' but was '%.40s'..\n", entry);
return -1;
}
 
-- 
2.5.0



[dpdk-dev] [PATCH] doc: clarify Multi-Buffer library version support

2017-02-10 Thread Pablo de Lara
AES-NI MB PMD uses external Multi-Buffer library,
which is hosted in github, but the version was not specified
in the documentation.

Signed-off-by: Pablo de Lara 
---
 doc/guides/cryptodevs/aesni_mb.rst | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/doc/guides/cryptodevs/aesni_mb.rst 
b/doc/guides/cryptodevs/aesni_mb.rst
index 8b18eba..a492b6f 100644
--- a/doc/guides/cryptodevs/aesni_mb.rst
+++ b/doc/guides/cryptodevs/aesni_mb.rst
@@ -70,9 +70,11 @@ Limitations
 Installation
 
 
-To build DPDK with the AESNI_MB_PMD the user is required to download the mult-
-buffer library from `here `_
+To build DPDK with the AESNI_MB_PMD the user is required to download the 
multi-buffer
+library from `here `_
 and compile it on their user system before building DPDK.
+The latest version of the library supported by this PMD is v0.44, which
+can be downloaded in 
``_.
 
 .. code-block:: console
 
-- 
2.7.4



[dpdk-dev] [PATCH] eal/linux: fix fd check before close

2017-02-10 Thread Yong Wang
The "dev->intr_handle.fd" is possibly a negative value while it is
passed as an argument to function "close". Fix the check to the fd.

Signed-off-by: Yong Wang 
---
 lib/librte_eal/linuxapp/eal/eal_pci_uio.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal_pci_uio.c 
b/lib/librte_eal/linuxapp/eal/eal_pci_uio.c
index 3e4ffb5..20a4a66 100644
--- a/lib/librte_eal/linuxapp/eal/eal_pci_uio.c
+++ b/lib/librte_eal/linuxapp/eal/eal_pci_uio.c
@@ -230,7 +230,7 @@
close(dev->intr_handle.uio_cfg_fd);
dev->intr_handle.uio_cfg_fd = -1;
}
-   if (dev->intr_handle.fd) {
+   if (dev->intr_handle.fd >= 0) {
close(dev->intr_handle.fd);
dev->intr_handle.fd = -1;
dev->intr_handle.type = RTE_INTR_HANDLE_UNKNOWN;
-- 
1.8.3.1




Re: [dpdk-dev] [PATCH v2] cryptodev: fix segmentation fault

2017-02-10 Thread Mrozowicz, SlawomirX

>-Original Message-
>From: De Lara Guarch, Pablo
>Sent: Thursday, February 9, 2017 11:21 PM
>To: De Lara Guarch, Pablo ; Mrozowicz,
>SlawomirX ; Doherty, Declan
>
>Cc: dev@dpdk.org; Mrozowicz, SlawomirX 
>Subject: RE: [dpdk-dev] [PATCH v2] cryptodev: fix segmentation fault
>
>
>
>> -Original Message-
>> From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of De Lara Guarch,
>> Pablo
>> Sent: Thursday, February 09, 2017 3:29 PM
>> To: Mrozowicz, SlawomirX; Doherty, Declan
>> Cc: dev@dpdk.org; Mrozowicz, SlawomirX
>> Subject: Re: [dpdk-dev] [PATCH v2] cryptodev: fix segmentation fault
>>
>> Hi Slawomir,
>>
>> > -Original Message-
>> > From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Slawomir
>> > Mrozowicz
>> > Sent: Friday, February 03, 2017 3:55 PM
>> > To: Doherty, Declan
>> > Cc: dev@dpdk.org; Mrozowicz, SlawomirX
>> > Subject: [dpdk-dev] [PATCH v2] cryptodev: fix segmentation fault
>> >
>> > This patch fix problem in function rte_cryptodev_devices_get().
>> > Program received signal SIGSEGV, Segmentation fault.
>> > It rework the function to use correct types and clean up visibility.
>> > It also fix Coverity ID 141073
>> >
>> > Fixes: 38227c0e3ad2 ("cryptodev: retrieve device info")
>> >
>> > Signed-off-by: Slawomir Mrozowicz 
>>
>> I think this patch fixes coverity issue 141067, not 141073.
>> Could you submit a v3 for this patch? Make sure to follow the format:
>
>I have done it for you.
>
>Applied to dpdk-next-crypto.
>Thanks,
>
>Pablo

Thanks Pablo.
Sławomir


Re: [dpdk-dev] [PATCH] doc: clarify Multi-Buffer library version support

2017-02-10 Thread Jain, Deepak K

> -Original Message-
> From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Pablo de Lara
> Sent: Friday, February 10, 2017 12:34 PM
> To: dec...@intel.com; Mcnamara, John 
> Cc: dev@dpdk.org; De Lara Guarch, Pablo 
> Subject: [dpdk-dev] [PATCH] doc: clarify Multi-Buffer library version support
> 
> AES-NI MB PMD uses external Multi-Buffer library, which is hosted in github,
> but the version was not specified in the documentation.
> 
> Signed-off-by: Pablo de Lara 
> ---
>  doc/guides/cryptodevs/aesni_mb.rst | 6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/doc/guides/cryptodevs/aesni_mb.rst
> b/doc/guides/cryptodevs/aesni_mb.rst
> index 8b18eba..a492b6f 100644
> --- a/doc/guides/cryptodevs/aesni_mb.rst
> +++ b/doc/guides/cryptodevs/aesni_mb.rst
> @@ -70,9 +70,11 @@ Limitations
>  Installation
>  
> 
> -To build DPDK with the AESNI_MB_PMD the user is required to download
> --
> 2.7.4
Acked-by: Deepak Kumar Jain


Re: [dpdk-dev] [PATCH] doc: fix unreadable images

2017-02-10 Thread Thomas Monjalon
2017-02-10 11:18, Jianfeng Tan:
> The images by below two commits are very unclear. Fix it.
> 
> Fixes: 50665deebda ("doc: add guide to use virtio-user for container 
> networking")
> Fixes: 0ba3870e755 ("doc: add guide to use virtio-user as exceptional path")
> 
> Suggested-by: Thomas Monjalon 
> Signed-off-by: Jianfeng Tan 

I've suggested to take time to do a real SVG, not replacing it by a PNG.
Please understand this git repository is for hosting sources which can
be modified, not binary formats.


Re: [dpdk-dev] [PATCH] eal/linux: fix fd check before close

2017-02-10 Thread Thomas Monjalon
2017-02-10 08:53, Yong Wang:
> The "dev->intr_handle.fd" is possibly a negative value while it is
> passed as an argument to function "close". Fix the check to the fd.
> 
> Signed-off-by: Yong Wang 

pci: fix UIO interrupt file descriptor check before close

Fixes: 5a60a7ffc801 ("pci: introduce functions to alloc and free uio resource")

Applied, thanks


Re: [dpdk-dev] [PATCH v2] app/crypto-perf: fix dereference null return value

2017-02-10 Thread De Lara Guarch, Pablo


> -Original Message-
> From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Slawomir
> Mrozowicz
> Sent: Friday, February 10, 2017 2:23 PM
> To: Doherty, Declan
> Cc: dev@dpdk.org; Mrozowicz, SlawomirX
> Subject: [dpdk-dev] [PATCH v2] app/crypto-perf: fix dereference null return
> value
> 
> Dereferencing a pointer that might be null key_token when calling strstr.
> Check if the pointer is null before.
> 
> Coverity issue: 141071
> Fixes: f8be1786b1b8 ("app/crypto-perf: introduce performance test
> application")
> 
> Signed-off-by: Slawomir Mrozowicz 

Acked-by: Pablo de Lara 



Re: [dpdk-dev] [PATCH] crypto/scheduler: fix session backup

2017-02-10 Thread De Lara Guarch, Pablo


> -Original Message-
> From: Zhang, Roy Fan
> Sent: Thursday, February 09, 2017 6:46 PM
> To: dev@dpdk.org
> Cc: De Lara Guarch, Pablo
> Subject: [PATCH] crypto/scheduler: fix session backup
> 
> Fixes the missed session backup during enqueue.
> 
> Fixes: 100e4f7e44ab ("crypto/scheduler: add round-robin mode")
> 
> Signed-off-by: Fan Zhang 
> ---

Acked-by: Pablo de Lara 



Re: [dpdk-dev] [PATCH v2] app/crypto-perf: fix dereference null return value

2017-02-10 Thread De Lara Guarch, Pablo


> -Original Message-
> From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of De Lara Guarch,
> Pablo
> Sent: Friday, February 10, 2017 1:25 PM
> To: Mrozowicz, SlawomirX; Doherty, Declan
> Cc: dev@dpdk.org; Mrozowicz, SlawomirX
> Subject: Re: [dpdk-dev] [PATCH v2] app/crypto-perf: fix dereference null
> return value
> 
> 
> 
> > -Original Message-
> > From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Slawomir
> > Mrozowicz
> > Sent: Friday, February 10, 2017 2:23 PM
> > To: Doherty, Declan
> > Cc: dev@dpdk.org; Mrozowicz, SlawomirX
> > Subject: [dpdk-dev] [PATCH v2] app/crypto-perf: fix dereference null
> return
> > value
> >
> > Dereferencing a pointer that might be null key_token when calling strstr.
> > Check if the pointer is null before.
> >
> > Coverity issue: 141071
> > Fixes: f8be1786b1b8 ("app/crypto-perf: introduce performance test
> > application")
> >
> > Signed-off-by: Slawomir Mrozowicz 
> 
> Acked-by: Pablo de Lara 

Applied to dpdk-next-crypto.
Thanks,

Pablo


Re: [dpdk-dev] [PATCH v2 00/11] moving away from coremask to corelist

2017-02-10 Thread Wiles, Keith

> On Feb 10, 2017, at 2:46 AM, Thomas Monjalon  
> wrote:
> 
> Hi Keith,
> 
> 2017-02-09 17:42, Keith Wiles:
>> The coremask option in DPDK is difficult to use and we should be
>> promoting the use of the corelist (-l) option. The patch series
>> adjusts the docs to use -l EAL option instead of the -c option.
>> 
>> The patch series doc change only and is not required to be done
>> in 17.02 release, but should be added to the 17.05 release.
>> The -c option will be kept and not removed for now unless in the
>> future we decide to deprecate the code.
>> 
>> v2 - Fix taskset back to using -c
>> 
>> Keith Wiles (11):
>>  doc/cryptodev: use corelist instead of coremask
>>  doc/faq: use corelist instead of coremask
>>  doc/freebsd: use corelist instead of coremask
>>  doc/howto: use corelist instead of coremask
>>  doc/linux: use corelist instead of coremask
>>  doc/nics: use corelist instead of coremask
>>  doc/prog_guide: use corelist instead of coremask
>>  doc/testpmd: use corelist instead of coremask
>>  doc/cryptoperf: use corelist instead of coremask
>>  doc/xen: use corelist instead of coremask
>>  doc/sample_app: use corelist instead of coremask
> 
> In case you make new revisions, I think you can squash every patches
> in a single one. They are all doing the same thing in different files.

I have been known to squash patches before, so did not want to do it again :-)

Will submit a single patch as v3.



Regards,
Keith



Re: [dpdk-dev] [PATCH v3] crypto/scheduler: fix initialization

2017-02-10 Thread De Lara Guarch, Pablo


> -Original Message-
> From: Zhang, Roy Fan
> Sent: Thursday, February 09, 2017 6:50 PM
> To: dev@dpdk.org
> Cc: De Lara Guarch, Pablo
> Subject: [PATCH v3] crypto/scheduler: fix initialization
> 
> Fixes the wrong slave initialization issue on start-up
> 
> Fixes: 100e4f7e44ab ("crypto/scheduler: add round-robin mode")
> 
> Signed-off-by: Fan Zhang 

Acked-by: Pablo de Lara 


Re: [dpdk-dev] [PATCH] doc: annouce ABI change for cryptodev ops structure

2017-02-10 Thread Trahe, Fiona
Hi Fan,

> -Original Message-
> From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Fan Zhang
> Sent: Friday, February 10, 2017 11:39 AM
> To: dev@dpdk.org
> Cc: De Lara Guarch, Pablo 
> Subject: [dpdk-dev] [PATCH] doc: annouce ABI change for cryptodev ops
> structure
> 
> Signed-off-by: Fan Zhang 
> ---
>  doc/guides/rel_notes/deprecation.rst | 4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/doc/guides/rel_notes/deprecation.rst
> b/doc/guides/rel_notes/deprecation.rst
> index 755dc65..564d93a 100644
> --- a/doc/guides/rel_notes/deprecation.rst
> +++ b/doc/guides/rel_notes/deprecation.rst
> @@ -62,3 +62,7 @@ Deprecation Notices
>PMDs that implement the latter.
>Target release for removal of the legacy API will be defined once most
>PMDs have switched to rte_flow.
> +
> +* ABI changes are planned for 17.05 in the ``rte_cryptodev_ops`` structure.
> +  The field ``cryptodev_configure_t`` function prototype will be added a
> +  parameter of a struct rte_cryptodev_config type pointer.
> --
> 2.7.4

Can you fix the grammar here please. I'm not sure what the change is?


[dpdk-dev] [PATCH v3] doc: use corelist instead of coremask

2017-02-10 Thread Keith Wiles
The coremask option in DPDK is difficult to use and we should be
promoting the use of the corelist (-l) option. The patch
adjusts the docs to use -l EAL option instead of the -c option.

The patch only changes the docs and not the code as the -c option
will continue to exist unless it is removed in the future. The -c
option should be kept to maintain backward compatibility.

v3 - squash all of the changes into a single commit.
v2 - reset changes to taskset back to using -c.

Signed-off-by: Keith Wiles 
---
 doc/guides/cryptodevs/aesni_gcm.rst|  2 +-
 doc/guides/cryptodevs/aesni_mb.rst |  2 +-
 doc/guides/cryptodevs/kasumi.rst   |  2 +-
 doc/guides/cryptodevs/null.rst |  2 +-
 doc/guides/cryptodevs/openssl.rst  |  2 +-
 doc/guides/cryptodevs/snow3g.rst   |  2 +-
 doc/guides/cryptodevs/zuc.rst  |  2 +-
 doc/guides/faq/faq.rst |  4 ++-
 doc/guides/freebsd_gsg/build_sample_apps.rst   |  9 ---
 doc/guides/freebsd_gsg/install_from_ports.rst  |  2 +-
 doc/guides/howto/flow_bifurcation.rst  |  2 +-
 doc/guides/howto/lm_bond_virtio_sriov.rst  |  2 +-
 doc/guides/howto/lm_virtio_vhost_user.rst  |  2 +-
 doc/guides/linux_gsg/build_sample_apps.rst | 19 +++---
 doc/guides/linux_gsg/nic_perf_intel_platform.rst   |  2 +-
 doc/guides/linux_gsg/quick_start.rst   |  2 +-
 doc/guides/nics/bnx2x.rst  |  2 +-
 doc/guides/nics/cxgbe.rst  |  4 +--
 doc/guides/nics/ena.rst|  2 +-
 doc/guides/nics/i40e.rst   |  4 +--
 doc/guides/nics/intel_vf.rst   |  7 +++--
 doc/guides/nics/ixgbe.rst  |  4 +--
 doc/guides/nics/mlx4.rst   |  2 +-
 doc/guides/nics/mlx5.rst   |  2 +-
 doc/guides/nics/pcap_ring.rst  | 14 +-
 doc/guides/nics/qede.rst   |  2 +-
 doc/guides/nics/szedata2.rst   |  2 +-
 doc/guides/nics/thunderx.rst   |  2 +-
 doc/guides/nics/vhost.rst  |  2 +-
 doc/guides/nics/virtio.rst |  6 ++---
 doc/guides/prog_guide/kernel_nic_interface.rst |  2 +-
 .../prog_guide/link_bonding_poll_mode_drv_lib.rst  | 10 
 doc/guides/prog_guide/multi_proc_support.rst   |  2 +-
 doc/guides/sample_app_ug/cmd_line.rst  |  2 +-
 doc/guides/sample_app_ug/dist_app.rst  |  2 +-
 doc/guides/sample_app_ug/exception_path.rst|  4 +--
 doc/guides/sample_app_ug/hello_world.rst   |  2 +-
 doc/guides/sample_app_ug/intel_quickassist.rst |  2 +-
 doc/guides/sample_app_ug/ip_frag.rst   |  4 +--
 doc/guides/sample_app_ug/ip_reassembly.rst |  4 +--
 doc/guides/sample_app_ug/ipv4_multicast.rst|  2 +-
 doc/guides/sample_app_ug/keep_alive.rst|  2 +-
 doc/guides/sample_app_ug/kernel_nic_interface.rst  |  4 +--
 doc/guides/sample_app_ug/l2_forward_cat.rst|  6 ++---
 doc/guides/sample_app_ug/l2_forward_crypto.rst |  2 +-
 doc/guides/sample_app_ug/l2_forward_job_stats.rst  |  2 +-
 .../sample_app_ug/l2_forward_real_virtual.rst  |  2 +-
 .../sample_app_ug/l3_forward_access_ctrl.rst   |  2 +-
 doc/guides/sample_app_ug/link_status_intr.rst  |  2 +-
 doc/guides/sample_app_ug/load_balancer.rst |  2 +-
 doc/guides/sample_app_ug/multi_process.rst | 30 +++---
 doc/guides/sample_app_ug/performance_thread.rst| 20 +++
 doc/guides/sample_app_ug/ptpclient.rst |  2 +-
 doc/guides/sample_app_ug/qos_scheduler.rst |  6 ++---
 doc/guides/sample_app_ug/quota_watermark.rst   |  2 +-
 doc/guides/sample_app_ug/rxtx_callbacks.rst|  2 +-
 doc/guides/sample_app_ug/skeleton.rst  |  2 +-
 doc/guides/sample_app_ug/tep_termination.rst   | 14 +-
 doc/guides/sample_app_ug/test_pipeline.rst |  2 +-
 doc/guides/sample_app_ug/timer.rst |  2 +-
 doc/guides/sample_app_ug/vhost.rst |  4 +--
 doc/guides/sample_app_ug/vm_power_management.rst   |  4 +--
 doc/guides/sample_app_ug/vmdq_dcb_forwarding.rst   |  2 +-
 doc/guides/testpmd_app_ug/run_app.rst  |  2 +-
 doc/guides/tools/cryptoperf.rst|  5 ++--
 doc/guides/xen/pkt_switch.rst  | 10 
 66 files changed, 145 insertions(+), 137 deletions(-)

diff --git a/doc/guides/cryptodevs/aesni_gcm.rst 
b/doc/guides/cryptodevs/aesni_gcm.rst
index e4b4108..ba9ecb5 100644
--- a/doc/guides/cryptodevs/aesni_gcm.rst
+++ b/doc/guides/cryptodevs/aesni_gcm.rst
@@ -84,7 +84,7 @@ Example:
 
 .. code-block:: console
 
-./l2fwd-crypto -c 40 -n 4 
--vdev="crypto_aesni_gcm,socket_id=1,max_nb_sessions=128"
+./l2fwd-crypto -

[dpdk-dev] [PATCH 0/2] ethdev: abstraction layer for QoS hierarchical scheduler

2017-02-10 Thread Cristian Dumitrescu
This patch set introduces an ethdev-based abstraction layer for Quality of
Service (QoS) hierarchical scheduler. The goal is to provide a simple generic
API that is agnostic of the underlying HW, SW or mixed HW-SW implementation.

Patch 1 builds on the mechanism introduced by rte_flow in DPDK and generalizes
it to make it available for other ethdev features/capabilities (such as the
hierarchical scheduler). The goal is to define a plugin-like mechanism to extend
the ethdev functionality in a modular way as opposed to the current monolithic
approach.

Patch 2 introduces the generic ethdev API for hierarchical scheduler using the
above plugin-like mechanism for ethdev.

Cristian Dumitrescu (2):
  ethdev: add capability control API
  ethdev: add hierarchical scheduler API

 MAINTAINERS|4 +
 lib/librte_ether/Makefile  |5 +-
 lib/librte_ether/rte_ethdev.c  |   13 +
 lib/librte_ether/rte_ethdev.h  |   29 +
 lib/librte_ether/rte_ether_version.map |   37 +
 lib/librte_ether/rte_scheddev.c|  790 
 lib/librte_ether/rte_scheddev.h| 1273 
 lib/librte_ether/rte_scheddev_driver.h |  374 ++
 8 files changed, 2524 insertions(+), 1 deletion(-)
 create mode 100644 lib/librte_ether/rte_scheddev.c
 create mode 100644 lib/librte_ether/rte_scheddev.h
 create mode 100644 lib/librte_ether/rte_scheddev_driver.h

-- 
2.5.0



[dpdk-dev] [PATCH 1/2] ethdev: add capability control API

2017-02-10 Thread Cristian Dumitrescu
The rte_flow feature breaks the current monolithic approach for ethdev and
introduces the new generic flow API to ethdev using a plugin-like approach.

Basically, the rte_flow API is still logically part of ethdev:
- It extends the ethdev functionality: rte_flow is a new feature/capability
  of ethdev;
- all its functions work on an Ethernet device: the first parameter of the
  rte_flow functions is Ethernet device port ID.

At the same time, the rte_flow API is a sort of capability plugin for ethdev:
- the rte_flow API functions have their own name space: they are called
  rte_flow_operationXYZ() as opposed to rte_eth_dev_flow_operationXYZ());
- the rte_flow API functions are placed in separate files in the same
  librte_ether folder as opposed to rte_ethdev.[hc].

The way it works is by using the existing ethdev API function
rte_eth_dev_filter_ctrl() to query the current Ethernet device port ID for the
support of the rte_flow capability and return the pointer to the
rte_flow operations when supported and NULL otherwise:

struct rte_flow_ops *eth_flow_ops;
int rte = rte_eth_dev_filter_ctrl(eth_port_id,
RTE_ETH_FILTER_GENERIC, RTE_ETH_FILTER_GET, ð_flow_ops);

Unfortunately, the rte_flow opportunistically uses the rte_eth_dev_filter_ctrl()
API function, which is applicable just to RX-side filters as opposed to
introducing a mechanism that could be used by any capability in a generic way.

This is the gap that addressed by the current patch. This mechanism is intended
to be used to introduce new capabilities into ethdev in a modular plugin-like
approach, such as hierarchical scheduler. Over time, if agreed, it can also be
used for exposing the existing Ethernet device capabilities in a modular way,
such as: xstats, filters, multicast, mirroring, tunnels, time stamping, eeprom,
bypass, etc.

Signed-off-by: Cristian Dumitrescu 
---
 lib/librte_ether/rte_ethdev.c  | 13 +
 lib/librte_ether/rte_ethdev.h  | 29 +
 lib/librte_ether/rte_ether_version.map |  7 +++
 3 files changed, 49 insertions(+)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index eb0a94a..ae187c4 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -2802,6 +2802,19 @@ rte_eth_dev_filter_ctrl(uint8_t port_id, enum 
rte_filter_type filter_type,
return (*dev->dev_ops->filter_ctrl)(dev, filter_type, filter_op, arg);
 }
 
+int
+rte_eth_dev_capability_control(uint8_t port_id, enum rte_eth_capability cap,
+   void *arg)
+{
+   struct rte_eth_dev *dev;
+
+   RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
+
+   dev = &rte_eth_devices[port_id];
+   RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->cap_ctrl, -ENOTSUP);
+   return (*dev->dev_ops->cap_ctrl)(dev, cap, arg);
+}
+
 void *
 rte_eth_add_rx_callback(uint8_t port_id, uint16_t queue_id,
rte_rx_callback_fn fn, void *user_param)
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index c17bbda..43ffb9e 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -1073,6 +1073,12 @@ TAILQ_HEAD(rte_eth_dev_cb_list, rte_eth_dev_callback);
  * structure associated with an Ethernet device.
  */
 
+enum rte_eth_capability {
+   RTE_ETH_CAPABILITY_FLOW = 0, /**< Flow */
+   RTE_ETH_CAPABILITY_SCHED, /**< Hierarchical Scheduler */
+   RTE_ETH_CAPABILITY_MAX
+};
+
 typedef int  (*eth_dev_configure_t)(struct rte_eth_dev *dev);
 /**< @internal Ethernet device configuration. */
 
@@ -1427,6 +1433,10 @@ typedef int (*eth_filter_ctrl_t)(struct rte_eth_dev *dev,
 void *arg);
 /**< @internal Take operations to assigned filter type on an Ethernet device */
 
+typedef int (*eth_capability_control_t)(struct rte_eth_dev *dev,
+   enum rte_eth_capability cap, void *arg);
+/**< @internal Take capability operations on an Ethernet device */
+
 typedef int (*eth_get_dcb_info)(struct rte_eth_dev *dev,
 struct rte_eth_dcb_info *dcb_info);
 /**< @internal Get dcb information on an Ethernet device */
@@ -1548,6 +1558,8 @@ struct eth_dev_ops {
eth_timesync_adjust_time   timesync_adjust_time; /** Adjust the device 
clock. */
eth_timesync_read_time timesync_read_time; /** Get the device clock 
time. */
eth_timesync_write_timetimesync_write_time; /** Set the device 
clock time. */
+
+   eth_capability_control_t   cap_ctrl; /**< capability control. */
 };
 
 /**
@@ -3890,6 +3902,23 @@ int rte_eth_dev_filter_ctrl(uint8_t port_id, enum 
rte_filter_type filter_type,
enum rte_filter_op filter_op, void *arg);
 
 /**
+ * Take capability operations on an Ethernet device.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param cap
+ *   The capability of the Ethernet device
+ * @param arg
+ *   A pointer to arguments defined specifically for the operation.
+ * @return
+ *

[dpdk-dev] [PATCH 2/2] ethdev: add hierarchical scheduler API

2017-02-10 Thread Cristian Dumitrescu
This patch introduces the generic ethdev API for the hierarchical scheduler
capability.

Main features:
- Exposed as ethdev plugin capability (similar to rte_flow approach)
- Capability query API per port and per hierarchy node
- Scheduling algorithms: strict priority (SP), Weighed Fair Queuing (WFQ),
  Weighted Round Robin (WRR)
- Traffic shaping: single/dual rate, private (per node) and shared (by multiple
  nodes) shapers
- Congestion management for hierarchy leaf nodes: algorithms of tail drop,
  head drop, WRED; private (per node) and shared (by multiple nodes) WRED
  contexts
- Packet marking: IEEE 802.1q (VLAN DEI), IETF RFC 3168 (IPv4/IPv6 ECN for
  TCP and SCTP), IETF RFC 2597 (IPv4 / IPv6 DSCP)

Changes since RFC [1]:
- Implemented as ethdev plugin (similar to rte_flow) as opposed to more
  monolithic additions to ethdev itself
- Implemented feedback from Jerin [2] and Hemant [3]. Implemented all the
  suggested items with only one exception, see the long list below, hopefully
  nothing was forgotten.
- The item not done (hopefully for a good reason): driver-generated object
  IDs. IMO the choice to have application-generated object IDs adds marginal
  complexity to the driver (search ID function required), but it provides
  huge simplification for the application. The app does not need to worry
  about building & managing tree-like structure for storing driver-generated
  object IDs, the app can use its own convention for node IDs depending on
  the specific hierarchy that it needs. Trivial example: identify all
  level-2 nodes with IDs like 100, 200, 300, … and the level-3 nodes based
  on their level-2 parents: 110, 120, 130, 140, …, 210, 220, 230, 240, …,
  310, 320, 330, … and level-4 nodes based on their level-3 parents: 111,
  112, 113, 114, …, 121, 122, 123, 124, …). Moreover, see the change log for
  the other related simplification that was implemented: leaf nodes now have
  predefined IDs that are the same with their Ethernet TX queue ID (
  therefore no translation is required for leaf nodes).
- Capability API. Done per port and per node as well.
- Dual rate shapers
- Added configuration of private shaper (per node) directly from the shaper
  profile as part of node API (no shaper ID needed for private shapers), while
  the shared shapers are configured outside of the node API using shaper profile
  and communicated to the node using shared shaper ID. So there is no
  configuration overhead for shared shapers if the app does not use any of them.
- Leaf nodes now have predefined IDs that are the same with their Ethernet TX
  queue ID (therefore no translation is required for leaf nodes). This is also
  used to differentiate between a leaf node and a non-leaf node.
- Domain-specific errors to give a precise indication of the error cause (same
  as done by rte_flow)
- Packet marking API
- Packet length optional adjustment for shapers, positive (e.g. for adding
  Ethernet framing overhead of 20 bytes) or negative (e.g. for rate limiting
  based on IP packet bytes)

Next steps:
- SW fallback based on librte_sched library (to be later introduced by
  standalone patch set)

[1] RFC: http://dpdk.org/ml/archives/dev/2016-November/050956.html
[2] Jerin’s feedback on RFC: 
http://www.dpdk.org/ml/archives/dev/2017-January/054484.html
[3] Hemants’s feedback on RFC: 
http://www.dpdk.org/ml/archives/dev/2017-January/054866.html

Signed-off-by: Cristian Dumitrescu 
---
 MAINTAINERS|4 +
 lib/librte_ether/Makefile  |5 +-
 lib/librte_ether/rte_ether_version.map |   30 +
 lib/librte_ether/rte_scheddev.c|  790 
 lib/librte_ether/rte_scheddev.h| 1273 
 lib/librte_ether/rte_scheddev_driver.h |  374 ++
 6 files changed, 2475 insertions(+), 1 deletion(-)
 create mode 100644 lib/librte_ether/rte_scheddev.c
 create mode 100644 lib/librte_ether/rte_scheddev.h
 create mode 100644 lib/librte_ether/rte_scheddev_driver.h

diff --git a/MAINTAINERS b/MAINTAINERS
index cc3bf98..666931d 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -247,6 +247,10 @@ Flow API
 M: Adrien Mazarguil 
 F: lib/librte_ether/rte_flow*
 
+SchedDev API
+M: Cristian Dumitrescu 
+F: lib/librte_ether/rte_scheddev*
+
 Crypto API
 M: Declan Doherty 
 F: lib/librte_cryptodev/
diff --git a/lib/librte_ether/Makefile b/lib/librte_ether/Makefile
index 1d095a9..7e0527f 100644
--- a/lib/librte_ether/Makefile
+++ b/lib/librte_ether/Makefile
@@ -1,6 +1,6 @@
 #   BSD LICENSE
 #
-#   Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
+#   Copyright(c) 2010-2017 Intel Corporation. All rights reserved.
 #   All rights reserved.
 #
 #   Redistribution and use in source and binary forms, with or without
@@ -45,6 +45,7 @@ LIBABIVER := 6
 
 SRCS-y += rte_ethdev.c
 SRCS-y += rte_flow.c
+SRCS-y += rte_scheddev.c
 
 #
 # Export include files
@@ -54,6 +55,8 @@ SYMLINK-y-include += rt

Re: [dpdk-dev] [PATCH] crypto/scheduler: fix session backup

2017-02-10 Thread De Lara Guarch, Pablo


> -Original Message-
> From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of De Lara Guarch,
> Pablo
> Sent: Friday, February 10, 2017 1:37 PM
> To: Zhang, Roy Fan; dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH] crypto/scheduler: fix session backup
> 
> 
> 
> > -Original Message-
> > From: Zhang, Roy Fan
> > Sent: Thursday, February 09, 2017 6:46 PM
> > To: dev@dpdk.org
> > Cc: De Lara Guarch, Pablo
> > Subject: [PATCH] crypto/scheduler: fix session backup
> >
> > Fixes the missed session backup during enqueue.
> >
> > Fixes: 100e4f7e44ab ("crypto/scheduler: add round-robin mode")
> >
> > Signed-off-by: Fan Zhang 
> > ---
> 
> Acked-by: Pablo de Lara 

Applied to dpdk-next-crypto.
Thanks,

Pablo


Re: [dpdk-dev] [PATCH v3] crypto/scheduler: fix initialization

2017-02-10 Thread De Lara Guarch, Pablo


> -Original Message-
> From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of De Lara Guarch,
> Pablo
> Sent: Friday, February 10, 2017 1:49 PM
> To: Zhang, Roy Fan; dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v3] crypto/scheduler: fix initialization
> 
> 
> 
> > -Original Message-
> > From: Zhang, Roy Fan
> > Sent: Thursday, February 09, 2017 6:50 PM
> > To: dev@dpdk.org
> > Cc: De Lara Guarch, Pablo
> > Subject: [PATCH v3] crypto/scheduler: fix initialization
> >
> > Fixes the wrong slave initialization issue on start-up
> >
> > Fixes: 100e4f7e44ab ("crypto/scheduler: add round-robin mode")
> >
> > Signed-off-by: Fan Zhang 
> 
> Acked-by: Pablo de Lara 

Applied to dpdk-next-crypto.
Thanks,

Pablo


Re: [dpdk-dev] [PATCH] doc: clarify Multi-Buffer library version support

2017-02-10 Thread De Lara Guarch, Pablo


> -Original Message-
> From: Jain, Deepak K
> Sent: Friday, February 10, 2017 1:01 PM
> To: De Lara Guarch, Pablo; Mcnamara, John
> Cc: dev@dpdk.org; De Lara Guarch, Pablo; Doherty, Declan
> Subject: RE: [dpdk-dev] [PATCH] doc: clarify Multi-Buffer library version
> support
> 
> 
> > -Original Message-
> > From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Pablo de Lara
> > Sent: Friday, February 10, 2017 12:34 PM
> > To: dec...@intel.com; Mcnamara, John 
> > Cc: dev@dpdk.org; De Lara Guarch, Pablo
> 
> > Subject: [dpdk-dev] [PATCH] doc: clarify Multi-Buffer library version
> support
> >
> > AES-NI MB PMD uses external Multi-Buffer library, which is hosted in
> github,
> > but the version was not specified in the documentation.
> >
> > Signed-off-by: Pablo de Lara 

...

> Acked-by: Deepak Kumar Jain

Applied to dpdk-next-crypto.

Pablo


Re: [dpdk-dev] [dpdk-techboard] decision process and DPDK scope

2017-02-10 Thread Bruce Richardson
On Thu, Feb 09, 2017 at 02:49:05PM -0800, Stephen Hemminger wrote:
> On Thu, 9 Feb 2017 12:20:47 +
> Bruce Richardson  wrote:
> 
> > > I think we can use this case to avoid seeing it again in the future.
> > > I suggest that the technical board should check whether every new proposed
> > > features are explained, discussed and approved enough in the community.
> > > If needed, the technical board meeting minutes will give some lights to
> > > the threads which require more attention.
> > > Before adding a new library or adding a major API, there should be
> > > some strong reviews which include discussing the DPDK scope.
> > >   
> > 
> > The bigger question here is the default position of the DPDK community -
> > default accept, or default reject. Your statements above all are very
> > much keeping in the style of default reject i.e. every patch or change
> > suggested is assumed to be unfit for acceptance unless reviewed in
> > detail to prove beyond doubt otherwise.
> > 
> > I believe that we should change this default position, as I think that
> > reject by default is hurting the community and will continue to do so.
> > 
> > NOTE: I am not suggesting that we allow all code in with zero review,
> > but I am suggesting that if something has been reviewed and acked by at
> > least one reviewer it should be autom
> 
> I agree but in a more assertive manner. The maintainer should be the default
> and active reviewer of all submissions. Like other projects the maintainers 
> job
> is to review and accept (or provide constructive feedback). Otherwise the
> job could just by done by some manager.
> 
> But recently, I have changed my mind. The current DPDK project model is not
> scaling well. After hearing some of the arguments in favor of a multiple
> committer model (see "Maintainers Don't Scale" )
> https://kernel-recipes.org/en/2016/talks/maintainers-dont-scale/
> 
> And comments on lwn:
> https://lwn.net/Articles/703005/
> 
Might it be worthwhile to try out having 2 or 3 committers to each tree
and see how it works? From the presentation you link too, the claim is
that moving from 1 to 2 is the hardest, and expanding beyond that
becomes easier.

/Bruce


Re: [dpdk-dev] [PATCH] net/bonding: improve non-ip packets RSS

2017-02-10 Thread Declan Doherty

On 18/11/16 09:08, haifeng.lin at huawei.com (Haifeng Lin) wrote:

Most ethernet not support non-ip packets RSS and only first
queue can used to receive. In this scenario lacp bond can
only use one queue even if multi queue configured.

We use below formula to change the map between bond_qid and
slave_qid to let at least slave_num queues to receive packets:

slave_qid = (bond_qid + slave_id) % queue_num

Signed-off-by: Haifeng Lin 
---
 drivers/net/bonding/rte_eth_bond_pmd.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/net/bonding/rte_eth_bond_pmd.c 
b/drivers/net/bonding/rte_eth_bond_pmd.c
index 09ce7bf..8ad843a 100644
--- a/drivers/net/bonding/rte_eth_bond_pmd.c
+++ b/drivers/net/bonding/rte_eth_bond_pmd.c
@@ -141,6 +141,8 @@ bond_ethdev_rx_burst_8023ad(void *queue, struct rte_mbuf 
**bufs,
uint8_t collecting;  /* current slave collecting status */
const uint8_t promisc = internals->promiscuous_en;
uint8_t i, j, k;
+   int slave_qid, bond_qid = bd_rx_q->queue_id;
+   int queue_num = internals->nb_rx_queues;

rte_eth_macaddr_get(internals->port_id, &bond_mac);
/* Copy slave list to protect against slave up/down changes during tx
@@ -154,7 +156,9 @@ bond_ethdev_rx_burst_8023ad(void *queue, struct rte_mbuf 
**bufs,
collecting = ACTOR_STATE(&mode_8023ad_ports[slaves[i]], 
COLLECTING);

/* Read packets from this slave */
-   num_rx_total += rte_eth_rx_burst(slaves[i], bd_rx_q->queue_id,
+   slave_qid = queue_num ? (bond_qid + slaves[i]) % queue_num :
+   bond_qid;
+   num_rx_total += rte_eth_rx_burst(slaves[i], slave_qid,
&bufs[num_rx_total], nb_pkts - num_rx_total);

for (k = j; k < 2 && k < num_rx_total; k++)



Nack, I think this could introduce unexpected behaviour as could then be 
read from a different of a slave queue that the queue id specified by 
the calling function, where the expected behaviour is that there is a 
1:1 queue mapping from bond to slave queues. If RSS is needed for 
ethdevs which don't support it natively I think the appropriate solution 
is to create a software RSS solution which can be enabled at the slave 
ethdev level itself. I don't think the bonding layer should be 
implementing this functionality.


Re: [dpdk-dev] [PATCH] eal: fix bug in x86 cmpset

2017-02-10 Thread Stephen Hemminger
On Fri, 10 Feb 2017 11:53:06 +0100
Thomas Monjalon  wrote:

> 2017-02-10 10:39, Hunt, David:
> > 
> > On 9/2/2017 4:53 PM, Thomas Monjalon wrote:  
> > > 2016-11-06 22:09, Thomas Monjalon:  
> > >> 2016-09-29 18:34, Thomas Monjalon:  
> > >>> 2016-09-30 02:54, Nikhil Rao:  
> >  The original code used movl instead of xchgl, this caused
> >  rte_atomic64_cmpset to use ebx as the lower dword of the source
> >  to cmpxchg8b instead of the lower dword of function argument "src".  
> > >>> Could you please start the explanation with a statement of
> > >>> what is wrong from an user point of view?
> > >>> It could help to understand how severe it is.  
> > >> Please, we need a clear explanation of the bug, and an acknowledgement.  
> > > Should we close this bug?  
> > 
> > I took a few minutes to look at this, and the issue can easily be 
> > reproduced with a small snippet of code.
> > With the 'mov', the lower dword of the result is incorrect. This is 
> > resolved by using 'xchgl'.
> > 
> > void main()
> > {
> >  uint64_t a = 0xff00ff;
> > 
> >  rte_atomic64_cmpset( &a, 0xff00ff, 0xfa00fa);
> >  printf("0x%lx\n", a);
> > }
> > 
> > When using 'mov', the result is 0xfa
> > When using 'xchgl', the result is 0xfa00fa, as expected.  
> 
> This operation is used a lot in drivers for link status.
> 
> I think we need to clearly explain what was the consequence of this bug.


A bigger issue is why there are a huge number of copies of the same link code
in drivers. Definitely should be common code.  Also why is cmpset used here
when a simple atomic_set would work as well for what was intended.


[dpdk-dev] [dpdk-announce] release candidate 17.02-rc3

2017-02-10 Thread Thomas Monjalon
A new DPDK release candidate is ready for testing:
http://dpdk.org/browse/dpdk/tag/?id=v17.02-rc3

It is February 10th, the third RC is out, bringing a lot of fixes.
A new release must be available every three months, preferably
at the beginning of the month. It means it is time to close the 17.02.

The last step is to check the release notes and acknowledge the
deprecation notices, preparing the next release.

Please, let's target Tuesday 14th as the release date.
There is a lot of work planned for 17.05.


Re: [dpdk-dev] [dpdk-techboard] decision process and DPDK scope

2017-02-10 Thread Thomas Monjalon
2017-02-10 15:54, Bruce Richardson:
> On Thu, Feb 09, 2017 at 02:49:05PM -0800, Stephen Hemminger wrote:
> > On Thu, 9 Feb 2017 12:20:47 +
> > Bruce Richardson  wrote:
> > 
> > > > I think we can use this case to avoid seeing it again in the future.
> > > > I suggest that the technical board should check whether every new 
> > > > proposed
> > > > features are explained, discussed and approved enough in the community.
> > > > If needed, the technical board meeting minutes will give some lights to
> > > > the threads which require more attention.
> > > > Before adding a new library or adding a major API, there should be
> > > > some strong reviews which include discussing the DPDK scope.
> > > >   
> > > 
> > > The bigger question here is the default position of the DPDK community -
> > > default accept, or default reject. Your statements above all are very
> > > much keeping in the style of default reject i.e. every patch or change
> > > suggested is assumed to be unfit for acceptance unless reviewed in
> > > detail to prove beyond doubt otherwise.
> > > 
> > > I believe that we should change this default position, as I think that
> > > reject by default is hurting the community and will continue to do so.

It is hurting because there is no clear explanation of the process.

> > > NOTE: I am not suggesting that we allow all code in with zero review,
> > > but I am suggesting that if something has been reviewed and acked by at
> > > least one reviewer it should be automatically accepted unless some other
> > > reviewed objects in a TIMELY manner.

I see an issue with "automatic" decision after a period of time.
It puts a lot of pressure on the community to check everything.
I agree we should state this kind of default. But we should add two
exceptions:
- new API or API change
- a maintainer explicitly ask for a techboard discussion


> > I agree but in a more assertive manner. The maintainer should be the default
> > and active reviewer of all submissions. Like other projects the maintainers 
> > job
> > is to review and accept (or provide constructive feedback). Otherwise the
> > job could just by done by some manager.
> > 
> > But recently, I have changed my mind. The current DPDK project model is not
> > scaling well. After hearing some of the arguments in favor of a multiple
> > committer model (see "Maintainers Don't Scale" )
> > https://kernel-recipes.org/en/2016/talks/maintainers-dont-scale/
> > 
> > And comments on lwn:
> > https://lwn.net/Articles/703005/
> > 
> Might it be worthwhile to try out having 2 or 3 committers to each tree
> and see how it works? From the presentation you link too, the claim is
> that moving from 1 to 2 is the hardest, and expanding beyond that
> becomes easier.

I think the first thing to improve is the decision process.
Increasing the number of committers, without agreeing on a clear
decision process, would make things worse.


Re: [dpdk-dev] [PATCH] igb_uio: map dummy dma forcing iommu domain attachment

2017-02-10 Thread Ferruh Yigit
On 2/8/2017 11:54 AM, Alejandro Lucero wrote:
> Hi Ferruh,
> 
> On Tue, Feb 7, 2017 at 3:59 PM, Ferruh Yigit  > wrote:
> 
> Hi Alejandro,
> 
> On 1/18/2017 12:27 PM, Alejandro Lucero wrote:
> > For using a DPDK app when iommu is enabled, it requires to
> > add iommu=pt to the kernel command line. But using igb_uio driver
> > makes DMAR errors because the device has not an IOMMU domain.
> 
> Please help to understand the scope of the problem,
> 
> 
> After reading your reply, I realize I could have explained it better.
> First of all, this is related to SRIOV, exactly when the VFs are created.
>  
> 
> 1- How can you re-produce the problem?
> 
> 
> Using a VF from a Intel card by a DPDK app in the host and a kernel >=
> 3.15. Although usually VFs are assigned to VMs, it could also be an
> option to use VFs by the host. 
> 
> BTW, I did not try to reproduce the problem with an Intel card. I
> triggered this problem with an NFP, but because the problem behind, I
> bet that is going to happen for an Intel one as well.

I can able to reproduce the problem with ixgbe, by using VF on the host.

And I verified your patch fixes it, it cause device attached to a vfio
group.

So, I believe good to get this patch, but it is already to late for
17.02 release.
I suggest getting this one early 17.05, so it gives more time to test.

> 
>  
> 
> 2- What happens get DMAR errors, is it prevents device work or some
> annoying error messages?
> 
> 
> A DMAR error implies the device can not access to the DMA address given
> by the host. I have experienced several situations where it is just that
> device not being able to work at all, but it also has more global
> implications and you need to reboot the system because it is unreliable.
> I think it depends on how these DMAR errors are handled, but in any
> case, this is a bad thing.

In my test, implication was device is not working.

>  
> 
> 
> 3- Can you please share the error messages?
> 
> 
> With this problem you can expect something like this:
> 
>  559.163874] DMAR: DRHD: handling fault status reg 2
> [ 559.165427] DMAR: DMAR:[DMA Read] Request device [82:08.0] fault addr
> e7b73b000
> [ 559.165427] DMAR:[fault reason 02] Present bit in context entry is clear
> [ 568.367417] DMAR: DRHD: handling fault status reg 102
> [ 568.369025] DMAR: DMAR:[DMA Read] Request device [82:08.1] fault addr
> ebb73b000
> [ 568.369025] DMAR:[fault reason 02] Present bit in context entry is clear
> [ 571.773944] DMAR: DRHD: handling fault status reg 202
> [ 571.775550] DMAR: DMAR:[DMA Read] Request device [82:08.2] fault addr
> efb73b000
> [ 571.775550] DMAR:[fault reason 02] Present bit in context entry is clear
> [ 575.039654] DMAR: DRHD: handling fault status reg 302
> [ 575.041259] DMAR: DMAR:[DMA Read] Request device [82:08.3] fault addr
> f3b73b000
> [ 575.041259] DMAR:[fault reason 02] Present bit in context entry is clear
> 
> There are different DMAR errors, sometimes referring to a specific
> address being wrong. In this case it is related to the device not having
> a context or a IOMMU domain.
> 
> Also note we got these errors for different devices/VFs. This was with a
> DPDK app using several VFs.
>  
> 
> 
> 
> >
> > Since kernel 3.15, iommu=pt requires to use the internal kernel
> > DMA API for attaching the device to the IOMMU 1:1 mapping, aka
> > si_domain. Previous versions did attach the device to that
> > domain when intel iommu notifier was called.
> 
> Again, what is not working since 3.15?
> 
> 
> This specific case, yes. With older kernels, when VFs are created, IOMMU
> code is executed (notifier chain callback) and if iommu=pt, the VF is
> attached to the si_domain, this is the 1:1 mapping. But this has changed
> with newer kernels, and after VFs are created they have no IOMMU domain
> at all. The kernel expects the driver to implicitly create such a domain
> when the kernel DMA API is used.

Thanks again for clarification.
What will be the effect of your patch for kernel < 3.15, should your
update be protected with a kernel version check, or is it safe for all?

>  
> 
> 
> >
> > This is not a problem if the driver does later some call to the
> > DMA API because the mapping can be done then. But DPDK apps do
> > not use that DMA API at all.
> 
> Is this same/similar with:
> http://dpdk.org/dev/patchwork/patch/12654/
> 
> 
>  
> That case was another issue regarding IOMMU and iommu=pt. The problem
> there was when you detach a VF from a VM, but the VF was initially
> attached to the si_domain because the kernel did so. The patch helped to
> attach the VF again to that domain when binding to the UIO.
> 
> Looking at that patch now (I did comment on it then), it just solved the
> problem if the VF was detach form the UIO, something that could be
> easily forgotten or simply not done because, apparently, 

[dpdk-dev] [PATCH] eventdev: clarify nb_unlinks description

2017-02-10 Thread Gage Eads
This commit clarifies the usage of nb_unlinks when passing a NULL pointer
as the queues argument.

Signed-off-by: Gage Eads 
---
 lib/librte_eventdev/rte_eventdev.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/lib/librte_eventdev/rte_eventdev.h 
b/lib/librte_eventdev/rte_eventdev.h
index c2f9310..7b64532 100644
--- a/lib/librte_eventdev/rte_eventdev.h
+++ b/lib/librte_eventdev/rte_eventdev.h
@@ -1336,7 +1336,8 @@ rte_event_port_link(uint8_t dev_id, uint8_t port_id,
  *   event queue(s) from the event port *port_id*.
  *
  * @param nb_unlinks
- *   The number of unlinks to establish
+ *   The number of unlinks to establish. This parameter is ignored if queues is
+ *   NULL.
  *
  * @return
  * The number of unlinks actually established. The return value can be less
-- 
2.7.4



Re: [dpdk-dev] [PATCH] igb_uio: map dummy dma forcing iommu domain attachment

2017-02-10 Thread Ferruh Yigit
On 2/10/2017 7:03 PM, Ferruh Yigit wrote:
> On 2/8/2017 11:54 AM, Alejandro Lucero wrote:
>> Hi Ferruh,
>>
>> On Tue, Feb 7, 2017 at 3:59 PM, Ferruh Yigit > > wrote:
>>
>> Hi Alejandro,
>>
>> On 1/18/2017 12:27 PM, Alejandro Lucero wrote:
>> > For using a DPDK app when iommu is enabled, it requires to
>> > add iommu=pt to the kernel command line. But using igb_uio driver
>> > makes DMAR errors because the device has not an IOMMU domain.
>>
>> Please help to understand the scope of the problem,
>>
>>
>> After reading your reply, I realize I could have explained it better.
>> First of all, this is related to SRIOV, exactly when the VFs are created.
>>  
>>
>> 1- How can you re-produce the problem?
>>
>>
>> Using a VF from a Intel card by a DPDK app in the host and a kernel >=
>> 3.15. Although usually VFs are assigned to VMs, it could also be an
>> option to use VFs by the host. 
>>
>> BTW, I did not try to reproduce the problem with an Intel card. I
>> triggered this problem with an NFP, but because the problem behind, I
>> bet that is going to happen for an Intel one as well.
> 
> I can able to reproduce the problem with ixgbe, by using VF on the host.
> 
> And I verified your patch fixes it, it cause device attached to a vfio
> group.

I want to send this in a separate mail, since not directly related to
your patch, but while I was testing with vfio-pci I get lower numbers
comparing to the igb_uio, which is unexpected AFAIK.

Most probably I am doing something wrong, but I would like to ask if are
you observing same behavior?

Thanks,
ferruh



[dpdk-dev] [PATCH] net/bnx2x: Fix transmit queue free threshold

2017-02-10 Thread Charles (Chas) Williams
The default tx_free_thresh is potentially larger than the allocated queue
which will result in TX queue cleanup never happening.  To fix this,
lower the default free threshold and ensure that the free threshold is
never greater than the maximum outstanding transmit buffers.

Fixes: 827ed2a118cc ("net/bnx2x: restructure Tx routine")

Signed-off-by: Chas Williams 
---
 drivers/net/bnx2x/bnx2x_rxtx.c | 2 ++
 drivers/net/bnx2x/bnx2x_rxtx.h | 2 +-
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/net/bnx2x/bnx2x_rxtx.c b/drivers/net/bnx2x/bnx2x_rxtx.c
index 170e48f..adf0309 100644
--- a/drivers/net/bnx2x/bnx2x_rxtx.c
+++ b/drivers/net/bnx2x/bnx2x_rxtx.c
@@ -273,6 +273,8 @@ bnx2x_dev_tx_queue_setup(struct rte_eth_dev *dev,
 
txq->tx_free_thresh = tx_conf->tx_free_thresh ?
tx_conf->tx_free_thresh : DEFAULT_TX_FREE_THRESH;
+   txq->tx_free_thresh = min(txq->tx_free_thresh,
+ txq->nb_tx_desc - BDS_PER_TX_PKT);
 
PMD_INIT_LOG(DEBUG, "fp[%02d] req_bd=%u, thresh=%u, usable_bd=%lu, "
 "total_bd=%lu, tx_pages=%u",
diff --git a/drivers/net/bnx2x/bnx2x_rxtx.h b/drivers/net/bnx2x/bnx2x_rxtx.h
index dd251aa..2e38ec2 100644
--- a/drivers/net/bnx2x/bnx2x_rxtx.h
+++ b/drivers/net/bnx2x/bnx2x_rxtx.h
@@ -11,7 +11,7 @@
 #ifndef _BNX2X_RXTX_H_
 #define _BNX2X_RXTX_H_
 
-#define DEFAULT_TX_FREE_THRESH   512
+#define DEFAULT_TX_FREE_THRESH   64
 #define RTE_PMD_BNX2X_TX_MAX_BURST 1
 
 /**
-- 
2.1.4



Re: [dpdk-dev] [PATCH 1/2] ethdev: add capability control API

2017-02-10 Thread Wiles, Keith


> On Feb 10, 2017, at 8:06 AM, Cristian Dumitrescu 
>  wrote:
> 
> The rte_flow feature breaks the current monolithic approach for ethdev and
> introduces the new generic flow API to ethdev using a plugin-like approach.
> 
> Basically, the rte_flow API is still logically part of ethdev:
> - It extends the ethdev functionality: rte_flow is a new feature/capability
>  of ethdev;
> - all its functions work on an Ethernet device: the first parameter of the
>  rte_flow functions is Ethernet device port ID.
> 
> At the same time, the rte_flow API is a sort of capability plugin for ethdev:
> - the rte_flow API functions have their own name space: they are called
>  rte_flow_operationXYZ() as opposed to rte_eth_dev_flow_operationXYZ());
> - the rte_flow API functions are placed in separate files in the same
>  librte_ether folder as opposed to rte_ethdev.[hc].
> 
> The way it works is by using the existing ethdev API function
> rte_eth_dev_filter_ctrl() to query the current Ethernet device port ID for the
> support of the rte_flow capability and return the pointer to the
> rte_flow operations when supported and NULL otherwise:
> 
> struct rte_flow_ops *eth_flow_ops;
> int rte = rte_eth_dev_filter_ctrl(eth_port_id,
>RTE_ETH_FILTER_GENERIC, RTE_ETH_FILTER_GET, ð_flow_ops);
> 
> Unfortunately, the rte_flow opportunistically uses the 
> rte_eth_dev_filter_ctrl()
> API function, which is applicable just to RX-side filters as opposed to
> introducing a mechanism that could be used by any capability in a generic way.
> 
> This is the gap that addressed by the current patch. This mechanism is 
> intended
> to be used to introduce new capabilities into ethdev in a modular plugin-like
> approach, such as hierarchical scheduler. Over time, if agreed, it can also be
> used for exposing the existing Ethernet device capabilities in a modular way,
> such as: xstats, filters, multicast, mirroring, tunnels, time stamping, 
> eeprom,
> bypass, etc.
> 
> Signed-off-by: Cristian Dumitrescu

Acked by: keith.wi...@intel.com

> 
> ---
> lib/librte_ether/rte_ethdev.c  | 13 +
> lib/librte_ether/rte_ethdev.h  | 29 +
> lib/librte_ether/rte_ether_version.map |  7 +++
> 3 files changed, 49 insertions(+)
> 
> diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
> index eb0a94a..ae187c4 100644
> --- a/lib/librte_ether/rte_ethdev.c
> +++ b/lib/librte_ether/rte_ethdev.c
> @@ -2802,6 +2802,19 @@ rte_eth_dev_filter_ctrl(uint8_t port_id, enum 
> rte_filter_type filter_type,
>return (*dev->dev_ops->filter_ctrl)(dev, filter_type, filter_op, arg);
> }
> 
> +int
> +rte_eth_dev_capability_control(uint8_t port_id, enum rte_eth_capability cap,
> +void *arg)
> +{
> +struct rte_eth_dev *dev;
> +
> +RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
> +
> +dev = &rte_eth_devices[port_id];
> +RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->cap_ctrl, -ENOTSUP);
> +return (*dev->dev_ops->cap_ctrl)(dev, cap, arg);
> +}
> +
> void *
> rte_eth_add_rx_callback(uint8_t port_id, uint16_t queue_id,
>rte_rx_callback_fn fn, void *user_param)
> diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
> index c17bbda..43ffb9e 100644
> --- a/lib/librte_ether/rte_ethdev.h
> +++ b/lib/librte_ether/rte_ethdev.h
> @@ -1073,6 +1073,12 @@ TAILQ_HEAD(rte_eth_dev_cb_list, rte_eth_dev_callback);
>  * structure associated with an Ethernet device.
>  */
> 
> +enum rte_eth_capability {
> +RTE_ETH_CAPABILITY_FLOW = 0, /**< Flow */
> +RTE_ETH_CAPABILITY_SCHED, /**< Hierarchical Scheduler */
> +RTE_ETH_CAPABILITY_MAX
> +};
> +
> typedef int  (*eth_dev_configure_t)(struct rte_eth_dev *dev);
> /**< @internal Ethernet device configuration. */
> 
> @@ -1427,6 +1433,10 @@ typedef int (*eth_filter_ctrl_t)(struct rte_eth_dev 
> *dev,
> void *arg);
> /**< @internal Take operations to assigned filter type on an Ethernet device 
> */
> 
> +typedef int (*eth_capability_control_t)(struct rte_eth_dev *dev,
> +enum rte_eth_capability cap, void *arg);
> +/**< @internal Take capability operations on an Ethernet device */
> +
> typedef int (*eth_get_dcb_info)(struct rte_eth_dev *dev,
> struct rte_eth_dcb_info *dcb_info);
> /**< @internal Get dcb information on an Ethernet device */
> @@ -1548,6 +1558,8 @@ struct eth_dev_ops {
>eth_timesync_adjust_time   timesync_adjust_time; /** Adjust the device 
> clock. */
>eth_timesync_read_time timesync_read_time; /** Get the device clock 
> time. */
>eth_timesync_write_timetimesync_write_time; /** Set the device clock 
> time. */
> +
> +eth_capability_control_t   cap_ctrl; /**< capability control. */
> };
> 
> /**
> @@ -3890,6 +3902,23 @@ int rte_eth_dev_filter_ctrl(uint8_t port_id, enum 
> rte_filter_type filter_type,
>enum rte_filter_op filter_op, void *arg);
> 
> /**
> + * Take capability operations on an Ethernet device.
> + *
> + * @param port_id
>

[dpdk-dev] [PATCH] eventdev: Add rte_errno return values to the enqueue and dequeue functions

2017-02-10 Thread Gage Eads
This change allows user software to differentiate between an invalid argument
(such as an invalid queue_id or sched_type in an enqueued event) and
backpressure from the event device.

The port and device ID checks are placed in RTE_LIBRTE_EVENTDEV_DEBUG header
guards to avoid the performance hit in non-debug execution.

Signed-off-by: Gage Eads 
---
 lib/librte_eventdev/rte_eventdev.h | 42 +++---
 1 file changed, 39 insertions(+), 3 deletions(-)

diff --git a/lib/librte_eventdev/rte_eventdev.h 
b/lib/librte_eventdev/rte_eventdev.h
index c2f9310..ef21205 100644
--- a/lib/librte_eventdev/rte_eventdev.h
+++ b/lib/librte_eventdev/rte_eventdev.h
@@ -245,6 +245,7 @@ extern "C" {
 
 #include 
 #include 
+#include 
 
 struct rte_mbuf; /* we just use mbuf pointers; no need to include rte_mbuf.h */
 
@@ -1116,9 +1117,14 @@ rte_event_schedule(uint8_t dev_id)
  *   The number of event objects actually enqueued on the event device. The
  *   return value can be less than the value of the *nb_events* parameter when
  *   the event devices queue is full or if invalid parameters are specified in 
a
- *   *rte_event*. If return value is less than *nb_events*, the remaining 
events
- *   at the end of ev[] are not consumed,and the caller has to take care of 
them
- *
+ *   *rte_event*. If the return value is less than *nb_events*, the remaining
+ *   events at the end of ev[] are not consumed and the caller has to take care
+ *   of them, and rte_errno is set accordingly. Possible errno values include:
+ *   -(-EINVAL) The port ID is invalid, device ID is invalid, an event's queue
+ *  ID is invalid, or an event's sched type doesn't match the
+ *  capabilities of the destination queue.
+ *   -(-ENOSPC) The event port was backpressured and unable to enqueue
+ *  one or more events.
  * @see rte_event_port_enqueue_depth()
  */
 static inline uint16_t
@@ -1127,6 +1133,21 @@ rte_event_enqueue_burst(uint8_t dev_id, uint8_t port_id,
 {
struct rte_eventdev *dev = &rte_eventdevs[dev_id];
 
+   rte_errno = 0;
+#ifdef RTE_LIBRTE_EVENTDEV_DEBUG
+   if (rte_eventdevs[dev_id].attached == RTE_EVENTDEV_DETACHED) {
+   RTE_EDEV_LOG_DEBUG("Invalid dev_id=%d\n", dev_id);
+   rte_errno = -EINVAL;
+   return 0;
+   }
+
+   if (port_id >= dev->data->nb_ports) {
+   RTE_EDEV_LOG_DEBUG("Invalid port_id=%d\n", port_id);
+   rte_errno = -EINVAL;
+   return 0;
+   }
+#endif
+
/*
 * Allow zero cost non burst mode routine invocation if application
 * requests nb_events as const one
@@ -1235,6 +1256,21 @@ rte_event_dequeue_burst(uint8_t dev_id, uint8_t port_id, 
struct rte_event ev[],
 {
struct rte_eventdev *dev = &rte_eventdevs[dev_id];
 
+#ifdef RTE_LIBRTE_EVENTDEV_DEBUG
+   rte_errno = 0;
+   if (rte_eventdevs[dev_id].attached == RTE_EVENTDEV_DETACHED) {
+   RTE_EDEV_LOG_DEBUG("Invalid dev_id=%d\n", dev_id);
+   rte_errno = -EINVAL;
+   return 0;
+   }
+
+   if (port_id >= dev->data->nb_ports) {
+   RTE_EDEV_LOG_DEBUG("Invalid port_id=%d\n", port_id);
+   rte_errno = -EINVAL;
+   return 0;
+   }
+#endif
+
/*
 * Allow zero cost non burst mode routine invocation if application
 * requests nb_events as const one
-- 
2.7.4



[dpdk-dev] GSO/GRO support

2017-02-10 Thread Kiran KN
We, at Juniper Opencontrail have added software support for TCP send offload 
and receive offload to DPDK.

If the community is interested, we can publish/upstream it.

Pl let us know what you think of it.

Thanks,
 Kiran


Re: [dpdk-dev] [PATCH] eal: sPAPR IOMMU support in pci probing for vfio-pci in ppc64le

2017-02-10 Thread gowrishankar muthukrishnan

Hi Thomas,
I see rc3 out. Could this patch also go in 17.02 (rc4 ?).

This patch is ppc64le specific (w/o affecting other arch) and it enables 
pmd over vfio-pci be useful for this arch.


Thanks,
Gowrishankar

On Friday 10 February 2017 11:48 AM, Gowrishankar wrote:

From: Gowrishankar Muthukrishnan 

Below changes adds pci probing support for vfio-pci devices in power8.

Signed-off-by: Gowrishankar Muthukrishnan 
Acked-by: Chao Zhu 
---
  lib/librte_eal/linuxapp/eal/eal_vfio.c | 88 ++
  lib/librte_eal/linuxapp/eal/eal_vfio.h |  1 +
  2 files changed, 89 insertions(+)

diff --git a/lib/librte_eal/linuxapp/eal/eal_vfio.c 
b/lib/librte_eal/linuxapp/eal/eal_vfio.c
index 702f7a2..1d4fea6 100644
--- a/lib/librte_eal/linuxapp/eal/eal_vfio.c
+++ b/lib/librte_eal/linuxapp/eal/eal_vfio.c
@@ -50,12 +50,15 @@
  static struct vfio_config vfio_cfg;

  static int vfio_type1_dma_map(int);
+static int vfio_spapr_dma_map(int);
  static int vfio_noiommu_dma_map(int);

  /* IOMMU types we support */
  static const struct vfio_iommu_type iommu_types[] = {
/* x86 IOMMU, otherwise known as type 1 */
{ RTE_VFIO_TYPE1, "Type 1", &vfio_type1_dma_map},
+   /* ppc64 IOMMU, otherwise known as spapr */
+   { RTE_VFIO_SPAPR, "sPAPR", &vfio_spapr_dma_map},
/* IOMMU-less mode */
{ RTE_VFIO_NOIOMMU, "No-IOMMU", &vfio_noiommu_dma_map},
  };
@@ -540,6 +543,91 @@ int vfio_setup_device(const char *sysfs_base, const char 
*dev_addr,
  }

  static int
+vfio_spapr_dma_map(int vfio_container_fd)
+{
+   const struct rte_memseg *ms = rte_eal_get_physmem_layout();
+   int i, ret;
+
+   struct vfio_iommu_spapr_register_memory reg = {
+   .argsz = sizeof(reg),
+   .flags = 0
+   };
+   struct vfio_iommu_spapr_tce_info info = {
+   .argsz = sizeof(info),
+   };
+   struct vfio_iommu_spapr_tce_create create = {
+   .argsz = sizeof(create),
+   };
+   struct vfio_iommu_spapr_tce_remove remove = {
+   .argsz = sizeof(remove),
+   };
+
+   /* query spapr iommu info */
+   ret = ioctl(vfio_container_fd, VFIO_IOMMU_SPAPR_TCE_GET_INFO, &info);
+   if (ret) {
+   RTE_LOG(ERR, EAL, "  cannot get iommu info, "
+   "error %i (%s)\n", errno, strerror(errno));
+   return -1;
+   }
+
+   /* remove default DMA of 32 bit window */
+   remove.start_addr = info.dma32_window_start;
+   ret = ioctl(vfio_container_fd, VFIO_IOMMU_SPAPR_TCE_REMOVE, &remove);
+   if (ret) {
+   RTE_LOG(ERR, EAL, "  cannot remove default DMA window, "
+   "error %i (%s)\n", errno, strerror(errno));
+   return -1;
+   }
+
+   /* calculate window size based on number of hugepages configured */
+   create.window_size = rte_eal_get_physmem_size();
+   create.page_shift = __builtin_ctzll(ms->hugepage_sz);
+   create.levels = 2;
+
+   ret = ioctl(vfio_container_fd, VFIO_IOMMU_SPAPR_TCE_CREATE, &create);
+   if (ret) {
+   RTE_LOG(ERR, EAL, "  cannot create new DMA window, "
+   "error %i (%s)\n", errno, strerror(errno));
+   return -1;
+   }
+
+   /* map all DPDK segments for DMA. use 1:1 PA to IOVA mapping */
+   for (i = 0; i < RTE_MAX_MEMSEG; i++) {
+   struct vfio_iommu_type1_dma_map dma_map;
+
+   if (ms[i].addr == NULL)
+   break;
+
+   reg.vaddr = (uintptr_t) ms[i].addr;
+   reg.size = ms[i].len;
+   ret = ioctl(vfio_container_fd, VFIO_IOMMU_SPAPR_REGISTER_MEMORY, 
®);
+   if (ret) {
+   RTE_LOG(ERR, EAL, "  cannot register vaddr for IOMMU, "
+   "error %i (%s)\n", errno, 
strerror(errno));
+   return -1;
+   }
+
+   memset(&dma_map, 0, sizeof(dma_map));
+   dma_map.argsz = sizeof(struct vfio_iommu_type1_dma_map);
+   dma_map.vaddr = ms[i].addr_64;
+   dma_map.size = ms[i].len;
+   dma_map.iova = ms[i].phys_addr;
+   dma_map.flags = VFIO_DMA_MAP_FLAG_READ | 
VFIO_DMA_MAP_FLAG_WRITE;
+
+   ret = ioctl(vfio_container_fd, VFIO_IOMMU_MAP_DMA, &dma_map);
+
+   if (ret) {
+   RTE_LOG(ERR, EAL, "  cannot set up DMA remapping, "
+   "error %i (%s)\n", errno, 
strerror(errno));
+   return -1;
+   }
+
+   }
+
+   return 0;
+}
+
+static int
  vfio_noiommu_dma_map(int __rte_unused vfio_container_fd)
  {
/* No-IOMMU mode does not need DMA mapping */
diff --git a/lib/librte_eal/linuxapp/eal/eal_vfio.h 
b/lib/librte_eal/linuxapp/eal/eal_vfio.h
index 29f7f3e..533b854 100644
--- a/lib/librte_eal/linuxapp/eal/eal_vfio.h
+++ b/lib/librte_eal/li

Re: [dpdk-dev] [PATCH] eventdev: clarify nb_unlinks description

2017-02-10 Thread Jerin Jacob
On Fri, Feb 10, 2017 at 01:04:33PM -0600, Gage Eads wrote:
> This commit clarifies the usage of nb_unlinks when passing a NULL pointer
> as the queues argument.
> 
> Signed-off-by: Gage Eads 
> ---
>  lib/librte_eventdev/rte_eventdev.h | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/lib/librte_eventdev/rte_eventdev.h 
> b/lib/librte_eventdev/rte_eventdev.h
> index c2f9310..7b64532 100644
> --- a/lib/librte_eventdev/rte_eventdev.h
> +++ b/lib/librte_eventdev/rte_eventdev.h
> @@ -1336,7 +1336,8 @@ rte_event_port_link(uint8_t dev_id, uint8_t port_id,
>   *   event queue(s) from the event port *port_id*.
>   *
>   * @param nb_unlinks
> - *   The number of unlinks to establish
> + *   The number of unlinks to establish. This parameter is ignored if queues 
> is
> + *   NULL.

Add similar description to rte_event_port_link too, With that,

Acked-by: Jerin Jacob 

>   *
>   * @return
>   * The number of unlinks actually established. The return value can be less
> -- 
> 2.7.4
> 


Re: [dpdk-dev] [PATCH 1/2] ethdev: add capability control API

2017-02-10 Thread Jerin Jacob
On Fri, Feb 10, 2017 at 02:05:49PM +, Cristian Dumitrescu wrote:
> The rte_flow feature breaks the current monolithic approach for ethdev and
> introduces the new generic flow API to ethdev using a plugin-like approach.
> 
> Basically, the rte_flow API is still logically part of ethdev:
> - It extends the ethdev functionality: rte_flow is a new feature/capability
>   of ethdev;
> - all its functions work on an Ethernet device: the first parameter of the
>   rte_flow functions is Ethernet device port ID.
> 
> At the same time, the rte_flow API is a sort of capability plugin for ethdev:
> - the rte_flow API functions have their own name space: they are called
>   rte_flow_operationXYZ() as opposed to rte_eth_dev_flow_operationXYZ());
> - the rte_flow API functions are placed in separate files in the same
>   librte_ether folder as opposed to rte_ethdev.[hc].
> 
> The way it works is by using the existing ethdev API function
> rte_eth_dev_filter_ctrl() to query the current Ethernet device port ID for the
> support of the rte_flow capability and return the pointer to the
> rte_flow operations when supported and NULL otherwise:
> 
> struct rte_flow_ops *eth_flow_ops;
> int rte = rte_eth_dev_filter_ctrl(eth_port_id,
>   RTE_ETH_FILTER_GENERIC, RTE_ETH_FILTER_GET, ð_flow_ops);
> 
> Unfortunately, the rte_flow opportunistically uses the 
> rte_eth_dev_filter_ctrl()
> API function, which is applicable just to RX-side filters as opposed to
> introducing a mechanism that could be used by any capability in a generic way.
> 
> This is the gap that addressed by the current patch. This mechanism is 
> intended
> to be used to introduce new capabilities into ethdev in a modular plugin-like
> approach, such as hierarchical scheduler. Over time, if agreed, it can also be
> used for exposing the existing Ethernet device capabilities in a modular way,
> such as: xstats, filters, multicast, mirroring, tunnels, time stamping, 
> eeprom,
> bypass, etc.
> 
> Signed-off-by: Cristian Dumitrescu 
> ---
>  lib/librte_ether/rte_ethdev.c  | 13 +
>  lib/librte_ether/rte_ethdev.h  | 29 +
>  lib/librte_ether/rte_ether_version.map |  7 +++
>  3 files changed, 49 insertions(+)
> 
> diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
> index eb0a94a..ae187c4 100644
> --- a/lib/librte_ether/rte_ethdev.c
> +++ b/lib/librte_ether/rte_ethdev.c
> @@ -2802,6 +2802,19 @@ rte_eth_dev_filter_ctrl(uint8_t port_id, enum 
> rte_filter_type filter_type,
>   return (*dev->dev_ops->filter_ctrl)(dev, filter_type, filter_op, arg);
>  }
>  
> +int
> +rte_eth_dev_capability_control(uint8_t port_id, enum rte_eth_capability cap,
> + void *arg)
> +{
> + struct rte_eth_dev *dev;
> +
> + RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
> +
> + dev = &rte_eth_devices[port_id];
> + RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->cap_ctrl, -ENOTSUP);
> + return (*dev->dev_ops->cap_ctrl)(dev, cap, arg);
> +}
> +
>  void *
>  rte_eth_add_rx_callback(uint8_t port_id, uint16_t queue_id,
>   rte_rx_callback_fn fn, void *user_param)
> diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
> index c17bbda..43ffb9e 100644
> --- a/lib/librte_ether/rte_ethdev.h
> +++ b/lib/librte_ether/rte_ethdev.h
> @@ -1073,6 +1073,12 @@ TAILQ_HEAD(rte_eth_dev_cb_list, rte_eth_dev_callback);
>   * structure associated with an Ethernet device.
>   */
>  
> +enum rte_eth_capability {
> + RTE_ETH_CAPABILITY_FLOW = 0, /**< Flow */
> + RTE_ETH_CAPABILITY_SCHED, /**< Hierarchical Scheduler */
> + RTE_ETH_CAPABILITY_MAX
> +};

Shouldn't it be the FLAG?. Meaning, To represent ethdev port can have both.

> +