Re: [dpdk-dev] [PATCH v3] net/i40e: Improved FDIR programming times

2017-05-17 Thread Ferruh Yigit
On 5/17/2017 3:22 AM, Xing, Beilei wrote:
> Hi,
> 
> Seems my comments in v2 are not addressed, add the comments here again.

Hi Michael,

And can you please use "--in-reply-to" option of the git send-email when
sending the new version of the patch.
To make new version of the patch in same mail thread, as a reply to
previous version.

Thanks,
ferruh

> 
>> -Original Message-
>> From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Michael Lilja
>> Sent: Wednesday, May 17, 2017 6:02 AM
>> To: dev@dpdk.org
>> Cc: Michael Lilja 
>> Subject: [dpdk-dev] [PATCH v3] net/i40e: Improved FDIR programming times
>>
>> Previously, the FDIR programming time is +11ms on i40e.
>> This patch will result in an average programming time of 22usec with a max of
>> 60usec .
>>
>> Signed-off-by: Michael Lilja 
<...>


Re: [dpdk-dev] [PATCH] cryptodev: remove crypto device type enumeration

2017-05-17 Thread De Lara Guarch, Pablo
Hi Slawomir,

> -Original Message-
> From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Slawomir
> Mrozowicz
> Sent: Monday, May 15, 2017 10:56 AM
> To: Doherty, Declan
> Cc: dev@dpdk.org; Mrozowicz, SlawomirX
> Subject: [dpdk-dev] [PATCH] cryptodev: remove crypto device type
> enumeration
> 
> Changes device type identification to be based on a unique
> driver id replacing the current device type enumeration, which needed
> library changes every time a new crypto driver was added.
> 
> The driver id is assigned dynamically during driver registration using
> the new macro RTE_PMD_REGISTER_CRYPTO_DRIVER which returns a
> unique
> uint8_t identifier for that driver. New APIs are also introduced
> to allow retrieval of the driver id using the driver name.
> 
> Signed-off-by: Slawomir Mrozowicz 

There are compilation issues with the patch: 
http://dpdk.org/ml/archives/test-report/2017-May/019815.html

Could you add some information in the release notes?
There is an API change here, that should be noted.

> ---
>  doc/guides/prog_guide/cryptodev_lib.rst|   5 +-
>  drivers/crypto/aesni_gcm/aesni_gcm_pmd.c   |  12 +-
>  drivers/crypto/aesni_gcm/aesni_gcm_pmd_ops.c   |   2 +-
>  drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c |  12 +-
>  drivers/crypto/aesni_mb/rte_aesni_mb_pmd_ops.c |   2 +-
>  drivers/crypto/armv8/rte_armv8_pmd.c   |  12 +-
>  drivers/crypto/armv8/rte_armv8_pmd_ops.c   |   2 +-
>  drivers/crypto/dpaa2_sec/dpaa2_sec_dpseci.c|  10 +-
>  drivers/crypto/kasumi/rte_kasumi_pmd.c |  12 +-
>  drivers/crypto/kasumi/rte_kasumi_pmd_ops.c |   2 +-
>  drivers/crypto/null/null_crypto_pmd.c  |  12 +-
>  drivers/crypto/null/null_crypto_pmd_ops.c  |   2 +-
>  drivers/crypto/openssl/rte_openssl_pmd.c   |  12 +-
>  drivers/crypto/openssl/rte_openssl_pmd_ops.c   |   2 +-
>  drivers/crypto/qat/qat_crypto.c|   7 +-
>  drivers/crypto/qat/rte_qat_cryptodev.c |   8 +-
>  drivers/crypto/scheduler/rte_cryptodev_scheduler.c |  31 ++--
>  drivers/crypto/scheduler/scheduler_pmd.c   |   7 +-
>  drivers/crypto/scheduler/scheduler_pmd_ops.c   |   2 +-
>  drivers/crypto/scheduler/scheduler_pmd_private.h   |   2 +-
>  drivers/crypto/snow3g/rte_snow3g_pmd.c |  12 +-
>  drivers/crypto/snow3g/rte_snow3g_pmd_ops.c |   2 +-
>  drivers/crypto/zuc/rte_zuc_pmd.c   |  12 +-
>  drivers/crypto/zuc/rte_zuc_pmd_ops.c   |   2 +-
>  lib/librte_cryptodev/rte_cryptodev.c   |  39 -
>  lib/librte_cryptodev/rte_cryptodev.h   |  68 ---
>  lib/librte_cryptodev/rte_cryptodev_pmd.h   |   2 +-
>  lib/librte_cryptodev/rte_cryptodev_version.map |  11 +-
>  test/test/test_cryptodev.c | 195 
> ++---
>  test/test/test_cryptodev_blockcipher.c |  68 ---
>  test/test/test_cryptodev_blockcipher.h |   2 +-
>  test/test/test_cryptodev_perf.c| 124 -
>  32 files changed, 472 insertions(+), 221 deletions(-)
> 

...

> diff --git a/drivers/crypto/aesni_gcm/aesni_gcm_pmd.c
> b/drivers/crypto/aesni_gcm/aesni_gcm_pmd.c
> index 101ef98..64a0ba0 100644
> --- a/drivers/crypto/aesni_gcm/aesni_gcm_pmd.c
> +++ b/drivers/crypto/aesni_gcm/aesni_gcm_pmd.c
> @@ -143,8 +143,9 @@ aesni_gcm_get_session(struct aesni_gcm_qp *qp,
> struct rte_crypto_sym_op *op)
>   struct aesni_gcm_session *sess = NULL;
> 
>   if (op->sess_type == RTE_CRYPTO_SYM_OP_WITH_SESSION) {
> - if (unlikely(op->session->dev_type
> - !=
> RTE_CRYPTODEV_AESNI_GCM_PMD))
> + if (unlikely(op->session->driver_id !=
> + rte_cryptodev_driver_id_get(
> +

I would store the driver id of the PMD somewhere (maybe in aesni_gcm_qp or 
access dev->driver_id?),
and not call this function for all operations (same for other PMDs).

...

> diff --git a/lib/librte_cryptodev/rte_cryptodev.h
> b/lib/librte_cryptodev/rte_cryptodev.h
> index 88aeb87..533017c 100644
> --- a/lib/librte_cryptodev/rte_cryptodev.h
> +++ b/lib/librte_cryptodev/rte_cryptodev.h

...

> +
> +/**
> + * Provide driver name.
> + *
> + * @param driver_id
> + *   The driver identifier.
> + * @return
> + *  The driver name or null if no driver found
> + */
> +char *rte_cryptodev_driver_name_get(uint8_t driver_id);
> +
> +/**
> + * Allocate driver identifier.
> + *

Mark this as internal, as this will only be used by PMDs, not apps.


[dpdk-dev] [PATCH v4] net/i40e: improved FDIR programming times

2017-05-17 Thread Michael Lilja
Previously, the FDIR programming time is +11ms on i40e.
This patch will result in an average programming time of
22usec with a max of 60usec .

Signed-off-by: Michael Lilja 

---
v4:
* Code style fix
---
---
 drivers/net/i40e/i40e_fdir.c | 20 ++--
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/drivers/net/i40e/i40e_fdir.c b/drivers/net/i40e/i40e_fdir.c
index 28cc554f5..32f6aeafb 100644
--- a/drivers/net/i40e/i40e_fdir.c
+++ b/drivers/net/i40e/i40e_fdir.c
@@ -1296,27 +1296,27 @@ i40e_fdir_filter_programming(struct i40e_pf *pf,
rte_wmb();
I40E_PCI_REG_WRITE(txq->qtx_tail, txq->tx_tail);
 
-   for (i = 0; i < I40E_FDIR_WAIT_COUNT; i++) {
-   rte_delay_us(I40E_FDIR_WAIT_INTERVAL_US);
+   for (i = 0; i < (I40E_FDIR_WAIT_COUNT * I40E_FDIR_WAIT_INTERVAL_US); 
i++) {
if ((txdp->cmd_type_offset_bsz &
rte_cpu_to_le_64(I40E_TXD_QW1_DTYPE_MASK)) ==
rte_cpu_to_le_64(I40E_TX_DESC_DTYPE_DESC_DONE))
break;
+   rte_delay_us(1);
}
-   if (i >= I40E_FDIR_WAIT_COUNT) {
+   if (i >= (I40E_FDIR_WAIT_COUNT * I40E_FDIR_WAIT_INTERVAL_US)) {
PMD_DRV_LOG(ERR, "Failed to program FDIR filter:"
" time out to get DD on tx queue.");
return -ETIMEDOUT;
}
/* totally delay 10 ms to check programming status*/
-   rte_delay_us((I40E_FDIR_WAIT_COUNT - i) * I40E_FDIR_WAIT_INTERVAL_US);
-   if (i40e_check_fdir_programming_status(rxq) < 0) {
-   PMD_DRV_LOG(ERR, "Failed to program FDIR filter:"
-   " programming status reported.");
-   return -ENOSYS;
+   for (i = 0; i < (I40E_FDIR_WAIT_COUNT * I40E_FDIR_WAIT_INTERVAL_US); 
i++) {
+   if (i40e_check_fdir_programming_status(rxq) >= 0)
+   return 0;
+   rte_delay_us(1);
}
-
-   return 0;
+   PMD_DRV_LOG(ERR, "Failed to program FDIR filter:"
+   " programming status reported.");
+   return -ENOSYS;
 }
 
 /*
-- 
2.12.2



Re: [dpdk-dev] [PATCH v4] net/i40e: improved FDIR programming times

2017-05-17 Thread Xing, Beilei
Hi Michael,

> -Original Message-
> From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Michael Lilja
> Sent: Wednesday, May 17, 2017 5:12 PM
> To: dev@dpdk.org
> Cc: Michael Lilja 
> Subject: [dpdk-dev] [PATCH v4] net/i40e: improved FDIR programming times
> 
> Previously, the FDIR programming time is +11ms on i40e.
> This patch will result in an average programming time of 22usec with a max of
> 60usec .
> 
> Signed-off-by: Michael Lilja 
> 
> ---
> v4:
> * Code style fix
> ---
> ---
>  drivers/net/i40e/i40e_fdir.c | 20 ++--
>  1 file changed, 10 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/net/i40e/i40e_fdir.c b/drivers/net/i40e/i40e_fdir.c index
> 28cc554f5..32f6aeafb 100644
> --- a/drivers/net/i40e/i40e_fdir.c
> +++ b/drivers/net/i40e/i40e_fdir.c
> @@ -1296,27 +1296,27 @@ i40e_fdir_filter_programming(struct i40e_pf *pf,
>   rte_wmb();
>   I40E_PCI_REG_WRITE(txq->qtx_tail, txq->tx_tail);
> 
> - for (i = 0; i < I40E_FDIR_WAIT_COUNT; i++) {
> - rte_delay_us(I40E_FDIR_WAIT_INTERVAL_US);
> + for (i = 0; i < (I40E_FDIR_WAIT_COUNT *
> I40E_FDIR_WAIT_INTERVAL_US);
> +i++) {
>   if ((txdp->cmd_type_offset_bsz &
> 
>   rte_cpu_to_le_64(I40E_TXD_QW1_DTYPE_MASK)) ==
> 
>   rte_cpu_to_le_64(I40E_TX_DESC_DTYPE_DESC_DONE))
>   break;
> + rte_delay_us(1);
>   }
> - if (i >= I40E_FDIR_WAIT_COUNT) {
> + if (i >= (I40E_FDIR_WAIT_COUNT * I40E_FDIR_WAIT_INTERVAL_US))
> {
>   PMD_DRV_LOG(ERR, "Failed to program FDIR filter:"
>   " time out to get DD on tx queue.");
>   return -ETIMEDOUT;
>   }
>   /* totally delay 10 ms to check programming status*/
> - rte_delay_us((I40E_FDIR_WAIT_COUNT - i) *
> I40E_FDIR_WAIT_INTERVAL_US);
> - if (i40e_check_fdir_programming_status(rxq) < 0) {
> - PMD_DRV_LOG(ERR, "Failed to program FDIR filter:"
> - " programming status reported.");
> - return -ENOSYS;
> + for (i = 0; i < (I40E_FDIR_WAIT_COUNT *
> I40E_FDIR_WAIT_INTERVAL_US); i++) {

To keep the original intention, "i" shouldn't be set to 0 again but keep above 
value.
Please refer to " rte_delay_us((I40E_FDIR_WAIT_COUNT - i) *> 
I40E_FDIR_WAIT_INTERVAL_US)".
Sorry for missing it before.

Overall it's OK for me, thanks.

Beilei



Re: [dpdk-dev] [RFC] Service Cores concept

2017-05-17 Thread Bruce Richardson
On Wed, May 17, 2017 at 12:11:10AM +0200, Thomas Monjalon wrote:
> 03/05/2017 13:29, Harry van Haaren:
> > The concept is to allow a software function register itself with EAL as
> > a "service", which requires CPU time to perform its duties. Multiple
> > services can be registered in an application, if more than one service
> > exists. The application can retrieve a list of services, and decide how
> > many "service cores" to use. The number of service cores is removed
> > from the application usage, and they are mapped to services based on
> > an application supplied coremask.
> > 
> > The application now continues as normal, without having to manually
> > schedule and implement arbitration of CPU time for the SW services.
> 
> I think it should not be the DPDK responsibility to schedule threads.
> The mainloops and scheduling are application design choices.
> 
> If I understand well the idea of your proposal, it is a helper for
> the application to configure the thread scheduling of known services.
> So I think we could add interrupt processing and other thread creations
> in this concept.
> Could we also embed the rte_eal_mp_remote_launch() calls in this concept?


There are a couple of parts of this:
1. Allowing libraries and drivers to register the fact that they require
background processing, e.g. as a SW fallback for functionality that
would otherwise be implemented in hardware
2. Providing support for easily multi-plexing these independent
functions from different libs onto a different core, compared to the
normal operation of DPDK of firing a single run-forever function on each
core.
3. Providing support for the application to configure the running of
these background services on specific cores.
4. Once configured, hiding these services and the cores they run on from
the rest of the application, so that the rest of the app logic does not
need to change depending on whether service cores are in use or not. For
instance, removing the service cores from the core list in foreach-lcore
loops, and preventing the EAL from trying to run app functions on the
cores when the app calls mp_remote_launch.

Overall, the objective is to provide us a way to have software
equivalents of hardware functions in as transparent a manner as
possible. There is a certain amount of scheduling being done by the
DPDK, but it is still very much under the control of the app.

As for other things being able to use this concept, definite +1 for
interrupt threads and similar. I would not see mp_remote_launch as being
affected here in any significant way (except from the hiding service
cores from it, obviously)

/Bruce


Re: [dpdk-dev] [RFC 0/2] ethdev: add new attribute for signature match

2017-05-17 Thread Adrien Mazarguil
On Sun, May 14, 2017 at 03:50:04PM -0400, Qi Zhang wrote:
> We try to enable ixgbe's signature match with rte_flow, but didn't
> find a way with current APIs, so the RFC propose to add a new flow
> attribute "sig_match" to indicate if current flow is "perfect match"
> or "signature match"
> With perfect match (by default), if a packet does not match pattern,
> actions will not be taken. (this is identical with current behavior)
> With signature match, if a packet does not match pattern, it still
> has the possibility to trigger the actions, this happens when device
> think the signature of the pattern is matched.
> Signature match is expected to have better performance than perfect
> match with the cost of accuracy.
> When a flow rule with this attribute set, identical behavior can ONLY
> be guaranteed if packet matches the pattern, since different device
> may have different implementation of signature calculation algorithm.
> Driver of device that does not support signature match is not required to
> return error, but just simply igore this attribute, because the default
>  "perfect match" still can be regarded as a speical case of 
> "signature match".
> 
> Qi Zhang (2):
>   rte_flow: add attribute for signature match
>   doc/guides/prog_guide: add new rte_flow attribute
> 
>  app/test-pmd/cmdline_flow.c| 11 +++
>  doc/guides/prog_guide/rte_flow.rst | 12 
>  lib/librte_ether/rte_flow.h|  3 ++-
>  3 files changed, 25 insertions(+), 1 deletion(-)

As discussed offline, modifying struct rte_flow_attr for this purpose is not
ideal. We've agreed that a new meta pattern item should be defined instead,
as described in the FDIR rules conversion section (8.9.7) of the
documentation [1].

[1] 
http://dpdk.org/doc/guides/prog_guide/rte_flow.html#fdir-to-most-item-types-queue-drop-passthru

-- 
Adrien Mazarguil
6WIND


[dpdk-dev] [PATCH v5] net/i40e: improved FDIR programming times

2017-05-17 Thread Michael Lilja
Previously, the FDIR programming time is +11ms on i40e.
This patch will result in an average programming time of
22usec with a max of 60usec .

Signed-off-by: Michael Lilja 

---
v5:
* Reinitialization of "i" inconsistent with original intent
---
---
 drivers/net/i40e/i40e_fdir.c | 20 ++--
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/drivers/net/i40e/i40e_fdir.c b/drivers/net/i40e/i40e_fdir.c
index 28cc554f5..85fd827e1 100644
--- a/drivers/net/i40e/i40e_fdir.c
+++ b/drivers/net/i40e/i40e_fdir.c
@@ -1296,27 +1296,27 @@ i40e_fdir_filter_programming(struct i40e_pf *pf,
rte_wmb();
I40E_PCI_REG_WRITE(txq->qtx_tail, txq->tx_tail);
 
-   for (i = 0; i < I40E_FDIR_WAIT_COUNT; i++) {
-   rte_delay_us(I40E_FDIR_WAIT_INTERVAL_US);
+   for (i = 0; i < (I40E_FDIR_WAIT_COUNT * I40E_FDIR_WAIT_INTERVAL_US); 
i++) {
if ((txdp->cmd_type_offset_bsz &
rte_cpu_to_le_64(I40E_TXD_QW1_DTYPE_MASK)) ==
rte_cpu_to_le_64(I40E_TX_DESC_DTYPE_DESC_DONE))
break;
+   rte_delay_us(1);
}
-   if (i >= I40E_FDIR_WAIT_COUNT) {
+   if (i >= (I40E_FDIR_WAIT_COUNT * I40E_FDIR_WAIT_INTERVAL_US)) {
PMD_DRV_LOG(ERR, "Failed to program FDIR filter:"
" time out to get DD on tx queue.");
return -ETIMEDOUT;
}
/* totally delay 10 ms to check programming status*/
-   rte_delay_us((I40E_FDIR_WAIT_COUNT - i) * I40E_FDIR_WAIT_INTERVAL_US);
-   if (i40e_check_fdir_programming_status(rxq) < 0) {
-   PMD_DRV_LOG(ERR, "Failed to program FDIR filter:"
-   " programming status reported.");
-   return -ENOSYS;
+   for (; i < (I40E_FDIR_WAIT_COUNT * I40E_FDIR_WAIT_INTERVAL_US); i++) {
+   if (i40e_check_fdir_programming_status(rxq) >= 0)
+   return 0;
+   rte_delay_us(1);
}
-
-   return 0;
+   PMD_DRV_LOG(ERR, "Failed to program FDIR filter:"
+   " programming status reported.");
+   return -ENOSYS;
 }
 
 /*
-- 
2.12.2



Re: [dpdk-dev] [PATCH v5] net/i40e: improved FDIR programming times

2017-05-17 Thread Xing, Beilei
Hi,

> -Original Message-
> From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Michael Lilja
> Sent: Wednesday, May 17, 2017 6:38 PM
> To: dev@dpdk.org
> Cc: Michael Lilja 
> Subject: [dpdk-dev] [PATCH v5] net/i40e: improved FDIR programming times
> 
> Previously, the FDIR programming time is +11ms on i40e.
> This patch will result in an average programming time of 22usec with a max of
> 60usec .
> 
> Signed-off-by: Michael Lilja 

Acked-by: Beilei Xing 



[dpdk-dev] [RFC][PATCH] vfio: allow to map other memory regions

2017-05-17 Thread Pawel Wodkowski
Currently it is not possible to use memory that is not owned by DPDK to
perform DMA. This scenarion might be used in vhost applications (like
SPDK) where guest send its own memory table. To fill this gap provide
API to allow registering arbitrary address in VFIO container.

Signed-off-by: Pawel Wodkowski 
---
 lib/librte_eal/linuxapp/eal/Makefile|   3 +
 lib/librte_eal/linuxapp/eal/eal_vfio.c  | 127 
 lib/librte_eal/linuxapp/eal/eal_vfio.h  |  10 ++
 lib/librte_eal/linuxapp/eal/include/rte_iommu.h |  76 ++
 lib/librte_eal/linuxapp/eal/rte_eal_version.map |   7 ++
 5 files changed, 206 insertions(+), 17 deletions(-)
 create mode 100644 lib/librte_eal/linuxapp/eal/include/rte_iommu.h

diff --git a/lib/librte_eal/linuxapp/eal/Makefile 
b/lib/librte_eal/linuxapp/eal/Makefile
index 640afd0887de..f0d8ae6ab4a3 100644
--- a/lib/librte_eal/linuxapp/eal/Makefile
+++ b/lib/librte_eal/linuxapp/eal/Makefile
@@ -126,6 +126,9 @@ ifeq ($(CONFIG_RTE_TOOLCHAIN_GCC),y)
 CFLAGS_eal_thread.o += -Wno-return-type
 endif
 
+SYMLINK-$(CONFIG_RTE_EXEC_ENV_LINUXAPP)-include = \
+   include/rte_iommu.h
+
 INC := rte_interrupts.h rte_kni_common.h rte_dom0_common.h
 
 SYMLINK-$(CONFIG_RTE_EXEC_ENV_LINUXAPP)-include/exec-env := \
diff --git a/lib/librte_eal/linuxapp/eal/eal_vfio.c 
b/lib/librte_eal/linuxapp/eal/eal_vfio.c
index 53ac725d22e0..549c9824fdd7 100644
--- a/lib/librte_eal/linuxapp/eal/eal_vfio.c
+++ b/lib/librte_eal/linuxapp/eal/eal_vfio.c
@@ -39,6 +39,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "eal_filesystem.h"
 #include "eal_vfio.h"
@@ -50,17 +51,19 @@
 static struct vfio_config vfio_cfg;
 
 static int vfio_type1_dma_map(int);
+static int vfio_type1_dma_mem_map(int, uint64_t, uint64_t, uint64_t, int);
 static int vfio_spapr_dma_map(int);
 static int vfio_noiommu_dma_map(int);
+static int vfio_noiommu_dma_mem_map(int, uint64_t, uint64_t, uint64_t, int);
 
 /* IOMMU types we support */
 static const struct vfio_iommu_type iommu_types[] = {
/* x86 IOMMU, otherwise known as type 1 */
-   { RTE_VFIO_TYPE1, "Type 1", &vfio_type1_dma_map},
+   { RTE_VFIO_TYPE1, "Type 1", &vfio_type1_dma_map, 
&vfio_type1_dma_mem_map},
/* ppc64 IOMMU, otherwise known as spapr */
-   { RTE_VFIO_SPAPR, "sPAPR", &vfio_spapr_dma_map},
+   { RTE_VFIO_SPAPR, "sPAPR", &vfio_spapr_dma_map, NULL},
/* IOMMU-less mode */
-   { RTE_VFIO_NOIOMMU, "No-IOMMU", &vfio_noiommu_dma_map},
+   { RTE_VFIO_NOIOMMU, "No-IOMMU", &vfio_noiommu_dma_map, 
&vfio_noiommu_dma_mem_map},
 };
 
 int
@@ -378,6 +381,8 @@ vfio_setup_device(const char *sysfs_base, const char 
*dev_addr,
clear_group(vfio_group_fd);
return -1;
}
+
+   vfio_cfg.vfio_iommu_type = t;
}
}
 
@@ -690,33 +695,61 @@ vfio_get_group_no(const char *sysfs_base,
 }
 
 static int
-vfio_type1_dma_map(int vfio_container_fd)
+vfio_type1_dma_mem_map(int vfio_container_fd, uint64_t vaddr, uint64_t iova,
+  uint64_t len, int do_map)
 {
-   const struct rte_memseg *ms = rte_eal_get_physmem_layout();
-   int i, ret;
-
-   /* map all DPDK segments for DMA. use 1:1 PA to IOVA mapping */
-   for (i = 0; i < RTE_MAX_MEMSEG; i++) {
-   struct vfio_iommu_type1_dma_map dma_map;
-
-   if (ms[i].addr == NULL)
-   break;
+   struct vfio_iommu_type1_dma_map dma_map;
+   struct vfio_iommu_type1_dma_unmap dma_unmap;
+   int ret;
 
+   if (do_map != 0) {
memset(&dma_map, 0, sizeof(dma_map));
dma_map.argsz = sizeof(struct vfio_iommu_type1_dma_map);
-   dma_map.vaddr = ms[i].addr_64;
-   dma_map.size = ms[i].len;
-   dma_map.iova = ms[i].phys_addr;
+   dma_map.vaddr = vaddr;
+   dma_map.size = len;
+   dma_map.iova = iova;
dma_map.flags = VFIO_DMA_MAP_FLAG_READ | 
VFIO_DMA_MAP_FLAG_WRITE;
 
ret = ioctl(vfio_container_fd, VFIO_IOMMU_MAP_DMA, &dma_map);
-
if (ret) {
RTE_LOG(ERR, EAL, "  cannot set up DMA remapping, "
  "error %i (%s)\n", errno,
  strerror(errno));
return -1;
}
+
+   } else {
+   memset(&dma_unmap, 0, sizeof(dma_unmap));
+   dma_unmap.argsz = sizeof(struct vfio_iommu_type1_dma_unmap);
+   dma_unmap.size = len;
+   dma_unmap.iova = iova;
+
+   ret = ioctl(vfio_container_fd, VFIO_IOMMU_UNMAP_DMA, 
&dma_unmap);
+   if (ret) {
+   RTE_LOG(ERR, EAL, "  cannot clear DMA remapping, "
+ "error %i (%s)\n", errno,
+ str

Re: [dpdk-dev] [PATCH v3 0/6] enic flow api support

2017-05-17 Thread Ferruh Yigit
On 5/17/2017 4:03 AM, John Daley wrote:
> This V3 is rebased on dpdk-next-net and retargeted at 17.08.  Also,
> inner ipv4, ipv6, udp, tcp support added for 1300 series VICs.
> 
> thank you,
> johnd
> 
> John Daley (6):
>   net/enic: flow API skeleton
>   net/enic: flow API for NICs with advanced filters enabled
>   net/enic: flow API for NICs with advanced filters disabled
>   net/enic: flow API for Legacy NICs
>   net/enic: flow API debug
>   net/enic: flow API documentation

Hi John,

I am getting multiple build errors from multiple reasons, I suspect
there was a git merge issue in the patchset, can you please double check
building the patchset?

Thanks,
ferruh


Re: [dpdk-dev] [PATCH v3 3/6] net/enic: flow API for NICs with advanced filters disabled

2017-05-17 Thread Ferruh Yigit
On 5/17/2017 4:03 AM, John Daley wrote:
> Flow support for 1300 series adapters with the 'Advanced Filter'
> mode disabled via the UCS management interface. This allows:
> Attributes: ingress
> Items: Outer eth, ipv4, ipv6, udp, sctp, tcp, vxlan. Inner eth, ipv4,
>ipv6, udp, tcp.
> Actions: queue and void
> Selectors: 'is', 'spec' and 'mask'. 'last' is not supported
> 
> With advanced filters disabled, an IPv4 or IPv6 item must be specified
> in the pattern.
> 
> Signed-off-by: John Daley 
> Reviewed-by: Nelson Escobar 

<...>

> @@ -193,6 +279,10 @@ static const enum rte_flow_action_type 
> enic_supported_actions_v2[] = {
>  
>  /** Action capabilites indexed by NIC version information */
>  static const struct enic_action_cap enic_action_cap[] = {
> + [FILTER_ACTION_RQ_STEERING_FLAG] = {

FILTER_ACTION_RQ_STEERING_FLAG doesn't defined anywhere which is causing
build error, compiler asks if you mean FILTER_ACTION_RQ_STEERING:

drivers/net/enic/enic_flow.c:318:3: error: use of undeclared identifier
'FILTER_ACTION_RQ_STEERING_FLAG'; did you mean 'FILTER_ACTION_RQ_STEERING'?


Re: [dpdk-dev] [PATCH v3 6/6] net/enic: flow API documentation

2017-05-17 Thread Ferruh Yigit
On 5/17/2017 4:03 AM, John Daley wrote:
> Update enic NIC guide, release notes and add flow API to the
> supported features list.
> 
> Signed-off-by: John Daley 
> Reviewed-by: Nelson Escobar 

<...>

>  How to build the suite
>  --
> +===

This part looks like git merge error?

> +- The number of filters that can be specified with the Generic Flow API is
> +  dependent on how many header fields are being masked. Use 'flow create' in
> +  a loop to determine how many filters your VIC will support (not more than
> +  1000 for 1300 series VICs. Filter are checked for matching in the order 
> they
> +  were added. Since there currently is no grouping or priority support,
> +  'catch-all' filters should be added last.
> +
> +How to build the suite?
> +---
> +The build instructions for the DPDK suite should be followed. By default
> +the ENIC PMD library will be built into the DPDK library.

<...>


Re: [dpdk-dev] [PATCH v3 1/6] net/enic: flow API skeleton

2017-05-17 Thread Ferruh Yigit
On 5/17/2017 4:03 AM, John Daley wrote:
> Stub callbacks for the generic flow API and a new FLOW debug define.
> 
> Signed-off-by: John Daley 
> Reviewed-by: Nelson Escobar 

<...>

> diff --git a/drivers/net/enic/enic_ethdev.c b/drivers/net/enic/enic_ethdev.c
> index 8e16a71b7..4e8a0d9e0 100644
> --- a/drivers/net/enic/enic_ethdev.c
> +++ b/drivers/net/enic/enic_ethdev.c
> @@ -116,13 +116,28 @@ enicpmd_dev_filter_ctrl(struct rte_eth_dev *dev,
>enum rte_filter_op filter_op,
>void *arg)
>  {
> - int ret = -EINVAL;
> + int ret = 0;
> +
> + ENICPMD_FUNC_TRACE();
>  
> - if (RTE_ETH_FILTER_FDIR == filter_type)
> + if (dev == NULL)
> + return -EINVAL;

dev can't be NULL here if it is only called via filter_ctrl eth_dev_ops

<...>

> diff --git a/drivers/net/enic/enic_flow.c b/drivers/net/enic/enic_flow.c
> new file mode 100644
> index 0..d25390f8a
> --- /dev/null
> +++ b/drivers/net/enic/enic_flow.c
> @@ -0,0 +1,154 @@
> +/*
> + * Copyright 2008-2017 Cisco Systems, Inc.  All rights reserved.
> + * Copyright 2007 Nuova Systems, Inc.  All rights reserved.
> + *
> + * Copyright (c) 2017, Cisco Systems, Inc.
> + * All rights reserved.

Is this file header correct, dates and "Nuova Systems" and double Cisco
copyright.

As a side note, there is also another LICENSE file under net/enic folder

<...>


Re: [dpdk-dev] [PATCH v3 2/6] net/enic: flow API for NICs with advanced filters enabled

2017-05-17 Thread Ferruh Yigit
On 5/17/2017 4:03 AM, John Daley wrote:
> Flow support for 1300 series adapters with the 'Advanced Filter'
> mode enabled via the UCS management interface. This enables:
> Attributes: ingress
> Items: Outer eth, ipv4, ipv6, udp, sctp, tcp, vxlan. Inner eth, ipv4,
>ipv6, udp, tcp.
> Actions: queue, mark, flag and void
> Selectors: 'is', 'spec' and 'mask'. 'last' is not supported
> 
> Signed-off-by: John Daley 
> Reviewed-by: Nelson Escobar 

<...>

> +/** Get the NIC filter capabilties structure */
> +static const struct enic_filter_cap *
> +enic_get_filter_cap(struct enic *enic)
> +{
> + /* FIXME: only support advanced filters for now */
> + if (enic->flow_filter_mode != FILTER_DPDK_1)
> + return (const struct enic_filter_cap *)NULL;
> +
> + if (enic->flow_filter_mode)
> + return &enic_filter_cap[enic->flow_filter_mode];
> +
> + return (const struct enic_filter_cap *)NULL;

Do we need this casting?

<...>

> diff --git a/drivers/net/enic/enic_rxtx.c b/drivers/net/enic/enic_rxtx.c
> index ba0cfd01a..5867acf19 100644
> --- a/drivers/net/enic/enic_rxtx.c
> +++ b/drivers/net/enic/enic_rxtx.c
> @@ -253,8 +253,20 @@ enic_cq_rx_to_pkt_flags(struct cq_desc *cqd, struct 
> rte_mbuf *mbuf)
>   }
>   mbuf->vlan_tci = vlan_tci;
>  
> - /* RSS flag */
> - if (enic_cq_rx_desc_rss_type(cqrd)) {
> + if ((cqd->type_color & CQ_DESC_TYPE_MASK) == CQ_DESC_TYPE_CLASSIFIER) {
> + struct cq_enet_rq_clsf_desc *clsf_cqd;
> + uint16_t filter_id;
> + clsf_cqd = (struct cq_enet_rq_clsf_desc *)cqd;
> + filter_id = clsf_cqd->filter_id;
> + if (filter_id) {
> + pkt_flags |= PKT_RX_FDIR;
> + if (filter_id != ENIC_MAGIC_FILTER_ID) {
> + mbuf->hash.fdir.hi = clsf_cqd->filter_id;
> + pkt_flags |= PKT_RX_FDIR_ID;
> + }
> + }
> + } else if (enic_cq_rx_desc_rss_type(cqrd)) {
> + /* RSS flag */

Is this piece of code related to the rte_flow ?

"struct cq_enet_rq_clsf_desc" is not defined and causing build erros.

>   pkt_flags |= PKT_RX_RSS_HASH;
>   mbuf->hash.rss = enic_cq_rx_desc_rss_hash(cqrd);
>   }
> 



Re: [dpdk-dev] [PATCH v5] net/i40e: improved FDIR programming times

2017-05-17 Thread Ferruh Yigit
On 5/17/2017 11:38 AM, Michael Lilja wrote:
> Previously, the FDIR programming time is +11ms on i40e.
> This patch will result in an average programming time of
> 22usec with a max of 60usec .
> 
> Signed-off-by: Michael Lilja 

Please keeps maintainers in CC while sending patches.

> 
> ---
> v5:
> * Reinitialization of "i" inconsistent with original intent

It can be useful to keep history about older versions.

> ---

There are two checkpatch warnings, can you please fix them [1], you can
keep Beilei's ack in next version.

[1]
WARNING:LONG_LINE: line over 80 characters
#39: FILE: drivers/net/i40e/i40e_fdir.c:1299:
+   for (i = 0; i < (I40E_FDIR_WAIT_COUNT *
I40E_FDIR_WAIT_INTERVAL_US); i++) {

WARNING:ENOSYS: ENOSYS means 'invalid syscall nr' and nothing else
#67: FILE: drivers/net/i40e/i40e_fdir.c:1319:
+   return -ENOSYS;

<...>


Re: [dpdk-dev] [RFC 17.08] Flow classification library

2017-05-17 Thread Ferruh Yigit
On 5/9/2017 8:24 PM, Morten Brørup wrote:
>> -Original Message-
>> From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Ferruh Yigit
>> Sent: Tuesday, May 9, 2017 3:38 PM
>> To: Morten Brørup; Mcnamara, John; dev@dpdk.org
>> Cc: Tahhan, Maryam; Gaëtan Rivet
>> Subject: Re: [dpdk-dev] [RFC 17.08] Flow classification library
>>
>> On 5/6/2017 3:04 PM, Morten Brørup wrote:
>>> Carthago delenda est: Again with the callbacks... why not just let the
>> application call the library's processing functions where appropriate. The
>> hook+callback design pattern tends to impose a specific framework (or order
>> of execution) on the DPDK user, rather than just being a stand-alone
>> library offering some functions. DPDK is not a stack; and one of the
>> reasons we are moving our firmware away from Linux is to avoid being
>> enforced a specific order of processing the packets (through a whole bunch
>> of hooks everywhere in the stack).
>>>
>>
>> I agree on callbacks usage, but I can't see the other option for this case.
>>
>> This is for additional functionality to get flow information, while
>> packet processing happens. So with don't want this functionality to be
>> available always or to be part of the processing. And this data requires
>> each packet to be processed, what can be the "library's processing
>> function" alternative can be?
>>
> 
> As I understand it, your library (and other libraries using the same hook) 
> calls a function for each packet via the PMD RX hook. Why not just let the 
> application call this function (i.e. the callback function) wherever the 
> application developer thinks it is appropriate? If the application calls it 
> as the first thing after rte_eth_rx_burst(), the result will probably be the 
> same as the current hook+callback design.
>

Agreed, I will send an updated RFC, thanks.

> 
>>> Maybe I missed the point of this library, so bear with me if my example
>> is stupid:
>>>
>>> Consider a NAT router application. Does this library support processing
>> ingress packets in the outside->inside direction after they have been
>> processed by the NAT engine? Or how about IP fragments after passing the
>> reassembly engine?
>>
>> Implementation is not there, we have packet information, and I guess
>> with more processing of packets, the proper flow information can be
>> created for various cases. But my concern is if this should be in DPDK?
>>
>> I was thinking to provide API to the application to give the flow
>> information with a specific key, and rest of the processing can be done
>> in upper layer, who calls these APIs.
>>
>>>
>>>
>>> Generally, a generic flow processing library would be great; but such a
>> library would need to support flow processing applications, not just byte
>> counting. Four key functions would be required: 1. Identify which flow a
>> packet belongs to (or "not found"), 2. Create a flow, 3. Destroy a flow,
>> and 4. Iterate through flows (e.g. for aging or listing purposes).
>>
>> Agreed, and where should this be?
>> Part of DPDK, or DPDK providing some APIs to enable this kind of library
>> on top of DPDK?
>>
> 
> Part of DPDK, so it will take advantage of any offload features provided by 
> the advanced NICs. Most network security appliances are flow based, not 
> packet based, so I thought your RFC intended to add flow support beyond RSS 
> hashing to DPDK 17.08.
> 
> Our StraightShaper product is flow based and stateful for each flow. As a 
> simplified example, consider a web server implemented using DPDK... It must 
> get all the packets related to the HTTP request, regardless how these packets 
> arrive (possibly fragmented, possibly via multiple interfaces through 
> multipath routing or link aggregation, etc.). Your current library does not 
> support this, so a flow based product like ours cannot use your library. But 
> it might still be perfectly viable for IPFIX for simple L2/L3 forwarding 
> products.
> 
> 
> Med venlig hilsen / kind regards
> - Morten Brørup
> 



Re: [dpdk-dev] [RFC] Service Cores concept

2017-05-17 Thread Thomas Monjalon
17/05/2017 12:32, Bruce Richardson:
> On Wed, May 17, 2017 at 12:11:10AM +0200, Thomas Monjalon wrote:
> > 03/05/2017 13:29, Harry van Haaren:
> > > The concept is to allow a software function register itself with EAL as
> > > a "service", which requires CPU time to perform its duties. Multiple
> > > services can be registered in an application, if more than one service
> > > exists. The application can retrieve a list of services, and decide how
> > > many "service cores" to use. The number of service cores is removed
> > > from the application usage, and they are mapped to services based on
> > > an application supplied coremask.
> > > 
> > > The application now continues as normal, without having to manually
> > > schedule and implement arbitration of CPU time for the SW services.
> > 
> > I think it should not be the DPDK responsibility to schedule threads.
> > The mainloops and scheduling are application design choices.
> > 
> > If I understand well the idea of your proposal, it is a helper for
> > the application to configure the thread scheduling of known services.
> > So I think we could add interrupt processing and other thread creations
> > in this concept.
> > Could we also embed the rte_eal_mp_remote_launch() calls in this concept?
> 
> 
> There are a couple of parts of this:
> 1. Allowing libraries and drivers to register the fact that they require
> background processing, e.g. as a SW fallback for functionality that
> would otherwise be implemented in hardware
> 2. Providing support for easily multi-plexing these independent
> functions from different libs onto a different core, compared to the
> normal operation of DPDK of firing a single run-forever function on each
> core.
> 3. Providing support for the application to configure the running of
> these background services on specific cores.
> 4. Once configured, hiding these services and the cores they run on from
> the rest of the application, so that the rest of the app logic does not
> need to change depending on whether service cores are in use or not. For
> instance, removing the service cores from the core list in foreach-lcore
> loops, and preventing the EAL from trying to run app functions on the
> cores when the app calls mp_remote_launch.
> 
> Overall, the objective is to provide us a way to have software
> equivalents of hardware functions in as transparent a manner as
> possible. There is a certain amount of scheduling being done by the
> DPDK, but it is still very much under the control of the app.
> 
> As for other things being able to use this concept, definite +1 for
> interrupt threads and similar. I would not see mp_remote_launch as being
> affected here in any significant way (except from the hiding service
> cores from it, obviously)

OK to register CPU needs for services (including interrupts processing).

Then we could take this opportunity to review how threads are managed.
We will have three types of cores:
- not used
- reserved for services
- used for polling / application processing
It is fine to reserve/define CPU from DPDK point of view.

Then DPDK launch threads on cores. Maybe we should allow the application
to choose how threads are launched and managed.
Keith was talking about a plugin approach for thread management I think.


Re: [dpdk-dev] [RFC PATCH 00/11] net/virtio: packed ring layout

2017-05-17 Thread Jens Freimann
Hi Yuanhan,

On Mon, May 08, 2017 at 01:02:57PM +0800, Yuanhan Liu wrote:
> On Fri, May 05, 2017 at 09:57:11AM -0400, Jens Freimann wrote:
> > I'm implementing the receive path now to eventually get some benchmark
> > results for that as well (Patches not included yet)
> 
> Good to know. Any progress? I'm asking because that was also my plan for
> this week: make Rx work. I haven't started it though.

just curious if you already had a chance to work on this? 

regards,
Jens 


Re: [dpdk-dev] [PATCH v1] net: support PPPOE in software packet type parser

2017-05-17 Thread Chilikin, Andrey
PPPoE consists of two stages with different ethertypes - Discovery and Session, 
so RTE_PTYPE_L2_ETHER_PPPOE looks ambiguous to me. This patch adds only PPPoE 
Session Stage packet type. Should it include PPPoE Discovery Stage as well 
(Ethertype x8863)?  

/Andrey

> -Original Message-
> From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Thomas Monjalon
> Sent: Tuesday, May 16, 2017 9:47 PM
> To: dev@dpdk.org; olivier.m...@6wind.com; Richardson, Bruce
> 
> Cc: Ray Zhang 
> Subject: Re: [dpdk-dev] [PATCH v1] net: support PPPOE in software packet type
> parser
> 
> Any volunteer to review this patch, please?
> 
> 07/04/2017 12:26, Ray Zhang:
> > From: Ray Zhang 
> >
> > Add a new RTE_PTYPE_L2_ETHER_PPPOE and its support in
> > rte_net_get_ptype()
> >
> > Signed-off-by: Ray Zhang 
> > ---
> >
> > Resend the patch
> >
> >  lib/librte_mbuf/rte_mbuf_ptype.h |  7 +++
> >  lib/librte_net/rte_ether.h   | 12 
> >  lib/librte_net/rte_net.c | 19 +++
> >  3 files changed, 38 insertions(+)
> >
> > diff --git a/lib/librte_mbuf/rte_mbuf_ptype.h
> > b/lib/librte_mbuf/rte_mbuf_ptype.h
> > index ff6de9d..7dd03de 100644
> > --- a/lib/librte_mbuf/rte_mbuf_ptype.h
> > +++ b/lib/librte_mbuf/rte_mbuf_ptype.h
> > @@ -150,6 +150,13 @@
> >   */
> >  #define RTE_PTYPE_L2_ETHER_QINQ 0x0007
> >  /**
> > + * PPPOE packet type.
> > + *
> > + * Packet format:
> > + * <'ether type'=[0x8864]>
> > + */
> > +#define RTE_PTYPE_L2_ETHER_PPPOE0x0008
> > +/**
> >   * Mask of layer 2 packet types.
> >   * It is used for outer packet for tunneling cases.
> >   */
> > diff --git a/lib/librte_net/rte_ether.h b/lib/librte_net/rte_ether.h
> > index ff3d065..d76edb3 100644
> > --- a/lib/librte_net/rte_ether.h
> > +++ b/lib/librte_net/rte_ether.h
> > @@ -323,12 +323,24 @@ struct vxlan_hdr {
> > uint32_t vx_vni;   /**< VNI (24) + Reserved (8). */
> >  } __attribute__((__packed__));
> >
> > +/**
> > + * PPPOE protocol header
> > + */
> > +struct pppoe_hdr {
> > +   uint8_t  type_ver;
> > +   uint8_t  code;
> > +   uint16_t sid;
> > +   uint16_t length;
> > +   uint16_t proto;
> > +} __attribute__((packed));
> > +
> >  /* Ethernet frame types */
> >  #define ETHER_TYPE_IPv4 0x0800 /**< IPv4 Protocol. */  #define
> > ETHER_TYPE_IPv6 0x86DD /**< IPv6 Protocol. */  #define ETHER_TYPE_ARP
> > 0x0806 /**< Arp Protocol. */  #define ETHER_TYPE_RARP 0x8035 /**<
> > Reverse Arp Protocol. */  #define ETHER_TYPE_VLAN 0x8100 /**< IEEE
> > 802.1Q VLAN tagging. */
> > +#define ETHER_TYPE_PPPOE 0x8864 /**< PPPoE Protocol */
> >  #define ETHER_TYPE_QINQ 0x88A8 /**< IEEE 802.1ad QinQ tagging. */
> > #define ETHER_TYPE_1588 0x88F7 /**< IEEE 802.1AS 1588 Precise Time
> > Protocol. */  #define ETHER_TYPE_SLOW 0x8809 /**< Slow protocols (LACP
> > and Marker). */ diff --git a/lib/librte_net/rte_net.c
> > b/lib/librte_net/rte_net.c index a8c7aff..439c2f6 100644
> > --- a/lib/librte_net/rte_net.c
> > +++ b/lib/librte_net/rte_net.c
> > @@ -302,6 +302,25 @@ uint32_t rte_net_get_ptype(const struct rte_mbuf
> *m,
> > off += 2 * sizeof(*vh);
> > hdr_lens->l2_len += 2 * sizeof(*vh);
> > proto = vh->eth_proto;
> > +   } else if (proto == rte_cpu_to_be_16(ETHER_TYPE_PPPOE)) {
> > +   const struct pppoe_hdr *ph;
> > +   struct pppoe_hdr ph_copy;
> > +
> > +   pkt_type = RTE_PTYPE_L2_ETHER_PPPOE;
> > +   ph = rte_pktmbuf_read(m, off, sizeof(*ph), &ph_copy);
> > +   if (unlikely(ph == NULL))
> > +   return pkt_type;
> > +
> > +   off += sizeof(*ph);
> > +   hdr_lens->l2_len += sizeof(*ph);
> > +   if (ph->code != 0) /* Not Seesion Data */
> > +   return pkt_type;
> > +   if (ph->proto == rte_cpu_to_be_16(0x21))
> > +   proto = rte_cpu_to_be_16(ETHER_TYPE_IPv4);
> > +   else if (ph->proto == rte_cpu_to_be_16(0x57))
> > +   proto = rte_cpu_to_be_16(ETHER_TYPE_IPv6);
> > +   else
> > +   return pkt_type;
> > }



[dpdk-dev] [PATCH 1/3] net/sfc: carefully cleanup on init failure and shutdown

2017-05-17 Thread Andrew Rybchenko
Signed-off-by: Andrew Rybchenko 
Reviewed-by: Andy Moreton 
---
 drivers/net/sfc/sfc_ethdev.c | 31 +--
 1 file changed, 25 insertions(+), 6 deletions(-)

diff --git a/drivers/net/sfc/sfc_ethdev.c b/drivers/net/sfc/sfc_ethdev.c
index d5583ec..e4f051a 100644
--- a/drivers/net/sfc/sfc_ethdev.c
+++ b/drivers/net/sfc/sfc_ethdev.c
@@ -1407,7 +1407,7 @@ sfc_eth_dev_set_ops(struct rte_eth_dev *dev)
"Insufficient Hw/FW capabilities to use Rx 
datapath %s",
rx_name);
rc = EINVAL;
-   goto fail_dp_rx;
+   goto fail_dp_rx_caps;
}
} else {
sa->dp_rx = sfc_dp_find_rx_by_caps(&sfc_dp_head, avail_caps);
@@ -1440,7 +1440,7 @@ sfc_eth_dev_set_ops(struct rte_eth_dev *dev)
"Insufficient Hw/FW capabilities to use Tx 
datapath %s",
tx_name);
rc = EINVAL;
-   goto fail_dp_tx;
+   goto fail_dp_tx_caps;
}
} else {
sa->dp_tx = sfc_dp_find_tx_by_caps(&sfc_dp_head, avail_caps);
@@ -1460,14 +1460,33 @@ sfc_eth_dev_set_ops(struct rte_eth_dev *dev)
 
return 0;
 
+fail_dp_tx_caps:
+   sa->dp_tx = NULL;
+
 fail_dp_tx:
 fail_kvarg_tx_datapath:
+fail_dp_rx_caps:
+   sa->dp_rx = NULL;
+
 fail_dp_rx:
 fail_kvarg_rx_datapath:
return rc;
 }
 
 static void
+sfc_eth_dev_clear_ops(struct rte_eth_dev *dev)
+{
+   struct sfc_adapter *sa = dev->data->dev_private;
+
+   dev->dev_ops = NULL;
+   dev->rx_pkt_burst = NULL;
+   dev->tx_pkt_burst = NULL;
+
+   sa->dp_tx = NULL;
+   sa->dp_rx = NULL;
+}
+
+static void
 sfc_register_dp(void)
 {
/* Register once */
@@ -1549,6 +1568,8 @@ sfc_eth_dev_init(struct rte_eth_dev *dev)
return 0;
 
 fail_attach:
+   sfc_eth_dev_clear_ops(dev);
+
 fail_set_ops:
sfc_unprobe(sa);
 
@@ -1577,16 +1598,14 @@ sfc_eth_dev_uninit(struct rte_eth_dev *dev)
 
sfc_adapter_lock(sa);
 
+   sfc_eth_dev_clear_ops(dev);
+
sfc_detach(sa);
sfc_unprobe(sa);
 
rte_free(dev->data->mac_addrs);
dev->data->mac_addrs = NULL;
 
-   dev->dev_ops = NULL;
-   dev->rx_pkt_burst = NULL;
-   dev->tx_pkt_burst = NULL;
-
sfc_kvargs_cleanup(sa);
 
sfc_adapter_unlock(sa);
-- 
2.9.4



[dpdk-dev] [PATCH 2/3] net/sfc: use locally stored data for logging

2017-05-17 Thread Andrew Rybchenko
Required to be able to use logging in the secondary process
where Ethernet device pointer stored in sfc_adapter is invalid.

Signed-off-by: Andrew Rybchenko 
Reviewed-by: Andy Moreton 
---
 drivers/net/sfc/sfc.h|  2 ++
 drivers/net/sfc/sfc_debug.h  | 10 --
 drivers/net/sfc/sfc_ethdev.c |  3 +++
 drivers/net/sfc/sfc_log.h| 14 ++
 4 files changed, 15 insertions(+), 14 deletions(-)

diff --git a/drivers/net/sfc/sfc.h b/drivers/net/sfc/sfc.h
index 7927678..772a713 100644
--- a/drivers/net/sfc/sfc.h
+++ b/drivers/net/sfc/sfc.h
@@ -179,6 +179,8 @@ struct sfc_adapter {
 */
rte_spinlock_t  lock;
enum sfc_adapter_state  state;
+   struct rte_pci_addr pci_addr;
+   uint16_tport_id;
struct rte_eth_dev  *eth_dev;
struct rte_kvargs   *kvargs;
booldebug_init;
diff --git a/drivers/net/sfc/sfc_debug.h b/drivers/net/sfc/sfc_debug.h
index f4fe044..92eba9c 100644
--- a/drivers/net/sfc/sfc_debug.h
+++ b/drivers/net/sfc/sfc_debug.h
@@ -47,14 +47,12 @@
 /* Log PMD message, automatically add prefix and \n */
 #define sfc_panic(sa, fmt, args...) \
do {\
-   const struct rte_eth_dev *_dev = (sa)->eth_dev; \
-   const struct rte_pci_device *_pci_dev = \
-   RTE_ETH_DEV_TO_PCI(_dev);   \
+   const struct sfc_adapter *_sa = (sa);   \
\
rte_panic("sfc " PCI_PRI_FMT " #%" PRIu8 ": " fmt "\n", \
- _pci_dev->addr.domain, _pci_dev->addr.bus,\
- _pci_dev->addr.devid, _pci_dev->addr.function,\
- _dev->data->port_id, ##args); \
+ _sa->pci_addr.domain, _sa->pci_addr.bus,  \
+ _sa->pci_addr.devid, _sa->pci_addr.function,  \
+ _sa->port_id, ##args);\
} while (0)
 
 #endif /* _SFC_DEBUG_H_ */
diff --git a/drivers/net/sfc/sfc_ethdev.c b/drivers/net/sfc/sfc_ethdev.c
index e4f051a..d6bba1d 100644
--- a/drivers/net/sfc/sfc_ethdev.c
+++ b/drivers/net/sfc/sfc_ethdev.c
@@ -1513,6 +1513,9 @@ sfc_eth_dev_init(struct rte_eth_dev *dev)
sfc_register_dp();
 
/* Required for logging */
+   sa->pci_addr = pci_dev->addr;
+   sa->port_id = dev->data->port_id;
+
sa->eth_dev = dev;
 
/* Copy PCI device info to the dev->data */
diff --git a/drivers/net/sfc/sfc_log.h b/drivers/net/sfc/sfc_log.h
index d0f8921..6c43925 100644
--- a/drivers/net/sfc/sfc_log.h
+++ b/drivers/net/sfc/sfc_log.h
@@ -35,18 +35,16 @@
 /* Log PMD message, automatically add prefix and \n */
 #define SFC_LOG(sa, level, ...) \
do {\
-   const struct rte_eth_dev *_dev = (sa)->eth_dev; \
-   const struct rte_pci_device *_pci_dev = \
-   RTE_ETH_DEV_TO_PCI(_dev);   \
+   const struct sfc_adapter *_sa = (sa);   \
\
RTE_LOG(level, PMD, \
RTE_FMT("sfc_efx " PCI_PRI_FMT " #%" PRIu8 ": " \
RTE_FMT_HEAD(__VA_ARGS__,) "\n",\
-   _pci_dev->addr.domain,  \
-   _pci_dev->addr.bus, \
-   _pci_dev->addr.devid,   \
-   _pci_dev->addr.function,\
-   _dev->data->port_id,\
+   _sa->pci_addr.domain,   \
+   _sa->pci_addr.bus,  \
+   _sa->pci_addr.devid,\
+   _sa->pci_addr.function, \
+   _sa->port_id,   \
RTE_FMT_TAIL(__VA_ARGS__,)));   \
} while (0)
 
-- 
2.9.4



[dpdk-dev] [PATCH 3/3] net/sfc: support multi-process

2017-05-17 Thread Andrew Rybchenko
Signed-off-by: Andrew Rybchenko 
Reviewed-by: Andy Moreton 
---
 doc/guides/nics/features/sfc_efx.ini |   1 +
 drivers/net/sfc/sfc.h|  11 +++
 drivers/net/sfc/sfc_dp_rx.h  |   1 +
 drivers/net/sfc/sfc_dp_tx.h  |   1 +
 drivers/net/sfc/sfc_ef10_rx.c|   2 +-
 drivers/net/sfc/sfc_ef10_tx.c|   5 +-
 drivers/net/sfc/sfc_ethdev.c | 133 ++-
 7 files changed, 148 insertions(+), 6 deletions(-)

diff --git a/doc/guides/nics/features/sfc_efx.ini 
b/doc/guides/nics/features/sfc_efx.ini
index 7957b5e..1db7f67 100644
--- a/doc/guides/nics/features/sfc_efx.ini
+++ b/doc/guides/nics/features/sfc_efx.ini
@@ -28,6 +28,7 @@ Packet type parsing  = Y
 Basic stats  = Y
 Extended stats   = Y
 FW version   = Y
+Multiprocess aware   = Y
 BSD nic_uio  = Y
 Linux UIO= Y
 Linux VFIO   = Y
diff --git a/drivers/net/sfc/sfc.h b/drivers/net/sfc/sfc.h
index 772a713..007ed24 100644
--- a/drivers/net/sfc/sfc.h
+++ b/drivers/net/sfc/sfc.h
@@ -225,7 +225,18 @@ struct sfc_adapter {
uint8_t rss_key[SFC_RSS_KEY_SIZE];
 #endif
 
+   /*
+* Shared memory copy of the Rx datapath name to be used by
+* the secondary process to find Rx datapath to be used.
+*/
+   char*dp_rx_name;
const struct sfc_dp_rx  *dp_rx;
+
+   /*
+* Shared memory copy of the Tx datapath name to be used by
+* the secondary process to find Rx datapath to be used.
+*/
+   char*dp_tx_name;
const struct sfc_dp_tx  *dp_tx;
 };
 
diff --git a/drivers/net/sfc/sfc_dp_rx.h b/drivers/net/sfc/sfc_dp_rx.h
index 9d05a4b..a7b8278 100644
--- a/drivers/net/sfc/sfc_dp_rx.h
+++ b/drivers/net/sfc/sfc_dp_rx.h
@@ -161,6 +161,7 @@ struct sfc_dp_rx {
 
unsigned intfeatures;
 #define SFC_DP_RX_FEAT_SCATTER 0x1
+#define SFC_DP_RX_FEAT_MULTI_PROCESS   0x2
sfc_dp_rx_qcreate_t *qcreate;
sfc_dp_rx_qdestroy_t*qdestroy;
sfc_dp_rx_qstart_t  *qstart;
diff --git a/drivers/net/sfc/sfc_dp_tx.h b/drivers/net/sfc/sfc_dp_tx.h
index 2bb9a2e..c1c3419 100644
--- a/drivers/net/sfc/sfc_dp_tx.h
+++ b/drivers/net/sfc/sfc_dp_tx.h
@@ -135,6 +135,7 @@ struct sfc_dp_tx {
 #define SFC_DP_TX_FEAT_VLAN_INSERT 0x1
 #define SFC_DP_TX_FEAT_TSO 0x2
 #define SFC_DP_TX_FEAT_MULTI_SEG   0x4
+#define SFC_DP_TX_FEAT_MULTI_PROCESS   0x8
sfc_dp_tx_qcreate_t *qcreate;
sfc_dp_tx_qdestroy_t*qdestroy;
sfc_dp_tx_qstart_t  *qstart;
diff --git a/drivers/net/sfc/sfc_ef10_rx.c b/drivers/net/sfc/sfc_ef10_rx.c
index 1484bab..60812cb 100644
--- a/drivers/net/sfc/sfc_ef10_rx.c
+++ b/drivers/net/sfc/sfc_ef10_rx.c
@@ -699,7 +699,7 @@ struct sfc_dp_rx sfc_ef10_rx = {
.type   = SFC_DP_RX,
.hw_fw_caps = SFC_DP_HW_FW_CAP_EF10,
},
-   .features   = 0,
+   .features   = SFC_DP_RX_FEAT_MULTI_PROCESS,
.qcreate= sfc_ef10_rx_qcreate,
.qdestroy   = sfc_ef10_rx_qdestroy,
.qstart = sfc_ef10_rx_qstart,
diff --git a/drivers/net/sfc/sfc_ef10_tx.c b/drivers/net/sfc/sfc_ef10_tx.c
index bac9baa..5482db8 100644
--- a/drivers/net/sfc/sfc_ef10_tx.c
+++ b/drivers/net/sfc/sfc_ef10_tx.c
@@ -534,7 +534,8 @@ struct sfc_dp_tx sfc_ef10_tx = {
.type   = SFC_DP_TX,
.hw_fw_caps = SFC_DP_HW_FW_CAP_EF10,
},
-   .features   = SFC_DP_TX_FEAT_MULTI_SEG,
+   .features   = SFC_DP_TX_FEAT_MULTI_SEG |
+ SFC_DP_TX_FEAT_MULTI_PROCESS,
.qcreate= sfc_ef10_tx_qcreate,
.qdestroy   = sfc_ef10_tx_qdestroy,
.qstart = sfc_ef10_tx_qstart,
@@ -549,7 +550,7 @@ struct sfc_dp_tx sfc_ef10_simple_tx = {
.name   = SFC_KVARG_DATAPATH_EF10_SIMPLE,
.type   = SFC_DP_TX,
},
-   .features   = 0,
+   .features   = SFC_DP_TX_FEAT_MULTI_PROCESS,
.qcreate= sfc_ef10_tx_qcreate,
.qdestroy   = sfc_ef10_tx_qdestroy,
.qstart = sfc_ef10_tx_qstart,
diff --git a/drivers/net/sfc/sfc_ethdev.c b/drivers/net/sfc/sfc_ethdev.c
index d6bba1d..0bd2de4 100644
--- a/drivers/net/sfc/sfc_ethdev.c
+++ b/drivers/net/sfc/sfc_ethdev.c
@@ -913,6 +913,10 @@ sfc_set_mc_addr_list(struct rte_eth_dev *dev, struct 
ether_addr *mc_addr_set,
return -rc;
 }
 
+/*
+ * The function is used by the secondary process as well. It must not
+ * use any process-local pointers from the adapter data.
+ */
 static void
 sfc_rx_queue_info_get(str

Re: [dpdk-dev] [RFC] Service Cores concept

2017-05-17 Thread Bruce Richardson
On Wed, May 17, 2017 at 01:28:14PM +0200, Thomas Monjalon wrote:
> 17/05/2017 12:32, Bruce Richardson:
> > On Wed, May 17, 2017 at 12:11:10AM +0200, Thomas Monjalon wrote:
> > > 03/05/2017 13:29, Harry van Haaren:
> > > > The concept is to allow a software function register itself with EAL as
> > > > a "service", which requires CPU time to perform its duties. Multiple
> > > > services can be registered in an application, if more than one service
> > > > exists. The application can retrieve a list of services, and decide how
> > > > many "service cores" to use. The number of service cores is removed
> > > > from the application usage, and they are mapped to services based on
> > > > an application supplied coremask.
> > > > 
> > > > The application now continues as normal, without having to manually
> > > > schedule and implement arbitration of CPU time for the SW services.
> > > 
> > > I think it should not be the DPDK responsibility to schedule threads.
> > > The mainloops and scheduling are application design choices.
> > > 
> > > If I understand well the idea of your proposal, it is a helper for
> > > the application to configure the thread scheduling of known services.
> > > So I think we could add interrupt processing and other thread creations
> > > in this concept.
> > > Could we also embed the rte_eal_mp_remote_launch() calls in this concept?
> > 
> > 
> > There are a couple of parts of this:
> > 1. Allowing libraries and drivers to register the fact that they require
> > background processing, e.g. as a SW fallback for functionality that
> > would otherwise be implemented in hardware
> > 2. Providing support for easily multi-plexing these independent
> > functions from different libs onto a different core, compared to the
> > normal operation of DPDK of firing a single run-forever function on each
> > core.
> > 3. Providing support for the application to configure the running of
> > these background services on specific cores.
> > 4. Once configured, hiding these services and the cores they run on from
> > the rest of the application, so that the rest of the app logic does not
> > need to change depending on whether service cores are in use or not. For
> > instance, removing the service cores from the core list in foreach-lcore
> > loops, and preventing the EAL from trying to run app functions on the
> > cores when the app calls mp_remote_launch.
> > 
> > Overall, the objective is to provide us a way to have software
> > equivalents of hardware functions in as transparent a manner as
> > possible. There is a certain amount of scheduling being done by the
> > DPDK, but it is still very much under the control of the app.
> > 
> > As for other things being able to use this concept, definite +1 for
> > interrupt threads and similar. I would not see mp_remote_launch as being
> > affected here in any significant way (except from the hiding service
> > cores from it, obviously)
> 
> OK to register CPU needs for services (including interrupts processing).
> 
> Then we could take this opportunity to review how threads are managed.
> We will have three types of cores:
> - not used
> - reserved for services
> - used for polling / application processing
> It is fine to reserve/define CPU from DPDK point of view.
> 
> Then DPDK launch threads on cores. Maybe we should allow the application
> to choose how threads are launched and managed.
> Keith was talking about a plugin approach for thread management I think.

For thread management, I'd view us as extending what we have with some
EAL APIs rather than replacing what is there already. What I think we
could with would be APIs to:
* spawn an additional thread on a core i.e. add a bit to the coremask
* shutdown a thread on a core i.e. remove a bit from the coremask
* register an existing thread with DPDK, i.e. give it an lcore_id
  internally so that it can use DPDK data structures as a first-class
  citizen.

However, while needed, this is separate work from the service cores concept.

Regards,
/Bruce


Re: [dpdk-dev] [RFC] service core concept header implementation

2017-05-17 Thread Ananyev, Konstantin
Hi Harry,
I had a look, my questions/comments below.
>From  my perspective it looks like an attempt to introduce simple scheduling 
>library inside DPDK.
Though some things left unclear from my perspective, but probably I just 
misunderstood your
intentions here.
Thanks
Konstantin

> 
> This patch adds a service header to DPDK EAL. This header is
> an RFC for a mechanism to allow DPDK components request a
> callback function to be invoked.
> 
> The application can set the number of service cores, and
> a coremask for each particular services. The implementation
> of this functionality in rte_services.c (not completed) would
> use atomics to ensure that a service can only be polled by a
> single lcore at a time.
> 
> Signed-off-by: Harry van Haaren 
> ---
>  lib/librte_eal/common/include/rte_service.h | 211 
> 
>  1 file changed, 211 insertions(+)
>  create mode 100644 lib/librte_eal/common/include/rte_service.h
> 

...

> +
> +/**
> + * Signature of callback back function to run a service.
> + */
> +typedef void (*rte_eal_service_func)(void *args);
> +
> +struct rte_service_config {
> + /* name of the service */
> + char name[RTE_SERVICE_NAMESIZE];
> + /* cores that run this service */
> + uint64_t coremask;

Why uint64_t?
Even now by default we support more than 64 cores.
We have rte_cpuset_t for such things.

Ok, the first main question:
Is that possible to register multiple services with intersecting coremasks or 
not?
If not, then I suppose there is no practical difference from what we have right 
now
with eal_remote_launch() .
If yes, then several questions arise:
   a) How the service scheduling function will going to switch from one service 
to another
   on particular core?
   As we don't have interrupt framework for that, I suppose the only choice 
is to rely
   on service voluntary fiving up cpu. Is that so?
 b) Would it be possible to specify percentage of core cycles that service can 
consume?
 I suppose the answer is 'no', at least for current iteration. 

> + /* when set, multiple lcores can run this service simultaneously without
> +  * the need for software atomics to ensure that two cores do not
> +  * attempt to run the service at the same time.
> +  */
> + uint8_t multithread_capable;

Ok, and what would happen if this flag is not set, and user specified more
then one cpu in the coremask?

> +};
> +
> +/** @internal - this will not be visible to application, defined in a 
> seperate
> + * rte_eal_service_impl.h header.

Why not?
If the application  registers the service, it for sure needs to know what 
exactly that service
would do (cb func and data).

> The application does not need to be exposed
> + * to the actual function pointers - so hide them. */
> +struct rte_service {
> + /* callback to be called to run the service */
> + rte_eal_service_func cb;
> + /* args to the callback function */
> + void *cb_args;

If user plans to run that service on multiple cores, then he might
need a separate instance of cb_args for each core.

> + /* configuration of the service */
> + struct rte_service_config config;
> +};
> +
> +/**
> + * @internal - Only DPDK internal components request "service" cores.
> + *
> + * Registers a service in EAL.
> + *
> + * Registered services' configurations are exposed to an application using
> + * *rte_eal_service_get_all*. These services have names which can be 
> understood
> + * by the application, and the application logic can decide how to allocate
> + * cores to run the various services.
> + *
> + * This function is expected to be called by a DPDK component to indicate 
> that
> + * it require a CPU to run a specific function in order for it to perform its
> + * processing. An example of such a component is the eventdev software PMD.
> + *
> + * The config struct should be filled in as appropriate by the PMD. For 
> example
> + * the name field should include some level of detail (e.g. "ethdev_p1_rx_q3"
> + * might mean RX from an ethdev from port 1, on queue 3).
> + *
> + * @param service
> + *   The service structure to be registered with EAL.
> + *
> + * @return
> + *   On success, zero
> + *   On failure, a negative error
> + */
> +int rte_eal_service_register(const struct rte_service *service);
> +
> +/**
> + * Get the count of services registered in the EAL.
> + *
> + * @return the number of services registered with EAL.
> + */
> +uint32_t rte_eal_service_get_count();
> +
> +/**
> + * Writes all registered services to the application supplied array.
> + *
> + * This function can be used by the application to understand if and what
> + * services require running. Each service provides a config struct exposing
> + * attributes of the service, which can be used by the application to decide 
> on
> + * its strategy of running services on cores.
> + *
> + * @param service_config
> + *   An array of service config structures to be filled in
> + *
> + * @p

Re: [dpdk-dev] [PATCH v2 00/13] introduce fail-safe PMD

2017-05-17 Thread Ferruh Yigit
On 3/20/2017 3:00 PM, Thomas Monjalon wrote:
> There have been some discussions on this new PMD and it will be
> discussed today in the techboard meeting.
> 
> I would like to expose my view and summarize the solutions I have heard.
> First it is important to remind that everyone agrees on the need for
> this feature, i.e. masking the hotplug events by maintaining an ethdev
> object even without real underlying device.
> 
> 1/ 
> The proposal from Gaetan is to add a failsafe driver with 2 features:
>   * masking underlying device
>   * limited and small failover code to switch from a device
> to another one, with the same centralized configuration
> The latter feature makes think to the bonding driver, but it could be
> kept limited without any intent of implementing real bonding features.
> 
> 2/ 
> If we really want to merge failsafe and bonding features, we could
> create a new bonding driver with centralized configuration.
> The legacy bonding driver let each slave to be configured separately.
> It is a different model and we should not mix them.
> If one is better, it could be deprecated later.
> 
> 3/
> It can be tried to implement the failsafe feature into the bonding
> driver, as Neil suggests.
> However, I am not sure it would work very well or would be easy to use.
> 
> 4/
> We can implement only the failsafe feature as a PMD and use it to wrap
> the slaves of the bonding driver.
> So the order of link would be 
>   bonding -> failsafe -> real device
> In this model, failsafe can have only one slave and do not implement
> the fail-over feature.
> 

Tech board decided [1] to "reconsider" the PMD for this release (17.08).
So, lets start it J

I think it is good idea to continue on top of above summary, is there a
plan to how to proceed?

Thanks,
ferruh

[1]
http://dpdk.org/ml/archives/dev/2017-March/061009.html


Re: [dpdk-dev] [RFC] service core concept header implementation

2017-05-17 Thread Bruce Richardson
On Wed, May 17, 2017 at 12:47:52PM +, Ananyev, Konstantin wrote:
> Hi Harry,
> I had a look, my questions/comments below.
> From  my perspective it looks like an attempt to introduce simple scheduling 
> library inside DPDK.
> Though some things left unclear from my perspective, but probably I just 
> misunderstood your
> intentions here.
> Thanks
> Konstantin
> 

Hi Konstantin,

Thanks for the feedback, the specific detail of which I'll perhaps leave
for Harry to reply to or include in a later version of this patchset.
However, from a higher level, I think the big difference in what we
envisage compared to what you suggest in your comments is that these are
not services set up by the application. If the app wants to run
something it uses remote_launch as now. This is instead for other
components to request threads - or a share of a thread - for their own
use, since they cannot call remote_launch directly, as the app owns the
threads, not the individual libraries.
See also comments made in reply to Thomas mail.

Hope this helps clarify things a little.

/Bruce

> > 
> > This patch adds a service header to DPDK EAL. This header is
> > an RFC for a mechanism to allow DPDK components request a
> > callback function to be invoked.
> > 
> > The application can set the number of service cores, and
> > a coremask for each particular services. The implementation
> > of this functionality in rte_services.c (not completed) would
> > use atomics to ensure that a service can only be polled by a
> > single lcore at a time.
> > 
> > Signed-off-by: Harry van Haaren 
> > ---
> >  lib/librte_eal/common/include/rte_service.h | 211 
> > 
> >  1 file changed, 211 insertions(+)
> >  create mode 100644 lib/librte_eal/common/include/rte_service.h
> > 
> 
> ...
> 
> > +
> > +/**
> > + * Signature of callback back function to run a service.
> > + */
> > +typedef void (*rte_eal_service_func)(void *args);
> > +
> > +struct rte_service_config {
> > +   /* name of the service */
> > +   char name[RTE_SERVICE_NAMESIZE];
> > +   /* cores that run this service */
> > +   uint64_t coremask;
> 
> Why uint64_t?
> Even now by default we support more than 64 cores.
> We have rte_cpuset_t for such things.
> 
> Ok, the first main question:
> Is that possible to register multiple services with intersecting coremasks or 
> not?
> If not, then I suppose there is no practical difference from what we have 
> right now
> with eal_remote_launch() .
> If yes, then several questions arise:
>a) How the service scheduling function will going to switch from one 
> service to another
>on particular core?
>As we don't have interrupt framework for that, I suppose the only 
> choice is to rely
>on service voluntary fiving up cpu. Is that so?
>  b) Would it be possible to specify percentage of core cycles that service 
> can consume?
>  I suppose the answer is 'no', at least for current iteration. 
> 
> > +   /* when set, multiple lcores can run this service simultaneously without
> > +* the need for software atomics to ensure that two cores do not
> > +* attempt to run the service at the same time.
> > +*/
> > +   uint8_t multithread_capable;
> 
> Ok, and what would happen if this flag is not set, and user specified more
> then one cpu in the coremask?
> 
> > +};
> > +
> > +/** @internal - this will not be visible to application, defined in a 
> > seperate
> > + * rte_eal_service_impl.h header.
> 
> Why not?
> If the application  registers the service, it for sure needs to know what 
> exactly that service
> would do (cb func and data).
> 
> > The application does not need to be exposed
> > + * to the actual function pointers - so hide them. */
> > +struct rte_service {
> > +   /* callback to be called to run the service */
> > +   rte_eal_service_func cb;
> > +   /* args to the callback function */
> > +   void *cb_args;
> 
> If user plans to run that service on multiple cores, then he might
> need a separate instance of cb_args for each core.
> 
> > +   /* configuration of the service */
> > +   struct rte_service_config config;
> > +};
> > +
> > +/**
> > + * @internal - Only DPDK internal components request "service" cores.
> > + *
> > + * Registers a service in EAL.
> > + *
> > + * Registered services' configurations are exposed to an application using
> > + * *rte_eal_service_get_all*. These services have names which can be 
> > understood
> > + * by the application, and the application logic can decide how to allocate
> > + * cores to run the various services.
> > + *
> > + * This function is expected to be called by a DPDK component to indicate 
> > that
> > + * it require a CPU to run a specific function in order for it to perform 
> > its
> > + * processing. An example of such a component is the eventdev software PMD.
> > + *
> > + * The config struct should be filled in as appropriate by the PMD. For 
> > example
> > + * the name field should include some level of detail (e.g

Re: [dpdk-dev] Proposed schedule dates for DPDK 17.08, 17.11 and 18.02

2017-05-17 Thread Thomas Monjalon
Hi,

04/05/2017 18:52, Mcnamara, John:
> The current 17.08 schedule dates are:
> 
> 17.08
> * Proposal deadline:May  28, 2017
> * Integration deadline: June 29, 2017
> * Release:  August1, 2017

I think these dates are good and must be strictly respected.
It would be very bad (at least for me) to release later in August.

> The following are proposed dates for 17.11 and 18.02.
> 
> 17.11
> * Proposal deadline:August25, 2017

As we can see in the following picture [1],
we work less in August and tend to send the proposals at the last minute.
[1] https://s12.postimg.org/g9uluz9u5/patchesv1.png
I don't know what can be done to avoid having hard days of reviews
after the August deadline.
Should we cut this period before or let a bit more days?

> * Integration deadline: September 29, 2017
> * Release:  November   2, 2017

I'm OK for these ones.

> 18.02
> * Proposal deadline:November  24, 2017
> * Integration deadline: December  29, 2017

The Christmas / New Year holidays will be from Dec 25 to Jan 1.
I think we could move the integration deadline to Jan 5.

> * Release:  February   2, 2018

The Chinese Spring Festival will be from Feb 15 to 21.
So we have a good margin.

> These dates need to be discussed/agreed in the community since there are a
> lot of different holidays in these periods: August holidays, Christmas,
> New Year, Spring Festival.




Re: [dpdk-dev] [PATCH v3] drivers/net: document missing speed capabilities feature

2017-05-17 Thread Mcnamara, John


> -Original Message-
> From: Yigit, Ferruh
> Sent: Monday, May 15, 2017 1:31 PM
> To: Shepard Siegel ; Ed Czeck
> ; John Miller ;
> Mcnamara, John ; Harish Patil
> ; Rasesh Mody ; Rahul
> Lakkireddy ; Hemant Agrawal
> ; Shreyansh Jain ; Lu,
> Wenzhuo ; Marcin Wojtas ; Michal
> Krawczyk ; Guy Tzalik ; Evgeny
> Schemeilin ; Chen, Jing D ;
> Zhang, Helin ; Wu, Jingjing
> ; Ananyev, Konstantin
> ; Adrien Mazarguil
> ; Nelio Laranjeiro
> ; Matej Vido ; Pascal Mazon
> ; Yuanhan Liu ;
> Maxime Coquelin ; Shrikrishna Khare
> 
> Cc: dev@dpdk.org; Yigit, Ferruh 
> Subject: [PATCH v3] drivers/net: document missing speed capabilities
> feature
> 
> Signed-off-by: Ferruh Yigit 

Acked-by: John McNamara 



Re: [dpdk-dev] [PATCH v4] net/ixgbe: ensure link status is updated

2017-05-17 Thread Roger B Melton

Hi Laurent/Wei,

As I continue to integrate DPDK 17.05 into my application, I have hit 
another issue with this patch while testing in a VM with multispeed 
fiber ixgbe PCI passthrough.  My application periodically invokes 
rte_eth_link_get_nowait() to detect link state changes.  If the link is 
down (no cable or far end disabled), ixgbe_setup_link() will not return 
for ~1.3seconds due to the link setup algorithm in 
ixgbe_common.c:ixgbe_multispeed_fiber():


+   if ((intr->flags & IXGBE_FLAG_NEED_LINK_CONFIG) &&
+   hw->mac.ops.get_media_type(hw) == ixgbe_media_type_fiber) {
+   speed = hw->phy.autoneg_advertised;
+   if (!speed)
+   ixgbe_get_link_capabilities(hw, &speed, &autoneg);
+   ixgbe_setup_link(hw, speed, true); +}
+

I have two questions:

 * Shouldn't we avoid the link setup cost if the caller has specified
   not to wait_to_complete?
 * If the concern is speed may not be properly configured, shouldn't
   the link setup be deferred until state changes link up thus
   minimizing the delays enforced in ixgbe_multispeed_fiber()?


Regards,
-Roger




On 4/27/17 11:03 AM, Laurent Hardy wrote:

In case of fiber and link speed set to 1Gb at peer side (with autoneg
or with defined speed), link status could be not properly updated at
time cable is plugged-in.
Indeed if cable was not plugged when device has been configured and
started then link status will not be updated properly with new speed
as no link setup will be triggered.

To avoid this issue, IXGBE_FLAG_NEED_LINK_CONFIG is set to try a link
setup each time link_update() is triggered and current link status is
down. When cable is plugged-in, link setup will be performed via
ixgbe_setup_link().

Signed-off-by: Laurent Hardy 
---
Hi Wei, please find enclosed patch v4, tested using testpmd.

v1 -> v2:
- rebase on top of head (change flag to 1<<4)
- fix regression with copper links: only update link for fiber links

v2 -> v3:
- remove unnescessary check on speed mask if autoneg is false

v3 -> v4:
- remove default speed set to 10Gb if autoneg is false, rely on
ixgbe_get_link_capabilities( ) instead.
---
  drivers/net/ixgbe/ixgbe_ethdev.c | 14 ++
  drivers/net/ixgbe/ixgbe_ethdev.h |  1 +
  2 files changed, 15 insertions(+)

diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c b/drivers/net/ixgbe/ixgbe_ethdev.c
index 7b856bb..8a0c0a7 100644
--- a/drivers/net/ixgbe/ixgbe_ethdev.c
+++ b/drivers/net/ixgbe/ixgbe_ethdev.c
@@ -3782,8 +3782,12 @@ ixgbe_dev_link_update(struct rte_eth_dev *dev, int 
wait_to_complete)
struct ixgbe_hw *hw = IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
struct rte_eth_link link, old;
ixgbe_link_speed link_speed = IXGBE_LINK_SPEED_UNKNOWN;
+   struct ixgbe_interrupt *intr =
+   IXGBE_DEV_PRIVATE_TO_INTR(dev->data->dev_private);
int link_up;
int diag;
+   u32 speed = 0;
+   bool autoneg = false;
  
  	link.link_status = ETH_LINK_DOWN;

link.link_speed = 0;
@@ -3793,6 +3797,14 @@ ixgbe_dev_link_update(struct rte_eth_dev *dev, int 
wait_to_complete)
  
  	hw->mac.get_link_status = true;
  
+	if ((intr->flags & IXGBE_FLAG_NEED_LINK_CONFIG) &&

+   hw->mac.ops.get_media_type(hw) == ixgbe_media_type_fiber) {
+   speed = hw->phy.autoneg_advertised;
+   if (!speed)
+   ixgbe_get_link_capabilities(hw, &speed, &autoneg);
+   ixgbe_setup_link(hw, speed, true);
+   }
+
/* check if it needs to wait to complete, if lsc interrupt is enabled */
if (wait_to_complete == 0 || dev->data->dev_conf.intr_conf.lsc != 0)
diag = ixgbe_check_link(hw, &link_speed, &link_up, 0);
@@ -3810,10 +3822,12 @@ ixgbe_dev_link_update(struct rte_eth_dev *dev, int 
wait_to_complete)
  
  	if (link_up == 0) {

rte_ixgbe_dev_atomic_write_link_status(dev, &link);
+   intr->flags |= IXGBE_FLAG_NEED_LINK_CONFIG;
if (link.link_status == old.link_status)
return -1;
return 0;
}
+   intr->flags &= ~IXGBE_FLAG_NEED_LINK_CONFIG;
link.link_status = ETH_LINK_UP;
link.link_duplex = ETH_LINK_FULL_DUPLEX;
  
diff --git a/drivers/net/ixgbe/ixgbe_ethdev.h b/drivers/net/ixgbe/ixgbe_ethdev.h

index 5176b02..b576a6f 100644
--- a/drivers/net/ixgbe/ixgbe_ethdev.h
+++ b/drivers/net/ixgbe/ixgbe_ethdev.h
@@ -45,6 +45,7 @@
  #define IXGBE_FLAG_MAILBOX  (uint32_t)(1 << 1)
  #define IXGBE_FLAG_PHY_INTERRUPT(uint32_t)(1 << 2)
  #define IXGBE_FLAG_MACSEC   (uint32_t)(1 << 3)
+#define IXGBE_FLAG_NEED_LINK_CONFIG (uint32_t)(1 << 4)
  
  /*

   * Defines that were not part of ixgbe_type.h as they are not used by the




[dpdk-dev] [PATCH v6] net/i40e: improved FDIR programming times

2017-05-17 Thread Michael Lilja
Previously, the FDIR programming time is +11ms on i40e.
This patch will result in an average programming time of
22usec with a max of 60usec .

Signed-off-by: Michael Lilja 

---
v6:
* Fixed code style issues

v5:
* Reinitialization of "i" inconsistent with original intent

v4:
* Code style fix

v3:
* Replaced commit message

v2:
*  Code style fix

v1:
* Initial version
---
---
 drivers/net/i40e/i40e_fdir.c | 26 +-
 1 file changed, 13 insertions(+), 13 deletions(-)

diff --git a/drivers/net/i40e/i40e_fdir.c b/drivers/net/i40e/i40e_fdir.c
index 28cc554f5..16cb963ce 100644
--- a/drivers/net/i40e/i40e_fdir.c
+++ b/drivers/net/i40e/i40e_fdir.c
@@ -1295,28 +1295,28 @@ i40e_fdir_filter_programming(struct i40e_pf *pf,
/* Update the tx tail register */
rte_wmb();
I40E_PCI_REG_WRITE(txq->qtx_tail, txq->tx_tail);
-
-   for (i = 0; i < I40E_FDIR_WAIT_COUNT; i++) {
-   rte_delay_us(I40E_FDIR_WAIT_INTERVAL_US);
+   i = 0;
+   for (; i < (I40E_FDIR_WAIT_COUNT * I40E_FDIR_WAIT_INTERVAL_US); i++) {
if ((txdp->cmd_type_offset_bsz &
-   rte_cpu_to_le_64(I40E_TXD_QW1_DTYPE_MASK)) ==
-   rte_cpu_to_le_64(I40E_TX_DESC_DTYPE_DESC_DONE))
+   rte_cpu_to_le_64(I40E_TXD_QW1_DTYPE_MASK)) ==
+   rte_cpu_to_le_64(I40E_TX_DESC_DTYPE_DESC_DONE))
break;
+   rte_delay_us(1);
}
-   if (i >= I40E_FDIR_WAIT_COUNT) {
+   if (i >= (I40E_FDIR_WAIT_COUNT * I40E_FDIR_WAIT_INTERVAL_US)) {
PMD_DRV_LOG(ERR, "Failed to program FDIR filter:"
" time out to get DD on tx queue.");
return -ETIMEDOUT;
}
/* totally delay 10 ms to check programming status*/
-   rte_delay_us((I40E_FDIR_WAIT_COUNT - i) * I40E_FDIR_WAIT_INTERVAL_US);
-   if (i40e_check_fdir_programming_status(rxq) < 0) {
-   PMD_DRV_LOG(ERR, "Failed to program FDIR filter:"
-   " programming status reported.");
-   return -ENOSYS;
+   for (; i < (I40E_FDIR_WAIT_COUNT * I40E_FDIR_WAIT_INTERVAL_US); i++) {
+   if (i40e_check_fdir_programming_status(rxq) >= 0)
+   return 0;
+   rte_delay_us(1);
}
-
-   return 0;
+   PMD_DRV_LOG(ERR, "Failed to program FDIR filter:"
+   " programming status reported.");
+   return -ETIMEDOUT;
 }
 
 /*
-- 
2.12.2



Re: [dpdk-dev] [RFC] service core concept header implementation

2017-05-17 Thread Ananyev, Konstantin
Hi Bruce,

> >
> 
> Hi Konstantin,
> 
> Thanks for the feedback, the specific detail of which I'll perhaps leave
> for Harry to reply to or include in a later version of this patchset.
> However, from a higher level, I think the big difference in what we
> envisage compared to what you suggest in your comments is that these are
> not services set up by the application. If the app wants to run
> something it uses remote_launch as now. This is instead for other
> components to request threads - or a share of a thread - for their own
> use, since they cannot call remote_launch directly, as the app owns the
> threads, not the individual libraries.

Ok, thanks for clarification about who supposed to be the main consumer for 
that.
That makes sense.
Though  I am not sure why the API suggested wouldn't fit your purposes. 
Though I think both PMD/core libraries  and app layer might be interested in 
that: i.e.
app might also have some background processing tasks, that might need to be run
on several cores (or core slices).
Konstantin

> See also comments made in reply to Thomas mail.
> 
> Hope this helps clarify things a little.
> 
> /Bruce
> 
> > >
> > > This patch adds a service header to DPDK EAL. This header is
> > > an RFC for a mechanism to allow DPDK components request a
> > > callback function to be invoked.
> > >
> > > The application can set the number of service cores, and
> > > a coremask for each particular services. The implementation
> > > of this functionality in rte_services.c (not completed) would
> > > use atomics to ensure that a service can only be polled by a
> > > single lcore at a time.
> > >
> > > Signed-off-by: Harry van Haaren 
> > > ---
> > >  lib/librte_eal/common/include/rte_service.h | 211 
> > > 
> > >  1 file changed, 211 insertions(+)
> > >  create mode 100644 lib/librte_eal/common/include/rte_service.h
> > >
> >
> > ...
> >
> > > +
> > > +/**
> > > + * Signature of callback back function to run a service.
> > > + */
> > > +typedef void (*rte_eal_service_func)(void *args);
> > > +
> > > +struct rte_service_config {
> > > + /* name of the service */
> > > + char name[RTE_SERVICE_NAMESIZE];
> > > + /* cores that run this service */
> > > + uint64_t coremask;
> >
> > Why uint64_t?
> > Even now by default we support more than 64 cores.
> > We have rte_cpuset_t for such things.
> >
> > Ok, the first main question:
> > Is that possible to register multiple services with intersecting coremasks 
> > or not?
> > If not, then I suppose there is no practical difference from what we have 
> > right now
> > with eal_remote_launch() .
> > If yes, then several questions arise:
> >a) How the service scheduling function will going to switch from one 
> > service to another
> >on particular core?
> >As we don't have interrupt framework for that, I suppose the only 
> > choice is to rely
> >on service voluntary fiving up cpu. Is that so?
> >  b) Would it be possible to specify percentage of core cycles that service 
> > can consume?
> >  I suppose the answer is 'no', at least for current iteration.
> >
> > > + /* when set, multiple lcores can run this service simultaneously without
> > > +  * the need for software atomics to ensure that two cores do not
> > > +  * attempt to run the service at the same time.
> > > +  */
> > > + uint8_t multithread_capable;
> >
> > Ok, and what would happen if this flag is not set, and user specified more
> > then one cpu in the coremask?
> >
> > > +};
> > > +
> > > +/** @internal - this will not be visible to application, defined in a 
> > > seperate
> > > + * rte_eal_service_impl.h header.
> >
> > Why not?
> > If the application  registers the service, it for sure needs to know what 
> > exactly that service
> > would do (cb func and data).
> >
> > > The application does not need to be exposed
> > > + * to the actual function pointers - so hide them. */
> > > +struct rte_service {
> > > + /* callback to be called to run the service */
> > > + rte_eal_service_func cb;
> > > + /* args to the callback function */
> > > + void *cb_args;
> >
> > If user plans to run that service on multiple cores, then he might
> > need a separate instance of cb_args for each core.
> >
> > > + /* configuration of the service */
> > > + struct rte_service_config config;
> > > +};
> > > +
> > > +/**
> > > + * @internal - Only DPDK internal components request "service" cores.
> > > + *
> > > + * Registers a service in EAL.
> > > + *
> > > + * Registered services' configurations are exposed to an application 
> > > using
> > > + * *rte_eal_service_get_all*. These services have names which can be 
> > > understood
> > > + * by the application, and the application logic can decide how to 
> > > allocate
> > > + * cores to run the various services.
> > > + *
> > > + * This function is expected to be called by a DPDK component to 
> > > indicate that
> > > + * it require a CPU to run a specific function in order for it to 
> > > perfor

[dpdk-dev] [PATCH] examples/helloworld: add output of core id and socket id

2017-05-17 Thread Wei Dai
Adding output of core id and socket id of each lcore/pthread
can help to understand their relationship.
And this can also help to examine the usage of the EAL lcore
settings like -c, -l and --lcore .

Signed-off-by: Wei Dai 
---
 examples/helloworld/main.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/examples/helloworld/main.c b/examples/helloworld/main.c
index 8b7a2de..fdd8818 100644
--- a/examples/helloworld/main.c
+++ b/examples/helloworld/main.c
@@ -50,7 +50,9 @@ lcore_hello(__attribute__((unused)) void *arg)
 {
unsigned lcore_id;
lcore_id = rte_lcore_id();
-   printf("hello from core %u\n", lcore_id);
+   printf("hello from core %2u at core_id = %2u on socket_id = %2u\n",
+   lcore_id, lcore_config[lcore_id].core_id,
+   lcore_config[lcore_id].socket_id);
return 0;
 }
 
-- 
2.7.4



Re: [dpdk-dev] [PATCH v6] net/i40e: improved FDIR programming times

2017-05-17 Thread Ferruh Yigit
On 5/17/2017 2:45 PM, Michael Lilja wrote:
> Previously, the FDIR programming time is +11ms on i40e.
> This patch will result in an average programming time of
> 22usec with a max of 60usec .
> 
> Signed-off-by: Michael Lilja 

Please cc maintainers in the patch.

> 
> ---
> v6:
> * Fixed code style issues
> 
> v5:
> * Reinitialization of "i" inconsistent with original intent
> 
> v4:
> * Code style fix
> 
> v3:
> * Replaced commit message
> 
> v2:
> *  Code style fix
> 
> v1:
> * Initial version
> ---
> ---
>  drivers/net/i40e/i40e_fdir.c | 26 +-
>  1 file changed, 13 insertions(+), 13 deletions(-)
> 
> diff --git a/drivers/net/i40e/i40e_fdir.c b/drivers/net/i40e/i40e_fdir.c
> index 28cc554f5..16cb963ce 100644
> --- a/drivers/net/i40e/i40e_fdir.c
> +++ b/drivers/net/i40e/i40e_fdir.c
> @@ -1295,28 +1295,28 @@ i40e_fdir_filter_programming(struct i40e_pf *pf,
>   /* Update the tx tail register */
>   rte_wmb();
>   I40E_PCI_REG_WRITE(txq->qtx_tail, txq->tx_tail);
> -
> - for (i = 0; i < I40E_FDIR_WAIT_COUNT; i++) {
> - rte_delay_us(I40E_FDIR_WAIT_INTERVAL_US);
> + i = 0;

This is extracted out of "for" to stay in 80 columns limit, but instead
what do you think:

Create a variable, something like "wait_us_count":
wait_us_count = I40E_FDIR_WAIT_COUNT * I40E_FDIR_WAIT_INTERVAL_US;

and used it below three times, and lines will stay in limit.

> + for (; i < (I40E_FDIR_WAIT_COUNT * I40E_FDIR_WAIT_INTERVAL_US); i++) {
>   if ((txdp->cmd_type_offset_bsz &
> - rte_cpu_to_le_64(I40E_TXD_QW1_DTYPE_MASK)) ==
> - rte_cpu_to_le_64(I40E_TX_DESC_DTYPE_DESC_DONE))
> + rte_cpu_to_le_64(I40E_TXD_QW1_DTYPE_MASK)) ==
> + rte_cpu_to_le_64(I40E_TX_DESC_DTYPE_DESC_DONE))

Old indentation was correct I think, that is to differentiate the code
in below line easily.

>   break;
> + rte_delay_us(1);
>   }
> - if (i >= I40E_FDIR_WAIT_COUNT) {
> + if (i >= (I40E_FDIR_WAIT_COUNT * I40E_FDIR_WAIT_INTERVAL_US)) {
>   PMD_DRV_LOG(ERR, "Failed to program FDIR filter:"
>   " time out to get DD on tx queue.");
>   return -ETIMEDOUT;
>   }
>   /* totally delay 10 ms to check programming status*/
> - rte_delay_us((I40E_FDIR_WAIT_COUNT - i) * I40E_FDIR_WAIT_INTERVAL_US);
> - if (i40e_check_fdir_programming_status(rxq) < 0) {
> - PMD_DRV_LOG(ERR, "Failed to program FDIR filter:"
> - " programming status reported.");
> - return -ENOSYS;
> + for (; i < (I40E_FDIR_WAIT_COUNT * I40E_FDIR_WAIT_INTERVAL_US); i++) {
> + if (i40e_check_fdir_programming_status(rxq) >= 0)
> + return 0;
> + rte_delay_us(1);
>   }
> -
> - return 0;
> + PMD_DRV_LOG(ERR, "Failed to program FDIR filter:"
> + " programming status reported.");
> + return -ETIMEDOUT;
>  }
>  
>  /*
> 



Re: [dpdk-dev] [PATCH] examples/helloworld: add output of core id and socket id

2017-05-17 Thread Ferruh Yigit
On 5/17/2017 2:49 PM, Wei Dai wrote:
> Adding output of core id and socket id of each lcore/pthread
> can help to understand their relationship.
> And this can also help to examine the usage of the EAL lcore
> settings like -c, -l and --lcore .
> 
> Signed-off-by: Wei Dai 
> ---
>  examples/helloworld/main.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/examples/helloworld/main.c b/examples/helloworld/main.c
> index 8b7a2de..fdd8818 100644
> --- a/examples/helloworld/main.c
> +++ b/examples/helloworld/main.c
> @@ -50,7 +50,9 @@ lcore_hello(__attribute__((unused)) void *arg)
>  {
>   unsigned lcore_id;
>   lcore_id = rte_lcore_id();
> - printf("hello from core %u\n", lcore_id);
> + printf("hello from core %2u at core_id = %2u on socket_id = %2u\n",

It is hard to understand difference from "core" and "core_id", what do
you think using "lcore" and "core" respectively in the message?

> + lcore_id, lcore_config[lcore_id].core_id,
> + lcore_config[lcore_id].socket_id);
>   return 0;
>  }
>  
> 



Re: [dpdk-dev] [PATCH] examples/helloworld: add output of core id and socket id

2017-05-17 Thread Dai, Wei
> -Original Message-
> From: Yigit, Ferruh
> Sent: Wednesday, May 17, 2017 10:16 PM
> To: Dai, Wei ; Richardson, Bruce
> ; Mcnamara, John 
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH] examples/helloworld: add output of core id and
> socket id
> 
> On 5/17/2017 2:49 PM, Wei Dai wrote:
> > Adding output of core id and socket id of each lcore/pthread can help
> > to understand their relationship.
> > And this can also help to examine the usage of the EAL lcore settings
> > like -c, -l and --lcore .
> >
> > Signed-off-by: Wei Dai 
> > ---
> >  examples/helloworld/main.c | 4 +++-
> >  1 file changed, 3 insertions(+), 1 deletion(-)
> >
> > diff --git a/examples/helloworld/main.c b/examples/helloworld/main.c
> > index 8b7a2de..fdd8818 100644
> > --- a/examples/helloworld/main.c
> > +++ b/examples/helloworld/main.c
> > @@ -50,7 +50,9 @@ lcore_hello(__attribute__((unused)) void *arg)  {
> > unsigned lcore_id;
> > lcore_id = rte_lcore_id();
> > -   printf("hello from core %u\n", lcore_id);
> > +   printf("hello from core %2u at core_id = %2u on socket_id = %2u\n",
> 
> It is hard to understand difference from "core" and "core_id", what do you
> think using "lcore" and "core" respectively in the message?
Yes, it is still a bit confused.
I should change it to printf("hello from lcore_id = %2u at core_id = %2u on 
socket_id = %2u\n",

> 
> > +   lcore_id, lcore_config[lcore_id].core_id,
> > +   lcore_config[lcore_id].socket_id);
> > return 0;
> >  }
> >
> >



Re: [dpdk-dev] [PATCH] examples/helloworld: add output of core id and socket id

2017-05-17 Thread Ferruh Yigit
On 5/17/2017 3:29 PM, Dai, Wei wrote:
>> -Original Message-
>> From: Yigit, Ferruh
>> Sent: Wednesday, May 17, 2017 10:16 PM
>> To: Dai, Wei ; Richardson, Bruce
>> ; Mcnamara, John 
>> Cc: dev@dpdk.org
>> Subject: Re: [dpdk-dev] [PATCH] examples/helloworld: add output of core id 
>> and
>> socket id
>>
>> On 5/17/2017 2:49 PM, Wei Dai wrote:
>>> Adding output of core id and socket id of each lcore/pthread can help
>>> to understand their relationship.
>>> And this can also help to examine the usage of the EAL lcore settings
>>> like -c, -l and --lcore .
>>>
>>> Signed-off-by: Wei Dai 
>>> ---
>>>  examples/helloworld/main.c | 4 +++-
>>>  1 file changed, 3 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/examples/helloworld/main.c b/examples/helloworld/main.c
>>> index 8b7a2de..fdd8818 100644
>>> --- a/examples/helloworld/main.c
>>> +++ b/examples/helloworld/main.c
>>> @@ -50,7 +50,9 @@ lcore_hello(__attribute__((unused)) void *arg)  {
>>> unsigned lcore_id;
>>> lcore_id = rte_lcore_id();
>>> -   printf("hello from core %u\n", lcore_id);
>>> +   printf("hello from core %2u at core_id = %2u on socket_id = %2u\n",
>>
>> It is hard to understand difference from "core" and "core_id", what do you
>> think using "lcore" and "core" respectively in the message?
> Yes, it is still a bit confused.
> I should change it to printf("hello from lcore_id = %2u at core_id = %2u on 
> socket_id = %2u\n",

+1

> 
>>
>>> +   lcore_id, lcore_config[lcore_id].core_id,
>>> +   lcore_config[lcore_id].socket_id);
>>> return 0;
>>>  }
>>>
>>>
> 



[dpdk-dev] [PATCH v7] net/i40e: improved FDIR programming times

2017-05-17 Thread Michael Lilja
Previously, the FDIR programming time is +11ms on i40e.
This patch will result in an average programming time of
22usec with a max of 60usec .

Signed-off-by: Michael Lilja 

---
v7:
* Code style changes

v6:
* Fixed code style issues

v5:
* Reinitialization of "i" inconsistent with original intent

v4:
* Code style fix

v3:
* Replaced commit message

v2:
*  Code style fix

v1:
* Initial version
---
---
 drivers/net/i40e/i40e_fdir.c | 22 +++---
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/drivers/net/i40e/i40e_fdir.c b/drivers/net/i40e/i40e_fdir.c
index 28cc554f5..1192d5831 100644
--- a/drivers/net/i40e/i40e_fdir.c
+++ b/drivers/net/i40e/i40e_fdir.c
@@ -76,6 +76,7 @@
 /* Wait count and interval for fdir filter programming */
 #define I40E_FDIR_WAIT_COUNT   10
 #define I40E_FDIR_WAIT_INTERVAL_US 1000
+#define I40E_FDIR_MAX_WAIT (I40E_FDIR_WAIT_COUNT * I40E_FDIR_WAIT_INTERVAL_US)
 
 /* Wait count and interval for fdir filter flush */
 #define I40E_FDIR_FLUSH_RETRY   50
@@ -1295,28 +1296,27 @@ i40e_fdir_filter_programming(struct i40e_pf *pf,
/* Update the tx tail register */
rte_wmb();
I40E_PCI_REG_WRITE(txq->qtx_tail, txq->tx_tail);
-
-   for (i = 0; i < I40E_FDIR_WAIT_COUNT; i++) {
-   rte_delay_us(I40E_FDIR_WAIT_INTERVAL_US);
+   for (i = 0; i < I40E_FDIR_MAX_WAIT; i++) {
if ((txdp->cmd_type_offset_bsz &
rte_cpu_to_le_64(I40E_TXD_QW1_DTYPE_MASK)) ==
rte_cpu_to_le_64(I40E_TX_DESC_DTYPE_DESC_DONE))
break;
+   rte_delay_us(1);
}
-   if (i >= I40E_FDIR_WAIT_COUNT) {
+   if (i >= I40E_FDIR_MAX_WAIT) {
PMD_DRV_LOG(ERR, "Failed to program FDIR filter:"
" time out to get DD on tx queue.");
return -ETIMEDOUT;
}
/* totally delay 10 ms to check programming status*/
-   rte_delay_us((I40E_FDIR_WAIT_COUNT - i) * I40E_FDIR_WAIT_INTERVAL_US);
-   if (i40e_check_fdir_programming_status(rxq) < 0) {
-   PMD_DRV_LOG(ERR, "Failed to program FDIR filter:"
-   " programming status reported.");
-   return -ENOSYS;
+   for (; i < I40E_FDIR_MAX_WAIT; i++) {
+   if (i40e_check_fdir_programming_status(rxq) >= 0)
+   return 0;
+   rte_delay_us(1);
}
-
-   return 0;
+   PMD_DRV_LOG(ERR, "Failed to program FDIR filter:"
+   " programming status reported.");
+   return -ETIMEDOUT;
 }
 
 /*
-- 
2.12.2



Re: [dpdk-dev] [PATCH v7] net/i40e: improved FDIR programming times

2017-05-17 Thread Ferruh Yigit
On 5/17/2017 3:31 PM, Michael Lilja wrote:
> Previously, the FDIR programming time is +11ms on i40e.
> This patch will result in an average programming time of
> 22usec with a max of 60usec .
> 
> Signed-off-by: Michael Lilja 

Sorry for multiple, minor change requests ...

> 
> ---
> v7:
> * Code style changes
> 
> v6:
> * Fixed code style issues
> 
> v5:
> * Reinitialization of "i" inconsistent with original intent
> 
> v4:
> * Code style fix
> 
> v3:
> * Replaced commit message
> 
> v2:
> *  Code style fix
> 
> v1:
> * Initial version
> ---
> ---
>  drivers/net/i40e/i40e_fdir.c | 22 +++---
>  1 file changed, 11 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/net/i40e/i40e_fdir.c b/drivers/net/i40e/i40e_fdir.c
> index 28cc554f5..1192d5831 100644
> --- a/drivers/net/i40e/i40e_fdir.c
> +++ b/drivers/net/i40e/i40e_fdir.c
> @@ -76,6 +76,7 @@
>  /* Wait count and interval for fdir filter programming */
>  #define I40E_FDIR_WAIT_COUNT   10
>  #define I40E_FDIR_WAIT_INTERVAL_US 1000
> +#define I40E_FDIR_MAX_WAIT (I40E_FDIR_WAIT_COUNT * 
> I40E_FDIR_WAIT_INTERVAL_US)

It looks like I40E_FDIR_WAIT_COUNT and I40E_FDIR_WAIT_INTERVAL_US not
used anywhere else, is there any value to keep them?

why not:
#define I40E_FDIR_MAX_WAIT_US 1 /* 10 ms */

>  
>  /* Wait count and interval for fdir filter flush */
>  #define I40E_FDIR_FLUSH_RETRY   50
> @@ -1295,28 +1296,27 @@ i40e_fdir_filter_programming(struct i40e_pf *pf,
>   /* Update the tx tail register */
>   rte_wmb();
>   I40E_PCI_REG_WRITE(txq->qtx_tail, txq->tx_tail);
> -
> - for (i = 0; i < I40E_FDIR_WAIT_COUNT; i++) {
> - rte_delay_us(I40E_FDIR_WAIT_INTERVAL_US);
> + for (i = 0; i < I40E_FDIR_MAX_WAIT; i++) {
>   if ((txdp->cmd_type_offset_bsz &
>   rte_cpu_to_le_64(I40E_TXD_QW1_DTYPE_MASK)) ==
>   rte_cpu_to_le_64(I40E_TX_DESC_DTYPE_DESC_DONE))
>   break;
> + rte_delay_us(1);
>   }
> - if (i >= I40E_FDIR_WAIT_COUNT) {
> + if (i >= I40E_FDIR_MAX_WAIT) {
>   PMD_DRV_LOG(ERR, "Failed to program FDIR filter:"
>   " time out to get DD on tx queue.");
>   return -ETIMEDOUT;
>   }
>   /* totally delay 10 ms to check programming status*/
> - rte_delay_us((I40E_FDIR_WAIT_COUNT - i) * I40E_FDIR_WAIT_INTERVAL_US);
> - if (i40e_check_fdir_programming_status(rxq) < 0) {
> - PMD_DRV_LOG(ERR, "Failed to program FDIR filter:"
> - " programming status reported.");
> - return -ENOSYS;
> + for (; i < I40E_FDIR_MAX_WAIT; i++) {
> + if (i40e_check_fdir_programming_status(rxq) >= 0)
> + return 0;
> + rte_delay_us(1);
>   }
> -
> - return 0;
> + PMD_DRV_LOG(ERR, "Failed to program FDIR filter:"
> + " programming status reported.");
> + return -ETIMEDOUT;
>  }
>  
>  /*
> 



Re: [dpdk-dev] [PATCH v7] net/i40e: improved FDIR programming times

2017-05-17 Thread Michael Lilja
It's ok. I didn't write the original code so I cannot tell why the two defines 
were made in the initial case. It make sense to remove them, but the 
maintainers must have had a reason, maybe they are needed in a future version 
of the code?

/Michael

-Original Message-
From: Ferruh Yigit [mailto:ferruh.yi...@intel.com] 
Sent: 17 May 2017 16:44
To: Michael Lilja ; helin.zh...@intel.com; 
jingjing...@intel.com
Cc: dev@dpdk.org
Subject: Re: [dpdk-dev] [PATCH v7] net/i40e: improved FDIR programming times

On 5/17/2017 3:31 PM, Michael Lilja wrote:
> Previously, the FDIR programming time is +11ms on i40e.
> This patch will result in an average programming time of 22usec with a 
> max of 60usec .
> 
> Signed-off-by: Michael Lilja 

Sorry for multiple, minor change requests ...

> 
> ---
> v7:
> * Code style changes
> 
> v6:
> * Fixed code style issues
> 
> v5:
> * Reinitialization of "i" inconsistent with original intent
> 
> v4:
> * Code style fix
> 
> v3:
> * Replaced commit message
> 
> v2:
> *  Code style fix
> 
> v1:
> * Initial version
> ---
> ---
>  drivers/net/i40e/i40e_fdir.c | 22 +++---
>  1 file changed, 11 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/net/i40e/i40e_fdir.c 
> b/drivers/net/i40e/i40e_fdir.c index 28cc554f5..1192d5831 100644
> --- a/drivers/net/i40e/i40e_fdir.c
> +++ b/drivers/net/i40e/i40e_fdir.c
> @@ -76,6 +76,7 @@
>  /* Wait count and interval for fdir filter programming */
>  #define I40E_FDIR_WAIT_COUNT   10
>  #define I40E_FDIR_WAIT_INTERVAL_US 1000
> +#define I40E_FDIR_MAX_WAIT (I40E_FDIR_WAIT_COUNT * 
> +I40E_FDIR_WAIT_INTERVAL_US)

It looks like I40E_FDIR_WAIT_COUNT and I40E_FDIR_WAIT_INTERVAL_US not used 
anywhere else, is there any value to keep them?

why not:
#define I40E_FDIR_MAX_WAIT_US 1 /* 10 ms */

>  
>  /* Wait count and interval for fdir filter flush */
>  #define I40E_FDIR_FLUSH_RETRY   50
> @@ -1295,28 +1296,27 @@ i40e_fdir_filter_programming(struct i40e_pf *pf,
>   /* Update the tx tail register */
>   rte_wmb();
>   I40E_PCI_REG_WRITE(txq->qtx_tail, txq->tx_tail);
> -
> - for (i = 0; i < I40E_FDIR_WAIT_COUNT; i++) {
> - rte_delay_us(I40E_FDIR_WAIT_INTERVAL_US);
> + for (i = 0; i < I40E_FDIR_MAX_WAIT; i++) {
>   if ((txdp->cmd_type_offset_bsz &
>   rte_cpu_to_le_64(I40E_TXD_QW1_DTYPE_MASK)) ==
>   rte_cpu_to_le_64(I40E_TX_DESC_DTYPE_DESC_DONE))
>   break;
> + rte_delay_us(1);
>   }
> - if (i >= I40E_FDIR_WAIT_COUNT) {
> + if (i >= I40E_FDIR_MAX_WAIT) {
>   PMD_DRV_LOG(ERR, "Failed to program FDIR filter:"
>   " time out to get DD on tx queue.");
>   return -ETIMEDOUT;
>   }
>   /* totally delay 10 ms to check programming status*/
> - rte_delay_us((I40E_FDIR_WAIT_COUNT - i) * I40E_FDIR_WAIT_INTERVAL_US);
> - if (i40e_check_fdir_programming_status(rxq) < 0) {
> - PMD_DRV_LOG(ERR, "Failed to program FDIR filter:"
> - " programming status reported.");
> - return -ENOSYS;
> + for (; i < I40E_FDIR_MAX_WAIT; i++) {
> + if (i40e_check_fdir_programming_status(rxq) >= 0)
> + return 0;
> + rte_delay_us(1);
>   }
> -
> - return 0;
> + PMD_DRV_LOG(ERR, "Failed to program FDIR filter:"
> + " programming status reported.");
> + return -ETIMEDOUT;
>  }
>  
>  /*
> 



Re: [dpdk-dev] [PATCH v7] net/i40e: improved FDIR programming times

2017-05-17 Thread Ferruh Yigit
On 5/17/2017 3:46 PM, Michael Lilja wrote:
> It's ok. I didn't write the original code so I cannot tell why the two 
> defines were made in the initial case. It make sense to remove them, but the 
> maintainers must have had a reason, maybe they are needed in a future version 
> of the code?

In original code, they have a meaning:
for (i = 0; i < I40E_FDIR_WAIT_COUNT; i++)
rte_delay_us(I40E_FDIR_WAIT_INTERVAL_US);

wait step is I40E_FDIR_WAIT_INTERVAL_US.

But you changed to fixes 1us stepping. So WAIT_COUNT and
WAIT_INTERVAL_US are no more meaningful. And since they are not used
anywhere else, I think they can go away.

And we can wait from maintainers ack for any "plan to use in the future"
case.

Thanks,
ferruh

> 
> /Michael
> 
> -Original Message-
> From: Ferruh Yigit [mailto:ferruh.yi...@intel.com]
> Sent: 17 May 2017 16:44
> To: Michael Lilja ; helin.zh...@intel.com; 
> jingjing...@intel.com
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v7] net/i40e: improved FDIR programming times
> 
> On 5/17/2017 3:31 PM, Michael Lilja wrote:
>> Previously, the FDIR programming time is +11ms on i40e.
>> This patch will result in an average programming time of 22usec with a
>> max of 60usec .
>>
>> Signed-off-by: Michael Lilja 
> 
> Sorry for multiple, minor change requests ...
> 
>>
>> ---
>> v7:
>> * Code style changes
>>
>> v6:
>> * Fixed code style issues
>>
>> v5:
>> * Reinitialization of "i" inconsistent with original intent
>>
>> v4:
>> * Code style fix
>>
>> v3:
>> * Replaced commit message
>>
>> v2:
>> *  Code style fix
>>
>> v1:
>> * Initial version
>> ---
>> ---
>>  drivers/net/i40e/i40e_fdir.c | 22 +++---
>>  1 file changed, 11 insertions(+), 11 deletions(-)
>>
>> diff --git a/drivers/net/i40e/i40e_fdir.c
>> b/drivers/net/i40e/i40e_fdir.c index 28cc554f5..1192d5831 100644
>> --- a/drivers/net/i40e/i40e_fdir.c
>> +++ b/drivers/net/i40e/i40e_fdir.c
>> @@ -76,6 +76,7 @@
>>  /* Wait count and interval for fdir filter programming */
>>  #define I40E_FDIR_WAIT_COUNT   10
>>  #define I40E_FDIR_WAIT_INTERVAL_US 1000
>> +#define I40E_FDIR_MAX_WAIT (I40E_FDIR_WAIT_COUNT *
>> +I40E_FDIR_WAIT_INTERVAL_US)
> 
> It looks like I40E_FDIR_WAIT_COUNT and I40E_FDIR_WAIT_INTERVAL_US not used 
> anywhere else, is there any value to keep them?
> 
> why not:
> #define I40E_FDIR_MAX_WAIT_US 1 /* 10 ms */
> 
>>
>>  /* Wait count and interval for fdir filter flush */
>>  #define I40E_FDIR_FLUSH_RETRY   50
>> @@ -1295,28 +1296,27 @@ i40e_fdir_filter_programming(struct i40e_pf *pf,
>>  /* Update the tx tail register */
>>  rte_wmb();
>>  I40E_PCI_REG_WRITE(txq->qtx_tail, txq->tx_tail);
>> -
>> -for (i = 0; i < I40E_FDIR_WAIT_COUNT; i++) {
>> -rte_delay_us(I40E_FDIR_WAIT_INTERVAL_US);
>> +for (i = 0; i < I40E_FDIR_MAX_WAIT; i++) {
>>  if ((txdp->cmd_type_offset_bsz &
>>  rte_cpu_to_le_64(I40E_TXD_QW1_DTYPE_MASK)) ==
>>  rte_cpu_to_le_64(I40E_TX_DESC_DTYPE_DESC_DONE))
>>  break;
>> +rte_delay_us(1);
>>  }
>> -if (i >= I40E_FDIR_WAIT_COUNT) {
>> +if (i >= I40E_FDIR_MAX_WAIT) {
>>  PMD_DRV_LOG(ERR, "Failed to program FDIR filter:"
>>  " time out to get DD on tx queue.");
>>  return -ETIMEDOUT;
>>  }
>>  /* totally delay 10 ms to check programming status*/
>> -rte_delay_us((I40E_FDIR_WAIT_COUNT - i) * I40E_FDIR_WAIT_INTERVAL_US);
>> -if (i40e_check_fdir_programming_status(rxq) < 0) {
>> -PMD_DRV_LOG(ERR, "Failed to program FDIR filter:"
>> -" programming status reported.");
>> -return -ENOSYS;
>> +for (; i < I40E_FDIR_MAX_WAIT; i++) {
>> +if (i40e_check_fdir_programming_status(rxq) >= 0)
>> +return 0;
>> +rte_delay_us(1);
>>  }
>> -
>> -return 0;
>> +PMD_DRV_LOG(ERR, "Failed to program FDIR filter:"
>> +" programming status reported.");
>> +return -ETIMEDOUT;
>>  }
>>
>>  /*
>>
> 
> Disclaimer: This email and any files transmitted with it may contain 
> confidential information intended for the addressee(s) only. The information 
> is not to be surrendered or copied to unauthorized persons. If you have 
> received this communication in error, please notify the sender immediately 
> and delete this e-mail from your system.
> 



[dpdk-dev] Discuss plugin threading model for DPDK.

2017-05-17 Thread Wiles, Keith

> On May 17, 2017, at 4:28 AM, Thomas Monjalon  wrote:
> 
> OK to register CPU needs for services (including interrupts processing).
> 
> Then we could take this opportunity to review how threads are managed.
> We will have three types of cores:
> - not used
> - reserved for services
> - used for polling / application processing
> It is fine to reserve/define CPU from DPDK point of view.
> 
> Then DPDK launch threads on cores. Maybe we should allow the application
> to choose how threads are launched and managed.
> Keith was talking about a plugin approach for thread management I think.
Thomas,
So, not to hijack this thread or maybe I misunderstood your comment I changed 
the subject.

Maybe we can look at the plugin model for a DPDK threading model to allow 
someone to use their own threading solution.

Is this required or just another enhancement?

Regards,
Keith



Re: [dpdk-dev] [PATCH v7] net/i40e: improved FDIR programming times

2017-05-17 Thread Michael Lilja
Ok, I'll make a v8 removing the define.

/Michael

-Original Message-
From: Ferruh Yigit [mailto:ferruh.yi...@intel.com] 
Sent: 17 May 2017 16:50
To: Michael Lilja ; helin.zh...@intel.com; 
jingjing...@intel.com
Cc: dev@dpdk.org
Subject: Re: [dpdk-dev] [PATCH v7] net/i40e: improved FDIR programming times

On 5/17/2017 3:46 PM, Michael Lilja wrote:
> It's ok. I didn't write the original code so I cannot tell why the two 
> defines were made in the initial case. It make sense to remove them, but the 
> maintainers must have had a reason, maybe they are needed in a future version 
> of the code?

In original code, they have a meaning:
for (i = 0; i < I40E_FDIR_WAIT_COUNT; i++)
rte_delay_us(I40E_FDIR_WAIT_INTERVAL_US);

wait step is I40E_FDIR_WAIT_INTERVAL_US.

But you changed to fixes 1us stepping. So WAIT_COUNT and WAIT_INTERVAL_US are 
no more meaningful. And since they are not used anywhere else, I think they can 
go away.

And we can wait from maintainers ack for any "plan to use in the future"
case.

Thanks,
ferruh

> 
> /Michael
> 
> -Original Message-
> From: Ferruh Yigit [mailto:ferruh.yi...@intel.com]
> Sent: 17 May 2017 16:44
> To: Michael Lilja ; helin.zh...@intel.com; 
> jingjing...@intel.com
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v7] net/i40e: improved FDIR programming 
> times
> 
> On 5/17/2017 3:31 PM, Michael Lilja wrote:
>> Previously, the FDIR programming time is +11ms on i40e.
>> This patch will result in an average programming time of 22usec with 
>> a max of 60usec .
>>
>> Signed-off-by: Michael Lilja 
> 
> Sorry for multiple, minor change requests ...
> 
>>
>> ---
>> v7:
>> * Code style changes
>>
>> v6:
>> * Fixed code style issues
>>
>> v5:
>> * Reinitialization of "i" inconsistent with original intent
>>
>> v4:
>> * Code style fix
>>
>> v3:
>> * Replaced commit message
>>
>> v2:
>> *  Code style fix
>>
>> v1:
>> * Initial version
>> ---
>> ---
>>  drivers/net/i40e/i40e_fdir.c | 22 +++---
>>  1 file changed, 11 insertions(+), 11 deletions(-)
>>
>> diff --git a/drivers/net/i40e/i40e_fdir.c 
>> b/drivers/net/i40e/i40e_fdir.c index 28cc554f5..1192d5831 100644
>> --- a/drivers/net/i40e/i40e_fdir.c
>> +++ b/drivers/net/i40e/i40e_fdir.c
>> @@ -76,6 +76,7 @@
>>  /* Wait count and interval for fdir filter programming */
>>  #define I40E_FDIR_WAIT_COUNT   10
>>  #define I40E_FDIR_WAIT_INTERVAL_US 1000
>> +#define I40E_FDIR_MAX_WAIT (I40E_FDIR_WAIT_COUNT *
>> +I40E_FDIR_WAIT_INTERVAL_US)
> 
> It looks like I40E_FDIR_WAIT_COUNT and I40E_FDIR_WAIT_INTERVAL_US not used 
> anywhere else, is there any value to keep them?
> 
> why not:
> #define I40E_FDIR_MAX_WAIT_US 1 /* 10 ms */
> 
>>
>>  /* Wait count and interval for fdir filter flush */
>>  #define I40E_FDIR_FLUSH_RETRY   50
>> @@ -1295,28 +1296,27 @@ i40e_fdir_filter_programming(struct i40e_pf 
>> *pf,
>>  /* Update the tx tail register */
>>  rte_wmb();
>>  I40E_PCI_REG_WRITE(txq->qtx_tail, txq->tx_tail);
>> -
>> -for (i = 0; i < I40E_FDIR_WAIT_COUNT; i++) { 
>> -rte_delay_us(I40E_FDIR_WAIT_INTERVAL_US);
>> +for (i = 0; i < I40E_FDIR_MAX_WAIT; i++) {
>>  if ((txdp->cmd_type_offset_bsz &
>>  rte_cpu_to_le_64(I40E_TXD_QW1_DTYPE_MASK)) ==
>>  rte_cpu_to_le_64(I40E_TX_DESC_DTYPE_DESC_DONE))
>>  break;
>> +rte_delay_us(1);
>>  }
>> -if (i >= I40E_FDIR_WAIT_COUNT) {
>> +if (i >= I40E_FDIR_MAX_WAIT) {
>>  PMD_DRV_LOG(ERR, "Failed to program FDIR filter:"
>>  " time out to get DD on tx queue.");  return -ETIMEDOUT;  }
>>  /* totally delay 10 ms to check programming status*/ 
>> -rte_delay_us((I40E_FDIR_WAIT_COUNT - i) * 
>> I40E_FDIR_WAIT_INTERVAL_US); -if 
>> (i40e_check_fdir_programming_status(rxq) < 0) { -PMD_DRV_LOG(ERR, "Failed to 
>> program FDIR filter:"
>> -" programming status reported.");
>> -return -ENOSYS;
>> +for (; i < I40E_FDIR_MAX_WAIT; i++) { if 
>> +(i40e_check_fdir_programming_status(rxq) >= 0) return 0; 
>> +rte_delay_us(1);
>>  }
>> -
>> -return 0;
>> +PMD_DRV_LOG(ERR, "Failed to program FDIR filter:"
>> +" programming status reported.");
>> +return -ETIMEDOUT;
>>  }
>>
>>  /*
>>
> 
> Disclaimer: This email and any files transmitted with it may contain 
> confidential information intended for the addressee(s) only. The information 
> is not to be surrendered or copied to unauthorized persons. If you have 
> received this communication in error, please notify the sender immediately 
> and delete this e-mail from your system.
> 



Re: [dpdk-dev] [RFC 17.08] flow_classify: add librte_flow_classify library

2017-05-17 Thread Ananyev, Konstantin
Hi Ferruh,
Please see my comments/questions below.
Thanks
Konstantin

> +
> +/**
> + * @file
> + *
> + * RTE Flow Classify Library
> + *
> + * This library provides flow record information with some measured 
> properties.
> + *
> + * Application can select variety of flow types based on various flow keys.
> + *
> + * Library only maintains flow records between rte_flow_classify_stats_get()
> + * calls and with a maximum limit.
> + *
> + * Provided flow record will be linked list rte_flow_classify_stat_xxx
> + * structure.
> + *
> + * Library is responsible from allocating and freeing memory for flow record
> + * table. Previous table freed with next rte_flow_classify_stats_get() call 
> and
> + * all tables are freed with rte_flow_classify_type_reset() or
> + * rte_flow_classify_type_set(x, 0). Memory for table allocated on the fly 
> while
> + * creating records.
> + *
> + * A rte_flow_classify_type_set() with a valid type will register Rx/Tx
> + * callbacks and start filling flow record table.
> + * With rte_flow_classify_stats_get(), pointer sent to caller and meanwhile
> + * library continues collecting records.
> + *
> + *  Usage:
> + *  - application calls rte_flow_classify_type_set() for a device
> + *  - library creates Rx/Tx callbacks for packets and start filling flow 
> table

Does it necessary to use an  RX callback here?
Can library provide an API like collect(port_id, input_mbuf[], pkt_num) instead?
So the user would have a choice either setup a callback or call collect() 
directly. 

> + *for that type of flow (currently only one flow type supported)
> + *  - application calls rte_flow_classify_stats_get() to get pointer to 
> linked
> + *listed flow table. Library assigns this pointer to another value and 
> keeps
> + *collecting flow data. In next rte_flow_classify_stats_get(), library 
> first
> + *free the previous table, and pass current table to the application, 
> keep
> + *collecting data.

Ok, but that means that you can't use stats_get() for the same type
from 2 different threads without explicit synchronization?

> + *  - application calls rte_flow_classify_type_reset(), library unregisters 
> the
> + *callbacks and free all flow table data.
> + *
> + */
> +
> +enum rte_flow_classify_type {
> + RTE_FLOW_CLASSIFY_TYPE_GENERIC = (1 << 0),
> + RTE_FLOW_CLASSIFY_TYPE_MAX,
> +};
> +
> +#define RTE_FLOW_CLASSIFY_TYPE_MASK = (((RTE_FLOW_CLASSIFY_TYPE_MAX - 1) << 
> 1) - 1)
> +
> +/**
> + * Global configuration struct
> + */
> +struct rte_flow_classify_config {
> + uint32_t type; /* bitwise enum rte_flow_classify_type values */
> + void *flow_table_prev;
> + uint32_t flow_table_prev_item_count;
> + void *flow_table_current;
> + uint32_t flow_table_current_item_count;
> +} rte_flow_classify_config[RTE_MAX_ETHPORTS];
> +
> +#define RTE_FLOW_CLASSIFY_STAT_MAX UINT16_MAX
> +
> +/**
> + * Classification stats data struct
> + */
> +struct rte_flow_classify_stat_generic {
> + struct rte_flow_classify_stat_generic *next;
> + uint32_t id;
> + uint64_t timestamp;
> +
> + struct ether_addr src_mac;
> + struct ether_addr dst_mac;
> + uint32_t src_ipv4;
> + uint32_t dst_ipv4;
> + uint8_t l3_protocol_id;
> + uint16_t src_port;
> + uint16_t dst_port;
> +
> + uint64_t packet_count;
> + uint64_t packet_size; /* bytes */
> +};

Ok, so if I understood things right, for generic type it will always classify 
all incoming packets by:
 
all by absolute values, and represent results as a linked list.
Is that correct, or I misunderstood your intentions here?
If so, then I see several disadvantages here:
1) It is really hard to predict what kind of stats is required for that 
particular cases.
 Let say some people would like to collect stat by  ,
another by , third ones by  and so on.
Having just one hardcoded filter doesn't seem very felxable/usable.
I think you need to find a way to allow user to define what type of filter they 
want to apply.
I think it was discussed already, but I still wonder why rte_flow_item can't be 
used for that approach?
2) Even  one 10G port can produce you ~14M rte_flow_classify_stat_generic 
entries in one second
(all packets have different ipv4/ports or so).
Accessing/retrieving items over linked list with 14M entries - doesn't sound 
like a good idea.
I'd say we need some better way to retrieve/present collected data.



[dpdk-dev] [PATCH v8] net/i40e: improved FDIR programming times

2017-05-17 Thread Michael Lilja
Previously, the FDIR programming time is +11ms on i40e.
This patch will result in an average programming time of
22usec with a max of 60usec .

Signed-off-by: Michael Lilja 

---
v8:
* Merged two defines into one handling max wait time

v7:
* Code style changes

v6:
* Fixed code style issues

v5:
* Reinitialization of "i" inconsistent with original intent

v4:
* Code style fix

v3:
* Replaced commit message

v2:
*  Code style fix

v1:
* Initial version
---
---
 drivers/net/i40e/i40e_fdir.c | 26 --
 1 file changed, 12 insertions(+), 14 deletions(-)

diff --git a/drivers/net/i40e/i40e_fdir.c b/drivers/net/i40e/i40e_fdir.c
index 28cc554f5..f94e1c3b8 100644
--- a/drivers/net/i40e/i40e_fdir.c
+++ b/drivers/net/i40e/i40e_fdir.c
@@ -73,9 +73,8 @@
 #define I40E_FDIR_IPv6_PAYLOAD_LEN  380
 #define I40E_FDIR_UDP_DEFAULT_LEN   400
 
-/* Wait count and interval for fdir filter programming */
-#define I40E_FDIR_WAIT_COUNT   10
-#define I40E_FDIR_WAIT_INTERVAL_US 1000
+/* Wait time for fdir filter programming */
+#define I40E_FDIR_MAX_WAIT_US 1
 
 /* Wait count and interval for fdir filter flush */
 #define I40E_FDIR_FLUSH_RETRY   50
@@ -1295,28 +1294,27 @@ i40e_fdir_filter_programming(struct i40e_pf *pf,
/* Update the tx tail register */
rte_wmb();
I40E_PCI_REG_WRITE(txq->qtx_tail, txq->tx_tail);
-
-   for (i = 0; i < I40E_FDIR_WAIT_COUNT; i++) {
-   rte_delay_us(I40E_FDIR_WAIT_INTERVAL_US);
+   for (i = 0; i < I40E_FDIR_MAX_WAIT_US; i++) {
if ((txdp->cmd_type_offset_bsz &
rte_cpu_to_le_64(I40E_TXD_QW1_DTYPE_MASK)) ==
rte_cpu_to_le_64(I40E_TX_DESC_DTYPE_DESC_DONE))
break;
+   rte_delay_us(1);
}
-   if (i >= I40E_FDIR_WAIT_COUNT) {
+   if (i >= I40E_FDIR_MAX_WAIT_US) {
PMD_DRV_LOG(ERR, "Failed to program FDIR filter:"
" time out to get DD on tx queue.");
return -ETIMEDOUT;
}
/* totally delay 10 ms to check programming status*/
-   rte_delay_us((I40E_FDIR_WAIT_COUNT - i) * I40E_FDIR_WAIT_INTERVAL_US);
-   if (i40e_check_fdir_programming_status(rxq) < 0) {
-   PMD_DRV_LOG(ERR, "Failed to program FDIR filter:"
-   " programming status reported.");
-   return -ENOSYS;
+   for (; i < I40E_FDIR_MAX_WAIT_US; i++) {
+   if (i40e_check_fdir_programming_status(rxq) >= 0)
+   return 0;
+   rte_delay_us(1);
}
-
-   return 0;
+   PMD_DRV_LOG(ERR, "Failed to program FDIR filter:"
+   " programming status reported.");
+   return -ETIMEDOUT;
 }
 
 /*
-- 
2.12.2



Re: [dpdk-dev] [PATCH v8] net/i40e: improved FDIR programming times

2017-05-17 Thread Ferruh Yigit
On 5/17/2017 3:57 PM, Michael Lilja wrote:
> Previously, the FDIR programming time is +11ms on i40e.
> This patch will result in an average programming time of
> 22usec with a max of 60usec .
> 
> Signed-off-by: Michael Lilja 

<...>

>   /* totally delay 10 ms to check programming status*/
> - rte_delay_us((I40E_FDIR_WAIT_COUNT - i) * I40E_FDIR_WAIT_INTERVAL_US);
> - if (i40e_check_fdir_programming_status(rxq) < 0) {
> - PMD_DRV_LOG(ERR, "Failed to program FDIR filter:"
> - " programming status reported.");
> - return -ENOSYS;
> + for (; i < I40E_FDIR_MAX_WAIT_US; i++) {
> + if (i40e_check_fdir_programming_status(rxq) >= 0)
> + return 0;
> + rte_delay_us(1);
>   }
> -
> - return 0;
> + PMD_DRV_LOG(ERR, "Failed to program FDIR filter:"
> + " programming status reported."
I am aware that you just moved this log, but since you have touched to
it, can you please fix it too [1]:

PMD_DRV_LOG(ERR,
"Failed to program FDIR filter: programming status reported.");

[1] Or if you prefer please let me know, so I can fix it while applying.

> + return -ETIMEDOUT;
>  }
>  
>  /*
> 



Re: [dpdk-dev] [PATCH v8] net/i40e: improved FDIR programming times

2017-05-17 Thread Michael Lilja
Please fix while applying.

-Original Message-
From: Ferruh Yigit [mailto:ferruh.yi...@intel.com] 
Sent: 17 May 2017 17:17
To: Michael Lilja ; helin.zh...@intel.com; 
jingjing...@intel.com
Cc: dev@dpdk.org
Subject: Re: [dpdk-dev] [PATCH v8] net/i40e: improved FDIR programming times

On 5/17/2017 3:57 PM, Michael Lilja wrote:
> Previously, the FDIR programming time is +11ms on i40e.
> This patch will result in an average programming time of 22usec with a 
> max of 60usec .
> 
> Signed-off-by: Michael Lilja 

<...>

>   /* totally delay 10 ms to check programming status*/
> - rte_delay_us((I40E_FDIR_WAIT_COUNT - i) * I40E_FDIR_WAIT_INTERVAL_US);
> - if (i40e_check_fdir_programming_status(rxq) < 0) {
> - PMD_DRV_LOG(ERR, "Failed to program FDIR filter:"
> - " programming status reported.");
> - return -ENOSYS;
> + for (; i < I40E_FDIR_MAX_WAIT_US; i++) {
> + if (i40e_check_fdir_programming_status(rxq) >= 0)
> + return 0;
> + rte_delay_us(1);
>   }
> -
> - return 0;
> + PMD_DRV_LOG(ERR, "Failed to program FDIR filter:"
> + " programming status reported."
I am aware that you just moved this log, but since you have touched to it, can 
you please fix it too [1]:

PMD_DRV_LOG(ERR,
"Failed to program FDIR filter: programming status reported.");

[1] Or if you prefer please let me know, so I can fix it while applying.

> + return -ETIMEDOUT;
>  }
>  
>  /*
> 



Re: [dpdk-dev] [RFC 17.08] flow_classify: add librte_flow_classify library

2017-05-17 Thread Ferruh Yigit
On 5/17/2017 3:54 PM, Ananyev, Konstantin wrote:
> Hi Ferruh,
> Please see my comments/questions below.
> Thanks
> Konstantin
> 
>> +
>> +/**
>> + * @file
>> + *
>> + * RTE Flow Classify Library
>> + *
>> + * This library provides flow record information with some measured 
>> properties.
>> + *
>> + * Application can select variety of flow types based on various flow keys.
>> + *
>> + * Library only maintains flow records between rte_flow_classify_stats_get()
>> + * calls and with a maximum limit.
>> + *
>> + * Provided flow record will be linked list rte_flow_classify_stat_xxx
>> + * structure.
>> + *
>> + * Library is responsible from allocating and freeing memory for flow record
>> + * table. Previous table freed with next rte_flow_classify_stats_get() call 
>> and
>> + * all tables are freed with rte_flow_classify_type_reset() or
>> + * rte_flow_classify_type_set(x, 0). Memory for table allocated on the fly 
>> while
>> + * creating records.
>> + *
>> + * A rte_flow_classify_type_set() with a valid type will register Rx/Tx
>> + * callbacks and start filling flow record table.
>> + * With rte_flow_classify_stats_get(), pointer sent to caller and meanwhile
>> + * library continues collecting records.
>> + *
>> + *  Usage:
>> + *  - application calls rte_flow_classify_type_set() for a device
>> + *  - library creates Rx/Tx callbacks for packets and start filling flow 
>> table
> 
> Does it necessary to use an  RX callback here?
> Can library provide an API like collect(port_id, input_mbuf[], pkt_num) 
> instead?
> So the user would have a choice either setup a callback or call collect() 
> directly. 

This was also comment from Morten, I will update RFC to use direct API call.

> 
>> + *for that type of flow (currently only one flow type supported)
>> + *  - application calls rte_flow_classify_stats_get() to get pointer to 
>> linked
>> + *listed flow table. Library assigns this pointer to another value and 
>> keeps
>> + *collecting flow data. In next rte_flow_classify_stats_get(), library 
>> first
>> + *free the previous table, and pass current table to the application, 
>> keep
>> + *collecting data.
> 
> Ok, but that means that you can't use stats_get() for the same type
> from 2 different threads without explicit synchronization?

Correct.
And multiple threads shouldn't be calling this API. It doesn't store
previous flow data, so multiple threads calling this only can have piece
of information. Do you see any use case that multiple threads can call
this API?

> 
>> + *  - application calls rte_flow_classify_type_reset(), library unregisters 
>> the
>> + *callbacks and free all flow table data.
>> + *
>> + */
>> +
>> +enum rte_flow_classify_type {
>> +RTE_FLOW_CLASSIFY_TYPE_GENERIC = (1 << 0),
>> +RTE_FLOW_CLASSIFY_TYPE_MAX,
>> +};
>> +
>> +#define RTE_FLOW_CLASSIFY_TYPE_MASK = (((RTE_FLOW_CLASSIFY_TYPE_MAX - 1) << 
>> 1) - 1)
>> +
>> +/**
>> + * Global configuration struct
>> + */
>> +struct rte_flow_classify_config {
>> +uint32_t type; /* bitwise enum rte_flow_classify_type values */
>> +void *flow_table_prev;
>> +uint32_t flow_table_prev_item_count;
>> +void *flow_table_current;
>> +uint32_t flow_table_current_item_count;
>> +} rte_flow_classify_config[RTE_MAX_ETHPORTS];
>> +
>> +#define RTE_FLOW_CLASSIFY_STAT_MAX UINT16_MAX
>> +
>> +/**
>> + * Classification stats data struct
>> + */
>> +struct rte_flow_classify_stat_generic {
>> +struct rte_flow_classify_stat_generic *next;
>> +uint32_t id;
>> +uint64_t timestamp;
>> +
>> +struct ether_addr src_mac;
>> +struct ether_addr dst_mac;
>> +uint32_t src_ipv4;
>> +uint32_t dst_ipv4;
>> +uint8_t l3_protocol_id;
>> +uint16_t src_port;
>> +uint16_t dst_port;
>> +
>> +uint64_t packet_count;
>> +uint64_t packet_size; /* bytes */
>> +};
> 
> Ok, so if I understood things right, for generic type it will always classify 
> all incoming packets by:
>  
> all by absolute values, and represent results as a linked list.
> Is that correct, or I misunderstood your intentions here?

Correct.

> If so, then I see several disadvantages here:
> 1) It is really hard to predict what kind of stats is required for that 
> particular cases.
>  Let say some people would like to collect stat by  ,
> another by , third ones by  and so on.
> Having just one hardcoded filter doesn't seem very felxable/usable.
> I think you need to find a way to allow user to define what type of filter 
> they want to apply.

The flow type should be provided by applications, according their needs,
and needs to be implemented in this library. The generic one will be the
only one implemented in first version:
enum rte_flow_classify_type {
RTE_FLOW_CLASSIFY_TYPE_GENERIC = (1 << 0),
RTE_FLOW_CLASSIFY_TYPE_MAX,
};


App should set the type first via the API:
rte_flow_classify_type_set(uint8_t port_id, uint32_t type);


And the stats for this type will be returned, because returned type ca

Re: [dpdk-dev] Discuss plugin threading model for DPDK.

2017-05-17 Thread Thomas Monjalon
17/05/2017 16:51, Wiles, Keith:
> 
> > On May 17, 2017, at 4:28 AM, Thomas Monjalon  wrote:
> > 
> > OK to register CPU needs for services (including interrupts processing).
> > 
> > Then we could take this opportunity to review how threads are managed.
> > We will have three types of cores:
> > - not used
> > - reserved for services
> > - used for polling / application processing
> > It is fine to reserve/define CPU from DPDK point of view.
> > 
> > Then DPDK launch threads on cores. Maybe we should allow the application
> > to choose how threads are launched and managed.
> > Keith was talking about a plugin approach for thread management I think.
> Thomas,
> So, not to hijack this thread or maybe I misunderstood your comment I changed 
> the subject.
> 
> Maybe we can look at the plugin model for a DPDK threading model to allow 
> someone to use their own threading solution.
> 
> Is this required or just another enhancement?

It is another enhancement.
As the service core would be a new API, we should check that it is
compatible with a possible evolution of the underlying thread model.
And I think it can be a good opportunity to draw a complete view of
how DPDK could evolve regarding the thread model.


Re: [dpdk-dev] [RFC 17.08] flow_classify: add librte_flow_classify library

2017-05-17 Thread Ferruh Yigit
On 5/17/2017 3:54 PM, Ananyev, Konstantin wrote:
> Hi Ferruh,
> Please see my comments/questions below.

Thanks for review.

> Thanks
> Konstantin

<...>

> I think it was discussed already, but I still wonder why rte_flow_item can't 
> be used for that approach?

Missed this one:

Gaëtan also had same comment, copy-paste from other mail related to my
concerns using rte_flow:

"
rte_flow is to create flow rules in PMD level, but what this library
aims to collect flow information, independent from if underlying PMD
implemented rte_flow or not.

So issues with using rte_flow for this use case:
1- It may not be implemented for all PMDs (including virtual ones).
2- It may conflict with other rte_flow rules created by user.
3- It may not gather all information required. (I mean some actions
here, count like ones are easy but rte_flow may not be so flexible to
extract different metrics from flows)
"


Re: [dpdk-dev] [RFC 17.08] flow_classify: add librte_flow_classify library

2017-05-17 Thread Ananyev, Konstantin
> > Hi Ferruh,
> > Please see my comments/questions below.
> > Thanks
> > Konstantin
> >
> >> +
> >> +/**
> >> + * @file
> >> + *
> >> + * RTE Flow Classify Library
> >> + *
> >> + * This library provides flow record information with some measured 
> >> properties.
> >> + *
> >> + * Application can select variety of flow types based on various flow 
> >> keys.
> >> + *
> >> + * Library only maintains flow records between 
> >> rte_flow_classify_stats_get()
> >> + * calls and with a maximum limit.
> >> + *
> >> + * Provided flow record will be linked list rte_flow_classify_stat_xxx
> >> + * structure.
> >> + *
> >> + * Library is responsible from allocating and freeing memory for flow 
> >> record
> >> + * table. Previous table freed with next rte_flow_classify_stats_get() 
> >> call and
> >> + * all tables are freed with rte_flow_classify_type_reset() or
> >> + * rte_flow_classify_type_set(x, 0). Memory for table allocated on the 
> >> fly while
> >> + * creating records.
> >> + *
> >> + * A rte_flow_classify_type_set() with a valid type will register Rx/Tx
> >> + * callbacks and start filling flow record table.
> >> + * With rte_flow_classify_stats_get(), pointer sent to caller and 
> >> meanwhile
> >> + * library continues collecting records.
> >> + *
> >> + *  Usage:
> >> + *  - application calls rte_flow_classify_type_set() for a device
> >> + *  - library creates Rx/Tx callbacks for packets and start filling flow 
> >> table
> >
> > Does it necessary to use an  RX callback here?
> > Can library provide an API like collect(port_id, input_mbuf[], pkt_num) 
> > instead?
> > So the user would have a choice either setup a callback or call collect() 
> > directly.
> 
> This was also comment from Morten, I will update RFC to use direct API call.
> 
> >
> >> + *for that type of flow (currently only one flow type supported)
> >> + *  - application calls rte_flow_classify_stats_get() to get pointer to 
> >> linked
> >> + *listed flow table. Library assigns this pointer to another value 
> >> and keeps
> >> + *collecting flow data. In next rte_flow_classify_stats_get(), 
> >> library first
> >> + *free the previous table, and pass current table to the application, 
> >> keep
> >> + *collecting data.
> >
> > Ok, but that means that you can't use stats_get() for the same type
> > from 2 different threads without explicit synchronization?
> 
> Correct.
> And multiple threads shouldn't be calling this API. It doesn't store
> previous flow data, so multiple threads calling this only can have piece
> of information. Do you see any use case that multiple threads can call
> this API?

One example would be when you have multiple queues per port,
managed/monitored by different cores.
BTW, how are you going to collect the stats in that way?

> 
> >
> >> + *  - application calls rte_flow_classify_type_reset(), library 
> >> unregisters the
> >> + *callbacks and free all flow table data.
> >> + *
> >> + */
> >> +
> >> +enum rte_flow_classify_type {
> >> +  RTE_FLOW_CLASSIFY_TYPE_GENERIC = (1 << 0),
> >> +  RTE_FLOW_CLASSIFY_TYPE_MAX,
> >> +};
> >> +
> >> +#define RTE_FLOW_CLASSIFY_TYPE_MASK = (((RTE_FLOW_CLASSIFY_TYPE_MAX - 1) 
> >> << 1) - 1)
> >> +
> >> +/**
> >> + * Global configuration struct
> >> + */
> >> +struct rte_flow_classify_config {
> >> +  uint32_t type; /* bitwise enum rte_flow_classify_type values */
> >> +  void *flow_table_prev;
> >> +  uint32_t flow_table_prev_item_count;
> >> +  void *flow_table_current;
> >> +  uint32_t flow_table_current_item_count;
> >> +} rte_flow_classify_config[RTE_MAX_ETHPORTS];
> >> +
> >> +#define RTE_FLOW_CLASSIFY_STAT_MAX UINT16_MAX
> >> +
> >> +/**
> >> + * Classification stats data struct
> >> + */
> >> +struct rte_flow_classify_stat_generic {
> >> +  struct rte_flow_classify_stat_generic *next;
> >> +  uint32_t id;
> >> +  uint64_t timestamp;
> >> +
> >> +  struct ether_addr src_mac;
> >> +  struct ether_addr dst_mac;
> >> +  uint32_t src_ipv4;
> >> +  uint32_t dst_ipv4;
> >> +  uint8_t l3_protocol_id;
> >> +  uint16_t src_port;
> >> +  uint16_t dst_port;
> >> +
> >> +  uint64_t packet_count;
> >> +  uint64_t packet_size; /* bytes */
> >> +};
> >
> > Ok, so if I understood things right, for generic type it will always 
> > classify all incoming packets by:
> > 
> > all by absolute values, and represent results as a linked list.
> > Is that correct, or I misunderstood your intentions here?
> 
> Correct.
> 
> > If so, then I see several disadvantages here:
> > 1) It is really hard to predict what kind of stats is required for that 
> > particular cases.
> >  Let say some people would like to collect stat by  ,
> > another by , third ones by  and so on.
> > Having just one hardcoded filter doesn't seem very felxable/usable.
> > I think you need to find a way to allow user to define what type of filter 
> > they want to apply.
> 
> The flow type should be provided by applications, according their needs,
> and needs to be implemented in this library. The g

Re: [dpdk-dev] [RFC 17.08] flow_classify: add librte_flow_classify library

2017-05-17 Thread Ananyev, Konstantin


> 
> On 5/17/2017 3:54 PM, Ananyev, Konstantin wrote:
> > Hi Ferruh,
> > Please see my comments/questions below.
> 
> Thanks for review.
> 
> > Thanks
> > Konstantin
> 
> <...>
> 
> > I think it was discussed already, but I still wonder why rte_flow_item 
> > can't be used for that approach?
> 
> Missed this one:
> 
> Gaëtan also had same comment, copy-paste from other mail related to my
> concerns using rte_flow:
> 
> "
> rte_flow is to create flow rules in PMD level, but what this library
> aims to collect flow information, independent from if underlying PMD
> implemented rte_flow or not.
> 
> So issues with using rte_flow for this use case:
> 1- It may not be implemented for all PMDs (including virtual ones).
> 2- It may conflict with other rte_flow rules created by user.
> 3- It may not gather all information required. (I mean some actions
> here, count like ones are easy but rte_flow may not be so flexible to
> extract different metrics from flows)
> "

I am not talking about actions - I am talking about using rte_flow_item
(or similar approach) to allow user to define what flow he likes to have.
Then the flow_classify library would use that information to generate
the internal structures(/code) it will use to classify the incoming packets. 
I understand that we might not support all define rte_flow_items straightway,
we could start with some limited set and add new ones on a iterative basis.
Basically what I am talking about - SW implementation for rte_flow.
Konstantin





Re: [dpdk-dev] [RFC 17.08] flow_classify: add librte_flow_classify library

2017-05-17 Thread Gaëtan Rivet

Hi Ferruh,

On Wed, May 17, 2017 at 05:02:50PM +0100, Ferruh Yigit wrote:

On 5/17/2017 3:54 PM, Ananyev, Konstantin wrote:

Hi Ferruh,
Please see my comments/questions below.


Thanks for review.


Thanks
Konstantin


<...>


I think it was discussed already, but I still wonder why rte_flow_item can't be 
used for that approach?


Missed this one:

Gaëtan also had same comment, copy-paste from other mail related to my
concerns using rte_flow:

"
rte_flow is to create flow rules in PMD level, but what this library
aims to collect flow information, independent from if underlying PMD
implemented rte_flow or not.

So issues with using rte_flow for this use case:
1- It may not be implemented for all PMDs (including virtual ones).
2- It may conflict with other rte_flow rules created by user.
3- It may not gather all information required. (I mean some actions
here, count like ones are easy but rte_flow may not be so flexible to
extract different metrics from flows)
"


There are two separate elements to using rte_flow in this context I think.

One is the use of the existing actions, and as you say, this makes the
support of this library dependent on the rte_flow support in PMDs.

The other is the expression of flows through a shared syntax. Using
flags to propose presets can be simpler, but will probably not be flexible
enough. rte_flow_items are a first-class citizen in DPDK and are
already a data type that can express flows with flexibility. As
mentioned, they are however missing a few elements to fully cover IPFIX
meters, but nothing that cannot be added I think.

So I was probably not clear enough, but I was thinking about
supporting rte_flow_items in rte_flow_classify as the possible key
applications would use to configure their measurements. This should not
require rte_flow supports from the PMDs they would be using, only
rte_flow_item parsing from the rte_flow_classify library.

Otherwise, DPDK will probably end up with two competing flow
representations. Additionally, it may be interesting for applications
to bind these data directly to rte_flow actions once the
classification has been analyzed.

--
Gaëtan Rivet
6WIND


Re: [dpdk-dev] [PATCH v2 00/13] introduce fail-safe PMD

2017-05-17 Thread Gaëtan Rivet

On Wed, May 17, 2017 at 01:50:40PM +0100, Ferruh Yigit wrote:

On 3/20/2017 3:00 PM, Thomas Monjalon wrote:

There have been some discussions on this new PMD and it will be
discussed today in the techboard meeting.

I would like to expose my view and summarize the solutions I have heard.
First it is important to remind that everyone agrees on the need for
this feature, i.e. masking the hotplug events by maintaining an ethdev
object even without real underlying device.

1/
The proposal from Gaetan is to add a failsafe driver with 2 features:
* masking underlying device
* limited and small failover code to switch from a device
  to another one, with the same centralized configuration
The latter feature makes think to the bonding driver, but it could be
kept limited without any intent of implementing real bonding features.

2/
If we really want to merge failsafe and bonding features, we could
create a new bonding driver with centralized configuration.
The legacy bonding driver let each slave to be configured separately.
It is a different model and we should not mix them.
If one is better, it could be deprecated later.

3/
It can be tried to implement the failsafe feature into the bonding
driver, as Neil suggests.
However, I am not sure it would work very well or would be easy to use.

4/
We can implement only the failsafe feature as a PMD and use it to wrap
the slaves of the bonding driver.
So the order of link would be
bonding -> failsafe -> real device
In this model, failsafe can have only one slave and do not implement
the fail-over feature.



Tech board decided [1] to "reconsider" the PMD for this release (17.08).
So, lets start it J

I think it is good idea to continue on top of above summary, is there a
plan to how to proceed?



The fail-safe proposal has not evolved from the techboard point of view.
The salient point is still choosing between those 4 possible integrations.

To give a quick overview of its current state:

I have started working on a v3 to be integrated to v17.08. The work
however was exceedingly complicated due to deep-rooted dependencies in
the PCI implementation within the EAL, which has evolved in v17.05 and
will evolve during this release.

The current rte_bus rework from Jan Blunck and myself will greatly
simplify the sub-EAL layer that was used in the fail-safe. I am thus
waiting on Jan Blunck series on attach / detach, to propose mine in
turn for rte_devargs, move the PCI bus where it belongs and, finally,
rebase the fail-safe upon it.

The form this work is taking however is still the same as previously,
thus currently aiming at solutions 1 or 2.


Thanks,
ferruh

[1]
http://dpdk.org/ml/archives/dev/2017-March/061009.html


--
Gaëtan Rivet
6WIND


Re: [dpdk-dev] [RFC][PATCH] vfio: allow to map other memory regions

2017-05-17 Thread Stephen Hemminger
On Wed, 17 May 2017 16:44:46 +0200
Pawel Wodkowski  wrote:

>  /* IOMMU types we support */
>  static const struct vfio_iommu_type iommu_types[] = {
>   /* x86 IOMMU, otherwise known as type 1 */
> - { RTE_VFIO_TYPE1, "Type 1", &vfio_type1_dma_map},
> + { RTE_VFIO_TYPE1, "Type 1", &vfio_type1_dma_map, 
> &vfio_type1_dma_mem_map},
>   /* ppc64 IOMMU, otherwise known as spapr */
> - { RTE_VFIO_SPAPR, "sPAPR", &vfio_spapr_dma_map},
> + { RTE_VFIO_SPAPR, "sPAPR", &vfio_spapr_dma_map, NULL},
>   /* IOMMU-less mode */
> - { RTE_VFIO_NOIOMMU, "No-IOMMU", &vfio_noiommu_dma_map},
> + { RTE_VFIO_NOIOMMU, "No-IOMMU", &vfio_noiommu_dma_map, 
> &vfio_noiommu_dma_mem_map},
>  };

For complex tables like this why not use C99 style initializer.


Re: [dpdk-dev] [PATCH 1/3] examples/eventdev_pipeline: added sample app

2017-05-17 Thread Jerin Jacob
-Original Message-
> Date: Fri, 21 Apr 2017 10:51:37 +0100
> From: Harry van Haaren 
> To: dev@dpdk.org
> CC: jerin.ja...@caviumnetworks.com, Harry van Haaren
>  , Gage Eads , Bruce
>  Richardson 
> Subject: [PATCH 1/3] examples/eventdev_pipeline: added sample app
> X-Mailer: git-send-email 2.7.4
> 
> This commit adds a sample app for the eventdev library.
> The app has been tested with DPDK 17.05-rc2, hence this
> release (or later) is recommended.
> 
> The sample app showcases a pipeline processing use-case,
> with event scheduling and processing defined per stage.
> The application recieves traffic as normal, with each
> packet traversing the pipeline. Once the packet has
> been processed by each of the pipeline stages, it is
> transmitted again.
> 
> The app provides a framework to utilize cores for a single
> role or multiple roles. Examples of roles are the RX core,
> TX core, Scheduling core (in the case of the event/sw PMD),
> and worker cores.
> 
> Various flags are available to configure numbers of stages,
> cycles of work at each stage, type of scheduling, number of
> worker cores, queue depths etc. For a full explaination,
> please refer to the documentation.
> 
> Signed-off-by: Gage Eads 
> Signed-off-by: Bruce Richardson 
> Signed-off-by: Harry van Haaren 
> ---
> +
> +static inline void
> +schedule_devices(uint8_t dev_id, unsigned lcore_id)
> +{
> + if (rx_core[lcore_id] && (rx_single ||
> + rte_atomic32_cmpset(&rx_lock, 0, 1))) {
> + producer();
> + rte_atomic32_clear((rte_atomic32_t *)&rx_lock);
> + }
> +
> + if (sched_core[lcore_id] && (sched_single ||
> + rte_atomic32_cmpset(&sched_lock, 0, 1))) {
> + rte_event_schedule(dev_id);

One question here,

Does rte_event_schedule()'s SW PMD implementation capable of running
concurrently on multiple cores?

Context:
Currently I am writing a testpmd like test framework to realize
different use cases along with with performance test cases like throughput
and latency and making sure it works on SW and HW driver.

I see the following segfault problem when rte_event_schedule() invoked on
multiple core currently. Is it expected?

#0  0x0043e945 in __pull_port_lb (allow_reorder=0, port_id=2,
sw=0x7ff93f3cb540) at
/export/dpdk-thunderx/drivers/event/sw/sw_evdev_scheduler.c:406
/export/dpdk-thunderx/drivers/event/sw/sw_evdev_scheduler.c:406:11647:beg:0x43e945
[Current thread is 1 (Thread 0x7ff9fbd34700 (LWP 796))]
(gdb) bt
#0  0x0043e945 in __pull_port_lb (allow_reorder=0, port_id=2,
sw=0x7ff93f3cb540) at
/export/dpdk-thunderx/drivers/event/sw/sw_evdev_scheduler.c:406
#1  sw_schedule_pull_port_no_reorder (port_id=2, sw=0x7ff93f3cb540) at
/export/dpdk-thunderx/drivers/event/sw/sw_evdev_scheduler.c:495
#2  sw_event_schedule (dev=) at
/export/dpdk-thunderx/drivers/event/sw/sw_evdev_scheduler.c:566
#3  0x0040b4af in rte_event_schedule (dev_id=) at
/export/dpdk-thunderx/build/include/rte_eventdev.h:1092
#4  worker (arg=) at
/export/dpdk-thunderx/app/test-eventdev/test_queue_order.c:200
#5  0x0042d14b in eal_thread_loop (arg=) at
/export/dpdk-thunderx/lib/librte_eal/linuxapp/eal/eal_thread.c:184
#6  0x7ff9fd8e32e7 in start_thread () from /usr/lib/libpthread.so.0
#7  0x7ff9fd62454f in clone () from /usr/lib/libc.so.6
(gdb) list
401  */
402 uint32_t iq_num = PRIO_TO_IQ(qe->priority);
403 struct sw_qid *qid = &sw->qids[qe->queue_id];
404
405 if ((flags & QE_FLAG_VALID) &&
406
iq_ring_free_count(qid->iq[iq_num]) == 0)
407 break;
408
409 /* now process based on flags. Note that for
directed
410  * queues, the enqueue_flush masks off all but
the
(gdb) 





> + if (dump_dev_signal) {
> + rte_event_dev_dump(0, stdout);
> + dump_dev_signal = 0;
> + }
> + rte_atomic32_clear((rte_atomic32_t *)&sched_lock);
> + }
> +
> + if (tx_core[lcore_id] && (tx_single ||
> + rte_atomic32_cmpset(&tx_lock, 0, 1))) {
> + consumer();
> + rte_atomic32_clear((rte_atomic32_t *)&tx_lock);
> + }
> +}
> +


[dpdk-dev] [PATCH 1/3] net/af_packet: handle strdup() failures

2017-05-17 Thread Charles (Chas) Williams
Fixes: 1b93c2aa81b4 ("net/af_packet: add interface name to internals")

Signed-off-by: Chas Williams 
---
 drivers/net/af_packet/rte_eth_af_packet.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/af_packet/rte_eth_af_packet.c 
b/drivers/net/af_packet/rte_eth_af_packet.c
index a03966a..ce4dc07 100644
--- a/drivers/net/af_packet/rte_eth_af_packet.c
+++ b/drivers/net/af_packet/rte_eth_af_packet.c
@@ -630,6 +630,8 @@ rte_pmd_init_internals(struct rte_vdev_device *dev,
goto error_early;
}
(*internals)->if_name = strdup(pair->value);
+   if ((*internals)->if_name == NULL)
+   goto error_early;
(*internals)->if_index = ifr.ifr_ifindex;
 
if (ioctl(sockfd, SIOCGIFHWADDR, &ifr) == -1) {
-- 
2.1.4



[dpdk-dev] [PATCH 2/3] net/af_packet: add accessor for iface

2017-05-17 Thread Charles (Chas) Williams
Provide an accessor for the name of the underlying linux interface
used by the AF_PACKET-based interface.

Signed-off-by: Charles (Chas) Williams 
---
 drivers/net/af_packet/Makefile |  2 +
 drivers/net/af_packet/rte_eth_af_packet.c  | 17 +++
 drivers/net/af_packet/rte_eth_af_packet.h  | 55 ++
 .../net/af_packet/rte_pmd_af_packet_version.map|  6 +++
 4 files changed, 80 insertions(+)
 create mode 100644 drivers/net/af_packet/rte_eth_af_packet.h

diff --git a/drivers/net/af_packet/Makefile b/drivers/net/af_packet/Makefile
index 70d517c..5ea058c 100644
--- a/drivers/net/af_packet/Makefile
+++ b/drivers/net/af_packet/Makefile
@@ -50,4 +50,6 @@ CFLAGS += $(WERROR_FLAGS)
 #
 SRCS-$(CONFIG_RTE_LIBRTE_PMD_AF_PACKET) += rte_eth_af_packet.c
 
+SYMLINK-y-include += rte_eth_af_packet.h
+
 include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/drivers/net/af_packet/rte_eth_af_packet.c 
b/drivers/net/af_packet/rte_eth_af_packet.c
index ce4dc07..6927f70 100644
--- a/drivers/net/af_packet/rte_eth_af_packet.c
+++ b/drivers/net/af_packet/rte_eth_af_packet.c
@@ -43,6 +43,8 @@
 #include 
 #include 
 
+#include "rte_eth_af_packet.h"
+
 #include 
 #include 
 #include 
@@ -125,6 +127,21 @@ static struct rte_eth_link pmd_link = {
.link_autoneg = ETH_LINK_SPEED_AUTONEG
 };
 
+int
+rte_af_packet_get_ifname(uint8_t port, char *buf, size_t len)
+{
+   struct rte_eth_dev *dev;
+   struct pmd_internals *internals;
+
+   RTE_ETH_VALID_PORTID_OR_ERR_RET(port, -ENODEV);
+
+   dev = &rte_eth_devices[port];
+   internals = dev->data->dev_private;
+   snprintf(buf, len, "%s", internals->if_name);
+
+   return 0;
+}
+
 static uint16_t
 eth_af_packet_rx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
 {
diff --git a/drivers/net/af_packet/rte_eth_af_packet.h 
b/drivers/net/af_packet/rte_eth_af_packet.h
new file mode 100644
index 000..c5276f5
--- /dev/null
+++ b/drivers/net/af_packet/rte_eth_af_packet.h
@@ -0,0 +1,55 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright (c) 2017 Brocade Communications Systems, Inc.
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ *   * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ *   * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ *   * Neither the name of Intel Corporation nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_ETH_AF_PACKET_H_
+#define _RTE_ETH_AF_PACKET_H_
+
+#include 
+
+/**
+ * Get the name of the underlying kernel interface assocated
+ * with this AF_PACKET device.
+ *
+ * @param port
+ *  The port identifier of the AF_PACKET device.
+ * @param buf
+ *  The buffer to stored the queried ifname
+ * @param len
+ *  The length of buf
+ *
+ * @return
+ *  0 on success, -1 on failure
+ */
+int rte_af_packet_get_ifname(uint8_t port, char *buf, size_t len);
+
+#endif /* _RTE_ETH_AF_PACKET_H_ */
diff --git a/drivers/net/af_packet/rte_pmd_af_packet_version.map 
b/drivers/net/af_packet/rte_pmd_af_packet_version.map
index ef35398..2231699 100644
--- a/drivers/net/af_packet/rte_pmd_af_packet_version.map
+++ b/drivers/net/af_packet/rte_pmd_af_packet_version.map
@@ -2,3 +2,9 @@ DPDK_2.0 {
 
local: *;
 };
+
+DPDK_17.08 {
+   global:
+
+   rte_af_packet_get_ifname;
+};
-- 
2.1.4



[dpdk-dev] [PATCH 3/3] net/af_packet: fix packet bytes counting

2017-05-17 Thread Charles (Chas) Williams
On error, we also need to zero the bytes transmitted.

Signed-off-by: Charles (Chas) Williams 
---
 drivers/net/af_packet/rte_eth_af_packet.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/net/af_packet/rte_eth_af_packet.c 
b/drivers/net/af_packet/rte_eth_af_packet.c
index 6927f70..e456836 100644
--- a/drivers/net/af_packet/rte_eth_af_packet.c
+++ b/drivers/net/af_packet/rte_eth_af_packet.c
@@ -269,8 +269,11 @@ eth_af_packet_tx(void *queue, struct rte_mbuf **bufs, 
uint16_t nb_pkts)
}
 
/* kick-off transmits */
-   if (sendto(pkt_q->sockfd, NULL, 0, MSG_DONTWAIT, NULL, 0) == -1)
-   num_tx = 0; /* error sending -- no packets transmitted */
+   if (sendto(pkt_q->sockfd, NULL, 0, MSG_DONTWAIT, NULL, 0) == -1) {
+   /* error sending -- no packets transmitted */
+   num_tx = 0;
+   num_tx_bytes = 0;
+   }
 
pkt_q->framenum = framenum;
pkt_q->tx_pkts += num_tx;
-- 
2.1.4



[dpdk-dev] [PATCH] net/e1000: support MAC filters for i210 and i211 chips

2017-05-17 Thread Markus Theil
i210 and i211 also support unicast MAC filters.
The patch was tested on i210 based hw, for i211
support was looked up in the specs.

Signed-off-by: Markus Theil 
---
 drivers/net/e1000/igb_ethdev.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/e1000/igb_ethdev.c b/drivers/net/e1000/igb_ethdev.c
index e1702d8..43d1f5f 100644
--- a/drivers/net/e1000/igb_ethdev.c
+++ b/drivers/net/e1000/igb_ethdev.c
@@ -3551,7 +3551,8 @@ eth_igb_rss_reta_query(struct rte_eth_dev *dev,
 
 #define MAC_TYPE_FILTER_SUP(type)do {\
if ((type) != e1000_82580 && (type) != e1000_i350 &&\
-   (type) != e1000_82576)\
+   (type) != e1000_82576 && (type) != e1000_i210 &&\
+   (type) != e1000_i211)\
return -ENOTSUP;\
 } while (0)
 
-- 
2.7.4



Re: [dpdk-dev] [PATCH v2] eventdev: clarify atomic and ordered queue config

2017-05-17 Thread Jerin Jacob
-Original Message-
> Date: Mon, 15 May 2017 14:40:14 -0500
> From: Gage Eads 
> To: dev@dpdk.org
> CC: jerin.ja...@caviumnetworks.com
> Subject: [PATCH v2] eventdev: clarify atomic and ordered queue config
> X-Mailer: git-send-email 2.7.4
> 
> The nb_atomic_flows and nb_atomic_order_sequences fields are only inspected
> if the queue is configured for atomic or ordered scheduling, respectively.
> This commit updates the documentation to reflect that.
> 
> Signed-off-by: Gage Eads 

Applied to dpdk-next-eventdev/master. Thanks.

> ---
> v2: Fixed doxygen output issue and tweaked the ranges
> 
>  lib/librte_eventdev/rte_eventdev.h | 15 ++-
>  1 file changed, 10 insertions(+), 5 deletions(-)
> 
> diff --git a/lib/librte_eventdev/rte_eventdev.h 
> b/lib/librte_eventdev/rte_eventdev.h
> index 20e7293..9428433 100644
> --- a/lib/librte_eventdev/rte_eventdev.h
> +++ b/lib/librte_eventdev/rte_eventdev.h
> @@ -521,9 +521,11 @@ rte_event_dev_configure(uint8_t dev_id,
>  struct rte_event_queue_conf {
>   uint32_t nb_atomic_flows;
>   /**< The maximum number of active flows this queue can track at any
> -  * given time. The value must be in the range of
> -  * [1 - nb_event_queue_flows)] which previously provided in
> -  * rte_event_dev_info_get().
> +  * given time. If the queue is configured for atomic scheduling (by
> +  * applying the RTE_EVENT_QUEUE_CFG_ALL_TYPES or
> +  * RTE_EVENT_QUEUE_CFG_ATOMIC_ONLY flags to event_queue_cfg), then the
> +  * value must be in the range of [1, nb_event_queue_flows], which was
> +  * previously provided in rte_event_dev_configure().
>*/
>   uint32_t nb_atomic_order_sequences;
>   /**< The maximum number of outstanding events waiting to be
> @@ -533,8 +535,11 @@ struct rte_event_queue_conf {
>* scheduler cannot schedule the events from this queue and invalid
>* event will be returned from dequeue until one or more entries are
>* freed up/released.
> -  * The value must be in the range of [1 - nb_event_queue_flows)]
> -  * which previously supplied to rte_event_dev_configure().
> +  * If the queue is configured for ordered scheduling (by applying the
> +  * RTE_EVENT_QUEUE_CFG_ALL_TYPES or RTE_EVENT_QUEUE_CFG_ORDERED_ONLY
> +  * flags to event_queue_cfg), then the value must be in the range of
> +  * [1, nb_event_queue_flows], which was previously supplied to
> +  * rte_event_dev_configure().
>*/
>   uint32_t event_queue_cfg; /**< Queue cfg flags(EVENT_QUEUE_CFG_) */
>   uint8_t priority;
> -- 
> 2.7.4
> 


[dpdk-dev] [PATCH] examples/performance-thread: add arm64 support

2017-05-17 Thread Ashwin Sekhar T K
Updated Makefile to allow compilation for arm64 architecture.

Moved the code for setting the initial stack to architecture specific
directory.

Added implementation of context-switch for arm64 architecture.

Fixed minor compilation errors for arm64 compilation.

Signed-off-by: Ashwin Sekhar T K 
---
 examples/performance-thread/Makefile   |   4 +-
 .../performance-thread/common/arch/arm64/ctx.c |  99 ++
 .../performance-thread/common/arch/arm64/ctx.h |  95 ++
 .../performance-thread/common/arch/arm64/stack.h   | 111 +
 .../performance-thread/common/arch/x86/stack.h |  65 
 examples/performance-thread/common/common.mk   |  10 +-
 examples/performance-thread/common/lthread.c   |  11 +-
 examples/performance-thread/l3fwd-thread/main.c|   2 +-
 8 files changed, 383 insertions(+), 14 deletions(-)
 create mode 100644 examples/performance-thread/common/arch/arm64/ctx.c
 create mode 100644 examples/performance-thread/common/arch/arm64/ctx.h
 create mode 100644 examples/performance-thread/common/arch/arm64/stack.h
 create mode 100644 examples/performance-thread/common/arch/x86/stack.h

diff --git a/examples/performance-thread/Makefile 
b/examples/performance-thread/Makefile
index d19f8489e..0c5edfdb9 100644
--- a/examples/performance-thread/Makefile
+++ b/examples/performance-thread/Makefile
@@ -38,8 +38,8 @@ RTE_TARGET ?= x86_64-native-linuxapp-gcc
 
 include $(RTE_SDK)/mk/rte.vars.mk
 
-ifneq ($(CONFIG_RTE_ARCH),"x86_64")
-$(error This application is only supported for x86_64 targets)
+ifeq ($(filter y,$(CONFIG_RTE_ARCH_X86_64) $(CONFIG_RTE_ARCH_ARM64)),)
+$(error This application is only supported for x86_64 and arm64 targets)
 endif
 
 DIRS-y += l3fwd-thread
diff --git a/examples/performance-thread/common/arch/arm64/ctx.c 
b/examples/performance-thread/common/arch/arm64/ctx.c
new file mode 100644
index 0..7073cfd75
--- /dev/null
+++ b/examples/performance-thread/common/arch/arm64/ctx.c
@@ -0,0 +1,99 @@
+/*
+ *   BSD LICENSE
+ *
+ *   Copyright (C) Cavium networks Ltd. 2017.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Cavium networks nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+/*
+ * https://github.com/halayli/lthread which carries the following license.
+ *
+ * Copyright (C) 2012, Hasan Alayli 
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *notice, this list of conditions and the following disclaimer in the
+ *documentation and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY AUTHOR AND CONTRIBUTORS ``AS IS'' AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED.  IN NO EVENT SHALL AUTHOR OR CONTRIBUTORS BE LIABLE
+ * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
+ * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+ * LIABILITY, OR TORT (INCLUDIN

Re: [dpdk-dev] [PATCH] examples/performance-thread: add arm64 support

2017-05-17 Thread Jerin Jacob
-Original Message-
> Date: Wed, 17 May 2017 11:19:49 -0700
> From: Ashwin Sekhar T K 
> To: jerin.ja...@caviumnetworks.com, john.mcnam...@intel.com,
>  jianbo@linaro.org
> Cc: dev@dpdk.org, Ashwin Sekhar T K 
> Subject: [dpdk-dev] [PATCH] examples/performance-thread: add arm64 support
> X-Mailer: git-send-email 2.12.2
> 
> Updated Makefile to allow compilation for arm64 architecture.
> 
> Moved the code for setting the initial stack to architecture specific
> directory.

Please split the patch to two
- "arch_set_stack" abstraction and associated x86 change
- arm64 support

Thanks Ashwin.

I think, This may be the last feature to make arm64 at par with x86 features
supported in DPDK.

/Jerin


[dpdk-dev] [PATCH v4 4/8] net/enic: flow API mark and flag support

2017-05-17 Thread John Daley
For VICs with filter tagging, support the MARK and FLAG actions
by setting appropriate mbuf ol_flags if there is a filter match.

Signed-off-by: John Daley 
Reviewed-by: Nelson Escobar 
---
 drivers/net/enic/enic_rxtx.c | 16 ++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/drivers/net/enic/enic_rxtx.c b/drivers/net/enic/enic_rxtx.c
index ba0cfd01a..5867acf19 100644
--- a/drivers/net/enic/enic_rxtx.c
+++ b/drivers/net/enic/enic_rxtx.c
@@ -253,8 +253,20 @@ enic_cq_rx_to_pkt_flags(struct cq_desc *cqd, struct 
rte_mbuf *mbuf)
}
mbuf->vlan_tci = vlan_tci;
 
-   /* RSS flag */
-   if (enic_cq_rx_desc_rss_type(cqrd)) {
+   if ((cqd->type_color & CQ_DESC_TYPE_MASK) == CQ_DESC_TYPE_CLASSIFIER) {
+   struct cq_enet_rq_clsf_desc *clsf_cqd;
+   uint16_t filter_id;
+   clsf_cqd = (struct cq_enet_rq_clsf_desc *)cqd;
+   filter_id = clsf_cqd->filter_id;
+   if (filter_id) {
+   pkt_flags |= PKT_RX_FDIR;
+   if (filter_id != ENIC_MAGIC_FILTER_ID) {
+   mbuf->hash.fdir.hi = clsf_cqd->filter_id;
+   pkt_flags |= PKT_RX_FDIR_ID;
+   }
+   }
+   } else if (enic_cq_rx_desc_rss_type(cqrd)) {
+   /* RSS flag */
pkt_flags |= PKT_RX_RSS_HASH;
mbuf->hash.rss = enic_cq_rx_desc_rss_hash(cqrd);
}
-- 
2.12.0



[dpdk-dev] [PATCH v4 5/8] net/enic: flow API for NICs with advanced filters disabled

2017-05-17 Thread John Daley
Flow support for 1300 series adapters with the 'Advanced Filter'
mode disabled via the UCS management interface. This allows:
Attributes: ingress
Items: Outer eth, ipv4, ipv6, udp, sctp, tcp, vxlan. Inner eth, ipv4,
   ipv6, udp, tcp.
Actions: queue and void
Selectors: 'is', 'spec' and 'mask'. 'last' is not supported

With advanced filters disabled, an IPv4 or IPv6 item must be specified
in the pattern.

Signed-off-by: John Daley 
Reviewed-by: Nelson Escobar 
---
 drivers/net/enic/enic_flow.c | 135 ++-
 1 file changed, 133 insertions(+), 2 deletions(-)

diff --git a/drivers/net/enic/enic_flow.c b/drivers/net/enic/enic_flow.c
index 44efe4b81..a32e25ef9 100644
--- a/drivers/net/enic/enic_flow.c
+++ b/drivers/net/enic/enic_flow.c
@@ -98,8 +98,85 @@ static enic_copy_item_fn enic_copy_item_tcp_v2;
 static enic_copy_item_fn enic_copy_item_sctp_v2;
 static enic_copy_item_fn enic_copy_item_sctp_v2;
 static enic_copy_item_fn enic_copy_item_vxlan_v2;
+static copy_action_fn enic_copy_action_v1;
 static copy_action_fn enic_copy_action_v2;
 
+/**
+ * NICs have Advanced Filters capability but they are disabled. This means
+ * that layer 3 must be specified.
+ */
+static const struct enic_items enic_items_v2[] = {
+   [RTE_FLOW_ITEM_TYPE_ETH] = {
+   .copy_item = enic_copy_item_eth_v2,
+   .valid_start_item = 1,
+   .prev_items = (const enum rte_flow_item_type[]) {
+  RTE_FLOW_ITEM_TYPE_VXLAN,
+  RTE_FLOW_ITEM_TYPE_END,
+   },
+   },
+   [RTE_FLOW_ITEM_TYPE_VLAN] = {
+   .copy_item = enic_copy_item_vlan_v2,
+   .valid_start_item = 1,
+   .prev_items = (const enum rte_flow_item_type[]) {
+  RTE_FLOW_ITEM_TYPE_ETH,
+  RTE_FLOW_ITEM_TYPE_END,
+   },
+   },
+   [RTE_FLOW_ITEM_TYPE_IPV4] = {
+   .copy_item = enic_copy_item_ipv4_v2,
+   .valid_start_item = 1,
+   .prev_items = (const enum rte_flow_item_type[]) {
+  RTE_FLOW_ITEM_TYPE_ETH,
+  RTE_FLOW_ITEM_TYPE_VLAN,
+  RTE_FLOW_ITEM_TYPE_END,
+   },
+   },
+   [RTE_FLOW_ITEM_TYPE_IPV6] = {
+   .copy_item = enic_copy_item_ipv6_v2,
+   .valid_start_item = 1,
+   .prev_items = (const enum rte_flow_item_type[]) {
+  RTE_FLOW_ITEM_TYPE_ETH,
+  RTE_FLOW_ITEM_TYPE_VLAN,
+  RTE_FLOW_ITEM_TYPE_END,
+   },
+   },
+   [RTE_FLOW_ITEM_TYPE_UDP] = {
+   .copy_item = enic_copy_item_udp_v2,
+   .valid_start_item = 0,
+   .prev_items = (const enum rte_flow_item_type[]) {
+  RTE_FLOW_ITEM_TYPE_IPV4,
+  RTE_FLOW_ITEM_TYPE_IPV6,
+  RTE_FLOW_ITEM_TYPE_END,
+   },
+   },
+   [RTE_FLOW_ITEM_TYPE_TCP] = {
+   .copy_item = enic_copy_item_tcp_v2,
+   .valid_start_item = 0,
+   .prev_items = (const enum rte_flow_item_type[]) {
+  RTE_FLOW_ITEM_TYPE_IPV4,
+  RTE_FLOW_ITEM_TYPE_IPV6,
+  RTE_FLOW_ITEM_TYPE_END,
+   },
+   },
+   [RTE_FLOW_ITEM_TYPE_SCTP] = {
+   .copy_item = enic_copy_item_sctp_v2,
+   .valid_start_item = 0,
+   .prev_items = (const enum rte_flow_item_type[]) {
+  RTE_FLOW_ITEM_TYPE_IPV4,
+  RTE_FLOW_ITEM_TYPE_IPV6,
+  RTE_FLOW_ITEM_TYPE_END,
+   },
+   },
+   [RTE_FLOW_ITEM_TYPE_VXLAN] = {
+   .copy_item = enic_copy_item_vxlan_v2,
+   .valid_start_item = 0,
+   .prev_items = (const enum rte_flow_item_type[]) {
+  RTE_FLOW_ITEM_TYPE_UDP,
+  RTE_FLOW_ITEM_TYPE_END,
+   },
+   },
+};
+
 /** NICs with Advanced filters enabled */
 static const struct enic_items enic_items_v3[] = {
[RTE_FLOW_ITEM_TYPE_ETH] = {
@@ -175,11 +252,20 @@ static const struct enic_items enic_items_v3[] = {
 
 /** Filtering capabilities indexed this NICs supported filter type. */
 static const struct enic_filter_cap enic_filter_cap[] = {
+   [FILTER_USNIC_IP] = {
+   .item_info = enic_items_v2,
+   },
[FILTER_DPDK_1] = {
.item_info = enic_items_v3,
},
 };
 
+/** Supported actions for older NICs */
+static const enum rte_flow_action_type enic_supported_actions_v1[] = {
+   RTE_FLOW_ACTION_TYPE_QUEUE,
+   RTE_FLOW_ACTION_TYPE_END,
+};
+
 /** Supported actions for newer NICs */
 static const enum 

[dpdk-dev] [PATCH v4 2/8] net/enic: flow API skeleton

2017-05-17 Thread John Daley
Stub callbacks for the generic flow API and a new FLOW debug define.

Signed-off-by: John Daley 
Reviewed-by: Nelson Escobar 
---
 config/common_base |   1 +
 drivers/net/enic/Makefile  |   1 +
 drivers/net/enic/enic.h|   1 +
 drivers/net/enic/enic_ethdev.c |  18 -
 drivers/net/enic/enic_flow.c   | 151 +
 5 files changed, 169 insertions(+), 3 deletions(-)
 create mode 100644 drivers/net/enic/enic_flow.c

diff --git a/config/common_base b/config/common_base
index 8907bea36..67ef2ece0 100644
--- a/config/common_base
+++ b/config/common_base
@@ -254,6 +254,7 @@ CONFIG_RTE_LIBRTE_CXGBE_DEBUG_RX=n
 #
 CONFIG_RTE_LIBRTE_ENIC_PMD=y
 CONFIG_RTE_LIBRTE_ENIC_DEBUG=n
+CONFIG_RTE_LIBRTE_ENIC_DEBUG_FLOW=n
 
 #
 # Compile burst-oriented Netronome NFP PMD driver
diff --git a/drivers/net/enic/Makefile b/drivers/net/enic/Makefile
index 2c7496dc5..db48ff2da 100644
--- a/drivers/net/enic/Makefile
+++ b/drivers/net/enic/Makefile
@@ -56,6 +56,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_ENIC_PMD) += enic_main.c
 SRCS-$(CONFIG_RTE_LIBRTE_ENIC_PMD) += enic_rxtx.c
 SRCS-$(CONFIG_RTE_LIBRTE_ENIC_PMD) += enic_clsf.c
 SRCS-$(CONFIG_RTE_LIBRTE_ENIC_PMD) += enic_res.c
+SRCS-$(CONFIG_RTE_LIBRTE_ENIC_PMD) += enic_flow.c
 SRCS-$(CONFIG_RTE_LIBRTE_ENIC_PMD) += base/vnic_cq.c
 SRCS-$(CONFIG_RTE_LIBRTE_ENIC_PMD) += base/vnic_wq.c
 SRCS-$(CONFIG_RTE_LIBRTE_ENIC_PMD) += base/vnic_dev.c
diff --git a/drivers/net/enic/enic.h b/drivers/net/enic/enic.h
index 2358a7f6f..9647ca21f 100644
--- a/drivers/net/enic/enic.h
+++ b/drivers/net/enic/enic.h
@@ -306,4 +306,5 @@ void copy_fltr_v1(struct filter_v2 *fltr, struct 
rte_eth_fdir_input *input,
  struct rte_eth_fdir_masks *masks);
 void copy_fltr_v2(struct filter_v2 *fltr, struct rte_eth_fdir_input *input,
  struct rte_eth_fdir_masks *masks);
+extern const struct rte_flow_ops enic_flow_ops;
 #endif /* _ENIC_H_ */
diff --git a/drivers/net/enic/enic_ethdev.c b/drivers/net/enic/enic_ethdev.c
index 8e16a71b7..a8e167681 100644
--- a/drivers/net/enic/enic_ethdev.c
+++ b/drivers/net/enic/enic_ethdev.c
@@ -116,13 +116,25 @@ enicpmd_dev_filter_ctrl(struct rte_eth_dev *dev,
 enum rte_filter_op filter_op,
 void *arg)
 {
-   int ret = -EINVAL;
+   int ret = 0;
+
+   ENICPMD_FUNC_TRACE();
 
-   if (RTE_ETH_FILTER_FDIR == filter_type)
+   switch (filter_type) {
+   case RTE_ETH_FILTER_GENERIC:
+   if (filter_op != RTE_ETH_FILTER_GET)
+   return -EINVAL;
+   *(const void **)arg = &enic_flow_ops;
+   break;
+   case RTE_ETH_FILTER_FDIR:
ret = enicpmd_fdir_ctrl_func(dev, filter_op, arg);
-   else
+   break;
+   default:
dev_warning(enic, "Filter type (%d) not supported",
filter_type);
+   ret = -EINVAL;
+   break;
+   }
 
return ret;
 }
diff --git a/drivers/net/enic/enic_flow.c b/drivers/net/enic/enic_flow.c
new file mode 100644
index 0..a5c6ebd0a
--- /dev/null
+++ b/drivers/net/enic/enic_flow.c
@@ -0,0 +1,151 @@
+/*
+ * Copyright (c) 2017, Cisco Systems, Inc.
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ *
+ * 2. Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
+ * FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
+ * COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
+ * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
+ * BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+ * CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+ * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN
+ * ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+ * POSSIBILITY OF SUCH DAMAGE.
+ *
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "enic_compat.h"
+#include "enic.h"
+#include "vnic_dev.h"
+#include "vnic_nic.h"
+
+#ifdef RTE_LIBRTE_ENIC_DEBUG_FLOW
+#define FLOW_TRACE() \
+   RTE_LOG(DEBUG, PMD, "%s()\n", __func__)
+#define FLOW_LOG(level, fmt, args...) \
+   RTE_LOG(level, PMD, fmt, ## args)
+#else
+#define FLOW_TRACE() do { } whi

[dpdk-dev] [PATCH v4 0/8] enic flow api support

2017-05-17 Thread John Daley
Sorry for the inconvenience Ferruh, hopefully this set will work better.

V4 patchset changes:
- include patch that was missing in V3 which caused compile errors
- put mark and flag support into a separate patch to make it more clear
  what the code changes in enic_rxtx.c were for
- fixed a documentation merge error
- fix copyright, remove unnecessary check for dev == NULL

thanks,
johnd

John Daley (8):
  net/enic: bring NIC interface functions up to date
  net/enic: flow API skeleton
  net/enic: flow API for NICs with advanced filters enabled
  net/enic: flow API mark and flag support
  net/enic: flow API for NICs with advanced filters disabled
  net/enic: flow API for Legacy NICs
  net/enic: flow API debug
  net/enic: flow API documentation

 config/common_base |1 +
 doc/guides/nics/enic.rst   |   52 ++
 doc/guides/nics/features/enic.ini  |1 +
 doc/guides/rel_notes/release_17_08.rst |6 +
 drivers/net/enic/Makefile  |1 +
 drivers/net/enic/base/cq_enet_desc.h   |   13 +
 drivers/net/enic/base/vnic_dev.c   |  162 +++-
 drivers/net/enic/base/vnic_dev.h   |5 +-
 drivers/net/enic/base/vnic_devcmd.h|   81 +-
 drivers/net/enic/enic.h|   15 +-
 drivers/net/enic/enic_clsf.c   |   16 +-
 drivers/net/enic/enic_ethdev.c |   18 +-
 drivers/net/enic/enic_flow.c   | 1544 
 drivers/net/enic/enic_main.c   |3 +
 drivers/net/enic/enic_res.c|   15 +
 drivers/net/enic/enic_rxtx.c   |   16 +-
 16 files changed, 1904 insertions(+), 45 deletions(-)
 create mode 100644 drivers/net/enic/enic_flow.c

-- 
2.12.0



[dpdk-dev] [PATCH v4 3/8] net/enic: flow API for NICs with advanced filters enabled

2017-05-17 Thread John Daley
Flow support for 1300 series adapters with the 'Advanced Filter'
mode enabled via the UCS management interface. This enables:
Attributes: ingress
Items: Outer eth, ipv4, ipv6, udp, sctp, tcp, vxlan. Inner eth, ipv4,
   ipv6, udp, tcp.
Actions: queue, and void
Selectors: 'is', 'spec' and 'mask'. 'last' is not supported

Signed-off-by: John Daley 
Reviewed-by: Nelson Escobar 
---

copy functions return positive errnos because they are then used in
rte_flow_error_set() which takes a positive errno. checkpatch flags
these as warnings, but in my option the implementation should be
acceptable in this case.

 drivers/net/enic/enic.h  |  14 +-
 drivers/net/enic/enic_flow.c | 960 ++-
 drivers/net/enic/enic_main.c |   3 +
 drivers/net/enic/enic_res.c  |  15 +
 4 files changed, 975 insertions(+), 17 deletions(-)

diff --git a/drivers/net/enic/enic.h b/drivers/net/enic/enic.h
index 9647ca21f..e28f22352 100644
--- a/drivers/net/enic/enic.h
+++ b/drivers/net/enic/enic.h
@@ -80,6 +80,8 @@
 #define PCI_DEVICE_ID_CISCO_VIC_ENET 0x0043  /* ethernet vnic */
 #define PCI_DEVICE_ID_CISCO_VIC_ENET_VF  0x0071  /* enet SRIOV VF */
 
+/* Special Filter id for non-specific packet flagging. Don't change value */
+#define ENIC_MAGIC_FILTER_ID 0x
 
 #define ENICPMD_FDIR_MAX   64
 
@@ -111,6 +113,12 @@ struct enic_memzone_entry {
LIST_ENTRY(enic_memzone_entry) entries;
 };
 
+struct rte_flow {
+   LIST_ENTRY(rte_flow) next;
+   u16 enic_filter_id;
+   struct filter_v2 enic_filter;
+};
+
 /* Per-instance private data structure */
 struct enic {
struct enic *next;
@@ -135,7 +143,9 @@ struct enic {
int link_status;
u8 hw_ip_checksum;
u16 max_mtu;
-   u16 adv_filters;
+   u8 adv_filters;
+   u32 flow_filter_mode;
+   u8 filter_tags;
 
unsigned int flags;
unsigned int priv_flags;
@@ -170,6 +180,8 @@ struct enic {
rte_spinlock_t memzone_list_lock;
rte_spinlock_t mtu_lock;
 
+   LIST_HEAD(enic_flows, rte_flow) flows;
+   rte_spinlock_t flows_lock;
 };
 
 /* Get the CQ index from a Start of Packet(SOP) RQ index */
diff --git a/drivers/net/enic/enic_flow.c b/drivers/net/enic/enic_flow.c
index a5c6ebd0a..44efe4b81 100644
--- a/drivers/net/enic/enic_flow.c
+++ b/drivers/net/enic/enic_flow.c
@@ -52,6 +52,911 @@
 #define FLOW_LOG(level, fmt, args...) do { } while (0)
 #endif
 
+/** Info about how to copy items into enic filters. */
+struct enic_items {
+   /** Function for copying and validating an item. */
+   int (*copy_item)(const struct rte_flow_item *item,
+struct filter_v2 *enic_filter, u8 *inner_ofst);
+   /** List of valid previous items. */
+   const enum rte_flow_item_type * const prev_items;
+   /** True if it's OK for this item to be the first item. For some NIC
+* versions, it's invalid to start the stack above layer 3.
+*/
+   const u8 valid_start_item;
+};
+
+/** Filtering capabilities for various NIC and firmware versions. */
+struct enic_filter_cap {
+   /** list of valid items and their handlers and attributes. */
+   const struct enic_items *item_info;
+};
+
+/* functions for copying flow actions into enic actions */
+typedef int (copy_action_fn)(const struct rte_flow_action actions[],
+struct filter_action_v2 *enic_action);
+
+/* functions for copying items into enic filters */
+typedef int(enic_copy_item_fn)(const struct rte_flow_item *item,
+ struct filter_v2 *enic_filter, u8 *inner_ofst);
+
+/** Action capabilities for various NICs. */
+struct enic_action_cap {
+   /** list of valid actions */
+   const enum rte_flow_action_type *actions;
+   /** copy function for a particular NIC */
+   int (*copy_fn)(const struct rte_flow_action actions[],
+  struct filter_action_v2 *enic_action);
+};
+
+/* Forward declarations */
+static enic_copy_item_fn enic_copy_item_eth_v2;
+static enic_copy_item_fn enic_copy_item_vlan_v2;
+static enic_copy_item_fn enic_copy_item_ipv4_v2;
+static enic_copy_item_fn enic_copy_item_ipv6_v2;
+static enic_copy_item_fn enic_copy_item_udp_v2;
+static enic_copy_item_fn enic_copy_item_tcp_v2;
+static enic_copy_item_fn enic_copy_item_sctp_v2;
+static enic_copy_item_fn enic_copy_item_sctp_v2;
+static enic_copy_item_fn enic_copy_item_vxlan_v2;
+static copy_action_fn enic_copy_action_v2;
+
+/** NICs with Advanced filters enabled */
+static const struct enic_items enic_items_v3[] = {
+   [RTE_FLOW_ITEM_TYPE_ETH] = {
+   .copy_item = enic_copy_item_eth_v2,
+   .valid_start_item = 1,
+   .prev_items = (const enum rte_flow_item_type[]) {
+  RTE_FLOW_ITEM_TYPE_VXLAN,
+  RTE_FLOW_ITEM_TYPE_END,
+   },
+   },
+   [RTE_FLOW_ITEM_TYPE_VLAN] = {
+   .copy_item = eni

[dpdk-dev] [PATCH v4 1/8] net/enic: bring NIC interface functions up to date

2017-05-17 Thread John Daley
Update the base functions for the Cisco VIC. These files are mostly
common with other VIC drivers so are left alone is as much as possilbe.
Includes in a new filter/action interface which is needed for Generic
Flow API PMD support. Update FDIR code to use the new interface.

Signed-off-by: John Daley 
Reviewed-by: Nelson Escobar 
---

checkpatch style warning about multiple use of 'x' in ARRAY_SIZE macro should
be waived in my opinion.

 drivers/net/enic/base/cq_enet_desc.h |  13 +++
 drivers/net/enic/base/vnic_dev.c | 162 ---
 drivers/net/enic/base/vnic_dev.h |   5 +-
 drivers/net/enic/base/vnic_devcmd.h  |  81 +-
 drivers/net/enic/enic_clsf.c |  16 ++--
 5 files changed, 238 insertions(+), 39 deletions(-)

diff --git a/drivers/net/enic/base/cq_enet_desc.h 
b/drivers/net/enic/base/cq_enet_desc.h
index f9822a450..e8410563a 100644
--- a/drivers/net/enic/base/cq_enet_desc.h
+++ b/drivers/net/enic/base/cq_enet_desc.h
@@ -71,6 +71,19 @@ struct cq_enet_rq_desc {
u8 type_color;
 };
 
+/* Completion queue descriptor: Ethernet receive queue, 16B */
+struct cq_enet_rq_clsf_desc {
+   __le16 completed_index_flags;
+   __le16 q_number_rss_type_flags;
+   __le16 filter_id;
+   __le16 lif;
+   __le16 bytes_written_flags;
+   __le16 vlan;
+   __le16 checksum_fcoe;
+   u8 flags;
+   u8 type_color;
+};
+
 #define CQ_ENET_RQ_DESC_FLAGS_INGRESS_PORT  (0x1 << 12)
 #define CQ_ENET_RQ_DESC_FLAGS_FCOE  (0x1 << 13)
 #define CQ_ENET_RQ_DESC_FLAGS_EOP   (0x1 << 14)
diff --git a/drivers/net/enic/base/vnic_dev.c b/drivers/net/enic/base/vnic_dev.c
index 84e4840af..7b3aed31a 100644
--- a/drivers/net/enic/base/vnic_dev.c
+++ b/drivers/net/enic/base/vnic_dev.c
@@ -387,17 +387,24 @@ static int _vnic_dev_cmd(struct vnic_dev *vdev, enum 
vnic_devcmd_cmd cmd,
 
 static int vnic_dev_cmd_proxy(struct vnic_dev *vdev,
enum vnic_devcmd_cmd proxy_cmd, enum vnic_devcmd_cmd cmd,
-   u64 *a0, u64 *a1, int wait)
+   u64 *args, int nargs, int wait)
 {
u32 status;
int err;
 
+   /*
+* Proxy command consumes 2 arguments. One for proxy index,
+* the other is for command to be proxied
+*/
+   if (nargs > VNIC_DEVCMD_NARGS - 2) {
+   pr_err("number of args %d exceeds the maximum\n", nargs);
+   return -EINVAL;
+   }
memset(vdev->args, 0, sizeof(vdev->args));
 
vdev->args[0] = vdev->proxy_index;
vdev->args[1] = cmd;
-   vdev->args[2] = *a0;
-   vdev->args[3] = *a1;
+   memcpy(&vdev->args[2], args, nargs * sizeof(args[0]));
 
err = _vnic_dev_cmd(vdev, proxy_cmd, wait);
if (err)
@@ -412,24 +419,26 @@ static int vnic_dev_cmd_proxy(struct vnic_dev *vdev,
return err;
}
 
-   *a0 = vdev->args[1];
-   *a1 = vdev->args[2];
+   memcpy(args, &vdev->args[1], nargs * sizeof(args[0]));
 
return 0;
 }
 
 static int vnic_dev_cmd_no_proxy(struct vnic_dev *vdev,
-   enum vnic_devcmd_cmd cmd, u64 *a0, u64 *a1, int wait)
+   enum vnic_devcmd_cmd cmd, u64 *args, int nargs, int wait)
 {
int err;
 
-   vdev->args[0] = *a0;
-   vdev->args[1] = *a1;
+   if (nargs > VNIC_DEVCMD_NARGS) {
+   pr_err("number of args %d exceeds the maximum\n", nargs);
+   return -EINVAL;
+   }
+   memset(vdev->args, 0, sizeof(vdev->args));
+   memcpy(vdev->args, args, nargs * sizeof(args[0]));
 
err = _vnic_dev_cmd(vdev, cmd, wait);
 
-   *a0 = vdev->args[0];
-   *a1 = vdev->args[1];
+   memcpy(args, vdev->args, nargs * sizeof(args[0]));
 
return err;
 }
@@ -455,24 +464,64 @@ void vnic_dev_cmd_proxy_end(struct vnic_dev *vdev)
 int vnic_dev_cmd(struct vnic_dev *vdev, enum vnic_devcmd_cmd cmd,
u64 *a0, u64 *a1, int wait)
 {
+   u64 args[2];
+   int err;
+
+   args[0] = *a0;
+   args[1] = *a1;
memset(vdev->args, 0, sizeof(vdev->args));
 
switch (vdev->proxy) {
case PROXY_BY_INDEX:
+   err =  vnic_dev_cmd_proxy(vdev, CMD_PROXY_BY_INDEX, cmd,
+   args, ARRAY_SIZE(args), wait);
+   break;
+   case PROXY_BY_BDF:
+   err =  vnic_dev_cmd_proxy(vdev, CMD_PROXY_BY_BDF, cmd,
+   args, ARRAY_SIZE(args), wait);
+   break;
+   case PROXY_NONE:
+   default:
+   err = vnic_dev_cmd_no_proxy(vdev, cmd, args, 2, wait);
+   break;
+   }
+
+   if (err == 0) {
+   *a0 = args[0];
+   *a1 = args[1];
+   }
+
+   return err;
+}
+
+int vnic_dev_cmd_args(struct vnic_dev *vdev, enum vnic_devcmd_cmd cmd,
+ u64 *args, int nargs, int wait)
+{
+   switch (vdev->proxy) {
+   case PROXY_BY_INDEX:
return vnic_dev_cmd_proxy(vdev, CMD_PROXY_BY_INDEX, cmd

[dpdk-dev] [PATCH v4 7/8] net/enic: flow API debug

2017-05-17 Thread John Daley
Added a debug function to print enic filters and actions when
rte_validate_flow is called. Compiled in CONFIG_RTE_LIBRTE_ENIC_DEBUG_FLOW
is enabled and log level is INFO.

Signed-off-by: John Daley 
Reviewed-by: Nelson Escobar 
---
 drivers/net/enic/enic_flow.c | 138 +++
 1 file changed, 138 insertions(+)

diff --git a/drivers/net/enic/enic_flow.c b/drivers/net/enic/enic_flow.c
index c20ff86b1..a728d0777 100644
--- a/drivers/net/enic/enic_flow.c
+++ b/drivers/net/enic/enic_flow.c
@@ -1097,6 +1097,142 @@ enic_get_action_cap(struct enic *enic)
ea = &enic_action_cap[FILTER_ACTION_RQ_STEERING_FLAG];
return ea;
 }
+
+/* Debug function to dump internal NIC action structure. */
+static void
+enic_dump_actions(const struct filter_action_v2 *ea)
+{
+   if (ea->type == FILTER_ACTION_RQ_STEERING) {
+   FLOW_LOG(INFO, "Action(V1), queue: %u\n", ea->rq_idx);
+   } else if (ea->type == FILTER_ACTION_V2) {
+   FLOW_LOG(INFO, "Actions(V2)\n");
+   if (ea->flags & FILTER_ACTION_RQ_STEERING_FLAG)
+   FLOW_LOG(INFO, "\tqueue: %u\n",
+  enic_sop_rq_idx_to_rte_idx(ea->rq_idx));
+   if (ea->flags & FILTER_ACTION_FILTER_ID_FLAG)
+   FLOW_LOG(INFO, "\tfilter_id: %u\n", ea->filter_id);
+   }
+}
+
+/* Debug function to dump internal NIC filter structure. */
+static void
+enic_dump_filter(const struct filter_v2 *filt)
+{
+   const struct filter_generic_1 *gp;
+   int i, j, mbyte;
+   char buf[128], *bp;
+   char ip4[16], ip6[16], udp[16], tcp[16], tcpudp[16], ip4csum[16];
+   char l4csum[16], ipfrag[16];
+
+   switch (filt->type) {
+   case FILTER_IPV4_5TUPLE:
+   FLOW_LOG(INFO, "FILTER_IPV4_5TUPLE\n");
+   break;
+   case FILTER_USNIC_IP:
+   case FILTER_DPDK_1:
+   /* FIXME: this should be a loop */
+   gp = &filt->u.generic_1;
+   FLOW_LOG(INFO, "Filter: vlan: 0x%04x, mask: 0x%04x\n",
+  gp->val_vlan, gp->mask_vlan);
+
+   if (gp->mask_flags & FILTER_GENERIC_1_IPV4)
+   sprintf(ip4, "%s ",
+   (gp->val_flags & FILTER_GENERIC_1_IPV4)
+? "ip4(y)" : "ip4(n)");
+   else
+   sprintf(ip4, "%s ", "ip4(x)");
+
+   if (gp->mask_flags & FILTER_GENERIC_1_IPV6)
+   sprintf(ip6, "%s ",
+   (gp->val_flags & FILTER_GENERIC_1_IPV4)
+? "ip6(y)" : "ip6(n)");
+   else
+   sprintf(ip6, "%s ", "ip6(x)");
+
+   if (gp->mask_flags & FILTER_GENERIC_1_UDP)
+   sprintf(udp, "%s ",
+   (gp->val_flags & FILTER_GENERIC_1_UDP)
+? "udp(y)" : "udp(n)");
+   else
+   sprintf(udp, "%s ", "udp(x)");
+
+   if (gp->mask_flags & FILTER_GENERIC_1_TCP)
+   sprintf(tcp, "%s ",
+   (gp->val_flags & FILTER_GENERIC_1_TCP)
+? "tcp(y)" : "tcp(n)");
+   else
+   sprintf(tcp, "%s ", "tcp(x)");
+
+   if (gp->mask_flags & FILTER_GENERIC_1_TCP_OR_UDP)
+   sprintf(tcpudp, "%s ",
+   (gp->val_flags & FILTER_GENERIC_1_TCP_OR_UDP)
+? "tcpudp(y)" : "tcpudp(n)");
+   else
+   sprintf(tcpudp, "%s ", "tcpudp(x)");
+
+   if (gp->mask_flags & FILTER_GENERIC_1_IP4SUM_OK)
+   sprintf(ip4csum, "%s ",
+   (gp->val_flags & FILTER_GENERIC_1_IP4SUM_OK)
+? "ip4csum(y)" : "ip4csum(n)");
+   else
+   sprintf(ip4csum, "%s ", "ip4csum(x)");
+
+   if (gp->mask_flags & FILTER_GENERIC_1_L4SUM_OK)
+   sprintf(l4csum, "%s ",
+   (gp->val_flags & FILTER_GENERIC_1_L4SUM_OK)
+? "l4csum(y)" : "l4csum(n)");
+   else
+   sprintf(l4csum, "%s ", "l4csum(x)");
+
+   if (gp->mask_flags & FILTER_GENERIC_1_IPFRAG)
+   sprintf(ipfrag, "%s ",
+   (gp->val_flags & FILTER_GENERIC_1_IPFRAG)
+? "ipfrag(y)" : "ipfrag(n)");
+   else
+   sprintf(ipfrag, "%s ", "ipfrag(x)");
+   FLOW_LOG(INFO, "\tFlags: %s%s%s%s%s%s%s%s\n", ip4, ip6, udp,
+tcp, tcpudp, ip4csum, l4csum, ipfrag);
+
+   for (i = 0; i < FILTER_GENERIC_1_NUM_LAYERS; i++) {
+   mbyte = FILTER_GENERIC_1_KEY_LEN - 1;
+ 

[dpdk-dev] [PATCH v4 6/8] net/enic: flow API for Legacy NICs

2017-05-17 Thread John Daley
5-tuple exact Flow support for 1200 series adapters. This allows:
Attributes: ingress
Items: ipv4, ipv6, udp, tcp (must exactly match src/dst IP
   addresses and ports and all must be specified).
Actions: queue and void
Selectors: 'is'

Signed-off-by: John Daley 
Reviewed-by: Nelson Escobar 
---

copy functions return positive errnos because they are then used in
rte_flow_error_set() which takes a positive errno. checkpatch flags
these as warnings, but in my option the implementation should be
acceptable in this case.

 drivers/net/enic/enic_flow.c | 206 +--
 1 file changed, 201 insertions(+), 5 deletions(-)

diff --git a/drivers/net/enic/enic_flow.c b/drivers/net/enic/enic_flow.c
index a32e25ef9..c20ff86b1 100644
--- a/drivers/net/enic/enic_flow.c
+++ b/drivers/net/enic/enic_flow.c
@@ -89,6 +89,9 @@ struct enic_action_cap {
 };
 
 /* Forward declarations */
+static enic_copy_item_fn enic_copy_item_ipv4_v1;
+static enic_copy_item_fn enic_copy_item_udp_v1;
+static enic_copy_item_fn enic_copy_item_tcp_v1;
 static enic_copy_item_fn enic_copy_item_eth_v2;
 static enic_copy_item_fn enic_copy_item_vlan_v2;
 static enic_copy_item_fn enic_copy_item_ipv4_v2;
@@ -102,6 +105,36 @@ static copy_action_fn enic_copy_action_v1;
 static copy_action_fn enic_copy_action_v2;
 
 /**
+ * Legacy NICs or NICs with outdated firmware. Only 5-tuple perfect match
+ * is supported.
+ */
+static const struct enic_items enic_items_v1[] = {
+   [RTE_FLOW_ITEM_TYPE_IPV4] = {
+   .copy_item = enic_copy_item_ipv4_v1,
+   .valid_start_item = 1,
+   .prev_items = (const enum rte_flow_item_type[]) {
+  RTE_FLOW_ITEM_TYPE_END,
+   },
+   },
+   [RTE_FLOW_ITEM_TYPE_UDP] = {
+   .copy_item = enic_copy_item_udp_v1,
+   .valid_start_item = 0,
+   .prev_items = (const enum rte_flow_item_type[]) {
+  RTE_FLOW_ITEM_TYPE_IPV4,
+  RTE_FLOW_ITEM_TYPE_END,
+   },
+   },
+   [RTE_FLOW_ITEM_TYPE_TCP] = {
+   .copy_item = enic_copy_item_tcp_v1,
+   .valid_start_item = 0,
+   .prev_items = (const enum rte_flow_item_type[]) {
+  RTE_FLOW_ITEM_TYPE_IPV4,
+  RTE_FLOW_ITEM_TYPE_END,
+   },
+   },
+};
+
+/**
  * NICs have Advanced Filters capability but they are disabled. This means
  * that layer 3 must be specified.
  */
@@ -252,6 +285,9 @@ static const struct enic_items enic_items_v3[] = {
 
 /** Filtering capabilities indexed this NICs supported filter type. */
 static const struct enic_filter_cap enic_filter_cap[] = {
+   [FILTER_IPV4_5TUPLE] = {
+   .item_info = enic_items_v1,
+   },
[FILTER_USNIC_IP] = {
.item_info = enic_items_v2,
},
@@ -285,6 +321,171 @@ static const struct enic_action_cap enic_action_cap[] = {
.copy_fn = enic_copy_action_v2,
},
 };
+
+static int
+mask_exact_match(const u8 *supported, const u8 *supplied,
+unsigned int size)
+{
+   unsigned int i;
+   for (i = 0; i < size; i++) {
+   if (supported[i] != supplied[i])
+   return 0;
+   }
+   return 1;
+}
+
+/**
+ * Copy IPv4 item into version 1 NIC filter.
+ *
+ * @param item[in]
+ *   Item specification.
+ * @param enic_filter[out]
+ *   Partially filled in NIC filter structure.
+ * @param inner_ofst[in]
+ *   Should always be 0 for version 1.
+ */
+static int
+enic_copy_item_ipv4_v1(const struct rte_flow_item *item,
+  struct filter_v2 *enic_filter, u8 *inner_ofst)
+{
+   const struct rte_flow_item_ipv4 *spec = item->spec;
+   const struct rte_flow_item_ipv4 *mask = item->mask;
+   struct filter_ipv4_5tuple *enic_5tup = &enic_filter->u.ipv4;
+   struct ipv4_hdr supported_mask = {
+   .src_addr = 0x,
+   .dst_addr = 0x,
+   };
+
+   FLOW_TRACE();
+
+   if (*inner_ofst)
+   return ENOTSUP;
+
+   if (!mask)
+   mask = &rte_flow_item_ipv4_mask;
+
+   /* This is an exact match filter, both fields must be set */
+   if (!spec || !spec->hdr.src_addr || !spec->hdr.dst_addr) {
+   FLOW_LOG(ERR, "IPv4 exact match src/dst addr");
+   return ENOTSUP;
+   }
+
+   /* check that the suppied mask exactly matches capabilty */
+   if (!mask_exact_match((const u8 *)&supported_mask,
+ (const u8 *)item->mask, sizeof(*mask))) {
+   FLOW_LOG(ERR, "IPv4 exact match mask");
+   return ENOTSUP;
+   }
+
+   enic_filter->u.ipv4.flags = FILTER_FIELDS_IPV4_5TUPLE;
+   enic_5tup->src_addr = spec->hdr.src_addr;
+   enic_5tup->dst_addr = spec->hdr.dst_addr;
+
+   return 0;
+}
+
+/**
+ * Copy UDP item into versio

[dpdk-dev] [PATCH v4 8/8] net/enic: flow API documentation

2017-05-17 Thread John Daley
Update enic NIC guide, release notes and add flow API to the
supported features list.

Signed-off-by: John Daley 
Reviewed-by: Nelson Escobar 
---
 doc/guides/nics/enic.rst   | 52 ++
 doc/guides/nics/features/enic.ini  |  1 +
 doc/guides/rel_notes/release_17_08.rst |  6 
 3 files changed, 59 insertions(+)

diff --git a/doc/guides/nics/enic.rst b/doc/guides/nics/enic.rst
index 89a301585..cb5ae1250 100644
--- a/doc/guides/nics/enic.rst
+++ b/doc/guides/nics/enic.rst
@@ -213,6 +213,45 @@ or ``vfio`` in non-IOMMU mode.
 Please see :ref:`Limitations ` for limitations in
 the use of SR-IOV.
 
+.. _enic-genic-flow-api:
+
+Generic Flow API support
+
+
+Generic Flow API is supported. The baseline support is:
+
+- **1200 series VICs**
+
+  5-tuple exact Flow support for 1200 series adapters. This allows:
+
+  - Attributes: ingress
+  - Items: ipv4, ipv6, udp, tcp (must exactly match src/dst IP
+addresses and ports and all must be specified).
+  - Actions: queue and void
+  - Selectors: 'is'
+
+- **1300 series VICS with Advanced filters disabled**
+
+  With advanced filters disabled, an IPv4 or IPv6 item must be specified
+  in the pattern.
+
+  - Attributes: ingress
+  - Items: eth, ipv4, ipv6, udp, tcp, vxlan, inner eth, ipv4, ipv6, udp, tcp
+  - Actions: queue and void
+  - Selectors: 'is', 'spec' and 'mask'. 'last' is not supported
+  - In total, up to 64 bytes of mask is allowed across all haeders
+
+- **1300 series VICS with Advanced filters enabled**
+
+  - Attributes: ingress
+  - Items: eth, ipv4, ipv6, udp, tcp, vxlan, inner eth, ipv4, ipv6, udp, tcp
+  - Actions: queue, mark, flag and void
+  - Selectors: 'is', 'spec' and 'mask'. 'last' is not supported
+  - In total, up to 64 bytes of mask is allowed across all haeders
+
+More features may be added in future firmware and new versions of the VIC.
+Please refer to the release notes.
+
 .. _enic_limitations:
 
 Limitations
@@ -260,9 +299,21 @@ Limitations
   - The number of SR-IOV devices is limited to 256. Components on target system
 might limit this number to fewer than 256.
 
+- **Flow API**
+
+  - The number of filters that can be specified with the Generic Flow API is
+dependent on how many header fields are being masked. Use 'flow create' in
+a loop to determine how many filters your VIC will support (not more than
+1000 for 1300 series VICs). Filter are checked for matching in the order 
they
+were added. Since there currently is no grouping or priority support,
+'catch-all' filters should be added last.
+
 How to build the suite
 --
 
+The build instructions for the DPDK suite should be followed. By default
+the ENIC PMD library will be built into the DPDK library.
+
 Refer to the document :ref:`compiling and testing a PMD for a NIC
 ` for details.
 
@@ -313,6 +364,7 @@ Supported features
 - Scattered Rx
 - MTU update
 - SR-IOV on UCS managed servers connected to Fabric Interconnects.
+- Flow API
 
 Known bugs and unsupported features in this release
 ---
diff --git a/doc/guides/nics/features/enic.ini 
b/doc/guides/nics/features/enic.ini
index 94e7f3cba..0de3ef53c 100644
--- a/doc/guides/nics/features/enic.ini
+++ b/doc/guides/nics/features/enic.ini
@@ -20,6 +20,7 @@ VLAN filter  = Y
 CRC offload  = Y
 VLAN offload = Y
 Flow director= Y
+Flow API = Y
 L3 checksum offload  = Y
 L4 checksum offload  = Y
 Packet type parsing  = Y
diff --git a/doc/guides/rel_notes/release_17_08.rst 
b/doc/guides/rel_notes/release_17_08.rst
index 74aae10f7..e3a920438 100644
--- a/doc/guides/rel_notes/release_17_08.rst
+++ b/doc/guides/rel_notes/release_17_08.rst
@@ -41,6 +41,12 @@ New Features
  Also, make sure to start the actual text at the margin.
  =
 
+* **Added Generic Flow API support to enic.**
+
+  Flow API support for outer Ethernet, VLAN, IPv4, IPv6, UDP, TCP, SCTP, VxLAN
+  and inner Ethernet, VLAN, IPv4, IPv6, UDP and TCP pattern items with QUEUE,
+  MARK, FLAG and VOID actions for ingress traffic.
+
 
 Resolved Issues
 ---
-- 
2.12.0



Re: [dpdk-dev] [PATCH] net/e1000: support MAC filters for i210 and i211 chips

2017-05-17 Thread Lu, Wenzhuo
Hi,


> -Original Message-
> From: Markus Theil [mailto:markus.th...@tu-ilmenau.de]
> Sent: Thursday, May 18, 2017 2:06 AM
> To: Lu, Wenzhuo
> Cc: dev@dpdk.org; Markus Theil
> Subject: [PATCH] net/e1000: support MAC filters for i210 and i211 chips
> 
> i210 and i211 also support unicast MAC filters.
> The patch was tested on i210 based hw, for i211 support was looked up in
> the specs.
> 
> Signed-off-by: Markus Theil 
Acked-by: Wenzhuo Lu 

Thanks for this patch.


Re: [dpdk-dev] [PATCH v8] net/i40e: improved FDIR programming times

2017-05-17 Thread Xing, Beilei
Hi,

> -Original Message-
> From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Michael Lilja
> Sent: Wednesday, May 17, 2017 10:58 PM
> To: Zhang, Helin ; Wu, Jingjing
> 
> Cc: dev@dpdk.org; Michael Lilja 
> Subject: [dpdk-dev] [PATCH v8] net/i40e: improved FDIR programming times
> 
> Previously, the FDIR programming time is +11ms on i40e.
> This patch will result in an average programming time of 22usec with a max of
> 60usec .
> 
> Signed-off-by: Michael Lilja 
> 
> ---
> v8:
> * Merged two defines into one handling max wait time
> 
> v7:
> * Code style changes
> 
> v6:
> * Fixed code style issues
> 
> v5:
> * Reinitialization of "i" inconsistent with original intent
> 
> v4:
> * Code style fix
> 
> v3:
> * Replaced commit message
> 
> v2:
> *  Code style fix
> 
> v1:
> * Initial version
> ---
> ---
>  drivers/net/i40e/i40e_fdir.c | 26 --
>  1 file changed, 12 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/net/i40e/i40e_fdir.c b/drivers/net/i40e/i40e_fdir.c index
> 28cc554f5..f94e1c3b8 100644
> --- a/drivers/net/i40e/i40e_fdir.c
> +++ b/drivers/net/i40e/i40e_fdir.c
> @@ -73,9 +73,8 @@
>  #define I40E_FDIR_IPv6_PAYLOAD_LEN  380
>  #define I40E_FDIR_UDP_DEFAULT_LEN   400
> 
> -/* Wait count and interval for fdir filter programming */
> -#define I40E_FDIR_WAIT_COUNT   10
> -#define I40E_FDIR_WAIT_INTERVAL_US 1000
> +/* Wait time for fdir filter programming */ #define
> +I40E_FDIR_MAX_WAIT_US 1
> 
>  /* Wait count and interval for fdir filter flush */
>  #define I40E_FDIR_FLUSH_RETRY   50
> @@ -1295,28 +1294,27 @@ i40e_fdir_filter_programming(struct i40e_pf *pf,
>   /* Update the tx tail register */
>   rte_wmb();
>   I40E_PCI_REG_WRITE(txq->qtx_tail, txq->tx_tail);
> -
> - for (i = 0; i < I40E_FDIR_WAIT_COUNT; i++) {
> - rte_delay_us(I40E_FDIR_WAIT_INTERVAL_US);
> + for (i = 0; i < I40E_FDIR_MAX_WAIT_US; i++) {
>   if ((txdp->cmd_type_offset_bsz &
> 
>   rte_cpu_to_le_64(I40E_TXD_QW1_DTYPE_MASK)) ==
> 
>   rte_cpu_to_le_64(I40E_TX_DESC_DTYPE_DESC_DONE))
>   break;
> + rte_delay_us(1);
>   }
> - if (i >= I40E_FDIR_WAIT_COUNT) {
> + if (i >= I40E_FDIR_MAX_WAIT_US) {
>   PMD_DRV_LOG(ERR, "Failed to program FDIR filter:"
>   " time out to get DD on tx queue.");
>   return -ETIMEDOUT;
>   }
>   /* totally delay 10 ms to check programming status*/
> - rte_delay_us((I40E_FDIR_WAIT_COUNT - i) *
> I40E_FDIR_WAIT_INTERVAL_US);
> - if (i40e_check_fdir_programming_status(rxq) < 0) {
> - PMD_DRV_LOG(ERR, "Failed to program FDIR filter:"
> - " programming status reported.");
> - return -ENOSYS;
> + for (; i < I40E_FDIR_MAX_WAIT_US; i++) {
> + if (i40e_check_fdir_programming_status(rxq) >= 0)
> + return 0;
> + rte_delay_us(1);
>   }
> -
> - return 0;
> + PMD_DRV_LOG(ERR, "Failed to program FDIR filter:"
> + " programming status reported.");
> + return -ETIMEDOUT;
>  }
> 
>  /*
> --
> 2.12.2

Acked-by: Beilei Xing 



Re: [dpdk-dev] [RFC] Enable primary/secondary model for vdev

2017-05-17 Thread Tan, Jianfeng
Any comments please?

> -Original Message-
> From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Tan, Jianfeng
> Sent: Thursday, May 11, 2017 2:20 PM
> To: dev@dpdk.org
> Cc: Thomas Monjalon; Richardson, Bruce; Ananyev, Konstantin; Stephen
> Hemminger; Yuanhan Liu; Yigit, Ferruh; Yang, Zhiyong;
> huangh...@meituan.com
> Subject: [dpdk-dev] [RFC] Enable primary/secondary model for vdev
> 
> Hi,
> 
> Status quo:
> Almost none of vdev supports primary/secondary model. Two exceptions
> are
> rte_ring, virtio-user (limited support). Two problems facing this issue.
> 
> P1: How to attach vdev in secondary process?
> 
> Previous discussion: http://dpdk.org/ml/archives/dev/2017-
> April/063370.html
> According current implementation, vdev can be recognized on vdev bus
> only if it is explicitly assigned in the parameters, or invokes call
> rte_eal_vdev_init() to create one. Assigning --vdev parameter explicitly
> on secondary process is not the optimal way. Instead, we can iterate
> rte_eth_dev_data array to discover/obtain all devices, which seems a
> more unified solution with PCI devices.
> 
> If we only need to obtain the statistics of vdev in secondary process
> (like what dpdk-procinfo does), solving above issue is enough, but if we
> need to Rx/Tx packets in secondary process for vdev, we need to solve
> the following problem too.
> 
> P2: How to share dev-specific FDs to secondary process?
> 
> We need to share the FDs which are opened at primary process, for
> example, the socket file of pcap, the char device of tap, the QEMU
> memory region fds of vhost-user, etc. The only (as far as we know) way
> to share FD between processes in Linux is to use ancillary data of
> sendmsg/recvmsg on unix socket. That means we propose a new socket file
> created for each primary proess, and secondary processes will connect
> with this unix socket; and for simplicity, we also propose a new API to
> share FD, so that vdev can make use of it to share FDs.
> 
> Another problem I'd like to make clear is: can secondary processes own
> ports that does not belong to primary?
> 
> Besides, there are two errors in documents: (1) rte_ring supports
> primary/secondary model but the document does not state that; (2) pcap
> does not support primary/secondary model but the document states it
> supports.
> 
> Thanks,
> Jianfeng



[dpdk-dev] [PATCH 00/23] bnxt patchset

2017-05-17 Thread Ajit Khaparde
This patchset amongst other changes adds support few more dev_ops,
updates HWRM to version 1.7.5, switches to polling stats from the
hardware, support for LRO etc..

  bnxt: add various hwrm input/output structures
  bnxt: code reorg to properly allocate resources in PF/VF modes
  bnxt: add tunneling support
  bnxt: support lack of huge pages
  bnxt: add functions for tx_loopback, set_vf_mac and queues_drop_en
  bnxt: add support for set VF QOS and MAC anti spoof
  bnxt: add support for VLAN stripq, VLAN anti spoof and VLAN filtering for VFs
  bnxt: add support to get and clear VF specific stats
  bnxt: add code to determine the Rx status of VF
  bnxt: add support to add a VF MAC address
  bnxt: add support for xstats get/reset
  bnxt: Add support for VLAN filter and strip dev_ops
  bnxt: add code to configure a default VF VLAN
  bnxt: add support for set_mc_addr_list and mac_addr_set
  bnxt: add support for fw_version_get dev_op
  bnxt: add support to set MTU
  bnxt: add support for LRO
  bnxt: add rxq_info_get and txq_info_get dev_ops
  bnxt: add additonal HWRM debug info to error messages
  bnxt: reorg the query stats code
  bnxt: update to HWRM version 1.7.5
  bnxt: Add support to set VF rxmode
  bnxt: add code to support vlan_pvid_set dev_op

 drivers/net/bnxt/Makefile |4 +
 drivers/net/bnxt/bnxt.h   |  104 +-
 drivers/net/bnxt/bnxt_cpr.c   |  129 +-
 drivers/net/bnxt/bnxt_cpr.h   |   17 +
 drivers/net/bnxt/bnxt_ethdev.c|  818 ++-
 drivers/net/bnxt/bnxt_filter.c|   58 +-
 drivers/net/bnxt/bnxt_filter.h|3 +
 drivers/net/bnxt/bnxt_hwrm.c  | 1523 -
 drivers/net/bnxt/bnxt_hwrm.h  |   62 +-
 drivers/net/bnxt/bnxt_irq.c   |   21 +-
 drivers/net/bnxt/bnxt_ring.c  |  159 +-
 drivers/net/bnxt/bnxt_ring.h  |4 +-
 drivers/net/bnxt/bnxt_rxq.c   |   54 +-
 drivers/net/bnxt/bnxt_rxq.h   |3 +
 drivers/net/bnxt/bnxt_rxr.c   |  396 +-
 drivers/net/bnxt/bnxt_rxr.h   |   46 +
 drivers/net/bnxt/bnxt_stats.c |  261 +-
 drivers/net/bnxt/bnxt_stats.h |   10 +
 drivers/net/bnxt/bnxt_txr.c   |3 +-
 drivers/net/bnxt/bnxt_vnic.c  |   70 +-
 drivers/net/bnxt/bnxt_vnic.h  |   20 +-
 drivers/net/bnxt/hsi_struct_def_dpdk.h| 8633 -
 drivers/net/bnxt/rte_pmd_bnxt.c   |  701 +++
 drivers/net/bnxt/rte_pmd_bnxt.h   |  322 ++
 drivers/net/bnxt/rte_pmd_bnxt_version.map |   22 +
 25 files changed, 10407 insertions(+), 3036 deletions(-)
 create mode 100644 drivers/net/bnxt/rte_pmd_bnxt.c
 create mode 100644 drivers/net/bnxt/rte_pmd_bnxt.h

-- 
2.10.1 (Apple Git-78)



[dpdk-dev] [PATCH 03/23] bnxt: add tunneling support

2017-05-17 Thread Ajit Khaparde
Add support for udp_tunnel_port_add/del dev_ops to configure a UDP port
for VXLAN and Geneve Tunnel protocols.
The HWRM supports only one global destination port for a tunnel type,
use a referene counter to keep track of its usage.
Cache the configured VXLAN/Geneve ports and use that value to check
if the right UDP port is being freed up.
Skip calling bnxt_hwrm_tunnel_dst_port_alloc if the same UDP port is
being programmed.
Skip calling bnxt_hwrm_tunnel_dst_port_free if no UDP port has been
configured.
Also update tx offload capabilities

Signed-off-by: Ajit Khaparde 
---
 drivers/net/bnxt/bnxt.h|   6 +
 drivers/net/bnxt/bnxt_ethdev.c | 119 +-
 drivers/net/bnxt/bnxt_hwrm.c   |  56 +++
 drivers/net/bnxt/bnxt_hwrm.h   |   5 +
 drivers/net/bnxt/bnxt_txr.c|   3 +-
 drivers/net/bnxt/hsi_struct_def_dpdk.h | 273 +
 6 files changed, 460 insertions(+), 2 deletions(-)

diff --git a/drivers/net/bnxt/bnxt.h b/drivers/net/bnxt/bnxt.h
index 1522eb4..e85cbee 100644
--- a/drivers/net/bnxt/bnxt.h
+++ b/drivers/net/bnxt/bnxt.h
@@ -188,6 +188,12 @@ struct bnxt {
struct bnxt_pf_info pf;
uint8_t port_partition_type;
uint8_t dev_stopped;
+   uint8_t vxlan_port_cnt;
+   uint8_t geneve_port_cnt;
+   uint16_tvxlan_port;
+   uint16_tgeneve_port;
+   uint16_tvxlan_fw_dst_port_id;
+   uint16_tgeneve_fw_dst_port_id;
uint32_tfw_ver;
 };
 
diff --git a/drivers/net/bnxt/bnxt_ethdev.c b/drivers/net/bnxt/bnxt_ethdev.c
index a7f183c..18263a4 100644
--- a/drivers/net/bnxt/bnxt_ethdev.c
+++ b/drivers/net/bnxt/bnxt_ethdev.c
@@ -363,7 +363,12 @@ static void bnxt_dev_info_get_op(struct rte_eth_dev 
*eth_dev,
dev_info->tx_offload_capa = DEV_TX_OFFLOAD_IPV4_CKSUM |
DEV_TX_OFFLOAD_TCP_CKSUM |
DEV_TX_OFFLOAD_UDP_CKSUM |
-   DEV_TX_OFFLOAD_TCP_TSO;
+   DEV_TX_OFFLOAD_TCP_TSO |
+   DEV_TX_OFFLOAD_OUTER_IPV4_CKSUM |
+   DEV_TX_OFFLOAD_VXLAN_TNL_TSO |
+   DEV_TX_OFFLOAD_GRE_TNL_TSO |
+   DEV_TX_OFFLOAD_IPIP_TNL_TSO |
+   DEV_TX_OFFLOAD_GENEVE_TNL_TSO;
 
/* *INDENT-OFF* */
dev_info->default_rxconf = (struct rte_eth_rxconf) {
@@ -975,6 +980,116 @@ static int bnxt_flow_ctrl_set_op(struct rte_eth_dev *dev,
return bnxt_set_hwrm_link_config(bp, true);
 }
 
+/* Add UDP tunneling port */
+static int
+bnxt_udp_tunnel_port_add_op(struct rte_eth_dev *eth_dev,
+struct rte_eth_udp_tunnel *udp_tunnel)
+{
+   struct bnxt *bp = (struct bnxt *)eth_dev->data->dev_private;
+   uint16_t tunnel_type = 0;
+   int rc = 0;
+
+   switch (udp_tunnel->prot_type) {
+   case RTE_TUNNEL_TYPE_VXLAN:
+   if (bp->vxlan_port_cnt) {
+   RTE_LOG(ERR, PMD, "Tunnel Port %d already programmed\n",
+   udp_tunnel->udp_port);
+   if (bp->vxlan_port != udp_tunnel->udp_port) {
+   RTE_LOG(ERR, PMD, "Only one port allowed\n");
+   return -ENOSPC;
+   }
+   bp->vxlan_port_cnt++;
+   return 0;
+   }
+   tunnel_type =
+   HWRM_TUNNEL_DST_PORT_ALLOC_INPUT_TUNNEL_TYPE_VXLAN;
+   bp->vxlan_port_cnt++;
+   break;
+   case RTE_TUNNEL_TYPE_GENEVE:
+   if (bp->geneve_port_cnt) {
+   RTE_LOG(ERR, PMD, "Tunnel Port %d already programmed\n",
+   udp_tunnel->udp_port);
+   if (bp->geneve_port != udp_tunnel->udp_port) {
+   RTE_LOG(ERR, PMD, "Only one port allowed\n");
+   return -ENOSPC;
+   }
+   bp->geneve_port_cnt++;
+   return 0;
+   }
+   tunnel_type =
+   HWRM_TUNNEL_DST_PORT_ALLOC_INPUT_TUNNEL_TYPE_GENEVE;
+   bp->geneve_port_cnt++;
+   break;
+   default:
+   RTE_LOG(ERR, PMD, "Tunnel type is not supported\n");
+   return -ENOTSUP;
+   }
+   rc = bnxt_hwrm_tunnel_dst_port_alloc(bp, udp_tunnel->udp_port,
+tunnel_type);
+   return rc;
+}
+
+static int
+bnxt_udp_tunnel_port_del_op(struct rte_eth_dev *eth_dev,
+struct rte_eth_udp_tunnel *udp_tunnel)
+{
+   struct bnxt *

[dpdk-dev] [PATCH 02/23] bnxt: code reorg to properly allocate resources in PF/VF modes

2017-05-17 Thread Ajit Khaparde
1) Move the function reset to bnxt_dev_init()
2) After a function reset, configure the VFs.  Distribute resources
evenly between all functions (PF and VF) for now. In the future, this
should be controllable.
3) Since PF/VF need to allocate resources from a pool in the hardware,
use func_qcaps and func_qcfg to appropriately query the capabilities
and available resources.
4) If a PF is being initialized and no VFs are allocated, explicitly
call func_cfg to allocate the resources.
5) The bnxt_vf_info and bnxt_pf_info had lot of duplication. Move the
common items to struct bnxt. And only unique items specific to PF remain
in the struct bnxt_pf_info.
6) Once resources are requested from the firmware, update local copy
of resource count in struct bnxt only after sending the func_qcfg to
make sure the allocation request in the firmware went through.
7) Do not initialize the default completion ring in
bnxt_alloc_hwrm_rings().

Signed-off-by: Ajit Khaparde 
---
 drivers/net/bnxt/bnxt.h|  81 +++--
 drivers/net/bnxt/bnxt_cpr.c| 128 +--
 drivers/net/bnxt/bnxt_cpr.h|  17 +
 drivers/net/bnxt/bnxt_ethdev.c | 187 +++---
 drivers/net/bnxt/bnxt_filter.c |  36 +-
 drivers/net/bnxt/bnxt_hwrm.c   | 760 +
 drivers/net/bnxt/bnxt_hwrm.h   |  20 +-
 drivers/net/bnxt/bnxt_irq.c|  21 +-
 drivers/net/bnxt/bnxt_ring.c   |  17 +-
 drivers/net/bnxt/bnxt_vnic.c   |  47 +--
 drivers/net/bnxt/bnxt_vnic.h   |  11 +-
 11 files changed, 1045 insertions(+), 280 deletions(-)

diff --git a/drivers/net/bnxt/bnxt.h b/drivers/net/bnxt/bnxt.h
index 4418c7f..1522eb4 100644
--- a/drivers/net/bnxt/bnxt.h
+++ b/drivers/net/bnxt/bnxt.h
@@ -35,6 +35,7 @@
 #define _BNXT_H_
 
 #include 
+#include 
 #include 
 
 #include 
@@ -54,17 +55,19 @@ enum bnxt_hw_context {
HW_CONTEXT_IS_LB= 3,
 };
 
-struct bnxt_vf_info {
-   uint16_tfw_fid;
-   uint8_t mac_addr[ETHER_ADDR_LEN];
-   uint16_tmax_rsscos_ctx;
-   uint16_tmax_cp_rings;
-   uint16_tmax_tx_rings;
-   uint16_tmax_rx_rings;
-   uint16_tmax_l2_ctx;
-   uint16_tmax_vnics;
-   uint16_tvlan;
-   struct bnxt_pf_info *pf;
+struct bnxt_vlan_table_entry {
+   uint16_ttpid;
+   uint16_tvid;
+} __attribute__((packed));
+
+struct bnxt_child_vf_info {
+   void*req_buf;
+   struct bnxt_vlan_table_entry*vlan_table;
+   STAILQ_HEAD(, bnxt_filter_info) filter;
+   uint32_tfunc_cfg_flags;
+   uint32_tl2_rx_mask;
+   uint16_tfid;
+   boolrandom_mac;
 };
 
 struct bnxt_pf_info {
@@ -73,22 +76,20 @@ struct bnxt_pf_info {
 #define BNXT_FIRST_VF_FID  128
 #define BNXT_PF_RINGS_USED(bp) bnxt_get_num_queues(bp)
 #define BNXT_PF_RINGS_AVAIL(bp)(bp->pf.max_cp_rings - 
BNXT_PF_RINGS_USED(bp))
-   uint32_tfw_fid;
uint8_t port_id;
-   uint8_t mac_addr[ETHER_ADDR_LEN];
-   uint16_tmax_rsscos_ctx;
-   uint16_tmax_cp_rings;
-   uint16_tmax_tx_rings;
-   uint16_tmax_rx_rings;
-   uint16_tmax_l2_ctx;
-   uint16_tmax_vnics;
uint16_tfirst_vf_id;
uint16_tactive_vfs;
uint16_tmax_vfs;
+   uint32_tfunc_cfg_flags;
void*vf_req_buf;
phys_addr_t vf_req_buf_dma_addr;
uint32_tvf_req_fwd[8];
-   struct bnxt_vf_info *vf;
+   uint16_ttotal_vnics;
+   struct bnxt_child_vf_info   *vf_info;
+#define BNXT_EVB_MODE_NONE 0
+#define BNXT_EVB_MODE_VEB  1
+#define BNXT_EVB_MODE_VEPA 2
+   uint8_t evb_mode;
 };
 
 /* Max wait time is 10 * 100ms = 1s */
@@ -174,12 +175,50 @@ struct bnxt {
struct bnxt_link_info   link_info;
struct bnxt_cos_queue_info  cos_queue[BNXT_COS_QUEUE_COUNT];
 
+   uint16_tfw_fid;
+   uint8_t dflt_mac_addr[ETHER_ADDR_LEN];
+   uint16_tmax_rsscos_ctx;
+   uint16_tmax_cp_rings;
+   uint16_tmax_tx_rings;
+   uint16_tmax_rx_rings;
+   uint16_tmax_l2_ctx;
+   uint16_tmax_vnics;
+   uint16_tmax_stat_ctx;
+   uint16_tvlan;
struct bnxt_pf_info pf;
-   struct bnxt_vf_info vf;
uint8_t port_partition_type;
uint8_t dev_stopped;
+   uint32_tfw_ver;
+};
+
+/*
+ * Response sent back to the caller afte

[dpdk-dev] [PATCH 01/23] bnxt: add various hwrm input/output structures

2017-05-17 Thread Ajit Khaparde
This patch adds various hwrm API which allows the configuration
and allocation of resources for a PCI function depending on whether
it is a PF or a VF. These structures will be used in subsequent
patches.

Signed-off-by: Ajit Khaparde 
---
 drivers/net/bnxt/hsi_struct_def_dpdk.h | 1497 ++--
 1 file changed, 1417 insertions(+), 80 deletions(-)

diff --git a/drivers/net/bnxt/hsi_struct_def_dpdk.h 
b/drivers/net/bnxt/hsi_struct_def_dpdk.h
index f024837..a091a0b 100644
--- a/drivers/net/bnxt/hsi_struct_def_dpdk.h
+++ b/drivers/net/bnxt/hsi_struct_def_dpdk.h
@@ -85,20 +85,63 @@ struct ctx_hw_stats64 {
  * Request types
  */
 #define HWRM_VER_GET   (UINT32_C(0x0))
+#define HWRM_FUNC_BUF_UNRGTR   (UINT32_C(0xe))
+#define HWRM_FUNC_VF_CFG   (UINT32_C(0xf))
+/* Reserved for future use */
+#define RESERVED1  (UINT32_C(0x10))
 #define HWRM_FUNC_RESET(UINT32_C(0x11))
+#define HWRM_FUNC_GETFID   (UINT32_C(0x12))
+#define HWRM_FUNC_VF_ALLOC (UINT32_C(0x13))
+#define HWRM_FUNC_VF_FREE  (UINT32_C(0x14))
 #define HWRM_FUNC_QCAPS(UINT32_C(0x15))
 #define HWRM_FUNC_QCFG (UINT32_C(0x16))
+#define HWRM_FUNC_CFG  (UINT32_C(0x17))
+#define HWRM_FUNC_QSTATS   (UINT32_C(0x18))
+#define HWRM_FUNC_CLR_STATS(UINT32_C(0x19))
 #define HWRM_FUNC_DRV_UNRGTR   (UINT32_C(0x1a))
+#define HWRM_FUNC_VF_RESC_FREE (UINT32_C(0x1b))
+#define HWRM_FUNC_VF_VNIC_IDS_QUERY(UINT32_C(0x1c))
 #define HWRM_FUNC_DRV_RGTR (UINT32_C(0x1d))
+#define HWRM_FUNC_DRV_QVER (UINT32_C(0x1e))
+#define HWRM_FUNC_BUF_RGTR (UINT32_C(0x1f))
 #define HWRM_PORT_PHY_CFG  (UINT32_C(0x20))
+#define HWRM_PORT_MAC_CFG  (UINT32_C(0x21))
+#define HWRM_PORT_QSTATS   (UINT32_C(0x23))
+#define HWRM_PORT_LPBK_QSTATS  (UINT32_C(0x24))
 #define HWRM_PORT_PHY_QCFG (UINT32_C(0x27))
+#define HWRM_PORT_MAC_QCFG (UINT32_C(0x28))
+#define HWRM_PORT_PHY_QCAPS(UINT32_C(0x2a))
+#define HWRM_PORT_LED_CFG  (UINT32_C(0x2d))
+#define HWRM_PORT_LED_QCFG (UINT32_C(0x2e))
+#define HWRM_PORT_LED_QCAPS(UINT32_C(0x2f))
 #define HWRM_QUEUE_QPORTCFG(UINT32_C(0x30))
+#define HWRM_QUEUE_QCFG(UINT32_C(0x31))
+#define HWRM_QUEUE_CFG (UINT32_C(0x32))
+/* Reserved for future use */
+#define RESERVED2  (UINT32_C(0x33))
+/* Reserved for future use */
+#define RESERVED3  (UINT32_C(0x34))
+#define HWRM_QUEUE_PFCENABLE_QCFG  (UINT32_C(0x35))
+#define HWRM_QUEUE_PFCENABLE_CFG   (UINT32_C(0x36))
+#define HWRM_QUEUE_PRI2COS_QCFG(UINT32_C(0x37))
+#define HWRM_QUEUE_PRI2COS_CFG (UINT32_C(0x38))
+#define HWRM_QUEUE_COS2BW_QCFG (UINT32_C(0x39))
+#define HWRM_QUEUE_COS2BW_CFG  (UINT32_C(0x3a))
+#define HWRM_VNIC_ALLOC(UINT32_C(0x40))
 #define HWRM_VNIC_ALLOC(UINT32_C(0x40))
 #define HWRM_VNIC_FREE (UINT32_C(0x41))
 #define HWRM_VNIC_CFG  (UINT32_C(0x42))
+#define HWRM_VNIC_QCFG (UINT32_C(0x43))
+#define HWRM_VNIC_TPA_CFG  (UINT32_C(0x44))
 #define HWRM_VNIC_RSS_CFG  (UINT32_C(0x46))
+#define HWRM_VNIC_RSS_QCFG (UINT32_C(0x47))
+#define HWRM_VNIC_PLCMODES_CFG (UINT32_C(0x48))
+#define HWRM_VNIC_PLCMODES_QCFG(UINT32_C(0x49))
 #define HWRM_RING_ALLOC(UINT32_C(0x50))
 #define HWRM_RING_FREE (UINT32_C(0x51))
+#define HWRM_RING_CMPL_RING_QAGGINT_PARAMS (UINT32_C(0x52))
+#define HWRM_RING_CMPL_RING_CFG_AGGINT_PARAM   (UINT32_C(0x53))
+#define HWRM_RING_RESET(UINT32_C(0x5e))
 #define HWRM_RING_GRP_ALLOC(UINT32_C(0x60))
 #define HWRM_RING_GRP_FREE (UINT32_C(0x61))
 #define HWRM_VNIC_RSS_COS_LB_CTX_ALLOC (UINT32_C(0x70))
@@ -107,10 +150,46 @@ struct ctx_hw_stats64 {
 #define HWRM_CFA_L2_FILTER_FREE(UINT32_C(0x91))
 #define HWRM_CFA_L2_FILTER_CFG (UINT32_C(0x92))
 #define HWRM_CFA_L2_SET_RX_MASK(UINT32_C(0x93))
+/* Reserved for future use */
+#define RESERVED4  (UINT32_C(0x94))
+#define HWRM_CFA_TUNNEL_FILTER_ALLOC   (UINT32_C(0x95))
+#define HWRM_CFA_TUNNEL_FILTER_FREE(UINT32_C(0x96))
+#define HWRM_CFA_NTUPLE_FILTER_ALLOC   (UINT32_C(0x99))
+#define HWRM_CFA_NTUPLE_FILTER_FREE(UINT32_C(0x9a))
+#define HWRM_CFA_NTUPLE_FILTER_CFG (UINT32_C(0x9b))
+#define HWRM_TUNNEL_DST_PORT_QUERY (UINT32_C(0xa0))
+#define HWRM_TUNNEL_DST_PORT_ALLOC (UINT32_C(0xa1))
+#define HWRM_TUNNEL_DST_PORT_FREE  (UINT32_C(0xa2))
 #define HWRM_STAT_CTX_ALLOC(UINT32_C(0xb0))
 #define H

[dpdk-dev] [PATCH 04/23] bnxt: support lack of huge pages

2017-05-17 Thread Ajit Khaparde
rte_malloc_virt2phy() does not return a physical address if huge pages
aren't in use.  Further, rte_memzone->phys_addr is not a physical address.

Use rte_mem_virt2phy() and manually lock pages to support lack of
huge pages.

Also check the return value of rte_mem_virt2phy()

Verify the function returns an address. Otherwise return an error and
log a message.

Signed-off-by: Stephen Hurd 
Signed-off-by: Ajit Khaparde 
---
 drivers/net/bnxt/bnxt_hwrm.c | 17 +++--
 drivers/net/bnxt/bnxt_ring.c | 31 +--
 drivers/net/bnxt/bnxt_vnic.c | 16 +++-
 3 files changed, 55 insertions(+), 9 deletions(-)

diff --git a/drivers/net/bnxt/bnxt_hwrm.c b/drivers/net/bnxt/bnxt_hwrm.c
index c8abbcb..db00a74 100644
--- a/drivers/net/bnxt/bnxt_hwrm.c
+++ b/drivers/net/bnxt/bnxt_hwrm.c
@@ -487,8 +487,15 @@ int bnxt_hwrm_ver_get(struct bnxt *bp)
rc = -ENOMEM;
goto error;
}
+   rte_mem_lock_page(bp->hwrm_cmd_resp_addr);
bp->hwrm_cmd_resp_dma_addr =
-   rte_malloc_virt2phy(bp->hwrm_cmd_resp_addr);
+   rte_mem_virt2phy(bp->hwrm_cmd_resp_addr);
+   if (bp->hwrm_cmd_resp_dma_addr == 0) {
+   RTE_LOG(ERR, PMD,
+   "Unable to map response buffer to physical memory.\n");
+   rc = -ENOMEM;
+   goto error;
+   }
bp->max_resp_len = max_resp_len;
}
 
@@ -1361,10 +1368,16 @@ int bnxt_alloc_hwrm_resources(struct bnxt *bp)
bp->max_req_len = HWRM_MAX_REQ_LEN;
bp->max_resp_len = HWRM_MAX_RESP_LEN;
bp->hwrm_cmd_resp_addr = rte_malloc(type, bp->max_resp_len, 0);
+   rte_mem_lock_page(bp->hwrm_cmd_resp_addr);
if (bp->hwrm_cmd_resp_addr == NULL)
return -ENOMEM;
bp->hwrm_cmd_resp_dma_addr =
-   rte_malloc_virt2phy(bp->hwrm_cmd_resp_addr);
+   rte_mem_virt2phy(bp->hwrm_cmd_resp_addr);
+   if (bp->hwrm_cmd_resp_dma_addr == 0) {
+   RTE_LOG(ERR, PMD,
+   "unable to map response address to physical memory\n");
+   return -ENOMEM;
+   }
rte_spinlock_init(&bp->hwrm_lock);
 
return 0;
diff --git a/drivers/net/bnxt/bnxt_ring.c b/drivers/net/bnxt/bnxt_ring.c
index 0ad7810..642ec53 100644
--- a/drivers/net/bnxt/bnxt_ring.c
+++ b/drivers/net/bnxt/bnxt_ring.c
@@ -32,6 +32,7 @@
  */
 
 #include 
+#include 
 
 #include "bnxt.h"
 #include "bnxt_cpr.h"
@@ -96,6 +97,8 @@ int bnxt_alloc_rings(struct bnxt *bp, uint16_t qidx,
struct rte_pci_device *pdev = bp->pdev;
const struct rte_memzone *mz = NULL;
char mz_name[RTE_MEMZONE_NAMESIZE];
+   phys_addr_t mz_phys_addr;
+   int sz;
 
int stats_len = (tx_ring_info || rx_ring_info) ?
RTE_CACHE_LINE_ROUNDUP(sizeof(struct ctx_hw_stats64)) : 0;
@@ -136,21 +139,37 @@ int bnxt_alloc_rings(struct bnxt *bp, uint16_t qidx,
mz_name[RTE_MEMZONE_NAMESIZE - 1] = 0;
mz = rte_memzone_lookup(mz_name);
if (!mz) {
-   mz = rte_memzone_reserve(mz_name, total_alloc_len,
+   mz = rte_memzone_reserve_aligned(mz_name, total_alloc_len,
 SOCKET_ID_ANY,
 RTE_MEMZONE_2MB |
-RTE_MEMZONE_SIZE_HINT_ONLY);
+RTE_MEMZONE_SIZE_HINT_ONLY,
+getpagesize());
if (mz == NULL)
return -ENOMEM;
}
memset(mz->addr, 0, mz->len);
+   mz_phys_addr = mz->phys_addr;
+   if ((unsigned long)mz->addr == mz_phys_addr) {
+   RTE_LOG(WARNING, PMD,
+   "Memzone physical address same as virtual.\n");
+   RTE_LOG(WARNING, PMD,
+   "Using rte_mem_virt2phy()\n");
+   for (sz = 0; sz < total_alloc_len; sz += getpagesize())
+   rte_mem_lock_page(((char *)mz->addr) + sz);
+   mz_phys_addr = rte_mem_virt2phy(mz->addr);
+   if (mz_phys_addr == 0) {
+   RTE_LOG(ERR, PMD,
+   "unable to map ring address to physical memory\n");
+   return -ENOMEM;
+   }
+   }
 
if (tx_ring_info) {
tx_ring = tx_ring_info->tx_ring_struct;
 
tx_ring->bd = ((char *)mz->addr + tx_ring_start);
tx_ring_info->tx_desc_ring = (struct tx_bd_long *)tx_ring->bd;
-   tx_ring->bd_dma = mz->phys_addr + tx_ring_start;
+   tx_ring->bd_dma = mz_phys_addr + tx_ring_start;
tx_ring_info->tx_desc_mapping = tx_ring->bd_dma;
tx_ring->mem_zone = (const void *)mz;
 
@@ -170,7 +189,7 @@ int bnxt_alloc_rings(struct bnxt *bp, ui

[dpdk-dev] [PATCH 05/23] bnxt: add functions for tx_loopback, set_vf_mac and queues_drop_en

2017-05-17 Thread Ajit Khaparde
Add functions rte_pmd_bnxt_set_tx_loopback,
rte_pmd_bnxt_set_all_queues_drop_en and
rte_pmd_bnxt_set_vf_mac_addr to configure tx_loopback,
queue_drop and VF MAC address setting in the hardware.
It also adds the necessary functions to send the HWRM commands
to the firmware.

Signed-off-by: Steeven Li 
Signed-off-by: Ajit Khaparde 
---
 drivers/net/bnxt/Makefile |   4 +
 drivers/net/bnxt/bnxt_hwrm.c  | 108 
 drivers/net/bnxt/bnxt_hwrm.h  |   4 +
 drivers/net/bnxt/hsi_struct_def_dpdk.h|  79 +++
 drivers/net/bnxt/rte_pmd_bnxt.c   | 163 ++
 drivers/net/bnxt/rte_pmd_bnxt.h   |  87 
 drivers/net/bnxt/rte_pmd_bnxt_version.map |   9 ++
 7 files changed, 454 insertions(+)
 create mode 100644 drivers/net/bnxt/rte_pmd_bnxt.c
 create mode 100644 drivers/net/bnxt/rte_pmd_bnxt.h

diff --git a/drivers/net/bnxt/Makefile b/drivers/net/bnxt/Makefile
index 0fffe35..b03f65d 100644
--- a/drivers/net/bnxt/Makefile
+++ b/drivers/net/bnxt/Makefile
@@ -38,6 +38,8 @@ include $(RTE_SDK)/mk/rte.vars.mk
 #
 LIB = librte_pmd_bnxt.a
 
+EXPORT_MAP := rte_pmd_bnxt_version.map
+
 LIBABIVER := 1
 
 CFLAGS += -O3
@@ -60,10 +62,12 @@ SRCS-$(CONFIG_RTE_LIBRTE_BNXT_PMD) += bnxt_txq.c
 SRCS-$(CONFIG_RTE_LIBRTE_BNXT_PMD) += bnxt_txr.c
 SRCS-$(CONFIG_RTE_LIBRTE_BNXT_PMD) += bnxt_vnic.c
 SRCS-$(CONFIG_RTE_LIBRTE_BNXT_PMD) += bnxt_irq.c
+SRCS-$(CONFIG_RTE_LIBRTE_BNXT_PMD) += rte_pmd_bnxt.c
 
 #
 # Export include files
 #
 SYMLINK-y-include +=
+SYMLINK-$(CONFIG_RTE_LIBRTE_BNXT_PMD)-include := rte_pmd_bnxt.h
 
 include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/drivers/net/bnxt/bnxt_hwrm.c b/drivers/net/bnxt/bnxt_hwrm.c
index db00a74..c1cf95d 100644
--- a/drivers/net/bnxt/bnxt_hwrm.c
+++ b/drivers/net/bnxt/bnxt_hwrm.c
@@ -2071,6 +2071,24 @@ int bnxt_hwrm_allocate_vfs(struct bnxt *bp, int num_vfs)
return rc;
 }
 
+int bnxt_hwrm_pf_evb_mode(struct bnxt *bp)
+{
+   struct hwrm_func_cfg_input req = {0};
+   struct hwrm_func_cfg_output *resp = bp->hwrm_cmd_resp_addr;
+   int rc;
+
+   HWRM_PREP(req, FUNC_CFG, -1, resp);
+
+   req.fid = rte_cpu_to_le_16(0x);
+   req.enables = rte_cpu_to_le_32(HWRM_FUNC_CFG_INPUT_ENABLES_EVB_MODE);
+   req.evb_mode = bp->pf.evb_mode;
+
+   rc = bnxt_hwrm_send_message(bp, &req, sizeof(req));
+   HWRM_CHECK_RESULT;
+
+   return rc;
+}
+
 int bnxt_hwrm_tunnel_dst_port_alloc(struct bnxt *bp, uint16_t port,
uint8_t tunnel_type)
 {
@@ -2219,3 +2237,93 @@ int bnxt_hwrm_exec_fwd_resp(struct bnxt *bp, uint16_t 
target_id,
 
return rc;
 }
+
+static int bnxt_hwrm_func_vf_vnic_query(struct bnxt *bp, uint16_t vf,
+   uint16_t *vnic_ids)
+{
+   struct hwrm_func_vf_vnic_ids_query_input req = {0};
+   struct hwrm_func_vf_vnic_ids_query_output *resp =
+   bp->hwrm_cmd_resp_addr;
+   int rc;
+
+   /* First query all VNIC ids */
+   HWRM_PREP(req, FUNC_VF_VNIC_IDS_QUERY, -1, resp_vf_vnic_ids);
+
+   req.vf_id = rte_cpu_to_le_16(bp->pf.first_vf_id + vf);
+   req.max_vnic_id_cnt = rte_cpu_to_le_32(bp->pf.total_vnics);
+   req.vnic_id_tbl_addr = rte_cpu_to_le_64(rte_mem_virt2phy(vnic_ids));
+
+   if (req.vnic_id_tbl_addr == 0) {
+   RTE_LOG(ERR, PMD,
+   "unable to map VNIC ID table address to physical memory\n");
+   //rte_free(vnic_ids);
+   return -ENOMEM;
+   }
+   rc = bnxt_hwrm_send_message(bp, &req, sizeof(req));
+   if (rc) {
+   RTE_LOG(ERR, PMD, "hwrm_func_vf_vnic_query failed rc:%d\n", rc);
+   return -1;
+   } else if (resp->error_code) {
+   rc = rte_le_to_cpu_16(resp->error_code);
+   RTE_LOG(ERR, PMD, "hwrm_func_vf_vnic_query error %d\n", rc);
+   return -1;
+   }
+
+   return rte_le_to_cpu_32(resp->vnic_id_cnt);
+}
+
+/*
+ * This function queries the VNIC IDs  for a specified VF. It then calls
+ * the vnic_cb to update the necessary field in vnic_info with cbdata.
+ * Then it calls the hwrm_cb function to program this new vnic configuration.
+ */
+int bnxt_hwrm_func_vf_vnic_query_and_config(struct bnxt *bp, uint16_t vf,
+   void (*vnic_cb)(struct bnxt_vnic_info *, void *), void *cbdata,
+   int (*hwrm_cb)(struct bnxt *bp, struct bnxt_vnic_info *vnic))
+{
+   struct bnxt_vnic_info vnic;
+   int rc = 0;
+   int i, num_vnic_ids;
+   uint16_t *vnic_ids;
+   size_t vnic_id_sz;
+   size_t sz;
+
+   /* First query all VNIC ids */
+   vnic_id_sz = bp->pf.total_vnics * sizeof(*vnic_ids);
+   vnic_ids = rte_malloc("bnxt_hwrm_vf_vnic_ids_query", vnic_id_sz,
+   RTE_CACHE_LINE_SIZE);
+   if (vnic_ids == NULL) {
+   rc = -ENOMEM;
+   return rc;
+   }
+   for (sz = 0; sz < vnic_id_sz; 

[dpdk-dev] [PATCH 06/23] bnxt: add support for set VF QOS and MAC anti spoof

2017-05-17 Thread Ajit Khaparde
This patch adds support to
1) enable VF MAC anti spoof.
2) QOS configuration for specified VF.

Signed-off-by: Ajit Khaparde 
---
 drivers/net/bnxt/bnxt.h   |  2 +
 drivers/net/bnxt/bnxt_hwrm.c  | 33 +++
 drivers/net/bnxt/bnxt_hwrm.h  |  3 +
 drivers/net/bnxt/rte_pmd_bnxt.c   | 95 +++
 drivers/net/bnxt/rte_pmd_bnxt.h   | 37 
 drivers/net/bnxt/rte_pmd_bnxt_version.map |  2 +
 6 files changed, 172 insertions(+)

diff --git a/drivers/net/bnxt/bnxt.h b/drivers/net/bnxt/bnxt.h
index e85cbee..8db2ed4 100644
--- a/drivers/net/bnxt/bnxt.h
+++ b/drivers/net/bnxt/bnxt.h
@@ -67,6 +67,8 @@ struct bnxt_child_vf_info {
uint32_tfunc_cfg_flags;
uint32_tl2_rx_mask;
uint16_tfid;
+   uint16_tmax_tx_rate;
+   uint8_t mac_spoof_en;
boolrandom_mac;
 };
 
diff --git a/drivers/net/bnxt/bnxt_hwrm.c b/drivers/net/bnxt/bnxt_hwrm.c
index c1cf95d..88704b7 100644
--- a/drivers/net/bnxt/bnxt_hwrm.c
+++ b/drivers/net/bnxt/bnxt_hwrm.c
@@ -2133,6 +2133,21 @@ int bnxt_hwrm_tunnel_dst_port_free(struct bnxt *bp, 
uint16_t port,
return rc;
 }
 
+int bnxt_hwrm_func_cfg_vf_set_flags(struct bnxt *bp, uint16_t vf)
+{
+   struct hwrm_func_cfg_output *resp = bp->hwrm_cmd_resp_addr;
+   struct hwrm_func_cfg_input req = {0};
+   int rc;
+
+   HWRM_PREP(req, FUNC_CFG, -1, resp);
+   req.fid = rte_cpu_to_le_16(bp->pf.vf_info[vf].fid);
+   req.flags = rte_cpu_to_le_32(bp->pf.vf_info[vf].func_cfg_flags);
+   rc = bnxt_hwrm_send_message(bp, &req, sizeof(req));
+   HWRM_CHECK_RESULT;
+
+   return rc;
+}
+
 int bnxt_hwrm_func_buf_rgtr(struct bnxt *bp)
 {
int rc = 0;
@@ -2194,6 +2209,24 @@ int bnxt_hwrm_func_cfg_def_cp(struct bnxt *bp)
return rc;
 }
 
+int bnxt_hwrm_func_bw_cfg(struct bnxt *bp, uint16_t vf,
+   uint16_t max_bw, uint16_t enables)
+{
+   struct hwrm_func_cfg_output *resp = bp->hwrm_cmd_resp_addr;
+   struct hwrm_func_cfg_input req = {0};
+   int rc;
+
+   HWRM_PREP(req, FUNC_CFG, -1, resp);
+   req.fid = rte_cpu_to_le_16(bp->pf.vf_info[vf].fid);
+   req.enables |= rte_cpu_to_le_32(enables);
+   req.flags = rte_cpu_to_le_32(bp->pf.vf_info[vf].func_cfg_flags);
+   req.max_bw = rte_cpu_to_le_32(max_bw);
+   rc = bnxt_hwrm_send_message(bp, &req, sizeof(req));
+   HWRM_CHECK_RESULT;
+
+   return rc;
+}
+
 int bnxt_hwrm_reject_fwd_resp(struct bnxt *bp, uint16_t target_id,
  void *encaped, size_t ec_size)
 {
diff --git a/drivers/net/bnxt/bnxt_hwrm.h b/drivers/net/bnxt/bnxt_hwrm.h
index 09b935c..857772b 100644
--- a/drivers/net/bnxt/bnxt_hwrm.h
+++ b/drivers/net/bnxt/bnxt_hwrm.h
@@ -112,6 +112,8 @@ int bnxt_hwrm_allocate_vfs(struct bnxt *bp, int num_vfs);
 int bnxt_hwrm_func_vf_mac(struct bnxt *bp, uint16_t vf,
  const uint8_t *mac_addr);
 int bnxt_hwrm_pf_evb_mode(struct bnxt *bp);
+int bnxt_hwrm_func_bw_cfg(struct bnxt *bp, uint16_t vf,
+   uint16_t max_bw, uint16_t enables);
 int bnxt_hwrm_func_qcfg_vf_default_mac(struct bnxt *bp, uint16_t vf,
   struct ether_addr *mac);
 int bnxt_hwrm_tunnel_dst_port_alloc(struct bnxt *bp, uint16_t port,
@@ -119,6 +121,7 @@ int bnxt_hwrm_tunnel_dst_port_alloc(struct bnxt *bp, 
uint16_t port,
 int bnxt_hwrm_tunnel_dst_port_free(struct bnxt *bp, uint16_t port,
uint8_t tunnel_type);
 void bnxt_free_tunnel_ports(struct bnxt *bp);
+int bnxt_hwrm_func_cfg_vf_set_flags(struct bnxt *bp, uint16_t vf);
 int bnxt_hwrm_func_vf_vnic_query_and_config(struct bnxt *bp, uint16_t vf,
void (*vnic_cb)(struct bnxt_vnic_info *, void *), void *cbdata,
int (*hwrm_cb)(struct bnxt *bp, struct bnxt_vnic_info *vnic));
diff --git a/drivers/net/bnxt/rte_pmd_bnxt.c b/drivers/net/bnxt/rte_pmd_bnxt.c
index 667b24c..9d331eb 100644
--- a/drivers/net/bnxt/rte_pmd_bnxt.c
+++ b/drivers/net/bnxt/rte_pmd_bnxt.c
@@ -161,3 +161,98 @@ int rte_pmd_bnxt_set_vf_mac_addr(uint8_t port, uint16_t vf,
 
return rc;
 }
+
+int rte_pmd_bnxt_set_vf_rate_limit(uint8_t port, uint16_t vf,
+   uint16_t tx_rate, uint64_t q_msk)
+{
+   struct rte_eth_dev *eth_dev;
+   struct rte_eth_dev_info dev_info;
+   struct bnxt *bp;
+   uint16_t tot_rate = 0;
+   uint64_t idx;
+   int rc;
+
+   RTE_ETH_VALID_PORTID_OR_ERR_RET(port, -ENODEV);
+
+   eth_dev = &rte_eth_devices[port];
+   rte_eth_dev_info_get(port, &dev_info);
+   bp = (struct bnxt *)eth_dev->data->dev_private;
+
+   if (!bp->pf.active_vfs)
+   return -EINVAL;
+
+   if (vf >= bp->pf.max_vfs)
+   return -EINVAL;
+
+   /* Add up the per queue BW and configure MAX BW of the VF */
+   for (idx = 0; i

[dpdk-dev] [PATCH 07/23] bnxt: add support for VLAN stripq, VLAN anti spoof and VLAN filtering for VFs

2017-05-17 Thread Ajit Khaparde
This patch adds support for VF VLAN stripq, VF VLAN anti spoof and
VF VLAN filtering.
The VF VLAN filtering needs the VLAN anti spoof setting to be set first
before the command to program the VLAN table is sent to the firmware.

Signed-off-by: Stephen Hurd 
Signed-off-by: Ajit Khaparde 
---
 drivers/net/bnxt/bnxt.h   |   2 +
 drivers/net/bnxt/bnxt_ethdev.c|  12 +--
 drivers/net/bnxt/bnxt_filter.h|   2 +
 drivers/net/bnxt/bnxt_hwrm.c  |  89 ++-
 drivers/net/bnxt/bnxt_hwrm.h  |   9 +-
 drivers/net/bnxt/rte_pmd_bnxt.c   | 174 ++
 drivers/net/bnxt/rte_pmd_bnxt.h   |  60 +++
 drivers/net/bnxt/rte_pmd_bnxt_version.map |   3 +
 8 files changed, 339 insertions(+), 12 deletions(-)

diff --git a/drivers/net/bnxt/bnxt.h b/drivers/net/bnxt/bnxt.h
index 8db2ed4..5f458cd 100644
--- a/drivers/net/bnxt/bnxt.h
+++ b/drivers/net/bnxt/bnxt.h
@@ -68,7 +68,9 @@ struct bnxt_child_vf_info {
uint32_tl2_rx_mask;
uint16_tfid;
uint16_tmax_tx_rate;
+   uint16_tvlan_count;
uint8_t mac_spoof_en;
+   uint8_t vlan_spoof_en;
boolrandom_mac;
 };
 
diff --git a/drivers/net/bnxt/bnxt_ethdev.c b/drivers/net/bnxt/bnxt_ethdev.c
index 18263a4..fcb7679 100644
--- a/drivers/net/bnxt/bnxt_ethdev.c
+++ b/drivers/net/bnxt/bnxt_ethdev.c
@@ -278,7 +278,7 @@ static int bnxt_init_chip(struct bnxt *bp)
}
}
}
-   rc = bnxt_hwrm_cfa_l2_set_rx_mask(bp, &bp->vnic_info[0]);
+   rc = bnxt_hwrm_cfa_l2_set_rx_mask(bp, &bp->vnic_info[0], 0, NULL);
if (rc) {
RTE_LOG(ERR, PMD,
"HWRM cfa l2 rx mask failure rc: %x\n", rc);
@@ -628,7 +628,7 @@ static int bnxt_mac_addr_add_op(struct rte_eth_dev *eth_dev,
STAILQ_INSERT_TAIL(&vnic->filter, filter, next);
filter->mac_index = index;
memcpy(filter->l2_addr, mac_addr, ETHER_ADDR_LEN);
-   return bnxt_hwrm_set_filter(bp, vnic, filter);
+   return bnxt_hwrm_set_filter(bp, vnic->fw_vnic_id, filter);
 }
 
 int bnxt_link_update_op(struct rte_eth_dev *eth_dev, int wait_to_complete)
@@ -677,7 +677,7 @@ static void bnxt_promiscuous_enable_op(struct rte_eth_dev 
*eth_dev)
vnic = &bp->vnic_info[0];
 
vnic->flags |= BNXT_VNIC_INFO_PROMISC;
-   bnxt_hwrm_cfa_l2_set_rx_mask(bp, vnic);
+   bnxt_hwrm_cfa_l2_set_rx_mask(bp, vnic, 0, NULL);
 }
 
 static void bnxt_promiscuous_disable_op(struct rte_eth_dev *eth_dev)
@@ -691,7 +691,7 @@ static void bnxt_promiscuous_disable_op(struct rte_eth_dev 
*eth_dev)
vnic = &bp->vnic_info[0];
 
vnic->flags &= ~BNXT_VNIC_INFO_PROMISC;
-   bnxt_hwrm_cfa_l2_set_rx_mask(bp, vnic);
+   bnxt_hwrm_cfa_l2_set_rx_mask(bp, vnic, 0, NULL);
 }
 
 static void bnxt_allmulticast_enable_op(struct rte_eth_dev *eth_dev)
@@ -705,7 +705,7 @@ static void bnxt_allmulticast_enable_op(struct rte_eth_dev 
*eth_dev)
vnic = &bp->vnic_info[0];
 
vnic->flags |= BNXT_VNIC_INFO_ALLMULTI;
-   bnxt_hwrm_cfa_l2_set_rx_mask(bp, vnic);
+   bnxt_hwrm_cfa_l2_set_rx_mask(bp, vnic, 0, NULL);
 }
 
 static void bnxt_allmulticast_disable_op(struct rte_eth_dev *eth_dev)
@@ -719,7 +719,7 @@ static void bnxt_allmulticast_disable_op(struct rte_eth_dev 
*eth_dev)
vnic = &bp->vnic_info[0];
 
vnic->flags &= ~BNXT_VNIC_INFO_ALLMULTI;
-   bnxt_hwrm_cfa_l2_set_rx_mask(bp, vnic);
+   bnxt_hwrm_cfa_l2_set_rx_mask(bp, vnic, 0, NULL);
 }
 
 static int bnxt_reta_update_op(struct rte_eth_dev *eth_dev,
diff --git a/drivers/net/bnxt/bnxt_filter.h b/drivers/net/bnxt/bnxt_filter.h
index 06fe134..353b7f7 100644
--- a/drivers/net/bnxt/bnxt_filter.h
+++ b/drivers/net/bnxt/bnxt_filter.h
@@ -63,6 +63,8 @@ struct bnxt_filter_info {
uint32_tvni;
uint8_t pri_hint;
uint64_tl2_filter_id_hint;
+   uint32_tsrc_id;
+   uint8_t src_type;
 };
 
 struct bnxt_filter_info *bnxt_alloc_filter(struct bnxt *bp);
diff --git a/drivers/net/bnxt/bnxt_hwrm.c b/drivers/net/bnxt/bnxt_hwrm.c
index 88704b7..05f7034 100644
--- a/drivers/net/bnxt/bnxt_hwrm.c
+++ b/drivers/net/bnxt/bnxt_hwrm.c
@@ -196,7 +196,10 @@ int bnxt_hwrm_cfa_l2_clear_rx_mask(struct bnxt *bp, struct 
bnxt_vnic_info *vnic)
return rc;
 }
 
-int bnxt_hwrm_cfa_l2_set_rx_mask(struct bnxt *bp, struct bnxt_vnic_info *vnic)
+int bnxt_hwrm_cfa_l2_set_rx_mask(struct bnxt *bp,
+struct bnxt_vnic_info *vnic,
+uint16_t vlan_count,
+struct bnxt_vlan_table_entry *vlan_table)
 {
int rc = 0;
struct hwrm_cfa_l2_set_rx_mask_input req = {.req_type = 0 };
@@ -215,6 +218,12 @@ int bnxt_hwrm_cfa_l2_set_rx_mask(s

[dpdk-dev] [PATCH 08/23] bnxt: add support to get and clear VF specific stats

2017-05-17 Thread Ajit Khaparde
This patch adds code to get and clear VF stats.
It also adds the necessary HWRM structures to send the command
to the firmware.

Signed-off-by: Stephen Hurd 
Signed-off-by: Ajit Khaparde 
---
 drivers/net/bnxt/bnxt_hwrm.c   |  75 
 drivers/net/bnxt/bnxt_hwrm.h   |   5 +
 drivers/net/bnxt/hsi_struct_def_dpdk.h | 217 +
 drivers/net/bnxt/rte_pmd_bnxt.c|  87 +
 drivers/net/bnxt/rte_pmd_bnxt.h|  64 ++
 5 files changed, 448 insertions(+)

diff --git a/drivers/net/bnxt/bnxt_hwrm.c b/drivers/net/bnxt/bnxt_hwrm.c
index 05f7034..589d327 100644
--- a/drivers/net/bnxt/bnxt_hwrm.c
+++ b/drivers/net/bnxt/bnxt_hwrm.c
@@ -1164,6 +1164,81 @@ int bnxt_hwrm_func_vf_mac(struct bnxt *bp, uint16_t vf, 
const uint8_t *mac_addr)
return rc;
 }
 
+int bnxt_hwrm_func_qstats_tx_drop(struct bnxt *bp, uint16_t fid,
+ uint64_t *dropped)
+{
+   int rc = 0;
+   struct hwrm_func_qstats_input req = {.req_type = 0};
+   struct hwrm_func_qstats_output *resp = bp->hwrm_cmd_resp_addr;
+
+   HWRM_PREP(req, FUNC_QSTATS, -1, resp);
+
+   req.fid = rte_cpu_to_le_16(fid);
+
+   rc = bnxt_hwrm_send_message(bp, &req, sizeof(req));
+
+   HWRM_CHECK_RESULT;
+
+   if (dropped)
+   *dropped = rte_le_to_cpu_64(resp->tx_drop_pkts);
+
+   return rc;
+}
+
+int bnxt_hwrm_func_qstats(struct bnxt *bp, uint16_t fid,
+ struct rte_eth_stats *stats)
+{
+   int rc = 0;
+   struct hwrm_func_qstats_input req = {.req_type = 0};
+   struct hwrm_func_qstats_output *resp = bp->hwrm_cmd_resp_addr;
+
+   HWRM_PREP(req, FUNC_QSTATS, -1, resp);
+
+   req.fid = rte_cpu_to_le_16(fid);
+
+   rc = bnxt_hwrm_send_message(bp, &req, sizeof(req));
+
+   HWRM_CHECK_RESULT;
+
+   stats->ipackets = rte_le_to_cpu_64(resp->rx_ucast_pkts);
+   stats->ipackets += rte_le_to_cpu_64(resp->rx_mcast_pkts);
+   stats->ipackets += rte_le_to_cpu_64(resp->rx_bcast_pkts);
+   stats->ibytes = rte_le_to_cpu_64(resp->rx_ucast_bytes);
+   stats->ibytes += rte_le_to_cpu_64(resp->rx_mcast_bytes);
+   stats->ibytes += rte_le_to_cpu_64(resp->rx_bcast_bytes);
+
+   stats->opackets = rte_le_to_cpu_64(resp->tx_ucast_pkts);
+   stats->opackets += rte_le_to_cpu_64(resp->tx_mcast_pkts);
+   stats->opackets += rte_le_to_cpu_64(resp->tx_bcast_pkts);
+   stats->obytes = rte_le_to_cpu_64(resp->tx_ucast_bytes);
+   stats->obytes += rte_le_to_cpu_64(resp->tx_mcast_bytes);
+   stats->obytes += rte_le_to_cpu_64(resp->tx_bcast_bytes);
+
+   stats->ierrors = rte_le_to_cpu_64(resp->rx_err_pkts);
+   stats->oerrors = rte_le_to_cpu_64(resp->tx_err_pkts);
+
+   stats->imissed = rte_le_to_cpu_64(resp->rx_drop_pkts);
+
+   return rc;
+}
+
+int bnxt_hwrm_func_clr_stats(struct bnxt *bp, uint16_t fid)
+{
+   int rc = 0;
+   struct hwrm_func_clr_stats_input req = {.req_type = 0};
+   struct hwrm_func_clr_stats_output *resp = bp->hwrm_cmd_resp_addr;
+
+   HWRM_PREP(req, FUNC_CLR_STATS, -1, resp);
+
+   req.fid = rte_cpu_to_le_16(fid);
+
+   rc = bnxt_hwrm_send_message(bp, &req, sizeof(req));
+
+   HWRM_CHECK_RESULT;
+
+   return rc;
+}
+
 /*
  * HWRM utility functions
  */
diff --git a/drivers/net/bnxt/bnxt_hwrm.h b/drivers/net/bnxt/bnxt_hwrm.h
index 1ecd660..f5bc4cc 100644
--- a/drivers/net/bnxt/bnxt_hwrm.h
+++ b/drivers/net/bnxt/bnxt_hwrm.h
@@ -64,6 +64,11 @@ int bnxt_hwrm_func_driver_register(struct bnxt *bp);
 int bnxt_hwrm_func_qcaps(struct bnxt *bp);
 int bnxt_hwrm_func_reset(struct bnxt *bp);
 int bnxt_hwrm_func_driver_unregister(struct bnxt *bp, uint32_t flags);
+int bnxt_hwrm_func_qstats(struct bnxt *bp, uint16_t fid,
+ struct rte_eth_stats *stats);
+int bnxt_hwrm_func_qstats_tx_drop(struct bnxt *bp, uint16_t fid,
+ uint64_t *dropped);
+int bnxt_hwrm_func_clr_stats(struct bnxt *bp, uint16_t fid);
 int bnxt_hwrm_func_cfg_def_cp(struct bnxt *bp);
 
 int bnxt_hwrm_queue_qportcfg(struct bnxt *bp);
diff --git a/drivers/net/bnxt/hsi_struct_def_dpdk.h 
b/drivers/net/bnxt/hsi_struct_def_dpdk.h
index c730337..45d1117 100644
--- a/drivers/net/bnxt/hsi_struct_def_dpdk.h
+++ b/drivers/net/bnxt/hsi_struct_def_dpdk.h
@@ -2766,6 +2766,223 @@ struct hwrm_func_cfg_output {
 */
 } __attribute__((packed));
 
+/* hwrm_func_qstats */
+/*
+ * Description: This command returns statistics of a function. The input FID
+ * value is used to indicate what function is being queried. This allows a
+ * physical function driver to query virtual functions that are children of the
+ * physical function. The HWRM shall return any unsupported counter with a 
value
+ * of 0x for 32-bit counters and 0x for 64-bit 
counters.
+ */
+/* Input   (24 bytes) */
+struct hwrm_func_qstats_input {
+   uint16_t req_type;
+   /*
+* This value in

[dpdk-dev] [PATCH 09/23] bnxt: add code to determine the Rx status of VF

2017-05-17 Thread Ajit Khaparde
This patch adds code to determine the Rx status of a VF.
It adds the rte_pmd_bnxt_get_vf_rx_status call, which calculates
the VNIC count of the function to get the Rx status.

Signed-off-by: Stephen Hurd 
Signed-off-by: Ajit Khaparde 
---
 drivers/net/bnxt/bnxt_hwrm.c  | 24 
 drivers/net/bnxt/bnxt_hwrm.h  |  1 +
 drivers/net/bnxt/rte_pmd_bnxt.c   | 23 +++
 drivers/net/bnxt/rte_pmd_bnxt.h   | 17 +
 drivers/net/bnxt/rte_pmd_bnxt_version.map |  5 +
 5 files changed, 70 insertions(+)

diff --git a/drivers/net/bnxt/bnxt_hwrm.c b/drivers/net/bnxt/bnxt_hwrm.c
index 589d327..8b1a73c 100644
--- a/drivers/net/bnxt/bnxt_hwrm.c
+++ b/drivers/net/bnxt/bnxt_hwrm.c
@@ -2359,6 +2359,30 @@ int bnxt_hwrm_exec_fwd_resp(struct bnxt *bp, uint16_t 
target_id,
return rc;
 }
 
+static void bnxt_vnic_count(struct bnxt_vnic_info *vnic, void *cbdata)
+{
+   uint32_t *count = cbdata;
+
+   if (vnic->func_default)
+   *count = *count + 1;
+}
+
+static int bnxt_vnic_count_hwrm_stub(struct bnxt *bp __rte_unused,
+struct bnxt_vnic_info *vnic __rte_unused)
+{
+   return 0;
+}
+
+int bnxt_vf_default_vnic_count(struct bnxt *bp, uint16_t vf)
+{
+   uint32_t count = 0;
+
+   bnxt_hwrm_func_vf_vnic_query_and_config(bp, vf, bnxt_vnic_count,
+   &count, bnxt_vnic_count_hwrm_stub);
+
+   return count;
+}
+
 static int bnxt_hwrm_func_vf_vnic_query(struct bnxt *bp, uint16_t vf,
uint16_t *vnic_ids)
 {
diff --git a/drivers/net/bnxt/bnxt_hwrm.h b/drivers/net/bnxt/bnxt_hwrm.h
index f5bc4cc..a0779d4 100644
--- a/drivers/net/bnxt/bnxt_hwrm.h
+++ b/drivers/net/bnxt/bnxt_hwrm.h
@@ -129,6 +129,7 @@ int bnxt_hwrm_tunnel_dst_port_free(struct bnxt *bp, 
uint16_t port,
uint8_t tunnel_type);
 void bnxt_free_tunnel_ports(struct bnxt *bp);
 int bnxt_hwrm_func_cfg_vf_set_flags(struct bnxt *bp, uint16_t vf);
+int bnxt_vf_default_vnic_count(struct bnxt *bp, uint16_t vf);
 int bnxt_hwrm_func_vf_vnic_query_and_config(struct bnxt *bp, uint16_t vf,
void (*vnic_cb)(struct bnxt_vnic_info *, void *), void *cbdata,
int (*hwrm_cb)(struct bnxt *bp, struct bnxt_vnic_info *vnic));
diff --git a/drivers/net/bnxt/rte_pmd_bnxt.c b/drivers/net/bnxt/rte_pmd_bnxt.c
index f17b8c8..82fe313 100644
--- a/drivers/net/bnxt/rte_pmd_bnxt.c
+++ b/drivers/net/bnxt/rte_pmd_bnxt.c
@@ -480,6 +480,29 @@ int rte_pmd_bnxt_reset_vf_stats(uint8_t port,
return bnxt_hwrm_func_clr_stats(bp, bp->pf.first_vf_id + vf_id);
 }
 
+int rte_pmd_bnxt_get_vf_rx_status(uint8_t port, uint16_t vf_id)
+{
+   struct rte_eth_dev *dev;
+   struct rte_eth_dev_info dev_info;
+   struct bnxt *bp;
+
+   dev = &rte_eth_devices[port];
+   rte_eth_dev_info_get(port, &dev_info);
+   bp = (struct bnxt *)dev->data->dev_private;
+
+   if (vf_id >= dev_info.max_vfs)
+   return -EINVAL;
+
+   if (!BNXT_PF(bp)) {
+   RTE_LOG(ERR, PMD,
+   "Attempt to query VF %d RX stats on non-PF port %d!\n",
+   vf_id, port);
+   return -ENOTSUP;
+   }
+
+   return bnxt_vf_default_vnic_count(bp, vf_id);
+}
+
 int rte_pmd_bnxt_get_tx_drop_count(uint8_t port, uint64_t *count)
 {
struct rte_eth_dev *dev;
diff --git a/drivers/net/bnxt/rte_pmd_bnxt.h b/drivers/net/bnxt/rte_pmd_bnxt.h
index 2fa4786..a35e577 100644
--- a/drivers/net/bnxt/rte_pmd_bnxt.h
+++ b/drivers/net/bnxt/rte_pmd_bnxt.h
@@ -217,6 +217,23 @@ int rte_pmd_bnxt_reset_vf_stats(uint8_t port,
  */
 int rte_pmd_bnxt_set_vf_vlan_anti_spoof(uint8_t port, uint16_t vf, uint8_t on);
 
+
+/**
+ * Returns the number of default RX queues on a VF
+ *
+ * @param port
+ *The port identifier of the Ethernet device.
+ * @param vf
+ *   VF id.
+ * @return
+ *   - Non-negative value - Number of default RX queues
+ *   - (-EINVAL) if bad parameter.
+ *   - (-ENOTSUP) if on a function without VFs
+ *   - (-ENOMEM) on an allocation failure
+ *   - (-1) firmware interface error
+ */
+int rte_pmd_bnxt_get_vf_rx_status(uint8_t port, uint16_t vf_id);
+
 /**
  * Queries the TX drop counter for the function
  *
diff --git a/drivers/net/bnxt/rte_pmd_bnxt_version.map 
b/drivers/net/bnxt/rte_pmd_bnxt_version.map
index c98ed19..7fb9a28 100644
--- a/drivers/net/bnxt/rte_pmd_bnxt_version.map
+++ b/drivers/net/bnxt/rte_pmd_bnxt_version.map
@@ -9,6 +9,11 @@ DPDK_17.08 {
 
rte_pmd_bnxt_set_tx_loopback;
rte_pmd_bnxt_set_all_queues_drop_en;
+   rte_pmd_bnxt_get_vf_stats;
+   rte_pmd_bnxt_reset_vf_stats;
+   rte_pmd_bnxt_get_vf_rx_status;
+   rte_pmd_bnxt_get_vf_tx_drop_count;
+   rte_pmd_bnxt_get_tx_drop_count;
rte_pmd_bnxt_set_vf_mac_addr;
rte_pmd_bnxt_set_vf_mac_anti_spoof;
rte_pmd_bnxt_set_vf_rate_limit;
-- 
2.10.1 (Apple Git-78)



[dpdk-dev] [PATCH 12/23] bnxt: Add support for VLAN filter and strip dev_ops

2017-05-17 Thread Ajit Khaparde
This patch adds VLAN strip and offload callbacks.
To add a VLAN filter:
For each VNIC and each associated filter(s)
if VLAN exists:
if VLAN matches vlan_id
VLAN filter already exists, just skip and continue
else
add a new MAC+VLAN filter
else
Remove the old MAC only filter
Add a new MAC+VLAN filter

To remove a VLAN filter:
For each VNIC and each associated filter(s)
if VLAN exists && VLAN matches vlan_id
remove the MAC+VLAN filter
add a new MAC only filter
else
VLAN filter doesn't exist, just skip and continue

Signed-off-by: Ajit Khaparde 
---
 drivers/net/bnxt/bnxt_ethdev.c | 201 +
 1 file changed, 201 insertions(+)

diff --git a/drivers/net/bnxt/bnxt_ethdev.c b/drivers/net/bnxt/bnxt_ethdev.c
index e110e9b..1dc2327 100644
--- a/drivers/net/bnxt/bnxt_ethdev.c
+++ b/drivers/net/bnxt/bnxt_ethdev.c
@@ -141,6 +141,7 @@ static const struct rte_pci_id bnxt_pci_id_map[] = {
ETH_RSS_NONFRAG_IPV6_TCP |  \
ETH_RSS_NONFRAG_IPV6_UDP)
 
+static void bnxt_vlan_offload_set_op(struct rte_eth_dev *dev, int mask);
 /***/
 
 /*
@@ -487,6 +488,7 @@ static int bnxt_dev_lsc_intr_setup(struct rte_eth_dev 
*eth_dev)
 static int bnxt_dev_start_op(struct rte_eth_dev *eth_dev)
 {
struct bnxt *bp = (struct bnxt *)eth_dev->data->dev_private;
+   int vlan_mask = 0;
int rc;
 
bp->dev_stopped = 0;
@@ -496,6 +498,13 @@ static int bnxt_dev_start_op(struct rte_eth_dev *eth_dev)
goto error;
 
bnxt_link_update_op(eth_dev, 0);
+
+   if (eth_dev->data->dev_conf.rxmode.hw_vlan_filter)
+   vlan_mask |= ETH_VLAN_FILTER_MASK;
+   if (eth_dev->data->dev_conf.rxmode.hw_vlan_strip)
+   vlan_mask |= ETH_VLAN_STRIP_MASK;
+   bnxt_vlan_offload_set_op(eth_dev, vlan_mask);
+
return 0;
 
 error:
@@ -1091,6 +1100,196 @@ bnxt_udp_tunnel_port_del_op(struct rte_eth_dev *eth_dev,
return rc;
 }
 
+static int bnxt_del_vlan_filter(struct bnxt *bp, uint16_t vlan_id)
+{
+   struct bnxt_filter_info *filter, *temp_filter, *new_filter;
+   struct bnxt_vnic_info *vnic;
+   unsigned int i;
+   int rc = 0;
+   uint32_t chk = HWRM_CFA_L2_FILTER_ALLOC_INPUT_ENABLES_L2_OVLAN;
+
+   /* Cycle through all VNICs */
+   for (i = 0; i < bp->nr_vnics; i++) {
+   /*
+* For each VNIC and each associated filter(s)
+* if VLAN exists && VLAN matches vlan_id
+*  remove the MAC+VLAN filter
+*  add a new MAC only filter
+* else
+*  VLAN filter doesn't exist, just skip and continue
+*/
+   STAILQ_FOREACH(vnic, &bp->ff_pool[i], next) {
+   filter = STAILQ_FIRST(&vnic->filter);
+   while (filter) {
+   temp_filter = STAILQ_NEXT(filter, next);
+
+   if (filter->enables & chk &&
+   filter->l2_ovlan == vlan_id) {
+   /* Must delete the filter */
+   STAILQ_REMOVE(&vnic->filter, filter,
+ bnxt_filter_info, next);
+   bnxt_hwrm_clear_filter(bp, filter);
+   STAILQ_INSERT_TAIL(
+   &bp->free_filter_list,
+   filter, next);
+
+   /*
+* Need to examine to see if the MAC
+* filter already existed or not before
+* allocating a new one
+*/
+
+   new_filter = bnxt_alloc_filter(bp);
+   if (!new_filter) {
+   RTE_LOG(ERR, PMD,
+   "MAC/VLAN filter alloc 
failed\n");
+   rc = -ENOMEM;
+   goto exit;
+   }
+   STAILQ_INSERT_TAIL(&vnic->filter,
+  new_filter, next);
+   /* Inherit MAC from previous filter */
+   new_filter->mac_index =
+   filter->mac_index;
+   memcpy(new_filter->l2_addr,
+  filter->l2_addr, ET

[dpdk-dev] [PATCH 11/23] bnxt: add support for xstats get/reset

2017-05-17 Thread Ajit Khaparde
This patch adds support to get and reset xstats dev_ops

dev_ops added:
xstats_get, xstats_get_name, xstats_reset

HWRM commands added:
hwrm_port_qstats, hwrm_port_clr_stats

Signed-off-by: Ajit Khaparde 
---
 drivers/net/bnxt/bnxt.h|   7 +
 drivers/net/bnxt/bnxt_ethdev.c |  84 ++
 drivers/net/bnxt/bnxt_hwrm.c   |  33 +++
 drivers/net/bnxt/bnxt_hwrm.h   |   2 +
 drivers/net/bnxt/bnxt_stats.c  | 202 ++
 drivers/net/bnxt/bnxt_stats.h  |  10 +
 drivers/net/bnxt/hsi_struct_def_dpdk.h | 495 +
 7 files changed, 833 insertions(+)

diff --git a/drivers/net/bnxt/bnxt.h b/drivers/net/bnxt/bnxt.h
index 5f458cd..8b32375 100644
--- a/drivers/net/bnxt/bnxt.h
+++ b/drivers/net/bnxt/bnxt.h
@@ -134,6 +134,7 @@ struct bnxt {
uint32_tflags;
 #define BNXT_FLAG_REGISTERED   (1 << 0)
 #define BNXT_FLAG_VF   (1 << 1)
+#define BNXT_FLAG_PORT_STATS   (1 << 2)
 #define BNXT_PF(bp)(!((bp)->flags & BNXT_FLAG_VF))
 #define BNXT_VF(bp)((bp)->flags & BNXT_FLAG_VF)
 #define BNXT_NPAR_ENABLED(bp)  ((bp)->port_partition_type)
@@ -142,10 +143,16 @@ struct bnxt {
unsigned intrx_nr_rings;
unsigned intrx_cp_nr_rings;
struct bnxt_rx_queue **rx_queues;
+   const void  *rx_mem_zone;
+   struct rx_port_stats*hw_rx_port_stats;
+   phys_addr_t hw_rx_port_stats_map;
 
unsigned inttx_nr_rings;
unsigned inttx_cp_nr_rings;
struct bnxt_tx_queue **tx_queues;
+   const void  *tx_mem_zone;
+   struct tx_port_stats*hw_tx_port_stats;
+   phys_addr_t hw_tx_port_stats_map;
 
/* Default completion ring */
struct bnxt_cp_ring_info*def_cp_ring;
diff --git a/drivers/net/bnxt/bnxt_ethdev.c b/drivers/net/bnxt/bnxt_ethdev.c
index fcb7679..e110e9b 100644
--- a/drivers/net/bnxt/bnxt_ethdev.c
+++ b/drivers/net/bnxt/bnxt_ethdev.c
@@ -533,6 +533,7 @@ static void bnxt_dev_stop_op(struct rte_eth_dev *eth_dev)
eth_dev->data->dev_link.link_status = 0;
}
bnxt_set_hwrm_link_config(bp, false);
+   bnxt_hwrm_port_clr_stats(bp);
bnxt_shutdown_nic(bp);
bp->dev_stopped = 1;
 }
@@ -1123,6 +1124,9 @@ static const struct eth_dev_ops bnxt_dev_ops = {
.flow_ctrl_set = bnxt_flow_ctrl_set_op,
.udp_tunnel_port_add  = bnxt_udp_tunnel_port_add_op,
.udp_tunnel_port_del  = bnxt_udp_tunnel_port_del_op,
+   .xstats_get = bnxt_dev_xstats_get_op,
+   .xstats_get_names = bnxt_dev_xstats_get_names_op,
+   .xstats_reset = bnxt_dev_xstats_reset_op,
 };
 
 static bool bnxt_vf_pciid(uint16_t id)
@@ -1181,7 +1185,11 @@ static int
 bnxt_dev_init(struct rte_eth_dev *eth_dev)
 {
struct rte_pci_device *pci_dev = RTE_DEV_TO_PCI(eth_dev->device);
+   char mz_name[RTE_MEMZONE_NAMESIZE];
+   const struct rte_memzone *mz = NULL;
static int version_printed;
+   uint32_t total_alloc_len;
+   phys_addr_t mz_phys_addr;
struct bnxt *bp;
int rc;
 
@@ -1208,6 +1216,80 @@ bnxt_dev_init(struct rte_eth_dev *eth_dev)
eth_dev->rx_pkt_burst = &bnxt_recv_pkts;
eth_dev->tx_pkt_burst = &bnxt_xmit_pkts;
 
+   if (BNXT_PF(bp) && pci_dev->id.device_id != BROADCOM_DEV_ID_NS2) {
+   snprintf(mz_name, RTE_MEMZONE_NAMESIZE,
+"bnxt_%04x:%02x:%02x:%02x-%s", pci_dev->addr.domain,
+pci_dev->addr.bus, pci_dev->addr.devid,
+pci_dev->addr.function, "rx_port_stats");
+   mz_name[RTE_MEMZONE_NAMESIZE - 1] = 0;
+   mz = rte_memzone_lookup(mz_name);
+   total_alloc_len = RTE_CACHE_LINE_ROUNDUP(
+   sizeof(struct rx_port_stats) + 512);
+   if (!mz) {
+   mz = rte_memzone_reserve(mz_name, total_alloc_len,
+SOCKET_ID_ANY,
+RTE_MEMZONE_2MB |
+RTE_MEMZONE_SIZE_HINT_ONLY);
+   if (mz == NULL)
+   return -ENOMEM;
+   }
+   memset(mz->addr, 0, mz->len);
+   mz_phys_addr = mz->phys_addr;
+   if ((unsigned long)mz->addr == mz_phys_addr) {
+   RTE_LOG(WARNING, PMD,
+   "Memzone physical address same as virtual.\n");
+   RTE_LOG(WARNING, PMD,
+   "Using rte_mem_virt2phy()\n");
+   mz_phys_addr = rte_mem_virt2phy(mz->addr);
+   if (mz_phys_addr == 0) {
+   RTE_LOG(ERR, PMD,
+   "unable to map address to physical memory\n");
+  

[dpdk-dev] [PATCH 13/23] bnxt: add code to configure a default VF VLAN

2017-05-17 Thread Ajit Khaparde
This patch adds code to insert a default VF VLAN.
Also track the current default VLAN per vnic for the VF.
When setting the default VLAN, avoid setting it to the current value.

Signed-off-by: Stephen Hurd 
Signed-off-by: Ajit Khaparde 
---
 drivers/net/bnxt/bnxt.h   |  1 +
 drivers/net/bnxt/bnxt_hwrm.c  | 39 +++
 drivers/net/bnxt/bnxt_hwrm.h  |  2 ++
 drivers/net/bnxt/rte_pmd_bnxt.c   | 35 +++
 drivers/net/bnxt/rte_pmd_bnxt.h   | 20 
 drivers/net/bnxt/rte_pmd_bnxt_version.map |  1 +
 6 files changed, 98 insertions(+)

diff --git a/drivers/net/bnxt/bnxt.h b/drivers/net/bnxt/bnxt.h
index 8b32375..335482b 100644
--- a/drivers/net/bnxt/bnxt.h
+++ b/drivers/net/bnxt/bnxt.h
@@ -68,6 +68,7 @@ struct bnxt_child_vf_info {
uint32_tl2_rx_mask;
uint16_tfid;
uint16_tmax_tx_rate;
+   uint16_tdflt_vlan;
uint16_tvlan_count;
uint8_t mac_spoof_en;
uint8_t vlan_spoof_en;
diff --git a/drivers/net/bnxt/bnxt_hwrm.c b/drivers/net/bnxt/bnxt_hwrm.c
index 30d4891..f66a760 100644
--- a/drivers/net/bnxt/bnxt_hwrm.c
+++ b/drivers/net/bnxt/bnxt_hwrm.c
@@ -2002,6 +2002,27 @@ static void reserve_resources_from_vf(struct bnxt *bp,
bp->max_ring_grps -= rte_le_to_cpu_16(resp->max_hw_ring_grps);
 }
 
+int bnxt_hwrm_func_qcfg_current_vf_vlan(struct bnxt *bp, int vf)
+{
+   struct hwrm_func_qcfg_input req = {0};
+   struct hwrm_func_qcfg_output *resp = bp->hwrm_cmd_resp_addr;
+   int rc;
+
+   /* Check for zero MAC address */
+   HWRM_PREP(req, FUNC_QCFG, -1, resp);
+   req.fid = rte_cpu_to_le_16(bp->pf.vf_info[vf].fid);
+   rc = bnxt_hwrm_send_message(bp, &req, sizeof(req));
+   if (rc) {
+   RTE_LOG(ERR, PMD, "hwrm_func_qcfg failed rc:%d\n", rc);
+   return -1;
+   } else if (resp->error_code) {
+   rc = rte_le_to_cpu_16(resp->error_code);
+   RTE_LOG(ERR, PMD, "hwrm_func_qcfg error %d\n", rc);
+   return -1;
+   }
+   return rte_le_to_cpu_16(resp->vlan);
+}
+
 static int update_pf_resource_max(struct bnxt *bp)
 {
struct hwrm_func_qcfg_input req = {0};
@@ -2315,6 +2336,24 @@ int bnxt_hwrm_func_bw_cfg(struct bnxt *bp, uint16_t vf,
return rc;
 }
 
+int bnxt_hwrm_set_vf_vlan(struct bnxt *bp, int vf)
+{
+   struct hwrm_func_cfg_input req = {0};
+   struct hwrm_func_cfg_output *resp = bp->hwrm_cmd_resp_addr;
+   int rc = 0;
+
+   HWRM_PREP(req, FUNC_CFG, -1, resp);
+   req.flags = rte_cpu_to_le_32(bp->pf.vf_info[vf].func_cfg_flags);
+   req.fid = rte_cpu_to_le_16(bp->pf.vf_info[vf].fid);
+   req.enables |= rte_cpu_to_le_32(HWRM_FUNC_CFG_INPUT_ENABLES_DFLT_VLAN);
+   req.dflt_vlan = rte_cpu_to_le_16(bp->pf.vf_info[vf].dflt_vlan);
+
+   rc = bnxt_hwrm_send_message(bp, &req, sizeof(req));
+   HWRM_CHECK_RESULT;
+
+   return rc;
+}
+
 int bnxt_hwrm_reject_fwd_resp(struct bnxt *bp, uint16_t target_id,
  void *encaped, size_t ec_size)
 {
diff --git a/drivers/net/bnxt/bnxt_hwrm.h b/drivers/net/bnxt/bnxt_hwrm.h
index 23f490b..46a8fde 100644
--- a/drivers/net/bnxt/bnxt_hwrm.h
+++ b/drivers/net/bnxt/bnxt_hwrm.h
@@ -121,8 +121,10 @@ int bnxt_hwrm_func_vf_mac(struct bnxt *bp, uint16_t vf,
 int bnxt_hwrm_pf_evb_mode(struct bnxt *bp);
 int bnxt_hwrm_func_bw_cfg(struct bnxt *bp, uint16_t vf,
uint16_t max_bw, uint16_t enables);
+int bnxt_hwrm_set_vf_vlan(struct bnxt *bp, int vf);
 int bnxt_hwrm_func_qcfg_vf_default_mac(struct bnxt *bp, uint16_t vf,
   struct ether_addr *mac);
+int bnxt_hwrm_func_qcfg_current_vf_vlan(struct bnxt *bp, int vf);
 int bnxt_hwrm_tunnel_dst_port_alloc(struct bnxt *bp, uint16_t port,
uint8_t tunnel_type);
 int bnxt_hwrm_tunnel_dst_port_free(struct bnxt *bp, uint16_t port,
diff --git a/drivers/net/bnxt/rte_pmd_bnxt.c b/drivers/net/bnxt/rte_pmd_bnxt.c
index a2bd39e..ca14ff3 100644
--- a/drivers/net/bnxt/rte_pmd_bnxt.c
+++ b/drivers/net/bnxt/rte_pmd_bnxt.c
@@ -356,6 +356,41 @@ rte_pmd_bnxt_set_vf_vlan_stripq(uint8_t port, uint16_t vf, 
uint8_t on)
return rc;
 }
 
+int
+rte_pmd_bnxt_set_vf_vlan_insert(uint8_t port, uint16_t vf,
+   uint16_t vlan_id)
+{
+   struct rte_eth_dev *dev;
+   struct rte_eth_dev_info dev_info;
+   struct bnxt *bp;
+   int rc;
+
+   RTE_ETH_VALID_PORTID_OR_ERR_RET(port, -ENODEV);
+
+   dev = &rte_eth_devices[port];
+   rte_eth_dev_info_get(port, &dev_info);
+   bp = (struct bnxt *)dev->data->dev_private;
+
+   if (vf >= dev_info.max_vfs)
+   return -EINVAL;
+
+   if (!BNXT_PF(bp)) {
+   RTE_LOG(ERR, PMD,
+   "Attempt to set VF %d vlan insert on 

[dpdk-dev] [PATCH 10/23] bnxt: add support to add a VF MAC address

2017-05-17 Thread Ajit Khaparde
This patch adds support to allocate a filter and program
it in the hardware for every MAC address added to the specified
function.

Signed-off-by: Stephen Hurd 
Signed-off-by: Ajit Khaparde 
---
 drivers/net/bnxt/bnxt_filter.c| 22 ++
 drivers/net/bnxt/bnxt_filter.h|  1 +
 drivers/net/bnxt/rte_pmd_bnxt.c   | 73 +++
 drivers/net/bnxt/rte_pmd_bnxt.h   | 18 
 drivers/net/bnxt/rte_pmd_bnxt_version.map |  1 +
 5 files changed, 115 insertions(+)

diff --git a/drivers/net/bnxt/bnxt_filter.c b/drivers/net/bnxt/bnxt_filter.c
index 146ab33..78aa0ae 100644
--- a/drivers/net/bnxt/bnxt_filter.c
+++ b/drivers/net/bnxt/bnxt_filter.c
@@ -68,6 +68,22 @@ struct bnxt_filter_info *bnxt_alloc_filter(struct bnxt *bp)
return filter;
 }
 
+struct bnxt_filter_info *bnxt_alloc_vf_filter(struct bnxt *bp, uint16_t vf)
+{
+   struct bnxt_filter_info *filter;
+
+   filter = rte_zmalloc("bnxt_vf_filter_info", sizeof(*filter), 0);
+   if (!filter) {
+   RTE_LOG(ERR, PMD, "Failed to alloc memory for VF %hu filters\n",
+   vf);
+   return NULL;
+   }
+
+   filter->fw_l2_filter_id = UINT64_MAX;
+   STAILQ_INSERT_TAIL(&bp->pf.vf_info[vf].filter, filter, next);
+   return filter;
+}
+
 void bnxt_init_filters(struct bnxt *bp)
 {
struct bnxt_filter_info *filter;
@@ -102,6 +118,12 @@ void bnxt_free_all_filters(struct bnxt *bp)
STAILQ_INIT(&vnic->filter);
}
}
+
+   for (i = 0; i < bp->pf.max_vfs; i++) {
+   STAILQ_FOREACH(filter, &bp->pf.vf_info[i].filter, next) {
+   bnxt_hwrm_clear_filter(bp, filter);
+   }
+   }
 }
 
 void bnxt_free_filter_mem(struct bnxt *bp)
diff --git a/drivers/net/bnxt/bnxt_filter.h b/drivers/net/bnxt/bnxt_filter.h
index 353b7f7..613b2ee 100644
--- a/drivers/net/bnxt/bnxt_filter.h
+++ b/drivers/net/bnxt/bnxt_filter.h
@@ -68,6 +68,7 @@ struct bnxt_filter_info {
 };
 
 struct bnxt_filter_info *bnxt_alloc_filter(struct bnxt *bp);
+struct bnxt_filter_info *bnxt_alloc_vf_filter(struct bnxt *bp, uint16_t vf);
 void bnxt_init_filters(struct bnxt *bp);
 void bnxt_free_all_filters(struct bnxt *bp);
 void bnxt_free_filter_mem(struct bnxt *bp);
diff --git a/drivers/net/bnxt/rte_pmd_bnxt.c b/drivers/net/bnxt/rte_pmd_bnxt.c
index 82fe313..a2bd39e 100644
--- a/drivers/net/bnxt/rte_pmd_bnxt.c
+++ b/drivers/net/bnxt/rte_pmd_bnxt.c
@@ -540,3 +540,76 @@ int rte_pmd_bnxt_get_vf_tx_drop_count(uint8_t port, 
uint16_t vf_id,
return bnxt_hwrm_func_qstats_tx_drop(bp, bp->pf.first_vf_id + vf_id,
 count);
 }
+
+int rte_pmd_bnxt_mac_addr_add(uint8_t port, struct ether_addr *addr,
+   uint32_t vf_id)
+{
+   struct rte_eth_dev *dev;
+   struct rte_eth_dev_info dev_info;
+   struct bnxt *bp;
+   struct bnxt_filter_info *filter;
+   struct bnxt_vnic_info vnic;
+   struct ether_addr dflt_mac;
+   int rc;
+
+   dev = &rte_eth_devices[port];
+   rte_eth_dev_info_get(port, &dev_info);
+   bp = (struct bnxt *)dev->data->dev_private;
+
+   if (vf_id >= dev_info.max_vfs)
+   return -EINVAL;
+
+   if (!BNXT_PF(bp)) {
+   RTE_LOG(ERR, PMD,
+   "Attempt to config VF %d MAC on non-PF port %d!\n",
+   vf_id, port);
+   return -ENOTSUP;
+   }
+
+   /* If the VF currently uses a random MAC, update default to this one */
+   if (bp->pf.vf_info[vf_id].random_mac) {
+   if (rte_pmd_bnxt_get_vf_rx_status(port, vf_id) <= 0)
+   rc = bnxt_hwrm_func_vf_mac(bp, vf_id, (uint8_t *)addr);
+   }
+
+   /* query the default VNIC id used by the function */
+   rc = bnxt_hwrm_func_qcfg_vf_dflt_vnic_id(bp, vf_id);
+   if (rc < 0)
+   goto exit;
+
+   memset(&vnic, 0, sizeof(struct bnxt_vnic_info));
+   vnic.fw_vnic_id = rte_le_to_cpu_16(rc);
+   rc = bnxt_hwrm_vnic_qcfg(bp, &vnic, bp->pf.first_vf_id + vf_id);
+   if (rc < 0)
+   goto exit;
+
+   STAILQ_FOREACH(filter, &bp->pf.vf_info[vf_id].filter, next) {
+   if (filter->flags ==
+   HWRM_CFA_L2_FILTER_ALLOC_INPUT_FLAGS_PATH_RX &&
+   filter->enables ==
+   (HWRM_CFA_L2_FILTER_ALLOC_INPUT_ENABLES_L2_ADDR |
+HWRM_CFA_L2_FILTER_ALLOC_INPUT_ENABLES_L2_ADDR_MASK) &&
+   memcmp(addr, filter->l2_addr, ETHER_ADDR_LEN) == 0) {
+   bnxt_hwrm_clear_filter(bp, filter);
+   break;
+   }
+   }
+
+   if (filter == NULL)
+   filter = bnxt_alloc_vf_filter(bp, vf_id);
+
+   filter->fw_l2_filter_id = UINT64_MAX;
+   filter->flags = HWRM_CFA_L2_FILTER_ALLOC_INPUT_FLAGS_PATH_RX;
+   filter->enables = HWRM_CF

[dpdk-dev] [PATCH 15/23] bnxt: add support for fw_version_get dev_op

2017-05-17 Thread Ajit Khaparde
This patch adds support for fw_version_get dev_op

Signed-off-by: Ajit Khaparde 
---
 drivers/net/bnxt/bnxt_ethdev.c | 20 
 1 file changed, 20 insertions(+)

diff --git a/drivers/net/bnxt/bnxt_ethdev.c b/drivers/net/bnxt/bnxt_ethdev.c
index 9a0acee..9d503b7 100644
--- a/drivers/net/bnxt/bnxt_ethdev.c
+++ b/drivers/net/bnxt/bnxt_ethdev.c
@@ -1356,6 +1356,25 @@ bnxt_dev_set_mc_addr_list_op(struct rte_eth_dev *eth_dev,
return bnxt_hwrm_cfa_l2_set_rx_mask(bp, &bp->vnic_info[0], 0, NULL);
 }
 
+static int
+bnxt_fw_version_get(struct rte_eth_dev *dev, char *fw_version, size_t fw_size)
+{
+   struct bnxt *bp = (struct bnxt *)dev->data->dev_private;
+   uint8_t fw_major = (bp->fw_ver >> 24) & 0xff;
+   uint8_t fw_minor = (bp->fw_ver >> 16) & 0xff;
+   uint8_t fw_updt = (bp->fw_ver >> 8) & 0xff;
+   int ret;
+
+   ret = snprintf(fw_version, fw_size, "%d.%d.%d",
+   fw_major, fw_minor, fw_updt);
+
+   ret += 1; /* add the size of '\0' */
+   if (fw_size < (uint32_t)ret)
+   return ret;
+   else
+   return 0;
+}
+
 /*
  * Initialization
  */
@@ -1395,6 +1414,7 @@ static const struct eth_dev_ops bnxt_dev_ops = {
.xstats_get = bnxt_dev_xstats_get_op,
.xstats_get_names = bnxt_dev_xstats_get_names_op,
.xstats_reset = bnxt_dev_xstats_reset_op,
+   .fw_version_get = bnxt_fw_version_get,
.set_mc_addr_list = bnxt_dev_set_mc_addr_list_op,
 };
 
-- 
2.10.1 (Apple Git-78)



[dpdk-dev] [PATCH 16/23] bnxt: add support to set MTU

2017-05-17 Thread Ajit Khaparde
This patch adds support to modify MTU using the set_mtu dev_op.
To support frames > 2k, the PMD creates an aggregator ring.
When a frame greater than 2k is received, it is fragmented
and the resulting fragments are DMA'ed to the aggregator ring.
Now the driver can support jumbo frames upto 9500 bytes.

Signed-off-by: Steeven Li 
Signed-off-by: Ajit Khaparde 
---
 drivers/net/bnxt/bnxt.h|   4 +-
 drivers/net/bnxt/bnxt_cpr.c|   3 +-
 drivers/net/bnxt/bnxt_ethdev.c |  59 +
 drivers/net/bnxt/bnxt_hwrm.c   | 123 --
 drivers/net/bnxt/bnxt_hwrm.h   |   4 +-
 drivers/net/bnxt/bnxt_ring.c   |  81 +++--
 drivers/net/bnxt/bnxt_ring.h   |   2 +
 drivers/net/bnxt/bnxt_rxq.c|  29 --
 drivers/net/bnxt/bnxt_rxq.h|   1 +
 drivers/net/bnxt/bnxt_rxr.c| 195 -
 drivers/net/bnxt/bnxt_rxr.h|   6 ++
 11 files changed, 420 insertions(+), 87 deletions(-)

diff --git a/drivers/net/bnxt/bnxt.h b/drivers/net/bnxt/bnxt.h
index 335482b..f25a1c4 100644
--- a/drivers/net/bnxt/bnxt.h
+++ b/drivers/net/bnxt/bnxt.h
@@ -45,7 +45,7 @@
 
 #include "bnxt_cpr.h"
 
-#define BNXT_MAX_MTU   9000
+#define BNXT_MAX_MTU   9500
 #define VLAN_TAG_SIZE  4
 
 enum bnxt_hw_context {
@@ -136,6 +136,7 @@ struct bnxt {
 #define BNXT_FLAG_REGISTERED   (1 << 0)
 #define BNXT_FLAG_VF   (1 << 1)
 #define BNXT_FLAG_PORT_STATS   (1 << 2)
+#define BNXT_FLAG_JUMBO(1 << 3)
 #define BNXT_PF(bp)(!((bp)->flags & BNXT_FLAG_VF))
 #define BNXT_VF(bp)((bp)->flags & BNXT_FLAG_VF)
 #define BNXT_NPAR_ENABLED(bp)  ((bp)->port_partition_type)
@@ -239,4 +240,5 @@ struct rte_pmd_bnxt_mb_event_param {
 int bnxt_link_update_op(struct rte_eth_dev *eth_dev, int wait_to_complete);
 int bnxt_rcv_msg_from_vf(struct bnxt *bp, uint16_t vf_id, void *msg);
 
+#define RX_PROD_AGG_BD_TYPE_RX_PROD_AGG0x6
 #endif
diff --git a/drivers/net/bnxt/bnxt_cpr.c b/drivers/net/bnxt/bnxt_cpr.c
index f0b8728..6eb32ab 100644
--- a/drivers/net/bnxt/bnxt_cpr.c
+++ b/drivers/net/bnxt/bnxt_cpr.c
@@ -159,7 +159,8 @@ int bnxt_alloc_def_cp_ring(struct bnxt *bp)
 
rc = bnxt_hwrm_ring_alloc(bp, cp_ring,
  HWRM_RING_ALLOC_INPUT_RING_TYPE_CMPL,
- 0, HWRM_NA_SIGNATURE);
+ 0, HWRM_NA_SIGNATURE,
+ HWRM_NA_SIGNATURE);
if (rc)
goto err_out;
cpr->cp_doorbell = bp->pdev->mem_resource[2].addr;
diff --git a/drivers/net/bnxt/bnxt_ethdev.c b/drivers/net/bnxt/bnxt_ethdev.c
index 9d503b7..5f22091 100644
--- a/drivers/net/bnxt/bnxt_ethdev.c
+++ b/drivers/net/bnxt/bnxt_ethdev.c
@@ -199,6 +199,14 @@ static int bnxt_init_chip(struct bnxt *bp)
struct rte_eth_link new;
int rc;
 
+   if (bp->eth_dev->data->mtu > ETHER_MTU) {
+   bp->eth_dev->data->dev_conf.rxmode.jumbo_frame = 1;
+   bp->flags |= BNXT_FLAG_JUMBO;
+   } else {
+   bp->eth_dev->data->dev_conf.rxmode.jumbo_frame = 0;
+   bp->flags &= ~BNXT_FLAG_JUMBO;
+   }
+
rc = bnxt_alloc_all_hwrm_stat_ctxs(bp);
if (rc) {
RTE_LOG(ERR, PMD, "HWRM stat ctx alloc failure rc: %x\n", rc);
@@ -1375,6 +1383,56 @@ bnxt_fw_version_get(struct rte_eth_dev *dev, char 
*fw_version, size_t fw_size)
return 0;
 }
 
+static int bnxt_mtu_set_op(struct rte_eth_dev *eth_dev, uint16_t new_mtu)
+{
+   struct bnxt *bp = eth_dev->data->dev_private;
+   struct rte_eth_dev_info dev_info;
+   uint32_t max_dev_mtu;
+   uint32_t rc = 0;
+   uint32_t i;
+
+   bnxt_dev_info_get_op(eth_dev, &dev_info);
+   max_dev_mtu = dev_info.max_rx_pktlen -
+ ETHER_HDR_LEN - ETHER_CRC_LEN - VLAN_TAG_SIZE * 2;
+
+   if (new_mtu < ETHER_MIN_MTU || new_mtu > max_dev_mtu) {
+   RTE_LOG(ERR, PMD, "MTU requested must be within (%d, %d)\n",
+   ETHER_MIN_MTU, max_dev_mtu);
+   return -EINVAL;
+   }
+
+
+   if (new_mtu > ETHER_MTU) {
+   bp->flags |= BNXT_FLAG_JUMBO;
+   eth_dev->data->dev_conf.rxmode.jumbo_frame = 1;
+   } else {
+   eth_dev->data->dev_conf.rxmode.jumbo_frame = 0;
+   bp->flags &= ~BNXT_FLAG_JUMBO;
+   }
+
+   eth_dev->data->dev_conf.rxmode.max_rx_pkt_len =
+   new_mtu + ETHER_HDR_LEN + ETHER_CRC_LEN + VLAN_TAG_SIZE * 2;
+
+   eth_dev->data->mtu = new_mtu;
+   RTE_LOG(INFO, PMD, "New MTU is %d\n", eth_dev->data->mtu);
+
+   for (i = 0; i < bp->nr_vnics; i++) {
+   struct bnxt_vnic_info *vnic = &bp->vnic_info[i];
+
+   vnic->mru = bp->eth_dev->data->mtu + ETHER_HDR_LEN +
+   ETHER_CRC_LEN + VLAN_TAG_SIZE * 2;
+   rc = bnxt_hwrm_vnic_cfg(bp, vnic);
+   if (rc)
+

[dpdk-dev] [PATCH 17/23] bnxt: add support for LRO

2017-05-17 Thread Ajit Khaparde
This patch adds support to enable and disable LRO
To support this feature, the driver creates an aggregrator ring.
When the hardware starts doing LRO, it sends a tpa_start completion.
When the driver receives a tpa_end completion, it indicates that the
LRO chaining is complete.

Signed-off-by: Steeven Li 
Signed-off-by: Ajit Khaparde 
---
 drivers/net/bnxt/bnxt_ethdev.c |   7 +
 drivers/net/bnxt/bnxt_hwrm.c   |  44 ++-
 drivers/net/bnxt/bnxt_hwrm.h   |   2 +
 drivers/net/bnxt/bnxt_ring.c   |  32 +-
 drivers/net/bnxt/bnxt_ring.h   |   4 +-
 drivers/net/bnxt/bnxt_rxq.c|  12 +
 drivers/net/bnxt/bnxt_rxq.h|   2 +
 drivers/net/bnxt/bnxt_rxr.c| 315 
 drivers/net/bnxt/bnxt_rxr.h|  40 ++
 drivers/net/bnxt/hsi_struct_def_dpdk.h | 647 -
 10 files changed, 1019 insertions(+), 86 deletions(-)

diff --git a/drivers/net/bnxt/bnxt_ethdev.c b/drivers/net/bnxt/bnxt_ethdev.c
index 5f22091..7fafd02 100644
--- a/drivers/net/bnxt/bnxt_ethdev.c
+++ b/drivers/net/bnxt/bnxt_ethdev.c
@@ -286,6 +286,13 @@ static int bnxt_init_chip(struct bnxt *bp)
goto err_out;
}
}
+
+   bnxt_hwrm_vnic_plcmode_cfg(bp, vnic);
+
+   if (bp->eth_dev->data->dev_conf.rxmode.enable_lro)
+   bnxt_hwrm_vnic_tpa_cfg(bp, vnic, 1);
+   else
+   bnxt_hwrm_vnic_tpa_cfg(bp, vnic, 0);
}
rc = bnxt_hwrm_cfa_l2_set_rx_mask(bp, &bp->vnic_info[0], 0, NULL);
if (rc) {
diff --git a/drivers/net/bnxt/bnxt_hwrm.c b/drivers/net/bnxt/bnxt_hwrm.c
index 136365c..bea855e 100644
--- a/drivers/net/bnxt/bnxt_hwrm.c
+++ b/drivers/net/bnxt/bnxt_hwrm.c
@@ -1148,19 +1148,15 @@ int bnxt_hwrm_vnic_plcmode_cfg(struct bnxt *bp,
HWRM_PREP(req, VNIC_PLCMODES_CFG, -1, resp);
 
req.flags = rte_cpu_to_le_32(
-// HWRM_VNIC_PLCMODES_CFG_INPUT_FLAGS_REGULAR_PLACEMENT |
HWRM_VNIC_PLCMODES_CFG_INPUT_FLAGS_JUMBO_PLACEMENT);
-// HWRM_VNIC_PLCMODES_CFG_INPUT_FLAGS_HDS_IPV4 | //TODO
-// HWRM_VNIC_PLCMODES_CFG_INPUT_FLAGS_HDS_IPV6);
+
req.enables = rte_cpu_to_le_32(
HWRM_VNIC_PLCMODES_CFG_INPUT_ENABLES_JUMBO_THRESH_VALID);
-// HWRM_VNIC_PLCMODES_CFG_INPUT_ENABLES_HDS_THRESHOLD_VALID);
 
size = rte_pktmbuf_data_room_size(bp->rx_queues[0]->mb_pool);
size -= RTE_PKTMBUF_HEADROOM;
 
req.jumbo_thresh = rte_cpu_to_le_16(size);
-// req.hds_threshold = rte_cpu_to_le_16(size);
req.vnic_id = rte_cpu_to_le_32(vnic->fw_vnic_id);
 
rc = bnxt_hwrm_send_message(bp, &req, sizeof(req));
@@ -1170,6 +1166,41 @@ int bnxt_hwrm_vnic_plcmode_cfg(struct bnxt *bp,
return rc;
 }
 
+int bnxt_hwrm_vnic_tpa_cfg(struct bnxt *bp,
+   struct bnxt_vnic_info *vnic, bool enable)
+{
+   int rc = 0;
+   struct hwrm_vnic_tpa_cfg_input req = {.req_type = 0 };
+   struct hwrm_vnic_tpa_cfg_output *resp = bp->hwrm_cmd_resp_addr;
+
+   HWRM_PREP(req, VNIC_TPA_CFG, -1, resp);
+
+   if (enable) {
+   req.enables = rte_cpu_to_le_32(
+   HWRM_VNIC_TPA_CFG_INPUT_ENABLES_MAX_AGG_SEGS |
+   HWRM_VNIC_TPA_CFG_INPUT_ENABLES_MAX_AGGS |
+   HWRM_VNIC_TPA_CFG_INPUT_ENABLES_MIN_AGG_LEN);
+   req.flags = rte_cpu_to_le_32(
+   HWRM_VNIC_TPA_CFG_INPUT_FLAGS_TPA |
+   HWRM_VNIC_TPA_CFG_INPUT_FLAGS_ENCAP_TPA |
+   HWRM_VNIC_TPA_CFG_INPUT_FLAGS_RSC_WND_UPDATE |
+   HWRM_VNIC_TPA_CFG_INPUT_FLAGS_GRO |
+   HWRM_VNIC_TPA_CFG_INPUT_FLAGS_AGG_WITH_ECN |
+   HWRM_VNIC_TPA_CFG_INPUT_FLAGS_AGG_WITH_SAME_GRE_SEQ);
+   req.vnic_id = rte_cpu_to_le_32(vnic->fw_vnic_id);
+   req.max_agg_segs = rte_cpu_to_le_16(5);
+   req.max_aggs =
+   rte_cpu_to_le_16(HWRM_VNIC_TPA_CFG_INPUT_MAX_AGGS_MAX);
+   req.min_agg_len = rte_cpu_to_le_32(512);
+   }
+
+   rc = bnxt_hwrm_send_message(bp, &req, sizeof(req));
+
+   HWRM_CHECK_RESULT;
+
+   return rc;
+}
+
 int bnxt_hwrm_func_vf_mac(struct bnxt *bp, uint16_t vf, const uint8_t 
*mac_addr)
 {
struct hwrm_func_cfg_input req = {0};
@@ -1562,6 +1593,9 @@ void bnxt_free_all_hwrm_resources(struct bnxt *bp)
bnxt_clear_hwrm_vnic_filters(bp, vnic);
 
bnxt_hwrm_vnic_ctx_free(bp, vnic);
+
+   bnxt_hwrm_vnic_tpa_cfg(bp, vnic, false);
+
bnxt_hwrm_vnic_free(bp, vnic);
}
/* Ring resources */
diff --git a/drivers/net/bnxt/bnxt_hwrm.h b/drivers/net/bnxt/bnxt_hwrm.h
index 9478d12..2d707a3 100644
--- a/dr

[dpdk-dev] [PATCH 18/23] bnxt: add rxq_info_get and txq_info_get dev_ops

2017-05-17 Thread Ajit Khaparde
Add support for txq_info_get and rxq_info_get dev_ops

Signed-off-by: Ajit Khaparde 
---
 drivers/net/bnxt/bnxt_ethdev.c | 39 +++
 1 file changed, 39 insertions(+)

diff --git a/drivers/net/bnxt/bnxt_ethdev.c b/drivers/net/bnxt/bnxt_ethdev.c
index 7fafd02..d6c36b9 100644
--- a/drivers/net/bnxt/bnxt_ethdev.c
+++ b/drivers/net/bnxt/bnxt_ethdev.c
@@ -1390,6 +1390,43 @@ bnxt_fw_version_get(struct rte_eth_dev *dev, char 
*fw_version, size_t fw_size)
return 0;
 }
 
+static void
+bnxt_rxq_info_get_op(struct rte_eth_dev *dev, uint16_t queue_id,
+   struct rte_eth_rxq_info *qinfo)
+{
+   struct bnxt_rx_queue *rxq;
+
+   rxq = dev->data->rx_queues[queue_id];
+
+   qinfo->mp = rxq->mb_pool;
+   qinfo->scattered_rx = dev->data->scattered_rx;
+   qinfo->nb_desc = rxq->nb_rx_desc;
+
+   qinfo->conf.rx_free_thresh = rxq->rx_free_thresh;
+   qinfo->conf.rx_drop_en = 0;
+   qinfo->conf.rx_deferred_start = 0;
+}
+
+static void
+bnxt_txq_info_get_op(struct rte_eth_dev *dev, uint16_t queue_id,
+   struct rte_eth_txq_info *qinfo)
+{
+   struct bnxt_tx_queue *txq;
+
+   txq = dev->data->tx_queues[queue_id];
+
+   qinfo->nb_desc = txq->nb_tx_desc;
+
+   qinfo->conf.tx_thresh.pthresh = txq->pthresh;
+   qinfo->conf.tx_thresh.hthresh = txq->hthresh;
+   qinfo->conf.tx_thresh.wthresh = txq->wthresh;
+
+   qinfo->conf.tx_free_thresh = txq->tx_free_thresh;
+   qinfo->conf.tx_rs_thresh = 0;
+   qinfo->conf.txq_flags = txq->txq_flags;
+   qinfo->conf.tx_deferred_start = txq->tx_deferred_start;
+}
+
 static int bnxt_mtu_set_op(struct rte_eth_dev *eth_dev, uint16_t new_mtu)
 {
struct bnxt *bp = eth_dev->data->dev_private;
@@ -1482,6 +1519,8 @@ static const struct eth_dev_ops bnxt_dev_ops = {
.xstats_reset = bnxt_dev_xstats_reset_op,
.fw_version_get = bnxt_fw_version_get,
.set_mc_addr_list = bnxt_dev_set_mc_addr_list_op,
+   .rxq_info_get = bnxt_rxq_info_get_op,
+   .txq_info_get = bnxt_txq_info_get_op,
 };
 
 static bool bnxt_vf_pciid(uint16_t id)
-- 
2.10.1 (Apple Git-78)



[dpdk-dev] [PATCH 14/23] bnxt: add support for set_mc_addr_list and mac_addr_set

2017-05-17 Thread Ajit Khaparde
This patch adds support for set_mc_addr_list and
mac_addr_set dev_ops

Signed-off-by: Ajit Khaparde 
---
 drivers/net/bnxt/bnxt_ethdev.c | 68 ++
 drivers/net/bnxt/bnxt_hwrm.c   | 11 +--
 drivers/net/bnxt/bnxt_vnic.c   |  7 -
 drivers/net/bnxt/bnxt_vnic.h   |  4 +++
 4 files changed, 86 insertions(+), 4 deletions(-)

diff --git a/drivers/net/bnxt/bnxt_ethdev.c b/drivers/net/bnxt/bnxt_ethdev.c
index 1dc2327..9a0acee 100644
--- a/drivers/net/bnxt/bnxt_ethdev.c
+++ b/drivers/net/bnxt/bnxt_ethdev.c
@@ -1290,6 +1290,72 @@ bnxt_vlan_offload_set_op(struct rte_eth_dev *dev, int 
mask)
RTE_LOG(ERR, PMD, "Extend VLAN Not supported\n");
 }
 
+static void
+bnxt_set_default_mac_addr_op(struct rte_eth_dev *dev, struct ether_addr *addr)
+{
+   struct bnxt *bp = (struct bnxt *)dev->data->dev_private;
+   /* Default Filter is tied to VNIC 0 */
+   struct bnxt_vnic_info *vnic = &bp->vnic_info[0];
+   struct bnxt_filter_info *filter;
+   int rc;
+
+   if (BNXT_VF(bp))
+   return;
+
+   memcpy(bp->mac_addr, addr, sizeof(bp->mac_addr));
+   memcpy(&dev->data->mac_addrs[0], bp->mac_addr, ETHER_ADDR_LEN);
+
+   STAILQ_FOREACH(filter, &vnic->filter, next) {
+   /* Default Filter is at Index 0 */
+   if (filter->mac_index != 0)
+   continue;
+   rc = bnxt_hwrm_clear_filter(bp, filter);
+   if (rc)
+   break;
+   memcpy(filter->l2_addr, bp->mac_addr, ETHER_ADDR_LEN);
+   memset(filter->l2_addr_mask, 0xff, ETHER_ADDR_LEN);
+   filter->flags |= HWRM_CFA_L2_FILTER_ALLOC_INPUT_FLAGS_PATH_RX;
+   filter->enables |=
+   HWRM_CFA_L2_FILTER_ALLOC_INPUT_ENABLES_L2_ADDR |
+   HWRM_CFA_L2_FILTER_ALLOC_INPUT_ENABLES_L2_ADDR_MASK;
+   rc = bnxt_hwrm_set_filter(bp, vnic->fw_vnic_id, filter);
+   if (rc)
+   break;
+   filter->mac_index = 0;
+   RTE_LOG(DEBUG, PMD, "Set MAC addr\n");
+   }
+}
+
+static int
+bnxt_dev_set_mc_addr_list_op(struct rte_eth_dev *eth_dev,
+ struct ether_addr *mc_addr_set,
+ uint32_t nb_mc_addr)
+{
+   struct bnxt *bp = (struct bnxt *)eth_dev->data->dev_private;
+   char *mc_addr_list = (char *)mc_addr_set;
+   struct bnxt_vnic_info *vnic;
+   uint32_t off = 0, i = 0;
+
+   vnic = &bp->vnic_info[0];
+
+   if (nb_mc_addr > BNXT_MAX_MC_ADDRS) {
+   vnic->flags |= BNXT_VNIC_INFO_ALLMULTI;
+   goto allmulti;
+   }
+
+   /* TODO Check for Duplicate mcast addresses */
+   vnic->flags &= ~BNXT_VNIC_INFO_ALLMULTI;
+   for (i = 0; i < nb_mc_addr; i++) {
+   memcpy(vnic->mc_list + off, &mc_addr_list[i], ETHER_ADDR_LEN);
+   off += ETHER_ADDR_LEN;
+   }
+
+   vnic->mc_addr_cnt = i;
+
+allmulti:
+   return bnxt_hwrm_cfa_l2_set_rx_mask(bp, &bp->vnic_info[0], 0, NULL);
+}
+
 /*
  * Initialization
  */
@@ -1325,9 +1391,11 @@ static const struct eth_dev_ops bnxt_dev_ops = {
.udp_tunnel_port_del  = bnxt_udp_tunnel_port_del_op,
.vlan_filter_set = bnxt_vlan_filter_set_op,
.vlan_offload_set = bnxt_vlan_offload_set_op,
+   .mac_addr_set = bnxt_set_default_mac_addr_op,
.xstats_get = bnxt_dev_xstats_get_op,
.xstats_get_names = bnxt_dev_xstats_get_names_op,
.xstats_reset = bnxt_dev_xstats_reset_op,
+   .set_mc_addr_list = bnxt_dev_set_mc_addr_list_op,
 };
 
 static bool bnxt_vf_pciid(uint16_t id)
diff --git a/drivers/net/bnxt/bnxt_hwrm.c b/drivers/net/bnxt/bnxt_hwrm.c
index f66a760..9d37a50 100644
--- a/drivers/net/bnxt/bnxt_hwrm.c
+++ b/drivers/net/bnxt/bnxt_hwrm.c
@@ -215,15 +215,20 @@ int bnxt_hwrm_cfa_l2_set_rx_mask(struct bnxt *bp,
if (vnic->flags & BNXT_VNIC_INFO_PROMISC)
mask = HWRM_CFA_L2_SET_RX_MASK_INPUT_MASK_PROMISCUOUS;
if (vnic->flags & BNXT_VNIC_INFO_ALLMULTI)
-   mask = HWRM_CFA_L2_SET_RX_MASK_INPUT_MASK_ALL_MCAST;
-   req.mask = rte_cpu_to_le_32(HWRM_CFA_L2_SET_RX_MASK_INPUT_MASK_BCAST |
-   mask);
+   mask |= HWRM_CFA_L2_SET_RX_MASK_INPUT_MASK_ALL_MCAST;
+   if (vnic->mc_addr_cnt) {
+   mask |= HWRM_CFA_L2_SET_RX_MASK_INPUT_MASK_MCAST;
+   req.num_mc_entries = rte_cpu_to_le_32(vnic->mc_addr_cnt);
+   req.mc_tbl_addr = rte_cpu_to_le_64(vnic->mc_list_dma_addr);
+   }
if (vlan_count && vlan_table) {
mask |= HWRM_CFA_L2_SET_RX_MASK_INPUT_MASK_VLANONLY;
req.vlan_tag_tbl_addr = rte_cpu_to_le_16(
 rte_mem_virt2phy(vlan_table));
req.num_vlan_tags = rte_cpu_to_le_32((uint32_t)vlan_count);
}
+   req.mask = rte_cpu_to_le_32(HWRM_CFA_L2_SET_RX_MASK_INPUT_MASK_BCAST

[dpdk-dev] [PATCH 22/23] bnxt: Add support to set VF rxmode

2017-05-17 Thread Ajit Khaparde
This patch adds support to configure the VF L2 Rx settings.
The per VF setting is maintained in bnxt_child_vf_info.l2_rx_mask

Signed-off-by: Ajit Khaparde 
---
 drivers/net/bnxt/bnxt_hwrm.c  | 20 +++-
 drivers/net/bnxt/bnxt_hwrm.h  |  2 ++
 drivers/net/bnxt/bnxt_rxq.c   | 13 ++--
 drivers/net/bnxt/bnxt_vnic.h  |  5 +++
 drivers/net/bnxt/rte_pmd_bnxt.c   | 51 +++
 drivers/net/bnxt/rte_pmd_bnxt.h   | 19 
 drivers/net/bnxt/rte_pmd_bnxt_version.map |  1 +
 7 files changed, 107 insertions(+), 4 deletions(-)

diff --git a/drivers/net/bnxt/bnxt_hwrm.c b/drivers/net/bnxt/bnxt_hwrm.c
index e16fb3a..f1dc3bb 100644
--- a/drivers/net/bnxt/bnxt_hwrm.c
+++ b/drivers/net/bnxt/bnxt_hwrm.c
@@ -227,10 +227,16 @@ int bnxt_hwrm_cfa_l2_set_rx_mask(struct bnxt *bp,
/* FIXME add multicast flag, when multicast adding options is supported
 * by ethtool.
 */
+   if (vnic->flags & BNXT_VNIC_INFO_BCAST)
+   mask = HWRM_CFA_L2_SET_RX_MASK_INPUT_MASK_BCAST;
+   if (vnic->flags & BNXT_VNIC_INFO_UNTAGGED)
+   mask |= HWRM_CFA_L2_SET_RX_MASK_INPUT_MASK_VLAN_NONVLAN;
if (vnic->flags & BNXT_VNIC_INFO_PROMISC)
-   mask = HWRM_CFA_L2_SET_RX_MASK_INPUT_MASK_PROMISCUOUS;
+   mask |= HWRM_CFA_L2_SET_RX_MASK_INPUT_MASK_PROMISCUOUS;
if (vnic->flags & BNXT_VNIC_INFO_ALLMULTI)
mask |= HWRM_CFA_L2_SET_RX_MASK_INPUT_MASK_ALL_MCAST;
+   if (vnic->flags & BNXT_VNIC_INFO_MCAST)
+   mask |= HWRM_CFA_L2_SET_RX_MASK_INPUT_MASK_MCAST;
if (vnic->mc_addr_cnt) {
mask |= HWRM_CFA_L2_SET_RX_MASK_INPUT_MASK_MCAST;
req.num_mc_entries = rte_cpu_to_le_32(vnic->mc_addr_cnt);
@@ -2337,6 +2343,18 @@ int bnxt_hwrm_func_cfg_vf_set_flags(struct bnxt *bp, 
uint16_t vf)
return rc;
 }
 
+void vf_vnic_set_rxmask_cb(struct bnxt_vnic_info *vnic, void *flagp)
+{
+   uint32_t *flag = flagp;
+
+   vnic->flags = *flag;
+}
+
+int bnxt_set_rx_mask_no_vlan(struct bnxt *bp, struct bnxt_vnic_info *vnic)
+{
+   return bnxt_hwrm_cfa_l2_set_rx_mask(bp, vnic, 0, NULL);
+}
+
 int bnxt_hwrm_func_buf_rgtr(struct bnxt *bp)
 {
int rc = 0;
diff --git a/drivers/net/bnxt/bnxt_hwrm.h b/drivers/net/bnxt/bnxt_hwrm.h
index 7fc7b57..2c2c124 100644
--- a/drivers/net/bnxt/bnxt_hwrm.h
+++ b/drivers/net/bnxt/bnxt_hwrm.h
@@ -137,6 +137,8 @@ int bnxt_hwrm_tunnel_dst_port_free(struct bnxt *bp, 
uint16_t port,
uint8_t tunnel_type);
 void bnxt_free_tunnel_ports(struct bnxt *bp);
 int bnxt_hwrm_func_cfg_vf_set_flags(struct bnxt *bp, uint16_t vf);
+void vf_vnic_set_rxmask_cb(struct bnxt_vnic_info *vnic, void *flagp);
+int bnxt_set_rx_mask_no_vlan(struct bnxt *bp, struct bnxt_vnic_info *vnic);
 int bnxt_vf_default_vnic_count(struct bnxt *bp, uint16_t vf);
 int bnxt_hwrm_func_vf_vnic_query_and_config(struct bnxt *bp, uint16_t vf,
void (*vnic_cb)(struct bnxt_vnic_info *, void *), void *cbdata,
diff --git a/drivers/net/bnxt/bnxt_rxq.c b/drivers/net/bnxt/bnxt_rxq.c
index b0bbed1..bef48df 100644
--- a/drivers/net/bnxt/bnxt_rxq.c
+++ b/drivers/net/bnxt/bnxt_rxq.c
@@ -76,6 +76,7 @@ int bnxt_mq_rx_configure(struct bnxt *bp)
rc = -ENOMEM;
goto err_out;
}
+   vnic->flags |= BNXT_VNIC_INFO_BCAST;
STAILQ_INSERT_TAIL(&bp->ff_pool[0], vnic, next);
bp->nr_vnics++;
 
@@ -120,6 +121,9 @@ int bnxt_mq_rx_configure(struct bnxt *bp)
}
/* For each pool, allocate MACVLAN CFA rule & VNIC */
if (!pools) {
+   pools = RTE_MIN(bp->max_vnics,
+   RTE_MIN(bp->max_l2_ctx,
+RTE_MIN(bp->max_rsscos_ctx, ETH_64_POOLS)));
RTE_LOG(ERR, PMD,
"VMDq pool not set, defaulted to 64\n");
pools = ETH_64_POOLS;
@@ -137,6 +141,7 @@ int bnxt_mq_rx_configure(struct bnxt *bp)
rc = -ENOMEM;
goto err_out;
}
+   vnic->flags |= BNXT_VNIC_INFO_BCAST;
STAILQ_INSERT_TAIL(&bp->ff_pool[i], vnic, next);
bp->nr_vnics++;
 
@@ -177,6 +182,7 @@ int bnxt_mq_rx_configure(struct bnxt *bp)
rc = -ENOMEM;
goto err_out;
}
+   vnic->flags |= BNXT_VNIC_INFO_BCAST;
/* Partition the rx queues for the single pool */
for (i = 0; i < bp->rx_cp_nr_rings; i++) {
rxq = bp->eth_dev->data->rx_queues[i];
@@ -295,7 +301,7 @@ int bnxt_rx_queue_setup_op(struct rte_eth_dev *eth_dev,
int rc = 0;
 
if (!nb_desc || nb_desc > MAX_RX_DESC_CNT) {
-   RTE_LOG(ERR, PMD, "nb_desc %d is invalid", nb_desc);
+

[dpdk-dev] [PATCH 19/23] bnxt: add additonal HWRM debug info to error messages

2017-05-17 Thread Ajit Khaparde
Add the cmd_err and opaque_0 and opaque_1 fields to HWRM error
messages.  These allow better debugging of some classes of HWRM
errors.

Signed-off-by: Stephen Hurd 
Signed-off-by: Ajit Khaparde 
---
 drivers/net/bnxt/bnxt_hwrm.c   | 19 --
 drivers/net/bnxt/hsi_struct_def_dpdk.h | 36 ++
 2 files changed, 53 insertions(+), 2 deletions(-)

diff --git a/drivers/net/bnxt/bnxt_hwrm.c b/drivers/net/bnxt/bnxt_hwrm.c
index bea855e..b6fff91 100644
--- a/drivers/net/bnxt/bnxt_hwrm.c
+++ b/drivers/net/bnxt/bnxt_hwrm.c
@@ -137,7 +137,7 @@ static int bnxt_hwrm_send_message_locked(struct bnxt *bp, 
void *msg,
}
 
if (i >= HWRM_CMD_TIMEOUT) {
-   RTE_LOG(ERR, PMD, "Error sending msg %x\n",
+   RTE_LOG(ERR, PMD, "Error sending msg 0x%04x\n",
req->req_type);
goto err_ret;
}
@@ -174,7 +174,22 @@ static int bnxt_hwrm_send_message(struct bnxt *bp, void 
*msg, uint32_t msg_len)
} \
if (resp->error_code) { \
rc = rte_le_to_cpu_16(resp->error_code); \
-   RTE_LOG(ERR, PMD, "%s error %d\n", __func__, rc); \
+   if (resp->resp_len >= 16) { \
+   struct hwrm_err_output *tmp_hwrm_err_op = \
+   (void *)resp; \
+   RTE_LOG(ERR, PMD, \
+   "%s error %d:%d:%08x:%04x\n", \
+   __func__, \
+   rc, tmp_hwrm_err_op->cmd_err, \
+   rte_le_to_cpu_32(\
+   tmp_hwrm_err_op->opaque_0), \
+   rte_le_to_cpu_16(\
+   tmp_hwrm_err_op->opaque_1)); \
+   } \
+   else { \
+   RTE_LOG(ERR, PMD, \
+   "%s error %d\n", __func__, rc); \
+   } \
return rc; \
} \
}
diff --git a/drivers/net/bnxt/hsi_struct_def_dpdk.h 
b/drivers/net/bnxt/hsi_struct_def_dpdk.h
index 987ff53..84c53d7 100644
--- a/drivers/net/bnxt/hsi_struct_def_dpdk.h
+++ b/drivers/net/bnxt/hsi_struct_def_dpdk.h
@@ -8569,6 +8569,42 @@ struct hwrm_stat_ctx_clr_stats_output {
 */
 } __attribute__((packed));
 
+struct hwrm_err_output {
+   uint16_t error_code;
+   /*
+* Pass/Fail or error type Note: receiver to verify the in
+* parameters, and fail the call with an error when appropriate
+*/
+   uint16_t req_type;
+   /* This field returns the type of original request. */
+   uint16_t seq_id;
+   /* This field provides original sequence number of the command. */
+   uint16_t resp_len;
+   /*
+* This field is the length of the response in bytes. The last
+* byte of the response is a valid flag that will read as '1'
+* when the command has been completely written to memory.
+*/
+   uint32_t opaque_0;
+   /* debug info for this error response. */
+   uint16_t opaque_1;
+   /* debug info for this error response. */
+   uint8_t cmd_err;
+   /*
+* In the case of an error response, command specific error code
+* is returned in this field.
+*/
+   uint8_t valid;
+   /*
+* This field is used in Output records to indicate that the
+* output is completely written to RAM. This field should be
+* read as '1' to indicate that the output has been completely
+* written. When writing a command completion or response to an
+* internal processor, the order of writes has to be such that
+* this field is written last.
+*/
+} __attribute__((packed));
+
 /* Port Tx Statistics Formats  (408 bytes) */
 struct tx_port_stats {
uint64_t tx_64b_frames;
-- 
2.10.1 (Apple Git-78)



[dpdk-dev] [PATCH 20/23] bnxt: reorg the query stats code

2017-05-17 Thread Ajit Khaparde
1) Use hwrm_stat_ctx_query command to query statistics
Using hwrm_stat_ctx_query command will allow polling
the statistics from hardware instead of using the current push
model from the hardware which does a DMA of the stats to the host
at fixed intervals.
2) Use the rx_mbuf_alloc_fail to track mbuf alloc failures.
3) We were wrongly incrementing hwrm_cmd_seq in bnxt_hwrm_stat_clear
and bnxt_hwrm_stat_ctx_alloc functions.  This patch fixes that.

Signed-off-by: Stephen Hurd 
Signed-off-by: Ajit Khaparde 
---
 drivers/net/bnxt/bnxt.h|   1 +
 drivers/net/bnxt/bnxt_ethdev.c |   1 +
 drivers/net/bnxt/bnxt_hwrm.c   |  45 --
 drivers/net/bnxt/bnxt_hwrm.h   |   2 +
 drivers/net/bnxt/bnxt_rxr.c|  16 +++--
 drivers/net/bnxt/bnxt_stats.c  |  59 +++---
 drivers/net/bnxt/hsi_struct_def_dpdk.h | 108 +
 7 files changed, 170 insertions(+), 62 deletions(-)

diff --git a/drivers/net/bnxt/bnxt.h b/drivers/net/bnxt/bnxt.h
index f25a1c4..387dab7 100644
--- a/drivers/net/bnxt/bnxt.h
+++ b/drivers/net/bnxt/bnxt.h
@@ -208,6 +208,7 @@ struct bnxt {
uint16_tvxlan_fw_dst_port_id;
uint16_tgeneve_fw_dst_port_id;
uint32_tfw_ver;
+   rte_atomic64_t  rx_mbuf_alloc_fail;
 };
 
 /*
diff --git a/drivers/net/bnxt/bnxt_ethdev.c b/drivers/net/bnxt/bnxt_ethdev.c
index d6c36b9..80b85ab 100644
--- a/drivers/net/bnxt/bnxt_ethdev.c
+++ b/drivers/net/bnxt/bnxt_ethdev.c
@@ -1595,6 +1595,7 @@ bnxt_dev_init(struct rte_eth_dev *eth_dev)
 
bp = eth_dev->data->dev_private;
 
+   rte_atomic64_init(&bp->rx_mbuf_alloc_fail);
bp->dev_stopped = 1;
 
if (bnxt_vf_pciid(pci_dev->id.device_id))
diff --git a/drivers/net/bnxt/bnxt_hwrm.c b/drivers/net/bnxt/bnxt_hwrm.c
index b6fff91..7808f28 100644
--- a/drivers/net/bnxt/bnxt_hwrm.c
+++ b/drivers/net/bnxt/bnxt_hwrm.c
@@ -832,13 +832,12 @@ int bnxt_hwrm_stat_clear(struct bnxt *bp, struct 
bnxt_cp_ring_info *cpr)
struct hwrm_stat_ctx_clr_stats_input req = {.req_type = 0 };
struct hwrm_stat_ctx_clr_stats_output *resp = bp->hwrm_cmd_resp_addr;
 
-   HWRM_PREP(req, STAT_CTX_CLR_STATS, -1, resp);
-
if (cpr->hw_stats_ctx_id == (uint32_t)HWRM_NA_SIGNATURE)
return rc;
 
+   HWRM_PREP(req, STAT_CTX_CLR_STATS, -1, resp);
+
req.stat_ctx_id = rte_cpu_to_le_16(cpr->hw_stats_ctx_id);
-   req.seq_id = rte_cpu_to_le_16(bp->hwrm_cmd_seq++);
 
rc = bnxt_hwrm_send_message(bp, &req, sizeof(req));
 
@@ -856,9 +855,8 @@ int bnxt_hwrm_stat_ctx_alloc(struct bnxt *bp, struct 
bnxt_cp_ring_info *cpr,
 
HWRM_PREP(req, STAT_CTX_ALLOC, -1, resp);
 
-   req.update_period_ms = rte_cpu_to_le_32(1000);
+   req.update_period_ms = rte_cpu_to_le_32(0);
 
-   req.seq_id = rte_cpu_to_le_16(bp->hwrm_cmd_seq++);
req.stats_dma_addr =
rte_cpu_to_le_64(cpr->hw_stats_map);
 
@@ -881,7 +879,6 @@ int bnxt_hwrm_stat_ctx_free(struct bnxt *bp, struct 
bnxt_cp_ring_info *cpr,
HWRM_PREP(req, STAT_CTX_FREE, -1, resp);
 
req.stat_ctx_id = rte_cpu_to_le_16(cpr->hw_stats_ctx_id);
-   req.seq_id = rte_cpu_to_le_16(bp->hwrm_cmd_seq++);
 
rc = bnxt_hwrm_send_message(bp, &req, sizeof(req));
 
@@ -2481,6 +2478,42 @@ int bnxt_hwrm_exec_fwd_resp(struct bnxt *bp, uint16_t 
target_id,
return rc;
 }
 
+int bnxt_hwrm_ctx_qstats(struct bnxt *bp, uint32_t cid, int idx,
+struct rte_eth_stats *stats)
+{
+   int rc = 0;
+   struct hwrm_stat_ctx_query_input req = {.req_type = 0};
+   struct hwrm_stat_ctx_query_output *resp = bp->hwrm_cmd_resp_addr;
+
+   HWRM_PREP(req, STAT_CTX_QUERY, -1, resp);
+
+   req.stat_ctx_id = rte_cpu_to_le_32(cid);
+
+   rc = bnxt_hwrm_send_message(bp, &req, sizeof(req));
+
+   HWRM_CHECK_RESULT;
+
+   stats->q_ipackets[idx] = rte_le_to_cpu_64(resp->rx_ucast_pkts);
+   stats->q_ipackets[idx] += rte_le_to_cpu_64(resp->rx_mcast_pkts);
+   stats->q_ipackets[idx] += rte_le_to_cpu_64(resp->rx_bcast_pkts);
+   stats->q_ibytes[idx] = rte_le_to_cpu_64(resp->rx_ucast_bytes);
+   stats->q_ibytes[idx] += rte_le_to_cpu_64(resp->rx_mcast_bytes);
+   stats->q_ibytes[idx] += rte_le_to_cpu_64(resp->rx_bcast_bytes);
+
+   stats->q_opackets[idx] = rte_le_to_cpu_64(resp->tx_ucast_pkts);
+   stats->q_opackets[idx] += rte_le_to_cpu_64(resp->tx_mcast_pkts);
+   stats->q_opackets[idx] += rte_le_to_cpu_64(resp->tx_bcast_pkts);
+   stats->q_obytes[idx] = rte_le_to_cpu_64(resp->tx_ucast_bytes);
+   stats->q_obytes[idx] += rte_le_to_cpu_64(resp->tx_mcast_bytes);
+   stats->q_obytes[idx] += rte_le_to_cpu_64(resp->tx_bcast_bytes);
+
+   stats->q_errors[idx] = rte_le_to_cpu_64(resp->rx_err_pkts);
+   stats->q_errors[idx] += rte_le_to_cpu_64(resp->tx_err_pkts);
+   stats->q_errors[idx] += rte_le_to_cpu_64(resp->rx_dr

[dpdk-dev] [PATCH 23/23] bnxt: add code to support vlan_pvid_set dev_op

2017-05-17 Thread Ajit Khaparde
This patch adds code to support vlan_pvid_set dev_op

Signed-off-by: Ajit Khaparde 
---
 drivers/net/bnxt/bnxt_ethdev.c | 21 +
 drivers/net/bnxt/bnxt_hwrm.c   | 31 +++
 drivers/net/bnxt/bnxt_hwrm.h   |  1 +
 3 files changed, 53 insertions(+)

diff --git a/drivers/net/bnxt/bnxt_ethdev.c b/drivers/net/bnxt/bnxt_ethdev.c
index 80b85ab..a8484bb 100644
--- a/drivers/net/bnxt/bnxt_ethdev.c
+++ b/drivers/net/bnxt/bnxt_ethdev.c
@@ -1477,6 +1477,26 @@ static int bnxt_mtu_set_op(struct rte_eth_dev *eth_dev, 
uint16_t new_mtu)
return rc;
 }
 
+static int
+bnxt_vlan_pvid_set_op(struct rte_eth_dev *dev, uint16_t pvid, int on)
+{
+   struct bnxt *bp = (struct bnxt *)dev->data->dev_private;
+   uint16_t vlan = bp->vlan;
+   int rc;
+
+   if (BNXT_NPAR_PF(bp) || BNXT_VF(bp)) {
+   RTE_LOG(ERR, PMD,
+   "PVID cannot be modified for this function\n");
+   return -ENOTSUP;
+   }
+   bp->vlan = on ? pvid : 0;
+
+   rc = bnxt_hwrm_set_default_vlan(bp, 0, 0);
+   if (rc)
+   bp->vlan = vlan;
+   return rc;
+}
+
 /*
  * Initialization
  */
@@ -1512,6 +1532,7 @@ static const struct eth_dev_ops bnxt_dev_ops = {
.udp_tunnel_port_del  = bnxt_udp_tunnel_port_del_op,
.vlan_filter_set = bnxt_vlan_filter_set_op,
.vlan_offload_set = bnxt_vlan_offload_set_op,
+   .vlan_pvid_set = bnxt_vlan_pvid_set_op,
.mtu_set = bnxt_mtu_set_op,
.mac_addr_set = bnxt_set_default_mac_addr_op,
.xstats_get = bnxt_dev_xstats_get_op,
diff --git a/drivers/net/bnxt/bnxt_hwrm.c b/drivers/net/bnxt/bnxt_hwrm.c
index f1dc3bb..f3a4483 100644
--- a/drivers/net/bnxt/bnxt_hwrm.c
+++ b/drivers/net/bnxt/bnxt_hwrm.c
@@ -2434,6 +2434,37 @@ int bnxt_hwrm_func_bw_cfg(struct bnxt *bp, uint16_t vf,
return rc;
 }
 
+int bnxt_hwrm_set_default_vlan(struct bnxt *bp, int vf, uint8_t is_vf)
+{
+   struct hwrm_func_cfg_input req = {0};
+   struct hwrm_func_cfg_output *resp = bp->hwrm_cmd_resp_addr;
+   uint16_t dflt_vlan, fid;
+   uint32_t func_cfg_flags;
+   int rc = 0;
+
+   HWRM_PREP(req, FUNC_CFG, -1, resp);
+
+   if (is_vf) {
+   dflt_vlan = bp->pf.vf_info[vf].dflt_vlan;
+   fid = bp->pf.vf_info[vf].fid;
+   func_cfg_flags = bp->pf.vf_info[vf].func_cfg_flags;
+   } else {
+   fid = rte_cpu_to_le_16(0x);
+   func_cfg_flags = bp->pf.func_cfg_flags;
+   dflt_vlan = bp->vlan;
+   }
+
+   req.flags = rte_cpu_to_le_32(func_cfg_flags);
+   req.fid = rte_cpu_to_le_16(fid);
+   req.enables |= rte_cpu_to_le_32(HWRM_FUNC_CFG_INPUT_ENABLES_DFLT_VLAN);
+   req.dflt_vlan = rte_cpu_to_le_16(dflt_vlan);
+
+   rc = bnxt_hwrm_send_message(bp, &req, sizeof(req));
+   HWRM_CHECK_RESULT;
+
+   return rc;
+}
+
 int bnxt_hwrm_set_vf_vlan(struct bnxt *bp, int vf)
 {
struct hwrm_func_cfg_input req = {0};
diff --git a/drivers/net/bnxt/bnxt_hwrm.h b/drivers/net/bnxt/bnxt_hwrm.h
index 2c2c124..28203b7 100644
--- a/drivers/net/bnxt/bnxt_hwrm.h
+++ b/drivers/net/bnxt/bnxt_hwrm.h
@@ -136,6 +136,7 @@ int bnxt_hwrm_tunnel_dst_port_alloc(struct bnxt *bp, 
uint16_t port,
 int bnxt_hwrm_tunnel_dst_port_free(struct bnxt *bp, uint16_t port,
uint8_t tunnel_type);
 void bnxt_free_tunnel_ports(struct bnxt *bp);
+int bnxt_hwrm_set_default_vlan(struct bnxt *bp, int vf, uint8_t is_vf);
 int bnxt_hwrm_func_cfg_vf_set_flags(struct bnxt *bp, uint16_t vf);
 void vf_vnic_set_rxmask_cb(struct bnxt_vnic_info *vnic, void *flagp);
 int bnxt_set_rx_mask_no_vlan(struct bnxt *bp, struct bnxt_vnic_info *vnic);
-- 
2.10.1 (Apple Git-78)



[dpdk-dev] [PATCH] net/i40e/base: fix TX error stats on VF

2017-05-17 Thread Wenzhuo Lu
Unfortunately the datasheet has a mistake. The
address of the TX error counter is wrong.

Fixes: 8db9e2a1b232 ("i40e: base driver")
CC: sta...@dpdk.org

Signed-off-by: Wenzhuo Lu 
---
 drivers/net/i40e/base/i40e_register.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/i40e/base/i40e_register.h 
b/drivers/net/i40e/base/i40e_register.h
index 3a305b6..b150fbd 100644
--- a/drivers/net/i40e/base/i40e_register.h
+++ b/drivers/net/i40e/base/i40e_register.h
@@ -2805,7 +2805,7 @@ CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, 
PROCUREMENT OF
 #define I40E_GLV_RUPP_MAX_INDEX  383
 #define I40E_GLV_RUPP_RUPP_SHIFT 0
 #define I40E_GLV_RUPP_RUPP_MASK  I40E_MASK(0x, 
I40E_GLV_RUPP_RUPP_SHIFT)
-#define I40E_GLV_TEPC(_VSI)  (0x00344000 + ((_VSI) * 4)) /* _i=0...383 */ 
/* Reset: CORER */
+#define I40E_GLV_TEPC(_VSI)  (0x00344000 + ((_VSI) * 8)) /* _i=0...383 */ 
/* Reset: CORER */
 #define I40E_GLV_TEPC_MAX_INDEX  383
 #define I40E_GLV_TEPC_TEPC_SHIFT 0
 #define I40E_GLV_TEPC_TEPC_MASK  I40E_MASK(0x, 
I40E_GLV_TEPC_TEPC_SHIFT)
-- 
1.9.3



Re: [dpdk-dev] [PATCH] examples/performance-thread: add arm64 support

2017-05-17 Thread Jianbo Liu
On 18 May 2017 at 02:44, Jerin Jacob  wrote:
> -Original Message-
>> Date: Wed, 17 May 2017 11:19:49 -0700
>> From: Ashwin Sekhar T K 
>> To: jerin.ja...@caviumnetworks.com, john.mcnam...@intel.com,
>>  jianbo@linaro.org
>> Cc: dev@dpdk.org, Ashwin Sekhar T K 
>> Subject: [dpdk-dev] [PATCH] examples/performance-thread: add arm64 support
>> X-Mailer: git-send-email 2.12.2
>>
>> Updated Makefile to allow compilation for arm64 architecture.
>>
>> Moved the code for setting the initial stack to architecture specific
>> directory.
>
> Please split the patch to two
> - "arch_set_stack" abstraction and associated x86 change
> - arm64 support

There are so many redundant code in l3fwd and l3fwd-thread, I think
it's possible to merge them.

>
> Thanks Ashwin.
>
> I think, This may be the last feature to make arm64 at par with x86 features
> supported in DPDK.
>
> /Jerin


Re: [dpdk-dev] [PATCH 5/6] eal/ppc64: rte pause implementation for ppc64

2017-05-17 Thread Chao Zhu
> -Original Message-
> From: Jerin Jacob [mailto:jerin.ja...@caviumnetworks.com]
> Sent: 2017年5月11日 18:11
> To: dev@dpdk.org
> Cc: tho...@monjalon.net; jianbo@linaro.org; vikto...@rehivetech.com;
> Jerin Jacob ; Chao Zhu
> 
> Subject: [dpdk-dev] [PATCH 5/6] eal/ppc64: rte pause implementation for
ppc64
> 
> The patch does not provide any functional change for ppc64 with respect to
> existing rte_pause() definition.
> 
> CC: Chao Zhu 
> Signed-off-by: Jerin Jacob 
> ---
>  .../common/include/arch/ppc_64/rte_pause.h | 51
> ++
>  1 file changed, 51 insertions(+)
>  create mode 100644 lib/librte_eal/common/include/arch/ppc_64/rte_pause.h
> 
> diff --git a/lib/librte_eal/common/include/arch/ppc_64/rte_pause.h
> b/lib/librte_eal/common/include/arch/ppc_64/rte_pause.h
> new file mode 100644
> index 0..489cf2f13
> --- /dev/null
> +++ b/lib/librte_eal/common/include/arch/ppc_64/rte_pause.h
> @@ -0,0 +1,51 @@
> +/*-
> + *   BSD LICENSE
> + *
> + *   Copyright(c) Cavium. All rights reserved.
> + *   All rights reserved.
> + *
> + *   Redistribution and use in source and binary forms, with or without
> + *   modification, are permitted provided that the following conditions
> + *   are met:
> + *
> + * * Redistributions of source code must retain the above copyright
> + *   notice, this list of conditions and the following disclaimer.
> + * * Redistributions in binary form must reproduce the above
copyright
> + *   notice, this list of conditions and the following disclaimer in
> + *   the documentation and/or other materials provided with the
> + *   distribution.
> + * * Neither the name of Cavium nor the names of its
> + *   contributors may be used to endorse or promote products derived
> + *   from this software without specific prior written permission.
> + *
> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
> CONTRIBUTORS
> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT
> NOT
> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND
> FITNESS FOR
> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
> COPYRIGHT
> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
> INCIDENTAL,
> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
> BUT NOT
> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
> LOSS OF USE,
> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED
> AND ON ANY
> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR
> TORT
> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT
> OF THE USE
> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH
> DAMAGE.
> + */
> +
> +#ifndef _RTE_PAUSE_PPC64_H_
> +#define _RTE_PAUSE_PPC64_H_
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +#include "generic/rte_pause.h"
> +
> +static inline void rte_pause(void)
> +{
> +}
> +
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +#endif /* _RTE_PAUSE_PPC64_H_ */
> --
> 2.12.2

Acked-by: Chao Zhu 



  1   2   >