Re: [dpdk-dev] 17.11.4 patches review and test (RC2)

2018-08-27 Thread Marco Varlese
Hi,

On Fri, 2018-08-24 at 18:18 -0700, Yongseok Koh wrote:
> Hi all,
> 
> Here is a list of patches targeted for LTS release 17.11.4. Please help review
> and test. The planned date for the final release is Aug 30. Before that,
> please
> shout if anyone has objections with these patches being applied.
> 
> Also for the companies committed to running regression tests, please run the
> tests and report any issue before the release date.
I retested -rc2 and things look good now.
> 
> A release candidate tarball can be found at:
> 
> https://dpdk.org/browse/dpdk-stable/tag/?id=v17.11.4-rc2
> 
> These patches are located at branch 17.11 of dpdk-stable repo:
> 
> https://dpdk.org/browse/dpdk-stable/
> 
> 
> Thanks,
> Yongseok
Cheers,
Marco
> 
> ---
> Adrien Mazarguil (2):
>   maintainers: update for Mellanox PMDs
>   net/mlx4: fix minor resource leak during init
> 
> Ajit Khaparde (7):
>   net/bnxt: fix HW Tx checksum offload check
>   net/bnxt: fix set MTU
>   net/bnxt: fix Rx ring count limitation
>   net/bnxt: fix memory leaks in NVM commands
>   net/bnxt: fix lock release on NVM write failure
>   net/bnxt: check access denied for HWRM commands
>   net/bnxt: fix RETA size
> 
> Alejandro Lucero (6):
>   net/nfp: fix field initialization in Tx descriptor
>   mem: add function for checking memsegs IOVAs addresses
>   bus/pci: use IOVAs check when setting IOVA mode
>   mem: use address hint for mapping hugepages
>   net/nfp: check hugepages IOVAs based on DMA mask
>   net/nfp: support IOVA VA mode
> 
> Alok Makhariya (1):
>   bus/dpaa: fix phandle support for Linux 4.16
> 
> Anatoly Burakov (8):
>   eal/linux: fix invalid syntax in interrupts
>   eal/linux: fix uninitialized value
>   test: fix EAL flags autotest on FreeBSD
>   test: fix result printing
>   test: fix code on report
>   test: make autotest runner python 2/3 compliant
>   test: print autotest categories
>   test: improve filtering
> 
> Andrew Rybchenko (3):
>   net/sfc: cut non VLAN ID bits from TCI
>   net/sfc: fix assert in set multicast address list
>   net/sfc: handle unknown L3 packet class in EF10 event parser
> 
> Andy Green (1):
>   ring: fix sign conversion warning
> 
> Beilei Xing (3):
>   net/i40e: fix shifts of 32-bit value
>   net/i40e: fix packet type parsing with DDP
>   net/i40e: fix setting TPID with AQ command
> 
> Bruce Richardson (2):
>   examples/exception_path: fix out-of-bounds read
>   mk: fix permissions when using make install
> 
> Chas Williams (2):
>   net/bonding: always update bonding link status
>   net/bonding: do not clear active slave count
> 
> Dan Gora (1):
>   kni: fix crash with null name
> 
> Daria Kolistratova (1):
>   net/ena: fix SIGFPE with 0 Rx queue
> 
> Dariusz Stojaczyk (1):
>   eal: fix return codes on thread naming failure
> 
> David Marchand (1):
>   net/bnxt: add missing ids in xstats
> 
> Drocula Lambda (1):
>   kni: fix build on RHEL 7.5
> 
> Ferruh Yigit (2):
>   kni: fix build with gcc 8.1
>   net/thunderx: fix build with gcc optimization on
> 
> Fiona Trahe (1):
>   crypto/qat: fix checks for 3GPP algo bit params
> 
> Gavin Hu (3):
>   mk: fix cross build
>   net/dpaa2: remove loop for unused pool entries
>   maintainers: claim maintainership for ARM v7 and v8
> 
> Haiyue Wang (1):
>   net/i40e: workaround performance degradation
> 
> Harry van Haaren (1):
>   event: fix ring init failure handling
> 
> He Zhe (1):
>   examples: fix strncpy error for GCC8
> 
> Hemant Agrawal (2):
>   test/crypto: fix device id when stopping port
>   bus/dpaa: fix buffer offset setting in FMAN
> 
> Hyong Youb Kim (2):
>   net/enic: do not overwrite admin Tx queue limit
>   net/enic: add devarg to specify ingress VLAN rewrite mode
> 
> Ido Goshen (1):
>   net/pcap: fix multiple queues
> 
> Jananee Parthasarathy (1):
>   mk: update targets for classified tests
> 
> Jay Ding (1):
>   net/bnxt: check for invalid vNIC id
> 
> Jerin Jacob (2):
>   ethdev: fix queue statistics mapping documentation
>   eal: fix bitmap documentation
> 
> Kiran Kumar (2):
>   net/bonding: fix MAC address reset
>   net/thunderx: avoid sq door bell write on zero packet
> 
> Konstantin Ananyev (3):
>   examples/ipsec-secgw: fix IPv4 checksum at Tx
>   examples/ipsec-secgw: fix bypass rule processing
>   app/testpmd: fix DCB config
> 
> Matan Azrad (1):
>   net/tap: fix zeroed flow mask configurations
> 
> Maxime Coquelin (4):
>   vhost: improve dirty pages logging performance
>   vhost: fix missing increment of log cache count
>   vhost: flush IOTLB cache on new mem table handling
>   vhost: retranslate vring addr when memory table changes
> 
> Moti Haimovsky (2):
>   net/mlx5: fix build with old kernels
>   net/mlx4: check RSS queues number limitation
> 

[dpdk-dev] [PATCH v2] bus/fslmc: fix the undefined ref of rte dpaa2 memsegs

2018-08-27 Thread Hemant Agrawal
This patch fix the undefined reference issue with rte_dpaa2_memsegs
when compiled in shared lib mode with EXTRA_CFLAGS="-g -O0"

Bugzilla ID: 61
Fixes: 365fb925d3b3 ("bus/fslmc: optimize physical to virtual address search")
Cc: sta...@dpdk.org

Signed-off-by: Hemant Agrawal 
Reported-by: Keith Wiles 
---
v2: add bugzilla id

 drivers/bus/fslmc/portal/dpaa2_hw_dpbp.c| 7 +++
 drivers/bus/fslmc/rte_bus_fslmc_version.map | 1 +
 drivers/mempool/dpaa2/dpaa2_hw_mempool.c| 7 ---
 drivers/mempool/dpaa2/rte_mempool_dpaa2_version.map | 1 -
 4 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/bus/fslmc/portal/dpaa2_hw_dpbp.c 
b/drivers/bus/fslmc/portal/dpaa2_hw_dpbp.c
index 39c5adf..db49d63 100644
--- a/drivers/bus/fslmc/portal/dpaa2_hw_dpbp.c
+++ b/drivers/bus/fslmc/portal/dpaa2_hw_dpbp.c
@@ -28,6 +28,13 @@
 #include "portal/dpaa2_hw_pvt.h"
 #include "portal/dpaa2_hw_dpio.h"
 
+/* List of all the memseg information locally maintained in dpaa2 driver. This
+ * is to optimize the PA_to_VA searches until a better mechanism (algo) is
+ * available.
+ */
+struct dpaa2_memseg_list rte_dpaa2_memsegs
+   = TAILQ_HEAD_INITIALIZER(rte_dpaa2_memsegs);
+
 TAILQ_HEAD(dpbp_dev_list, dpaa2_dpbp_dev);
 static struct dpbp_dev_list dpbp_dev_list
= TAILQ_HEAD_INITIALIZER(dpbp_dev_list); /*!< DPBP device list */
diff --git a/drivers/bus/fslmc/rte_bus_fslmc_version.map 
b/drivers/bus/fslmc/rte_bus_fslmc_version.map
index fe45a11..b4a8817 100644
--- a/drivers/bus/fslmc/rte_bus_fslmc_version.map
+++ b/drivers/bus/fslmc/rte_bus_fslmc_version.map
@@ -114,5 +114,6 @@ DPDK_18.05 {
dpdmai_open;
dpdmai_set_rx_queue;
rte_dpaa2_free_dpci_dev;
+   rte_dpaa2_memsegs;
 
 } DPDK_18.02;
diff --git a/drivers/mempool/dpaa2/dpaa2_hw_mempool.c 
b/drivers/mempool/dpaa2/dpaa2_hw_mempool.c
index 7d0435f..84ff128 100644
--- a/drivers/mempool/dpaa2/dpaa2_hw_mempool.c
+++ b/drivers/mempool/dpaa2/dpaa2_hw_mempool.c
@@ -33,13 +33,6 @@
 struct dpaa2_bp_info rte_dpaa2_bpid_info[MAX_BPID];
 static struct dpaa2_bp_list *h_bp_list;
 
-/* List of all the memseg information locally maintained in dpaa2 driver. This
- * is to optimize the PA_to_VA searches until a better mechanism (algo) is
- * available.
- */
-struct dpaa2_memseg_list rte_dpaa2_memsegs
-   = TAILQ_HEAD_INITIALIZER(rte_dpaa2_memsegs);
-
 /* Dynamic logging identified for mempool */
 int dpaa2_logtype_mempool;
 
diff --git a/drivers/mempool/dpaa2/rte_mempool_dpaa2_version.map 
b/drivers/mempool/dpaa2/rte_mempool_dpaa2_version.map
index b9d996a..b45e7a9 100644
--- a/drivers/mempool/dpaa2/rte_mempool_dpaa2_version.map
+++ b/drivers/mempool/dpaa2/rte_mempool_dpaa2_version.map
@@ -3,7 +3,6 @@ DPDK_17.05 {
 
rte_dpaa2_bpid_info;
rte_dpaa2_mbuf_alloc_bulk;
-   rte_dpaa2_memsegs;
 
local: *;
 };
-- 
2.7.4



Re: [dpdk-dev] [PATCH] net/virtio-user: fix memory hotplug support

2018-08-27 Thread Burakov, Anatoly

On 24-Aug-18 4:51 PM, Sean Harte wrote:

On Fri, 24 Aug 2018 at 16:19, Burakov, Anatoly
 wrote:


On 24-Aug-18 11:41 AM, Burakov, Anatoly wrote:

On 24-Aug-18 10:35 AM, Tiwei Bie wrote:

On Fri, Aug 24, 2018 at 09:59:42AM +0100, Burakov, Anatoly wrote:

On 24-Aug-18 5:49 AM, Tiwei Bie wrote:

On Thu, Aug 23, 2018 at 03:01:30PM +0100, Burakov, Anatoly wrote:

On 23-Aug-18 12:19 PM, Sean Harte wrote:

On Thu, 23 Aug 2018 at 10:05, Burakov, Anatoly
 wrote:


On 23-Aug-18 3:57 AM, Tiwei Bie wrote:

Deadlock can occur when allocating memory if a vhost-kernel
based virtio-user device is in use. Besides, it's possible
to have much more than 64 non-contiguous hugepage backed
memory regions due to the memory hotplug, which may cause
problems when handling VHOST_SET_MEM_TABLE request. A better
solution is to have the virtio-user pass all the VA ranges
reserved by DPDK to vhost-kernel.

Bugzilla ID: 81
Fixes: 12ecb2f63b12 ("net/virtio-user: support memory hotplug")
Cc: sta...@dpdk.org

Reported-by: Seán Harte 
Signed-off-by: Tiwei Bie 
---
  drivers/net/virtio/virtio_user/vhost_kernel.c | 64
---
  1 file changed, 27 insertions(+), 37 deletions(-)

diff --git a/drivers/net/virtio/virtio_user/vhost_kernel.c
b/drivers/net/virtio/virtio_user/vhost_kernel.c
index b2444096c..49bd1b821 100644
--- a/drivers/net/virtio/virtio_user/vhost_kernel.c
+++ b/drivers/net/virtio/virtio_user/vhost_kernel.c
@@ -70,41 +70,12 @@ static uint64_t vhost_req_user_to_kernel[] = {
  [VHOST_USER_SET_MEM_TABLE] = VHOST_SET_MEM_TABLE,
  };

-struct walk_arg {
- struct vhost_memory_kernel *vm;
- uint32_t region_nr;
-};
-static int
-add_memory_region(const struct rte_memseg_list *msl __rte_unused,
- const struct rte_memseg *ms, size_t len, void *arg)
-{
- struct walk_arg *wa = arg;
- struct vhost_memory_region *mr;
- void *start_addr;
-
- if (wa->region_nr >= max_regions)
- return -1;
-
- mr = &wa->vm->regions[wa->region_nr++];
- start_addr = ms->addr;
-
- mr->guest_phys_addr = (uint64_t)(uintptr_t)start_addr;
- mr->userspace_addr = (uint64_t)(uintptr_t)start_addr;
- mr->memory_size = len;
- mr->mmap_offset = 0;
-
- return 0;
-}
-
-/* By default, vhost kernel module allows 64 regions, but DPDK
allows
- * 256 segments. As a relief, below function merges those
virtually
- * adjacent memsegs into one region.
- */
  static struct vhost_memory_kernel *
  prepare_vhost_memory_kernel(void)
  {
+ struct rte_mem_config *mcfg =
rte_eal_get_configuration()->mem_config;
  struct vhost_memory_kernel *vm;
- struct walk_arg wa;
+ uint32_t region_nr = 0, i;

  vm = malloc(sizeof(struct vhost_memory_kernel) +
  max_regions *
@@ -112,15 +83,34 @@ prepare_vhost_memory_kernel(void)
  if (!vm)
  return NULL;

- wa.region_nr = 0;
- wa.vm = vm;
+ for (i = 0; i < RTE_MAX_MEMSEG_LISTS; i++) {
+ struct rte_memseg_list *msl = &mcfg->memsegs[i];
+ struct vhost_memory_region *mr;
+ void *start_addr;
+ uint64_t len;


There is a rte_memseg_list_walk() - please do not iterate over
memseg
lists manually.



rte_memseg_list_walk() can't be used here because
prepare_vhost_memory_kernel() is sometimes called from a memory
callback. It will then hang trying to get a read lock on
memory_hotplug_lock.


OK, so use rte_memseg_list_walk_thread_unsafe().


I don't think the rte_memseg_list_walk_thread_unsafe() function is
appropriate because prepare_vhost_memory_kernel() may not always be
called from a memory callback.


And how is this different? What you're doing here is identical to
calling
rte_memseg_list_walk_thread_unsafe() (that's precisely what it does
internally - check the code!), except that you're doing it manually
and not
using DPDK API, which makes your code dependent on internals of DPDK's
memory implementation.

So, this function may or may not be called from a callback, but
you're using
thread-unsafe walk anyway. I think you should call either
thread-safe or
thread-unsafe version depending on whether you're being called from a
callback or not.


Hmm, the real case is a bit more tricky. Even if this
function isn't called from memory event callbacks, the
"thread-safe" version list_walk() still can't be used.
Because deadlock may happen.

In virtio-user device start, it needs to do SET_MEM_TABLE
for the vhost-backend. And to make sure that preparing and
setting the memory table is atomic (and also to protect the
device state), it needs a lock. So if it calls "thread-safe"
version list_walk(), there will be two locks taken in
below order:

- the virtio-user device lock (taken by virtio_user_start_device());
- the memory hotplug lock (taken by rte_memseg_list_walk());

And above locks will be released in below order:

- the memory hotplug lock (released by rte_memseg_list_walk());
- the virtio-user device lock (released b

Re: [dpdk-dev] [RFC 1/1] eventdev: add distributed software (DSW) event device

2018-08-27 Thread Mattias Rönnblom

On 2018-07-22 13:32, Jerin Jacob wrote:


+static void
+dsw_stop(struct rte_eventdev *dev __rte_unused)
+{


You may implement, eventdev_stop_flush_t callback to free up the
outstanding events in the eventdev.



Is this support mandatory, or is it OK to leave it to the user to empty 
the machinery before calling stop in the initial driver version?


I can't find any other event device supporting the callback.

In DSW, the events can be a little here-and-there - in the output 
buffers, in the pause buffer, and on the input rings.


That said, assuming the worker lcore threads have stopped using the 
device and issued the appropriate barriers, it should be possible to 
round up the events from the thread running 'rte_event_dev_stop'.


Re: [dpdk-dev] 18.05.1 patches review and test

2018-08-27 Thread Christian Ehrhardt
On Wed, Aug 22, 2018 at 9:26 AM Christian Ehrhardt <
christian.ehrha...@canonical.com> wrote:

> Hi all,
>
> Here is a list of patches targeted for stable release 18.05.1. Please
> help review and test. The planned date for the final release is August,
> 29th. Before that, please shout if anyone has objections with these
> patches being applied.
>

There was neither positive nor negative feedback on 18.05.1-rc1 so far.
Maybe 17.11.x priorities and general PTO time just beats 18.05 - which is
fine to some extend.
The only private message I got was about one party needing some extra time.
For all of the above I will do two things:
1. the deadline to get back with results on 18.05.1-rc1 is extended to
Tuesday the 4th of September
2. I'd highly appreciate feedback of people involved that intend to test it
so I know what to wait for (or not)

Also for the companies committed to running regression tests,
> please run the tests and report any issue before the release date.
>
> A release candidate tarball can be found at:
>
> https://dpdk.org/browse/dpdk-stable/tag/?id=v18.05.1-rc1
>
> These patches are located at branch 18.05 of dpdk-stable repo:
>
> https://git.dpdk.org/dpdk-stable/log/?h=18.05
>
> Thanks.
>
> Christian Ehrhardt 
>
> ---
> Adrien Mazarguil (8):
>   app/testpmd: fix crash when attaching a device
>   net/mlx4: fix minor resource leak during init
>   net/mlx5: fix errno object in probe function
>   net/mlx5: fix missing errno in probe function
>   net/mlx5: fix error message in probe function
>   net/mlx5: fix invalid error check
>   maintainers: update for Mellanox PMDs
>   net/mlx5: fix invalid network interface index
>
> Ajit Khaparde (11):
>   net/bnxt: fix clear port stats
>   net/bnxt: fix close operation
>   net/bnxt: fix HW Tx checksum offload check
>   net/bnxt: check filter type before clearing it
>   net/bnxt: fix set MTU
>   net/bnxt: fix incorrect IO address handling in Tx
>   net/bnxt: fix Rx ring count limitation
>   net/bnxt: fix memory leaks in NVM commands
>   net/bnxt: fix lock release on NVM write failure
>   net/bnxt: check access denied for HWRM commands
>   net/bnxt: fix RETA size
>
> Alejandro Lucero (2):
>   net/nfp: fix unused header reference
>   net/nfp: fix field initialization in Tx descriptor
>
> Alok Makhariya (1):
>   bus/dpaa: fix phandle support for Linux 4.16
>
> Anatoly Burakov (14):
>   ipc: fix locking while sending messages
>   mem: fix alignment of requested virtual areas
>   eal/bsd: fix memory segment index display
>   malloc: fix pad erasing
>   eal/linux: fix invalid syntax in interrupts
>   eal/linux: fix uninitialized value
>   vfio: fix uninitialized variable
>   malloc: do not skip pad on free
>   test: fix EAL flags autotest on FreeBSD
>   test: fix result printing
>   test: fix code on report
>   test: make autotest runner python 2/3 compliant
>   test: print autotest categories
>   test: improve filtering
>
> Andrew Rybchenko (7):
>   net/sfc: cut non VLAN ID bits from TCI
>   net/sfc: discard packets with bad CRC on EF10 ESSB Rx
>   net/sfc: fix double-free in EF10 ESSB Rx queue purge
>   net/sfc: move Rx checksum offload check to device level
>   net/sfc: fix Rx queue offloads reporting in queue info
>   net/sfc: fix assert in set multicast address list
>   net/sfc: handle unknown L3 packet class in EF10 event parser
>
> Andy Green (2):
>   ring: fix declaration after statement
>   ring: fix sign conversion warning
>
> Beilei Xing (5):
>   net/i40e: fix shifts of 32-bit value
>   net/i40e: fix PPPoL2TP packet type parsing
>   net/i40e: fix packet type parsing with DDP
>   net/i40e: fix setting TPID with AQ command
>   net/i40e: fix device parameter parsing
>
> Bruce Richardson (3):
>   eal: fix error message for unsupported platforms
>   examples/exception_path: fix out-of-bounds read
>   mk: fix permissions when using make install
>
> Chas Williams (2):
>   net/bonding: always update bonding link status
>   net/bonding: do not clear active slave count
>
> Christian Ehrhardt (2):
>   FIXUP: net/mlx5: fix invalid network interface index
>   version: 18.05.1-rc1
>
> Damjan Marion (1):
>   net/i40e: do not reset device info data
>
> Dan Gora (1):
>   kni: fix crash with null name
>
> Daria Kolistratova (1):
>   net/ena: fix SIGFPE with 0 Rx queue
>
> Dariusz Stojaczyk (7):
>   mem: do not leave unmapped holes in EAL memory area
>   mem: do not unmap overlapping region on mmap failure
>   mem: avoid crash on memseg query with invalid address
>   mem: fix alignment requested with --base-virtaddr
>   mem: do not use --base-virtaddr in secondary processes
>   eal: fix return codes on thread naming failure
>   eal: fix return codes on control thread failure
>
> David Marchand (1):
>

Re: [dpdk-dev] [PATCH v2] bus/fslmc: fix the undefined ref of rte dpaa2 memsegs

2018-08-27 Thread Shreyansh Jain

On Monday 27 August 2018 02:22 PM, Hemant Agrawal wrote:

This patch fix the undefined reference issue with rte_dpaa2_memsegs
when compiled in shared lib mode with EXTRA_CFLAGS="-g -O0"

Bugzilla ID: 61
Fixes: 365fb925d3b3 ("bus/fslmc: optimize physical to virtual address search")
Cc: sta...@dpdk.org

Signed-off-by: Hemant Agrawal 
Reported-by: Keith Wiles 
---
v2: add bugzilla id


Acked-by: Shreyansh Jain 



[dpdk-dev] [PATCH] app/testpmd: Optimize membuf pool allocation

2018-08-27 Thread Phil Yang
By default, testpmd will create membuf pool for all NUMA nodes and
ignore EAL configuration.

Count the number of available NUMA according to EAL core mask or core
list configuration. Optimized by only creating membuf pool for those
nodes.

Fixes: d5aeab6542f ("app/testpmd: fix mempool creation by socket id")

Signed-off-by: Phil Yang 
---
 app/test-pmd/testpmd.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index ee48db2..a56af2b 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -476,6 +476,8 @@ set_default_fwd_lcores_config(void)
 
nb_lc = 0;
for (i = 0; i < RTE_MAX_LCORE; i++) {
+   if (!rte_lcore_is_enabled(i))
+   continue;
sock_num = rte_lcore_to_socket_id(i);
if (new_socket_id(sock_num)) {
if (num_sockets >= RTE_MAX_NUMA_NODES) {
@@ -485,8 +487,6 @@ set_default_fwd_lcores_config(void)
}
socket_ids[num_sockets++] = sock_num;
}
-   if (!rte_lcore_is_enabled(i))
-   continue;
if (i == rte_get_master_lcore())
continue;
fwd_lcores_cpuids[nb_lc++] = i;
-- 
2.7.4



Re: [dpdk-dev] [PATCH] app/testpmd: Optimize membuf pool allocation

2018-08-27 Thread Gavin Hu


> -Original Message-
> From: Phil Yang 
> Sent: Monday, August 27, 2018 5:33 PM
> To: dev@dpdk.org
> Cc: nd ; Gavin Hu 
> Subject: [PATCH] app/testpmd: Optimize membuf pool allocation
> 
> By default, testpmd will create membuf pool for all NUMA nodes and ignore
> EAL configuration.
> 
> Count the number of available NUMA according to EAL core mask or core list
> configuration. Optimized by only creating membuf pool for those nodes.
> 
> Fixes: d5aeab6542f ("app/testpmd: fix mempool creation by socket id")
> 
> Signed-off-by: Phil Yang 

Acked-by: Gavin Hu 

> ---
>  app/test-pmd/testpmd.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c index
> ee48db2..a56af2b 100644
> --- a/app/test-pmd/testpmd.c
> +++ b/app/test-pmd/testpmd.c
> @@ -476,6 +476,8 @@ set_default_fwd_lcores_config(void)
> 
>   nb_lc = 0;
>   for (i = 0; i < RTE_MAX_LCORE; i++) {
> + if (!rte_lcore_is_enabled(i))
> + continue;
>   sock_num = rte_lcore_to_socket_id(i);
>   if (new_socket_id(sock_num)) {
>   if (num_sockets >= RTE_MAX_NUMA_NODES) { @@ -
> 485,8 +487,6 @@ set_default_fwd_lcores_config(void)
>   }
>   socket_ids[num_sockets++] = sock_num;
>   }
> - if (!rte_lcore_is_enabled(i))
> - continue;
>   if (i == rte_get_master_lcore())
>   continue;
>   fwd_lcores_cpuids[nb_lc++] = i;
> --
> 2.7.4



Re: [dpdk-dev] [RFC 1/1] eventdev: add distributed software (DSW) event device

2018-08-27 Thread Jerin Jacob
-Original Message-
> Date: Mon, 27 Aug 2018 11:23:59 +0200
> From: Mattias Rönnblom 
> To: Jerin Jacob 
> CC: dev@dpdk.org, bruce.richard...@intel.com
> Subject: Re: [dpdk-dev] [RFC 1/1] eventdev: add distributed software (DSW)
>  event device
> User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101
>  Thunderbird/52.9.1
> 
> External Email
> 
> On 2018-07-22 13:32, Jerin Jacob wrote:
> 
> > > +static void
> > > +dsw_stop(struct rte_eventdev *dev __rte_unused)
> > > +{
> > 
> > You may implement, eventdev_stop_flush_t callback to free up the
> > outstanding events in the eventdev.
> > 
> 
> Is this support mandatory, or is it OK to leave it to the user to empty
> the machinery before calling stop in the initial driver version?

This is useful to avoid event buffer leak. The application may call stop()
abruptly or some reason event device cannot provide the event on
dequeue().

Any trivial implementation could be doing dequeue() in the driver on
stop().

> 
> I can't find any other event device supporting the callback.

drivers/event/sw and drivers/event/octeontx/ supports it.

$ grep -r "dev_stop_flush" drivers/event/
drivers/event/octeontx/ssovf_evdev.c:   if (dev->dev_ops->dev_stop_flush != 
NULL)
drivers/event/octeontx/ssovf_evdev.c: 
dev->dev_ops->dev_stop_flush(dev->data->dev_id, event,
dev->data->dev_stop_flush_arg);
drivers/event/sw/sw_evdev_selftest.c:dev_stop_flush(struct test *t) /*
test to check we can properly flush events */
drivers/event/sw/sw_evdev_selftest.c:   if
(rte_event_dev_stop_flush_callback_register(evdev, flush, &count)) {
drivers/event/sw/sw_evdev_selftest.c:   if
(rte_event_dev_stop_flush_callback_register(evdev, NULL, NULL)) {
drivers/event/sw/sw_evdev_selftest.c:   ret = dev_stop_flush(t);
drivers/event/sw/sw_evdev.c:eventdev_stop_flush_t flush;
drivers/event/sw/sw_evdev.c:flush = dev->dev_ops->dev_stop_flush;
drivers/event/sw/sw_evdev.c:arg = dev->data->dev_stop_flush_arg;
drivers/event/sw/sw_evdev.c:eventdev_stop_flush_t flush;
drivers/event/sw/sw_evdev.c:flush = dev->dev_ops->dev_stop_flush;
drivers/event/sw/sw_evdev.c:arg = dev->data->dev_stop_flush_arg;

> 
> In DSW, the events can be a little here-and-there - in the output
> buffers, in the pause buffer, and on the input rings.

Any trivial implementation could be doing dequeue() in the driver on
stop().

> 
> That said, assuming the worker lcore threads have stopped using the
> device and issued the appropriate barriers, it should be possible to
> round up the events from the thread running 'rte_event_dev_stop'.


Re: [dpdk-dev] [RFC 1/1] eventdev: add distributed software (DSW) event device

2018-08-27 Thread Mattias Rönnblom

On 2018-08-27 11:40, Jerin Jacob wrote:

On 2018-07-22 13:32, Jerin Jacob wrote:


+static void
+dsw_stop(struct rte_eventdev *dev __rte_unused)
+{


You may implement, eventdev_stop_flush_t callback to free up the
outstanding events in the eventdev.



Is this support mandatory, or is it OK to leave it to the user to empty
the machinery before calling stop in the initial driver version?


This is useful to avoid event buffer leak. The application may call stop()
abruptly or some reason event device cannot provide the event on
dequeue().



I see it being useful.

I'll implement it.


Any trivial implementation could be doing dequeue() in the driver on
stop().



I can't find any other event device supporting the callback.


drivers/event/sw and drivers/event/octeontx/ supports it.

$ grep -r "dev_stop_flush" drivers/event/
drivers/event/octeontx/ssovf_evdev.c:   if (dev->dev_ops->dev_stop_flush != 
NULL)
drivers/event/octeontx/ssovf_evdev.c: 
dev->dev_ops->dev_stop_flush(dev->data->dev_id, event,
dev->data->dev_stop_flush_arg);
drivers/event/sw/sw_evdev_selftest.c:dev_stop_flush(struct test *t) /*
test to check we can properly flush events */
drivers/event/sw/sw_evdev_selftest.c:   if
(rte_event_dev_stop_flush_callback_register(evdev, flush, &count)) {
drivers/event/sw/sw_evdev_selftest.c:   if
(rte_event_dev_stop_flush_callback_register(evdev, NULL, NULL)) {
drivers/event/sw/sw_evdev_selftest.c:   ret = dev_stop_flush(t);
drivers/event/sw/sw_evdev.c:eventdev_stop_flush_t flush;
drivers/event/sw/sw_evdev.c:flush = dev->dev_ops->dev_stop_flush;
drivers/event/sw/sw_evdev.c:arg = dev->data->dev_stop_flush_arg;
drivers/event/sw/sw_evdev.c:eventdev_stop_flush_t flush;
drivers/event/sw/sw_evdev.c:flush = dev->dev_ops->dev_stop_flush;
drivers/event/sw/sw_evdev.c:arg = dev->data->dev_stop_flush_arg;



I needed to rebase against master. Sorry.



In DSW, the events can be a little here-and-there - in the output
buffers, in the pause buffer, and on the input rings.


Any trivial implementation could be doing dequeue() in the driver on
stop().



Sure, but how many times? One zero-dequeue is not enough. A migration 
might be in progress, and the signaling needs to finish before the 
events in the pause buffer is flushed to the destination in_ring.


I'll traverse the port-internal output/pause buffers instead.


Re: [dpdk-dev] [dpdk-stable] 18.05.1 patches review and test

2018-08-27 Thread Marco Varlese
Hi Christian,

Apologies for being late on this... I tested 18.05.1-rc1 via the usual 
test-pmd, 
OvS-DPDK and VPP, and did not find any issues with it.


Cheers,
Marco

On Mon, 2018-08-27 at 11:29 +0200, Christian Ehrhardt wrote:
> On Wed, Aug 22, 2018 at 9:26 AM Christian Ehrhardt <
> christian.ehrha...@canonical.com> wrote:
> 
> > Hi all,
> > 
> > Here is a list of patches targeted for stable release 18.05.1. Please
> > help review and test. The planned date for the final release is August,
> > 29th. Before that, please shout if anyone has objections with these
> > patches being applied.
> > 
> 
> There was neither positive nor negative feedback on 18.05.1-rc1 so far.
> Maybe 17.11.x priorities and general PTO time just beats 18.05 - which is
> fine to some extend.
> The only private message I got was about one party needing some extra time.
> For all of the above I will do two things:
> 1. the deadline to get back with results on 18.05.1-rc1 is extended to
> Tuesday the 4th of September
> 2. I'd highly appreciate feedback of people involved that intend to test it
> so I know what to wait for (or not)
> 
> Also for the companies committed to running regression tests,
> > please run the tests and report any issue before the release date.
> > 
> > A release candidate tarball can be found at:
> > 
> > https://dpdk.org/browse/dpdk-stable/tag/?id=v18.05.1-rc1
> > 
> > These patches are located at branch 18.05 of dpdk-stable repo:
> > 
> > https://git.dpdk.org/dpdk-stable/log/?h=18.05
> > 
> > Thanks.
> > 
> > Christian Ehrhardt 
> > 
> > ---
> > Adrien Mazarguil (8):
> >   app/testpmd: fix crash when attaching a device
> >   net/mlx4: fix minor resource leak during init
> >   net/mlx5: fix errno object in probe function
> >   net/mlx5: fix missing errno in probe function
> >   net/mlx5: fix error message in probe function
> >   net/mlx5: fix invalid error check
> >   maintainers: update for Mellanox PMDs
> >   net/mlx5: fix invalid network interface index
> > 
> > Ajit Khaparde (11):
> >   net/bnxt: fix clear port stats
> >   net/bnxt: fix close operation
> >   net/bnxt: fix HW Tx checksum offload check
> >   net/bnxt: check filter type before clearing it
> >   net/bnxt: fix set MTU
> >   net/bnxt: fix incorrect IO address handling in Tx
> >   net/bnxt: fix Rx ring count limitation
> >   net/bnxt: fix memory leaks in NVM commands
> >   net/bnxt: fix lock release on NVM write failure
> >   net/bnxt: check access denied for HWRM commands
> >   net/bnxt: fix RETA size
> > 
> > Alejandro Lucero (2):
> >   net/nfp: fix unused header reference
> >   net/nfp: fix field initialization in Tx descriptor
> > 
> > Alok Makhariya (1):
> >   bus/dpaa: fix phandle support for Linux 4.16
> > 
> > Anatoly Burakov (14):
> >   ipc: fix locking while sending messages
> >   mem: fix alignment of requested virtual areas
> >   eal/bsd: fix memory segment index display
> >   malloc: fix pad erasing
> >   eal/linux: fix invalid syntax in interrupts
> >   eal/linux: fix uninitialized value
> >   vfio: fix uninitialized variable
> >   malloc: do not skip pad on free
> >   test: fix EAL flags autotest on FreeBSD
> >   test: fix result printing
> >   test: fix code on report
> >   test: make autotest runner python 2/3 compliant
> >   test: print autotest categories
> >   test: improve filtering
> > 
> > Andrew Rybchenko (7):
> >   net/sfc: cut non VLAN ID bits from TCI
> >   net/sfc: discard packets with bad CRC on EF10 ESSB Rx
> >   net/sfc: fix double-free in EF10 ESSB Rx queue purge
> >   net/sfc: move Rx checksum offload check to device level
> >   net/sfc: fix Rx queue offloads reporting in queue info
> >   net/sfc: fix assert in set multicast address list
> >   net/sfc: handle unknown L3 packet class in EF10 event parser
> > 
> > Andy Green (2):
> >   ring: fix declaration after statement
> >   ring: fix sign conversion warning
> > 
> > Beilei Xing (5):
> >   net/i40e: fix shifts of 32-bit value
> >   net/i40e: fix PPPoL2TP packet type parsing
> >   net/i40e: fix packet type parsing with DDP
> >   net/i40e: fix setting TPID with AQ command
> >   net/i40e: fix device parameter parsing
> > 
> > Bruce Richardson (3):
> >   eal: fix error message for unsupported platforms
> >   examples/exception_path: fix out-of-bounds read
> >   mk: fix permissions when using make install
> > 
> > Chas Williams (2):
> >   net/bonding: always update bonding link status
> >   net/bonding: do not clear active slave count
> > 
> > Christian Ehrhardt (2):
> >   FIXUP: net/mlx5: fix invalid network interface index
> >   version: 18.05.1-rc1
> > 
> > Damjan Marion (1):
> >   net/i40e: do not reset device info data
> > 
> > Dan Gora (1):
> >   kni: fix crash with null name
> > 
> > Daria Kolistratova (1):
> >   ne

Re: [dpdk-dev] [dpdk-stable] 18.05.1 patches review and test

2018-08-27 Thread Marco Varlese
Hi Christian,

Apologies for being late on this... I tested 18.05.1-rc1 via the usual 
test-pmd, 
OvS-DPDK and VPP, and did not find any issues with it.


Cheers,
Marco

On Mon, 2018-08-27 at 11:29 +0200, Christian Ehrhardt wrote:
> On Wed, Aug 22, 2018 at 9:26 AM Christian Ehrhardt <
> christian.ehrha...@canonical.com> wrote:
> 
> > Hi all,
> > 
> > Here is a list of patches targeted for stable release 18.05.1. Please
> > help review and test. The planned date for the final release is August,
> > 29th. Before that, please shout if anyone has objections with these
> > patches being applied.
> > 
> 
> There was neither positive nor negative feedback on 18.05.1-rc1 so far.
> Maybe 17.11.x priorities and general PTO time just beats 18.05 - which is
> fine to some extend.
> The only private message I got was about one party needing some extra time.
> For all of the above I will do two things:
> 1. the deadline to get back with results on 18.05.1-rc1 is extended to
> Tuesday the 4th of September
> 2. I'd highly appreciate feedback of people involved that intend to test it
> so I know what to wait for (or not)
> 
> Also for the companies committed to running regression tests,
> > please run the tests and report any issue before the release date.
> > 
> > A release candidate tarball can be found at:
> > 
> > https://dpdk.org/browse/dpdk-stable/tag/?id=v18.05.1-rc1
> > 
> > These patches are located at branch 18.05 of dpdk-stable repo:
> > 
> > https://git.dpdk.org/dpdk-stable/log/?h=18.05
> > 
> > Thanks.
> > 
> > Christian Ehrhardt 
> > 
> > ---
> > Adrien Mazarguil (8):
> >   app/testpmd: fix crash when attaching a device
> >   net/mlx4: fix minor resource leak during init
> >   net/mlx5: fix errno object in probe function
> >   net/mlx5: fix missing errno in probe function
> >   net/mlx5: fix error message in probe function
> >   net/mlx5: fix invalid error check
> >   maintainers: update for Mellanox PMDs
> >   net/mlx5: fix invalid network interface index
> > 
> > Ajit Khaparde (11):
> >   net/bnxt: fix clear port stats
> >   net/bnxt: fix close operation
> >   net/bnxt: fix HW Tx checksum offload check
> >   net/bnxt: check filter type before clearing it
> >   net/bnxt: fix set MTU
> >   net/bnxt: fix incorrect IO address handling in Tx
> >   net/bnxt: fix Rx ring count limitation
> >   net/bnxt: fix memory leaks in NVM commands
> >   net/bnxt: fix lock release on NVM write failure
> >   net/bnxt: check access denied for HWRM commands
> >   net/bnxt: fix RETA size
> > 
> > Alejandro Lucero (2):
> >   net/nfp: fix unused header reference
> >   net/nfp: fix field initialization in Tx descriptor
> > 
> > Alok Makhariya (1):
> >   bus/dpaa: fix phandle support for Linux 4.16
> > 
> > Anatoly Burakov (14):
> >   ipc: fix locking while sending messages
> >   mem: fix alignment of requested virtual areas
> >   eal/bsd: fix memory segment index display
> >   malloc: fix pad erasing
> >   eal/linux: fix invalid syntax in interrupts
> >   eal/linux: fix uninitialized value
> >   vfio: fix uninitialized variable
> >   malloc: do not skip pad on free
> >   test: fix EAL flags autotest on FreeBSD
> >   test: fix result printing
> >   test: fix code on report
> >   test: make autotest runner python 2/3 compliant
> >   test: print autotest categories
> >   test: improve filtering
> > 
> > Andrew Rybchenko (7):
> >   net/sfc: cut non VLAN ID bits from TCI
> >   net/sfc: discard packets with bad CRC on EF10 ESSB Rx
> >   net/sfc: fix double-free in EF10 ESSB Rx queue purge
> >   net/sfc: move Rx checksum offload check to device level
> >   net/sfc: fix Rx queue offloads reporting in queue info
> >   net/sfc: fix assert in set multicast address list
> >   net/sfc: handle unknown L3 packet class in EF10 event parser
> > 
> > Andy Green (2):
> >   ring: fix declaration after statement
> >   ring: fix sign conversion warning
> > 
> > Beilei Xing (5):
> >   net/i40e: fix shifts of 32-bit value
> >   net/i40e: fix PPPoL2TP packet type parsing
> >   net/i40e: fix packet type parsing with DDP
> >   net/i40e: fix setting TPID with AQ command
> >   net/i40e: fix device parameter parsing
> > 
> > Bruce Richardson (3):
> >   eal: fix error message for unsupported platforms
> >   examples/exception_path: fix out-of-bounds read
> >   mk: fix permissions when using make install
> > 
> > Chas Williams (2):
> >   net/bonding: always update bonding link status
> >   net/bonding: do not clear active slave count
> > 
> > Christian Ehrhardt (2):
> >   FIXUP: net/mlx5: fix invalid network interface index
> >   version: 18.05.1-rc1
> > 
> > Damjan Marion (1):
> >   net/i40e: do not reset device info data
> > 
> > Dan Gora (1):
> >   kni: fix crash with null name
> > 
> > Daria Kolistratova (1):
> >   ne

[dpdk-dev] [PATCH 2/2] net/mlx: add meson build support

2018-08-27 Thread Nelio Laranjeiro
Adds a configuration item to enable those drivers and also to configure
it in 'glue mode'.  Option driver_mlx{4,5}_glue_enable will enable its
compilation with the creation for the
librte_pmd_mlx{4,5}_glue.so.xx.yy.z library whereas driver_mlx5_enable
will configure the compilation without this glue library.
driver_mlx{4,5}_glue_enable overrides driver_mlx{4,5}_enable meson's
option.

To avoid modifying the whole sources and keep the compatibility with
current build systems, the mlx{4,5}_autoconf.h is still generated by
invoking DPDK scripts though meson's run_command() instead of using
has_types, has_members, ... commands.

Signed-off-by: Nelio Laranjeiro 
---
 drivers/net/meson.build  |   2 +
 drivers/net/mlx4/meson.build | 109 +++
 drivers/net/mlx5/meson.build | 561 +++
 meson_options.txt|   8 +
 4 files changed, 680 insertions(+)
 create mode 100644 drivers/net/mlx4/meson.build
 create mode 100644 drivers/net/mlx5/meson.build

diff --git a/drivers/net/meson.build b/drivers/net/meson.build
index 9c28ed4da..c7a2d0e7d 100644
--- a/drivers/net/meson.build
+++ b/drivers/net/meson.build
@@ -18,6 +18,8 @@ drivers = ['af_packet',
'ixgbe',
'kni',
'liquidio',
+   'mlx4',
+   'mlx5',
'mvpp2',
'netvsc',
'nfp',
diff --git a/drivers/net/mlx4/meson.build b/drivers/net/mlx4/meson.build
new file mode 100644
index 0..ccaf03433
--- /dev/null
+++ b/drivers/net/mlx4/meson.build
@@ -0,0 +1,109 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright 2018 6WIND S.A.
+# Copyright 2018 Mellanox Technologies, Ltd
+
+# As there is no more configuration file to activate/configure the PMD it will
+# use some variables here to configure it.
+pmd_dlopen = get_option('enable_driver_mlx4_glue')
+build = get_option('enable_driver_mlx4') or pmd_dlopen
+# dpdk_conf.set('RTE_LIBRTE_MLX4_DEBUG', 1)
+# Glue configuratin
+LIB_GLUE_BASE = 'librte_pmd_mlx4_glue.so'
+LIB_GLUE_VERSION = '18.02.0'
+LIB_GLUE = LIB_GLUE_BASE + '.' + LIB_GLUE_VERSION
+if pmd_dlopen
+dpdk_conf.set('RTE_LIBRTE_MLX4_DLOPEN_DEPS', 1)
+cflags += [
+'-DMLX4_GLUE="@0@"'.format(LIB_GLUE),
+'-DMLX4_GLUE_VERSION="@0@"'.format(LIB_GLUE_VERSION),
+'-ldl',
+]
+endif
+# Compile PMD
+if build
+allow_experimental_apis = true
+ext_deps += [ cc.find_library('mnl') ]
+# Search for ibverbs and mlx4 library.
+# note: meson find_library accept directories arrays but they must be
+# stripped i.e. no extra space.
+flags = get_option('extra_ldflags')
+libs_dir = []
+foreach flag:flags.split('-L')
+flag = flag.strip()
+if flag != ''
+libs_dir += [ flag.strip() ]
+endif
+endforeach
+foreach libname:['ibverbs', 'mlx4']
+lib = cc.find_library(libname, dirs:libs_dir, required:false)
+if not lib.found()
+lib = cc.find_library(libname, required:true)
+endif
+ext_deps += [ lib ]
+endforeach
+sources = files(
+   'mlx4.c',
+   'mlx4_ethdev.c',
+   'mlx4_flow.c',
+   'mlx4_intr.c',
+   'mlx4_mr.c',
+   'mlx4_rxq.c',
+   'mlx4_rxtx.c',
+   'mlx4_txq.c',
+   'mlx4_utils.c',
+)
+if not pmd_dlopen
+sources += files('mlx4_glue.c')
+endif
+cflags += [
+   '-O3',
+   '-Wall',
+   '-Wextra',
+   '-g',
+   '-std=c11',
+   '-I.',
+   '-D_BSD_SOURCE',
+   '-D_DEFAULT_SOURCE',
+   '-D_XOPEN_SOURCE=600',
+   '-Wno-strict-prototypes',
+]
+if dpdk_conf.has('RTE_LIBRTE_MLX4_DEBUG')
+cflags += [ '-pedantic', '-UNDEBUG', '-DPEDANTIC' ]
+else
+cflags += [ '-DNDEBUG', '-UPEDANTIC' ]
+endif
+# To maintain the compatibility with the make build system
+# mlx4_autoconf.h file is still generated.
+r = run_command('sh', '../../../buildtools/auto-config-h.sh',
+'mlx4_autoconf.h',
+'HAVE_IBV_MLX4_WQE_LSO_SEG',
+'infiniband/mlx4dv.h',
+'type', 'struct mlx4_wqe_lso_seg')
+if r.returncode() != 0
+error('autoconfiguration fail')
+endif
+endif
+# Build Glue Library
+if pmd_dlopen
+dlopen_name = 'mlx4_glue'
+dlopen_lib_name = driver_name_fmt.format(dlopen_name)
+dlopen_so_version = LIB_GLUE_VERSION
+dlopen_sources = files('mlx4_glue.c')
+dlopen_install_dir = [ eal_pmd_path + '-glue' ]
+shared_lib = shared_library(
+   dlopen_lib_name,
+   dlopen

[dpdk-dev] [PATCH 1/2] build: add extra cflags ldflags to meson option

2018-08-27 Thread Nelio Laranjeiro
Almost equivalent to the make system build which uses those options
through environment variables (EXTRA_{CFLAGS,LDFLAGS}).

Signed-off-by: Nelio Laranjeiro 
---
 drivers/meson.build | 2 +-
 meson_options.txt   | 2 ++
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/meson.build b/drivers/meson.build
index f94e2fe67..008aac62c 100644
--- a/drivers/meson.build
+++ b/drivers/meson.build
@@ -11,7 +11,7 @@ driver_classes = ['common',
   'event',   # depends on common, bus, mempool and net.
   'raw'] # depends on common, bus, mempool, net and event.
 
-default_cflags = machine_args
+default_cflags = machine_args + [get_option('extra_cflags'), 
get_option('extra_ldflags')]
 if cc.has_argument('-Wno-format-truncation')
default_cflags += '-Wno-format-truncation'
 endif
diff --git a/meson_options.txt b/meson_options.txt
index c84327858..da6373a2c 100644
--- a/meson_options.txt
+++ b/meson_options.txt
@@ -22,3 +22,5 @@ option('use_hpet', type: 'boolean', value: false,
description: 'use HPET timer in EAL')
 option('tests', type: 'boolean', value: true,
description: 'build unit tests')
+option('extra_cflags', type: 'string', description: 'Extra compiler flags')
+option('extra_ldflags', type: 'string', description: 'Extra linker flags')
-- 
2.18.0



Re: [dpdk-dev] [dpdk-stable] [PATCH v2] bus/fslmc: fix the undefined ref of rte dpaa2 memsegs

2018-08-27 Thread Ferruh Yigit
On 8/27/2018 10:33 AM, Shreyansh Jain wrote:
> On Monday 27 August 2018 02:22 PM, Hemant Agrawal wrote:
>> This patch fix the undefined reference issue with rte_dpaa2_memsegs
>> when compiled in shared lib mode with EXTRA_CFLAGS="-g -O0"
>>
>> Bugzilla ID: 61
>> Fixes: 365fb925d3b3 ("bus/fslmc: optimize physical to virtual address 
>> search")
>> Cc: sta...@dpdk.org
>>
>> Signed-off-by: Hemant Agrawal 
>> Reported-by: Keith Wiles 
>> ---
>> v2: add bugzilla id
> 
> Acked-by: Shreyansh Jain 

Applied to dpdk/master, thanks.


Re: [dpdk-dev] [PATCH 1/2] build: add extra cflags ldflags to meson option

2018-08-27 Thread Bruce Richardson
On Mon, Aug 27, 2018 at 01:10:52PM +0200, Nelio Laranjeiro wrote:
> Almost equivalent to the make system build which uses those options
> through environment variables (EXTRA_{CFLAGS,LDFLAGS}).
> 
> Signed-off-by: Nelio Laranjeiro 
> ---
>  drivers/meson.build | 2 +-
>  meson_options.txt   | 2 ++
>  2 files changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/meson.build b/drivers/meson.build
> index f94e2fe67..008aac62c 100644
> --- a/drivers/meson.build
> +++ b/drivers/meson.build
> @@ -11,7 +11,7 @@ driver_classes = ['common',
>  'event',   # depends on common, bus, mempool and net.
>  'raw'] # depends on common, bus, mempool, net and event.
>  
> -default_cflags = machine_args
> +default_cflags = machine_args + [get_option('extra_cflags'), 
> get_option('extra_ldflags')]
>  if cc.has_argument('-Wno-format-truncation')
>   default_cflags += '-Wno-format-truncation'
>  endif
> diff --git a/meson_options.txt b/meson_options.txt
> index c84327858..da6373a2c 100644
> --- a/meson_options.txt
> +++ b/meson_options.txt
> @@ -22,3 +22,5 @@ option('use_hpet', type: 'boolean', value: false,
>   description: 'use HPET timer in EAL')
>  option('tests', type: 'boolean', value: true,
>   description: 'build unit tests')
> +option('extra_cflags', type: 'string', description: 'Extra compiler flags')
> +option('extra_ldflags', type: 'string', description: 'Extra linker flags')

This should not be needed. Meson should pick up CFLAGS and LDFLAGS from the
environment without having to add options for them.

https://mesonbuild.com/howtox.html#set-extra-compiler-and-linker-flags-from-the-outside-when-eg-building-distro-packages

/Bruce


Re: [dpdk-dev] [PATCH v3 0/6] net/mvpp2: changes and features

2018-08-27 Thread Ferruh Yigit
On 8/24/2018 7:29 PM, Tomasz Duszynski wrote:
> This patch series introduces following changes:
> 
> * Common code responsible for DMA memory initialization
>   is now available under drivers/common/mvep. MVEP stands for
>   Marvell Embedded Processors. This eases maintenance and avoids
>   boilerplate code across Marvell PMDs. MVEP will grow over time as new
>   features and PMDs are added.
> 
> * Couple of minor fixes.
> 
> * Support for reading VLAN information from descriptor.
> 
> v3:
>  * Change exported symbols version to 18.11.
>  * Drop excessive new lines from messages passed to MRVL_LOG().
> 
> v2:
>  * Remove CONFIG_RTE_LIBRTE_MVEP_COMMON. Use CONFIG_RTE_LIBRTE_MVPP2_PMD
>to control common/mvep compilation instead.
> 
> Liron Himi (2):
>   drivers/common: add mvep common code for MRVL PMDs
>   net/mvpp2: use common code to initialize DMA
> 
> Natalie Samsonov (3):
>   net/mvpp2: fix comments and error messages
>   net/mvpp2: make private variables static
>   net/mvpp2: add VLAN packet type support for parser offload
> 
> Tomasz Duszynski (1):
>   net/mvpp2: fix array initialization

Series applied to dpdk-next-net/master, thanks.


Re: [dpdk-dev] [PATCH 1/2] build: add extra cflags ldflags to meson option

2018-08-27 Thread Nélio Laranjeiro
On Mon, Aug 27, 2018 at 12:24:11PM +0100, Bruce Richardson wrote:
> On Mon, Aug 27, 2018 at 01:10:52PM +0200, Nelio Laranjeiro wrote:
> > Almost equivalent to the make system build which uses those options
> > through environment variables (EXTRA_{CFLAGS,LDFLAGS}).
> > 
> > Signed-off-by: Nelio Laranjeiro 
> > ---
> >  drivers/meson.build | 2 +-
> >  meson_options.txt   | 2 ++
> >  2 files changed, 3 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/meson.build b/drivers/meson.build
> > index f94e2fe67..008aac62c 100644
> > --- a/drivers/meson.build
> > +++ b/drivers/meson.build
> > @@ -11,7 +11,7 @@ driver_classes = ['common',
> >'event',   # depends on common, bus, mempool and net.
> >'raw'] # depends on common, bus, mempool, net and event.
> >  
> > -default_cflags = machine_args
> > +default_cflags = machine_args + [get_option('extra_cflags'), 
> > get_option('extra_ldflags')]
> >  if cc.has_argument('-Wno-format-truncation')
> > default_cflags += '-Wno-format-truncation'
> >  endif
> > diff --git a/meson_options.txt b/meson_options.txt
> > index c84327858..da6373a2c 100644
> > --- a/meson_options.txt
> > +++ b/meson_options.txt
> > @@ -22,3 +22,5 @@ option('use_hpet', type: 'boolean', value: false,
> > description: 'use HPET timer in EAL')
> >  option('tests', type: 'boolean', value: true,
> > description: 'build unit tests')
> > +option('extra_cflags', type: 'string', description: 'Extra compiler flags')
> > +option('extra_ldflags', type: 'string', description: 'Extra linker flags')
> 
> This should not be needed. Meson should pick up CFLAGS and LDFLAGS from the
> environment without having to add options for them.
> 
> https://mesonbuild.com/howtox.html#set-extra-compiler-and-linker-flags-from-the-outside-when-eg-building-distro-packages
> 
> /Bruce

Indeed this works with the CLFAGS/LDFLAGS way, but to find correctly the
library dependencies, it also needs to have the LD_LIBRARY_PATH set with
the correct path.

This patch will be discarded in the next version.

Thanks,

-- 
Nélio Laranjeiro
6WIND


Re: [dpdk-dev] 18.08 build error on ppc64el - bool as vector type

2018-08-27 Thread Adrien Mazarguil
Hi Christian,

On Wed, Aug 22, 2018 at 05:11:41PM +0200, Christian Ehrhardt wrote:
> Just FYI the simple change hits similar issues later on.
> 
> The (not really) proposed patch would have to be extended to be as
> following.
> We really need a better solution (or somebody has to convince me that my
> change is better than a band aid).

Thanks for reporting. I've made a quick investigation on my own and believe
it's a toolchain issue which may affect more than this PMD; potentially all
users of stdbool.h (C11) on this platform.

C11's stdbool.h defines a bool macro as _Bool (big B) along with
true/false. On PPC targets, another file (altivec.h) defines bool as _bool
(small b) but not true/false:

 #if !defined(__APPLE_ALTIVEC__)
 /* You are allowed to undef these for C++ compatibility.  */
 #define vector __vector
 #define pixel __pixel
 #define bool __bool
 #endif

mlx5_nl.c explicitly includes stdbool.h to get the above definitions then
includes mlx5.h -> rte_ether.h -> ppc_64/rte_memcpy.h -> altivec.h.

For some reason the conflicting bool redefinition doesn't seem to raise any
warnings, but results in mismatching bool and true/false definitions; an
integer value cannot be assigned to a bool variable anymore, hence the build
failure.

The inability to assign integer values to bool is, in my opinion, a
fundamental issue caused by altivec.h. If there is no way to fix this on the
system, there are a couple of workarounds for DPDK, by order of preference:

1. Always #undef bool after including altivec.h in
   lib/librte_eal/common/include/arch/ppc_64/rte_memcpy.h. I do not think
   anyone expects this type to be unusable with true/false or integer values
   anyway. The version of altivec.h I have doesn't rely on this macro at
   all so it's probably not a big loss.

   Ditto for "pixel" and "vector" keywords. Alternatively you could #define
   __APPLE_ALTIVEC__ before including altivec.h to prevent them from getting
   defined in the first place.

2. Add Altivec detection to impacted users of stdbool.h, which #undef and
   redefine bool as _Bool on their own with a short comment about broken
   toolchains.

3. Replace bool with _Bool to impacted users of stdbool.h. Basically what
   you did below with "int" but slightly more correct since true/false can
   still be used with _Bool. A comment explaining why is necessary after the
   inclusion of stdbool.h.

Can you validate these suggestions? I don't have the right setup for that.

Thanks.

> Description: Fix ppc64le build error between altivec and bool
> 
> We really hope there will eventually be a better fix for this, but
> currently
> we have to unbreak building this code so until something better is
> available
> let's use this modification.
> 
> Forwarded: yes
> Forward-info: http://mails.dpdk.org/archives/dev/2018-August/109926.html
> Author: Christian Ehrhardt 
> Last-Update: 2018-08-22
> --- a/drivers/net/mlx5/mlx5_nl.c
> +++ b/drivers/net/mlx5/mlx5_nl.c
> @@ -834,8 +834,8 @@ mlx5_nl_switch_info_cb(struct nlmsghdr *
>.switch_id = 0,
>};
>size_t off = NLMSG_LENGTH(sizeof(struct ifinfomsg));
> -   bool port_name_set = false;
> -   bool switch_id_set = false;
> +   int port_name_set = 0;
> +   int switch_id_set = 0;
> 
>if (nh->nlmsg_type != RTM_NEWLINK)
>goto error;
> @@ -854,7 +854,7 @@ mlx5_nl_switch_info_cb(struct nlmsghdr *
>if (errno ||
>(size_t)(end - (char *)payload) !=
> strlen(payload))
>goto error;
> -   port_name_set = true;
> +   port_name_set = 1;
>break;
>case IFLA_PHYS_SWITCH_ID:
>info.switch_id = 0;
> @@ -862,7 +862,7 @@ mlx5_nl_switch_info_cb(struct nlmsghdr *
>info.switch_id <<= 8;
>info.switch_id |= ((uint8_t *)payload)[i];
>}
> -   switch_id_set = true;
> +   switch_id_set = 1;
>break;
>}
>off += RTA_ALIGN(ra->rta_len);
> --- a/drivers/net/mlx5/mlx5_ethdev.c
> +++ b/drivers/net/mlx5/mlx5_ethdev.c
> @@ -1335,8 +1335,8 @@ mlx5_sysfs_switch_info(unsigned int ifin
>char ifname[IF_NAMESIZE];
>FILE *file;
>struct mlx5_switch_info data = { .master = 0, };
> -   bool port_name_set = false;
> -   bool port_switch_id_set = false;
> +   int port_name_set = 0;
> +   int port_switch_id_set = 0;
>char c;
> 
>if (!if_indextoname(ifindex, ifname)) {
> --- a/drivers/net/mlx5/mlx5_nl_flow.c
> +++ b/drivers/net/mlx5/mlx5_nl_flow.c
> @@ -385,11 +385,11 @@ mlx5_nl_flow_transpose(void *buf,
>const struct rte_flow_action *action;
>unsigned int n;
>uint32_t act_index_cur;
> -   bool in_port_id_set;
> -   bool eth_type_set;
> -   

[dpdk-dev] [PATCH 0/3] crypto/mvsam: yet another round of features

2018-08-27 Thread Tomasz Duszynski
Following changes are introduced in this patch series:

* Add S/G support.
* Start using common/mvep for DMA memory initialization.
* Add dynamic logging support.

Note that 'crypto/mvsam: common use initialization' relies on
'net/mvpp2: use common code to initialize DMA' which is now on
dpdk-next-net/master.

Dmitri Epshtein (1):
  crypto/mvsam: use common initialization

Tomasz Duszynski (1):
  crypto/mvsam: add dynamic logging support

Zyta Szpak (1):
  crypto/mvsam: add S/G support to crypto dirver

 config/common_base  |   1 -
 drivers/common/Makefile |   4 +-
 drivers/crypto/mvsam/Makefile   |   3 +-
 drivers/crypto/mvsam/meson.build|   2 +-
 drivers/crypto/mvsam/rte_mrvl_pmd.c | 192 +---
 drivers/crypto/mvsam/rte_mrvl_pmd_ops.c |  10 +-
 drivers/crypto/mvsam/rte_mrvl_pmd_private.h |  34 ++---
 mk/rte.app.mk   |   4 +-
 8 files changed, 143 insertions(+), 107 deletions(-)

--
2.7.4



[dpdk-dev] [PATCH 1/3] crypto/mvsam: add S/G support to crypto dirver

2018-08-27 Thread Tomasz Duszynski
From: Zyta Szpak 

The patch adds support for chained source mbufs given
to crypto operations. The crypto engine accepts source buffer
containing a number of segments. The destination buffer
stays the same - always one segment.
On decryption, EIP engine will look for digest at 'auth_icv_offset'
offset in SRC buffer.It must be placed in the last segment and the
offset must be set to reach digest in the last segment.
If application doesn't placed digest in source mbuf, driver try to
copy it to a last segment.

Signed-off-by: Zyta Szpak 
Signed-off-by: Natalie Samsonov 
Reviewed-by: Dmitri Epshtein 
---
 drivers/crypto/mvsam/rte_mrvl_pmd.c | 96 +
 drivers/crypto/mvsam/rte_mrvl_pmd_private.h |  7 +++
 2 files changed, 76 insertions(+), 27 deletions(-)

diff --git a/drivers/crypto/mvsam/rte_mrvl_pmd.c 
b/drivers/crypto/mvsam/rte_mrvl_pmd.c
index 961802e..001aa28 100644
--- a/drivers/crypto/mvsam/rte_mrvl_pmd.c
+++ b/drivers/crypto/mvsam/rte_mrvl_pmd.c
@@ -452,8 +452,10 @@ mrvl_request_prepare(struct sam_cio_op_params *request,
struct rte_crypto_op *op)
 {
struct mrvl_crypto_session *sess;
-   struct rte_mbuf *dst_mbuf;
+   struct rte_mbuf *src_mbuf, *dst_mbuf;
+   uint16_t segments_nb;
uint8_t *digest;
+   int i;
 
if (unlikely(op->sess_type == RTE_CRYPTO_OP_SESSIONLESS)) {
MRVL_CRYPTO_LOG_ERR("MRVL CRYPTO PMD only supports session "
@@ -469,29 +471,47 @@ mrvl_request_prepare(struct sam_cio_op_params *request,
return -EINVAL;
}
 
-   /*
+   request->sa = sess->sam_sess;
+   request->cookie = op;
+
+   src_mbuf = op->sym->m_src;
+   segments_nb = src_mbuf->nb_segs;
+   /* The following conditions must be met:
+* - Destination buffer is required when segmented source buffer
+* - Segmented destination buffer is not supported
+*/
+   if ((segments_nb > 1) && (!op->sym->m_dst)) {
+   MRVL_CRYPTO_LOG_ERR("op->sym->m_dst = NULL!\n");
+   return -1;
+   }
+   /* For non SG case:
 * If application delivered us null dst buffer, it means it expects
 * us to deliver the result in src buffer.
 */
dst_mbuf = op->sym->m_dst ? op->sym->m_dst : op->sym->m_src;
 
-   request->sa = sess->sam_sess;
-   request->cookie = op;
-
-   /* Single buffers only, sorry. */
-   request->num_bufs = 1;
-   request->src = src_bd;
-   src_bd->vaddr = rte_pktmbuf_mtod(op->sym->m_src, void *);
-   src_bd->paddr = rte_pktmbuf_iova(op->sym->m_src);
-   src_bd->len = rte_pktmbuf_data_len(op->sym->m_src);
-
-   /* Empty source. */
-   if (rte_pktmbuf_data_len(op->sym->m_src) == 0) {
-   /* EIP does not support 0 length buffers. */
-   MRVL_CRYPTO_LOG_ERR("Buffer length == 0 not supported!");
+   if (!rte_pktmbuf_is_contiguous(dst_mbuf)) {
+   MRVL_CRYPTO_LOG_ERR("Segmented destination buffer "
+   "not supported.\n");
return -1;
}
 
+   request->num_bufs = segments_nb;
+   for (i = 0; i < segments_nb; i++) {
+   /* Empty source. */
+   if (rte_pktmbuf_data_len(src_mbuf) == 0) {
+   /* EIP does not support 0 length buffers. */
+   MRVL_CRYPTO_LOG_ERR("Buffer length == 0 not 
supported!");
+   return -1;
+   }
+   src_bd[i].vaddr = rte_pktmbuf_mtod(src_mbuf, void *);
+   src_bd[i].paddr = rte_pktmbuf_iova(src_mbuf);
+   src_bd[i].len = rte_pktmbuf_data_len(src_mbuf);
+
+   src_mbuf = src_mbuf->next;
+   }
+   request->src = src_bd;
+
/* Empty destination. */
if (rte_pktmbuf_data_len(dst_mbuf) == 0) {
/* Make dst buffer fit at least source data. */
@@ -542,7 +562,7 @@ mrvl_request_prepare(struct sam_cio_op_params *request,
 
/*
 * EIP supports only scenarios where ICV(digest buffer) is placed at
-* auth_icv_offset. Any other placement means risking errors.
+* auth_icv_offset.
 */
if (sess->sam_sess_params.dir == SAM_DIR_ENCRYPT) {
/*
@@ -551,17 +571,36 @@ mrvl_request_prepare(struct sam_cio_op_params *request,
 */
if (rte_pktmbuf_mtod_offset(
dst_mbuf, uint8_t *,
-   request->auth_icv_offset) == digest) {
+   request->auth_icv_offset) == digest)
return 0;
-   }
} else {/* sess->sam_sess_params.dir == SAM_DIR_DECRYPT */
/*
 * EIP will look for digest at auth_icv_offset
-* offset in SRC buffer.
+* offset in SRC buffer. It must be placed in the last
+* segment and the offset must be set to 

[dpdk-dev] [PATCH 2/3] crypto/mvsam: use common initialization

2018-08-27 Thread Tomasz Duszynski
From: Dmitri Epshtein 

Use common initialization to reduce boilerplate code.

Signed-off-by: Dmitri Epshtein 
Signed-off-by: Tomasz Duszynski 
Reviewed-by: Natalie Samsonov 
---
 drivers/common/Makefile |  4 +++-
 drivers/crypto/mvsam/Makefile   |  3 ++-
 drivers/crypto/mvsam/meson.build|  2 +-
 drivers/crypto/mvsam/rte_mrvl_pmd.c | 30 +-
 mk/rte.app.mk   |  4 +++-
 5 files changed, 22 insertions(+), 21 deletions(-)

diff --git a/drivers/common/Makefile b/drivers/common/Makefile
index 5f72da0..5bcff17 100644
--- a/drivers/common/Makefile
+++ b/drivers/common/Makefile
@@ -8,7 +8,9 @@ ifeq 
($(CONFIG_RTE_LIBRTE_PMD_OCTEONTX_SSOVF)$(CONFIG_RTE_LIBRTE_OCTEONTX_MEMPOO
 DIRS-y += octeontx
 endif

-ifeq ($(CONFIG_RTE_LIBRTE_MVPP2_PMD),y)
+MVEP-y += $(CONFIG_RTE_LIBRTE_MVPP2_PMD)
+MVEP-y += $(CONFIG_RTE_LIBRTE_PMD_MVSAM_CRYPTO)
+ifneq (,$(filter y,$(MVEP-y)))
 DIRS-y += mvep
 endif

diff --git a/drivers/crypto/mvsam/Makefile b/drivers/crypto/mvsam/Makefile
index 3290147..2b4d036 100644
--- a/drivers/crypto/mvsam/Makefile
+++ b/drivers/crypto/mvsam/Makefile
@@ -19,6 +19,7 @@ LIB = librte_pmd_mvsam_crypto.a
 # build flags
 CFLAGS += -O3
 CFLAGS += $(WERROR_FLAGS)
+CFLAGS += -I$(RTE_SDK)/drivers/common/mvep
 CFLAGS += -I$(LIBMUSDK_PATH)/include
 CFLAGS += -DMVCONF_TYPES_PUBLIC
 CFLAGS += -DMVCONF_DMA_PHYS_ADDR_T_PUBLIC
@@ -33,7 +34,7 @@ EXPORT_MAP := rte_pmd_mvsam_version.map
 LDLIBS += -L$(LIBMUSDK_PATH)/lib -lmusdk
 LDLIBS += -lrte_eal -lrte_mbuf -lrte_mempool -lrte_kvargs
 LDLIBS += -lrte_cryptodev
-LDLIBS += -lrte_bus_vdev
+LDLIBS += -lrte_bus_vdev -lrte_common_mvep

 # library source files
 SRCS-$(CONFIG_RTE_LIBRTE_PMD_MVSAM_CRYPTO) += rte_mrvl_pmd.c
diff --git a/drivers/crypto/mvsam/meson.build b/drivers/crypto/mvsam/meson.build
index 3c8ea3c..f1c8796 100644
--- a/drivers/crypto/mvsam/meson.build
+++ b/drivers/crypto/mvsam/meson.build
@@ -18,4 +18,4 @@ endif

 sources = files('rte_mrvl_pmd.c', 'rte_mrvl_pmd_ops.c')

-deps += ['bus_vdev']
+deps += ['bus_vdev', 'common_mvep']
diff --git a/drivers/crypto/mvsam/rte_mrvl_pmd.c 
b/drivers/crypto/mvsam/rte_mrvl_pmd.c
index 001aa28..9c3bb91 100644
--- a/drivers/crypto/mvsam/rte_mrvl_pmd.c
+++ b/drivers/crypto/mvsam/rte_mrvl_pmd.c
@@ -12,11 +12,10 @@
 #include 
 #include 
 #include 
+#include 

 #include "rte_mrvl_pmd_private.h"

-#define MRVL_MUSDK_DMA_MEMSIZE 41943040
-
 #define MRVL_PMD_MAX_NB_SESS_ARG   ("max_nb_sessions")
 #define MRVL_PMD_DEFAULT_MAX_NB_SESSIONS   2048

@@ -767,7 +766,7 @@ cryptodev_mrvl_crypto_create(const char *name,
struct rte_cryptodev *dev;
struct mrvl_crypto_private *internals;
struct sam_init_params  sam_params;
-   int ret;
+   int ret = -EINVAL;

dev = rte_cryptodev_pmd_create(name, &vdev->device,
&init_params->common);
@@ -793,30 +792,26 @@ cryptodev_mrvl_crypto_create(const char *name,
internals->max_nb_qpairs = init_params->common.max_nb_queue_pairs;
internals->max_nb_sessions = init_params->max_nb_sessions;

-   /*
-* ret == -EEXIST is correct, it means DMA
-* has been already initialized.
-*/
-   ret = mv_sys_dma_mem_init(MRVL_MUSDK_DMA_MEMSIZE);
-   if (ret < 0) {
-   if (ret != -EEXIST)
-   return ret;
-
-   MRVL_CRYPTO_LOG_INFO(
-   "DMA memory has been already initialized by a different 
driver.");
-   }
+   ret = rte_mvep_init(MVEP_MOD_T_SAM, NULL);
+   if (ret)
+   goto init_error;

sam_params.max_num_sessions = internals->max_nb_sessions;

/* sam_set_debug_flags(3); */
-   return sam_init(&sam_params);
+
+   ret = sam_init(&sam_params);
+   if (ret)
+   goto init_error;
+
+   return 0;

 init_error:
MRVL_CRYPTO_LOG_ERR(
"driver %s: %s failed", init_params->common.name, __func__);

cryptodev_mrvl_crypto_uninit(vdev);
-   return -EFAULT;
+   return ret;
 }

 /** Parse integer from integer argument */
@@ -966,6 +961,7 @@ cryptodev_mrvl_crypto_uninit(struct rte_vdev_device *vdev)
name, rte_socket_id());

sam_deinit();
+   rte_mvep_deinit(MVEP_MOD_T_SAM);

cryptodev = rte_cryptodev_pmd_get_named_dev(name);
if (cryptodev == NULL)
diff --git a/mk/rte.app.mk b/mk/rte.app.mk
index 899d51a..c8a261e 100644
--- a/mk/rte.app.mk
+++ b/mk/rte.app.mk
@@ -98,7 +98,9 @@ ifeq 
($(CONFIG_RTE_LIBRTE_PMD_OCTEONTX_SSOVF)$(CONFIG_RTE_LIBRTE_OCTEONTX_MEMPOO
 _LDLIBS-y += -lrte_common_octeontx
 endif

-ifeq ($(CONFIG_RTE_LIBRTE_MVPP2_PMD),y)
+MVEP-y += $(CONFIG_RTE_LIBRTE_MVPP2_PMD)
+MVEP-y += $(CONFIG_RTE_LIBRTE_PMD_MVSAM_CRYPTO)
+ifneq (,$(filter y,$(MVEP-y)))
 _LDLIBS-y += -lrte_common_mvep -L$(LIBMUSDK_PATH)/lib -lmusdk
 endif

--
2.7.4



[dpdk-dev] [PATCH 3/3] crypto/mvsam: add dynamic logging support

2018-08-27 Thread Tomasz Duszynski
Add dynamic logging support to mvsam crypto PMD.

Signed-off-by: Tomasz Duszynski 
---
 config/common_base  |  1 -
 drivers/crypto/mvsam/rte_mrvl_pmd.c | 74 ++---
 drivers/crypto/mvsam/rte_mrvl_pmd_ops.c | 10 ++--
 drivers/crypto/mvsam/rte_mrvl_pmd_private.h | 27 +++
 4 files changed, 49 insertions(+), 63 deletions(-)

diff --git a/config/common_base b/config/common_base
index 4bcbaf9..271f549 100644
--- a/config/common_base
+++ b/config/common_base
@@ -559,7 +559,6 @@ CONFIG_RTE_LIBRTE_PMD_CCP=n
 # Compile PMD for Marvell Crypto device
 #
 CONFIG_RTE_LIBRTE_PMD_MVSAM_CRYPTO=n
-CONFIG_RTE_LIBRTE_PMD_MVSAM_CRYPTO_DEBUG=n
 
 #
 # Compile generic security library
diff --git a/drivers/crypto/mvsam/rte_mrvl_pmd.c 
b/drivers/crypto/mvsam/rte_mrvl_pmd.c
index 9c3bb91..33dd018 100644
--- a/drivers/crypto/mvsam/rte_mrvl_pmd.c
+++ b/drivers/crypto/mvsam/rte_mrvl_pmd.c
@@ -224,7 +224,7 @@ mrvl_crypto_set_cipher_session_parameters(struct 
mrvl_crypto_session *sess,
 {
/* Make sure we've got proper struct */
if (cipher_xform->type != RTE_CRYPTO_SYM_XFORM_CIPHER) {
-   MRVL_CRYPTO_LOG_ERR("Wrong xform struct provided!");
+   MRVL_LOG(ERR, "Wrong xform struct provided!");
return -EINVAL;
}
 
@@ -232,7 +232,7 @@ mrvl_crypto_set_cipher_session_parameters(struct 
mrvl_crypto_session *sess,
if ((cipher_xform->cipher.algo > RTE_DIM(cipher_map)) ||
(cipher_map[cipher_xform->cipher.algo].supported
!= ALGO_SUPPORTED)) {
-   MRVL_CRYPTO_LOG_ERR("Cipher algorithm not supported!");
+   MRVL_LOG(ERR, "Cipher algorithm not supported!");
return -EINVAL;
}
 
@@ -252,7 +252,7 @@ mrvl_crypto_set_cipher_session_parameters(struct 
mrvl_crypto_session *sess,
/* Get max key length. */
if (cipher_xform->cipher.key.length >
cipher_map[cipher_xform->cipher.algo].max_key_len) {
-   MRVL_CRYPTO_LOG_ERR("Wrong key length!");
+   MRVL_LOG(ERR, "Wrong key length!");
return -EINVAL;
}
 
@@ -275,14 +275,14 @@ mrvl_crypto_set_auth_session_parameters(struct 
mrvl_crypto_session *sess,
 {
/* Make sure we've got proper struct */
if (auth_xform->type != RTE_CRYPTO_SYM_XFORM_AUTH) {
-   MRVL_CRYPTO_LOG_ERR("Wrong xform struct provided!");
+   MRVL_LOG(ERR, "Wrong xform struct provided!");
return -EINVAL;
}
 
/* See if map data is present and valid */
if ((auth_xform->auth.algo > RTE_DIM(auth_map)) ||
(auth_map[auth_xform->auth.algo].supported != ALGO_SUPPORTED)) {
-   MRVL_CRYPTO_LOG_ERR("Auth algorithm not supported!");
+   MRVL_LOG(ERR, "Auth algorithm not supported!");
return -EINVAL;
}
 
@@ -314,7 +314,7 @@ mrvl_crypto_set_aead_session_parameters(struct 
mrvl_crypto_session *sess,
 {
/* Make sure we've got proper struct */
if (aead_xform->type != RTE_CRYPTO_SYM_XFORM_AEAD) {
-   MRVL_CRYPTO_LOG_ERR("Wrong xform struct provided!");
+   MRVL_LOG(ERR, "Wrong xform struct provided!");
return -EINVAL;
}
 
@@ -322,7 +322,7 @@ mrvl_crypto_set_aead_session_parameters(struct 
mrvl_crypto_session *sess,
if ((aead_xform->aead.algo > RTE_DIM(aead_map)) ||
(aead_map[aead_xform->aead.algo].supported
!= ALGO_SUPPORTED)) {
-   MRVL_CRYPTO_LOG_ERR("AEAD algorithm not supported!");
+   MRVL_LOG(ERR, "AEAD algorithm not supported!");
return -EINVAL;
}
 
@@ -340,7 +340,7 @@ mrvl_crypto_set_aead_session_parameters(struct 
mrvl_crypto_session *sess,
/* Get max key length. */
if (aead_xform->aead.key.length >
aead_map[aead_xform->aead.algo].max_key_len) {
-   MRVL_CRYPTO_LOG_ERR("Wrong key length!");
+   MRVL_LOG(ERR, "Wrong key length!");
return -EINVAL;
}
 
@@ -405,21 +405,21 @@ mrvl_crypto_set_session_parameters(struct 
mrvl_crypto_session *sess,
if ((cipher_xform != NULL) &&
(mrvl_crypto_set_cipher_session_parameters(
sess, cipher_xform) < 0)) {
-   MRVL_CRYPTO_LOG_ERR("Invalid/unsupported cipher parameters");
+   MRVL_LOG(ERR, "Invalid/unsupported cipher parameters!");
return -EINVAL;
}
 
if ((auth_xform != NULL) &&
(mrvl_crypto_set_auth_session_parameters(
sess, auth_xform) < 0)) {
-   MRVL_CRYPTO_LOG_ERR("Invalid/unsupported auth parameters");
+   MRVL_LOG(ERR, "Invalid/unsupported auth parameters!");
return -EINVAL;
}
 
if ((aead_xform != NULL) &&
(mrvl_crypto_set_aead_session_

[dpdk-dev] [PATCH] mem: share legacy and single file segments mode with secondaries

2018-08-27 Thread Anatoly Burakov
Currently, command-line switches for legacy mem mode or single-file
segments mode are only stored in internal config. This leads to a
situation where these flags have to always match between primary
and secondary, which is bad for usability.

Fix this by storing these flags in the shared config as well, so
that secondary process can know if the primary was launched in
single-file segments or legacy mem mode.

This bumps the EAL ABI, however there's an EAL deprecation notice
already in place[1] for a different feature, so that's OK.

[1] http://patches.dpdk.org/patch/43502/

Signed-off-by: Anatoly Burakov 
---
 .../common/include/rte_eal_memconfig.h|  4 
 lib/librte_eal/linuxapp/eal/Makefile  |  2 +-
 lib/librte_eal/linuxapp/eal/eal.c | 20 +++
 lib/librte_eal/meson.build|  2 +-
 4 files changed, 26 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/common/include/rte_eal_memconfig.h 
b/lib/librte_eal/common/include/rte_eal_memconfig.h
index aff0688dd..62a21c2dc 100644
--- a/lib/librte_eal/common/include/rte_eal_memconfig.h
+++ b/lib/librte_eal/common/include/rte_eal_memconfig.h
@@ -77,6 +77,10 @@ struct rte_mem_config {
 * exact same address the primary process maps it.
 */
uint64_t mem_cfg_addr;
+
+   /* legacy mem and single file segments options are shared */
+   uint32_t legacy_mem;
+   uint32_t single_file_segments;
 } __attribute__((__packed__));
 
 
diff --git a/lib/librte_eal/linuxapp/eal/Makefile 
b/lib/librte_eal/linuxapp/eal/Makefile
index fd92c75c2..5c16bc40f 100644
--- a/lib/librte_eal/linuxapp/eal/Makefile
+++ b/lib/librte_eal/linuxapp/eal/Makefile
@@ -10,7 +10,7 @@ ARCH_DIR ?= $(RTE_ARCH)
 EXPORT_MAP := ../../rte_eal_version.map
 VPATH += $(RTE_SDK)/lib/librte_eal/common/arch/$(ARCH_DIR)
 
-LIBABIVER := 8
+LIBABIVER := 9
 
 VPATH += $(RTE_SDK)/lib/librte_eal/common
 
diff --git a/lib/librte_eal/linuxapp/eal/eal.c 
b/lib/librte_eal/linuxapp/eal/eal.c
index e59ac6577..4a55d3b69 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -352,6 +352,24 @@ eal_proc_type_detect(void)
return ptype;
 }
 
+/* copies data from internal config to shared config */
+static void
+eal_update_mem_config(void)
+{
+   struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config;
+   mcfg->legacy_mem = internal_config.legacy_mem;
+   mcfg->single_file_segments = internal_config.single_file_segments;
+}
+
+/* copies data from shared config to internal config */
+static void
+eal_update_internal_config(void)
+{
+   struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config;
+   internal_config.legacy_mem = mcfg->legacy_mem;
+   internal_config.single_file_segments = mcfg->single_file_segments;
+}
+
 /* Sets up rte_config structure with the pointer to shared memory config.*/
 static void
 rte_config_init(void)
@@ -361,11 +379,13 @@ rte_config_init(void)
switch (rte_config.process_type){
case RTE_PROC_PRIMARY:
rte_eal_config_create();
+   eal_update_mem_config();
break;
case RTE_PROC_SECONDARY:
rte_eal_config_attach();
rte_eal_mcfg_wait_complete(rte_config.mem_config);
rte_eal_config_reattach();
+   eal_update_internal_config();
break;
case RTE_PROC_AUTO:
case RTE_PROC_INVALID:
diff --git a/lib/librte_eal/meson.build b/lib/librte_eal/meson.build
index e1fde15d1..62ef985b9 100644
--- a/lib/librte_eal/meson.build
+++ b/lib/librte_eal/meson.build
@@ -21,7 +21,7 @@ else
error('unsupported system type "@0@"'.format(host_machine.system()))
 endif
 
-version = 8  # the version of the EAL API
+version = 9  # the version of the EAL API
 allow_experimental_apis = true
 deps += 'compat'
 deps += 'kvargs'
-- 
2.17.1


[dpdk-dev] [PATCH] bus/vdev: fix wrong error log on secondary device scan

2018-08-27 Thread Qi Zhang
When a secondary process handles VDEV_SCAN_ONE mp action, it is possible
the device is already be inserted. This happens when we have multiple
secondary processes which cause multiple broadcasts from primary during
bus->scan. So we don't need to log any error for -EEXIST.

Bugzilla ID: 84
Fixes: cdb068f031c6 ("bus/vdev: scan by multi-process channel")
Cc: sta...@dpdk.org

Reported-by: Eads Gage 
Signed-off-by: Qi Zhang 
---
 drivers/bus/vdev/vdev.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/bus/vdev/vdev.c b/drivers/bus/vdev/vdev.c
index 6139dd551..af9526fe6 100644
--- a/drivers/bus/vdev/vdev.c
+++ b/drivers/bus/vdev/vdev.c
@@ -346,6 +346,7 @@ vdev_action(const struct rte_mp_msg *mp_msg, const void 
*peer)
const struct vdev_param *in = (const struct vdev_param *)mp_msg->param;
const char *devname;
int num;
+   int ret;
 
strlcpy(mp_resp.name, VDEV_MP_KEY, sizeof(mp_resp.name));
mp_resp.len_param = sizeof(*ou);
@@ -380,7 +381,10 @@ vdev_action(const struct rte_mp_msg *mp_msg, const void 
*peer)
break;
case VDEV_SCAN_ONE:
VDEV_LOG(INFO, "receive vdev, %s", in->name);
-   if (insert_vdev(in->name, NULL, NULL) < 0)
+   ret = insert_vdev(in->name, NULL, NULL);
+   if (ret == -EEXIST)
+   VDEV_LOG(INFO, "device already exist, %s", in->name);
+   else if (ret < 0)
VDEV_LOG(ERR, "failed to add vdev, %s", in->name);
break;
default:
-- 
2.13.6



Re: [dpdk-dev] [PATCH v3 0/2] support MAC changes when no live changes allowed

2018-08-27 Thread Ferruh Yigit
On 8/24/2018 3:25 PM, Alejandro Lucero wrote:
> This is a patched to fix a functionality coming with the first public
> release: changing/setting MAC address.
> 
> The original patch assumes all NICs can safely change or set the MAC
> in any case. However, this is not always true. NFP depends on the firmware
> capabilities and this is not always supported. There are other NICs with
> this same limitation, although, as far as I know, not in DPDK. Linux kernel
> has a IFF_LIVE_ADDR_CHANGE flag and two NICs are checking this flag for
> allowing or not live MAC changes.
> 
> The flag proposed in this patch is just the opposite: advertise if live
> change not supported and assuming it is supported other way.
> 
> Although most NICs support rte_eth_dev_default_mac_addr_set and this
> function returns and error when live change is not supported, note that
> this function is invoked during port start but the value returned is not
> checked. It is likely this is good enough for most of the cases, but
> bonding is relying on this start then mac set/change, and a PMD ports is
> not properly configured for being used as an slave port in some bonding
> modes.
> 
> v2:
>  - add RTE_ETH_DEV_NOLIVE_MAC_ADDR comment in 
> rte_eth_dev_default_mac_addr_set doc
>  - add rte_eth_dev_start change in release API changes
> 
> v3:
>  - merge doc API changes with first patch
>  - comment behaviour change in rte_eth_dev_start
>  - remove comment on rte_eth_dev_default_mac_addr_set

Series applied to dpdk-next-net/master, thanks.


[dpdk-dev] [PATCH] mbuf: add IGMP packet type

2018-08-27 Thread Jerin Jacob
Add support for IGMP packet type.

Signed-off-by: Jerin Jacob 
---
 lib/librte_mbuf/rte_mbuf_ptype.c | 1 +
 lib/librte_mbuf/rte_mbuf_ptype.h | 8 
 2 files changed, 9 insertions(+)

diff --git a/lib/librte_mbuf/rte_mbuf_ptype.c b/lib/librte_mbuf/rte_mbuf_ptype.c
index d7835e283..b483a609d 100644
--- a/lib/librte_mbuf/rte_mbuf_ptype.c
+++ b/lib/librte_mbuf/rte_mbuf_ptype.c
@@ -47,6 +47,7 @@ const char *rte_get_ptype_l4_name(uint32_t ptype)
case RTE_PTYPE_L4_SCTP: return "L4_SCTP";
case RTE_PTYPE_L4_ICMP: return "L4_ICMP";
case RTE_PTYPE_L4_NONFRAG: return "L4_NONFRAG";
+   case RTE_PTYPE_L4_IGMP: return "L4_IGMP";
default: return "L4_UNKNOWN";
}
 }
diff --git a/lib/librte_mbuf/rte_mbuf_ptype.h b/lib/librte_mbuf/rte_mbuf_ptype.h
index 01acc66e2..00db3eeed 100644
--- a/lib/librte_mbuf/rte_mbuf_ptype.h
+++ b/lib/librte_mbuf/rte_mbuf_ptype.h
@@ -286,6 +286,14 @@ extern "C" {
  * | 'version'=6, 'next header'!=[6|17|44|132|1]>
  */
 #define RTE_PTYPE_L4_NONFRAG0x0600
+/**
+ * IGMP (Internet Group Management Protocol) packet type.
+ *
+ * Packet format:
+ * <'ether type'=0x0800
+ * | 'version'=4, 'protocol'=2, 'MF'=0, 'frag_offset'=0>
+ */
+#define RTE_PTYPE_L4_IGMP   0x0700
 /**
  * Mask of layer 4 packet types.
  * It is used for outer packet for tunneling cases.
-- 
2.18.0



[dpdk-dev] [PATCH v2] net/mlx: add meson build support

2018-08-27 Thread Nelio Laranjeiro
Mellanox drivers remains un-compiled by default due to third party
libraries dependencies.  They can be enabled through:
- enable_driver_mlx{4,5}=true or
- enable_driver_mlx{4,5}_glue=true
depending on the needs.

To avoid modifying the whole sources and keep the compatibility with
current build systems (e.g. make), the mlx{4,5}_autoconf.h is still
generated by invoking DPDK scripts though meson's run_command() instead
of using has_types, has_members, ... commands.

Meson will try to find the required external libraries.  When they are
not installed system wide, they can be provided though CFLAGS, LDFLAGS
and LD_LIBRARY_PATH environment variables, example (considering
RDMA-Core is installed in /tmp/rdma-core):

 # CLFAGS=-I/tmp/rdma-core/build/include \
   LDFLAGS=-L/tmp/rdma-core/build/lib \
   LD_LIBRARY_PATH=/tmp/rdma-core/build/lib \
   meson -Denable_driver_mlx4=true output

 # CLFAGS=-I/tmp/rdma-core/build/include \
   LDFLAGS=-L/tmp/rdma-core/build/lib \
   LD_LIBRARY_PATH=/tmp/rdma-core/build/lib \
   ninja -C output install

Signed-off-by: Nelio Laranjeiro 

---

Changes in v2:

- dropped patch https://patches.dpdk.org/patch/43897/
- remove extra_{cflags,ldflags} as already honored by meson through
environment variables.
---
 drivers/net/meson.build  |   2 +
 drivers/net/mlx4/meson.build |  94 ++
 drivers/net/mlx5/meson.build | 545 +++
 meson_options.txt|   8 +
 4 files changed, 649 insertions(+)
 create mode 100644 drivers/net/mlx4/meson.build
 create mode 100644 drivers/net/mlx5/meson.build

diff --git a/drivers/net/meson.build b/drivers/net/meson.build
index 9c28ed4da..c7a2d0e7d 100644
--- a/drivers/net/meson.build
+++ b/drivers/net/meson.build
@@ -18,6 +18,8 @@ drivers = ['af_packet',
'ixgbe',
'kni',
'liquidio',
+   'mlx4',
+   'mlx5',
'mvpp2',
'netvsc',
'nfp',
diff --git a/drivers/net/mlx4/meson.build b/drivers/net/mlx4/meson.build
new file mode 100644
index 0..debaca5b6
--- /dev/null
+++ b/drivers/net/mlx4/meson.build
@@ -0,0 +1,94 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright 2018 6WIND S.A.
+# Copyright 2018 Mellanox Technologies, Ltd
+
+# As there is no more configuration file to activate/configure the PMD it will
+# use some variables here to configure it.
+pmd_dlopen = get_option('enable_driver_mlx4_glue')
+build = get_option('enable_driver_mlx4') or pmd_dlopen
+# dpdk_conf.set('RTE_LIBRTE_MLX4_DEBUG', 1)
+# Glue configuratin
+LIB_GLUE_BASE = 'librte_pmd_mlx4_glue.so'
+LIB_GLUE_VERSION = '18.02.0'
+LIB_GLUE = LIB_GLUE_BASE + '.' + LIB_GLUE_VERSION
+if pmd_dlopen
+dpdk_conf.set('RTE_LIBRTE_MLX4_DLOPEN_DEPS', 1)
+cflags += [
+'-DMLX4_GLUE="@0@"'.format(LIB_GLUE),
+'-DMLX4_GLUE_VERSION="@0@"'.format(LIB_GLUE_VERSION),
+'-ldl',
+]
+endif
+# Compile PMD
+if build
+allow_experimental_apis = true
+ext_deps += [
+cc.find_library('mnl'),
+cc.find_library('mlx4'),
+cc.find_library('ibverbs'),
+]
+sources = files(
+   'mlx4.c',
+   'mlx4_ethdev.c',
+   'mlx4_flow.c',
+   'mlx4_intr.c',
+   'mlx4_mr.c',
+   'mlx4_rxq.c',
+   'mlx4_rxtx.c',
+   'mlx4_txq.c',
+   'mlx4_utils.c',
+)
+if not pmd_dlopen
+sources += files('mlx4_glue.c')
+endif
+cflags += [
+   '-O3',
+   '-Wall',
+   '-Wextra',
+   '-g',
+   '-std=c11',
+   '-I.',
+   '-D_BSD_SOURCE',
+   '-D_DEFAULT_SOURCE',
+   '-D_XOPEN_SOURCE=600',
+   '-Wno-strict-prototypes',
+]
+if dpdk_conf.has('RTE_LIBRTE_MLX4_DEBUG')
+cflags += [ '-pedantic', '-UNDEBUG', '-DPEDANTIC' ]
+else
+cflags += [ '-DNDEBUG', '-UPEDANTIC' ]
+endif
+# To maintain the compatibility with the make build system
+# mlx4_autoconf.h file is still generated.
+r = run_command('sh', '../../../buildtools/auto-config-h.sh',
+'mlx4_autoconf.h',
+'HAVE_IBV_MLX4_WQE_LSO_SEG',
+'infiniband/mlx4dv.h',
+'type', 'struct mlx4_wqe_lso_seg')
+if r.returncode() != 0
+error('autoconfiguration fail')
+endif
+endif
+# Build Glue Library
+if pmd_dlopen
+dlopen_name = 'mlx4_glue'
+dlopen_lib_name = driver_name_fmt.format(dlopen_name)
+dlopen_so_version = LIB_GLUE_VERSION
+dlopen_sources = files('mlx4_glue.c')
+dlopen_install_dir = [ eal_pmd_path + '-glue' ]
+shared_lib = shared_library(
+   dlopen_lib_name,
+   dlopen_sources,
+   include_directories: global_in

[dpdk-dev] [PATCH 0/4] net/cxgbe: add destination MAC match and VLAN rewrite support for flow API

2018-08-27 Thread Rahul Lakkireddy
This series of patches add support to offload flows with destination MAC
match item and VLAN push/pop/rewrite actions.

Patch 1 adds API to program and manage hardware Layer 2 Table (L2T).
L2T holds destination node information to be used for VLAN rewrite.

Patch 2 implements offloading VLAN push/pop/rewrite actions.

Patch 3 adds API to program and manage hardware Multi Port Switch (MPS)
table. MPS holds the destination MAC addresses to be matched against
incoming packets.

Patch 4 implements offloading destination MAC match item.

Thanks,
Rahul

Shagun Agrawal (4):
  net/cxgbe: add API to program hardware layer 2 table
  net/cxgbe: add flow operations to offload vlan actions
  net/cxgbe: add API to program hardware MPS table
  net/cxgbe: add flow operations to match based on destination MAC
address

 doc/guides/rel_notes/release_18_11.rst  |   7 +
 drivers/net/cxgbe/Makefile  |   2 +
 drivers/net/cxgbe/base/adapter.h|   4 +
 drivers/net/cxgbe/base/common.h |   7 +
 drivers/net/cxgbe/base/t4_hw.c  | 108 ++
 drivers/net/cxgbe/base/t4_msg.h |  40 ++
 drivers/net/cxgbe/base/t4_regs.h|   8 ++
 drivers/net/cxgbe/base/t4_tcb.h |   5 +
 drivers/net/cxgbe/base/t4fw_interface.h |  26 
 drivers/net/cxgbe/cxgbe_ethdev.c|   4 +-
 drivers/net/cxgbe/cxgbe_filter.c|  71 +-
 drivers/net/cxgbe/cxgbe_filter.h|  11 ++
 drivers/net/cxgbe/cxgbe_flow.c  |  90 +++-
 drivers/net/cxgbe/cxgbe_flow.h  |   1 +
 drivers/net/cxgbe/cxgbe_main.c  |  43 --
 drivers/net/cxgbe/l2t.c | 227 +
 drivers/net/cxgbe/l2t.h |  57 
 drivers/net/cxgbe/meson.build   |   2 +
 drivers/net/cxgbe/mps_tcam.c| 243 
 drivers/net/cxgbe/mps_tcam.h|  52 +++
 20 files changed, 987 insertions(+), 21 deletions(-)
 create mode 100644 drivers/net/cxgbe/l2t.c
 create mode 100644 drivers/net/cxgbe/l2t.h
 create mode 100644 drivers/net/cxgbe/mps_tcam.c
 create mode 100644 drivers/net/cxgbe/mps_tcam.h

-- 
2.14.1



[dpdk-dev] [PATCH 4/4] net/cxgbe: add flow operations to match based on destination MAC address

2018-08-27 Thread Rahul Lakkireddy
From: Shagun Agrawal 

Add flow operations to match packets based on destination MAC address.
Allocate and program hardware MPS table with the destination MAC
address to be matched against. The returned MPS index is then used while
offloading flows to LETCAM (maskfull) and HASH (maskless) filter regions.

Also update existing mac_addr_set() to use the new MPS table API.

Signed-off-by: Shagun Agrawal 
Signed-off-by: Rahul Lakkireddy 
---
 doc/guides/rel_notes/release_18_11.rst |  1 +
 drivers/net/cxgbe/base/common.h|  1 +
 drivers/net/cxgbe/base/t4_hw.c |  2 ++
 drivers/net/cxgbe/cxgbe_ethdev.c   |  4 +--
 drivers/net/cxgbe/cxgbe_filter.c   |  9 +++--
 drivers/net/cxgbe/cxgbe_filter.h   |  1 +
 drivers/net/cxgbe/cxgbe_flow.c | 66 --
 drivers/net/cxgbe/cxgbe_flow.h |  1 +
 drivers/net/cxgbe/cxgbe_main.c |  6 ++--
 9 files changed, 80 insertions(+), 11 deletions(-)

diff --git a/doc/guides/rel_notes/release_18_11.rst 
b/doc/guides/rel_notes/release_18_11.rst
index db99518f4..f7bef95e1 100644
--- a/doc/guides/rel_notes/release_18_11.rst
+++ b/doc/guides/rel_notes/release_18_11.rst
@@ -58,6 +58,7 @@ New Features
 
   Flow API support has been enhanced for CXGBE Poll Mode Driver to offload:
 
+  * Match items: destination MAC address.
   * Action items: push/pop/rewrite vlan header.
 
 
diff --git a/drivers/net/cxgbe/base/common.h b/drivers/net/cxgbe/base/common.h
index 9f5756850..d9f74d995 100644
--- a/drivers/net/cxgbe/base/common.h
+++ b/drivers/net/cxgbe/base/common.h
@@ -157,6 +157,7 @@ struct tp_params {
int port_shift;
int protocol_shift;
int ethertype_shift;
+   int macmatch_shift;
 
u64 hash_filter_mask;
 };
diff --git a/drivers/net/cxgbe/base/t4_hw.c b/drivers/net/cxgbe/base/t4_hw.c
index d60894115..701e0b1fe 100644
--- a/drivers/net/cxgbe/base/t4_hw.c
+++ b/drivers/net/cxgbe/base/t4_hw.c
@@ -5251,6 +5251,8 @@ int t4_init_tp_params(struct adapter *adap)
   F_PROTOCOL);
adap->params.tp.ethertype_shift = t4_filter_field_shift(adap,
F_ETHERTYPE);
+   adap->params.tp.macmatch_shift = t4_filter_field_shift(adap,
+  F_MACMATCH);
 
/*
 * If TP_INGRESS_CONFIG.VNID == 0, then TP_VLAN_PRI_MAP.VNIC_ID
diff --git a/drivers/net/cxgbe/cxgbe_ethdev.c b/drivers/net/cxgbe/cxgbe_ethdev.c
index 4dcad7a23..f253c2023 100644
--- a/drivers/net/cxgbe/cxgbe_ethdev.c
+++ b/drivers/net/cxgbe/cxgbe_ethdev.c
@@ -1075,11 +1075,9 @@ static int cxgbe_get_regs(struct rte_eth_dev *eth_dev,
 int cxgbe_mac_addr_set(struct rte_eth_dev *dev, struct ether_addr *addr)
 {
struct port_info *pi = (struct port_info *)(dev->data->dev_private);
-   struct adapter *adapter = pi->adapter;
int ret;
 
-   ret = t4_change_mac(adapter, adapter->mbox, pi->viid,
-   pi->xact_addr_filt, (u8 *)addr, true, true);
+   ret = cxgbe_mpstcam_modify(pi, (int)pi->xact_addr_filt, (u8 *)addr);
if (ret < 0) {
dev_err(adapter, "failed to set mac addr; err = %d\n",
ret);
diff --git a/drivers/net/cxgbe/cxgbe_filter.c b/drivers/net/cxgbe/cxgbe_filter.c
index 4d3f3ebee..dcb1dd03e 100644
--- a/drivers/net/cxgbe/cxgbe_filter.c
+++ b/drivers/net/cxgbe/cxgbe_filter.c
@@ -66,7 +66,8 @@ int validate_filter(struct adapter *adapter, struct 
ch_filter_specification *fs)
 #define U(_mask, _field) \
(!(fconf & (_mask)) && S(_field))
 
-   if (U(F_PORT, iport) || U(F_ETHERTYPE, ethtype) || U(F_PROTOCOL, proto))
+   if (U(F_PORT, iport) || U(F_ETHERTYPE, ethtype) ||
+   U(F_PROTOCOL, proto) || U(F_MACMATCH, macidx))
return -EOPNOTSUPP;
 
 #undef S
@@ -268,6 +269,8 @@ static u64 hash_filter_ntuple(const struct filter_entry *f)
 
if (tp->ethertype_shift >= 0 && f->fs.mask.ethtype)
ntuple |= (u64)(f->fs.val.ethtype) << tp->ethertype_shift;
+   if (tp->macmatch_shift >= 0 && f->fs.mask.macidx)
+   ntuple |= (u64)(f->fs.val.macidx) << tp->macmatch_shift;
 
if (ntuple != tp->hash_filter_mask)
return 0;
@@ -744,7 +747,9 @@ int set_filter_wr(struct rte_eth_dev *dev, unsigned int 
fidx)
V_FW_FILTER_WR_RX_RPL_IQ(adapter->sge.fw_evtq.abs_id
 ));
fwr->maci_to_matchtypem =
-   cpu_to_be32(V_FW_FILTER_WR_PORT(f->fs.val.iport) |
+   cpu_to_be32(V_FW_FILTER_WR_MACI(f->fs.val.macidx) |
+   V_FW_FILTER_WR_MACIM(f->fs.mask.macidx) |
+   V_FW_FILTER_WR_PORT(f->fs.val.iport) |
V_FW_FILTER_WR_PORTM(f->fs.mask.iport));
fwr->ptcl = f->fs.val.proto;
fwr->ptclm = f->fs.mask.proto;
dif

[dpdk-dev] [PATCH 1/4] net/cxgbe: add API to program hardware layer 2 table

2018-08-27 Thread Rahul Lakkireddy
From: Shagun Agrawal 

Add API to program and manage hardware Layer 2 Table. L2T holds
information necessary to rewrite specific fields in packet, such
as destination MAC address and vlan id.

Signed-off-by: Shagun Agrawal 
Signed-off-by: Rahul Lakkireddy 
---
 drivers/net/cxgbe/Makefile  |   1 +
 drivers/net/cxgbe/base/adapter.h|   3 +
 drivers/net/cxgbe/base/t4_msg.h |  34 +
 drivers/net/cxgbe/base/t4fw_interface.h |   2 +
 drivers/net/cxgbe/cxgbe_filter.h|   1 +
 drivers/net/cxgbe/cxgbe_main.c  |  30 -
 drivers/net/cxgbe/l2t.c | 227 
 drivers/net/cxgbe/l2t.h |  57 
 drivers/net/cxgbe/meson.build   |   1 +
 9 files changed, 349 insertions(+), 7 deletions(-)
 create mode 100644 drivers/net/cxgbe/l2t.c
 create mode 100644 drivers/net/cxgbe/l2t.h

diff --git a/drivers/net/cxgbe/Makefile b/drivers/net/cxgbe/Makefile
index 5d66c4b3a..d75b070f3 100644
--- a/drivers/net/cxgbe/Makefile
+++ b/drivers/net/cxgbe/Makefile
@@ -53,6 +53,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_CXGBE_PMD) += cxgbe_filter.c
 SRCS-$(CONFIG_RTE_LIBRTE_CXGBE_PMD) += cxgbe_flow.c
 SRCS-$(CONFIG_RTE_LIBRTE_CXGBE_PMD) += t4_hw.c
 SRCS-$(CONFIG_RTE_LIBRTE_CXGBE_PMD) += clip_tbl.c
+SRCS-$(CONFIG_RTE_LIBRTE_CXGBE_PMD) += l2t.c
 SRCS-$(CONFIG_RTE_LIBRTE_CXGBE_PMD) += t4vf_hw.c
 
 include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/drivers/net/cxgbe/base/adapter.h b/drivers/net/cxgbe/base/adapter.h
index e98dd2182..9f4a9653c 100644
--- a/drivers/net/cxgbe/base/adapter.h
+++ b/drivers/net/cxgbe/base/adapter.h
@@ -324,7 +324,10 @@ struct adapter {
 
unsigned int clipt_start; /* CLIP table start */
unsigned int clipt_end;   /* CLIP table end */
+   unsigned int l2t_start;   /* Layer 2 table start */
+   unsigned int l2t_end; /* Layer 2 table end */
struct clip_tbl *clipt;   /* CLIP table */
+   struct l2t_data *l2t; /* Layer 2 table */
 
struct tid_info tids; /* Info used to access TID related tables */
 };
diff --git a/drivers/net/cxgbe/base/t4_msg.h b/drivers/net/cxgbe/base/t4_msg.h
index 5d433c91c..094a153f2 100644
--- a/drivers/net/cxgbe/base/t4_msg.h
+++ b/drivers/net/cxgbe/base/t4_msg.h
@@ -11,7 +11,9 @@ enum {
CPL_SET_TCB_FIELD = 0x5,
CPL_ABORT_REQ = 0xA,
CPL_ABORT_RPL = 0xB,
+   CPL_L2T_WRITE_REQ = 0x12,
CPL_TID_RELEASE   = 0x1A,
+   CPL_L2T_WRITE_RPL = 0x23,
CPL_ACT_OPEN_RPL  = 0x25,
CPL_ABORT_RPL_RSS = 0x2D,
CPL_SET_TCB_RPL   = 0x3A,
@@ -66,6 +68,9 @@ union opcode_tid {
 #define M_TID_TID0x3fff
 #define G_TID_TID(x) (((x) >> S_TID_TID) & M_TID_TID)
 
+#define S_TID_QID14
+#define V_TID_QID(x) ((x) << S_TID_QID)
+
 struct rss_header {
__u8 opcode;
 #if RTE_BYTE_ORDER == RTE_LITTLE_ENDIAN
@@ -421,6 +426,35 @@ struct cpl_rx_pkt {
__be16 err_vec;
 };
 
+struct cpl_l2t_write_req {
+   WR_HDR;
+   union opcode_tid ot;
+   __be16 params;
+   __be16 l2t_idx;
+   __be16 vlan;
+   __u8   dst_mac[6];
+};
+
+/* cpl_l2t_write_req.params fields */
+#define S_L2T_W_PORT8
+#define V_L2T_W_PORT(x) ((x) << S_L2T_W_PORT)
+
+#define S_L2T_W_LPBK10
+#define V_L2T_W_LPBK(x) ((x) << S_L2T_W_LPBK)
+
+#define S_L2T_W_ARPMISS 11
+#define V_L2T_W_ARPMISS(x)  ((x) << S_L2T_W_ARPMISS)
+
+#define S_L2T_W_NOREPLY15
+#define V_L2T_W_NOREPLY(x) ((x) << S_L2T_W_NOREPLY)
+
+struct cpl_l2t_write_rpl {
+   RSS_HDR
+   union opcode_tid ot;
+   __u8 status;
+   __u8 rsvd[3];
+};
+
 /* rx_pkt.l2info fields */
 #define S_RXF_UDP22
 #define V_RXF_UDP(x) ((x) << S_RXF_UDP)
diff --git a/drivers/net/cxgbe/base/t4fw_interface.h 
b/drivers/net/cxgbe/base/t4fw_interface.h
index e80b58a32..1c08637bb 100644
--- a/drivers/net/cxgbe/base/t4fw_interface.h
+++ b/drivers/net/cxgbe/base/t4fw_interface.h
@@ -665,6 +665,8 @@ enum fw_params_param_pfvf {
FW_PARAMS_PARAM_PFVF_CLIP_END = 0x04,
FW_PARAMS_PARAM_PFVF_FILTER_START = 0x05,
FW_PARAMS_PARAM_PFVF_FILTER_END = 0x06,
+   FW_PARAMS_PARAM_PFVF_L2T_START = 0x13,
+   FW_PARAMS_PARAM_PFVF_L2T_END = 0x14,
FW_PARAMS_PARAM_PFVF_CPLFW4MSG_ENCAP = 0x31,
FW_PARAMS_PARAM_PFVF_PORT_CAPS32 = 0x3A
 };
diff --git a/drivers/net/cxgbe/cxgbe_filter.h b/drivers/net/cxgbe/cxgbe_filter.h
index af8fa7529..be12e231a 100644
--- a/drivers/net/cxgbe/cxgbe_filter.h
+++ b/drivers/net/cxgbe/cxgbe_filter.h
@@ -145,6 +145,7 @@ struct filter_entry {
u32 pending:1;  /* filter action is pending FW reply */
struct filter_ctx *ctx; /* caller's completion hook */
struct clip_entry *clipt;   /* CLIP Table entry for IPv6 */
+   struct l2t_entry *l2t;  /* Layer Two Table entry for dmac */
struct rte_eth_dev *dev;/* Port's rte eth device */
void *private;  /* For use by apps using filter_entry */

[dpdk-dev] [PATCH 2/4] net/cxgbe: add flow operations to offload vlan actions

2018-08-27 Thread Rahul Lakkireddy
From: Shagun Agrawal 

Add flow API operations to offload vlan push, pop, and rewrite actions.
For vlan push or rewrite actions, allocate and program an entry from
L2T table. Use the L2T index to program vlan actions for LETCAM
(maskfull) and HASH (maskless) filters.

Signed-off-by: Shagun Agrawal 
Signed-off-by: Rahul Lakkireddy 
---
 doc/guides/rel_notes/release_18_11.rst |  6 
 drivers/net/cxgbe/base/t4_msg.h|  6 
 drivers/net/cxgbe/base/t4_tcb.h|  5 +++
 drivers/net/cxgbe/cxgbe_filter.c   | 62 --
 drivers/net/cxgbe/cxgbe_filter.h   |  9 +
 drivers/net/cxgbe/cxgbe_flow.c | 24 +
 6 files changed, 109 insertions(+), 3 deletions(-)

diff --git a/doc/guides/rel_notes/release_18_11.rst 
b/doc/guides/rel_notes/release_18_11.rst
index 3ae6b3f58..db99518f4 100644
--- a/doc/guides/rel_notes/release_18_11.rst
+++ b/doc/guides/rel_notes/release_18_11.rst
@@ -54,6 +54,12 @@ New Features
  Also, make sure to start the actual text at the margin.
  =
 
+* **Add support to offload more flow match and actions for CXGBE PMD**
+
+  Flow API support has been enhanced for CXGBE Poll Mode Driver to offload:
+
+  * Action items: push/pop/rewrite vlan header.
+
 
 API Changes
 ---
diff --git a/drivers/net/cxgbe/base/t4_msg.h b/drivers/net/cxgbe/base/t4_msg.h
index 094a153f2..2128da64f 100644
--- a/drivers/net/cxgbe/base/t4_msg.h
+++ b/drivers/net/cxgbe/base/t4_msg.h
@@ -138,6 +138,12 @@ struct work_request_hdr {
 #define V_TCAM_BYPASS(x) ((__u64)(x) << S_TCAM_BYPASS)
 #define F_TCAM_BYPASSV_TCAM_BYPASS(1ULL)
 
+#define S_L2T_IDX36
+#define V_L2T_IDX(x) ((__u64)(x) << S_L2T_IDX)
+
+#define S_NAGLE49
+#define V_NAGLE(x) ((__u64)(x) << S_NAGLE)
+
 /* option 2 fields */
 #define S_RSS_QUEUE0
 #define V_RSS_QUEUE(x) ((x) << S_RSS_QUEUE)
diff --git a/drivers/net/cxgbe/base/t4_tcb.h b/drivers/net/cxgbe/base/t4_tcb.h
index 25435f9f4..68cda7730 100644
--- a/drivers/net/cxgbe/base/t4_tcb.h
+++ b/drivers/net/cxgbe/base/t4_tcb.h
@@ -6,6 +6,9 @@
 #ifndef _T4_TCB_DEFS_H
 #define _T4_TCB_DEFS_H
 
+/* 95:32 */
+#define W_TCB_T_FLAGS1
+
 /* 105:96 */
 #define W_TCB_RSS_INFO3
 #define S_TCB_RSS_INFO0
@@ -23,4 +26,6 @@
 #define M_TCB_T_RTT_TS_RECENT_AGE0xULL
 #define V_TCB_T_RTT_TS_RECENT_AGE(x) ((x) << S_TCB_T_RTT_TS_RECENT_AGE)
 
+#define S_TF_CCTRL_RFR62
+
 #endif /* _T4_TCB_DEFS_H */
diff --git a/drivers/net/cxgbe/cxgbe_filter.c b/drivers/net/cxgbe/cxgbe_filter.c
index 7f0d38001..4d3f3ebee 100644
--- a/drivers/net/cxgbe/cxgbe_filter.c
+++ b/drivers/net/cxgbe/cxgbe_filter.c
@@ -8,6 +8,7 @@
 #include "t4_regs.h"
 #include "cxgbe_filter.h"
 #include "clip_tbl.h"
+#include "l2t.h"
 
 /**
  * Initialize Hash Filters
@@ -164,6 +165,16 @@ static void set_tcb_field(struct adapter *adapter, 
unsigned int ftid,
t4_mgmt_tx(ctrlq, mbuf);
 }
 
+/**
+ * Set one of the t_flags bits in the TCB.
+ */
+static void set_tcb_tflag(struct adapter *adap, unsigned int ftid,
+ unsigned int bit_pos, unsigned int val, int no_reply)
+{
+   set_tcb_field(adap, ftid,  W_TCB_T_FLAGS, 1ULL << bit_pos,
+ (unsigned long long)val << bit_pos, no_reply);
+}
+
 /**
  * Build a CPL_SET_TCB_FIELD message as payload of a ULP_TX_PKT command.
  */
@@ -425,7 +436,10 @@ static void mk_act_open_req6(struct filter_entry *f, 
struct rte_mbuf *mbuf,
req->local_ip_lo = local_lo;
req->peer_ip_hi = peer_hi;
req->peer_ip_lo = peer_lo;
-   req->opt0 = cpu_to_be64(V_DELACK(f->fs.hitcnts) |
+   req->opt0 = cpu_to_be64(V_NAGLE(f->fs.newvlan == VLAN_REMOVE ||
+   f->fs.newvlan == VLAN_REWRITE) |
+   V_DELACK(f->fs.hitcnts) |
+   V_L2T_IDX(f->l2t ? f->l2t->idx : 0) |
V_SMAC_SEL((cxgbe_port_viid(f->dev) & 0x7F)
   << 1) |
V_TX_CHAN(f->fs.eport) |
@@ -468,7 +482,10 @@ static void mk_act_open_req(struct filter_entry *f, struct 
rte_mbuf *mbuf,
f->fs.val.lip[2] << 16 | f->fs.val.lip[3] << 24;
req->peer_ip = f->fs.val.fip[0] | f->fs.val.fip[1] << 8 |
f->fs.val.fip[2] << 16 | f->fs.val.fip[3] << 24;
-   req->opt0 = cpu_to_be64(V_DELACK(f->fs.hitcnts) |
+   req->opt0 = cpu_to_be64(V_NAGLE(f->fs.newvlan == VLAN_REMOVE ||
+   f->fs.newvlan == VLAN_REWRITE) |
+   V_DELACK(f->fs.hitcnts) |
+   V_L2T_IDX(f->l2t ? f->l2t->idx : 0) |
V_SMAC_SEL((cxgbe_port_viid(f->dev) & 0x7F)
   << 1) |
V_TX_CHAN(f->fs.eport) |
@@ -518,6 +535,22 @@ static int cxgbe_set_hash_filter(struct rte_eth

[dpdk-dev] [PATCH 3/4] net/cxgbe: add API to program hardware MPS table

2018-08-27 Thread Rahul Lakkireddy
From: Shagun Agrawal 

Add API to program and manage hardware Multi Port Switch table. MPS
holds destination MAC addresses to be matched against incoming packets
for further rule processing. Packets not matching any entry in MPS table
will be dropped by default, unless the underlying port is in promiscuous
mode.

Signed-off-by: Shagun Agrawal 
Signed-off-by: Rahul Lakkireddy 
---
 drivers/net/cxgbe/Makefile  |   1 +
 drivers/net/cxgbe/base/adapter.h|   1 +
 drivers/net/cxgbe/base/common.h |   6 +
 drivers/net/cxgbe/base/t4_hw.c  | 106 ++
 drivers/net/cxgbe/base/t4_regs.h|   8 ++
 drivers/net/cxgbe/base/t4fw_interface.h |  24 
 drivers/net/cxgbe/cxgbe_main.c  |   7 +
 drivers/net/cxgbe/meson.build   |   1 +
 drivers/net/cxgbe/mps_tcam.c| 243 
 drivers/net/cxgbe/mps_tcam.h|  52 +++
 10 files changed, 449 insertions(+)
 create mode 100644 drivers/net/cxgbe/mps_tcam.c
 create mode 100644 drivers/net/cxgbe/mps_tcam.h

diff --git a/drivers/net/cxgbe/Makefile b/drivers/net/cxgbe/Makefile
index d75b070f3..68466f13e 100644
--- a/drivers/net/cxgbe/Makefile
+++ b/drivers/net/cxgbe/Makefile
@@ -53,6 +53,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_CXGBE_PMD) += cxgbe_filter.c
 SRCS-$(CONFIG_RTE_LIBRTE_CXGBE_PMD) += cxgbe_flow.c
 SRCS-$(CONFIG_RTE_LIBRTE_CXGBE_PMD) += t4_hw.c
 SRCS-$(CONFIG_RTE_LIBRTE_CXGBE_PMD) += clip_tbl.c
+SRCS-$(CONFIG_RTE_LIBRTE_CXGBE_PMD) += mps_tcam.c
 SRCS-$(CONFIG_RTE_LIBRTE_CXGBE_PMD) += l2t.c
 SRCS-$(CONFIG_RTE_LIBRTE_CXGBE_PMD) += t4vf_hw.c
 
diff --git a/drivers/net/cxgbe/base/adapter.h b/drivers/net/cxgbe/base/adapter.h
index 9f4a9653c..47cfc5f5f 100644
--- a/drivers/net/cxgbe/base/adapter.h
+++ b/drivers/net/cxgbe/base/adapter.h
@@ -328,6 +328,7 @@ struct adapter {
unsigned int l2t_end; /* Layer 2 table end */
struct clip_tbl *clipt;   /* CLIP table */
struct l2t_data *l2t; /* Layer 2 table */
+   struct mpstcam_table *mpstcam;
 
struct tid_info tids; /* Info used to access TID related tables */
 };
diff --git a/drivers/net/cxgbe/base/common.h b/drivers/net/cxgbe/base/common.h
index 157201da2..9f5756850 100644
--- a/drivers/net/cxgbe/base/common.h
+++ b/drivers/net/cxgbe/base/common.h
@@ -388,6 +388,12 @@ int t4_free_vi(struct adapter *adap, unsigned int mbox,
 int t4_set_rxmode(struct adapter *adap, unsigned int mbox, unsigned int viid,
  int mtu, int promisc, int all_multi, int bcast, int vlanex,
  bool sleep_ok);
+int t4_free_raw_mac_filt(struct adapter *adap, unsigned int viid,
+const u8 *addr, const u8 *mask, unsigned int idx,
+u8 lookup_type, u8 port_id, bool sleep_ok);
+int t4_alloc_raw_mac_filt(struct adapter *adap, unsigned int viid,
+ const u8 *addr, const u8 *mask, unsigned int idx,
+ u8 lookup_type, u8 port_id, bool sleep_ok);
 int t4_change_mac(struct adapter *adap, unsigned int mbox, unsigned int viid,
  int idx, const u8 *addr, bool persist, bool add_smt);
 int t4_enable_vi_params(struct adapter *adap, unsigned int mbox,
diff --git a/drivers/net/cxgbe/base/t4_hw.c b/drivers/net/cxgbe/base/t4_hw.c
index 31762c9c5..d60894115 100644
--- a/drivers/net/cxgbe/base/t4_hw.c
+++ b/drivers/net/cxgbe/base/t4_hw.c
@@ -4161,6 +4161,112 @@ int t4_set_rxmode(struct adapter *adap, unsigned int 
mbox, unsigned int viid,
return t4vf_wr_mbox(adap, &c, sizeof(c), NULL);
 }
 
+/**
+ * t4_alloc_raw_mac_filt - Adds a raw mac entry in mps tcam
+ * @adap: the adapter
+ * @viid: the VI id
+ * @mac: the MAC address
+ * @mask: the mask
+ * @idx: index at which to add this entry
+ * @port_id: the port index
+ * @lookup_type: MAC address for inner (1) or outer (0) header
+ * @sleep_ok: call is allowed to sleep
+ *
+ * Adds the mac entry at the specified index using raw mac interface.
+ *
+ * Returns a negative error number or the allocated index for this mac.
+ */
+int t4_alloc_raw_mac_filt(struct adapter *adap, unsigned int viid,
+ const u8 *addr, const u8 *mask, unsigned int idx,
+ u8 lookup_type, u8 port_id, bool sleep_ok)
+{
+   int ret = 0;
+   struct fw_vi_mac_cmd c;
+   struct fw_vi_mac_raw *p = &c.u.raw;
+   u32 val;
+
+   memset(&c, 0, sizeof(c));
+   c.op_to_viid = cpu_to_be32(V_FW_CMD_OP(FW_VI_MAC_CMD) |
+  F_FW_CMD_REQUEST | F_FW_CMD_WRITE |
+  V_FW_VI_MAC_CMD_VIID(viid));
+   val = V_FW_CMD_LEN16(1) |
+ V_FW_VI_MAC_CMD_ENTRY_TYPE(FW_VI_MAC_TYPE_RAW);
+   c.freemacs_to_len16 = cpu_to_be32(val);
+
+   /* Specify that this is an inner mac address */
+   p->raw_idx_pkd = cpu_to_be32(V_FW_VI_MAC_CMD_RAW_IDX(idx));
+
+   /* Lookup Type. Outer header: 0, Inner head

[dpdk-dev] [PATCH] net/cxgbe: fix illegal memory access when parsing flow match items

2018-08-27 Thread Rahul Lakkireddy
From: Shagun Agrawal 

Coverity issue: 293096
Fixes: ee61f511 ("net/cxgbe: parse and validate flows")
Cc: sta...@dpdk.org

Signed-off-by: Shagun Agrawal 
Signed-off-by: Rahul Lakkireddy 
---
 drivers/net/cxgbe/cxgbe_flow.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/net/cxgbe/cxgbe_flow.c b/drivers/net/cxgbe/cxgbe_flow.c
index add4f0f95..bee3bd640 100644
--- a/drivers/net/cxgbe/cxgbe_flow.c
+++ b/drivers/net/cxgbe/cxgbe_flow.c
@@ -529,10 +529,10 @@ cxgbe_rtef_parse_items(struct rte_flow *flow,
char repeat[ARRAY_SIZE(parseitem)] = {0};
 
for (i = items; i->type != RTE_FLOW_ITEM_TYPE_END; i++) {
-   struct chrte_fparse *idx = &flow->item_parser[i->type];
+   struct chrte_fparse *idx;
int ret;
 
-   if (i->type > ARRAY_SIZE(parseitem))
+   if (i->type >= ARRAY_SIZE(parseitem))
return rte_flow_error_set(e, ENOTSUP,
  RTE_FLOW_ERROR_TYPE_ITEM,
  i, "Item not supported");
@@ -553,6 +553,7 @@ cxgbe_rtef_parse_items(struct rte_flow *flow,
if (ret)
return ret;
 
+   idx = &flow->item_parser[i->type];
if (!idx || !idx->fptr) {
return rte_flow_error_set(e, ENOTSUP,
RTE_FLOW_ERROR_TYPE_ITEM, i,
-- 
2.14.1



[dpdk-dev] [RFC] ethdev: add action to swap source and destination MAC to flow API

2018-08-27 Thread Rahul Lakkireddy
From: Shagun Agrawal 

This action is useful for offloading loopback mode, where the hardware
will swap source and destination MAC address before looping back the
packet. This action can be used in conjunction with other rewrite
actions to achieve MAC layer transparent NAT where the MAC addresses
are swapped before either the source or destination MAC address
is rewritten and NAT is performed.

Signed-off-by: Shagun Agrawal 
Signed-off-by: Rahul Lakkireddy 
---
 app/test-pmd/cmdline_flow.c |  9 +
 app/test-pmd/config.c   |  1 +
 doc/guides/prog_guide/rte_flow.rst  | 15 +++
 doc/guides/testpmd_app_ug/testpmd_funcs.rst |  2 ++
 lib/librte_ethdev/rte_flow.c|  1 +
 lib/librte_ethdev/rte_flow.h|  7 +++
 6 files changed, 35 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index f9260600e..4b83b55c4 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -243,6 +243,7 @@ enum index {
ACTION_VXLAN_DECAP,
ACTION_NVGRE_ENCAP,
ACTION_NVGRE_DECAP,
+   ACTION_MAC_SWAP,
 };
 
 /** Maximum size for pattern in struct rte_flow_item_raw. */
@@ -816,6 +817,7 @@ static const enum index next_action[] = {
ACTION_VXLAN_DECAP,
ACTION_NVGRE_ENCAP,
ACTION_NVGRE_DECAP,
+   ACTION_MAC_SWAP,
ZERO,
 };
 
@@ -2470,6 +2472,13 @@ static const struct token token_list[] = {
.next = NEXT(NEXT_ENTRY(ACTION_NEXT)),
.call = parse_vc,
},
+   [ACTION_MAC_SWAP] = {
+   .name = "mac_swap",
+   .help = "swap source and destination mac address",
+   .priv = PRIV_ACTION(MAC_SWAP, 0),
+   .next = NEXT(NEXT_ENTRY(ACTION_NEXT)),
+   .call = parse_vc,
+   },
 };
 
 /** Remove and return last entry from argument stack. */
diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
index 14ccd6864..b7393967a 100644
--- a/app/test-pmd/config.c
+++ b/app/test-pmd/config.c
@@ -1153,6 +1153,7 @@ static const struct {
   sizeof(struct rte_flow_action_of_pop_mpls)),
MK_FLOW_ACTION(OF_PUSH_MPLS,
   sizeof(struct rte_flow_action_of_push_mpls)),
+   MK_FLOW_ACTION(MAC_SWAP, 0),
 };
 
 /** Compute storage space needed by action configuration and copy it. */
diff --git a/doc/guides/prog_guide/rte_flow.rst 
b/doc/guides/prog_guide/rte_flow.rst
index b305a72a5..530dbc504 100644
--- a/doc/guides/prog_guide/rte_flow.rst
+++ b/doc/guides/prog_guide/rte_flow.rst
@@ -2076,6 +2076,21 @@ RTE_FLOW_ERROR_TYPE_ACTION error should be returned.
 
 This action modifies the payload of matched flows.
 
+Action: ``MAC_SWAP``
+^
+
+Swap source and destination mac address.
+
+.. _table_rte_flow_action_mac_swap:
+
+.. table:: MAC_SWAP
+
+   +---+
+   | Field |
+   +===+
+   | no properties |
+   +---+
+
 Negative types
 ~~
 
diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst 
b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
index dde205a2b..4f0da4fb6 100644
--- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
+++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
@@ -3697,6 +3697,8 @@ This section lists supported actions and their 
attributes, if any.
 - ``nvgre_decap``: Performs a decapsulation action by stripping all headers of
   the NVGRE tunnel network overlay from the matched flow.
 
+- ``mac_swap``: Swap source and destination mac address.
+
 Destroying flow rules
 ~
 
diff --git a/lib/librte_ethdev/rte_flow.c b/lib/librte_ethdev/rte_flow.c
index cff4b5209..04b0b40ea 100644
--- a/lib/librte_ethdev/rte_flow.c
+++ b/lib/librte_ethdev/rte_flow.c
@@ -109,6 +109,7 @@ static const struct rte_flow_desc_data 
rte_flow_desc_action[] = {
   sizeof(struct rte_flow_action_of_pop_mpls)),
MK_FLOW_ACTION(OF_PUSH_MPLS,
   sizeof(struct rte_flow_action_of_push_mpls)),
+   MK_FLOW_ACTION(MAC_SWAP, 0),
 };
 
 static int
diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
index f8ba71cdb..e1fa17b7e 100644
--- a/lib/librte_ethdev/rte_flow.h
+++ b/lib/librte_ethdev/rte_flow.h
@@ -1505,6 +1505,13 @@ enum rte_flow_action_type {
 * error.
 */
RTE_FLOW_ACTION_TYPE_NVGRE_DECAP,
+
+   /**
+* swap the source and destination mac address in ethernet header
+*
+* No associated configuration structure.
+*/
+   RTE_FLOW_ACTION_TYPE_MAC_SWAP,
 };
 
 /**
-- 
2.14.1



Re: [dpdk-dev] [PATCH v3] ethdev: silence error message on rte_eth_dev_owner_unset

2018-08-27 Thread Ferruh Yigit
On 8/21/2018 4:53 PM, Matan Azrad wrote:
> Hi
> 
> From: Stephen Hemminger
>> From 74ad4c60262b1451a5a2fabf79a2df89c6c5373d Mon Sep 17 00:00:00 2001
>> From: Stephen Hemminger 
>> Date: Thu, 16 Aug 2018 15:37:14 -0700
>> Subject: [PATCH 1/5] ethdev: silence error message on
>> rte_eth_dev_owner_unset
>>
>> The rte_eth_dev_owner_unset function always generates a log message
>> because the unset value for owner id is 0.
>>
>> Also, when rte_eth_dev_owner_delete is called with a valid owner id, the
>> log message should be at NOTICE not ERROR severity.
>>
>> Fixes: 5b7ba31148a8 ("ethdev: add port ownership")
>> Signed-off-by: Stephen Hemminger 

<...>

> I think the title should be:
> ethdev: fix port ownership logs
> 
> while adding the fixes commits (at list 2 because of the NOTICE change)
> and updating stable.
> 
> Besides that,
> Acked-by: Matan Azrad 

Applied to dpdk-next-net/master, thanks.

(Used suggested title.)


Re: [dpdk-dev] [dpdk-stable] [PATCH v2] net/bonding: fix buf corruption in merging un-transmitted packets

2018-08-27 Thread Ferruh Yigit
On 8/20/2018 2:58 PM, Chas Williams wrote:
> On Mon, Aug 20, 2018 at 2:54 AM Jia Yu  wrote:
> 
>> When bond slave devices cannot transmit all packets in bufs array,
>> tx_burst callback shall merge the un-transmitted packets back to
>> bufs array. Recent merge logic introduced a bug which causes
>> invalid mbuf addresses being written to bufs array.
>> When caller frees the un-transmitted packets, due to invalid addresses,
>> application will crash.
>>
>> The fix is avoid shifting mbufs, and directly write un-transmitted
>> packets back to bufs array.
>>
>> Fixes: 09150784a776 ("net/bonding: burst mode hash calculation")
>> Cc: sta...@dpdk.org
>> Signed-off-by: Jia Yu 
>>
> 
> Acked-by: Chas Williams 

Applied to dpdk-next-net/master, thanks.


Re: [dpdk-dev] RTE-FLOW: PF vs PHY_PORT

2018-08-27 Thread Vivek Sharma
Ping.

On Wednesday 22 August 2018 05:16 PM, Vivek Sharma wrote:
> External Email
> 
> Hi Devs,
> 
> I am trying to enable RTE-FLOW support on one of our platforms & having hard 
> time in figuring out PF vs PHY_PORT differences and DPDK rationale for 
> introducing these two distinct identities.
> 
> Rte-Flow distinguishes between RTE_FLOW_ITEM_TYPE_PF & 
> RTE_FLOW_ITEM_TYPE_PHY_PORT and
> 
>RTE_FLOW_ACTION_TYPE_PF & 
> RTE_FLOW_ACTION_TYPE_PHY_PORT.
> 
> 
> I am finding it difficult to justify the presence of both these types, when 
> functionality & implementation wise, these look quite similar. I would really 
> appreciate if you could illustrate the differences between above item & 
> action types by taking some hardware/platform as reference.
> 
> 
> Thanks in advance,
> Vivek Sharma
> 


Re: [dpdk-dev] [PATCH v2] examples: fix ip_reassembly not work with some NICs

2018-08-27 Thread Luca Boccassi
Hi Thomas and Yong,

This patch:

https://patches.dpdk.org/patch/19868/

Fixes an error in the Intel regression tests when backported to the
16.11.x LTS branch, but was never committed to master.
It's marked as superseded, and the error does not appear on master.

Do you remember what other change superseded this patch? I can't
find anything on the mailing list.

Thanks!

Kind regards,
Luca Boccassi


Re: [dpdk-dev] [dpdk-stable] [PATCH v4] net/bonding: per-slave intermediate rx ring

2018-08-27 Thread Chas Williams
On Sun, Aug 26, 2018 at 3:40 AM Matan Azrad  wrote:

>
> From: Chas Williams <3ch...@gmail.com>
> >On Thu, Aug 23, 2018 at 3:28 AM Matan Azrad 
> wrote:
> >Hi
> >
> >From: Eric Kinzie
> >> On Wed Aug 22 11:42:37 + 2018, Matan Azrad wrote:
> >> > Hi Luca
> >> >
> >> > From: Luca Boccassi
> >> > > On Wed, 2018-08-22 at 07:09 +, Matan Azrad wrote:
> >> > > > Hi Chas
> >> > > >
> >> > > > From: Chas Williams
> >> > > > > On Tue, Aug 21, 2018 at 11:43 AM Matan Azrad
> >> > > > >  wrote:
> >> > > > > Hi Chas
> >> > > > >
> >> > > > > From: Chas Williams
> >> > > > > > On Tue, Aug 21, 2018 at 6:56 AM Matan Azrad
> >> > > > > >  https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fx.com&data=02%7C01%7Cmatan%40mellanox.com%7Cc662ec1ee7734d12025808d609104474%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636706362822778011&sdata=MNAx1E5TgOrzXO9N8SWCOojrWmbqD8DPND%2BCXorOYhQ%3D&reserved=0
> >
> >> > > > > > wrote:
> >> > > > > > Hi
> >> > > > > >
> >> > > > > > From: Chas Williams
> >> > > > > > > This will need to be implemented for some of the other RX
> >> > > > > > > burst methods at some point for other modes to see this
> >> > > > > > > performance improvement (with the exception of
> active-backup).
> >> > > > > >
> >> > > > > > Yes, I think it should be done at least to
> >> > > > > > bond_ethdev_rx_burst_8023ad_fast_queue (should be easy) for
> >> now.
> >> > > > > >
> >> > > > > > There is some duplicated code between the various RX paths.
> >> > > > > > I would like to eliminate that as much as possible, so I was
> >> > > > > > going to give that some thought first.
> >> > > > >
> >> > > > > There is no reason to stay this function as is while its twin is
> >> > > > > changed.
> >> > > > >
> >> > > > > Unfortunately, this is all the patch I have at this time.
> >> > > > >
> >> > > > >
> >> > > > > >
> >> > > > > >
> >> > > > > > > On Thu, Aug 16, 2018 at 9:32 AM Luca Boccassi
> >> > > > > > >  https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fian.org&data=02%7C01%7Cmatan%40mellanox.com%7Cc662ec1ee7734d12025808d609104474%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636706362822778011&sdata=AAcm%2FcbAA4CQsOnXFZWIqii6T%2BFUcc8xxT7%2Fs3tKIfY%3D&reserved=0>
> wrote:
> >> > > > > > >
> >> > > > > > > > During bond 802.3ad receive, a burst of packets is fetched
> >> > > > > > > > from each slave into a local array and appended to
> >> > > > > > > > per-slave ring buffer.
> >> > > > > > > > Packets are taken from the head of the ring buffer and
> >> > > > > > > > returned to the caller.  The number of mbufs provided to
> >> > > > > > > > each slave is sufficient to meet the requirements of the
> >> > > > > > > > ixgbe vector receive.
> >> > > > > >
> >> > > > > > Luca,
> >> > > > > >
> >> > > > > > Can you explain these requirements of ixgbe?
> >> > > > > >
> >> > > > > > The ixgbe (and some other Intel PMDs) have vectorized RX
> >> > > > > > routines that are more efficient (if not faster) taking
> >> > > > > > advantage of some advanced CPU instructions.  I think you need
> >> > > > > > to be receiving at least 32 packets or more.
> >> > > > >
> >> > > > > So, why to do it in bond which is a generic driver for all the
> >> > > > > vendors PMDs, If for ixgbe and other Intel nics it is better you
> >> > > > > can force those PMDs to receive always 32 packets and to manage
> >> > > > > a ring by themselves.
> >> > > > >
> >> > > > > The drawback of the ring is some additional latency on the
> >> > > > > receive path.
> >> > > > > In testing, the additional latency hasn't been an issue for
> bonding.
> >> > > >
> >> > > > When bonding does processing slower it may be a bottleneck for the
> >> > > > packet processing for some application.
> >> > > >
> >> > > > > The bonding PMD has a fair bit of overhead associated with the
> >> > > > > RX and TX path calculations.  Most applications can just arrange
> >> > > > > to call the RX path with a sufficiently large receive.  Bonding
> >> > > > > can't do this.
> >> > > >
> >> > > > I didn't talk on application I talked on the slave PMDs, The slave
> >> > > > PMD can manage a ring by itself if it helps for its own
> performance.
> >> > > > The bonding should not be oriented to specific PMDs.
> >> > >
> >> > > The issue though is that the performance problem is not with the
> >> > > individual PMDs - it's with bonding. There were no reports regarding
> >> > > the individual PMDs.
> >> > > This comes from reports from customers from real world production
> >> > > deployments - the issue of bonding being too slow was raised
> multiple
> >> times.
> >> > > This patch addresses those issues, again in production deployments,
> >> > > where it's been used for years, to users and customers satisfaction.
> >> >
> >> > From Chas I understood that using burst of 32 helps for some slave
> PMDs
> >> performance which makes sense.
> >> > 

Re: [dpdk-dev] [PATCH 0/2] Some small changes to net/virtio

2018-08-27 Thread Ferruh Yigit
On 8/15/2018 2:51 PM, Luca Boccassi wrote:
> On Tue, 2017-07-18 at 07:52 -0400, Charles (Chas) Williams wrote:
>>
>> On 07/18/2017 07:50 AM, Ferruh Yigit wrote:
>>> On 7/18/2017 12:05 AM, Charles (Chas) Williams wrote:
 Just a couple small changes to net/virtio that make it a little
 more
 well behaved.
>>>
>>> Same question here, is this patchset targets the 17.08-rc2? Can
>>> this be
>>> postponed?
>>
>> Yes, these can all be postponed to 17.11.  They are too late for a
>> -rc2.
> 
> Hi Ferruh and Maxime,
> 
> Looks like this series fell through the cracks. Any chance you folks
> could please have a look at it for 18.11? Thanks!
> 
> https://patches.dpdk.org/patch/26994/
> https://patches.dpdk.org/patch/26995/

Hi Maxime, Tiwei,

Those patches seems missed for a while, they date back to July 2011. Can you
guys please look at it?

If there is no objection/comment for a little more, they will be automerged as a
part of process.

Thanks,
ferruh


Re: [dpdk-dev] RTE-FLOW: PF vs PHY_PORT

2018-08-27 Thread Adrien Mazarguil
Hi Vivek,

On Wed, Aug 22, 2018 at 05:16:52PM +0530, Vivek Sharma wrote:
> Hi Devs,
> 
> I am trying to enable RTE-FLOW support on one of our platforms & having hard 
> time in figuring out PF vs PHY_PORT differences and DPDK rationale for 
> introducing these two distinct identities. 
> 
> Rte-Flow distinguishes between RTE_FLOW_ITEM_TYPE_PF & 
> RTE_FLOW_ITEM_TYPE_PHY_PORT and
> 
>RTE_FLOW_ACTION_TYPE_PF & 
> RTE_FLOW_ACTION_TYPE_PHY_PORT.
> 
> 
> I am finding it difficult to justify the presence of both these types, when 
> functionality & implementation wise, these look quite similar. I would really 
> appreciate if you could illustrate the differences between above item & 
> action types by taking some hardware/platform as reference.

Some devices, typically those with a single PCI bus address shared for all
ports (e.g. Mellanox ConnectX-3) expose all their physical ports to each
PF/VF instance [1], not the other way around. With these, PHY_PORT item and
action give the ability to select a nondefault physical port in a flow rule.

PHY_PORT cannot be specified on most devices with PF/VF dedicated to
physical ports, although their drivers should at least recognize 0 as a
supported index and ignore it.

Since devices can expose any number of PF/VF instances and physical ports,
this gives applications the ability to use both as matching criteria and/or
action target.

A higher level alternative to PHY_PORT and PF/VF items/actions is PORT_ID to
match/target DPDK port IDs, which users may find more convenient. One
drawback is that it only works with devices instantiated within DPDK.

PF/VF and PHY_PORT should be reserved for corner cases where PORT_ID cannot
be used. My advice is to implement PORT_ID and not bother with the others
since port IDs are what applications are familiar with.

[1] Although with CX3, individual ports can be disabled per VF, they remain
"seen" by each instance.

-- 
Adrien Mazarguil
6WIND


Re: [dpdk-dev] [PATCH 0/2] Some small changes to net/virtio

2018-08-27 Thread Ferruh Yigit
On 8/27/2018 2:41 PM, Ferruh Yigit wrote:
> On 8/15/2018 2:51 PM, Luca Boccassi wrote:
>> On Tue, 2017-07-18 at 07:52 -0400, Charles (Chas) Williams wrote:
>>>
>>> On 07/18/2017 07:50 AM, Ferruh Yigit wrote:
 On 7/18/2017 12:05 AM, Charles (Chas) Williams wrote:
> Just a couple small changes to net/virtio that make it a little
> more
> well behaved.

 Same question here, is this patchset targets the 17.08-rc2? Can
 this be
 postponed?
>>>
>>> Yes, these can all be postponed to 17.11.  They are too late for a
>>> -rc2.
>>
>> Hi Ferruh and Maxime,
>>
>> Looks like this series fell through the cracks. Any chance you folks
>> could please have a look at it for 18.11? Thanks!
>>
>> https://patches.dpdk.org/patch/26994/
>> https://patches.dpdk.org/patch/26995/
> 
> Hi Maxime, Tiwei,
> 
> Those patches seems missed for a while, they date back to July 2011. Can you
> guys please look at it?

July 2017 :)

> 
> If there is no objection/comment for a little more, they will be automerged 
> as a
> part of process.
> 
> Thanks,
> ferruh
> 



Re: [dpdk-dev] [PATCH v2] examples: fix ip_reassembly not work with some NICs

2018-08-27 Thread Liu, Yong
Boccassi,
Packet type is supported in both vector and normal Rx functions of Fortville 
pmd. So I think this patch was superseded because of it do not fix the real 
issue.

Thanks,
Marvin

> -Original Message-
> From: Luca Boccassi [mailto:luca.bocca...@gmail.com]
> Sent: Monday, August 27, 2018 9:20 PM
> To: dev@dpdk.org
> Cc: Liu, Yong ; tho...@monjalon.net; Yigit, Ferruh
> 
> Subject: Re: [PATCH v2] examples: fix ip_reassembly not work with some NICs
> 
> Hi Thomas and Yong,
> 
> This patch:
> 
> https://patches.dpdk.org/patch/19868/
> 
> Fixes an error in the Intel regression tests when backported to the
> 16.11.x LTS branch, but was never committed to master.
> It's marked as superseded, and the error does not appear on master.
> 
> Do you remember what other change superseded this patch? I can't
> find anything on the mailing list.
> 
> Thanks!
> 
> Kind regards,
> Luca Boccassi


Re: [dpdk-dev] [PATCH v2] kni: fix kni Rx fifo producer synchronization

2018-08-27 Thread Ferruh Yigit
On 8/16/2018 10:55 AM, Kiran Kumar wrote:
> With existing code in kni_fifo_put, rx_q values are not being updated
> before updating fifo_write. While reading rx_q in kni_net_rx_normal,
> This is causing the sync issue on other core. So adding a write
> barrier to make sure the values being synced before updating fifo_write.
> 
> Fixes: 3fc5ca2f6352 ("kni: initial import")
> 
> Signed-off-by: Kiran Kumar 
> Acked-by: Jerin Jacob 

Acked-by: Ferruh Yigit 


Re: [dpdk-dev] [PATCH v2 0/7] ethdev: add flow API object converter

2018-08-27 Thread Adrien Mazarguil
On Fri, Aug 24, 2018 at 11:58:39AM +0100, Ferruh Yigit wrote:
> On 8/3/2018 2:36 PM, Adrien Mazarguil wrote:
> > This is a follow up to the "Flow API helpers enhancements" series submitted
> > almost a year ago [1]. The new title is due to the reduced scope of this
> > version.
> > 
> > rte_flow_conv() is a flexible replacement to rte_flow_copy(), itself a
> > temporary solution pending something better [2]. It replaces a lot of
> > duplicated code found in testpmd and removes some of the maintenance burden
> > that developers tend to forget (me included) when modifying pattern
> > item or actions (updating app/test-pmd/config.c to be clear).
> > 
> > This series was unearthed in order to complete the implementation of
> > RTE_FLOW_ACTION_TYPE_ENCAP_(VXLAN|NVGRE) in testpmd [3] without having to
> > duplicate existing code once again.
> > 
> > See individual patches for specific changes in this version.
> > 
> > v2 changes:
> > 
> > - rte_flow_copy() is kept, albeit deprecated, no API/ABI impact.
> > - Updated bonding PMD.
> > - No more automatic generation of rte_flow_conv.h.
> > 
> > [1] https://mails.dpdk.org/archives/dev/2017-October/077551.html
> > [2] https://mails.dpdk.org/archives/dev/2017-July/070492.html
> > [3] Currently the command-line parser (cmdline_flow.c) is aware of these
> > actions, however config.c isn't. Flow rules with such actions cannot
> > be created and cannot be validated with PMDs that implement them.
> > 
> > Adrien Mazarguil (7):
> >   ethdev: add flow API object converter
> >   ethdev: add flow API item/action name conversion
> >   app/testpmd: rely on flow API conversion function
> >   net/failsafe: switch to flow API object conversion function
> >   net/bonding: switch to flow API object conversion function
> >   ethdev: deprecate rte_flow_copy function
> >   ethdev: add missing item/actions to flow object converter
> 
> Causing build error for arm, it looks like related to rte_memcpy macro:
> 
> .../lib/librte_ethdev/rte_flow.c: In function ‘rte_flow_conv_item_spec’:
> .../lib/librte_ethdev/rte_flow.c:373:58: error: macro "rte_memcpy" passed 9
> arguments, but takes just 3
>(size > sizeof(*dst.raw) ? sizeof(*dst.raw) : size));

Thanks, noticed it after sending v2. I'll fix it for v3.

-- 
Adrien Mazarguil
6WIND


[dpdk-dev] [PATCH] test/hash: solve unit test hash compilation error

2018-08-27 Thread Dharmik Thakkar
Enable print_key_info() function compilation always.

Signed-off-by: Dharmik Thakkar 
Reviewed-by: Honnappa Nagarahalli 
Reviewed-by: Gavin Hu 
Suggested-by: Honnappa Nagarahalli 
---
 test/test/test_hash.c | 24 +---
 1 file changed, 9 insertions(+), 15 deletions(-)

diff --git a/test/test/test_hash.c b/test/test/test_hash.c
index b3db9fd..13239e1 100644
--- a/test/test/test_hash.c
+++ b/test/test/test_hash.c
@@ -80,29 +80,23 @@ static uint32_t pseudo_hash(__attribute__((unused)) const 
void *keys,
return 3;
 }
 
+#define UNIT_TEST_HASH_VERBOSE 0
 /*
  * Print out result of unit test hash operation.
  */
-#if defined(UNIT_TEST_HASH_VERBOSE)
 static void print_key_info(const char *msg, const struct flow_key *key,
int32_t pos)
 {
-   uint8_t *p = (uint8_t *)key;
-   unsigned i;
-
-   printf("%s key:0x", msg);
-   for (i = 0; i < sizeof(struct flow_key); i++) {
-   printf("%02X", p[i]);
+   if (UNIT_TEST_HASH_VERBOSE) {
+   const uint8_t *p = (const uint8_t *)key;
+   unsigned i;
+
+   printf("%s key:0x", msg);
+   for (i = 0; i < sizeof(struct flow_key); i++)
+   printf("%02X", p[i]);
+   printf(" @ pos %d\n", pos);
}
-   printf(" @ pos %d\n", pos);
-}
-#else
-static void print_key_info(__attribute__((unused)) const char *msg,
-   __attribute__((unused)) const struct flow_key *key,
-   __attribute__((unused)) int32_t pos)
-{
 }
-#endif
 
 /* Keys used by unit test functions */
 static struct flow_key keys[5] = { {
-- 
2.7.4



Re: [dpdk-dev] [PATCH] test/hash: solve unit test hash compilation error

2018-08-27 Thread Gavin Hu


> -Original Message-
> From: dev  On Behalf Of Dharmik Thakkar
> Sent: Monday, August 27, 2018 10:26 PM
> To: Bruce Richardson ; Pablo de Lara
> 
> Cc: dev@dpdk.org; Honnappa Nagarahalli
> ; Dharmik Thakkar
> 
> Subject: [dpdk-dev] [PATCH] test/hash: solve unit test hash compilation error
>
> Enable print_key_info() function compilation always.
>
> Signed-off-by: Dharmik Thakkar 
> Reviewed-by: Honnappa Nagarahalli 
> Reviewed-by: Gavin Hu 
> Suggested-by: Honnappa Nagarahalli 
Acked-by: Gavin Hu 
> ---
>  test/test/test_hash.c | 24 +---
>  1 file changed, 9 insertions(+), 15 deletions(-)
>
> diff --git a/test/test/test_hash.c b/test/test/test_hash.c index
> b3db9fd..13239e1 100644
> --- a/test/test/test_hash.c
> +++ b/test/test/test_hash.c
> @@ -80,29 +80,23 @@ static uint32_t pseudo_hash(__attribute__((unused))
> const void *keys,
>  return 3;
>  }
>
> +#define UNIT_TEST_HASH_VERBOSE0
>  /*
>   * Print out result of unit test hash operation.
>   */
> -#if defined(UNIT_TEST_HASH_VERBOSE)
>  static void print_key_info(const char *msg, const struct flow_key *key,
>  int32_t pos)
>  {
> -uint8_t *p = (uint8_t *)key;
> -unsigned i;
> -
> -printf("%s key:0x", msg);
> -for (i = 0; i < sizeof(struct flow_key); i++) {
> -printf("%02X", p[i]);
> +if (UNIT_TEST_HASH_VERBOSE) {
> +const uint8_t *p = (const uint8_t *)key;
> +unsigned i;
> +
> +printf("%s key:0x", msg);
> +for (i = 0; i < sizeof(struct flow_key); i++)
> +printf("%02X", p[i]);
> +printf(" @ pos %d\n", pos);
>  }
> -printf(" @ pos %d\n", pos);
> -}
> -#else
> -static void print_key_info(__attribute__((unused)) const char *msg,
> -__attribute__((unused)) const struct flow_key *key,
> -__attribute__((unused)) int32_t pos)
> -{
>  }
> -#endif
>
>  /* Keys used by unit test functions */
>  static struct flow_key keys[5] = { {
> --
> 2.7.4

IMPORTANT NOTICE: The contents of this email and any attachments are 
confidential and may also be privileged. If you are not the intended recipient, 
please notify the sender immediately and do not disclose the contents to any 
other person, use it for any purpose, or store or copy the information in any 
medium. Thank you.


Re: [dpdk-dev] [PATCH 0/2] Some small changes to net/virtio

2018-08-27 Thread Gavin Hu
Why not combine" started" and "opened" into "status" with two bits represent 
each respectively?

> -Original Message-
> From: dev  On Behalf Of Ferruh Yigit
> Sent: Monday, August 27, 2018 9:48 PM
> To: maxime.coque...@redhat.com; Tiwei Bie 
> Cc: Luca Boccassi ; dev@dpdk.org; 3ch...@gmail.com
> Subject: Re: [dpdk-dev] [PATCH 0/2] Some small changes to net/virtio
>
> On 8/27/2018 2:41 PM, Ferruh Yigit wrote:
> > On 8/15/2018 2:51 PM, Luca Boccassi wrote:
> >> On Tue, 2017-07-18 at 07:52 -0400, Charles (Chas) Williams wrote:
> >>>
> >>> On 07/18/2017 07:50 AM, Ferruh Yigit wrote:
>  On 7/18/2017 12:05 AM, Charles (Chas) Williams wrote:
> > Just a couple small changes to net/virtio that make it a little
> > more well behaved.
> 
>  Same question here, is this patchset targets the 17.08-rc2? Can
>  this be postponed?
> >>>
> >>> Yes, these can all be postponed to 17.11.  They are too late for a
> >>> -rc2.
> >>
> >> Hi Ferruh and Maxime,
> >>
> >> Looks like this series fell through the cracks. Any chance you folks
> >> could please have a look at it for 18.11? Thanks!
> >>
> >> https://patches.dpdk.org/patch/26994/
> >> https://patches.dpdk.org/patch/26995/
> >
> > Hi Maxime, Tiwei,
> >
> > Those patches seems missed for a while, they date back to July 2011.
> > Can you guys please look at it?
>
> July 2017 :)
>
> >
> > If there is no objection/comment for a little more, they will be
> > automerged as a part of process.
> >
> > Thanks,
> > ferruh
> >

IMPORTANT NOTICE: The contents of this email and any attachments are 
confidential and may also be privileged. If you are not the intended recipient, 
please notify the sender immediately and do not disclose the contents to any 
other person, use it for any purpose, or store or copy the information in any 
medium. Thank you.


Re: [dpdk-dev] [PATCH] ethdev: deprecate DEFERRED device state

2018-08-27 Thread Andrew Rybchenko

On 08/24/2018 05:51 PM, Ferruh Yigit wrote:

Add a deprecation notice to remove RTE_ETH_DEV_DEFERRED state, but this
is mostly a reminder because of a missing target.
It doesn't worth to break the ABI because of this change and removal
can be done when ethdev ABI version increased.

Signed-off-by: Ferruh Yigit 
---
Cc: Thomas Monjalon 
Cc: Andrew Rybchenko 
Cc: Matan Azrad 
---
  doc/guides/rel_notes/deprecation.rst | 4 
  1 file changed, 4 insertions(+)

diff --git a/doc/guides/rel_notes/deprecation.rst 
b/doc/guides/rel_notes/deprecation.rst
index e2dbee317..9cd12ccd8 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -95,3 +95,7 @@ Deprecation Notices
  
This is due to a lack of flexibility and reliance on a type unusable with

C++ programs (struct rte_flow_desc).
+
+* ethdev: remove deprecated RTE_ETH_DEV_DEFERRED device state.
+  Since this is an enum filed in the middle, removing this field will break
+  the ABI, so removing postponed to next ethdev ABI version increase.


Acked-by: Andrew Rybchenko 



Re: [dpdk-dev] [PATCH v2 0/7] ethdev: add flow API object converter

2018-08-27 Thread Adrien Mazarguil
On Thu, Aug 23, 2018 at 02:48:37PM +0100, Ferruh Yigit wrote:
> On 8/3/2018 2:36 PM, Adrien Mazarguil wrote:
> > This is a follow up to the "Flow API helpers enhancements" series submitted
> > almost a year ago [1]. The new title is due to the reduced scope of this
> > version.
> > 
> > rte_flow_conv() is a flexible replacement to rte_flow_copy(), itself a
> > temporary solution pending something better [2]. It replaces a lot of
> > duplicated code found in testpmd and removes some of the maintenance burden
> > that developers tend to forget (me included) when modifying pattern
> > item or actions (updating app/test-pmd/config.c to be clear).
> > 
> > This series was unearthed in order to complete the implementation of
> > RTE_FLOW_ACTION_TYPE_ENCAP_(VXLAN|NVGRE) in testpmd [3] without having to
> > duplicate existing code once again.
> > 
> > See individual patches for specific changes in this version.
> > 
> > v2 changes:
> > 
> > - rte_flow_copy() is kept, albeit deprecated, no API/ABI impact.
> > - Updated bonding PMD.
> > - No more automatic generation of rte_flow_conv.h.
> > 
> > [1] https://mails.dpdk.org/archives/dev/2017-October/077551.html
> > [2] https://mails.dpdk.org/archives/dev/2017-July/070492.html
> > [3] Currently the command-line parser (cmdline_flow.c) is aware of these
> > actions, however config.c isn't. Flow rules with such actions cannot
> > be created and cannot be validated with PMDs that implement them.
> > 
> > Adrien Mazarguil (7):
> >   ethdev: add flow API object converter
> >   ethdev: add flow API item/action name conversion
> >   app/testpmd: rely on flow API conversion function
> >   net/failsafe: switch to flow API object conversion function
> >   net/bonding: switch to flow API object conversion function
> >   ethdev: deprecate rte_flow_copy function
> >   ethdev: add missing item/actions to flow object converter
> 
> Patch needs to be rebased to target v18.11 (in map file),

Right, will do it for v3.

> and indeed new APIs
> (rte_flow_conv) needs to be experimental.

This is what I did at first. Problem is that experimental APIs cannot be
used in internal code without triggering a compilation error unless
ALLOW_EXPERIMENTAL_API is defined (bonding cannot rely on an API marked as
experimental).

Since this series reimplements rte_flow_copy() as a wrapper to
rte_flow_conv(), I thought it didn't make sense for internal code to keep
using the former either.

Considering this, shall I add -DDALLOW_EXPERIMENTAL_API to bonding PMD or
keep things not experimental?

> And needs to remove deprecation notice in this patchset.

Doesn't it make sense to deprecate this function immediately after providing
a replacement on top of which it is reimplemented? Users end up using the
new function whether they want it or not. I don't think maintaining the
old duplicated code around is the right thing to do either.

> Also do you think does make sense to announce this change in release notes?

I'm not sure it's worth a release note. It's a rather obscure helper
function part of rte_flow. We didn't do it for rte_flow_copy() for
instance. Please confirm if you think it's needed.

> Apart from above, any volunteer for reviewing actual implementation?

I hope Gaetan will take a look, he added rte_flow_copy() after all :)

-- 
Adrien Mazarguil
6WIND


Re: [dpdk-dev] [dpdk-stable] [PATCH v4] net/bonding: per-slave intermediate rx ring

2018-08-27 Thread Matan Azrad
Hi Chas

From: Chas Williams <3ch...@gmail.com> 
>On Sun, Aug 26, 2018 at 3:40 AM Matan Azrad  wrote:
>
>From: Chas Williams  
>>On Thu, Aug 23, 2018 at 3:28 AM Matan Azrad 
>> wrote:
>>Hi
>>
>>From: Eric Kinzie
>>> On Wed Aug 22 11:42:37 + 2018, Matan Azrad wrote:
>>> > Hi Luca
>>> >
>>> > From: Luca Boccassi
>>> > > On Wed, 2018-08-22 at 07:09 +, Matan Azrad wrote:
>>> > > > Hi Chas
>>> > > >
>>> > > > From: Chas Williams
>>> > > > > On Tue, Aug 21, 2018 at 11:43 AM Matan Azrad
>>> > > > >  wrote:
>>> > > > > Hi Chas
>>> > > > >
>>> > > > > From: Chas Williams
>>> > > > > > On Tue, Aug 21, 2018 at 6:56 AM Matan Azrad
>>> > > > > > >> > > > > > https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fx.com&data=02%7C01%7Cmatan%40mellanox.com%7C7ee011bf19224cb17d9a08d60c202379%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636709729507525613&sdata=TiWaRq8A%2FvUaqPIT6ajfJdcY7eIAGPqB%2BiHppcFdZJo%3D&reserved=0>
>>> > > > > > wrote:
>>> > > > > > Hi
>>> > > > > >
>>> > > > > > From: Chas Williams
>>> > > > > > > This will need to be implemented for some of the other RX
>>> > > > > > > burst methods at some point for other modes to see this
>>> > > > > > > performance improvement (with the exception of active-backup).
>>> > > > > >
>>> > > > > > Yes, I think it should be done at least to
>>> > > > > > bond_ethdev_rx_burst_8023ad_fast_queue (should be easy) for
>>> now.
>>> > > > > >
>>> > > > > > There is some duplicated code between the various RX paths.
>>> > > > > > I would like to eliminate that as much as possible, so I was
>>> > > > > > going to give that some thought first.
>>> > > > >
>>> > > > > There is no reason to stay this function as is while its twin is
>>> > > > > changed.
>>> > > > >
>>> > > > > Unfortunately, this is all the patch I have at this time.
>>> > > > >
>>> > > > >
>>> > > > > >
>>> > > > > >
>>> > > > > > > On Thu, Aug 16, 2018 at 9:32 AM Luca Boccassi
>>> > > > > > > >> > > > > > > https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fian.org&data=02%7C01%7Cmatan%40mellanox.com%7C7ee011bf19224cb17d9a08d60c202379%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636709729507525613&sdata=0ALmgN%2B6Xnrl3kKeeoJRGTlJmeLZlmcJwsOTcncMUWE%3D&reserved=0>
>>> > > > > > >  wrote:
>>> > > > > > >
>>> > > > > > > > During bond 802.3ad receive, a burst of packets is fetched
>>> > > > > > > > from each slave into a local array and appended to
>>> > > > > > > > per-slave ring buffer.
>>> > > > > > > > Packets are taken from the head of the ring buffer and
>>> > > > > > > > returned to the caller.  The number of mbufs provided to
>>> > > > > > > > each slave is sufficient to meet the requirements of the
>>> > > > > > > > ixgbe vector receive.
>>> > > > > >
>>> > > > > > Luca,
>>> > > > > >
>>> > > > > > Can you explain these requirements of ixgbe?
>>> > > > > >
>>> > > > > > The ixgbe (and some other Intel PMDs) have vectorized RX
>>> > > > > > routines that are more efficient (if not faster) taking
>>> > > > > > advantage of some advanced CPU instructions.  I think you need
>>> > > > > > to be receiving at least 32 packets or more.
>>> > > > >
>>> > > > > So, why to do it in bond which is a generic driver for all the
>>> > > > > vendors PMDs, If for ixgbe and other Intel nics it is better you
>>> > > > > can force those PMDs to receive always 32 packets and to manage
>>> > > > > a ring by themselves.
>>> > > > >
>>> > > > > The drawback of the ring is some additional latency on the
>>> > > > > receive path.
>>> > > > > In testing, the additional latency hasn't been an issue for bonding.
>>> > > >
>>> > > > When bonding does processing slower it may be a bottleneck for the
>>> > > > packet processing for some application.
>>> > > >
>>> > > > > The bonding PMD has a fair bit of overhead associated with the
>>> > > > > RX and TX path calculations.  Most applications can just arrange
>>> > > > > to call the RX path with a sufficiently large receive.  Bonding
>>> > > > > can't do this.
>>> > > >
>>> > > > I didn't talk on application I talked on the slave PMDs, The slave
>>> > > > PMD can manage a ring by itself if it helps for its own performance.
>>> > > > The bonding should not be oriented to specific PMDs.
>>> > >
>>> > > The issue though is that the performance problem is not with the
>>> > > individual PMDs - it's with bonding. There were no reports regarding
>>> > > the individual PMDs.
>>> > > This comes from reports from customers from real world production
>>> > > deployments - the issue of bonding being too slow was raised multiple
>>> times.
>>> > > This patch addresses those issues, again in production deployments,
>>> > > where it's been used for years, to users and customers satisfaction.
>>> >
>>> > From Chas I understood that using burst of 32 helps for some slave PM

Re: [dpdk-dev] Multi-thread mempool usage

2018-08-27 Thread Matteo Lanzuisi

Hi,

I apologize for the last email, it was a false positive, sometimes it 
went good and sometimes not.
The real problem was a memory overflow in my code, where part of a 
memzone was overwritten by a memcpy. This was never found in RedHat 6 
and dpdk-2.2.0. I think this is because of some hugepage management 
changes between 2.2.0 and 17.07 dpdk version.


Thank you for you time and patience,
Matteo

Il 24/08/2018 18:47, Wiles, Keith ha scritto:



On Aug 24, 2018, at 9:44 AM, Matteo Lanzuisi  wrote:

Hi,

I used valgrind again for a very long time, and it told me nothing strange is 
happening on my code.
After it, I changed my code this way

  unsignedlcore_id_start = rte_lcore_id();
RTE_LCORE_FOREACH(lcore_id)
{
 if (lcore_id_start != lcore_id) // <- before this change, 
every lcore could use it own mempool and enqueue to its own ring

Something in the back of my head tells me this is correct, but I have no real 
reason :-(

If this works then I guess it is OK, but it would be nice to understand why it 
works with this fix. Unless you have another thread running on this lcore doing 
a get/put I do not see the problem.

 {
 new_work = NULL;
 result = 
rte_mempool_get(cea_main_lcore_conf[lcore_id].de_conf.cmd_pool, (VOID_P *) 
&new_work);// mempools are created one for each logical core
 if (result == 0)
{
 if (((uint64_t)(new_work)) < 0x7f00)
 printf("Result %d, lcore di partenza %u, lcore di ricezione 
%u, pointer %p\n", result, rte_lcore_id(), lcore_id, new_work);// debug print, 
on my server it should never happen but with multi-thread happens always on the last 
logical core
 new_work->command = command; // usage of the memory gotten from the 
mempool... <- here is where the application crashes
 result = 
rte_ring_enqueue(cea_main_lcore_conf[lcore_id].de_conf.cmd_ring, (VOID_P) 
new_work);// enqueues the gotten buffer on the rings   of all lcores
 // check on result value ...
 }
 else
 {
 // do something if result != 0 ...
 }
 }
 else
 {
   // don't use mempool but call a function instead 
 }
}

and now it all goes well.
It is possibile that sending to itself could generate this issue?

Regards,
Matteo

Il 21/08/2018 16:46, Matteo Lanzuisi ha scritto:

Il 21/08/2018 14:51, Wiles, Keith ha scritto:

On Aug 21, 2018, at 7:44 AM, Matteo Lanzuisi  wrote:

Il 21/08/2018 14:17, Wiles, Keith ha scritto:

On Aug 21, 2018, at 7:01 AM, Matteo Lanzuisi  wrote:

Hi

Il 20/08/2018 18:03, Wiles, Keith ha scritto:

On Aug 20, 2018, at 9:47 AM, Matteo Lanzuisi 
   wrote:

Hello Olivier,

Il 13/08/2018 23:54, Olivier Matz ha scritto:


Hello Matteo,

On Mon, Aug 13, 2018 at 03:20:44PM +0200, Matteo Lanzuisi wrote:


Any suggestion? any idea about this behaviour?

Il 08/08/2018 11:56, Matteo Lanzuisi ha scritto:


Hi all,

recently I began using "dpdk-17.11-11.el7.x86_64" rpm (RedHat rpm) on
RedHat 7.5 kernel 3.10.0-862.6.3.el7.x86_64 as a porting of an
application from RH6 to RH7. On RH6 I used dpdk-2.2.0.

This application is made up by one or more threads (each one on a
different logical core) reading packets from i40e interfaces.

Each thread can call the following code lines when receiving a specific
packet:

RTE_LCORE_FOREACH(lcore_id)
{
  result =
rte_mempool_get(cea_main_lcore_conf[lcore_id].de_conf.cmd_pool, (VOID_P
*) &new_work);// mempools are created one for each logical core
  if (((uint64_t)(new_work)) < 0x7f00)
  printf("Result %d, lcore di partenza %u, lcore di ricezione
%u, pointer %p\n", result, rte_lcore_id(), lcore_id, new_work);//
debug print, on my server it should never happen but with multi-thread
happens always on the last logical core


Here, checking the value of new_work looks wrong to me, before
ensuring that result == 0. At least, new_work should be set to
NULL before calling rte_mempool_get().


I put the check after result == 0, and just before the rte_mempool_get() I set 
new_work to NULL, but nothing changed.
The first time something goes wrong the print is

Result 0, lcore di partenza 1, lcore di ricezione 2, counter 635, pointer 
0x880002

Sorry for the italian language print :) it means that application is sending a 
message from the logical core 1 to the logical core 2, it's the 635th time, the 
result is 0 and the pointer is 0x880002 while all pointers before were 
0x7ffxx.
One strange thing is that this behaviour happens always from the logical core 1 
to the logical core 2 when the counter is 635!!! (Sending messages from 2 to 1 
or 1 to 1 or 2 to 2 is all ok)
Another strange thing is that pointers from counter 636 to 640 are NULL, and from 641 
begin 

Re: [dpdk-dev] [PATCH v2] kni: fix kni Rx fifo producer synchronization

2018-08-27 Thread Gavin Hu
This fix is not complete, kni_fifo_get requires a read fence also, otherwise it 
probably gets stale data on a weak ordering platform.

> -Original Message-
> From: dev  On Behalf Of Ferruh Yigit
> Sent: Monday, August 27, 2018 10:08 PM
> To: Kiran Kumar ;
> jerin.ja...@caviumnetworks.com
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v2] kni: fix kni Rx fifo producer
> synchronization
>
> On 8/16/2018 10:55 AM, Kiran Kumar wrote:
> > With existing code in kni_fifo_put, rx_q values are not being updated
> > before updating fifo_write. While reading rx_q in kni_net_rx_normal,
> > This is causing the sync issue on other core. So adding a write
> > barrier to make sure the values being synced before updating fifo_write.
> >
> > Fixes: 3fc5ca2f6352 ("kni: initial import")
> >
> > Signed-off-by: Kiran Kumar 
> > Acked-by: Jerin Jacob 
>
> Acked-by: Ferruh Yigit 
IMPORTANT NOTICE: The contents of this email and any attachments are 
confidential and may also be privileged. If you are not the intended recipient, 
please notify the sender immediately and do not disclose the contents to any 
other person, use it for any purpose, or store or copy the information in any 
medium. Thank you.


Re: [dpdk-dev] [dpdk-stable] [PATCH v4] net/bonding: per-slave intermediate rx ring

2018-08-27 Thread Chas Williams
On Mon, Aug 27, 2018 at 11:30 AM Matan Azrad  wrote:

> Hi Chas
>
> From: Chas Williams <3ch...@gmail.com>
> >On Sun, Aug 26, 2018 at 3:40 AM Matan Azrad 
> wrote:
> >
> >From: Chas Williams 
> >>On Thu, Aug 23, 2018 at 3:28 AM Matan Azrad  ma...@mellanox.com> wrote:
> >>Hi
> >>
> >>From: Eric Kinzie
> >>> On Wed Aug 22 11:42:37 + 2018, Matan Azrad wrote:
> >>> > Hi Luca
> >>> >
> >>> > From: Luca Boccassi
> >>> > > On Wed, 2018-08-22 at 07:09 +, Matan Azrad wrote:
> >>> > > > Hi Chas
> >>> > > >
> >>> > > > From: Chas Williams
> >>> > > > > On Tue, Aug 21, 2018 at 11:43 AM Matan Azrad
> >>> > > > >  wrote:
> >>> > > > > Hi Chas
> >>> > > > >
> >>> > > > > From: Chas Williams
> >>> > > > > > On Tue, Aug 21, 2018 at 6:56 AM Matan Azrad
> >>> > > > > >  https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fx.com&data=02%7C01%7Cmatan%40mellanox.com%7C7ee011bf19224cb17d9a08d60c202379%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636709729507525613&sdata=TiWaRq8A%2FvUaqPIT6ajfJdcY7eIAGPqB%2BiHppcFdZJo%3D&reserved=0
> >
> >>> > > > > > wrote:
> >>> > > > > > Hi
> >>> > > > > >
> >>> > > > > > From: Chas Williams
> >>> > > > > > > This will need to be implemented for some of the other RX
> >>> > > > > > > burst methods at some point for other modes to see this
> >>> > > > > > > performance improvement (with the exception of
> active-backup).
> >>> > > > > >
> >>> > > > > > Yes, I think it should be done at least to
> >>> > > > > > bond_ethdev_rx_burst_8023ad_fast_queue (should be easy) for
> >>> now.
> >>> > > > > >
> >>> > > > > > There is some duplicated code between the various RX paths.
> >>> > > > > > I would like to eliminate that as much as possible, so I was
> >>> > > > > > going to give that some thought first.
> >>> > > > >
> >>> > > > > There is no reason to stay this function as is while its twin
> is
> >>> > > > > changed.
> >>> > > > >
> >>> > > > > Unfortunately, this is all the patch I have at this time.
> >>> > > > >
> >>> > > > >
> >>> > > > > >
> >>> > > > > >
> >>> > > > > > > On Thu, Aug 16, 2018 at 9:32 AM Luca Boccassi
> >>> > > > > > >  https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fian.org&data=02%7C01%7Cmatan%40mellanox.com%7C7ee011bf19224cb17d9a08d60c202379%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636709729507525613&sdata=0ALmgN%2B6Xnrl3kKeeoJRGTlJmeLZlmcJwsOTcncMUWE%3D&reserved=0>
> wrote:
> >>> > > > > > >
> >>> > > > > > > > During bond 802.3ad receive, a burst of packets is
> fetched
> >>> > > > > > > > from each slave into a local array and appended to
> >>> > > > > > > > per-slave ring buffer.
> >>> > > > > > > > Packets are taken from the head of the ring buffer and
> >>> > > > > > > > returned to the caller.  The number of mbufs provided to
> >>> > > > > > > > each slave is sufficient to meet the requirements of the
> >>> > > > > > > > ixgbe vector receive.
> >>> > > > > >
> >>> > > > > > Luca,
> >>> > > > > >
> >>> > > > > > Can you explain these requirements of ixgbe?
> >>> > > > > >
> >>> > > > > > The ixgbe (and some other Intel PMDs) have vectorized RX
> >>> > > > > > routines that are more efficient (if not faster) taking
> >>> > > > > > advantage of some advanced CPU instructions.  I think you
> need
> >>> > > > > > to be receiving at least 32 packets or more.
> >>> > > > >
> >>> > > > > So, why to do it in bond which is a generic driver for all the
> >>> > > > > vendors PMDs, If for ixgbe and other Intel nics it is better
> you
> >>> > > > > can force those PMDs to receive always 32 packets and to manage
> >>> > > > > a ring by themselves.
> >>> > > > >
> >>> > > > > The drawback of the ring is some additional latency on the
> >>> > > > > receive path.
> >>> > > > > In testing, the additional latency hasn't been an issue for
> bonding.
> >>> > > >
> >>> > > > When bonding does processing slower it may be a bottleneck for
> the
> >>> > > > packet processing for some application.
> >>> > > >
> >>> > > > > The bonding PMD has a fair bit of overhead associated with the
> >>> > > > > RX and TX path calculations.  Most applications can just
> arrange
> >>> > > > > to call the RX path with a sufficiently large receive.  Bonding
> >>> > > > > can't do this.
> >>> > > >
> >>> > > > I didn't talk on application I talked on the slave PMDs, The
> slave
> >>> > > > PMD can manage a ring by itself if it helps for its own
> performance.
> >>> > > > The bonding should not be oriented to specific PMDs.
> >>> > >
> >>> > > The issue though is that the performance problem is not with the
> >>> > > individual PMDs - it's with bonding. There were no reports
> regarding
> >>> > > the individual PMDs.
> >>> > > This comes from reports from customers from real world production
> >>> > > deployments - the issue of bonding being too slow was raised
> multiple
> >>> times.
> >>>

Re: [dpdk-dev] [PATCH] crypto/aesni_mb: fix possible array overrun

2018-08-27 Thread Ananyev, Konstantin
> 
> In order to process crypto operations in the AESNI MB PMD,
> they need to be sent to the buffer manager of the Multi-buffer library,
> through the "job" structure.
> 
> Currently, it is checked if there are outstanding operations to process
> in the ring, before getting a new job. However, if there are no available
> jobs in the manager, a flush operation needs to take place, freeing some of 
> the jobs,
> so it can be used for the outstanding operation.
> 
> In order to avoid leaving the dequeued operation without being processed,
> the maximum number of operations that can be flushed is the remaining 
> operations
> to return, which is the maximum number of operations that can be return minus
> the number of operations ready to be returned (nb_ops - processed_jobs),
> minus 1 (for the new operation).
> 
> The problem comes when (nb_ops - processed_jobs) is 1 (last operation to 
> dequeue).
> In that case, flush_mb_mgr is called with maximum number of operations equal 
> to 0,
> which is wrong, causing a potential overrun in the "ops" array.
> Besides, the operation dequeued from the ring will be leaked, as no more 
> operations can
> be returned.
> 
> The solution is to first check if there are jobs available in the manager.
> If there are not, flush operation gets called, and if enough operations are 
> returned
> from the manager, then no more outstanding operations get dequeued from the 
> ring,
> avoiding both the memory leak and the array overrun.
> If there are enough jobs, the PMD tries to dequeue an operation from the ring.
> If there are no operations in the ring, the new job pointer is not used,
> and it will be used in the next get_next_job call, so no memory leak happens.
> 
> Fixes: 0f548b50a160 ("crypto/aesni_mb: process crypto op on dequeue")
> Cc: sta...@dpdk.org
> 
> Signed-off-by: Pablo de Lara 
> ---
>  drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c | 20 ++--
>  1 file changed, 14 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c 
> b/drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c
> index 93dc7a443..e2dd834f0 100644
> --- a/drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c
> +++ b/drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c
> @@ -833,22 +833,30 @@ aesni_mb_pmd_dequeue_burst(void *queue_pair, struct 
> rte_crypto_op **ops,
> 
>   uint8_t digest_idx = qp->digest_idx;
>   do {
> - /* Get next operation to process from ingress queue */
> - retval = rte_ring_dequeue(qp->ingress_queue, (void **)&op);
> - if (retval < 0)
> - break;
> -
>   /* Get next free mb job struct from mb manager */
>   job = (*qp->op_fns->job.get_next)(qp->mb_mgr);
>   if (unlikely(job == NULL)) {
>   /* if no free mb job structs we need to flush mb_mgr */
>   processed_jobs += flush_mb_mgr(qp,
>   &ops[processed_jobs],
> - (nb_ops - processed_jobs) - 1);
> + nb_ops - processed_jobs);
> +
> + if (nb_ops == processed_jobs)
> + break;
> 
>   job = (*qp->op_fns->job.get_next)(qp->mb_mgr);
>   }
> 
> + /*
> +  * Get next operation to process from ingress queue.
> +  * There is no need to return the job to the MB_MGR
> +  * if there are no more operations to process, since the MB_MGR
> +  * can use that pointer again in next get_next calls.
> +  */
> + retval = rte_ring_dequeue(qp->ingress_queue, (void **)&op);
> + if (retval < 0)
> + break;
> +
>   retval = set_mb_job_params(job, qp, op, &digest_idx);
>   if (unlikely(retval != 0)) {
>   qp->stats.dequeue_err_count++;
> --
Acked-by: Konstantin Ananyev 

> 2.17.1



Re: [dpdk-dev] [PATCH] bus/vdev: fix wrong error log on secondary device scan

2018-08-27 Thread Burakov, Anatoly

On 27-Aug-18 1:27 PM, Qi Zhang wrote:

When a secondary process handles VDEV_SCAN_ONE mp action, it is possible
the device is already be inserted. This happens when we have multiple
secondary processes which cause multiple broadcasts from primary during
bus->scan. So we don't need to log any error for -EEXIST.

Bugzilla ID: 84
Fixes: cdb068f031c6 ("bus/vdev: scan by multi-process channel")
Cc: sta...@dpdk.org

Reported-by: Eads Gage 
Signed-off-by: Qi Zhang 
---
  drivers/bus/vdev/vdev.c | 6 +-
  1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/bus/vdev/vdev.c b/drivers/bus/vdev/vdev.c
index 6139dd551..af9526fe6 100644
--- a/drivers/bus/vdev/vdev.c
+++ b/drivers/bus/vdev/vdev.c
@@ -346,6 +346,7 @@ vdev_action(const struct rte_mp_msg *mp_msg, const void 
*peer)
const struct vdev_param *in = (const struct vdev_param *)mp_msg->param;
const char *devname;
int num;
+   int ret;
  
  	strlcpy(mp_resp.name, VDEV_MP_KEY, sizeof(mp_resp.name));

mp_resp.len_param = sizeof(*ou);
@@ -380,7 +381,10 @@ vdev_action(const struct rte_mp_msg *mp_msg, const void 
*peer)
break;
case VDEV_SCAN_ONE:
VDEV_LOG(INFO, "receive vdev, %s", in->name);
-   if (insert_vdev(in->name, NULL, NULL) < 0)
+   ret = insert_vdev(in->name, NULL, NULL);
+   if (ret == -EEXIST)
+   VDEV_LOG(INFO, "device already exist, %s", in->name);


This is probably going to be printed a lot, and there's no real point in 
that. Maybe set log level to DEBUG instead?



+   else if (ret < 0)
VDEV_LOG(ERR, "failed to add vdev, %s", in->name);
break;
default:




--
Thanks,
Anatoly


Re: [dpdk-dev] 16.11.8 (LTS) patches review and test

2018-08-27 Thread Luca Boccassi
On Thu, 2018-08-23 at 09:55 +0100, Luca Boccassi wrote:
> On Mon, 2018-08-13 at 19:21 +0100, luca.bocca...@gmail.com wrote:
> > Hi all,
> > 
> > Here is a list of patches targeted for LTS release 16.11.8. Please
> > help review and test. The planned date for the final release is
> > August
> > the 23rd.
> > Before that, please shout if anyone has objections with these
> > patches being applied.
> > 
> > Also for the companies committed to running regression tests,
> > please run the tests and report any issue before the release date.
> > 
> > A release candidate tarball can be found at:
> > 
> > https://dpdk.org/browse/dpdk-stable/tag/?id=v16.11.8-rc1
> > 
> > These patches are located at branch 16.11 of dpdk-stable repo:
> > https://dpdk.org/browse/dpdk-stable/
> > 
> > Thanks.
> > 
> > Luca Boccassi
> 
> Hi,
> 
> Regression tests from Intel have highlighted a possible issue with
> the
> changes (unidentified as of now), so while investigation is in
> progress
> we decided to postpone the release to Monday the 27th to be on the
> safe
> side.
> Apologies for any issues this might cause.

Hi,

Unfortunately triaging is still in progress, so it's better to postpone
again, to Wednesday the 29th of August.
Apologies again for any issues due to this delay.

-- 
Kind regards,
Luca Boccassi


Re: [dpdk-dev] [PATCH v4 2/2] virtio: fix PCI config err handling

2018-08-27 Thread Luca Boccassi
On Mon, 2018-08-27 at 13:29 +0800, Tiwei Bie wrote:
> On Fri, Aug 24, 2018 at 06:14:20PM +0100, Luca Boccassi wrote:
> > From: Brian Russell 
> > 
> > In virtio_read_caps and vtpci_msix_detect, rte_pci_read_config
> > returns
> > the number of bytes read from PCI config or < 0 on error.
> > If less than the expected number of bytes are read then log the
> > failure and return rather than carrying on with garbage.
> > 
> > Fixes: 6ba1f63b5ab0 ("virtio: support specification 1.0")
> > 
> > Signed-off-by: Brian Russell 
> > Signed-off-by: Luca Boccassi 
> > ---
> > v2: handle additional rte_pci_read_config incomplete reads
> > v3: do not handle rte_pci_read_config of virtio cap, added in v2,
> > as it's less clear what the right thing to do there is
> > v4: do a more robust check - first check what the vendor is, and
> > skip the cap entirely if it's not what we are looking for.
> > 
> >  drivers/net/virtio/virtio_pci.c | 57 -
> > 
> >  1 file changed, 42 insertions(+), 15 deletions(-)
> > 
> > diff --git a/drivers/net/virtio/virtio_pci.c
> > b/drivers/net/virtio/virtio_pci.c
> > index 6bd22e54a6..cfefa9789b 100644
> > --- a/drivers/net/virtio/virtio_pci.c
> > +++ b/drivers/net/virtio/virtio_pci.c
> > @@ -567,16 +567,30 @@ virtio_read_caps(struct rte_pci_device *dev,
> > struct virtio_hw *hw)
> >     }
> >  
> >     ret = rte_pci_read_config(dev, &pos, 1,
> > PCI_CAPABILITY_LIST);
> > -   if (ret < 0) {
> > -   PMD_INIT_LOG(DEBUG, "failed to read pci capability
> > list");
> > +   if (ret != 1) {
> > +   PMD_INIT_LOG(DEBUG,
> > +    "failed to read pci capability list,
> > ret %d", ret);
> >     return -1;
> >     }
> >  
> >     while (pos) {
> > +   ret = rte_pci_read_config(dev, &cap, 2, pos);
> > +   if (ret != 2) {
> > +   PMD_INIT_LOG(DEBUG,
> > +    "failed to read pci cap at
> > pos: %x ret %d",
> > +    pos, ret);
> > +   break;
> > +   }
> > +   if (cap.cap_vndr != PCI_CAP_ID_MSIX &&
> > +   cap.cap_vndr != PCI_CAP_ID_VNDR) {
> > +   goto next;
> > +   }
> > +
> >     ret = rte_pci_read_config(dev, &cap, sizeof(cap),
> > pos);
> > -   if (ret < 0) {
> > -   PMD_INIT_LOG(ERR,
> > -   "failed to read pci cap at pos:
> > %x", pos);
> > +   if (ret != sizeof(cap)) {
> > +   PMD_INIT_LOG(DEBUG,
> > +    "failed to read pci cap at
> > pos: %x ret %d",
> > +    pos, ret);
> >     break;
> >     }
> >  
> 
> It seems that I didn't make myself clear in my previous
> comments. I mean it's better to handle MSIX cap and virtio
> cap respectively in this function. Currently we're always
> reading them as virtio caps. As we are strictly requiring
> that _read_config() should return the required number of
> bytes, it's not perfect to require it to return "virtio
> cap size" of bytes while we're trying to read a MSIX cap.
> So please change the code to something similar to this:

Sorry, though you meant in the vtpci_msix_detect function, which I
changed. Fixed in v5.

> > @@ -689,25 +703,38 @@ enum virtio_msix_status
> >  vtpci_msix_detect(struct rte_pci_device *dev)
> >  {
> >     uint8_t pos;
> > -   struct virtio_pci_cap cap;
> >     int ret;
> >  
> >     ret = rte_pci_read_config(dev, &pos, 1,
> > PCI_CAPABILITY_LIST);
> > -   if (ret < 0) {
> > -   PMD_INIT_LOG(DEBUG, "failed to read pci capability
> > list");
> > +   if (ret != 1) {
> > +   PMD_INIT_LOG(DEBUG,
> > +    "failed to read pci capability list,
> > ret %d", ret);
> >     return VIRTIO_MSIX_NONE;
> >     }
> >  
> >     while (pos) {
> > -   ret = rte_pci_read_config(dev, &cap, sizeof(cap),
> > pos);
> > -   if (ret < 0) {
> > -   PMD_INIT_LOG(ERR,
> > -   "failed to read pci cap at pos:
> > %x", pos);
> > +   uint8_t cap[2];
> > +
> > +   ret = rte_pci_read_config(dev, cap, sizeof(cap),
> > pos);
> > +   if (ret != sizeof(cap)) {
> > +   PMD_INIT_LOG(DEBUG,
> > +    "failed to read pci cap at
> > pos: %x ret %d",
> > +    pos, ret);
> >     break;
> >     }
> >  
> > -   if (cap.cap_vndr == PCI_CAP_ID_MSIX) {
> > -   uint16_t flags = ((uint16_t *)&cap)[1];
> > +   if (cap[0] == PCI_CAP_ID_MSIX) {
> > +   uint16_t flags;
> > +
> > +   ret = rte_pci_read_config(dev, &flags,
> > sizeof(flags),
> > +   pos + sizeof(cap));
> > +   if (ret != sizeof(flags)) {
> > +   PMD_INIT_LOG(DEBUG,
> > +

[dpdk-dev] [PATCH v5 1/2] bus/pci: harmonize and document rte_pci_read_config return value

2018-08-27 Thread Luca Boccassi
On Linux, rte_pci_read_config on success returns the number of read
bytes, but on BSD it returns 0.
Document the return values, and have BSD behave as Linux does.

At least one case (bnx2x PMD) treats 0 as an error, so the change
makes sense also for that.

Signed-off-by: Luca Boccassi 
---
 drivers/bus/pci/bsd/pci.c | 4 +++-
 drivers/bus/pci/rte_bus_pci.h | 2 ++
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/bus/pci/bsd/pci.c b/drivers/bus/pci/bsd/pci.c
index 655b34b7e4..175d83cf1b 100644
--- a/drivers/bus/pci/bsd/pci.c
+++ b/drivers/bus/pci/bsd/pci.c
@@ -439,6 +439,8 @@ int rte_pci_read_config(const struct rte_pci_device *dev,
 {
int fd = -1;
int size;
+   /* Copy Linux implementation's behaviour */
+   const int return_len = len;
struct pci_io pi = {
.pi_sel = {
.pc_domain = dev->addr.domain,
@@ -469,7 +471,7 @@ int rte_pci_read_config(const struct rte_pci_device *dev,
}
close(fd);
 
-   return 0;
+   return return_len;
 
  error:
if (fd >= 0)
diff --git a/drivers/bus/pci/rte_bus_pci.h b/drivers/bus/pci/rte_bus_pci.h
index 0d1955ffe0..df8f64798d 100644
--- a/drivers/bus/pci/rte_bus_pci.h
+++ b/drivers/bus/pci/rte_bus_pci.h
@@ -219,6 +219,8 @@ void rte_pci_unregister(struct rte_pci_driver *driver);
  *   The length of the data buffer.
  * @param offset
  *   The offset into PCI config space
+ * @return
+ *  Number of bytes read on success, negative on error.
  */
 int rte_pci_read_config(const struct rte_pci_device *device,
void *buf, size_t len, off_t offset);
-- 
2.18.0



[dpdk-dev] [PATCH v5 2/2] virtio: fix PCI config err handling

2018-08-27 Thread Luca Boccassi
From: Brian Russell 

In virtio_read_caps and vtpci_msix_detect, rte_pci_read_config returns
the number of bytes read from PCI config or < 0 on error.
If less than the expected number of bytes are read then log the
failure and return rather than carrying on with garbage.

Fixes: 6ba1f63b5ab0 ("virtio: support specification 1.0")

Signed-off-by: Brian Russell 
Signed-off-by: Luca Boccassi 
---
v2: handle additional rte_pci_read_config incomplete reads
v3: do not handle rte_pci_read_config of virtio cap, added in v2,
as it's less clear what the right thing to do there is
v4: do a more robust check - first check what the vendor is, and
skip the cap entirely if it's not what we are looking for.
v5: fetch only 2 flags bytes if the vndr is PCI_CAP_ID_MSIX

 drivers/net/virtio/virtio_pci.c | 66 -
 1 file changed, 49 insertions(+), 17 deletions(-)

diff --git a/drivers/net/virtio/virtio_pci.c b/drivers/net/virtio/virtio_pci.c
index 6bd22e54a6..e900254a12 100644
--- a/drivers/net/virtio/virtio_pci.c
+++ b/drivers/net/virtio/virtio_pci.c
@@ -567,16 +567,18 @@ virtio_read_caps(struct rte_pci_device *dev, struct 
virtio_hw *hw)
}
 
ret = rte_pci_read_config(dev, &pos, 1, PCI_CAPABILITY_LIST);
-   if (ret < 0) {
-   PMD_INIT_LOG(DEBUG, "failed to read pci capability list");
+   if (ret != 1) {
+   PMD_INIT_LOG(DEBUG,
+"failed to read pci capability list, ret %d", ret);
return -1;
}
 
while (pos) {
-   ret = rte_pci_read_config(dev, &cap, sizeof(cap), pos);
-   if (ret < 0) {
-   PMD_INIT_LOG(ERR,
-   "failed to read pci cap at pos: %x", pos);
+   ret = rte_pci_read_config(dev, &cap, 2, pos);
+   if (ret != 2) {
+   PMD_INIT_LOG(DEBUG,
+"failed to read pci cap at pos: %x ret %d",
+pos, ret);
break;
}
 
@@ -586,7 +588,16 @@ virtio_read_caps(struct rte_pci_device *dev, struct 
virtio_hw *hw)
 * 1st byte is cap ID; 2nd byte is the position of next
 * cap; next two bytes are the flags.
 */
-   uint16_t flags = ((uint16_t *)&cap)[1];
+   uint16_t flags;
+
+   ret = rte_pci_read_config(dev, &flags, sizeof(flags),
+   pos + 2);
+   if (ret != sizeof(flags)) {
+   PMD_INIT_LOG(DEBUG,
+"failed to read pci cap at pos:"
+" %x ret %d", pos + 2, ret);
+   break;
+   }
 
if (flags & PCI_MSIX_ENABLE)
hw->use_msix = VIRTIO_MSIX_ENABLED;
@@ -601,6 +612,14 @@ virtio_read_caps(struct rte_pci_device *dev, struct 
virtio_hw *hw)
goto next;
}
 
+   ret = rte_pci_read_config(dev, &cap, sizeof(cap), pos);
+   if (ret != sizeof(cap)) {
+   PMD_INIT_LOG(DEBUG,
+"failed to read pci cap at pos: %x ret %d",
+pos, ret);
+   break;
+   }
+
PMD_INIT_LOG(DEBUG,
"[%2x] cfg type: %u, bar: %u, offset: %04x, len: %u",
pos, cap.cfg_type, cap.bar, cap.offset, cap.length);
@@ -689,25 +708,38 @@ enum virtio_msix_status
 vtpci_msix_detect(struct rte_pci_device *dev)
 {
uint8_t pos;
-   struct virtio_pci_cap cap;
int ret;
 
ret = rte_pci_read_config(dev, &pos, 1, PCI_CAPABILITY_LIST);
-   if (ret < 0) {
-   PMD_INIT_LOG(DEBUG, "failed to read pci capability list");
+   if (ret != 1) {
+   PMD_INIT_LOG(DEBUG,
+"failed to read pci capability list, ret %d", ret);
return VIRTIO_MSIX_NONE;
}
 
while (pos) {
-   ret = rte_pci_read_config(dev, &cap, sizeof(cap), pos);
-   if (ret < 0) {
-   PMD_INIT_LOG(ERR,
-   "failed to read pci cap at pos: %x", pos);
+   uint8_t cap[2];
+
+   ret = rte_pci_read_config(dev, cap, sizeof(cap), pos);
+   if (ret != sizeof(cap)) {
+   PMD_INIT_LOG(DEBUG,
+"failed to read pci cap at pos: %x ret %d",
+pos, ret);
break;
}
 
-   if (cap.cap_vndr == PCI_CAP_ID_MSIX) {
-   uint16_t flags = ((uint16_t *)&cap)[1];
+   if (cap[0] == 

Re: [dpdk-dev] [PATCH] kni: dynamically allocate memory for each KNI

2018-08-27 Thread Ferruh Yigit
On 8/2/2018 3:25 PM, Igor Ryzhov wrote:
> Long time ago preallocation of memory for KNI was introduced in commit
> 0c6bc8e. It was done because of lack of ability to free previously
> allocated memzones, which led to memzone exhaustion. Currently memzones
> can be freed and this patch uses this ability for dynamic KNI memory
> allocation.

Hi Igor,

It is good to be able to allocate memory dynamically and get rid of the
"max_kni_ifaces" and "kni_memzone_pool", thanks for the patch.

Overall looks good, a few comments below.

> 
> Signed-off-by: Igor Ryzhov 
> ---
>  lib/librte_kni/rte_kni.c | 392 ---
>  lib/librte_kni/rte_kni.h |   6 +-
>  test/test/test_kni.c |   6 -
>  3 files changed, 128 insertions(+), 276 deletions(-)
> 
> diff --git a/lib/librte_kni/rte_kni.c b/lib/librte_kni/rte_kni.c
> index 8a8f6c1cc..028b44bfd 100644
> --- a/lib/librte_kni/rte_kni.c
> +++ b/lib/librte_kni/rte_kni.c
> @@ -36,24 +36,33 @@
>   * KNI context
>   */
>  struct rte_kni {
> + const struct rte_memzone *mz;   /**< KNI context memzone */

I was thinking remove the context memzone and use rte_zmalloc() to create kni
objects but updated rte_kni_get() API seems relaying this.
If you see any other way to get kni object from name in rte_kni_get(), I am for
removing above *mz variable from rte_kni struct.

<...>

> +static void
> +kni_ctx_release_mz(struct rte_kni *ctx)
> +{
> + rte_memzone_free(ctx->m_tx_q);
> + rte_memzone_free(ctx->m_rx_q);
> + rte_memzone_free(ctx->m_alloc_q);
> + rte_memzone_free(ctx->m_free_q);
> + rte_memzone_free(ctx->m_req_q);
> + rte_memzone_free(ctx->m_resp_q);
> + rte_memzone_free(ctx->m_sync_addr);


"ctx" sounds confusing to me, isn't this "rte_kni" object instance, why not just
call it "kni" or if it is too generic "kni_obj" or similar? For other APIs as 
well.

And this is just a detail but about order of APIs would you mind having first
reserve() one, later release() one?

<...>

> -/* Shall be called before any allocation happens */
> -void
> -rte_kni_init(unsigned int max_kni_ifaces)
> +static struct rte_kni *
> +kni_ctx_reserve(const char *name)
>  {
> - uint32_t i;
> - struct rte_kni_memzone_slot *it;
> + struct rte_kni *ctx;
>   const struct rte_memzone *mz;
> -#define OBJNAMSIZ 32
> - char obj_name[OBJNAMSIZ];
>   char mz_name[RTE_MEMZONE_NAMESIZE];
>  
> - /* Immediately return if KNI is already initialized */
> - if (kni_memzone_pool.initialized) {
> - RTE_LOG(WARNING, KNI, "Double call to rte_kni_init()");
> - return;
> - }
> + snprintf(mz_name, RTE_MEMZONE_NAMESIZE, "kni_info_%s", name);

Can you please convert memzone names, like "kni_info" to defines, for all of 
them?

<...>

> @@ -81,8 +81,12 @@ struct rte_kni_conf {
>   *
>   * @param max_kni_ifaces
>   *  The maximum number of KNI interfaces that can coexist concurrently
> + *
> + * @return
> + *  - 0 indicates success.
> + *  - negative value indicates failure.
>   */
> -void rte_kni_init(unsigned int max_kni_ifaces);
> +int rte_kni_init(unsigned int max_kni_ifaces);

This changes the API. Return type changes from "void" to "int". I agree "int"
makes more sense since API can fail, but this changes the ABI/API.

Since existing binaries doesn't check the return type at all there may be no
issue from ABI point of view but from API point of view some apps may get return
value not checked warnings, not sure though.

And the need of the API is questionable at this stage, it may be possible to
move rte_kni_alloc() where it already has "kni_fd" check.

What do you think keep API signature same for now, but add a deprecation notice
to remove the API. Next release (v19.02) remove rte_kni_init() completely?

<...>

>  /**
> diff --git a/test/test/test_kni.c b/test/test/test_kni.c
> index 1b876719a..56c98513a 100644
> --- a/test/test/test_kni.c
> +++ b/test/test/test_kni.c
> @@ -429,12 +429,6 @@ test_kni_processing(uint16_t port_id, struct rte_mempool 
> *mp)
>   }
>   test_kni_ctx = NULL;
>  
> - /* test of releasing a released kni device */
> - if (rte_kni_release(kni) == 0) {
> - printf("should not release a released kni device\n");
> - return -1;
> - }

Why need to remove this?



[dpdk-dev] [PATCH v2] net/virtio-user: check negotiated features before set

2018-08-27 Thread eric zhang
This patch checks negotiated features to see if necessary to offload
before set the tap device offload capabilities. It also checks if kernel
support the TUNSETOFFLOAD operation.

Signed-off-by: eric zhang 

---
v2:
* don't return failure when failed to set offload to tap
* check if offloads available when handling VHOST_GET_FEATURES
---
 drivers/net/virtio/virtio_user/vhost_kernel.c |  8 ++--
 drivers/net/virtio/virtio_user/vhost_kernel_tap.c | 55 +--
 drivers/net/virtio/virtio_user/vhost_kernel_tap.h |  2 +-
 3 files changed, 47 insertions(+), 18 deletions(-)

diff --git a/drivers/net/virtio/virtio_user/vhost_kernel.c 
b/drivers/net/virtio/virtio_user/vhost_kernel.c
index dd24b6b..5c39f26 100644
--- a/drivers/net/virtio/virtio_user/vhost_kernel.c
+++ b/drivers/net/virtio/virtio_user/vhost_kernel.c
@@ -278,9 +278,11 @@ struct vhost_memory_kernel {
if (!ret && req_kernel == VHOST_GET_FEATURES) {
/* with tap as the backend, all these features are supported
 * but not claimed by vhost-net, so we add them back when
-* reporting to upper layer.
+* reporting to upper layer. For guest offloads we check if
+* they are available in the negotiated features.
 */
-   *((uint64_t *)arg) |= VHOST_KERNEL_GUEST_OFFLOADS_MASK;
+   *((uint64_t *)arg) |=
+   (dev->features & VHOST_KERNEL_GUEST_OFFLOADS_MASK);
*((uint64_t *)arg) |= VHOST_KERNEL_HOST_OFFLOADS_MASK;
 
/* vhost_kernel will not declare this feature, but it does
@@ -381,7 +383,7 @@ struct vhost_memory_kernel {
hdr_size = sizeof(struct virtio_net_hdr);
 
tapfd = vhost_kernel_open_tap(&dev->ifname, hdr_size, req_mq,
-(char *)dev->mac_addr);
+(char *)dev->mac_addr, dev->features);
if (tapfd < 0) {
PMD_DRV_LOG(ERR, "fail to open tap for vhost kernel");
return -1;
diff --git a/drivers/net/virtio/virtio_user/vhost_kernel_tap.c 
b/drivers/net/virtio/virtio_user/vhost_kernel_tap.c
index d036428..5e86404 100644
--- a/drivers/net/virtio/virtio_user/vhost_kernel_tap.c
+++ b/drivers/net/virtio/virtio_user/vhost_kernel_tap.c
@@ -45,21 +45,54 @@
 
 #include "vhost_kernel_tap.h"
 #include "../virtio_logs.h"
+#include "../virtio_pci.h"
+
+static int
+vhost_kernel_tap_set_offload(int fd, uint64_t feature)
+{
+   unsigned int offload = 0;
+
+   if (feature & (1ULL << VIRTIO_NET_F_GUEST_CSUM))
+   offload |= TUN_F_CSUM;
+   if (feature & (1ULL << VIRTIO_NET_F_GUEST_TSO4))
+   offload |= TUN_F_TSO4;
+   if (feature & (1ULL << VIRTIO_NET_F_GUEST_TSO6))
+   offload |= TUN_F_TSO6;
+   if (feature & ((1ULL << VIRTIO_NET_F_GUEST_TSO4) |
+   (1ULL << VIRTIO_NET_F_GUEST_TSO6)) &&
+   (feature & (1ULL << VIRTIO_NET_F_GUEST_ECN)))
+   offload |= TUN_F_TSO_ECN;
+   if (feature & (1ULL << VIRTIO_NET_F_GUEST_UFO))
+   offload |= TUN_F_UFO;
+
+   if (offload != 0) {
+   /* Check if our kernel supports TUNSETOFFLOAD */
+   if (ioctl(fd, TUNSETOFFLOAD, 0) != 0 && errno == EINVAL) {
+   PMD_DRV_LOG(ERR, "Kernel does't support 
TUNSETOFFLOAD\n");
+   return -ENOTSUP;
+   }
+
+   if (ioctl(fd, TUNSETOFFLOAD, offload) != 0) {
+   offload &= ~TUN_F_UFO;
+   if (ioctl(fd, TUNSETOFFLOAD, offload) != 0) {
+   PMD_DRV_LOG(ERR, "TUNSETOFFLOAD ioctl() failed: 
%s\n",
+   strerror(errno));
+   return -1;
+   }
+   }
+   }
+
+   return 0;
+}
 
 int
 vhost_kernel_open_tap(char **p_ifname, int hdr_size, int req_mq,
-const char *mac)
+const char *mac, uint64_t features)
 {
unsigned int tap_features;
int sndbuf = INT_MAX;
struct ifreq ifr;
int tapfd;
-   unsigned int offload =
-   TUN_F_CSUM |
-   TUN_F_TSO4 |
-   TUN_F_TSO6 |
-   TUN_F_TSO_ECN |
-   TUN_F_UFO;
 
/* TODO:
 * 1. verify we can get/set vnet_hdr_len, tap_probe_vnet_hdr_len
@@ -119,13 +152,7 @@
goto error;
}
 
-   /* TODO: before set the offload capabilities, we'd better (1) check
-* negotiated features to see if necessary to offload; (2) query tap
-* to see if it supports the offload capabilities.
-*/
-   if (ioctl(tapfd, TUNSETOFFLOAD, offload) != 0)
-   PMD_DRV_LOG(ERR, "TUNSETOFFLOAD ioctl() failed: %s",
-  strerror(errno));
+   vhost_kernel_tap_set_offload(tapfd, features);
 
  

[dpdk-dev] [Bug 86] Requested device cannot be used

2018-08-27 Thread bugzilla
https://bugs.dpdk.org/show_bug.cgi?id=86

Bug ID: 86
   Summary: Requested device cannot be used
   Product: DPDK
   Version: unspecified
  Hardware: x86
OS: Linux
Status: CONFIRMED
  Severity: normal
  Priority: Normal
 Component: ethdev
  Assignee: dev@dpdk.org
  Reporter: tcn...@iii.org.tw
  Target Milestone: ---

Hello all,
  I am trying to get the performance of intel x520 10G NIC over Dell R630/R730,
but I keep getting an unexpected error, please see below.

I followed the instruction of https://goo.gl/T7iTuk to compiler the DPDK and
OVS code. I've successfully binded both my x520 NIC ports to DPDK, using either
igb_uio or vfio_pci:

~~
Network devices using DPDK-compatible driver

:82:00.0 'Ethernet 10G 2P X520 Adapter 154d' drv=igb_uio unused=vfio-pci
:82:00.1 'Ethernet 10G 2P X520 Adapter 154d' drv=igb_uio unused=vfio-pci

Network devices using kernel driver
===
:01:00.0 'NetXtreme BCM5720 Gigabit Ethernet PCIe 165f' if=eno1 drv=tg3
unused=igb_uio,vfio-pci 
:01:00.1 'NetXtreme BCM5720 Gigabit Ethernet PCIe 165f' if=eno2 drv=tg3
unused=igb_uio,vfio-pci 
:02:00.0 'NetXtreme BCM5720 Gigabit Ethernet PCIe 165f' if=eno3 drv=tg3
unused=igb_uio,vfio-pci 
:02:00.1 'NetXtreme BCM5720 Gigabit Ethernet PCIe 165f' if=eno4 drv=tg3
unused=igb_uio,vfio-pci *Active*

Other Network devices
=

~~~

And the hugepage was set to 2048 * 2M
~~~
HugePages_Total:2048
HugePages_Free: 1024
HugePages_Rsvd:0
HugePages_Surp:0
Hugepagesize:   2048 kB
~~~

Here comes the problem, while I tried to init the ovsdb-server and ovs-vswitch,
I got the following error:
~~~
   2018-08-27T09:54:05.548Z|2|ovs_numa|INFO|Discovered 16 CPU cores on NUMA
node 0
   2018-08-27T09:54:05.548Z|3|ovs_numa|INFO|Discovered 16 CPU cores on NUMA
node 1
   2018-08-27T09:54:05.548Z|4|ovs_numa|INFO|Discovered 2 NUMA nodes and 32
CPU cores
  
2018-08-27T09:54:05.548Z|5|reconnect|INFO|unix:/usr/local/var/run/openvswitch/db.sock:
connecting...
   2018-08-  
27T09:54:05.549Z|6|reconnect|INFO|unix:/usr/local/var/run/openvswitch/db.sock:
connected
   2018-08-27T09:54:05.552Z|7|dpdk|INFO|DPDK Enabled - initializing...
   2018-08-27T09:54:05.552Z|8|dpdk|INFO|No vhost-sock-dir provided -
defaulting to /usr/local/var/run/openvswitch
   2018-08-27T09:54:05.552Z|9|dpdk|INFO|EAL ARGS: ovs-vswitchd --socket-mem
1024,0 -c 0x0001
   2018-08-27T09:54:05.553Z|00010|dpdk|INFO|EAL: Detected 32 lcore(s)
   2018-08-27T09:54:05.558Z|00011|dpdk|WARN|EAL: No free hugepages reported in
hugepages-1048576kB
   2018-08-27T09:54:05.559Z|00012|dpdk|INFO|EAL: Probing VFIO support...
   2018-08-27T09:54:06.700Z|00013|dpdk|INFO|EAL: PCI device :82:00.0 on
NUMA socket 1
   2018-08-27T09:54:06.700Z|00014|dpdk|INFO|EAL:   probe driver: 8086:154d
net_ixgbe
2018-08-27T09:54:06.700Z|00015|dpdk|ERR|EAL: Requested device :82:00.0
cannot be used
   2018-08-27T09:54:06.700Z|00016|dpdk|INFO|EAL: PCI device :82:00.1 on
NUMA socket 1
   2018-08-27T09:54:06.700Z|00017|dpdk|INFO|EAL:   probe driver: 8086:154d
net_ixgbe
2018-08-27T09:54:06.700Z|00018|dpdk|ERR|EAL: Requested device :82:00.1
cannot be used
   2018-08-27T09:54:06.701Z|00019|dpdk|INFO|DPDK Enabled - initialized
   2018-08-27T09:54:06.705Z|00020|ofproto_dpif|INFO|netdev@ovs-netdev: Datapath
supports recirculation
~~~

Therefore, I also got the same error when I added a dpdk-port:
~~~
2018-08-27T09:54:06.709Z|00036|dpdk|INFO|EAL: PCI device :82:00.0 on NUMA
socket 1
2018-08-27T09:54:06.709Z|00037|dpdk|INFO|EAL:   probe driver: 8086:154d
net_ixgbe
2018-08-27T09:54:06.710Z|00038|dpdk|WARN|EAL: Requested device :82:00.0
cannot be used
2018-08-27T09:54:06.710Z|00039|dpdk|ERR|EAL: Driver cannot attach the device
(:82:00.0)
2018-08-27T09:54:06.710Z|00040|netdev_dpdk|WARN|Error attaching device
':82:00.0' to DPDK
2018-08-27T09:54:06.710Z|00041|netdev|WARN|dpdk0: could not set configuration
(Invalid argument)
~~~


I've tried a solution described in https://goo.gl/3opVRT, which  is utilizing
"uio_pci_generic" and disable intel_iommu. It didn't work to me.

Here is detail info about my test platform:
DPDK & OVS version: DPDK 16.11 & OVS 2.7.0, DPDK 17.05.1 & OVS 2.8.0, DPDK
17.11 & OVS 2.9.0, DPDK 17.11 & OVS 2.10.0
OS: ubuntu 16.04
Hardware: Dell R730/R630 server  with intel X520 10G NIC
128G Memory, 32 Cores.

Can anybody help or give me a hint to debug? I'm totally loss here.

-- 
Yo

Re: [dpdk-dev] [PATCH v5 2/2] virtio: fix PCI config err handling

2018-08-27 Thread Tiwei Bie
I just noticed the title. It should be "net/virtio: xxx",
instead of "virtio: xxx".

On Mon, Aug 27, 2018 at 05:52:40PM +0100, Luca Boccassi wrote:
[...]
> + ret = rte_pci_read_config(dev, &flags, sizeof(flags),
> + pos + sizeof(cap));
> + if (ret != sizeof(flags)) {
> + PMD_INIT_LOG(DEBUG,
> +  "failed to read pci cap at pos:"
> +  " %lx ret %d", pos + sizeof(cap),
> +  ret);

In file included from drivers/net/virtio/virtio_pci.c:15:0:
drivers/net/virtio/virtio_pci.c: In function ‘vtpci_msix_detect’:
drivers/net/virtio/virtio_logs.h:13:3: error: format ‘%lx’ expects argument of 
type ‘long unsigned int’, but argument 5 has type ‘unsigned int’ 
[-Werror=format=]
   "%s(): " fmt "\n", __func__, ##args)
   ^
drivers/net/virtio/virtio_pci.c:737:5: note: in expansion of macro 
‘PMD_INIT_LOG’
 PMD_INIT_LOG(DEBUG,
 ^
cc1: all warnings being treated as errors

I got above build issues in 32bit build.


Apart from that,

Reviewed-by: Tiwei Bie 

Thanks!


Re: [dpdk-dev] [PATCH v4 2/2] virtio: fix PCI config err handling

2018-08-27 Thread Tiwei Bie
On Mon, Aug 27, 2018 at 05:52:56PM +0100, Luca Boccassi wrote:
> On Mon, 2018-08-27 at 13:29 +0800, Tiwei Bie wrote:
> > On Fri, Aug 24, 2018 at 06:14:20PM +0100, Luca Boccassi wrote:
> > > From: Brian Russell 
> > > 
> > > In virtio_read_caps and vtpci_msix_detect, rte_pci_read_config
> > > returns
> > > the number of bytes read from PCI config or < 0 on error.
> > > If less than the expected number of bytes are read then log the
> > > failure and return rather than carrying on with garbage.
> > > 
> > > Fixes: 6ba1f63b5ab0 ("virtio: support specification 1.0")
> > > 
> > > Signed-off-by: Brian Russell 
> > > Signed-off-by: Luca Boccassi 
> > > ---
> > > v2: handle additional rte_pci_read_config incomplete reads
> > > v3: do not handle rte_pci_read_config of virtio cap, added in v2,
> > > as it's less clear what the right thing to do there is
> > > v4: do a more robust check - first check what the vendor is, and
> > > skip the cap entirely if it's not what we are looking for.
> > > 
> > >  drivers/net/virtio/virtio_pci.c | 57 -
> > > 
> > >  1 file changed, 42 insertions(+), 15 deletions(-)
> > > 
> > > diff --git a/drivers/net/virtio/virtio_pci.c
> > > b/drivers/net/virtio/virtio_pci.c
> > > index 6bd22e54a6..cfefa9789b 100644
> > > --- a/drivers/net/virtio/virtio_pci.c
> > > +++ b/drivers/net/virtio/virtio_pci.c
> > > @@ -567,16 +567,30 @@ virtio_read_caps(struct rte_pci_device *dev,
> > > struct virtio_hw *hw)
> > >   }
> > >  
> > >   ret = rte_pci_read_config(dev, &pos, 1,
> > > PCI_CAPABILITY_LIST);
> > > - if (ret < 0) {
> > > - PMD_INIT_LOG(DEBUG, "failed to read pci capability
> > > list");
> > > + if (ret != 1) {
> > > + PMD_INIT_LOG(DEBUG,
> > > +  "failed to read pci capability list,
> > > ret %d", ret);
> > >   return -1;
> > >   }
> > >  
> > >   while (pos) {
> > > + ret = rte_pci_read_config(dev, &cap, 2, pos);
> > > + if (ret != 2) {
> > > + PMD_INIT_LOG(DEBUG,
> > > +  "failed to read pci cap at
> > > pos: %x ret %d",
> > > +  pos, ret);
> > > + break;
> > > + }
> > > + if (cap.cap_vndr != PCI_CAP_ID_MSIX &&
> > > + cap.cap_vndr != PCI_CAP_ID_VNDR) {
> > > + goto next;
> > > + }
> > > +
> > >   ret = rte_pci_read_config(dev, &cap, sizeof(cap),
> > > pos);
> > > - if (ret < 0) {
> > > - PMD_INIT_LOG(ERR,
> > > - "failed to read pci cap at pos:
> > > %x", pos);
> > > + if (ret != sizeof(cap)) {
> > > + PMD_INIT_LOG(DEBUG,
> > > +  "failed to read pci cap at
> > > pos: %x ret %d",
> > > +  pos, ret);
> > >   break;
> > >   }
> > >  
> > 
> > It seems that I didn't make myself clear in my previous
> > comments. I mean it's better to handle MSIX cap and virtio
> > cap respectively in this function. Currently we're always
> > reading them as virtio caps. As we are strictly requiring
> > that _read_config() should return the required number of
> > bytes, it's not perfect to require it to return "virtio
> > cap size" of bytes while we're trying to read a MSIX cap.
> > So please change the code to something similar to this:
> 
> Sorry, though you meant in the vtpci_msix_detect function, which I
> changed. Fixed in v5.

Thanks!

> 
> > > @@ -689,25 +703,38 @@ enum virtio_msix_status
> > >  vtpci_msix_detect(struct rte_pci_device *dev)
> > >  {
> > >   uint8_t pos;
> > > - struct virtio_pci_cap cap;
> > >   int ret;
> > >  
> > >   ret = rte_pci_read_config(dev, &pos, 1,
> > > PCI_CAPABILITY_LIST);
> > > - if (ret < 0) {
> > > - PMD_INIT_LOG(DEBUG, "failed to read pci capability
> > > list");
> > > + if (ret != 1) {
> > > + PMD_INIT_LOG(DEBUG,
> > > +  "failed to read pci capability list,
> > > ret %d", ret);
> > >   return VIRTIO_MSIX_NONE;
> > >   }
> > >  
> > >   while (pos) {
> > > - ret = rte_pci_read_config(dev, &cap, sizeof(cap),
> > > pos);
> > > - if (ret < 0) {
> > > - PMD_INIT_LOG(ERR,
> > > - "failed to read pci cap at pos:
> > > %x", pos);
> > > + uint8_t cap[2];
> > > +
> > > + ret = rte_pci_read_config(dev, cap, sizeof(cap),
> > > pos);
> > > + if (ret != sizeof(cap)) {
> > > + PMD_INIT_LOG(DEBUG,
> > > +  "failed to read pci cap at
> > > pos: %x ret %d",
> > > +  pos, ret);
> > >   break;
> > >   }
> > >  
> > > - if (cap.cap_vndr == PCI_CAP_ID_MSIX) {
> > > - uint16_t flags = ((uint16_t *)&cap)[1];
> > > + if (cap[0] == PCI_CAP_ID_MSIX) {
> > > + uint16_t flags;
> > > +
> > > + ret = rte_pci_read_c