Re: [dpdk-dev] [PATCH v1] app/pdump: add check for PCAP PMD

2018-03-06 Thread Varghese, Vipin
Hi Ferruh,

> -Original Message-
> From: Yigit, Ferruh
> Sent: Monday, March 5, 2018 2:33 PM
> To: Varghese, Vipin ; dev@dpdk.org; Pattan,
> Reshma 
> Cc: Mcnamara, John 
> Subject: Re: [dpdk-dev] [PATCH v1] app/pdump: add check for PCAP PMD
> 
> On 3/5/2018 7:57 AM, Vipin Varghese wrote:
> > dpdk-pdump makes use of LIBRTE_PMD_PCAP for interfacing the ring to
> > the device-queue pair. Updating Makefile to check for the same.
> >
> > Signed-off-by: Vipin Varghese 
> > ---
> >  app/pdump/Makefile | 4 
> >  1 file changed, 4 insertions(+)
> >
> > diff --git a/app/pdump/Makefile b/app/pdump/Makefile index
> > bd3c208..038a34f 100644
> > --- a/app/pdump/Makefile
> > +++ b/app/pdump/Makefile
> > @@ -3,6 +3,10 @@
> >
> >  include $(RTE_SDK)/mk/rte.vars.mk
> >
> > +ifeq ($(CONFIG_RTE_LIBRTE_PMD_PCAP),n) $(error "Please enable
> > +CONFIG_RTE_LIBRTE_PMD_PCAP") endif
> 
> pdump is enabled default, so won't this break the default build?

Yes, you are right it will fail. Which then forces the user to enable PCAP.

> 
> What about moving this to lib/librte_pdump, convert $(error ..) to $(warning 
> ..)
> and disable CONFIG_RTE_LIBRTE_PDUMP there?

If we set to warning and there are no PCAP headers in build system. The 
application gets built, but will fail internally becz the pcap API will fails 
during execution.

> 
> > +
> >  ifeq ($(CONFIG_RTE_LIBRTE_PDUMP),y)
> >
> >  APP = dpdk-pdump
> >



Re: [dpdk-dev] [PATCH 3/4] drivers/net: do not allocate rte_eth_dev_data privately

2018-03-06 Thread Tan, Jianfeng


> -Original Message-
> From: Matan Azrad [mailto:ma...@mellanox.com]
> Sent: Tuesday, March 6, 2018 2:08 PM
> To: Tan, Jianfeng; Yigit, Ferruh
> Cc: Richardson, Bruce; Ananyev, Konstantin; Thomas Monjalon;
> maxime.coque...@redhat.com; Burakov, Anatoly; dev@dpdk.org
> Subject: RE: [dpdk-dev] [PATCH 3/4] drivers/net: do not allocate
> rte_eth_dev_data privately
> 
> Hi Jianfeng
> 
> Please see a comment below.
> 
> > From: Jianfeng Tan, Sent: Sunday, March 4, 2018 5:30 PM
> > We introduced private rte_eth_dev_data to allow vdev to be created both
> in
> > primary process and secondary process(es). This is not friendly to multi-
> > process model, for example, it leads to port id contention issue if two
> > processes both find the data entry is free.
> >
> > And to get stats of primary vdev in secondary, we must allocate from the
> > pre-defined array so that we can find it.
> >
> > Suggested-by: Bruce Richardson 
> > Signed-off-by: Jianfeng Tan 
> > ---
> >  drivers/net/af_packet/rte_eth_af_packet.c | 25 +++--
> >  drivers/net/kni/rte_eth_kni.c | 13 ++---
> >  drivers/net/null/rte_eth_null.c   | 17 +++--
> >  drivers/net/octeontx/octeontx_ethdev.c| 14 ++
> >  drivers/net/pcap/rte_eth_pcap.c   | 18 +++---
> >  drivers/net/tap/rte_eth_tap.c |  9 +
> >  drivers/net/vhost/rte_eth_vhost.c | 17 ++---
> >  7 files changed, 20 insertions(+), 93 deletions(-)
> >
> > diff --git a/drivers/net/af_packet/rte_eth_af_packet.c
> > b/drivers/net/af_packet/rte_eth_af_packet.c
> > index 57eccfd..2db692f 100644
> > --- a/drivers/net/af_packet/rte_eth_af_packet.c
> > +++ b/drivers/net/af_packet/rte_eth_af_packet.c
> > @@ -564,25 +564,17 @@ rte_pmd_init_internals(struct rte_vdev_device
> > *dev,
> > RTE_LOG(ERR, PMD,
> > "%s: no interface specified for AF_PACKET
> > ethdev\n",
> > name);
> > -   goto error_early;
> > +   return -1;
> > }
> >
> > RTE_LOG(INFO, PMD,
> > "%s: creating AF_PACKET-backed ethdev on numa socket
> > %u\n",
> > name, numa_node);
> >
> > -   /*
> > -* now do all data allocation - for eth_dev structure, dummy pci
> > driver
> > -* and internal (private) data
> > -*/
> > -   data = rte_zmalloc_socket(name, sizeof(*data), 0, numa_node);
> > -   if (data == NULL)
> > -   goto error_early;
> > -
> > *internals = rte_zmalloc_socket(name, sizeof(**internals),
> > 0, numa_node);
> > if (*internals == NULL)
> > -   goto error_early;
> > +   return -1;
> >
> > for (q = 0; q < nb_queues; q++) {
> > (*internals)->rx_queue[q].map = MAP_FAILED; @@ -604,24
> > +596,24 @@ rte_pmd_init_internals(struct rte_vdev_device *dev,
> > RTE_LOG(ERR, PMD,
> > "%s: I/F name too long (%s)\n",
> > name, pair->value);
> > -   goto error_early;
> > +   return -1;
> > }
> > if (ioctl(sockfd, SIOCGIFINDEX, &ifr) == -1) {
> > RTE_LOG(ERR, PMD,
> > "%s: ioctl failed (SIOCGIFINDEX)\n",
> > name);
> > -   goto error_early;
> > +   return -1;
> > }
> > (*internals)->if_name = strdup(pair->value);
> > if ((*internals)->if_name == NULL)
> > -   goto error_early;
> > +   return -1;
> > (*internals)->if_index = ifr.ifr_ifindex;
> >
> > if (ioctl(sockfd, SIOCGIFHWADDR, &ifr) == -1) {
> > RTE_LOG(ERR, PMD,
> > "%s: ioctl failed (SIOCGIFHWADDR)\n",
> > name);
> > -   goto error_early;
> > +   return -1;
> > }
> > memcpy(&(*internals)->eth_addr, ifr.ifr_hwaddr.sa_data,
> > ETH_ALEN);
> >
> > @@ -775,14 +767,13 @@ rte_pmd_init_internals(struct rte_vdev_device
> > *dev,
> >
> > (*internals)->nb_queues = nb_queues;
> >
> > -   rte_memcpy(data, (*eth_dev)->data, sizeof(*data));
> > +   data = (*eth_dev)->data;
> > data->dev_private = *internals;
> > data->nb_rx_queues = (uint16_t)nb_queues;
> > data->nb_tx_queues = (uint16_t)nb_queues;
> > data->dev_link = pmd_link;
> > data->mac_addrs = &(*internals)->eth_addr;
> >
> > -   (*eth_dev)->data = data;
> > (*eth_dev)->dev_ops = &ops;
> >
> > return 0;
> > @@ -802,8 +793,6 @@ rte_pmd_init_internals(struct rte_vdev_device
> *dev,
> > }
> > free((*internals)->if_name);
> > rte_free(*internals);
> > -error_early:
> > -   rte_free(data);
> > return -1;
> >  }
> >
> 
> I think you should remove the private rte_eth_dev_data freeing in
> rte_pmd_af_packet_remove().
> This is relevant to all the vdevs here.

Ah, yes, you are correct. I will fix that in v2.

> 
> Question:
> Does the patch include all the vdevs which allocated private
> rte_eth_dev_data?

Yes, we are removing all privat

Re: [dpdk-dev] [PATCH v2] doc: add driver limitation for vhost dequeue zero copy

2018-03-06 Thread Maxime Coquelin



On 02/27/2018 10:21 AM, Junjie Chen wrote:

In vhost-switch example, when binding nic to vfio-pci, dequeue zero
copy cannot work in VM2NIC mode due to no iommu dma mapping is setup
for guest memory currently.

Signed-off-by: Junjie Chen 
---
Changes in V2:
  - add doc in vhost lib

  doc/guides/prog_guide/vhost_lib.rst | 3 +++
  doc/guides/sample_app_ug/vhost.rst  | 5 -
  2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/doc/guides/prog_guide/vhost_lib.rst 
b/doc/guides/prog_guide/vhost_lib.rst
index 18227b6..bdf77d6 100644
--- a/doc/guides/prog_guide/vhost_lib.rst
+++ b/doc/guides/prog_guide/vhost_lib.rst
@@ -83,6 +83,9 @@ The following is an overview of some key Vhost API functions:
of those segments, thus the fewer the segments, the quicker we will get
the mapping. NOTE: we may speed it by using tree searching in future.
  
+* zero copy does not work when using vfio-pci driver currently, this is

+  because we don't setup iommu dma mapping for guest memory.
+


I guess that it should work with vfio-pci in noiommu mode? Maybe worth
to clarify.


- ``RTE_VHOST_USER_IOMMU_SUPPORT``
  
  IOMMU support will be enabled when this flag is set. It is disabled by

diff --git a/doc/guides/sample_app_ug/vhost.rst 
b/doc/guides/sample_app_ug/vhost.rst
index a4bdc6a..840c1fd 100644
--- a/doc/guides/sample_app_ug/vhost.rst
+++ b/doc/guides/sample_app_ug/vhost.rst
@@ -147,7 +147,10 @@ retries on an RX burst, it takes effect only when rx retry 
is enabled. The
  default value is 15.
  
  **--dequeue-zero-copy**

-Dequeue zero copy will be enabled when this option is given.
+Dequeue zero copy will be enabled when this option is given, it is worth to
+note that if NIC is binded to vfio-pci driver, dequeue zero copy cannot work
+at VM2NIC mode (vm2vm=0) due to currently we don't setup iommu dma mapping for
+guest memory.
  
  **--vlan-strip 0|1**

  VLAN strip option is removed, because different NICs have different behaviors



Re: [dpdk-dev] [PATCH] vhost: add note about sockets in server mode

2018-03-06 Thread Maxime Coquelin

Hi Ilya,

On 02/26/2018 09:39 AM, Ilya Maximets wrote:

 From time to time, someone sends patches about unlinking existing
sockets when registering a vhost user in server mode.

A recent example:
http://dpdk.org/ml/archives/dev/2018-February/090025.html

This problem has been discussed many times, and it was made clear that
the library should not unlink files given by the application in order
to avoid possible security problems, such as removing random files
used by other programs.

One of the first discussions:
http://dpdk.org/ml/archives/dev/2015-December/030326.html

To avoid such patches in the future, it was decided to add a comment
that explains what is happening and tries to describe the reasoning.

Signed-off-by: Ilya Maximets 
---

I'm open for suggestions. Wording/grammar fixes are also welcome.

  lib/librte_vhost/socket.c | 10 ++
  1 file changed, 10 insertions(+)

diff --git a/lib/librte_vhost/socket.c b/lib/librte_vhost/socket.c
index 83befdc..e8584f3 100644
--- a/lib/librte_vhost/socket.c
+++ b/lib/librte_vhost/socket.c
@@ -318,6 +318,16 @@ vhost_user_start_server(struct vhost_user_socket *vsocket)
int fd = vsocket->socket_fd;
const char *path = vsocket->path;
  
+	/*

+* bind () may fail if the socket file with the same name already
+* exists. But the library obviously should not delete the file
+* provided by the user, since we can not be sure that it is not
+* being used by other applications. Moreover, many applications form
+* socket names based on user input, which is prone to errors.
+*
+* The user must ensure that the socket does not exist before
+* registering the vhost driver in server mode.
+*/
ret = bind(fd, (struct sockaddr *)&vsocket->un, sizeof(vsocket->un));
if (ret < 0) {
RTE_LOG(ERR, VHOST_CONFIG,



Reviewed-by: Maxime Coquelin 

Thanks!
Maxime


Re: [dpdk-dev] [RFC 1/4] drivers/bus/ifpga:Intel FPGA Bus Lib Code

2018-03-06 Thread Xu, Rosen


-Original Message-
From: Shreyansh Jain [mailto:shreyansh.j...@nxp.com] 
Sent: Tuesday, March 06, 2018 14:10
To: Xu, Rosen 
Cc: dev@dpdk.org; Doherty, Declan ; Zhang, Tianfei 

Subject: Re: [dpdk-dev] [RFC 1/4] drivers/bus/ifpga:Intel FPGA Bus Lib Code

Hello Rosen,

I have some initial (and most of them trivial) comments inline...

On Tue, Mar 6, 2018 at 7:13 AM, Rosen Xu  wrote:
> Signed-off-by: Rosen Xu 
> ---
>  drivers/bus/ifpga/Makefile  |  64 
>  drivers/bus/ifpga/ifpga_bus.c   | 527 
> 
>  drivers/bus/ifpga/ifpga_common.c| 168 +
>  drivers/bus/ifpga/ifpga_common.h|  46 +++
>  drivers/bus/ifpga/ifpga_logs.h  |  59 
>  drivers/bus/ifpga/rte_bus_ifpga.h   | 153 
>  drivers/bus/ifpga/rte_bus_ifpga_version.map |   8 +
>  7 files changed, 1025 insertions(+)
>  create mode 100644 drivers/bus/ifpga/Makefile  create mode 100644 
> drivers/bus/ifpga/ifpga_bus.c  create mode 100644 
> drivers/bus/ifpga/ifpga_common.c  create mode 100644 
> drivers/bus/ifpga/ifpga_common.h  create mode 100644 
> drivers/bus/ifpga/ifpga_logs.h  create mode 100644 
> drivers/bus/ifpga/rte_bus_ifpga.h  create mode 100644 
> drivers/bus/ifpga/rte_bus_ifpga_version.map
>
> diff --git a/drivers/bus/ifpga/Makefile b/drivers/bus/ifpga/Makefile 
> new file mode 100644 index 000..c71f186
> --- /dev/null
> +++ b/drivers/bus/ifpga/Makefile
> @@ -0,0 +1,64 @@
> +#   BSD LICENSE
> +#
> +#   Copyright(c) 2010-2017 Intel Corporation. All rights reserved.
> +#   All rights reserved.
> +#
> +#   Redistribution and use in source and binary forms, with or without
> +#   modification, are permitted provided that the following conditions
> +#   are met:
> +#
> +# * Redistributions of source code must retain the above copyright
> +#   notice, this list of conditions and the following disclaimer.
> +# * Redistributions in binary form must reproduce the above copyright
> +#   notice, this list of conditions and the following disclaimer in
> +#   the documentation and/or other materials provided with the
> +#   distribution.
> +# * Neither the name of Intel Corporation nor the names of its
> +#   contributors may be used to endorse or promote products derived
> +#   from this software without specific prior written permission.
> +#
> +#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> +#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> +#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> +#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> +#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> +#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> +#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> +#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> +#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> +#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> +#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

As of 18.02, I think all licensing has moved to SPDX. Maybe in formal patch you 
should change to that.
Rosen:  I will modify it

> +
> +include $(RTE_SDK)/mk/rte.vars.mk
> +
> +#
> +# library name
> +#
> +LIB = librte_bus_ifpga.a
> +LIBABIVER := 1
> +EXPORT_MAP := rte_bus_ifpga_version.map
> +
> +ifeq ($(CONFIG_RTE_LIBRTE_DPAA2_DEBUG_INIT),y)

I think this is copy-paste issue - isn't it?
Rosen: yes
(CONFIG_RTE_LIBRTE_DPAA2_DEBUG_INIT)
I see that you have already enabled dynamic logging - in which case you won't 
need this anyway.
Rosen: ok

> +CFLAGS += -O0 -g
> +CFLAGS += "-Wno-error"
> +else
> +CFLAGS += -O3
> +CFLAGS += $(WERROR_FLAGS)
> +endif
> +
> +CFLAGS += -I$(RTE_SDK)/drivers/bus/ifpga CFLAGS += 
> +-I$(RTE_SDK)/drivers/bus/pci CFLAGS += 
> +-I$(RTE_SDK)/lib/librte_eal/linuxapp/eal
> +CFLAGS += -I$(RTE_SDK)/lib/librte_eal/common
> +#CFLAGS += -I$(RTE_SDK)/lib/librte_rawdev #LDLIBS += -lrte_eal 
> +-lrte_mbuf -lrte_mempool -lrte_ring -lrte_rawdev LDLIBS += -lrte_eal 
> +-lrte_mbuf -lrte_mempool -lrte_ring #LDLIBS += -lrte_ethdev
> +
> +VPATH += $(SRCDIR)/base
> +
> +SRCS-y += \
> +ifpga_bus.c \
> +ifpga_common.c
> +
> +include $(RTE_SDK)/mk/rte.lib.mk
> diff --git a/drivers/bus/ifpga/ifpga_bus.c 
> b/drivers/bus/ifpga/ifpga_bus.c new file mode 100644 index 
> 000..382d550
> --- /dev/null
> +++ b/drivers/bus/ifpga/ifpga_bus.c
> @@ -0,0 +1,527 @@
> +/*-
> + *   BSD LICENSE
> + *
> + *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
> + *   Copyright 2013-2014 6WIND S.A.
> + *   All rights reserved.
> + *
> + *   Redistribution and use in source and binary forms, with or without
> + *   modification, are permitted provided that the following conditions
> + *   are met:
> + *
> + * * Redistri

[dpdk-dev] [PATCH] net/bonding: avoid wrong casting on primary_slave_port_id from input param

2018-03-06 Thread Gowrishankar
From: Gowrishankar Muthukrishnan 

primary_slave_port_id is uint16_t which needs to be correctly stored
with the same data type of input parameter in bond_ethdev_configure.

Fixes: f8244c6399 ("ethdev: increase port id range")
Cc: sta...@dpdk.org

Signed-off-by: Gowrishankar Muthukrishnan 
---

In powerpc, creating bond pmd results in below error due to wrong
cast on input param. This is reproducible, only when using shared
libraries.

sudo -E LD_LIBRARY_PATH=$PWD/$RTE_TARGET/lib $RTE_TARGET/app/testpmd \
  -l 0,8 --socket-mem=1024,1024 \
  --vdev 'net_tap0,iface=dpdktap0' --vdev 'net_tap1,iface=dpdktap1' \
  --vdev 'net_bonding0,mode=1,slave=0,slave=1,primary=0,socket_id=1' \
  -d $RTE_TARGET/lib/librte_pmd_tap.so \
  -d $RTE_TARGET/lib/librte_mempool_ring.so -- --forward-mode=rxonly

Configuring Port 0 (socket 0)
PMD: net_tap0: 0x70a854070280: TX configured queues number: 1
PMD: net_tap0: 0x70a854070280: RX configured queues number: 1
Port 0: 86:EA:6D:52:3E:DB
Configuring Port 1 (socket 0)
PMD: net_tap1: 0x70a854074300: TX configured queues number: 1
PMD: net_tap1: 0x70a854074300: RX configured queues number: 1
Port 1: 42:9A:B8:49:B6:00
Configuring Port 2 (socket 1)
EAL: Failed to set primary slave port 7424 on bonded device net_bonding0
Fail to configure port 2
EAL: Error - exiting with code: 1
  Cause: Start ports failed

 drivers/net/bonding/rte_eth_bond_args.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/bonding/rte_eth_bond_args.c 
b/drivers/net/bonding/rte_eth_bond_args.c
index 27d3101..e99681e 100644
--- a/drivers/net/bonding/rte_eth_bond_args.c
+++ b/drivers/net/bonding/rte_eth_bond_args.c
@@ -244,7 +244,7 @@
if (primary_slave_port_id < 0)
return -1;
 
-   *(uint8_t *)extra_args = (uint8_t)primary_slave_port_id;
+   *(uint16_t *)extra_args = (uint16_t)primary_slave_port_id;
 
return 0;
 }
-- 
1.9.1



Re: [dpdk-dev] [PATCH] ethdev: return diagnostic when setting MAC address

2018-03-06 Thread Tomasz Duszynski
Hi,

Some comments inline.

On Tue, Feb 27, 2018 at 04:11:29PM +0100, Olivier Matz wrote:
> Change the prototype and the behavior of dev_ops->eth_mac_addr_set(): a
> return code is added to notify the caller (librte_ether) if an error
> occurred in the PMD.
>
> The new default MAC address is now copied in dev->data->mac_addrs[0]
> only if the operation is successful.
>
> The patch also updates all the PMDs accordingly.
>
> Signed-off-by: Olivier Matz 
> ---
>
> Hi,
>
> This patch is the following of the discussion we had in this thread:
> https://dpdk.org/dev/patchwork/patch/32284/
>
> I did my best to keep the consistency inside the PMDs. The behavior
> of eth_mac_addr_set() is inspired from other fonctions in the same
> PMD, usually eth_mac_addr_add(). For instance:
> - dpaa and dpaa2 return 0 on error.
> - some PMDs (bnxt, mlx5, ...?) do not return a -errno code (-1 or
>   positive values).
> - some PMDs (avf, tap) check if the address is the same and return 0
>   in that case. This could go in generic code?
>
> I tried to use the following errors when relevant:
> - -EPERM when a VF is not allowed to do a change
> - -ENOTSUP if the function is not supported
> - -EIO if this is an unknown error from lower layer (hw or sdk)
> - -EINVAL for other unknown errors
>
> Please, PMD maintainers, feel free to comment if you ahve specific
> needs for your driver.
>
> Thanks
> Olivier
>
>
>  doc/guides/rel_notes/deprecation.rst|  8 
>  drivers/net/ark/ark_ethdev.c|  9 ++---
>  drivers/net/avf/avf_ethdev.c| 12 
>  drivers/net/bnxt/bnxt_ethdev.c  | 10 ++
>  drivers/net/bonding/rte_eth_bond_pmd.c  |  8 ++--
>  drivers/net/dpaa/dpaa_ethdev.c  |  4 +++-
>  drivers/net/dpaa2/dpaa2_ethdev.c|  6 --
>  drivers/net/e1000/igb_ethdev.c  | 12 +++-
>  drivers/net/failsafe/failsafe_ops.c | 16 +---
>  drivers/net/i40e/i40e_ethdev.c  | 24 ++-
>  drivers/net/i40e/i40e_ethdev_vf.c   | 12 +++-
>  drivers/net/ixgbe/ixgbe_ethdev.c| 13 -
>  drivers/net/mlx4/mlx4.h |  2 +-
>  drivers/net/mlx4/mlx4_ethdev.c  |  7 +--
>  drivers/net/mlx5/mlx5.h |  2 +-
>  drivers/net/mlx5/mlx5_mac.c |  7 +--
>  drivers/net/mrvl/mrvl_ethdev.c  |  7 ++-
>  drivers/net/null/rte_eth_null.c |  3 ++-
>  drivers/net/octeontx/octeontx_ethdev.c  |  4 +++-
>  drivers/net/qede/qede_ethdev.c  |  7 +++
>  drivers/net/sfc/sfc_ethdev.c| 14 +-
>  drivers/net/szedata2/rte_eth_szedata2.c |  3 ++-
>  drivers/net/tap/rte_eth_tap.c   | 34 
> +
>  drivers/net/virtio/virtio_ethdev.c  | 15 ++-
>  drivers/net/vmxnet3/vmxnet3_ethdev.c|  5 +++--
>  lib/librte_ether/rte_ethdev.c   |  7 +--
>  lib/librte_ether/rte_ethdev_core.h  |  2 +-
>  test/test/virtual_pmd.c |  3 ++-
>  28 files changed, 159 insertions(+), 97 deletions(-)
>
> diff --git a/doc/guides/rel_notes/deprecation.rst 
> b/doc/guides/rel_notes/deprecation.rst
> index 74c18ed7c..2bf360f0d 100644
> --- a/doc/guides/rel_notes/deprecation.rst
> +++ b/doc/guides/rel_notes/deprecation.rst
> @@ -134,14 +134,6 @@ Deprecation Notices
>between the VF representor and the VF or the parent PF. Those new fields
>are to be included in ``rte_eth_dev_info`` struct.
>
> -* ethdev: The prototype and the behavior of
> -  ``dev_ops->eth_mac_addr_set()`` will change in v18.05. A return code
> -  will be added to notify the caller if an error occurred in the PMD. In
> -  ``rte_eth_dev_default_mac_addr_set()``, the new default MAC address
> -  will be copied in ``dev->data->mac_addrs[0]`` only if the operation is
> -  successful. This modification will only impact the PMDs, not the
> -  applications.
> -
>  * ethdev: functions add rx/tx callback will return named opaque type
>``rte_eth_add_rx_callback()``, ``rte_eth_add_first_rx_callback()`` and
>``rte_eth_add_tx_callback()`` functions currently return callback object as
> diff --git a/drivers/net/ark/ark_ethdev.c b/drivers/net/ark/ark_ethdev.c
> index ff87c20e2..3fc40cd74 100644
> --- a/drivers/net/ark/ark_ethdev.c
> +++ b/drivers/net/ark/ark_ethdev.c
> @@ -69,7 +69,7 @@ static int eth_ark_dev_set_link_down(struct rte_eth_dev 
> *dev);
>  static int eth_ark_dev_stats_get(struct rte_eth_dev *dev,
> struct rte_eth_stats *stats);
>  static void eth_ark_dev_stats_reset(struct rte_eth_dev *dev);
> -static void eth_ark_set_default_mac_addr(struct rte_eth_dev *dev,
> +static int eth_ark_set_default_mac_addr(struct rte_eth_dev *dev,
>struct ether_addr *mac_addr);
>  static int eth_ark_macaddr_add(struct rte_eth_dev *dev,
>  struct ether_addr *mac_addr,
> @@ -887,16 +887,19 @@ eth_ark_macaddr_remove(struct 

Re: [dpdk-dev] [PATCH v2 1/6] vhost: export vhost feature definitions

2018-03-06 Thread Tan, Jianfeng


> -Original Message-
> From: Wang, Zhihong
> Sent: Tuesday, February 13, 2018 5:21 PM
> To: dev@dpdk.org
> Cc: Tan, Jianfeng; Bie, Tiwei; maxime.coque...@redhat.com;
> y...@fridaylinux.org; Liang, Cunming; Wang, Xiao W; Daly, Dan; Wang,
> Zhihong
> Subject: [PATCH v2 1/6] vhost: export vhost feature definitions
> 
> This patch exports vhost-user protocol features to support device driver
> development.
> 
> Signed-off-by: Zhihong Wang 
> ---
>  lib/librte_vhost/rte_vhost.h  |  8 
>  lib/librte_vhost/vhost.h  |  4 +---
>  lib/librte_vhost/vhost_user.c |  9 +
>  lib/librte_vhost/vhost_user.h | 20 +++-
>  4 files changed, 21 insertions(+), 20 deletions(-)
> 
> diff --git a/lib/librte_vhost/rte_vhost.h b/lib/librte_vhost/rte_vhost.h
> index d33206997..b05162366 100644
> --- a/lib/librte_vhost/rte_vhost.h
> +++ b/lib/librte_vhost/rte_vhost.h
> @@ -29,6 +29,14 @@ extern "C" {
>  #define RTE_VHOST_USER_DEQUEUE_ZERO_COPY (1ULL << 2)
>  #define RTE_VHOST_USER_IOMMU_SUPPORT (1ULL << 3)
> 
> +#define RTE_VHOST_USER_PROTOCOL_F_MQ 0

Instead of adding a "RTE_" prefix. I prefer to define it like this:
#ifndef VHOST_USER_PROTOCOL_F_MQ
#define VHOST_USER_PROTOCOL_F_MQ   0
#endif

Similar to other macros.

> +#define RTE_VHOST_USER_PROTOCOL_F_LOG_SHMFD  1
> +#define RTE_VHOST_USER_PROTOCOL_F_RARP   2
> +#define RTE_VHOST_USER_PROTOCOL_F_REPLY_ACK  3
> +#define RTE_VHOST_USER_PROTOCOL_F_NET_MTU4
> +#define RTE_VHOST_USER_PROTOCOL_F_SLAVE_REQ  5
> +#define RTE_VHOST_USER_F_PROTOCOL_FEATURES   30
> +
>  /**
>   * Information relating to memory regions including offsets to
>   * addresses in QEMUs memory file.
> diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h
> index 58aec2e0d..a0b0520e2 100644
> --- a/lib/librte_vhost/vhost.h
> +++ b/lib/librte_vhost/vhost.h
> @@ -174,8 +174,6 @@ struct vhost_msg {
>   #define VIRTIO_F_VERSION_1 32
>  #endif
> 
> -#define VHOST_USER_F_PROTOCOL_FEATURES   30
> -
>  /* Features supported by this builtin vhost-user net driver. */
>  #define VIRTIO_NET_SUPPORTED_FEATURES ((1ULL <<
> VIRTIO_NET_F_MRG_RXBUF) | \
>   (1ULL << VIRTIO_F_ANY_LAYOUT) | \
> @@ -185,7 +183,7 @@ struct vhost_msg {
>   (1ULL << VIRTIO_NET_F_MQ)  | \
>   (1ULL << VIRTIO_F_VERSION_1)   | \
>   (1ULL << VHOST_F_LOG_ALL)  | \
> - (1ULL <<
> VHOST_USER_F_PROTOCOL_FEATURES) | \
> + (1ULL <<
> RTE_VHOST_USER_F_PROTOCOL_FEATURES) | \
>   (1ULL << VIRTIO_NET_F_GSO) | \
>   (1ULL << VIRTIO_NET_F_HOST_TSO4) | \
>   (1ULL << VIRTIO_NET_F_HOST_TSO6) | \
> diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c
> index 5c5361066..c93e48e4d 100644
> --- a/lib/librte_vhost/vhost_user.c
> +++ b/lib/librte_vhost/vhost_user.c
> @@ -527,7 +527,7 @@ vhost_user_set_vring_addr(struct virtio_net **pdev,
> VhostUserMsg *msg)
>   vring_invalidate(dev, vq);
> 
>   if (vq->enabled && (dev->features &
> - (1ULL <<
> VHOST_USER_F_PROTOCOL_FEATURES))) {
> + (1ULL <<
> RTE_VHOST_USER_F_PROTOCOL_FEATURES))) {
>   dev = translate_ring_addresses(dev, msg-
> >payload.addr.index);
>   if (!dev)
>   return -1;
> @@ -897,11 +897,11 @@ vhost_user_set_vring_kick(struct virtio_net
> **pdev, struct VhostUserMsg *pmsg)
>   vq = dev->virtqueue[file.index];
> 
>   /*
> -  * When VHOST_USER_F_PROTOCOL_FEATURES is not negotiated,
> +  * When RTE_VHOST_USER_F_PROTOCOL_FEATURES is not
> negotiated,
>* the ring starts already enabled. Otherwise, it is enabled via
>* the SET_VRING_ENABLE message.
>*/
> - if (!(dev->features & (1ULL <<
> VHOST_USER_F_PROTOCOL_FEATURES)))
> + if (!(dev->features & (1ULL <<
> RTE_VHOST_USER_F_PROTOCOL_FEATURES)))
>   vq->enabled = 1;
> 
>   if (vq->kickfd >= 0)
> @@ -1012,7 +1012,8 @@ vhost_user_get_protocol_features(struct
> virtio_net *dev,
>* Qemu versions (from v2.7.0 to v2.9.0).
>*/
>   if (!(features & (1ULL << VIRTIO_F_IOMMU_PLATFORM)))
> - protocol_features &= ~(1ULL <<
> VHOST_USER_PROTOCOL_F_REPLY_ACK);
> + protocol_features &=
> + ~(1ULL <<
> RTE_VHOST_USER_PROTOCOL_F_REPLY_ACK);
> 
>   msg->payload.u64 = protocol_features;
>   msg->size = sizeof(msg->payload.u64);
> diff --git a/lib/librte_vhost/vhost_user.h b/lib/librte_vhost/vhost_user.h
> index 0fafbe6e0..066e772dd 100644
> --- a/lib/librte_vhost/vhost_user.h
> +++ b/lib/librte_vhost/vhost_user.h
> @@ -14,19 +14,13 @@
> 
>  #define VHOST_MEMORY_MAX_NREGIONS 8
> 
> -#define VHOST_USER_PROTOCOL_F_MQ 0
> -#define VHOST_USER_PROTOCOL_

Re: [dpdk-dev] [PATCH v2 6/6] vhost: export new apis

2018-03-06 Thread Tan, Jianfeng


> -Original Message-
> From: Wang, Zhihong
> Sent: Tuesday, February 13, 2018 5:21 PM
> To: dev@dpdk.org
> Cc: Tan, Jianfeng; Bie, Tiwei; maxime.coque...@redhat.com;
> y...@fridaylinux.org; Liang, Cunming; Wang, Xiao W; Daly, Dan; Wang,
> Zhihong
> Subject: [PATCH v2 6/6] vhost: export new apis
> 
> This patch exports new APIs as experimental.

How about squeezing this patch with patch 2 where the APIs are introduced, as 
well as the related doc update?

Thanks,
Jianfeng
 
> 
> Signed-off-by: Zhihong Wang 
> ---
>  lib/librte_vhost/rte_vdpa.h| 16 +++-
>  lib/librte_vhost/rte_vhost.h   | 33 ++---
>  lib/librte_vhost/rte_vhost_version.map | 19 +++
>  3 files changed, 52 insertions(+), 16 deletions(-)
> 
> diff --git a/lib/librte_vhost/rte_vdpa.h b/lib/librte_vhost/rte_vdpa.h
> index 1bde36f7f..23fb471be 100644
> --- a/lib/librte_vhost/rte_vdpa.h
> +++ b/lib/librte_vhost/rte_vdpa.h
> @@ -100,15 +100,21 @@ extern struct rte_vdpa_engine *vdpa_engines[];
>  extern uint32_t vdpa_engine_num;
> 
>  /* engine management */
> -int rte_vdpa_register_engine(const char *name, struct rte_vdpa_eng_addr
> *addr);
> -int rte_vdpa_unregister_engine(int eid);
> +int __rte_experimental
> +rte_vdpa_register_engine(const char *name, struct rte_vdpa_eng_addr
> *addr);
> 
> -int rte_vdpa_find_engine_id(struct rte_vdpa_eng_addr *addr);
> +int __rte_experimental
> +rte_vdpa_unregister_engine(int eid);
> 
> -int rte_vdpa_info_query(int eid, struct rte_vdpa_eng_attr *attr);
> +int __rte_experimental
> +rte_vdpa_find_engine_id(struct rte_vdpa_eng_addr *addr);
> +
> +int __rte_experimental
> +rte_vdpa_info_query(int eid, struct rte_vdpa_eng_attr *attr);
> 
>  /* driver register api */
> -void rte_vdpa_register_driver(struct rte_vdpa_eng_driver *drv);
> +void __rte_experimental
> +rte_vdpa_register_driver(struct rte_vdpa_eng_driver *drv);
> 
>  #define RTE_VDPA_REGISTER_DRIVER(nm, drv) \
>  RTE_INIT(vdpainitfn_ ##nm); \
> diff --git a/lib/librte_vhost/rte_vhost.h b/lib/librte_vhost/rte_vhost.h
> index 48005d9ff..d5589c543 100644
> --- a/lib/librte_vhost/rte_vhost.h
> +++ b/lib/librte_vhost/rte_vhost.h
> @@ -187,7 +187,8 @@ int rte_vhost_driver_unregister(const char *path);
>   * @return
>   *  0 on success, -1 on failure
>   */
> -int rte_vhost_driver_set_vdpa_eid(const char *path, int eid);
> +int __rte_experimental
> +rte_vhost_driver_set_vdpa_eid(const char *path, int eid);
> 
>  /**
>   * Set the device id, enforce single connection per socket
> @@ -199,7 +200,8 @@ int rte_vhost_driver_set_vdpa_eid(const char *path,
> int eid);
>   * @return
>   *  0 on success, -1 on failure
>   */
> -int rte_vhost_driver_set_vdpa_did(const char *path, int did);
> +int __rte_experimental
> +rte_vhost_driver_set_vdpa_did(const char *path, int did);
> 
>  /**
>   * Get the engine id
> @@ -209,7 +211,8 @@ int rte_vhost_driver_set_vdpa_did(const char *path,
> int did);
>   * @return
>   *  Engine id, -1 on failure
>   */
> -int rte_vhost_driver_get_vdpa_eid(const char *path);
> +int __rte_experimental
> +rte_vhost_driver_get_vdpa_eid(const char *path);
> 
>  /**
>   * Get the device id
> @@ -219,7 +222,8 @@ int rte_vhost_driver_get_vdpa_eid(const char *path);
>   * @return
>   *  Device id, -1 on failure
>   */
> -int rte_vhost_driver_get_vdpa_did(const char *path);
> +int __rte_experimental
> +rte_vhost_driver_get_vdpa_did(const char *path);
> 
>  /**
>   * Set the feature bits the vhost-user driver supports.
> @@ -286,7 +290,8 @@ int rte_vhost_driver_get_features(const char *path,
> uint64_t *features);
>   * @return
>   *  0 on success, -1 on failure
>   */
> -int rte_vhost_driver_get_protocol_features(const char *path,
> +int __rte_experimental
> +rte_vhost_driver_get_protocol_features(const char *path,
>   uint64_t *protocol_features);
> 
>  /**
> @@ -299,7 +304,8 @@ int rte_vhost_driver_get_protocol_features(const
> char *path,
>   * @return
>   *  0 on success, -1 on failure
>   */
> -int rte_vhost_driver_get_queue_num(const char *path, uint32_t
> *queue_num);
> +int __rte_experimental
> +rte_vhost_driver_get_queue_num(const char *path, uint32_t
> *queue_num);
> 
>  /**
>   * Get the feature bits after negotiation
> @@ -523,7 +529,8 @@ uint32_t rte_vhost_rx_queue_count(int vid, uint16_t
> qid);
>   * @return
>   *  0 on success, -1 on failure
>   */
> -int rte_vhost_get_log_base(int vid, uint64_t *log_base,
> +int __rte_experimental
> +rte_vhost_get_log_base(int vid, uint64_t *log_base,
>   uint64_t *log_size);
> 
>  /**
> @@ -540,7 +547,8 @@ int rte_vhost_get_log_base(int vid, uint64_t
> *log_base,
>   * @return
>   *  0 on success, -1 on failure
>   */
> -int rte_vhost_get_vring_base(int vid, uint16_t queue_id,
> +int __rte_experimental
> +rte_vhost_get_vring_base(int vid, uint16_t queue_id,
>   uint16_t *last_avail_idx, uint16_t *last_used_idx);
> 
>  /**
> @@ -557,7 +565,8 @@ int rte_vhost_get_vring_base(int vid, uint16_t

Re: [dpdk-dev] OPDL and 18.02 Release Notes

2018-03-06 Thread Mccarthy, Peter
The O stands for "Optimized", we will make the necessary changes to remove 
inconsistencies.

Regards
Peter

-Original Message-
From: Yigit, Ferruh 
Sent: Monday, March 5, 2018 5:58 PM
To: Rosen, Rami ; dev@dpdk.org
Cc: tho...@monjalon.net; Ma, Liang J ; Mccarthy, Peter 

Subject: Re: [dpdk-dev] OPDL and 18.02 Release Notes

On 2/9/2018 12:08 AM, Rosen, Rami wrote:
> Hi all,
> Following the recent announcement of DPDK 18.02-RC4, I went over
> 18.02 release notes and I have this minor query which I am not sure about:
> In the release notes:
> http://dpdk.org/doc/guides/rel_notes/release_18_02.html
> we have the following:
> ...
> The OPDL (Ordered Packet Distribution Library) eventdev ...
> 
> While in http://dpdk.org/dev/roadmap
> We have:
> 
> eventdev optimized packet distribution library (OPDL) driver ...
> 
> So I am not sure about this inconsistency -should it be "optimized" or 
> "ordered" ?

According driver documentation (doc/guides/eventdevs/opdl.rst) it is:
"Ordered Packet Distribution Library", release notes seems correct.

cc'ed maintainers.

> 
> Regards,
> Rami Rosen
> 
> 

--
Intel Research and Development Ireland Limited
Registered in Ireland
Registered Office: Collinstown Industrial Park, Leixlip, County Kildare
Registered Number: 308263


This e-mail and any attachments may contain confidential material for the sole
use of the intended recipient(s). Any review or distribution by others is
strictly prohibited. If you are not the intended recipient, please contact the
sender and delete all copies.


Re: [dpdk-dev] [RFC 1/4] drivers/bus/ifpga:Intel FPGA Bus Lib Code

2018-03-06 Thread Gaëtan Rivet
Hi Rosen,

A few comments inline.
(I will skip elements already pointed out by Shreyansh.)

On Tue, Mar 06, 2018 at 09:43:55AM +0800, Rosen Xu wrote:
> Signed-off-by: Rosen Xu 
> ---
>  drivers/bus/ifpga/Makefile  |  64 
>  drivers/bus/ifpga/ifpga_bus.c   | 527 
> 
>  drivers/bus/ifpga/ifpga_common.c| 168 +
>  drivers/bus/ifpga/ifpga_common.h|  46 +++
>  drivers/bus/ifpga/ifpga_logs.h  |  59 
>  drivers/bus/ifpga/rte_bus_ifpga.h   | 153 
>  drivers/bus/ifpga/rte_bus_ifpga_version.map |   8 +
>  7 files changed, 1025 insertions(+)
>  create mode 100644 drivers/bus/ifpga/Makefile
>  create mode 100644 drivers/bus/ifpga/ifpga_bus.c
>  create mode 100644 drivers/bus/ifpga/ifpga_common.c
>  create mode 100644 drivers/bus/ifpga/ifpga_common.h
>  create mode 100644 drivers/bus/ifpga/ifpga_logs.h
>  create mode 100644 drivers/bus/ifpga/rte_bus_ifpga.h
>  create mode 100644 drivers/bus/ifpga/rte_bus_ifpga_version.map
> 
> diff --git a/drivers/bus/ifpga/Makefile b/drivers/bus/ifpga/Makefile
> new file mode 100644
> index 000..c71f186
> --- /dev/null
> +++ b/drivers/bus/ifpga/Makefile
> @@ -0,0 +1,64 @@
> +#   BSD LICENSE
> +#
> +#   Copyright(c) 2010-2017 Intel Corporation. All rights reserved.
> +#   All rights reserved.
> +#
> +#   Redistribution and use in source and binary forms, with or without
> +#   modification, are permitted provided that the following conditions
> +#   are met:
> +#
> +# * Redistributions of source code must retain the above copyright
> +#   notice, this list of conditions and the following disclaimer.
> +# * Redistributions in binary form must reproduce the above copyright
> +#   notice, this list of conditions and the following disclaimer in
> +#   the documentation and/or other materials provided with the
> +#   distribution.
> +# * Neither the name of Intel Corporation nor the names of its
> +#   contributors may be used to endorse or promote products derived
> +#   from this software without specific prior written permission.
> +#
> +#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> +#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> +#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> +#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> +#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> +#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> +#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> +#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> +#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> +#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> +#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> +
> +include $(RTE_SDK)/mk/rte.vars.mk
> +
> +#
> +# library name
> +#
> +LIB = librte_bus_ifpga.a
> +LIBABIVER := 1
> +EXPORT_MAP := rte_bus_ifpga_version.map
> +
> +ifeq ($(CONFIG_RTE_LIBRTE_DPAA2_DEBUG_INIT),y)
> +CFLAGS += -O0 -g
> +CFLAGS += "-Wno-error"
> +else
> +CFLAGS += -O3
> +CFLAGS += $(WERROR_FLAGS)
> +endif
> +
> +CFLAGS += -I$(RTE_SDK)/drivers/bus/ifpga
> +CFLAGS += -I$(RTE_SDK)/drivers/bus/pci
> +CFLAGS += -I$(RTE_SDK)/lib/librte_eal/linuxapp/eal
> +CFLAGS += -I$(RTE_SDK)/lib/librte_eal/common
> +#CFLAGS += -I$(RTE_SDK)/lib/librte_rawdev
> +#LDLIBS += -lrte_eal -lrte_mbuf -lrte_mempool -lrte_ring -lrte_rawdev
> +LDLIBS += -lrte_eal -lrte_mbuf -lrte_mempool -lrte_ring
> +#LDLIBS += -lrte_ethdev
> +
> +VPATH += $(SRCDIR)/base
> +
> +SRCS-y += \
> +ifpga_bus.c \
> +ifpga_common.c
> +
> +include $(RTE_SDK)/mk/rte.lib.mk
> diff --git a/drivers/bus/ifpga/ifpga_bus.c b/drivers/bus/ifpga/ifpga_bus.c
> new file mode 100644
> index 000..382d550
> --- /dev/null
> +++ b/drivers/bus/ifpga/ifpga_bus.c
> @@ -0,0 +1,527 @@
> +/*-
> + *   BSD LICENSE
> + *
> + *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
> + *   Copyright 2013-2014 6WIND S.A.
> + *   All rights reserved.
> + *
> + *   Redistribution and use in source and binary forms, with or without
> + *   modification, are permitted provided that the following conditions
> + *   are met:
> + *
> + * * Redistributions of source code must retain the above copyright
> + *   notice, this list of conditions and the following disclaimer.
> + * * Redistributions in binary form must reproduce the above copyright
> + *   notice, this list of conditions and the following disclaimer in
> + *   the documentation and/or other materials provided with the
> + *   distribution.
> + * * Neither the name of Intel Corporation nor the names of its
> + *   contributors may be used to endorse or promote products derived
> + *   from this software without specific prior wri

Re: [dpdk-dev] [RFC 3/4] lib/librte_eal/common: Add Intel FPGA Bus Second Scan, it should be scanned after PCI Bus

2018-03-06 Thread Xu, Rosen


-Original Message-
From: Shreyansh Jain [mailto:shreyansh.j...@nxp.com] 
Sent: Tuesday, March 06, 2018 14:20
To: Xu, Rosen 
Cc: dev@dpdk.org; Doherty, Declan ; Zhang, Tianfei 

Subject: Re: [dpdk-dev] [RFC 3/4] lib/librte_eal/common: Add Intel FPGA Bus 
Second Scan, it should be scanned after PCI Bus

On Tue, Mar 6, 2018 at 7:13 AM, Rosen Xu  wrote:
> Signed-off-by: Rosen Xu 
> ---
>  lib/librte_eal/common/eal_common_bus.c | 14 +-
>  1 file changed, 13 insertions(+), 1 deletion(-)
>
> diff --git a/lib/librte_eal/common/eal_common_bus.c 
> b/lib/librte_eal/common/eal_common_bus.c
> index 3e022d5..74bfa15 100644
> --- a/lib/librte_eal/common/eal_common_bus.c
> +++ b/lib/librte_eal/common/eal_common_bus.c
> @@ -70,15 +70,27 @@ struct rte_bus_list rte_bus_list =
>  rte_bus_scan(void)
>  {
> int ret;
> -   struct rte_bus *bus = NULL;
> +   struct rte_bus *bus = NULL, *ifpga_bus = NULL;
>
> TAILQ_FOREACH(bus, &rte_bus_list, next) {
> +   if (!strcmp(bus->name, "ifpga")) {
> +   ifpga_bus = bus;
> +   continue;
> +   }
> +
> ret = bus->scan();
> if (ret)
> RTE_LOG(ERR, EAL, "Scan for (%s) bus failed.\n",
> bus->name);
> }
>
> +   if (ifpga_bus) {
> +   ret = ifpga_bus->scan();
> +   if (ret)
> +   RTE_LOG(ERR, EAL, "Scan for (%s) bus failed.\n",
> +   ifpga_bus->name);
> +   }
> +

You are doing this just so that PCI scans are completed *before* ifpga scans?
Rosen: yes
Well, I understand that this certainly is an issue that we can't yet define a 
priority ordering of bus scans.

But, I think what you are require is a simpler:

In the file ifpga_bus.c:

+RTE_REGISTER_BUS(IFPGA_BUS_NAME, rte_ifpga_bus.bus); <== this
...
...
#define RTE_REGISTER_BUS(nm, bus) \
RTE_INIT_PRIO(businitfn_ ##nm, 110); \

If you define your own version of RTE_REGISTER_BUS with the priority number 
higher, it would be inserted later in the bus list.
rte_register_bus doesn't do any inherent ordering.
This would save the changes you are doing in the 
lib/librte_eal/common/eal_common_bus.c file.

But I think there has to be a better provision of defining priority of bus 
scans - I am sure when new devices come in, there would be possibility of 
dependencies as in your case.
Rosen: is the priority scan of bus is implemented?

> return 0;
>  }
>
> --
> 1.8.3.1
>


[dpdk-dev] [PATCH 1/3] vhost: do not generate signal when sendmsg fails

2018-03-06 Thread Tiwei Bie
Signed-off-by: Tiwei Bie 
---
 lib/librte_vhost/socket.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/librte_vhost/socket.c b/lib/librte_vhost/socket.c
index 0354740fa..d703d2114 100644
--- a/lib/librte_vhost/socket.c
+++ b/lib/librte_vhost/socket.c
@@ -181,7 +181,7 @@ send_fd_message(int sockfd, char *buf, int buflen, int 
*fds, int fd_num)
}
 
do {
-   ret = sendmsg(sockfd, &msgh, 0);
+   ret = sendmsg(sockfd, &msgh, MSG_NOSIGNAL);
} while (ret < 0 && errno == EINTR);
 
if (ret < 0) {
-- 
2.11.0



[dpdk-dev] [PATCH 2/3] vhost: support sending fds via send_vhost_message()

2018-03-06 Thread Tiwei Bie
This function will be used to send fds to QEMU via slave channel.

Signed-off-by: Tiwei Bie 
---
 lib/librte_vhost/vhost_user.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c
index 8b07b6c43..e3a1dfbfb 100644
--- a/lib/librte_vhost/vhost_user.c
+++ b/lib/librte_vhost/vhost_user.c
@@ -1308,13 +1308,13 @@ read_vhost_message(int sockfd, struct VhostUserMsg *msg)
 }
 
 static int
-send_vhost_message(int sockfd, struct VhostUserMsg *msg)
+send_vhost_message(int sockfd, struct VhostUserMsg *msg, int *fds, int fd_num)
 {
if (!msg)
return 0;
 
return send_fd_message(sockfd, (char *)msg,
-   VHOST_USER_HDR_SIZE + msg->size, NULL, 0);
+   VHOST_USER_HDR_SIZE + msg->size, fds, fd_num);
 }
 
 static int
@@ -1328,7 +1328,7 @@ send_vhost_reply(int sockfd, struct VhostUserMsg *msg)
msg->flags |= VHOST_USER_VERSION;
msg->flags |= VHOST_USER_REPLY_MASK;
 
-   return send_vhost_message(sockfd, msg);
+   return send_vhost_message(sockfd, msg, NULL, 0);
 }
 
 /*
@@ -1643,7 +1643,7 @@ vhost_user_iotlb_miss(struct virtio_net *dev, uint64_t 
iova, uint8_t perm)
},
};
 
-   ret = send_vhost_message(dev->slave_req_fd, &msg);
+   ret = send_vhost_message(dev->slave_req_fd, &msg, NULL, 0);
if (ret < 0) {
RTE_LOG(ERR, VHOST_CONFIG,
"Failed to send IOTLB miss message (%d)\n",
-- 
2.11.0



[dpdk-dev] [PATCH 3/3] vhost: support VFIO based accelerator

2018-03-06 Thread Tiwei Bie
This commit adds the VFIO based accelerator support to
vhost. A new API is provided to support asking QEMU to
do further setup to allow notifications and interrupts
being delivered directly between the driver in guest
and the vDPA device in host.

Signed-off-by: Tiwei Bie 
---
 lib/librte_vhost/rte_vhost.h   |  28 ++
 lib/librte_vhost/rte_vhost_version.map |   1 +
 lib/librte_vhost/vhost_user.c  | 166 +
 lib/librte_vhost/vhost_user.h  |   9 ++
 4 files changed, 204 insertions(+)

diff --git a/lib/librte_vhost/rte_vhost.h b/lib/librte_vhost/rte_vhost.h
index d5589c543..68842e908 100644
--- a/lib/librte_vhost/rte_vhost.h
+++ b/lib/librte_vhost/rte_vhost.h
@@ -35,6 +35,7 @@ extern "C" {
 #define RTE_VHOST_USER_PROTOCOL_F_REPLY_ACK3
 #define RTE_VHOST_USER_PROTOCOL_F_NET_MTU  4
 #define RTE_VHOST_USER_PROTOCOL_F_SLAVE_REQ5
+#define RTE_VHOST_USER_PROTOCOL_F_VFIO 8
 #define RTE_VHOST_USER_F_PROTOCOL_FEATURES 30
 
 /**
@@ -591,6 +592,33 @@ rte_vhost_get_vdpa_eid(int vid);
 int __rte_experimental
 rte_vhost_get_vdpa_did(int vid);
 
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice
+ *
+ * Enable or disable the VFIO based accelerator for vhost-user.
+ *
+ * This function is to ask QEMU to do further setup to better
+ * support the vDPA device at vhost user backend. With this
+ * setup, the notifications and interrupts will be delivered
+ * directly between the driver in guest and the vDPA device
+ * in host if platform supports e.g. EPT and Posted interrupt.
+ * It's nice to have, and not mandatory.
+ *
+ * @param vid
+ *  vhost device ID
+ * @param int
+ *  Enable or disable
+ *
+ * @return
+ *   0: success
+ *   -ENODEV: no such vhost device
+ *   -ENOTSUP: device does not support VFIO based accelerator feature
+ *   -EINVAL: there is no accelerator assigned to this vhost device
+ *   -EFAULT: failed to talk with QEMU
+ */
+int rte_vhost_vfio_accelerator_ctrl(int vid, int enable);
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/lib/librte_vhost/rte_vhost_version.map 
b/lib/librte_vhost/rte_vhost_version.map
index 36257e51b..ca970170f 100644
--- a/lib/librte_vhost/rte_vhost_version.map
+++ b/lib/librte_vhost/rte_vhost_version.map
@@ -72,6 +72,7 @@ EXPERIMENTAL {
rte_vhost_set_vring_base;
rte_vhost_get_vdpa_eid;
rte_vhost_get_vdpa_did;
+   rte_vhost_vfio_accelerator_ctrl;
rte_vdpa_register_engine;
rte_vdpa_unregister_engine;
rte_vdpa_find_engine_id;
diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c
index e3a1dfbfb..a65598d80 100644
--- a/lib/librte_vhost/vhost_user.c
+++ b/lib/librte_vhost/vhost_user.c
@@ -35,6 +35,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "iotlb.h"
 #include "vhost.h"
@@ -1628,6 +1629,27 @@ vhost_user_msg_handler(int vid, int fd)
return 0;
 }
 
+static int process_slave_message_reply(struct virtio_net *dev,
+  const VhostUserMsg *msg)
+{
+   VhostUserMsg msg_reply;
+
+   if ((msg->flags & VHOST_USER_NEED_REPLY) == 0)
+   return 0;
+
+   if (read_vhost_message(dev->slave_req_fd, &msg_reply) < 0)
+   return -1;
+
+   if (msg_reply.request.slave != msg->request.slave) {
+   RTE_LOG(ERR, VHOST_CONFIG,
+   "received unexpected msg type (%u), expected %u\n",
+   msg_reply.request.slave, msg->request.slave);
+   return -1;
+   }
+
+   return msg_reply.payload.u64;
+}
+
 int
 vhost_user_iotlb_miss(struct virtio_net *dev, uint64_t iova, uint8_t perm)
 {
@@ -1653,3 +1675,147 @@ vhost_user_iotlb_miss(struct virtio_net *dev, uint64_t 
iova, uint8_t perm)
 
return 0;
 }
+
+static int vhost_user_slave_set_vring_file(struct virtio_net *dev,
+  uint32_t request,
+  struct vhost_vring_file *file)
+{
+   int *fdp = NULL;
+   size_t fd_num = 0;
+   int ret;
+   struct VhostUserMsg msg = {
+   .request.slave = request,
+   .flags = VHOST_USER_VERSION | VHOST_USER_NEED_REPLY,
+   .payload.u64 = file->index & VHOST_USER_VRING_IDX_MASK,
+   .size = sizeof(msg.payload.u64),
+   };
+
+   if (file->fd < 0)
+   msg.payload.u64 |= VHOST_USER_VRING_NOFD_MASK;
+   else {
+   fdp = &file->fd;
+   fd_num = 1;
+   }
+
+   ret = send_vhost_message(dev->slave_req_fd, &msg, fdp, fd_num);
+   if (ret < 0) {
+   RTE_LOG(ERR, VHOST_CONFIG,
+   "Failed to send slave message %u (%d)\n",
+   request, ret);
+   return ret;
+   }
+
+   return process_slave_message_reply(dev, &msg);
+}
+
+static int vhost_user_slave_set_vring_notify_area(struct virtio_net *dev,
+   

[dpdk-dev] [PATCH 0/3] Extend vhost to support VFIO based accelerator

2018-03-06 Thread Tiwei Bie
This patch set introduces the VFIO based accelerator support
for vhost. This is a new vhost user protocol feature to better
support the vDPA device at the vhost user backend. It allows
interrupts/notifications being delivered between the driver
in guest and the device in host directly.

Dependencies:

1. This patch set depends on the below patch set for QEMU:

http://lists.nongnu.org/archive/html/qemu-devel/2018-01/msg06028.html

Some of the enum definitions in this patch set have been
updated for the latest QEMU. A new patch set for QEMU will
be sent out later.

2. This patch set depends on Zhihong's "selective datapath"
   patch set:

http://dpdk.org/ml/archives/dev/2018-March/091858.html

This patch set is generated on the latest master branch of
dpdk-next-virtio with Zhihong's patches applied.

Best regards,
Tiwei Bie

Tiwei Bie (3):
  vhost: do not generate signal when sendmsg fails
  vhost: support sending fds via send_vhost_message()
  vhost: support VFIO based accelerator

 lib/librte_vhost/rte_vhost.h   |  28 ++
 lib/librte_vhost/rte_vhost_version.map |   1 +
 lib/librte_vhost/socket.c  |   2 +-
 lib/librte_vhost/vhost_user.c  | 174 -
 lib/librte_vhost/vhost_user.h  |   9 ++
 5 files changed, 209 insertions(+), 5 deletions(-)

-- 
2.11.0



Re: [dpdk-dev] [RFC 3/4] lib/librte_eal/common: Add Intel FPGA Bus Second Scan, it should be scanned after PCI Bus

2018-03-06 Thread Gaëtan Rivet
On Tue, Mar 06, 2018 at 10:42:14AM +, Xu, Rosen wrote:
> 
> 
> -Original Message-
> From: Shreyansh Jain [mailto:shreyansh.j...@nxp.com] 
> Sent: Tuesday, March 06, 2018 14:20
> To: Xu, Rosen 
> Cc: dev@dpdk.org; Doherty, Declan ; Zhang, Tianfei 
> 
> Subject: Re: [dpdk-dev] [RFC 3/4] lib/librte_eal/common: Add Intel FPGA Bus 
> Second Scan, it should be scanned after PCI Bus
> 
> On Tue, Mar 6, 2018 at 7:13 AM, Rosen Xu  wrote:
> > Signed-off-by: Rosen Xu 
> > ---
> >  lib/librte_eal/common/eal_common_bus.c | 14 +-
> >  1 file changed, 13 insertions(+), 1 deletion(-)
> >
> > diff --git a/lib/librte_eal/common/eal_common_bus.c 
> > b/lib/librte_eal/common/eal_common_bus.c
> > index 3e022d5..74bfa15 100644
> > --- a/lib/librte_eal/common/eal_common_bus.c
> > +++ b/lib/librte_eal/common/eal_common_bus.c
> > @@ -70,15 +70,27 @@ struct rte_bus_list rte_bus_list =
> >  rte_bus_scan(void)
> >  {
> > int ret;
> > -   struct rte_bus *bus = NULL;
> > +   struct rte_bus *bus = NULL, *ifpga_bus = NULL;
> >
> > TAILQ_FOREACH(bus, &rte_bus_list, next) {
> > +   if (!strcmp(bus->name, "ifpga")) {
> > +   ifpga_bus = bus;
> > +   continue;
> > +   }
> > +
> > ret = bus->scan();
> > if (ret)
> > RTE_LOG(ERR, EAL, "Scan for (%s) bus failed.\n",
> > bus->name);
> > }
> >
> > +   if (ifpga_bus) {
> > +   ret = ifpga_bus->scan();
> > +   if (ret)
> > +   RTE_LOG(ERR, EAL, "Scan for (%s) bus failed.\n",
> > +   ifpga_bus->name);
> > +   }
> > +
> 
> You are doing this just so that PCI scans are completed *before* ifpga scans?
> Rosen: yes
> Well, I understand that this certainly is an issue that we can't yet define a 
> priority ordering of bus scans.
> 
> But, I think what you are require is a simpler:
> 
> In the file ifpga_bus.c:
> 
> +RTE_REGISTER_BUS(IFPGA_BUS_NAME, rte_ifpga_bus.bus); <== this
> ...
> ...
> #define RTE_REGISTER_BUS(nm, bus) \
> RTE_INIT_PRIO(businitfn_ ##nm, 110); \
> 
> If you define your own version of RTE_REGISTER_BUS with the priority number 
> higher, it would be inserted later in the bus list.
> rte_register_bus doesn't do any inherent ordering.
> This would save the changes you are doing in the 
> lib/librte_eal/common/eal_common_bus.c file.
> 
> But I think there has to be a better provision of defining priority of bus 
> scans - I am sure when new devices come in, there would be possibility of 
> dependencies as in your case.
> Rosen: is the priority scan of bus is implemented?

No, there is no priority set for scanning order.
However, the order in which buses are registered, will modify the order
in which scans are done.

Thus, if you change the priority of your registration, you should be
able to ensure that your scan comes last.

> 
> > return 0;
> >  }
> >
> > --
> > 1.8.3.1
> >

-- 
Gaëtan Rivet
6WIND


[dpdk-dev] Anyone who can help?

2018-03-06 Thread wang.yong19
Hi,
I met a problem when i use git to get code from dpdk.org. I never met  this 
before.
Is there anyone know what happened with this?

[root@localhost dpdk]# git pull
fatal: unable to access 'http://dpdk.org/git/dpdk/': The requested URL returned 
error: 502

[root@localhost wangyong]# git clone http://dpdk.org/git/dpdk
Cloning into 'dpdk'...
fatal: unable to access 'http://dpdk.org/git/dpdk/': The requested URL returned 
error: 502

Re: [dpdk-dev] [PATCH v4 2/5] eal: use file to check if secondary process is ready

2018-03-06 Thread Burakov, Anatoly

On 02-Mar-18 3:14 PM, Anatoly Burakov wrote:

Previously, IPC would remove sockets it considers to be "inactive"
based on whether they have responded. We also need to prevent
sending messages to processes that are active, but haven't yet
finished initialization.

This will create a "init file" per socket which will be removed
after initialization is complete, to prevent primary process from
sending messages to a process that hasn't finished its
initialization.

Signed-off-by: Anatoly Burakov 
---


Self-NACK on this patch. Secondary processes may initialize data 
structures, which means IPC has to be active during init. Each subsystem 
will therefore have to synchronize access to IPC on their own. (For 
example, memory hotplug will only block IPC for a short period between 
rte_config_init() and init of memory/heap init)



--
Thanks,
Anatoly


Re: [dpdk-dev] [PATCH 00/41] Memory Hotplug for DPDK

2018-03-06 Thread Burakov, Anatoly

On 03-Mar-18 1:45 PM, Anatoly Burakov wrote:

This patchset introduces dynamic memory allocation for DPDK (aka memory
hotplug). Based upon RFC submitted in December [1].


For those testing this patch, there's a deadlock-at-startup issue when 
DPDK is started with no memory. This will be fixed in v2 (as well as 
dependent IPC patches), but for now the workaround is to start DPDK with 
-m/--socket-mem switches.


--
Thanks,
Anatoly


Re: [dpdk-dev] [RFC 3/4] lib/librte_eal/common: Add Intel FPGA Bus Second Scan, it should be scanned after PCI Bus

2018-03-06 Thread Bruce Richardson
On Tue, Mar 06, 2018 at 11:46:22AM +0100, Gaëtan Rivet wrote:
> On Tue, Mar 06, 2018 at 10:42:14AM +, Xu, Rosen wrote:
> > 
> > 
> > -Original Message-
> > From: Shreyansh Jain [mailto:shreyansh.j...@nxp.com] 
> > Sent: Tuesday, March 06, 2018 14:20
> > To: Xu, Rosen 
> > Cc: dev@dpdk.org; Doherty, Declan ; Zhang, 
> > Tianfei 
> > Subject: Re: [dpdk-dev] [RFC 3/4] lib/librte_eal/common: Add Intel FPGA Bus 
> > Second Scan, it should be scanned after PCI Bus
> > 
> > On Tue, Mar 6, 2018 at 7:13 AM, Rosen Xu  wrote:
> > > Signed-off-by: Rosen Xu 
> > > ---
> > >  lib/librte_eal/common/eal_common_bus.c | 14 +-
> > >  1 file changed, 13 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/lib/librte_eal/common/eal_common_bus.c 
> > > b/lib/librte_eal/common/eal_common_bus.c
> > > index 3e022d5..74bfa15 100644
> > > --- a/lib/librte_eal/common/eal_common_bus.c
> > > +++ b/lib/librte_eal/common/eal_common_bus.c
> > > @@ -70,15 +70,27 @@ struct rte_bus_list rte_bus_list =
> > >  rte_bus_scan(void)
> > >  {
> > > int ret;
> > > -   struct rte_bus *bus = NULL;
> > > +   struct rte_bus *bus = NULL, *ifpga_bus = NULL;
> > >
> > > TAILQ_FOREACH(bus, &rte_bus_list, next) {
> > > +   if (!strcmp(bus->name, "ifpga")) {
> > > +   ifpga_bus = bus;
> > > +   continue;
> > > +   }
> > > +
> > > ret = bus->scan();
> > > if (ret)
> > > RTE_LOG(ERR, EAL, "Scan for (%s) bus failed.\n",
> > > bus->name);
> > > }
> > >
> > > +   if (ifpga_bus) {
> > > +   ret = ifpga_bus->scan();
> > > +   if (ret)
> > > +   RTE_LOG(ERR, EAL, "Scan for (%s) bus failed.\n",
> > > +   ifpga_bus->name);
> > > +   }
> > > +
> > 
> > You are doing this just so that PCI scans are completed *before* ifpga 
> > scans?
> > Rosen: yes
> > Well, I understand that this certainly is an issue that we can't yet define 
> > a priority ordering of bus scans.
> > 
> > But, I think what you are require is a simpler:
> > 
> > In the file ifpga_bus.c:
> > 
> > +RTE_REGISTER_BUS(IFPGA_BUS_NAME, rte_ifpga_bus.bus); <== this
> > ...
> > ...
> > #define RTE_REGISTER_BUS(nm, bus) \
> > RTE_INIT_PRIO(businitfn_ ##nm, 110); \
> > 
> > If you define your own version of RTE_REGISTER_BUS with the priority number 
> > higher, it would be inserted later in the bus list.
> > rte_register_bus doesn't do any inherent ordering.
> > This would save the changes you are doing in the 
> > lib/librte_eal/common/eal_common_bus.c file.
> > 
> > But I think there has to be a better provision of defining priority of bus 
> > scans - I am sure when new devices come in, there would be possibility of 
> > dependencies as in your case.
> > Rosen: is the priority scan of bus is implemented?
> 
> No, there is no priority set for scanning order.
> However, the order in which buses are registered, will modify the order
> in which scans are done.
> 
> Thus, if you change the priority of your registration, you should be
> able to ensure that your scan comes last.
> 

Can we register the bus only when a PCI device match is found at
runtime, e.g. as part of the PCI driver instance initialization?

/Bruce


Re: [dpdk-dev] [PATCH v2] net/null:Different mac address support

2018-03-06 Thread Ferruh Yigit
On 3/6/2018 3:35 AM, Mallesh Koujalagi wrote:
> After attaching two Null device to ovs, seeing "00.00.00.00.00.00" mac
> address for both null devices. Fix this issue, by setting different mac
> address.
> 
> Signed-off-by: Mallesh Koujalagi 

<...>

> @@ -514,12 +524,21 @@ eth_dev_null_create(struct rte_vdev_device *dev,
>   if (!data)
>   return -ENOMEM;
>  
> + eth_addr = rte_zmalloc_socket(rte_vdev_device_name(dev),
> + sizeof(*eth_addr), 0, dev->device.numa_node);
> + if (eth_addr == NULL) {
> + rte_free(data);
> + return -ENOMEM;
> + }
> +
>   eth_dev = rte_eth_vdev_allocate(dev, sizeof(*internals));
>   if (!eth_dev) {
> + rte_free(eth_addr);
>   rte_free(data);
>   return -ENOMEM;
>   }

Same comment from previous version, why not put "eth_addr" inside "struct
pmd_internals"?

"struct pmd_internals" is already allocated/freed in the code, so you don't need
to manage "eth_addr" if you put it into "struct pmd_internals" it will come 
free.

<...>



Re: [dpdk-dev] [dpdk-stable] [PATCH] net/bonding: avoid wrong casting on primary_slave_port_id from input param

2018-03-06 Thread Ferruh Yigit
On 3/6/2018 9:37 AM, Gowrishankar wrote:
> From: Gowrishankar Muthukrishnan 
> 
> primary_slave_port_id is uint16_t which needs to be correctly stored
> with the same data type of input parameter in bond_ethdev_configure.
> 
> Fixes: f8244c6399 ("ethdev: increase port id range")
> Cc: sta...@dpdk.org
> 
> Signed-off-by: Gowrishankar Muthukrishnan 

Acked-by: Ferruh Yigit 


Re: [dpdk-dev] [PATCH v1] app/pdump: add check for PCAP PMD

2018-03-06 Thread Ferruh Yigit
On 3/6/2018 8:45 AM, Varghese, Vipin wrote:
> Hi Ferruh,
> 
>> -Original Message-
>> From: Yigit, Ferruh
>> Sent: Monday, March 5, 2018 2:33 PM
>> To: Varghese, Vipin ; dev@dpdk.org; Pattan,
>> Reshma 
>> Cc: Mcnamara, John 
>> Subject: Re: [dpdk-dev] [PATCH v1] app/pdump: add check for PCAP PMD
>>
>> On 3/5/2018 7:57 AM, Vipin Varghese wrote:
>>> dpdk-pdump makes use of LIBRTE_PMD_PCAP for interfacing the ring to
>>> the device-queue pair. Updating Makefile to check for the same.
>>>
>>> Signed-off-by: Vipin Varghese 
>>> ---
>>>  app/pdump/Makefile | 4 
>>>  1 file changed, 4 insertions(+)
>>>
>>> diff --git a/app/pdump/Makefile b/app/pdump/Makefile index
>>> bd3c208..038a34f 100644
>>> --- a/app/pdump/Makefile
>>> +++ b/app/pdump/Makefile
>>> @@ -3,6 +3,10 @@
>>>
>>>  include $(RTE_SDK)/mk/rte.vars.mk
>>>
>>> +ifeq ($(CONFIG_RTE_LIBRTE_PMD_PCAP),n) $(error "Please enable
>>> +CONFIG_RTE_LIBRTE_PMD_PCAP") endif
>>
>> pdump is enabled default, so won't this break the default build?
> 
> Yes, you are right it will fail. Which then forces the user to enable PCAP.

We shouldn't break the default build because of missing dependencies.

> 
>>
>> What about moving this to lib/librte_pdump, convert $(error ..) to $(warning 
>> ..)
>> and disable CONFIG_RTE_LIBRTE_PDUMP there?
> 
> If we set to warning and there are no PCAP headers in build system. The 
> application gets built, but will fail internally becz the pcap API will fails 
> during execution.

if CONFIG_RTE_LIBRTE_PDUMP disabled application won't be compiled

> 
>>
>>> +
>>>  ifeq ($(CONFIG_RTE_LIBRTE_PDUMP),y)
>>>
>>>  APP = dpdk-pdump
>>>
> 



Re: [dpdk-dev] [RFC 3/4] lib/librte_eal/common: Add Intel FPGA Bus Second Scan, it should be scanned after PCI Bus

2018-03-06 Thread Gaëtan Rivet
On Tue, Mar 06, 2018 at 11:36:17AM +, Bruce Richardson wrote:
> On Tue, Mar 06, 2018 at 11:46:22AM +0100, Gaëtan Rivet wrote:
> > On Tue, Mar 06, 2018 at 10:42:14AM +, Xu, Rosen wrote:
> > > 
> > > 
> > > -Original Message-
> > > From: Shreyansh Jain [mailto:shreyansh.j...@nxp.com] 
> > > Sent: Tuesday, March 06, 2018 14:20
> > > To: Xu, Rosen 
> > > Cc: dev@dpdk.org; Doherty, Declan ; Zhang, 
> > > Tianfei 
> > > Subject: Re: [dpdk-dev] [RFC 3/4] lib/librte_eal/common: Add Intel FPGA 
> > > Bus Second Scan, it should be scanned after PCI Bus
> > > 
> > > On Tue, Mar 6, 2018 at 7:13 AM, Rosen Xu  wrote:
> > > > Signed-off-by: Rosen Xu 
> > > > ---
> > > >  lib/librte_eal/common/eal_common_bus.c | 14 +-
> > > >  1 file changed, 13 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/lib/librte_eal/common/eal_common_bus.c 
> > > > b/lib/librte_eal/common/eal_common_bus.c
> > > > index 3e022d5..74bfa15 100644
> > > > --- a/lib/librte_eal/common/eal_common_bus.c
> > > > +++ b/lib/librte_eal/common/eal_common_bus.c
> > > > @@ -70,15 +70,27 @@ struct rte_bus_list rte_bus_list =
> > > >  rte_bus_scan(void)
> > > >  {
> > > > int ret;
> > > > -   struct rte_bus *bus = NULL;
> > > > +   struct rte_bus *bus = NULL, *ifpga_bus = NULL;
> > > >
> > > > TAILQ_FOREACH(bus, &rte_bus_list, next) {
> > > > +   if (!strcmp(bus->name, "ifpga")) {
> > > > +   ifpga_bus = bus;
> > > > +   continue;
> > > > +   }
> > > > +
> > > > ret = bus->scan();
> > > > if (ret)
> > > > RTE_LOG(ERR, EAL, "Scan for (%s) bus failed.\n",
> > > > bus->name);
> > > > }
> > > >
> > > > +   if (ifpga_bus) {
> > > > +   ret = ifpga_bus->scan();
> > > > +   if (ret)
> > > > +   RTE_LOG(ERR, EAL, "Scan for (%s) bus failed.\n",
> > > > +   ifpga_bus->name);
> > > > +   }
> > > > +
> > > 
> > > You are doing this just so that PCI scans are completed *before* ifpga 
> > > scans?
> > > Rosen: yes
> > > Well, I understand that this certainly is an issue that we can't yet 
> > > define a priority ordering of bus scans.
> > > 
> > > But, I think what you are require is a simpler:
> > > 
> > > In the file ifpga_bus.c:
> > > 
> > > +RTE_REGISTER_BUS(IFPGA_BUS_NAME, rte_ifpga_bus.bus); <== this
> > > ...
> > > ...
> > > #define RTE_REGISTER_BUS(nm, bus) \
> > > RTE_INIT_PRIO(businitfn_ ##nm, 110); \
> > > 
> > > If you define your own version of RTE_REGISTER_BUS with the priority 
> > > number higher, it would be inserted later in the bus list.
> > > rte_register_bus doesn't do any inherent ordering.
> > > This would save the changes you are doing in the 
> > > lib/librte_eal/common/eal_common_bus.c file.
> > > 
> > > But I think there has to be a better provision of defining priority of 
> > > bus scans - I am sure when new devices come in, there would be 
> > > possibility of dependencies as in your case.
> > > Rosen: is the priority scan of bus is implemented?
> > 
> > No, there is no priority set for scanning order.
> > However, the order in which buses are registered, will modify the order
> > in which scans are done.
> > 
> > Thus, if you change the priority of your registration, you should be
> > able to ensure that your scan comes last.
> > 
> 
> Can we register the bus only when a PCI device match is found at
> runtime, e.g. as part of the PCI driver instance initialization?
> 
> /Bruce

Technically, yes. You would append a new bus during rte_bus_probe, so
the linked list would simply have a new node and you would then probe
it. You would need to make sure you scan your bus first, so you would
have some weird conditions (whether you are loaded during probe or
naturally, you'd have to do your scan or not).

However, this seems like a terrible idea. You introduce an edge case
that will need to be carried over in most of the bus API implementation.

This new bus seems like a specialization of the PCI bus. Why not directly
use the PCI bus and have your driver linked to either a rawdev or a vdev,
where you could store your metadata and expose a specialized interface?

-- 
Gaëtan Rivet
6WIND


Re: [dpdk-dev] [PATCH v1] app/pdump: add check for PCAP PMD

2018-03-06 Thread Pattan, Reshma
> 
> +ifeq ($(CONFIG_RTE_LIBRTE_PMD_PCAP),n)
> +$(error "Please enable CONFIG_RTE_LIBRTE_PMD_PCAP") endif
> +

How about combining If(($(CONFIG_RTE_LIBRTE_PMD_PCAP),y) check with below 
existing if check?
with this, dpdk-pdump will be compiled only when both the flags are enabled.

>  ifeq ($(CONFIG_RTE_LIBRTE_PDUMP),y)
> 
>  APP = dpdk-pdump
> --
> 1.9.1



Re: [dpdk-dev] Anyone who can help?

2018-03-06 Thread Nélio Laranjeiro
+Thomas,

Hi,

On Tue, Mar 06, 2018 at 06:54:25PM +0800, wang.yon...@zte.com.cn wrote:
> Hi,
> I met a problem when i use git to get code from dpdk.org. I never met  this 
> before.
> Is there anyone know what happened with this?
> 
> [root@localhost dpdk]# git pull
> fatal: unable to access 'http://dpdk.org/git/dpdk/': The requested URL 
> returned error: 502
> 
> [root@localhost wangyong]# git clone http://dpdk.org/git/dpdk
> Cloning into 'dpdk'...
> fatal: unable to access 'http://dpdk.org/git/dpdk/': The requested URL 
> returned error: 502

Did you tried with https or git protocol instead?

 https://dpdk.org/git/dpdk/
 git://dpdk.org/git/dpdk/

Regards,

-- 
Nélio Laranjeiro
6WIND


Re: [dpdk-dev] [PATCH v2 1/6] vhost: export vhost feature definitions

2018-03-06 Thread Maxime Coquelin



On 03/06/2018 10:37 AM, Tan, Jianfeng wrote:




-Original Message-
From: Wang, Zhihong
Sent: Tuesday, February 13, 2018 5:21 PM
To: dev@dpdk.org
Cc: Tan, Jianfeng; Bie, Tiwei; maxime.coque...@redhat.com;
y...@fridaylinux.org; Liang, Cunming; Wang, Xiao W; Daly, Dan; Wang,
Zhihong
Subject: [PATCH v2 1/6] vhost: export vhost feature definitions

This patch exports vhost-user protocol features to support device driver
development.

Signed-off-by: Zhihong Wang 
---
  lib/librte_vhost/rte_vhost.h  |  8 
  lib/librte_vhost/vhost.h  |  4 +---
  lib/librte_vhost/vhost_user.c |  9 +
  lib/librte_vhost/vhost_user.h | 20 +++-
  4 files changed, 21 insertions(+), 20 deletions(-)

diff --git a/lib/librte_vhost/rte_vhost.h b/lib/librte_vhost/rte_vhost.h
index d33206997..b05162366 100644
--- a/lib/librte_vhost/rte_vhost.h
+++ b/lib/librte_vhost/rte_vhost.h
@@ -29,6 +29,14 @@ extern "C" {
  #define RTE_VHOST_USER_DEQUEUE_ZERO_COPY  (1ULL << 2)
  #define RTE_VHOST_USER_IOMMU_SUPPORT  (1ULL << 3)

+#define RTE_VHOST_USER_PROTOCOL_F_MQ   0


Instead of adding a "RTE_" prefix. I prefer to define it like this:
#ifndef VHOST_USER_PROTOCOL_F_MQ
#define VHOST_USER_PROTOCOL_F_MQ   0
#endif

Similar to other macros.


I agree, it is better to keep same naming as in the spec IMHO.


+#define RTE_VHOST_USER_PROTOCOL_F_LOG_SHMFD1
+#define RTE_VHOST_USER_PROTOCOL_F_RARP 2
+#define RTE_VHOST_USER_PROTOCOL_F_REPLY_ACK3
+#define RTE_VHOST_USER_PROTOCOL_F_NET_MTU  4
+#define RTE_VHOST_USER_PROTOCOL_F_SLAVE_REQ5
+#define RTE_VHOST_USER_F_PROTOCOL_FEATURES 30


Please put the above declaration separately, it could be misleading,
making to think it is a vhost-user protocol feature whereas it is a
Virtio feature.


+
  /**
   * Information relating to memory regions including offsets to
   * addresses in QEMUs memory file.
diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h
index 58aec2e0d..a0b0520e2 100644
--- a/lib/librte_vhost/vhost.h
+++ b/lib/librte_vhost/vhost.h
@@ -174,8 +174,6 @@ struct vhost_msg {
   #define VIRTIO_F_VERSION_1 32
  #endif

-#define VHOST_USER_F_PROTOCOL_FEATURES 30
-
  /* Features supported by this builtin vhost-user net driver. */
  #define VIRTIO_NET_SUPPORTED_FEATURES ((1ULL <<
VIRTIO_NET_F_MRG_RXBUF) | \
(1ULL << VIRTIO_F_ANY_LAYOUT) | \
@@ -185,7 +183,7 @@ struct vhost_msg {
(1ULL << VIRTIO_NET_F_MQ)  | \
(1ULL << VIRTIO_F_VERSION_1)   | \
(1ULL << VHOST_F_LOG_ALL)  | \
-   (1ULL <<
VHOST_USER_F_PROTOCOL_FEATURES) | \
+   (1ULL <<
RTE_VHOST_USER_F_PROTOCOL_FEATURES) | \
(1ULL << VIRTIO_NET_F_GSO) | \
(1ULL << VIRTIO_NET_F_HOST_TSO4) | \
(1ULL << VIRTIO_NET_F_HOST_TSO6) | \
diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c
index 5c5361066..c93e48e4d 100644
--- a/lib/librte_vhost/vhost_user.c
+++ b/lib/librte_vhost/vhost_user.c
@@ -527,7 +527,7 @@ vhost_user_set_vring_addr(struct virtio_net **pdev,
VhostUserMsg *msg)
vring_invalidate(dev, vq);

if (vq->enabled && (dev->features &
-   (1ULL <<
VHOST_USER_F_PROTOCOL_FEATURES))) {
+   (1ULL <<
RTE_VHOST_USER_F_PROTOCOL_FEATURES))) {
dev = translate_ring_addresses(dev, msg-

payload.addr.index);

if (!dev)
return -1;
@@ -897,11 +897,11 @@ vhost_user_set_vring_kick(struct virtio_net
**pdev, struct VhostUserMsg *pmsg)
vq = dev->virtqueue[file.index];

/*
-* When VHOST_USER_F_PROTOCOL_FEATURES is not negotiated,
+* When RTE_VHOST_USER_F_PROTOCOL_FEATURES is not
negotiated,
 * the ring starts already enabled. Otherwise, it is enabled via
 * the SET_VRING_ENABLE message.
 */
-   if (!(dev->features & (1ULL <<
VHOST_USER_F_PROTOCOL_FEATURES)))
+   if (!(dev->features & (1ULL <<
RTE_VHOST_USER_F_PROTOCOL_FEATURES)))
vq->enabled = 1;

if (vq->kickfd >= 0)
@@ -1012,7 +1012,8 @@ vhost_user_get_protocol_features(struct
virtio_net *dev,
 * Qemu versions (from v2.7.0 to v2.9.0).
 */
if (!(features & (1ULL << VIRTIO_F_IOMMU_PLATFORM)))
-   protocol_features &= ~(1ULL <<
VHOST_USER_PROTOCOL_F_REPLY_ACK);
+   protocol_features &=
+   ~(1ULL <<
RTE_VHOST_USER_PROTOCOL_F_REPLY_ACK);

msg->payload.u64 = protocol_features;
msg->size = sizeof(msg->payload.u64);
diff --git a/lib/librte_vhost/vhost_user.h b/lib/librte_vhost/vhost_user.h
index 0fafbe6e0..066e772dd 100644
--- a/lib/librte_vhost/vhost_user.h
+++ b/lib/librte_vhost/vhost_user.h
@@ -14,19 +

Re: [dpdk-dev] [PATCH 3/3] vhost: support VFIO based accelerator

2018-03-06 Thread Maxime Coquelin



On 03/06/2018 11:43 AM, Tiwei Bie wrote:

This commit adds the VFIO based accelerator support to
vhost. A new API is provided to support asking QEMU to
do further setup to allow notifications and interrupts
being delivered directly between the driver in guest
and the vDPA device in host.

Signed-off-by: Tiwei Bie 
---
  lib/librte_vhost/rte_vhost.h   |  28 ++
  lib/librte_vhost/rte_vhost_version.map |   1 +
  lib/librte_vhost/vhost_user.c  | 166 +
  lib/librte_vhost/vhost_user.h  |   9 ++
  4 files changed, 204 insertions(+)

diff --git a/lib/librte_vhost/rte_vhost.h b/lib/librte_vhost/rte_vhost.h
index d5589c543..68842e908 100644
--- a/lib/librte_vhost/rte_vhost.h
+++ b/lib/librte_vhost/rte_vhost.h
@@ -35,6 +35,7 @@ extern "C" {
  #define RTE_VHOST_USER_PROTOCOL_F_REPLY_ACK   3
  #define RTE_VHOST_USER_PROTOCOL_F_NET_MTU 4
  #define RTE_VHOST_USER_PROTOCOL_F_SLAVE_REQ   5
+#define RTE_VHOST_USER_PROTOCOL_F_VFIO 8
  #define RTE_VHOST_USER_F_PROTOCOL_FEATURES30
  
  /**

@@ -591,6 +592,33 @@ rte_vhost_get_vdpa_eid(int vid);
  int __rte_experimental
  rte_vhost_get_vdpa_did(int vid);
  
+/**

+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice
+ *
+ * Enable or disable the VFIO based accelerator for vhost-user.
+ *
+ * This function is to ask QEMU to do further setup to better
+ * support the vDPA device at vhost user backend. With this
+ * setup, the notifications and interrupts will be delivered
+ * directly between the driver in guest and the vDPA device
+ * in host if platform supports e.g. EPT and Posted interrupt.
+ * It's nice to have, and not mandatory.
+ *
+ * @param vid
+ *  vhost device ID
+ * @param int
+ *  Enable or disable
+ *
+ * @return
+ *   0: success
+ *   -ENODEV: no such vhost device
+ *   -ENOTSUP: device does not support VFIO based accelerator feature
+ *   -EINVAL: there is no accelerator assigned to this vhost device
+ *   -EFAULT: failed to talk with QEMU
+ */
+int rte_vhost_vfio_accelerator_ctrl(int vid, int enable);
+
  #ifdef __cplusplus
  }
  #endif
diff --git a/lib/librte_vhost/rte_vhost_version.map 
b/lib/librte_vhost/rte_vhost_version.map
index 36257e51b..ca970170f 100644
--- a/lib/librte_vhost/rte_vhost_version.map
+++ b/lib/librte_vhost/rte_vhost_version.map
@@ -72,6 +72,7 @@ EXPERIMENTAL {
rte_vhost_set_vring_base;
rte_vhost_get_vdpa_eid;
rte_vhost_get_vdpa_did;
+   rte_vhost_vfio_accelerator_ctrl;
rte_vdpa_register_engine;
rte_vdpa_unregister_engine;
rte_vdpa_find_engine_id;
diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c
index e3a1dfbfb..a65598d80 100644
--- a/lib/librte_vhost/vhost_user.c
+++ b/lib/librte_vhost/vhost_user.c
@@ -35,6 +35,7 @@
  #include 
  #include 
  #include 
+#include 
  
  #include "iotlb.h"

  #include "vhost.h"
@@ -1628,6 +1629,27 @@ vhost_user_msg_handler(int vid, int fd)
return 0;
  }
  
+static int process_slave_message_reply(struct virtio_net *dev,

+  const VhostUserMsg *msg)
+{
+   VhostUserMsg msg_reply;
+
+   if ((msg->flags & VHOST_USER_NEED_REPLY) == 0)
+   return 0;
+
+   if (read_vhost_message(dev->slave_req_fd, &msg_reply) < 0)
+   return -1;
+
+   if (msg_reply.request.slave != msg->request.slave) {
+   RTE_LOG(ERR, VHOST_CONFIG,
+   "received unexpected msg type (%u), expected %u\n",
+   msg_reply.request.slave, msg->request.slave);
+   return -1;
+   }
+
+   return msg_reply.payload.u64;
+}
+
  int
  vhost_user_iotlb_miss(struct virtio_net *dev, uint64_t iova, uint8_t perm)
  {
@@ -1653,3 +1675,147 @@ vhost_user_iotlb_miss(struct virtio_net *dev, uint64_t 
iova, uint8_t perm)
  
  	return 0;

  }
+
+static int vhost_user_slave_set_vring_file(struct virtio_net *dev,
+  uint32_t request,
+  struct vhost_vring_file *file)

Why passing the request as an argument?
It seems to be called only with the same request ID.


+{
+   int *fdp = NULL;
+   size_t fd_num = 0;
+   int ret;
+   struct VhostUserMsg msg = {
+   .request.slave = request,
+   .flags = VHOST_USER_VERSION | VHOST_USER_NEED_REPLY,
+   .payload.u64 = file->index & VHOST_USER_VRING_IDX_MASK,
+   .size = sizeof(msg.payload.u64),
+   };
+
+   if (file->fd < 0)
+   msg.payload.u64 |= VHOST_USER_VRING_NOFD_MASK;
+   else {
+   fdp = &file->fd;
+   fd_num = 1;
+   }
+
+   ret = send_vhost_message(dev->slave_req_fd, &msg, fdp, fd_num);
+   if (ret < 0) {
+   RTE_LOG(ERR, VHOST_CONFIG,
+   "Failed to send slave message %u (%d)\n",
+   request, ret);
+   return ret;
+   }
+
+   retu

Re: [dpdk-dev] [PATCH 2/6] net/sfc: add support for driver-wide dynamic logging

2018-03-06 Thread Andrew Rybchenko

On 03/05/2018 05:59 PM, Ferruh Yigit wrote:

On 1/25/2018 5:00 PM, Andrew Rybchenko wrote:

From: Ivan Malov 

Signed-off-by: Ivan Malov 
Signed-off-by: Andrew Rybchenko 
Reviewed-by: Andy Moreton 

<...>


@@ -2082,3 +2084,14 @@ RTE_PMD_REGISTER_PARAM_STRING(net_sfc_efx,
SFC_KVARG_STATS_UPDATE_PERIOD_MS "= "
SFC_KVARG_MCDI_LOGGING "=" SFC_KVARG_VALUES_BOOL " "
SFC_KVARG_DEBUG_INIT "=" SFC_KVARG_VALUES_BOOL);
+
+RTE_INIT(sfc_driver_register_logtype);
+static void
+sfc_driver_register_logtype(void)
+{
+   int ret;
+
+   ret = rte_log_register_type_and_pick_level(SFC_LOGTYPE_PREFIX "driver",
+  RTE_LOG_NOTICE);

No benefit of using rte_log_register_type_and_pick_level() here, in this stage
"opt_loglevel_list" will be empty and this will be same as rte_log_register()


That's true except "uniform approach is good". I.e. simply use
rte_log_register_type_and_pick_level() everywhere to make it safe against
code movements.
In fact it was raised during internal review and we kept as you can see it.

Other option is to avoid usage of constructor here at all and move it to 
probe.
Yes, it will be tried many times, but there is no harm if it is already 
registered.


Re: [dpdk-dev] [PATCH 2/6] net/sfc: add support for driver-wide dynamic logging

2018-03-06 Thread Andrew Rybchenko

On 03/06/2018 05:45 PM, Andrew Rybchenko wrote:

On 03/05/2018 05:59 PM, Ferruh Yigit wrote:

On 1/25/2018 5:00 PM, Andrew Rybchenko wrote:

From: Ivan Malov 

Signed-off-by: Ivan Malov 
Signed-off-by: Andrew Rybchenko 
Reviewed-by: Andy Moreton 

<...>


@@ -2082,3 +2084,14 @@ RTE_PMD_REGISTER_PARAM_STRING(net_sfc_efx,
  SFC_KVARG_STATS_UPDATE_PERIOD_MS "= "
  SFC_KVARG_MCDI_LOGGING "=" SFC_KVARG_VALUES_BOOL " "
  SFC_KVARG_DEBUG_INIT "=" SFC_KVARG_VALUES_BOOL);
+
+RTE_INIT(sfc_driver_register_logtype);
+static void
+sfc_driver_register_logtype(void)
+{
+    int ret;
+
+    ret = rte_log_register_type_and_pick_level(SFC_LOGTYPE_PREFIX 
"driver",

+   RTE_LOG_NOTICE);
No benefit of using rte_log_register_type_and_pick_level() here, in 
this stage
"opt_loglevel_list" will be empty and this will be same as 
rte_log_register()


That's true except "uniform approach is good". I.e. simply use
rte_log_register_type_and_pick_level() everywhere to make it safe against
code movements.
In fact it was raised during internal review and we kept as you can 
see it.


Other option is to avoid usage of constructor here at all and move it 
to probe.
Yes, it will be tried many times, but there is no harm if it is 
already registered.


In fact it could be really required if dynamic library is used and it is
pulled later using dlopen() - don't know if there are any restrictions in
DPDK which prevent it.


Re: [dpdk-dev] [PATCH 01/14] net/sfc/base: support filters for encapsulated packets

2018-03-06 Thread Andrew Rybchenko

On 02/27/2018 03:45 PM, Andrew Rybchenko wrote:

From: Roman Zhukov 

This adds filters for encapsulated packets to the list
returned by ef10_filter_supported_filters().

Signed-off-by: Roman Zhukov 
Signed-off-by: Andrew Rybchenko 
Reviewed-by: Andy Moreton 
---
  drivers/net/sfc/base/ef10_filter.c | 65 --
  1 file changed, 55 insertions(+), 10 deletions(-)


<...>


-   rc = efx_mcdi_get_parser_disp_info(enp, buffer, buffer_length,
-   &mcdi_list_length);
+   /*
+* Two calls to MC_CMD_GET_PARSER_DISP_INFO are needed: one to get the
+* list of supported filters for ordinary packets, and then another to
+* get the list of supported filters for encapsulated packets.
+*/
+   rc = efx_mcdi_get_parser_disp_info(enp, buffer, buffer_length, B_FALSE,
+   &mcdi_list_length);
if (rc != 0) {
-   if (rc == ENOSPC) {
-   /* Pass through mcdi_list_length for the list length */
-   *list_lengthp = mcdi_list_length;
+   if (rc == ENOSPC)
+   no_space = B_TRUE;
+   else
+   goto fail1;
+   }
+
+   if (no_space) {
+   next_buf_idx = 0;
+   next_buf_length = 0;
+   } else {
+   EFSYS_ASSERT(mcdi_list_length < buffer_length);


In fact <= must be here since above call may return 0 if return array
fits exactly in provided buffer. I'll send v2.


+   next_buf_idx = mcdi_list_length;
+   next_buf_length = buffer_length - mcdi_list_length;
+   }





Re: [dpdk-dev] Anyone who can help?

2018-03-06 Thread Thomas Monjalon
06/03/2018 11:54, wang.yon...@zte.com.cn:
> Hi,
> I met a problem when i use git to get code from dpdk.org. I never met  this 
> before.
> Is there anyone know what happened with this?
> 
> [root@localhost dpdk]# git pull
> fatal: unable to access 'http://dpdk.org/git/dpdk/': The requested URL 
> returned error: 502

There was an outage with git on dpdk.org today.
It has been fixed when discovered.





Re: [dpdk-dev] [PATCH 2/6] net/sfc: add support for driver-wide dynamic logging

2018-03-06 Thread Ferruh Yigit
On 3/6/2018 2:56 PM, Andrew Rybchenko wrote:
> 
> In fact it could be really required if dynamic library is used and it is
> pulled later using dlopen() - don't know if there are any restrictions in
> DPDK which prevent it.

That function has constructor attribute, not sure how it works for that case.
I am good as long as this is not missed but decided to implement this way.


[dpdk-dev] [PATCH v2 05/14] net/sfc: add VXLAN in flow API filters support

2018-03-06 Thread Andrew Rybchenko
From: Roman Zhukov 

Exact match of VXLAN network identifier is supported by parser.
IP protocol match are enforced to UDP.

Signed-off-by: Roman Zhukov 
Signed-off-by: Andrew Rybchenko 
Reviewed-by: Ivan Malov 
Reviewed-by: Andy Moreton 
---
 doc/guides/nics/sfc_efx.rst |   2 +
 drivers/net/sfc/sfc_flow.c  | 165 
 2 files changed, 167 insertions(+)

diff --git a/doc/guides/nics/sfc_efx.rst b/doc/guides/nics/sfc_efx.rst
index ccdf5ff..5a4b2a6 100644
--- a/doc/guides/nics/sfc_efx.rst
+++ b/doc/guides/nics/sfc_efx.rst
@@ -166,6 +166,8 @@ Supported pattern items:
 
 - UDP (exact match of source/destination ports)
 
+- VXLAN (exact match of VXLAN network identifier)
+
 Supported actions:
 
 - VOID
diff --git a/drivers/net/sfc/sfc_flow.c b/drivers/net/sfc/sfc_flow.c
index 93cdf8f..20ba69d 100644
--- a/drivers/net/sfc/sfc_flow.c
+++ b/drivers/net/sfc/sfc_flow.c
@@ -57,6 +57,7 @@ static sfc_flow_item_parse sfc_flow_parse_ipv4;
 static sfc_flow_item_parse sfc_flow_parse_ipv6;
 static sfc_flow_item_parse sfc_flow_parse_tcp;
 static sfc_flow_item_parse sfc_flow_parse_udp;
+static sfc_flow_item_parse sfc_flow_parse_vxlan;
 
 static boolean_t
 sfc_flow_is_zero(const uint8_t *buf, unsigned int size)
@@ -696,6 +697,132 @@ sfc_flow_parse_udp(const struct rte_flow_item *item,
return -rte_errno;
 }
 
+/*
+ * Filters for encapsulated packets match based on the EtherType and IP
+ * protocol in the outer frame.
+ */
+static int
+sfc_flow_set_match_flags_for_encap_pkts(const struct rte_flow_item *item,
+   efx_filter_spec_t *efx_spec,
+   uint8_t ip_proto,
+   struct rte_flow_error *error)
+{
+   if (!(efx_spec->efs_match_flags & EFX_FILTER_MATCH_IP_PROTO)) {
+   efx_spec->efs_match_flags |= EFX_FILTER_MATCH_IP_PROTO;
+   efx_spec->efs_ip_proto = ip_proto;
+   } else if (efx_spec->efs_ip_proto != ip_proto) {
+   switch (ip_proto) {
+   case EFX_IPPROTO_UDP:
+   rte_flow_error_set(error, EINVAL,
+   RTE_FLOW_ERROR_TYPE_ITEM, item,
+   "Outer IP header protocol must be UDP "
+   "in VxLAN pattern");
+   return -rte_errno;
+
+   default:
+   rte_flow_error_set(error, EINVAL,
+   RTE_FLOW_ERROR_TYPE_ITEM, item,
+   "Only VxLAN tunneling patterns "
+   "are supported");
+   return -rte_errno;
+   }
+   }
+
+   if (!(efx_spec->efs_match_flags & EFX_FILTER_MATCH_ETHER_TYPE)) {
+   rte_flow_error_set(error, EINVAL,
+   RTE_FLOW_ERROR_TYPE_ITEM, item,
+   "Outer frame EtherType in pattern with tunneling "
+   "must be set");
+   return -rte_errno;
+   } else if (efx_spec->efs_ether_type != EFX_ETHER_TYPE_IPV4 &&
+  efx_spec->efs_ether_type != EFX_ETHER_TYPE_IPV6) {
+   rte_flow_error_set(error, EINVAL,
+   RTE_FLOW_ERROR_TYPE_ITEM, item,
+   "Outer frame EtherType in pattern with tunneling "
+   "must be IPv4 or IPv6");
+   return -rte_errno;
+   }
+
+   return 0;
+}
+
+static int
+sfc_flow_set_efx_spec_vni_or_vsid(efx_filter_spec_t *efx_spec,
+ const uint8_t *vni_or_vsid_val,
+ const uint8_t *vni_or_vsid_mask,
+ const struct rte_flow_item *item,
+ struct rte_flow_error *error)
+{
+   const uint8_t vni_or_vsid_full_mask[EFX_VNI_OR_VSID_LEN] = {
+   0xff, 0xff, 0xff
+   };
+
+   if (memcmp(vni_or_vsid_mask, vni_or_vsid_full_mask,
+  EFX_VNI_OR_VSID_LEN) == 0) {
+   efx_spec->efs_match_flags |= EFX_FILTER_MATCH_VNI_OR_VSID;
+   rte_memcpy(efx_spec->efs_vni_or_vsid, vni_or_vsid_val,
+  EFX_VNI_OR_VSID_LEN);
+   } else if (!sfc_flow_is_zero(vni_or_vsid_mask, EFX_VNI_OR_VSID_LEN)) {
+   rte_flow_error_set(error, EINVAL,
+  RTE_FLOW_ERROR_TYPE_ITEM, item,
+  "Unsupported VNI/VSID mask");
+   return -rte_errno;
+   }
+
+   return 0;
+}
+
+/**
+ * Convert VXLAN item to EFX filter specification.
+ *
+ * @param item[in]
+ *   Item specification. Only VXLAN network identifier field is supported.
+ *   If the mask is NULL, default mask will be used.
+ *   Ranging is not supported.
+ * @param efx_spec[in, out]
+ *   EFX filter specification to update.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL.
+ */
+static int
+sfc_flow_parse_vxla

[dpdk-dev] [PATCH v2 13/14] net/sfc: avoid creation of ineffective flow rules

2018-03-06 Thread Andrew Rybchenko
From: Roman Zhukov 

Despite being versatile, the hardware support for filtering has a number
of special properties which must be taken into account. Namely, there is
a known set of valid filters which don't take any effect despite being
accepted by the hardware.

The combinations of match flags and field values which can describe the
exceptional filters are as follows:
- ETHER_TYPE or ETHER_TYPE | LOC_MAC with IPv4 or IPv6 EtherType
- ETHER_TYPE | IP_PROTO or ETHER_TYPE | IP_PROTO | LOC_MAC with UDP or
TCP IP protocol value
- The same combinations with OUTER_VID and/or INNER_VID

These exceptional filters can be expressed in terms of RTE flow rules.
If the user creates such a flow rule, no traffic will hit the underlying
filter, and no errors will be reported.

This patch adds a means to prevent such ineffective flow rules from
being created.

Signed-off-by: Roman Zhukov 
Signed-off-by: Andrew Rybchenko 
Reviewed-by: Ivan Malov 
---
 doc/guides/nics/sfc_efx.rst | 17 ++
 drivers/net/sfc/sfc_flow.c  | 78 +
 2 files changed, 95 insertions(+)

diff --git a/doc/guides/nics/sfc_efx.rst b/doc/guides/nics/sfc_efx.rst
index 539ce90..f41ccdb 100644
--- a/doc/guides/nics/sfc_efx.rst
+++ b/doc/guides/nics/sfc_efx.rst
@@ -193,6 +193,23 @@ in the mask of destination address. If destinaton address 
in the spec is
 multicast, it matches all multicast (and broadcast) packets, oherwise it
 matches unicast packets that are not filtered by other flow rules.
 
+Exceptions to flow rules
+
+
+There is a list of exceptional flow rule patterns which will not be
+accepted by the PMD. A pattern will be rejected if at least one of the
+conditions is met:
+
+- Filtering by IPv4 or IPv6 EtherType without pattern items of internet
+  layer and above.
+
+- The last item is IPV4 or IPV6, and it's empty.
+
+- Filtering by TCP or UDP IP transport protocol without pattern items of
+  transport layer and above.
+
+- The last item is TCP or UDP, and it's empty.
+
 
 Supported NICs
 --
diff --git a/drivers/net/sfc/sfc_flow.c b/drivers/net/sfc/sfc_flow.c
index 7b26653..2b8bef8 100644
--- a/drivers/net/sfc/sfc_flow.c
+++ b/drivers/net/sfc/sfc_flow.c
@@ -1919,6 +1919,77 @@ sfc_flow_spec_filters_complete(struct sfc_adapter *sa,
return 0;
 }
 
+/**
+ * Check that set of match flags is referred to by a filter. Filter is
+ * described by match flags with the ability to add OUTER_VID and INNER_VID
+ * flags.
+ *
+ * @param match_flags[in]
+ *   Set of match flags.
+ * @param flags_pattern[in]
+ *   Pattern of filter match flags.
+ */
+static boolean_t
+sfc_flow_is_match_with_vids(efx_filter_match_flags_t match_flags,
+   efx_filter_match_flags_t flags_pattern)
+{
+   if ((match_flags & flags_pattern) != flags_pattern)
+   return B_FALSE;
+
+   switch (match_flags & ~flags_pattern) {
+   case 0:
+   case EFX_FILTER_MATCH_OUTER_VID:
+   case EFX_FILTER_MATCH_OUTER_VID | EFX_FILTER_MATCH_INNER_VID:
+   return B_TRUE;
+   default:
+   return B_FALSE;
+   }
+}
+
+/**
+ * Check whether the spec maps to a hardware filter which is known to be
+ * ineffective despite being valid.
+ *
+ * @param spec[in]
+ *   SFC flow specification.
+ */
+static boolean_t
+sfc_flow_is_match_flags_exception(struct sfc_flow_spec *spec)
+{
+   unsigned int i;
+   uint16_t ether_type;
+   uint8_t ip_proto;
+   efx_filter_match_flags_t match_flags;
+
+   for (i = 0; i < spec->count; i++) {
+   match_flags = spec->filters[i].efs_match_flags;
+
+   if (sfc_flow_is_match_with_vids(match_flags,
+   EFX_FILTER_MATCH_ETHER_TYPE) ||
+   sfc_flow_is_match_with_vids(match_flags,
+   EFX_FILTER_MATCH_ETHER_TYPE |
+   EFX_FILTER_MATCH_LOC_MAC)) {
+   ether_type = spec->filters[i].efs_ether_type;
+   if (ether_type == EFX_ETHER_TYPE_IPV4 ||
+   ether_type == EFX_ETHER_TYPE_IPV6)
+   return B_TRUE;
+   } else if (sfc_flow_is_match_with_vids(match_flags,
+   EFX_FILTER_MATCH_ETHER_TYPE |
+   EFX_FILTER_MATCH_IP_PROTO) ||
+  sfc_flow_is_match_with_vids(match_flags,
+   EFX_FILTER_MATCH_ETHER_TYPE |
+   EFX_FILTER_MATCH_IP_PROTO |
+   EFX_FILTER_MATCH_LOC_MAC)) {
+   ip_proto = spec->filters[i].efs_ip_proto;
+   if (ip_proto == EFX_IPPROTO_TCP ||
+   ip_proto == EFX_IPPROTO_UDP)
+   return B_TRUE;
+   }
+   }
+
+   return B_FALSE;
+}
+
 static int
 sfc_flo

[dpdk-dev] [PATCH v2 07/14] net/sfc: add GENEVE in flow API filters support

2018-03-06 Thread Andrew Rybchenko
From: Roman Zhukov 

Exact match of virtual network identifier is supported by parser.
IP protocol match are enforced to UDP.
Only Ethernet protocol type is supported.

Signed-off-by: Roman Zhukov 
Signed-off-by: Andrew Rybchenko 
Reviewed-by: Ivan Malov 
Reviewed-by: Andy Moreton 
---
 doc/guides/nics/sfc_efx.rst |  3 ++
 drivers/net/sfc/sfc_flow.c  | 80 +++--
 2 files changed, 81 insertions(+), 2 deletions(-)

diff --git a/doc/guides/nics/sfc_efx.rst b/doc/guides/nics/sfc_efx.rst
index 05dacb3..943fe55 100644
--- a/doc/guides/nics/sfc_efx.rst
+++ b/doc/guides/nics/sfc_efx.rst
@@ -168,6 +168,9 @@ Supported pattern items:
 
 - VXLAN (exact match of VXLAN network identifier)
 
+- GENEVE (exact match of virtual network identifier, only Ethernet (0x6558)
+  protocol type is supported)
+
 - NVGRE (exact match of virtual subnet ID)
 
 Supported actions:
diff --git a/drivers/net/sfc/sfc_flow.c b/drivers/net/sfc/sfc_flow.c
index 126ec9b..efdc664 100644
--- a/drivers/net/sfc/sfc_flow.c
+++ b/drivers/net/sfc/sfc_flow.c
@@ -58,6 +58,7 @@ static sfc_flow_item_parse sfc_flow_parse_ipv6;
 static sfc_flow_item_parse sfc_flow_parse_tcp;
 static sfc_flow_item_parse sfc_flow_parse_udp;
 static sfc_flow_item_parse sfc_flow_parse_vxlan;
+static sfc_flow_item_parse sfc_flow_parse_geneve;
 static sfc_flow_item_parse sfc_flow_parse_nvgre;
 
 static boolean_t
@@ -717,7 +718,7 @@ sfc_flow_set_match_flags_for_encap_pkts(const struct 
rte_flow_item *item,
rte_flow_error_set(error, EINVAL,
RTE_FLOW_ERROR_TYPE_ITEM, item,
"Outer IP header protocol must be UDP "
-   "in VxLAN pattern");
+   "in VxLAN/GENEVE pattern");
return -rte_errno;
 
case EFX_IPPROTO_GRE:
@@ -730,7 +731,7 @@ sfc_flow_set_match_flags_for_encap_pkts(const struct 
rte_flow_item *item,
default:
rte_flow_error_set(error, EINVAL,
RTE_FLOW_ERROR_TYPE_ITEM, item,
-   "Only VxLAN/NVGRE tunneling patterns "
+   "Only VxLAN/GENEVE/NVGRE tunneling patterns "
"are supported");
return -rte_errno;
}
@@ -832,6 +833,74 @@ sfc_flow_parse_vxlan(const struct rte_flow_item *item,
 }
 
 /**
+ * Convert GENEVE item to EFX filter specification.
+ *
+ * @param item[in]
+ *   Item specification. Only Virtual Network Identifier and protocol type
+ *   fields are supported. But protocol type can be only Ethernet (0x6558).
+ *   If the mask is NULL, default mask will be used.
+ *   Ranging is not supported.
+ * @param efx_spec[in, out]
+ *   EFX filter specification to update.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL.
+ */
+static int
+sfc_flow_parse_geneve(const struct rte_flow_item *item,
+ efx_filter_spec_t *efx_spec,
+ struct rte_flow_error *error)
+{
+   int rc;
+   const struct rte_flow_item_geneve *spec = NULL;
+   const struct rte_flow_item_geneve *mask = NULL;
+   const struct rte_flow_item_geneve supp_mask = {
+   .protocol = RTE_BE16(0x),
+   .vni = { 0xff, 0xff, 0xff }
+   };
+
+   rc = sfc_flow_parse_init(item,
+(const void **)&spec,
+(const void **)&mask,
+&supp_mask,
+&rte_flow_item_geneve_mask,
+sizeof(struct rte_flow_item_geneve),
+error);
+   if (rc != 0)
+   return rc;
+
+   rc = sfc_flow_set_match_flags_for_encap_pkts(item, efx_spec,
+EFX_IPPROTO_UDP, error);
+   if (rc != 0)
+   return rc;
+
+   efx_spec->efs_encap_type = EFX_TUNNEL_PROTOCOL_GENEVE;
+   efx_spec->efs_match_flags |= EFX_FILTER_MATCH_ENCAP_TYPE;
+
+   if (spec == NULL)
+   return 0;
+
+   if (mask->protocol == supp_mask.protocol) {
+   if (spec->protocol != rte_cpu_to_be_16(ETHER_TYPE_TEB)) {
+   rte_flow_error_set(error, EINVAL,
+   RTE_FLOW_ERROR_TYPE_ITEM, item,
+   "GENEVE encap. protocol must be Ethernet "
+   "(0x6558) in the GENEVE pattern item");
+   return -rte_errno;
+   }
+   } else if (mask->protocol != 0) {
+   rte_flow_error_set(error, EINVAL,
+   RTE_FLOW_ERROR_TYPE_ITEM, item,
+   "Unsupported mask for GENEVE encap. protocol");
+   return -rte_errno;
+   }
+
+   rc = sfc_flow_set_efx_spec_vni_or_vsid(efx_spec, spec->vni,
+  

[dpdk-dev] [PATCH v2 10/14] net/sfc: multiply of specs with an unknown EtherType

2018-03-06 Thread Andrew Rybchenko
From: Roman Zhukov 

Hardware filter specification for encapsulated traffic must contain
EtherType. In terms of RTE flow API, this would require L3 item to be
used in the flow rule. In the simplest case, if the user needs to filter
encapsulated traffic without knowledge of exact EtherType, they will
have to create multiple variants of the flow rule featuring all possible
L3 items (IPv4, IPv6), respectively. In order to hide the gory details
and avoid such a complication, this patch implements a mechanism to
auto-complete the filter specifications if need be.

Signed-off-by: Roman Zhukov 
Signed-off-by: Andrew Rybchenko 
Reviewed-by: Ivan Malov 
---
 drivers/net/sfc/sfc_flow.c | 306 +++--
 drivers/net/sfc/sfc_flow.h |   2 +-
 2 files changed, 266 insertions(+), 42 deletions(-)

diff --git a/drivers/net/sfc/sfc_flow.c b/drivers/net/sfc/sfc_flow.c
index a432936..244fcdb 100644
--- a/drivers/net/sfc/sfc_flow.c
+++ b/drivers/net/sfc/sfc_flow.c
@@ -64,6 +64,21 @@ static sfc_flow_item_parse sfc_flow_parse_vxlan;
 static sfc_flow_item_parse sfc_flow_parse_geneve;
 static sfc_flow_item_parse sfc_flow_parse_nvgre;
 
+typedef int (sfc_flow_spec_set_vals)(struct sfc_flow_spec *spec,
+unsigned int filters_count_for_one_val,
+struct rte_flow_error *error);
+
+struct sfc_flow_copy_flag {
+   /* EFX filter specification match flag */
+   efx_filter_match_flags_t flag;
+   /* Number of values of corresponding field */
+   unsigned int vals_count;
+   /* Function to set values in specifications */
+   sfc_flow_spec_set_vals *set_vals;
+};
+
+static sfc_flow_spec_set_vals sfc_flow_set_ethertypes;
+
 static boolean_t
 sfc_flow_is_zero(const uint8_t *buf, unsigned int size)
 {
@@ -244,16 +259,9 @@ sfc_flow_parse_eth(const struct rte_flow_item *item,
if (rc != 0)
return rc;
 
-   /*
-* If "spec" is not set, could be any Ethernet, but for the inner frame
-* type of destination MAC must be set
-*/
-   if (spec == NULL) {
-   if (is_ifrm)
-   goto fail_bad_ifrm_dst_mac;
-   else
-   return 0;
-   }
+   /* If "spec" is not set, could be any Ethernet */
+   if (spec == NULL)
+   return 0;
 
if (is_same_ether_addr(&mask->dst, &supp_mask.dst)) {
efx_spec->efs_match_flags |= is_ifrm ?
@@ -273,8 +281,6 @@ sfc_flow_parse_eth(const struct rte_flow_item *item,
EFX_FILTER_MATCH_UNKNOWN_MCAST_DST;
} else if (!is_zero_ether_addr(&mask->dst)) {
goto fail_bad_mask;
-   } else if (is_ifrm) {
-   goto fail_bad_ifrm_dst_mac;
}
 
/*
@@ -308,13 +314,6 @@ sfc_flow_parse_eth(const struct rte_flow_item *item,
   RTE_FLOW_ERROR_TYPE_ITEM, item,
   "Bad mask in the ETH pattern item");
return -rte_errno;
-
-fail_bad_ifrm_dst_mac:
-   rte_flow_error_set(error, EINVAL,
-  RTE_FLOW_ERROR_TYPE_ITEM, item,
-  "Type of destination MAC address in inner frame "
-  "must be set");
-   return -rte_errno;
 }
 
 /**
@@ -782,14 +781,9 @@ sfc_flow_set_match_flags_for_encap_pkts(const struct 
rte_flow_item *item,
}
}
 
-   if (!(efx_spec->efs_match_flags & EFX_FILTER_MATCH_ETHER_TYPE)) {
-   rte_flow_error_set(error, EINVAL,
-   RTE_FLOW_ERROR_TYPE_ITEM, item,
-   "Outer frame EtherType in pattern with tunneling "
-   "must be set");
-   return -rte_errno;
-   } else if (efx_spec->efs_ether_type != EFX_ETHER_TYPE_IPV4 &&
-  efx_spec->efs_ether_type != EFX_ETHER_TYPE_IPV6) {
+   if (efx_spec->efs_match_flags & EFX_FILTER_MATCH_ETHER_TYPE &&
+   efx_spec->efs_ether_type != EFX_ETHER_TYPE_IPV4 &&
+   efx_spec->efs_ether_type != EFX_ETHER_TYPE_IPV6) {
rte_flow_error_set(error, EINVAL,
RTE_FLOW_ERROR_TYPE_ITEM, item,
"Outer frame EtherType in pattern with tunneling "
@@ -1508,6 +1502,246 @@ sfc_flow_parse_actions(struct sfc_adapter *sa,
return 0;
 }
 
+/**
+ * Set the EFX_FILTER_MATCH_ETHER_TYPE match flag and EFX_ETHER_TYPE_IPV4 and
+ * EFX_ETHER_TYPE_IPV6 values of the corresponding field in the same
+ * specifications after copying.
+ *
+ * @param spec[in, out]
+ *   SFC flow specification to update.
+ * @param filters_count_for_one_val[in]
+ *   How many specifications should have the same EtherType value, what is the
+ *   number of specifications before copying.
+ * @param error[out]
+ *   Perform verbose error reporting if not NULL.
+ */
+static int
+sfc_flow_set_ethertypes(struct sfc_flow_spec *spec,
+   unsig

[dpdk-dev] [PATCH v2 11/14] net/sfc: multiply of specs w/o inner frame destination MAC

2018-03-06 Thread Andrew Rybchenko
From: Roman Zhukov 

Knowledge of a network identifier is not sufficient to construct a
workable hardware filter for encapsulated traffic. It's obligatory to
specify one of the match flags associated with inner frame destination
MAC. If the address is unknown, then one needs to specify either unknown
unicast or unknown multicast destination match flag.

In terms of RTE flow API, this would require adding multiple flow rules
with corresponding ETH items besides the tunnel item. In order to avoid
such a complication, the patch implements a mechanism to auto-complete
an underlying filter representation of a flow rule in order to create
additional filter specififcations featuring the missing match flags.

Signed-off-by: Roman Zhukov 
Signed-off-by: Andrew Rybchenko 
Reviewed-by: Ivan Malov 
---
 drivers/net/sfc/sfc_flow.c | 114 -
 drivers/net/sfc/sfc_flow.h |   2 +-
 2 files changed, 113 insertions(+), 3 deletions(-)

diff --git a/drivers/net/sfc/sfc_flow.c b/drivers/net/sfc/sfc_flow.c
index 244fcdb..2d45827 100644
--- a/drivers/net/sfc/sfc_flow.c
+++ b/drivers/net/sfc/sfc_flow.c
@@ -68,6 +68,10 @@ typedef int (sfc_flow_spec_set_vals)(struct sfc_flow_spec 
*spec,
 unsigned int filters_count_for_one_val,
 struct rte_flow_error *error);
 
+typedef boolean_t (sfc_flow_spec_check)(efx_filter_match_flags_t match,
+   efx_filter_spec_t *spec,
+   struct sfc_filter *filter);
+
 struct sfc_flow_copy_flag {
/* EFX filter specification match flag */
efx_filter_match_flags_t flag;
@@ -75,9 +79,16 @@ struct sfc_flow_copy_flag {
unsigned int vals_count;
/* Function to set values in specifications */
sfc_flow_spec_set_vals *set_vals;
+   /*
+* Function to check that the specification is suitable
+* for adding this match flag
+*/
+   sfc_flow_spec_check *spec_check;
 };
 
 static sfc_flow_spec_set_vals sfc_flow_set_ethertypes;
+static sfc_flow_spec_set_vals sfc_flow_set_ifrm_unknown_dst_flags;
+static sfc_flow_spec_check sfc_flow_check_ifrm_unknown_dst_flags;
 
 static boolean_t
 sfc_flow_is_zero(const uint8_t *buf, unsigned int size)
@@ -1548,12 +1559,98 @@ sfc_flow_set_ethertypes(struct sfc_flow_spec *spec,
return 0;
 }
 
+/**
+ * Set the EFX_FILTER_MATCH_IFRM_UNKNOWN_UCAST_DST and
+ * EFX_FILTER_MATCH_IFRM_UNKNOWN_MCAST_DST match flags in the same
+ * specifications after copying.
+ *
+ * @param spec[in, out]
+ *   SFC flow specification to update.
+ * @param filters_count_for_one_val[in]
+ *   How many specifications should have the same match flag, what is the
+ *   number of specifications before copying.
+ * @param error[out]
+ *   Perform verbose error reporting if not NULL.
+ */
+static int
+sfc_flow_set_ifrm_unknown_dst_flags(struct sfc_flow_spec *spec,
+   unsigned int filters_count_for_one_val,
+   struct rte_flow_error *error)
+{
+   unsigned int i;
+   static const efx_filter_match_flags_t vals[] = {
+   EFX_FILTER_MATCH_IFRM_UNKNOWN_UCAST_DST,
+   EFX_FILTER_MATCH_IFRM_UNKNOWN_MCAST_DST
+   };
+
+   if (filters_count_for_one_val * RTE_DIM(vals) != spec->count) {
+   rte_flow_error_set(error, EINVAL,
+   RTE_FLOW_ERROR_TYPE_UNSPECIFIED, NULL,
+   "Number of specifications is incorrect while copying "
+   "by inner frame unknown destination flags");
+   return -rte_errno;
+   }
+
+   for (i = 0; i < spec->count; i++) {
+   /* The check above ensures that divisor can't be zero here */
+   spec->filters[i].efs_match_flags |=
+   vals[i / filters_count_for_one_val];
+   }
+
+   return 0;
+}
+
+/**
+ * Check that the following conditions are met:
+ * - the specification corresponds to a filter for encapsulated traffic
+ * - the list of supported filters has a filter
+ *   with EFX_FILTER_MATCH_IFRM_UNKNOWN_MCAST_DST flag instead of
+ *   EFX_FILTER_MATCH_IFRM_UNKNOWN_UCAST_DST, since this filter will also
+ *   be inserted.
+ *
+ * @param match[in]
+ *   The match flags of filter.
+ * @param spec[in]
+ *   Specification to be supplemented.
+ * @param filter[in]
+ *   SFC filter with list of supported filters.
+ */
+static boolean_t
+sfc_flow_check_ifrm_unknown_dst_flags(efx_filter_match_flags_t match,
+ efx_filter_spec_t *spec,
+ struct sfc_filter *filter)
+{
+   unsigned int i;
+   efx_tunnel_protocol_t encap_type = spec->efs_encap_type;
+   efx_filter_match_flags_t match_mcast_dst;
+
+   if (encap_type == EFX_TUNNEL_PROTOCOL_NONE)
+   return B_FALSE;
+
+   match_mcast_dst =
+   (match & ~EFX_FI

[dpdk-dev] [PATCH v2 04/14] net/sfc/base: distinguish filters for encapsulated packets

2018-03-06 Thread Andrew Rybchenko
From: Roman Zhukov 

Add filter match flag to distinguish filters applied only to
encapsulated packets.

Match flags set should allow to determine whether a filter
is supported or not. The problem is that if specification
has supported set outer match flags and specified
encapsulation without any inner flags, check says that it
is supported, and filter insertion is performed. However,
there is no filtering of the encapsulated traffic. A new
flag is added to solve this problem and separate the
filters for the encapsulated packets.

Signed-off-by: Roman Zhukov 
Signed-off-by: Andrew Rybchenko 
Reviewed-by: Andy Moreton 
Reviewed-by: Mark Spender 
---
 drivers/net/sfc/base/ef10_filter.c | 19 +--
 drivers/net/sfc/base/efx.h |  5 +
 drivers/net/sfc/base/efx_filter.c  |  3 ++-
 3 files changed, 24 insertions(+), 3 deletions(-)

diff --git a/drivers/net/sfc/base/ef10_filter.c 
b/drivers/net/sfc/base/ef10_filter.c
index e93dc13..a627cce 100644
--- a/drivers/net/sfc/base/ef10_filter.c
+++ b/drivers/net/sfc/base/ef10_filter.c
@@ -174,6 +174,7 @@ efx_mcdi_filter_op_add(
efx_mcdi_req_t req;
uint8_t payload[MAX(MC_CMD_FILTER_OP_EXT_IN_LEN,
MC_CMD_FILTER_OP_EXT_OUT_LEN)];
+   efx_filter_match_flags_t match_flags;
efx_rc_t rc;
 
memset(payload, 0, sizeof (payload));
@@ -183,6 +184,12 @@ efx_mcdi_filter_op_add(
req.emr_out_buf = payload;
req.emr_out_length = MC_CMD_FILTER_OP_EXT_OUT_LEN;
 
+   /*
+* Remove match flag for encapsulated filters that does not correspond
+* to the MCDI match flags
+*/
+   match_flags = spec->efs_match_flags & ~EFX_FILTER_MATCH_ENCAP_TYPE;
+
switch (filter_op) {
case MC_CMD_FILTER_OP_IN_OP_REPLACE:
MCDI_IN_SET_DWORD(req, FILTER_OP_EXT_IN_HANDLE_LO,
@@ -203,7 +210,7 @@ efx_mcdi_filter_op_add(
MCDI_IN_SET_DWORD(req, FILTER_OP_EXT_IN_PORT_ID,
EVB_PORT_ID_ASSIGNED);
MCDI_IN_SET_DWORD(req, FILTER_OP_EXT_IN_MATCH_FIELDS,
-   spec->efs_match_flags);
+   match_flags);
MCDI_IN_SET_DWORD(req, FILTER_OP_EXT_IN_RX_DEST,
MC_CMD_FILTER_OP_EXT_IN_RX_DEST_HOST);
MCDI_IN_SET_DWORD(req, FILTER_OP_EXT_IN_RX_QUEUE,
@@ -1008,13 +1015,17 @@ ef10_filter_supported_filters(
EFX_FILTER_MATCH_IFRM_LOC_MAC |
EFX_FILTER_MATCH_IFRM_UNKNOWN_MCAST_DST |
EFX_FILTER_MATCH_IFRM_UNKNOWN_UCAST_DST |
+   EFX_FILTER_MATCH_ENCAP_TYPE |
EFX_FILTER_MATCH_UNKNOWN_MCAST_DST |
EFX_FILTER_MATCH_UNKNOWN_UCAST_DST);
 
/*
 * Two calls to MC_CMD_GET_PARSER_DISP_INFO are needed: one to get the
 * list of supported filters for ordinary packets, and then another to
-* get the list of supported filters for encapsulated packets.
+* get the list of supported filters for encapsulated packets. To
+* distinguish the second list from the first, the
+* EFX_FILTER_MATCH_ENCAP_TYPE flag is added to each filter for
+* encapsulated packets.
 */
rc = efx_mcdi_get_parser_disp_info(enp, buffer, buffer_length, B_FALSE,
&mcdi_list_length);
@@ -1042,6 +1053,10 @@ ef10_filter_supported_filters(
no_space = B_TRUE;
else
goto fail2;
+   } else {
+   for (i = next_buf_idx;
+   i < next_buf_idx + mcdi_encap_list_length; i++)
+   buffer[i] |= EFX_FILTER_MATCH_ENCAP_TYPE;
}
} else {
mcdi_encap_list_length = 0;
diff --git a/drivers/net/sfc/base/efx.h b/drivers/net/sfc/base/efx.h
index e2f49ec..bb903e5 100644
--- a/drivers/net/sfc/base/efx.h
+++ b/drivers/net/sfc/base/efx.h
@@ -2485,6 +2485,11 @@ typedef uint8_t efx_filter_flags_t;
 #defineEFX_FILTER_MATCH_IFRM_UNKNOWN_MCAST_DST 0x0100
 /* For encapsulated packets, match all unicast inner frames */
 #defineEFX_FILTER_MATCH_IFRM_UNKNOWN_UCAST_DST 0x0200
+/*
+ * Match by encap type, this flag does not correspond to
+ * the MCDI match flags and any unoccupied value may be used
+ */
+#defineEFX_FILTER_MATCH_ENCAP_TYPE 0x2000
 /* Match otherwise-unmatched multicast and broadcast packets */
 #defineEFX_FILTER_MATCH_UNKNOWN_MCAST_DST  0x4000
 /* Match otherwise-unmatched unicast packets */
diff --git a/drivers/net/sfc/base/efx_filter.c 
b/drivers/net/sfc/base/efx_filter.c
index 2e6628b..97c972c 100644
--- a/drivers/net/sfc/base/efx_filter.c
+++ b/drivers/net/sfc/base/efx_filter.c
@@ -418,7 +418,7 @@ efx_filter_spec_set_encap_type(
__inefx_tunnel_protocol_t encap_type,
__inefx_filter_inner_frame_match_t inner_frame_match)
 {
-   uint32_t match_flags = 0;
+   uint32_t match_flags = EFX_FILTER_MATCH_ENCAP_TYPE;

[dpdk-dev] [PATCH v2 02/14] net/sfc/base: support VNI/VSID and inner frame local MAC

2018-03-06 Thread Andrew Rybchenko
From: Roman Zhukov 

This supports VNI/VSID and inner frame local MAC fields to
match in VXLAN, GENEVE, or NVGRE packets.

Signed-off-by: Roman Zhukov 
Signed-off-by: Andrew Rybchenko 
Reviewed-by: Andy Moreton 
---
 drivers/net/sfc/base/ef10_filter.c | 18 ++
 drivers/net/sfc/base/efx.h |  8 
 2 files changed, 26 insertions(+)

diff --git a/drivers/net/sfc/base/ef10_filter.c 
b/drivers/net/sfc/base/ef10_filter.c
index 8a6bc61..e93dc13 100644
--- a/drivers/net/sfc/base/ef10_filter.c
+++ b/drivers/net/sfc/base/ef10_filter.c
@@ -119,6 +119,10 @@ ef10_filter_init(
MATCH_MASK(MC_CMD_FILTER_OP_EXT_IN_MATCH_OUTER_VLAN));
EFX_STATIC_ASSERT(EFX_FILTER_MATCH_IP_PROTO ==
MATCH_MASK(MC_CMD_FILTER_OP_EXT_IN_MATCH_IP_PROTO));
+   EFX_STATIC_ASSERT(EFX_FILTER_MATCH_VNI_OR_VSID ==
+   MATCH_MASK(MC_CMD_FILTER_OP_EXT_IN_MATCH_VNI_OR_VSID));
+   EFX_STATIC_ASSERT(EFX_FILTER_MATCH_IFRM_LOC_MAC ==
+   MATCH_MASK(MC_CMD_FILTER_OP_EXT_IN_MATCH_IFRM_DST_MAC));
EFX_STATIC_ASSERT(EFX_FILTER_MATCH_IFRM_UNKNOWN_MCAST_DST ==
MATCH_MASK(MC_CMD_FILTER_OP_EXT_IN_MATCH_IFRM_UNKNOWN_MCAST_DST));
EFX_STATIC_ASSERT(EFX_FILTER_MATCH_IFRM_UNKNOWN_UCAST_DST ==
@@ -292,6 +296,12 @@ efx_mcdi_filter_op_add(
rc = EINVAL;
goto fail2;
}
+
+   memcpy(MCDI_IN2(req, uint8_t, FILTER_OP_EXT_IN_VNI_OR_VSID),
+   spec->efs_vni_or_vsid, EFX_VNI_OR_VSID_LEN);
+
+   memcpy(MCDI_IN2(req, uint8_t, FILTER_OP_EXT_IN_IFRM_DST_MAC),
+   spec->efs_ifrm_loc_mac, EFX_MAC_ADDR_LEN);
}
 
efx_mcdi_execute(enp, &req);
@@ -415,6 +425,12 @@ ef10_filter_equal(
return (B_FALSE);
if (left->efs_encap_type != right->efs_encap_type)
return (B_FALSE);
+   if (memcmp(left->efs_vni_or_vsid, right->efs_vni_or_vsid,
+   EFX_VNI_OR_VSID_LEN))
+   return (B_FALSE);
+   if (memcmp(left->efs_ifrm_loc_mac, right->efs_ifrm_loc_mac,
+   EFX_MAC_ADDR_LEN))
+   return (B_FALSE);
 
return (B_TRUE);
 
@@ -988,6 +1004,8 @@ ef10_filter_supported_filters(
EFX_FILTER_MATCH_LOC_MAC | EFX_FILTER_MATCH_LOC_PORT |
EFX_FILTER_MATCH_ETHER_TYPE | EFX_FILTER_MATCH_INNER_VID |
EFX_FILTER_MATCH_OUTER_VID | EFX_FILTER_MATCH_IP_PROTO |
+   EFX_FILTER_MATCH_VNI_OR_VSID |
+   EFX_FILTER_MATCH_IFRM_LOC_MAC |
EFX_FILTER_MATCH_IFRM_UNKNOWN_MCAST_DST |
EFX_FILTER_MATCH_IFRM_UNKNOWN_UCAST_DST |
EFX_FILTER_MATCH_UNKNOWN_MCAST_DST |
diff --git a/drivers/net/sfc/base/efx.h b/drivers/net/sfc/base/efx.h
index 088a896..8380d0a 100644
--- a/drivers/net/sfc/base/efx.h
+++ b/drivers/net/sfc/base/efx.h
@@ -454,6 +454,8 @@ typedef enum efx_link_mode_e {
 
 #defineEFX_MAC_ADDR_LEN 6
 
+#defineEFX_VNI_OR_VSID_LEN 3
+
 #defineEFX_MAC_ADDR_IS_MULTICAST(_address) (((uint8_t *)_address)[0] & 
0x01)
 
 #defineEFX_MAC_MULTICAST_LIST_MAX  256
@@ -2475,6 +2477,10 @@ typedef uint8_t efx_filter_flags_t;
 #defineEFX_FILTER_MATCH_OUTER_VID  0x0100
 /* Match by IP transport protocol */
 #defineEFX_FILTER_MATCH_IP_PROTO   0x0200
+/* Match by VNI or VSID */
+#defineEFX_FILTER_MATCH_VNI_OR_VSID0x0800
+/* For encapsulated packets, match by inner frame local MAC address */
+#defineEFX_FILTER_MATCH_IFRM_LOC_MAC   0x0001
 /* For encapsulated packets, match all multicast inner frames */
 #defineEFX_FILTER_MATCH_IFRM_UNKNOWN_MCAST_DST 0x0100
 /* For encapsulated packets, match all unicast inner frames */
@@ -2521,6 +2527,8 @@ typedef struct efx_filter_spec_s {
uint16_tefs_rem_port;
efx_oword_t efs_rem_host;
efx_oword_t efs_loc_host;
+   uint8_t efs_vni_or_vsid[EFX_VNI_OR_VSID_LEN];
+   uint8_t efs_ifrm_loc_mac[EFX_MAC_ADDR_LEN];
 } efx_filter_spec_t;
 
 
-- 
2.7.4



[dpdk-dev] [PATCH v2 00/14] net/sfc: support flow API for tunnels

2018-03-06 Thread Andrew Rybchenko
Update base driver and the PMD itself to support flow API
patterns for tunnels: VXLAN, NVGRE and Geneve.

Applicable to SFN8xxx NICs with full-feature firmware variant running.

Andrew Rybchenko (1):
  doc: add net/sfc flow API support for tunnels

Roman Zhukov (12):
  net/sfc/base: support filters for encapsulated packets
  net/sfc/base: support VNI/VSID and inner frame local MAC
  net/sfc/base: distinguish filters for encapsulated packets
  net/sfc: add VXLAN in flow API filters support
  net/sfc: add NVGRE in flow API filters support
  net/sfc: add GENEVE in flow API filters support
  net/sfc: add inner frame ETH in flow API filters support
  net/sfc: add infrastructure to make many filters from flow
  net/sfc: multiply of specs with an unknown EtherType
  net/sfc: multiply of specs w/o inner frame destination MAC
  net/sfc: multiply of specs with an unknown destination MAC
  net/sfc: avoid creation of ineffective flow rules

Vijay Srivastava (1):
  net/sfc/base: support VXLAN filter creation

 doc/guides/nics/sfc_efx.rst|   28 +-
 doc/guides/rel_notes/release_18_05.rst |6 +
 drivers/net/sfc/base/ef10_filter.c |  100 +++-
 drivers/net/sfc/base/efx.h |   20 +
 drivers/net/sfc/base/efx_filter.c  |   39 +-
 drivers/net/sfc/sfc_flow.c | 1001 ++--
 drivers/net/sfc/sfc_flow.h |   19 +-
 7 files changed, 1161 insertions(+), 52 deletions(-)

-- 
2.7.4



[dpdk-dev] [PATCH v2 01/14] net/sfc/base: support filters for encapsulated packets

2018-03-06 Thread Andrew Rybchenko
From: Roman Zhukov 

This adds filters for encapsulated packets to the list
returned by ef10_filter_supported_filters().

Signed-off-by: Roman Zhukov 
Signed-off-by: Andrew Rybchenko 
Reviewed-by: Andy Moreton 
---
v2:
 - fix assertion

 drivers/net/sfc/base/ef10_filter.c | 65 --
 1 file changed, 55 insertions(+), 10 deletions(-)

diff --git a/drivers/net/sfc/base/ef10_filter.c 
b/drivers/net/sfc/base/ef10_filter.c
index 2b7a09c..8a6bc61 100644
--- a/drivers/net/sfc/base/ef10_filter.c
+++ b/drivers/net/sfc/base/ef10_filter.c
@@ -895,6 +895,7 @@ efx_mcdi_get_parser_disp_info(
__inefx_nic_t *enp,
__out_ecount(buffer_length) uint32_t *buffer,
__insize_t buffer_length,
+   __inboolean_t encap,
__out   size_t *list_lengthp)
 {
efx_mcdi_req_t req;
@@ -911,7 +912,8 @@ efx_mcdi_get_parser_disp_info(
req.emr_out_buf = payload;
req.emr_out_length = MC_CMD_GET_PARSER_DISP_INFO_OUT_LENMAX;
 
-   MCDI_IN_SET_DWORD(req, GET_PARSER_DISP_INFO_OUT_OP,
+   MCDI_IN_SET_DWORD(req, GET_PARSER_DISP_INFO_OUT_OP, encap ?
+   MC_CMD_GET_PARSER_DISP_INFO_IN_OP_GET_SUPPORTED_ENCAP_RX_MATCHES :
MC_CMD_GET_PARSER_DISP_INFO_IN_OP_GET_SUPPORTED_RX_MATCHES);
 
efx_mcdi_execute(enp, &req);
@@ -971,28 +973,66 @@ ef10_filter_supported_filters(
__insize_t buffer_length,
__out   size_t *list_lengthp)
 {
-
+   efx_nic_cfg_t *encp = &(enp->en_nic_cfg);
size_t mcdi_list_length;
+   size_t mcdi_encap_list_length;
size_t list_length;
uint32_t i;
+   uint32_t next_buf_idx;
+   size_t next_buf_length;
efx_rc_t rc;
+   boolean_t no_space = B_FALSE;
efx_filter_match_flags_t all_filter_flags =
(EFX_FILTER_MATCH_REM_HOST | EFX_FILTER_MATCH_LOC_HOST |
EFX_FILTER_MATCH_REM_MAC | EFX_FILTER_MATCH_REM_PORT |
EFX_FILTER_MATCH_LOC_MAC | EFX_FILTER_MATCH_LOC_PORT |
EFX_FILTER_MATCH_ETHER_TYPE | EFX_FILTER_MATCH_INNER_VID |
EFX_FILTER_MATCH_OUTER_VID | EFX_FILTER_MATCH_IP_PROTO |
+   EFX_FILTER_MATCH_IFRM_UNKNOWN_MCAST_DST |
+   EFX_FILTER_MATCH_IFRM_UNKNOWN_UCAST_DST |
EFX_FILTER_MATCH_UNKNOWN_MCAST_DST |
EFX_FILTER_MATCH_UNKNOWN_UCAST_DST);
 
-   rc = efx_mcdi_get_parser_disp_info(enp, buffer, buffer_length,
-   &mcdi_list_length);
+   /*
+* Two calls to MC_CMD_GET_PARSER_DISP_INFO are needed: one to get the
+* list of supported filters for ordinary packets, and then another to
+* get the list of supported filters for encapsulated packets.
+*/
+   rc = efx_mcdi_get_parser_disp_info(enp, buffer, buffer_length, B_FALSE,
+   &mcdi_list_length);
if (rc != 0) {
-   if (rc == ENOSPC) {
-   /* Pass through mcdi_list_length for the list length */
-   *list_lengthp = mcdi_list_length;
+   if (rc == ENOSPC)
+   no_space = B_TRUE;
+   else
+   goto fail1;
+   }
+
+   if (no_space) {
+   next_buf_idx = 0;
+   next_buf_length = 0;
+   } else {
+   EFSYS_ASSERT(mcdi_list_length <= buffer_length);
+   next_buf_idx = mcdi_list_length;
+   next_buf_length = buffer_length - mcdi_list_length;
+   }
+
+   if (encp->enc_tunnel_encapsulations_supported != 0) {
+   rc = efx_mcdi_get_parser_disp_info(enp, &buffer[next_buf_idx],
+   next_buf_length, B_TRUE, &mcdi_encap_list_length);
+   if (rc != 0) {
+   if (rc == ENOSPC)
+   no_space = B_TRUE;
+   else
+   goto fail2;
}
-   goto fail1;
+   } else {
+   mcdi_encap_list_length = 0;
+   }
+
+   if (no_space) {
+   *list_lengthp = mcdi_list_length + mcdi_encap_list_length;
+   rc = ENOSPC;
+   goto fail3;
}
 
/*
@@ -1005,9 +1045,10 @@ ef10_filter_supported_filters(
 * of the matches is preserved as they are ordered from highest to
 * lowest priority.
 */
-   EFSYS_ASSERT(mcdi_list_length <= buffer_length);
+   EFSYS_ASSERT(mcdi_list_length + mcdi_encap_list_length <=
+   buffer_length);
list_length = 0;
-   for (i = 0; i < mcdi_list_length; i++) {
+   for (i = 0; i < mcdi_list_length + mcdi_encap_list_length; i++) {
if ((buffer[i] & ~all_filter_flags) == 0) {
buffer[list_length] = buffer[i];
list_length++;
@@ -1018,6 +1059,10 @@ 

[dpdk-dev] [PATCH v2 03/14] net/sfc/base: support VXLAN filter creation

2018-03-06 Thread Andrew Rybchenko
From: Vijay Srivastava 

Signed-off-by: Vijay Srivastava 
Signed-off-by: Andrew Rybchenko 
---
 drivers/net/sfc/base/efx.h|  7 +++
 drivers/net/sfc/base/efx_filter.c | 36 
 2 files changed, 43 insertions(+)

diff --git a/drivers/net/sfc/base/efx.h b/drivers/net/sfc/base/efx.h
index 8380d0a..e2f49ec 100644
--- a/drivers/net/sfc/base/efx.h
+++ b/drivers/net/sfc/base/efx.h
@@ -2624,6 +2624,13 @@ efx_filter_spec_set_encap_type(
__inefx_tunnel_protocol_t encap_type,
__inefx_filter_inner_frame_match_t inner_frame_match);
 
+extern __checkReturn   efx_rc_t
+efx_filter_spec_set_vxlan_full(
+   __inout efx_filter_spec_t *spec,
+   __inconst uint8_t *vxlan_id,
+   __inconst uint8_t *inner_addr,
+   __inconst uint8_t *outer_addr);
+
 #if EFSYS_OPT_RX_SCALE
 extern __checkReturn   efx_rc_t
 efx_filter_spec_set_rss_context(
diff --git a/drivers/net/sfc/base/efx_filter.c 
b/drivers/net/sfc/base/efx_filter.c
index 8705369..2e6628b 100644
--- a/drivers/net/sfc/base/efx_filter.c
+++ b/drivers/net/sfc/base/efx_filter.c
@@ -468,6 +468,42 @@ efx_filter_spec_set_encap_type(
return (rc);
 }
 
+/*
+ * Specify inner and outer Ethernet address and VXLAN ID in filter
+ * specification.
+ */
+   __checkReturn   efx_rc_t
+efx_filter_spec_set_vxlan_full(
+   __inout efx_filter_spec_t *spec,
+   __inconst uint8_t *vxlan_id,
+   __inconst uint8_t *inner_addr,
+   __inconst uint8_t *outer_addr)
+{
+   EFSYS_ASSERT3P(spec, !=, NULL);
+   EFSYS_ASSERT3P(vxlan_id, !=, NULL);
+   EFSYS_ASSERT3P(inner_addr, !=, NULL);
+   EFSYS_ASSERT3P(outer_addr, !=, NULL);
+
+   if ((inner_addr == NULL) && (outer_addr == NULL))
+   return (EINVAL);
+
+   if (vxlan_id != NULL) {
+   spec->efs_match_flags |= EFX_FILTER_MATCH_VNI_OR_VSID;
+   memcpy(spec->efs_vni_or_vsid, vxlan_id, EFX_VNI_OR_VSID_LEN);
+   }
+   if (outer_addr != NULL) {
+   spec->efs_match_flags |= EFX_FILTER_MATCH_LOC_MAC;
+   memcpy(spec->efs_loc_mac, outer_addr, EFX_MAC_ADDR_LEN);
+   }
+   if (inner_addr != NULL) {
+   spec->efs_match_flags |= EFX_FILTER_MATCH_IFRM_LOC_MAC;
+   memcpy(spec->efs_ifrm_loc_mac, inner_addr, EFX_MAC_ADDR_LEN);
+   }
+   spec->efs_encap_type = EFX_TUNNEL_PROTOCOL_VXLAN;
+
+   return (0);
+}
+
 #if EFSYS_OPT_RX_SCALE
__checkReturn   efx_rc_t
 efx_filter_spec_set_rss_context(
-- 
2.7.4



[dpdk-dev] [PATCH v2 06/14] net/sfc: add NVGRE in flow API filters support

2018-03-06 Thread Andrew Rybchenko
From: Roman Zhukov 

Exact match of virtual subnet ID is supported by parser.
IP protocol match are enforced to GRE.

Signed-off-by: Roman Zhukov 
Signed-off-by: Andrew Rybchenko 
Reviewed-by: Ivan Malov 
Reviewed-by: Andy Moreton 
---
 doc/guides/nics/sfc_efx.rst |  2 ++
 drivers/net/sfc/sfc_flow.c  | 68 -
 2 files changed, 69 insertions(+), 1 deletion(-)

diff --git a/doc/guides/nics/sfc_efx.rst b/doc/guides/nics/sfc_efx.rst
index 5a4b2a6..05dacb3 100644
--- a/doc/guides/nics/sfc_efx.rst
+++ b/doc/guides/nics/sfc_efx.rst
@@ -168,6 +168,8 @@ Supported pattern items:
 
 - VXLAN (exact match of VXLAN network identifier)
 
+- NVGRE (exact match of virtual subnet ID)
+
 Supported actions:
 
 - VOID
diff --git a/drivers/net/sfc/sfc_flow.c b/drivers/net/sfc/sfc_flow.c
index 20ba69d..126ec9b 100644
--- a/drivers/net/sfc/sfc_flow.c
+++ b/drivers/net/sfc/sfc_flow.c
@@ -58,6 +58,7 @@ static sfc_flow_item_parse sfc_flow_parse_ipv6;
 static sfc_flow_item_parse sfc_flow_parse_tcp;
 static sfc_flow_item_parse sfc_flow_parse_udp;
 static sfc_flow_item_parse sfc_flow_parse_vxlan;
+static sfc_flow_item_parse sfc_flow_parse_nvgre;
 
 static boolean_t
 sfc_flow_is_zero(const uint8_t *buf, unsigned int size)
@@ -719,10 +720,17 @@ sfc_flow_set_match_flags_for_encap_pkts(const struct 
rte_flow_item *item,
"in VxLAN pattern");
return -rte_errno;
 
+   case EFX_IPPROTO_GRE:
+   rte_flow_error_set(error, EINVAL,
+   RTE_FLOW_ERROR_TYPE_ITEM, item,
+   "Outer IP header protocol must be GRE "
+   "in NVGRE pattern");
+   return -rte_errno;
+
default:
rte_flow_error_set(error, EINVAL,
RTE_FLOW_ERROR_TYPE_ITEM, item,
-   "Only VxLAN tunneling patterns "
+   "Only VxLAN/NVGRE tunneling patterns "
"are supported");
return -rte_errno;
}
@@ -823,6 +831,57 @@ sfc_flow_parse_vxlan(const struct rte_flow_item *item,
return rc;
 }
 
+/**
+ * Convert NVGRE item to EFX filter specification.
+ *
+ * @param item[in]
+ *   Item specification. Only virtual subnet ID field is supported.
+ *   If the mask is NULL, default mask will be used.
+ *   Ranging is not supported.
+ * @param efx_spec[in, out]
+ *   EFX filter specification to update.
+ * @param[out] error
+ *   Perform verbose error reporting if not NULL.
+ */
+static int
+sfc_flow_parse_nvgre(const struct rte_flow_item *item,
+efx_filter_spec_t *efx_spec,
+struct rte_flow_error *error)
+{
+   int rc;
+   const struct rte_flow_item_nvgre *spec = NULL;
+   const struct rte_flow_item_nvgre *mask = NULL;
+   const struct rte_flow_item_nvgre supp_mask = {
+   .tni = { 0xff, 0xff, 0xff }
+   };
+
+   rc = sfc_flow_parse_init(item,
+(const void **)&spec,
+(const void **)&mask,
+&supp_mask,
+&rte_flow_item_nvgre_mask,
+sizeof(struct rte_flow_item_nvgre),
+error);
+   if (rc != 0)
+   return rc;
+
+   rc = sfc_flow_set_match_flags_for_encap_pkts(item, efx_spec,
+EFX_IPPROTO_GRE, error);
+   if (rc != 0)
+   return rc;
+
+   efx_spec->efs_encap_type = EFX_TUNNEL_PROTOCOL_NVGRE;
+   efx_spec->efs_match_flags |= EFX_FILTER_MATCH_ENCAP_TYPE;
+
+   if (spec == NULL)
+   return 0;
+
+   rc = sfc_flow_set_efx_spec_vni_or_vsid(efx_spec, spec->tni,
+  mask->tni, item, error);
+
+   return rc;
+}
+
 static const struct sfc_flow_item sfc_flow_items[] = {
{
.type = RTE_FLOW_ITEM_TYPE_VOID,
@@ -872,6 +931,12 @@ static const struct sfc_flow_item sfc_flow_items[] = {
.layer = SFC_FLOW_ITEM_START_LAYER,
.parse = sfc_flow_parse_vxlan,
},
+   {
+   .type = RTE_FLOW_ITEM_TYPE_NVGRE,
+   .prev_layer = SFC_FLOW_ITEM_L3,
+   .layer = SFC_FLOW_ITEM_START_LAYER,
+   .parse = sfc_flow_parse_nvgre,
+   },
 };
 
 /*
@@ -980,6 +1045,7 @@ sfc_flow_parse_pattern(const struct rte_flow_item 
pattern[],
break;
 
case RTE_FLOW_ITEM_TYPE_VXLAN:
+   case RTE_FLOW_ITEM_TYPE_NVGRE:
if (is_ifrm) {
rte_flow_error_set(error, EINVAL,
RTE_FLOW_ERROR_TYPE_ITEM,
-- 
2.7.4



[dpdk-dev] [PATCH v2 09/14] net/sfc: add infrastructure to make many filters from flow

2018-03-06 Thread Andrew Rybchenko
From: Roman Zhukov 

Not all flow rules can be expressed in one hardware filter, so some flow
rules have to be expressed in terms of multiple hardware filters. This
patch provides a means to produce a filter spec template from the flow
rule which then can be used to produce a set of fully elaborated specs
to be inserted.

Signed-off-by: Roman Zhukov 
Signed-off-by: Andrew Rybchenko 
Reviewed-by: Ivan Malov 
---
 drivers/net/sfc/sfc_flow.c | 118 -
 drivers/net/sfc/sfc_flow.h |  19 +++-
 2 files changed, 114 insertions(+), 23 deletions(-)

diff --git a/drivers/net/sfc/sfc_flow.c b/drivers/net/sfc/sfc_flow.c
index c942a36..a432936 100644
--- a/drivers/net/sfc/sfc_flow.c
+++ b/drivers/net/sfc/sfc_flow.c
@@ -25,10 +25,13 @@
 
 /*
  * At now flow API is implemented in such a manner that each
- * flow rule is converted to a hardware filter.
+ * flow rule is converted to one or more hardware filters.
  * All elements of flow rule (attributes, pattern items, actions)
  * correspond to one or more fields in the efx_filter_spec_s structure
  * that is responsible for the hardware filter.
+ * If some required field is unset in the flow rule, then a handful
+ * of filter copies will be created to cover all possible values
+ * of such a field.
  */
 
 enum sfc_flow_item_layers {
@@ -1095,8 +1098,8 @@ sfc_flow_parse_attr(const struct rte_flow_attr *attr,
return -rte_errno;
}
 
-   flow->spec.efs_flags |= EFX_FILTER_FLAG_RX;
-   flow->spec.efs_rss_context = EFX_RSS_CONTEXT_DEFAULT;
+   flow->spec.template.efs_flags |= EFX_FILTER_FLAG_RX;
+   flow->spec.template.efs_rss_context = EFX_RSS_CONTEXT_DEFAULT;
 
return 0;
 }
@@ -1187,7 +1190,7 @@ sfc_flow_parse_pattern(const struct rte_flow_item 
pattern[],
break;
}
 
-   rc = item->parse(pattern, &flow->spec, error);
+   rc = item->parse(pattern, &flow->spec.template, error);
if (rc != 0)
return rc;
 
@@ -1209,7 +1212,7 @@ sfc_flow_parse_queue(struct sfc_adapter *sa,
return -EINVAL;
 
rxq = sa->rxq_info[queue->index].rxq;
-   flow->spec.efs_dmaq_id = (uint16_t)rxq->hw_index;
+   flow->spec.template.efs_dmaq_id = (uint16_t)rxq->hw_index;
 
return 0;
 }
@@ -1285,13 +1288,57 @@ sfc_flow_parse_rss(struct sfc_adapter *sa,
 #endif /* EFSYS_OPT_RX_SCALE */
 
 static int
+sfc_flow_spec_flush(struct sfc_adapter *sa, struct sfc_flow_spec *spec,
+   unsigned int filters_count)
+{
+   unsigned int i;
+   int ret = 0;
+
+   for (i = 0; i < filters_count; i++) {
+   int rc;
+
+   rc = efx_filter_remove(sa->nic, &spec->filters[i]);
+   if (ret == 0 && rc != 0) {
+   sfc_err(sa, "failed to remove filter specification "
+   "(rc = %d)", rc);
+   ret = rc;
+   }
+   }
+
+   return ret;
+}
+
+static int
+sfc_flow_spec_insert(struct sfc_adapter *sa, struct sfc_flow_spec *spec)
+{
+   unsigned int i;
+   int rc = 0;
+
+   for (i = 0; i < spec->count; i++) {
+   rc = efx_filter_insert(sa->nic, &spec->filters[i]);
+   if (rc != 0) {
+   sfc_flow_spec_flush(sa, spec, i);
+   break;
+   }
+   }
+
+   return rc;
+}
+
+static int
+sfc_flow_spec_remove(struct sfc_adapter *sa, struct sfc_flow_spec *spec)
+{
+   return sfc_flow_spec_flush(sa, spec, spec->count);
+}
+
+static int
 sfc_flow_filter_insert(struct sfc_adapter *sa,
   struct rte_flow *flow)
 {
-   efx_filter_spec_t *spec = &flow->spec;
-
 #if EFSYS_OPT_RX_SCALE
struct sfc_flow_rss *rss = &flow->rss_conf;
+   uint32_t efs_rss_context = EFX_RSS_CONTEXT_DEFAULT;
+   unsigned int i;
int rc = 0;
 
if (flow->rss) {
@@ -1302,27 +1349,38 @@ sfc_flow_filter_insert(struct sfc_adapter *sa,
rc = efx_rx_scale_context_alloc(sa->nic,
EFX_RX_SCALE_EXCLUSIVE,
rss_spread,
-   &spec->efs_rss_context);
+   &efs_rss_context);
if (rc != 0)
goto fail_scale_context_alloc;
 
-   rc = efx_rx_scale_mode_set(sa->nic, spec->efs_rss_context,
+   rc = efx_rx_scale_mode_set(sa->nic, efs_rss_context,
   EFX_RX_HASHALG_TOEPLITZ,
   rss->rss_hash_types, B_TRUE);
if (rc != 0)
goto fail_scale_mode_set;
 
-   rc = efx_rx_scale_key_set(sa->nic, spec->efs_rss_context,
+   rc = efx_rx_scale_key_set(sa->nic, efs_rss_context,
 

[dpdk-dev] [PATCH v2 08/14] net/sfc: add inner frame ETH in flow API filters support

2018-03-06 Thread Andrew Rybchenko
From: Roman Zhukov 

Support destination MAC address match in inner frames.

Signed-off-by: Roman Zhukov 
Signed-off-by: Andrew Rybchenko 
Reviewed-by: Ivan Malov 
Reviewed-by: Andy Moreton 
---
 doc/guides/nics/sfc_efx.rst |  4 ++-
 drivers/net/sfc/sfc_flow.c  | 73 +++--
 2 files changed, 61 insertions(+), 16 deletions(-)

diff --git a/doc/guides/nics/sfc_efx.rst b/doc/guides/nics/sfc_efx.rst
index 943fe55..539ce90 100644
--- a/doc/guides/nics/sfc_efx.rst
+++ b/doc/guides/nics/sfc_efx.rst
@@ -152,7 +152,9 @@ Supported pattern items:
 - VOID
 
 - ETH (exact match of source/destination addresses, individual/group match
-  of destination address, EtherType)
+  of destination address, EtherType in the outer frame and exact match of
+  destination addresses, individual/group match of destination address in
+  the inner frame)
 
 - VLAN (exact match of VID, double-tagging is supported)
 
diff --git a/drivers/net/sfc/sfc_flow.c b/drivers/net/sfc/sfc_flow.c
index efdc664..c942a36 100644
--- a/drivers/net/sfc/sfc_flow.c
+++ b/drivers/net/sfc/sfc_flow.c
@@ -187,11 +187,11 @@ sfc_flow_parse_void(__rte_unused const struct 
rte_flow_item *item,
  * Convert Ethernet item to EFX filter specification.
  *
  * @param item[in]
- *   Item specification. Only source and destination addresses and
- *   Ethernet type fields are supported. In addition to full and
- *   empty masks of destination address, individual/group mask is
- *   also supported. If the mask is NULL, default mask will be used.
- *   Ranging is not supported.
+ *   Item specification. Outer frame specification may only comprise
+ *   source/destination addresses and Ethertype field.
+ *   Inner frame specification may contain destination address only.
+ *   There is support for individual/group mask as well as for empty and full.
+ *   If the mask is NULL, default mask will be used. Ranging is not supported.
  * @param efx_spec[in, out]
  *   EFX filter specification to update.
  * @param[out] error
@@ -210,40 +210,75 @@ sfc_flow_parse_eth(const struct rte_flow_item *item,
.src.addr_bytes = { 0xff, 0xff, 0xff, 0xff, 0xff, 0xff },
.type = 0x,
};
+   const struct rte_flow_item_eth ifrm_supp_mask = {
+   .dst.addr_bytes = { 0xff, 0xff, 0xff, 0xff, 0xff, 0xff },
+   };
const uint8_t ig_mask[EFX_MAC_ADDR_LEN] = {
0x01, 0x00, 0x00, 0x00, 0x00, 0x00
};
+   const struct rte_flow_item_eth *supp_mask_p;
+   const struct rte_flow_item_eth *def_mask_p;
+   uint8_t *loc_mac = NULL;
+   boolean_t is_ifrm = (efx_spec->efs_encap_type !=
+   EFX_TUNNEL_PROTOCOL_NONE);
+
+   if (is_ifrm) {
+   supp_mask_p = &ifrm_supp_mask;
+   def_mask_p = &ifrm_supp_mask;
+   loc_mac = efx_spec->efs_ifrm_loc_mac;
+   } else {
+   supp_mask_p = &supp_mask;
+   def_mask_p = &rte_flow_item_eth_mask;
+   loc_mac = efx_spec->efs_loc_mac;
+   }
 
rc = sfc_flow_parse_init(item,
 (const void **)&spec,
 (const void **)&mask,
-&supp_mask,
-&rte_flow_item_eth_mask,
+supp_mask_p, def_mask_p,
 sizeof(struct rte_flow_item_eth),
 error);
if (rc != 0)
return rc;
 
-   /* If "spec" is not set, could be any Ethernet */
-   if (spec == NULL)
-   return 0;
+   /*
+* If "spec" is not set, could be any Ethernet, but for the inner frame
+* type of destination MAC must be set
+*/
+   if (spec == NULL) {
+   if (is_ifrm)
+   goto fail_bad_ifrm_dst_mac;
+   else
+   return 0;
+   }
 
if (is_same_ether_addr(&mask->dst, &supp_mask.dst)) {
-   efx_spec->efs_match_flags |= EFX_FILTER_MATCH_LOC_MAC;
-   rte_memcpy(efx_spec->efs_loc_mac, spec->dst.addr_bytes,
+   efx_spec->efs_match_flags |= is_ifrm ?
+   EFX_FILTER_MATCH_IFRM_LOC_MAC :
+   EFX_FILTER_MATCH_LOC_MAC;
+   rte_memcpy(loc_mac, spec->dst.addr_bytes,
   EFX_MAC_ADDR_LEN);
} else if (memcmp(mask->dst.addr_bytes, ig_mask,
  EFX_MAC_ADDR_LEN) == 0) {
if (is_unicast_ether_addr(&spec->dst))
-   efx_spec->efs_match_flags |=
+   efx_spec->efs_match_flags |= is_ifrm ?
+   EFX_FILTER_MATCH_IFRM_UNKNOWN_UCAST_DST :
EFX_FILTER_MATCH_UNKNOWN_UCAST_DST;
else
-   efx_spec->efs_match_flags |=
+   efx_spec->efs_match_flags |= is_ifrm ?
+  

[dpdk-dev] [PATCH v2 14/14] doc: add net/sfc flow API support for tunnels

2018-03-06 Thread Andrew Rybchenko
Signed-off-by: Andrew Rybchenko 
---
 doc/guides/rel_notes/release_18_05.rst | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/doc/guides/rel_notes/release_18_05.rst 
b/doc/guides/rel_notes/release_18_05.rst
index 3923dc2..894f636 100644
--- a/doc/guides/rel_notes/release_18_05.rst
+++ b/doc/guides/rel_notes/release_18_05.rst
@@ -41,6 +41,12 @@ New Features
  Also, make sure to start the actual text at the margin.
  =
 
+* **Updated Solarflare network PMD.**
+
+  Updated the sfc_efx driver including the following changes:
+
+  * Added support for NVGRE, VXLAN and GENEVE filters in flow API.
+
 
 API Changes
 ---
-- 
2.7.4



[dpdk-dev] [PATCH v2 12/14] net/sfc: multiply of specs with an unknown destination MAC

2018-03-06 Thread Andrew Rybchenko
From: Roman Zhukov 

To filter all traffic, need to create two hardware filter specifications
with both unknown unicast and unknown multicast destination MAC address
match flags.

In terms of RTE flow API, this would require adding multiple flow rules
with corresponding ETH items. In order to avoid such a complication, the
patch implements a mechanism to auto-complete an underlying filter
representation of a flow rule in order to create additional filter
specififcations featuring the missing match flags.

Signed-off-by: Roman Zhukov 
Signed-off-by: Andrew Rybchenko 
Reviewed-by: Ivan Malov 
---
 drivers/net/sfc/sfc_flow.c | 91 +-
 drivers/net/sfc/sfc_flow.h |  2 +-
 2 files changed, 91 insertions(+), 2 deletions(-)

diff --git a/drivers/net/sfc/sfc_flow.c b/drivers/net/sfc/sfc_flow.c
index 2d45827..7b26653 100644
--- a/drivers/net/sfc/sfc_flow.c
+++ b/drivers/net/sfc/sfc_flow.c
@@ -86,6 +86,8 @@ struct sfc_flow_copy_flag {
sfc_flow_spec_check *spec_check;
 };
 
+static sfc_flow_spec_set_vals sfc_flow_set_unknown_dst_flags;
+static sfc_flow_spec_check sfc_flow_check_unknown_dst_flags;
 static sfc_flow_spec_set_vals sfc_flow_set_ethertypes;
 static sfc_flow_spec_set_vals sfc_flow_set_ifrm_unknown_dst_flags;
 static sfc_flow_spec_check sfc_flow_check_ifrm_unknown_dst_flags;
@@ -1514,6 +1516,80 @@ sfc_flow_parse_actions(struct sfc_adapter *sa,
 }
 
 /**
+ * Set the EFX_FILTER_MATCH_UNKNOWN_UCAST_DST
+ * and EFX_FILTER_MATCH_UNKNOWN_MCAST_DST match flags in the same
+ * specifications after copying.
+ *
+ * @param spec[in, out]
+ *   SFC flow specification to update.
+ * @param filters_count_for_one_val[in]
+ *   How many specifications should have the same match flag, what is the
+ *   number of specifications before copying.
+ * @param error[out]
+ *   Perform verbose error reporting if not NULL.
+ */
+static int
+sfc_flow_set_unknown_dst_flags(struct sfc_flow_spec *spec,
+  unsigned int filters_count_for_one_val,
+  struct rte_flow_error *error)
+{
+   unsigned int i;
+   static const efx_filter_match_flags_t vals[] = {
+   EFX_FILTER_MATCH_UNKNOWN_UCAST_DST,
+   EFX_FILTER_MATCH_UNKNOWN_MCAST_DST
+   };
+
+   if (filters_count_for_one_val * RTE_DIM(vals) != spec->count) {
+   rte_flow_error_set(error, EINVAL,
+   RTE_FLOW_ERROR_TYPE_UNSPECIFIED, NULL,
+   "Number of specifications is incorrect while copying "
+   "by unknown destination flags");
+   return -rte_errno;
+   }
+
+   for (i = 0; i < spec->count; i++) {
+   /* The check above ensures that divisor can't be zero here */
+   spec->filters[i].efs_match_flags |=
+   vals[i / filters_count_for_one_val];
+   }
+
+   return 0;
+}
+
+/**
+ * Check that the following conditions are met:
+ * - the list of supported filters has a filter
+ *   with EFX_FILTER_MATCH_UNKNOWN_MCAST_DST flag instead of
+ *   EFX_FILTER_MATCH_UNKNOWN_UCAST_DST, since this filter will also
+ *   be inserted.
+ *
+ * @param match[in]
+ *   The match flags of filter.
+ * @param spec[in]
+ *   Specification to be supplemented.
+ * @param filter[in]
+ *   SFC filter with list of supported filters.
+ */
+static boolean_t
+sfc_flow_check_unknown_dst_flags(efx_filter_match_flags_t match,
+__rte_unused efx_filter_spec_t *spec,
+struct sfc_filter *filter)
+{
+   unsigned int i;
+   efx_filter_match_flags_t match_mcast_dst;
+
+   match_mcast_dst =
+   (match & ~EFX_FILTER_MATCH_UNKNOWN_UCAST_DST) |
+   EFX_FILTER_MATCH_UNKNOWN_MCAST_DST;
+   for (i = 0; i < filter->supported_match_num; i++) {
+   if (match_mcast_dst == filter->supported_match[i])
+   return B_TRUE;
+   }
+
+   return B_FALSE;
+}
+
+/**
  * Set the EFX_FILTER_MATCH_ETHER_TYPE match flag and EFX_ETHER_TYPE_IPV4 and
  * EFX_ETHER_TYPE_IPV6 values of the corresponding field in the same
  * specifications after copying.
@@ -1638,9 +1714,22 @@ 
sfc_flow_check_ifrm_unknown_dst_flags(efx_filter_match_flags_t match,
return B_FALSE;
 }
 
-/* Match flags that can be automatically added to filters */
+/*
+ * Match flags that can be automatically added to filters.
+ * Selecting the last minimum when searching for the copy flag ensures that the
+ * EFX_FILTER_MATCH_UNKNOWN_UCAST_DST flag has a higher priority than
+ * EFX_FILTER_MATCH_ETHER_TYPE. This is because the filter
+ * EFX_FILTER_MATCH_UNKNOWN_UCAST_DST is at the end of the list of supported
+ * filters.
+ */
 static const struct sfc_flow_copy_flag sfc_flow_copy_flags[] = {
{
+   .flag = EFX_FILTER_MATCH_UNKNOWN_UCAST_DST,
+   .vals_count = 2,
+   .set_vals = sfc_flow_set_unknown_dst_flags,
+ 

[dpdk-dev] [RFC PATCH] net/bonding: add rte flow support

2018-03-06 Thread Matan Azrad
Ethernet devices which are grouped by bonding PMD, aka slaves, are
sharing the same queues and RSS configurations and their Rx burst
functions must be managed by the bonding PMD according to the bonding
architecture.

So, it makes sense to configure the same flow rules for all the bond
slaves to allow consistency in packet flow management.

Add rte flow support to the bonding PMD.

Signed-off-by: Matan Azrad 
---


Implementation details:

Allow rte flow next operations: validate, create, destroy, flush, query, 
isolate.

Validate:
Validation will pass only if all the existed slaves validations will pass.

Create:
Create the flow in all slaves.
Save all the slaves created flows objects in bonding internal flow structure.
Save each flow configuration to be able to configure them for each new slave.
Failure in flow creation for existed slave will reject the flow.
Failure in flow creation for new slaves in slave adding time will reject the 
slave.
Return the bonding flow structure pointer to the application.

Destroy:
Destroy the flow in all slaves and release the internal flow memory.

Flush:
Destroy all the bonding PMD flows in all the slaves (calling to slaves flush 
will destroy all the slave flows which may include another flows from 
application or the bond internal LACP flow).

Query:
Return the query result of the bonding primary slave.(alternatively we can sum 
all the query data for COUNT action and return -ENOTSUP for another queries).

Isolate:
Call to flow isolate for all slaves.
isolate mode will be configured for new slaves too(will reject the slave in 
failure case).


* This implementation allows to application to configure flows directly to the 
slaves and to manage another rte flows set.
* The recommendation is to use rte flow by the bonding PMD and not directly by 
the slaves PMDs (for example: calling to flow flush of the slave directly may 
hurt LACP mechanism).

You can look on the code below to see more details.

Thoughts?

 drivers/net/bonding/Makefile   |   1 +
 drivers/net/bonding/rte_eth_bond_api.c |  61 +
 drivers/net/bonding/rte_eth_bond_flow.c| 206 +
 drivers/net/bonding/rte_eth_bond_pmd.c |  28 +++-
 drivers/net/bonding/rte_eth_bond_private.h |  19 +++
 5 files changed, 312 insertions(+), 3 deletions(-)
 create mode 100644 drivers/net/bonding/rte_eth_bond_flow.c

diff --git a/drivers/net/bonding/Makefile b/drivers/net/bonding/Makefile
index 4a6633e..acad16a 100644
--- a/drivers/net/bonding/Makefile
+++ b/drivers/net/bonding/Makefile
@@ -27,6 +27,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_PMD_BOND) += rte_eth_bond_pmd.c
 SRCS-$(CONFIG_RTE_LIBRTE_PMD_BOND) += rte_eth_bond_args.c
 SRCS-$(CONFIG_RTE_LIBRTE_PMD_BOND) += rte_eth_bond_8023ad.c
 SRCS-$(CONFIG_RTE_LIBRTE_PMD_BOND) += rte_eth_bond_alb.c
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_BOND) += rte_eth_bond_flow.c
 
 #
 # Export include files
diff --git a/drivers/net/bonding/rte_eth_bond_api.c 
b/drivers/net/bonding/rte_eth_bond_api.c
index f854b73..350b46e 100644
--- a/drivers/net/bonding/rte_eth_bond_api.c
+++ b/drivers/net/bonding/rte_eth_bond_api.c
@@ -223,6 +223,47 @@
 }
 
 static int
+slave_rte_flow_prepare(uint16_t slave_id, struct bond_dev_private *internals)
+{
+   struct rte_flow *flow;
+   struct rte_flow_error ferror;
+   uint16_t slave_port_id = internals->slaves[slave_id].port_id;
+
+   if (internals->flow_isolated_valid != 0) {
+   if (rte_flow_isolate(slave_port_id, internals->flow_isolated,
+   &ferror)) {
+   RTE_BOND_LOG(ERR, "rte_flow_isolate failed for slave"
+" %d: %s", slave_id, ferror.message ?
+ferror.message : "(no stated reason)");
+   return -1;
+   }
+   }
+   TAILQ_FOREACH(flow, &internals->flow_list, next) {
+   flow->flows[slave_id] = rte_flow_create(slave_port_id,
+   &flow->fd->attr,
+   flow->fd->items,
+   flow->fd->actions,
+   &ferror);
+   if (flow->flows[slave_id] == NULL) {
+   RTE_BOND_LOG(ERR, "Cannot create flow for slave"
+" %d: %s", slave_id,
+ferror.message ? ferror.message :
+"(no stated reason)");
+   /* Destroy successful bond flows from the slave */
+   TAILQ_FOREACH(flow, &internals->flow_list, next) {
+   if (flow->flows[slave_id] != NULL) {
+   rte_flow_destroy(slave_port_id, flow,
+&ferror);
+   flow->flows[slave_id] = NUL

Re: [dpdk-dev] [dpdk-stable] [PATCH] net/bonding: avoid wrong casting on primary_slave_port_id from input param

2018-03-06 Thread Ferruh Yigit
On 3/6/2018 11:51 AM, Ferruh Yigit wrote:
> On 3/6/2018 9:37 AM, Gowrishankar wrote:
>> From: Gowrishankar Muthukrishnan 
>>
>> primary_slave_port_id is uint16_t which needs to be correctly stored
>> with the same data type of input parameter in bond_ethdev_configure.
>>
>> Fixes: f8244c6399 ("ethdev: increase port id range")
>> Cc: sta...@dpdk.org
>>
>> Signed-off-by: Gowrishankar Muthukrishnan 
> 
> Acked-by: Ferruh Yigit 

Applied to dpdk-next-net/master, thanks.


Re: [dpdk-dev] [PATCH] vhost: stop device before updating public vring data

2018-03-06 Thread Maxime Coquelin

Hi Tomasz,

On 03/05/2018 05:11 PM, Tomasz Kulasek wrote:

For now DPDK assumes that callfd, kickfd and last_idx are being set just
once during vring initialization and device cannot be running while DPDK
receives SET_VRING_KICK, SET_VRING_CALL and SET_VRING_BASE messages.
However, that assumption is wrong. For Vhost SCSI messages might arrive
at any point of time, possibly multiple times, one after another.

QEMU issues SET_VRING_CALL once during device initialization, then again
during device start. The second message will close previous callfd,
which is still being used by the user-implementation of vhost device.
This results in writing to invalid (closed) callfd.

Other messages like SET_FEATURES, SET_VRING_ADDR etc also will change
internal state of VQ or device. To prevent race condition device should
also be stopped before updateing vring data.

Signed-off-by: Dariusz Stojaczyk
Signed-off-by: Pawel Wodkowski
Signed-off-by: Tomasz Kulasek
---
  lib/librte_vhost/vhost_user.c | 40 
  1 file changed, 40 insertions(+)


In last release, we have introduced a per-virtqueue lock to protect
vring handling against asynchronous device changes.

I think that would solve the issue you are facing, but you would need
to export the VQs locking functions to the vhost-user lib API to be
able to use it.

I don't think your current patch is the right solution anyway, because
it destroys the device in case we don't want it to remain alive, like
set_log_base, or set_features when only the logging feature gets
enabled.

Cheers,
Maxime


[dpdk-dev] [PATCH] app/testpmd: print Rx/Tx offload values

2018-03-06 Thread Ferruh Yigit
It is not clear which per port offloads are enabled. Printing offloads
values at forwarding start.

CRC strip offload value was printed in more verbose manner, it is
removed since Rx/Tx offload values covers it and printing only CRC one
can cause confusion.

Hexadecimal offloads values are not very user friendly but preferred to
not create to much noise during forwarding start.

Signed-off-by: Ferruh Yigit 
---
Cc: Shahaf Shuler 
---
 app/test-pmd/config.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
index 4bb255c62..47845d0cb 100644
--- a/app/test-pmd/config.c
+++ b/app/test-pmd/config.c
@@ -1682,10 +1682,9 @@ rxtx_config_display(void)
struct rte_eth_txconf *tx_conf = &ports[pid].tx_conf;
 
printf("  port %d:\n", (unsigned int)pid);
-   printf("  CRC stripping %s\n",
-   (ports[pid].dev_conf.rxmode.offloads &
-DEV_RX_OFFLOAD_CRC_STRIP) ?
-   "enabled" : "disabled");
+   printf("  Rx offloads=0x%"PRIx64" Tx Offloads=0x%"PRIx64"\n",
+   ports[pid].dev_conf.rxmode.offloads,
+   ports[pid].dev_conf.txmode.offloads);
printf("  RX queues=%d - RX desc=%d - RX free threshold=%d\n",
nb_rxq, nb_rxd, rx_conf->rx_free_thresh);
printf("  RX threshold registers: pthresh=%d hthresh=%d "
-- 
2.13.6



Re: [dpdk-dev] [PATCH v1] net/tap: allow user MAC to be passed as args

2018-03-06 Thread Ferruh Yigit
On 2/12/2018 2:44 PM, Vipin Varghese wrote:
> Allow TAP PMD to pass user desired MAC address as argument.
> The argument value is processed as string delimited by  ':',
> is parsed and converted to HEX MAC address after validation.
> 
> Signed-off-by: Vipin Varghese 
> Signed-off-by: Pascal Mazon 

<...>

> @@ -1589,7 +1630,7 @@ enum ioctl_mode {
>   int speed;
>   char tap_name[RTE_ETH_NAME_MAX_LEN];
>   char remote_iface[RTE_ETH_NAME_MAX_LEN];
> - int fixed_mac_type = 0;
> + struct ether_addr user_mac;
>  
>   name = rte_vdev_device_name(dev);
>   params = rte_vdev_device_args(dev);
> @@ -1626,7 +1667,7 @@ enum ioctl_mode {
>   ret = rte_kvargs_process(kvlist,
>ETH_TAP_MAC_ARG,
>&set_mac_type,
> -  &fixed_mac_type);
> +  &user_mac);
>   if (ret == -1)
>   goto leave;
>   }
> @@ -1637,7 +1678,7 @@ enum ioctl_mode {
>   RTE_LOG(NOTICE, PMD, "Initializing pmd_tap for %s as %s\n",
>   name, tap_name);
>  
> - ret = eth_dev_tap_create(dev, tap_name, remote_iface, fixed_mac_type);
> + ret = eth_dev_tap_create(dev, tap_name, remote_iface, &user_mac);

"user_mac" without initial value is leading error when no "mac" argument is
provided. It should be zeroed out.


Re: [dpdk-dev] 16.11.5 (LTS) patches review and test

2018-03-06 Thread Luca Boccassi
On Tue, 2018-03-06 at 11:07 +0530, gowrishankar muthukrishnan wrote:
> On Monday 05 March 2018 03:42 PM, Luca Boccassi wrote:
> > On Mon, 2018-03-05 at 11:31 +0530, gowrishankar muthukrishnan
> > wrote:
> > > Hi Luca,
> > > In powerpc to support i40e, we wish below patch be merged:
> > > 
> > > c3def6a8724 net/i40e: implement vector PMD for altivec
> > > 
> > > I have verified br-16.11 with the above commit (in cherry-pick, I
> > > needed
> > > to remove release
> > > notes which was meant for 17.05 release which hope is fine here).
> > > Could you please merge the above.
> > > 
> > > Thanks,
> > > Gowrishankar
> > 
> > Hi,
> > 
> > This introduced a new PMD for that architecture, right?
> > 
> > If so I can merge the patch, at the following conditions:
> > 
> > 1) It will be disabled by default
> > 2) Support and help in backporting will have to be provided by the
> > authors for the remaining lifetime of 16.11
> > 
> > Is this OK for you?
> 
> Yes, please go ahead.
> 
> Thanks,
> Gowrishankar

Applied and pushed to dpdk-stable/16.11.

> > > On Monday 26 February 2018 05:04 PM, Luca Boccassi wrote:
> > > > Hi all,
> > > > 
> > > > Here is a list of patches targeted for LTS release 16.11.5.
> > > > Please
> > > > help review and test. The planned date for the final release is
> > > > March
> > > > the 5th, pending results from regression tests.
> > > > Before that, please shout if anyone has objections with these
> > > > patches being applied.
> > > > 
> > > > These patches are located at branch 16.11 of dpdk-stable repo:
> > > >   http://dpdk.org/browse/dpdk-stable/
> > > > 
> > > > Thanks.
> > > > 
> > > > Luca Boccassi
> > > > 
> > > > ---
> > > > Ajit Khaparde (6):
> > > > net/bnxt: support new PCI IDs
> > > > net/bnxt: parse checksum offload flags
> > > > net/bnxt: fix group info usage
> > > > net/bnxt: fix broadcast cofiguration
> > > > net/bnxt: fix size of Tx ring in HW
> > > > net/bnxt: fix link speed setting with autoneg off
> > > > 
> > > > Akhil Goyal (1):
> > > > examples/ipsec-secgw: fix corner case for SPI value
> > > > 
> > > > Alejandro Lucero (3):
> > > > net/nfp: fix MTU settings
> > > > net/nfp: fix jumbo settings
> > > > net/nfp: fix CRC strip check behaviour
> > > > 
> > > > Anatoly Burakov (14):
> > > > memzone: fix leak on allocation error
> > > > malloc: protect stats with lock
> > > > malloc: fix end for bounded elements
> > > > vfio: fix enabled check on error
> > > > app/procinfo: add compilation option in config
> > > > test: register test as failed if setup failed
> > > > test/table: fix uninitialized parameter
> > > > test/memzone: fix wrong test
> > > > test/memzone: handle previously allocated memzones
> > > > usertools/devbind: remove unused function
> > > > test/reorder: fix memory leak
> > > > test/ring_perf: fix memory leak
> > > > test/table: fix memory leak
> > > > test/timer_perf: fix memory leak
> > > > 
> > > > Andriy Berestovskyy (1):
> > > > keepalive: fix state alignment
> > > > 
> > > > Bao-Long Tran (1):
> > > > examples/ip_pipeline: fix timer period unit
> > > > 
> > > > Beilei Xing (8):
> > > > net/i40e: fix flow director Rx resource defect
> > > > net/i40e: add warnings when writing global registers
> > > > net/i40e: add debug logs when writing global registers
> > > > net/i40e: fix multiple driver support issue
> > > > net/i40e: fix interrupt conflict when using multi-
> > > > driver
> > > > net/i40e: fix Rx interrupt
> > > > net/i40e: check multi-driver option parsing
> > > > app/testpmd: fix flow director filter
> > > > 
> > > > Chas Williams (1):
> > > > net/bonding: fix setting slave MAC addresses
> > > > 
> > > > David Harton (1):
> > > > net/i40e: fix VF reset stats crash
> > > > 
> > > > Didier Pallard (1):
> > > > net/virtio: fix incorrect cast
> > > > 
> > > > Dustin Lundquist (1):
> > > > examples/exception_path: align stats on cache line
> > > > 
> > > > Erez Ferber (1):
> > > > net/mlx5: fix MTU update
> > > > 
> > > > Ferruh Yigit (1):
> > > > kni: fix build with kernel 4.15
> > > > 
> > > > Fiona Trahe (1):
> > > > crypto/qat: fix null auth algo overwrite
> > > > 
> > > > Gowrishankar Muthukrishnan (2):
> > > > eal/ppc: remove the braces in memory barrier macros
> > > > eal/ppc: support sPAPR IOMMU for vfio-pci
> > > > 
> > > > Harish Patil (2):
> > > > net/qede: fix to reject config with no Rx queue
> > > > net/qede/base: fix VF LRO tunnel configuration
> > > > 
> > > > Hemant Agrawal (4):
> > > > pmdinfogen: fix cross compilation for ARM big endian
> > > > lpm: fix ARM big endian build
> > > > net/i40e: fix ARM big endian build
> > > > net/ixgbe: fix A

Re: [dpdk-dev] [PATCH v1 1/2] net/octeontx: fix null pointer dereference

2018-03-06 Thread Ferruh Yigit
On 2/20/2018 5:14 PM, Santosh Shukla wrote:
> Fixes: f18b146c498d ("net/octeontx: create ethdev ports")
> Coverity issue: 195040
> 
> Cc: sta...@dpdk.org
> Signed-off-by: Santosh Shukla 

Series applied to dpdk-next-net/master, thanks.

BTW, what is the plan to switching new offloading API in PMD? This release it is
planned to remove support for old API.


[dpdk-dev] [PATCH] eal: register rte_panic user callback

2018-03-06 Thread Arnon Warshavsky
The use case addressed here is dpdk environment init
aborting the process due to panic,
preventing the calling process from running its own tear-down actions.
A preferred, though ABI breaking solution would be
to have the environment init always return a value
rather than abort upon distress.

This patch defines a couple of callback registration functions,
one for panic and one for exit
in case one wishes to distinguish between these events.
Once a callback is set and panic takes place,
it will be called prior to calling abort.

Maiden voyage patch for Qwilt and myself.

Signed-off-by: Arnon Warshavsky 
---
 lib/librte_eal/bsdapp/eal/eal_debug.c | 37 ++
 lib/librte_eal/common/include/rte_debug.h | 24 +++
 lib/librte_eal/linuxapp/eal/eal_debug.c   | 38 +++
 lib/librte_eal/rte_eal_version.map|  2 ++
 4 files changed, 101 insertions(+)

diff --git a/lib/librte_eal/bsdapp/eal/eal_debug.c 
b/lib/librte_eal/bsdapp/eal/eal_debug.c
index 5d92500..010859d 100644
--- a/lib/librte_eal/bsdapp/eal/eal_debug.c
+++ b/lib/librte_eal/bsdapp/eal/eal_debug.c
@@ -18,6 +18,39 @@
 
 #define BACKTRACE_SIZE 256
 
+/*
+ * user function pointers that when assigned, gets to be called
+ * during ret_exit()
+ */
+static rte_user_abort_callback_t *exit_user_callback;
+
+/*
+ * user function pointers that when assigned, gets to be called
+ * during ret_panic()
+ */
+static rte_user_abort_callback_t *panic_user_callback;
+
+/**
+ * Register user callback function to be called during rte_panic()
+ * Deregisteration is by passing NULL as the parameter
+ */
+void __rte_experimental
+rte_panic_user_callback_register(rte_user_abort_callback_t *cb)
+{
+   panic_user_callback = cb;
+}
+
+/**
+ * Register user callback function to be called during rte_exit()
+ * Deregisteration is by passing NULL as the parameter
+ */
+void __rte_experimental
+rte_exit_user_callback_register(rte_user_abort_callback_t *cb)
+{
+   exit_user_callback = cb;
+}
+
+
 /* dump the stack of the calling core */
 void rte_dump_stack(void)
 {
@@ -59,6 +92,8 @@ void __rte_panic(const char *funcname, const char *format, 
...)
va_end(ap);
rte_dump_stack();
rte_dump_registers();
+   if (panic_user_callback)
+   (*panic_user_callback)();
abort();
 }
 
@@ -78,6 +113,8 @@ rte_exit(int exit_code, const char *format, ...)
va_start(ap, format);
rte_vlog(RTE_LOG_CRIT, RTE_LOGTYPE_EAL, format, ap);
va_end(ap);
+   if (exit_user_callback)
+   (*exit_user_callback)();
 
 #ifndef RTE_EAL_ALWAYS_PANIC_ON_ERROR
if (rte_eal_cleanup() != 0)
diff --git a/lib/librte_eal/common/include/rte_debug.h 
b/lib/librte_eal/common/include/rte_debug.h
index 272df49..7e3d0a2 100644
--- a/lib/librte_eal/common/include/rte_debug.h
+++ b/lib/librte_eal/common/include/rte_debug.h
@@ -16,11 +16,35 @@
 
 #include "rte_log.h"
 #include "rte_branch_prediction.h"
+#include 
 
 #ifdef __cplusplus
 extern "C" {
 #endif
 
+
+/*
+ * Definition of user function pointer type to be called during
+ * the execution of rte_panic
+ */
+
+typedef void  (*rte_user_abort_callback_t)(void);
+/**< @internal Ethernet device configuration. */
+
+/**
+ * Register user callback function to be called during rte_panic()
+ * Deregisteration is by passing NULL as the parameter
+ */
+void __rte_experimental
+rte_panic_user_callback_register(rte_user_abort_callback_t *cb);
+
+/**
+ * Register user callback function to be called during rte_exit()
+ * Deregisteration is by passing NULL as the parameter
+ */
+void __rte_experimental
+rte_exit_user_callback_register(rte_user_abort_callback_t *cb);
+
 /**
  * Dump the stack of the calling core to the console.
  */
diff --git a/lib/librte_eal/linuxapp/eal/eal_debug.c 
b/lib/librte_eal/linuxapp/eal/eal_debug.c
index 5d92500..b1748b8 100644
--- a/lib/librte_eal/linuxapp/eal/eal_debug.c
+++ b/lib/librte_eal/linuxapp/eal/eal_debug.c
@@ -16,8 +16,42 @@
 #include 
 #include 
 
+
 #define BACKTRACE_SIZE 256
 
+/*
+ * user function pointers that when assigned, gets to be called
+ * during ret_exit()
+ */
+static rte_user_abort_callback_t *exit_user_callback;
+
+/*
+ * user function pointers that when assigned, gets to be called
+ * during ret_panic()
+ */
+static rte_user_abort_callback_t *panic_user_callback;
+
+/**
+ * Register user callback function to be called during rte_panic()
+ * Deregisteration is by passing NULL as the parameter
+ */
+void __rte_experimental
+rte_panic_user_callback_register(rte_user_abort_callback_t *cb)
+{
+   panic_user_callback = cb;
+}
+
+/**
+ * Register user callback function to be called during rte_exit()
+ * Deregisteration is by passing NULL as the parameter
+ */
+void __rte_experimental
+rte_exit_user_callback_register(rte_user_abort_callback_t *cb)
+{
+   exit_user_callback = cb;
+}
+
+
 /* dump the stack of the calling core */
 void rte_dump_stack(void)
 {
@@ -59,6 +93,8 @

Re: [dpdk-dev] [PATCH 0/5] remove void pointer explicit cast

2018-03-06 Thread Ferruh Yigit
On 2/26/2018 8:10 AM, Zhiyong Yang wrote:
> The patch series cleanup void pointer explicit cast related to
> struct rte_flow_item fields in librte_flow_classify and make
> code more readable.
> 
> Zhiyong Yang (5):
>   flow_classify: remove void pointer cast
>   net/ixgbe: remove void pointer cast
>   net/e1000: remove void pointer cast
>   net/bnxt: remove void pointer cast
>   net/sfc: remove void pointer cast

Series applied to dpdk-next-net/master, thanks.


[dpdk-dev] [PATCH] net/bnxt: switch to the new offload API

2018-03-06 Thread Ajit Khaparde
Update bnxt PMD to new ethdev offloads API.
Signed-off-by: Ajit Khaparde 
---
 drivers/net/bnxt/bnxt_ethdev.c | 59 +-
 1 file changed, 41 insertions(+), 18 deletions(-)

diff --git a/drivers/net/bnxt/bnxt_ethdev.c b/drivers/net/bnxt/bnxt_ethdev.c
index 21c46f833..cca4ef40c 100644
--- a/drivers/net/bnxt/bnxt_ethdev.c
+++ b/drivers/net/bnxt/bnxt_ethdev.c
@@ -146,6 +146,27 @@ static const struct rte_pci_id bnxt_pci_id_map[] = {
ETH_RSS_NONFRAG_IPV6_TCP |  \
ETH_RSS_NONFRAG_IPV6_UDP)
 
+#define BNXT_DEV_TX_OFFLOAD_SUPPORT (DEV_TX_OFFLOAD_VLAN_INSERT | \
+DEV_TX_OFFLOAD_IPV4_CKSUM | \
+DEV_TX_OFFLOAD_TCP_CKSUM | \
+DEV_TX_OFFLOAD_UDP_CKSUM | \
+DEV_TX_OFFLOAD_TCP_TSO | \
+DEV_TX_OFFLOAD_OUTER_IPV4_CKSUM | \
+DEV_TX_OFFLOAD_VXLAN_TNL_TSO | \
+DEV_TX_OFFLOAD_GRE_TNL_TSO | \
+DEV_TX_OFFLOAD_IPIP_TNL_TSO | \
+DEV_TX_OFFLOAD_GENEVE_TNL_TSO | \
+DEV_TX_OFFLOAD_MULTI_SEGS)
+
+#define BNXT_DEV_RX_OFFLOAD_SUPPORT (DEV_RX_OFFLOAD_VLAN_FILTER | \
+DEV_RX_OFFLOAD_VLAN_STRIP | \
+DEV_RX_OFFLOAD_IPV4_CKSUM | \
+DEV_RX_OFFLOAD_UDP_CKSUM | \
+DEV_RX_OFFLOAD_TCP_CKSUM | \
+DEV_RX_OFFLOAD_OUTER_IPV4_CKSUM | \
+DEV_RX_OFFLOAD_JUMBO_FRAME | \
+DEV_RX_OFFLOAD_CRC_STRIP)
+
 static int bnxt_vlan_offload_set_op(struct rte_eth_dev *dev, int mask);
 static void bnxt_print_link_info(struct rte_eth_dev *eth_dev);
 
@@ -430,21 +451,14 @@ static void bnxt_dev_info_get_op(struct rte_eth_dev 
*eth_dev,
dev_info->min_rx_bufsize = 1;
dev_info->max_rx_pktlen = BNXT_MAX_MTU + ETHER_HDR_LEN + ETHER_CRC_LEN
  + VLAN_TAG_SIZE;
-   dev_info->rx_offload_capa = DEV_RX_OFFLOAD_VLAN_STRIP |
-   DEV_RX_OFFLOAD_IPV4_CKSUM |
-   DEV_RX_OFFLOAD_UDP_CKSUM |
-   DEV_RX_OFFLOAD_TCP_CKSUM |
-   DEV_RX_OFFLOAD_OUTER_IPV4_CKSUM;
-   dev_info->tx_offload_capa = DEV_TX_OFFLOAD_VLAN_INSERT |
-   DEV_TX_OFFLOAD_IPV4_CKSUM |
-   DEV_TX_OFFLOAD_TCP_CKSUM |
-   DEV_TX_OFFLOAD_UDP_CKSUM |
-   DEV_TX_OFFLOAD_TCP_TSO |
-   DEV_TX_OFFLOAD_OUTER_IPV4_CKSUM |
-   DEV_TX_OFFLOAD_VXLAN_TNL_TSO |
-   DEV_TX_OFFLOAD_GRE_TNL_TSO |
-   DEV_TX_OFFLOAD_IPIP_TNL_TSO |
-   DEV_TX_OFFLOAD_GENEVE_TNL_TSO;
+
+   dev_info->rx_queue_offload_capa = BNXT_DEV_RX_OFFLOAD_SUPPORT;
+   dev_info->rx_offload_capa = BNXT_DEV_RX_OFFLOAD_SUPPORT;
+   if (bp->flags & BNXT_FLAG_PTP_SUPPORTED)
+   dev_info->rx_offload_capa |= DEV_RX_OFFLOAD_TIMESTAMP;
+   dev_info->tx_queue_offload_capa = BNXT_DEV_TX_OFFLOAD_SUPPORT;
+   dev_info->tx_offload_capa = BNXT_DEV_TX_OFFLOAD_SUPPORT;
+   dev_info->flow_type_rss_offloads = BNXT_ETH_RSS_SUPPORT;
 
/* *INDENT-OFF* */
dev_info->default_rxconf = (struct rte_eth_rxconf) {
@@ -454,7 +468,8 @@ static void bnxt_dev_info_get_op(struct rte_eth_dev 
*eth_dev,
.wthresh = 0,
},
.rx_free_thresh = 32,
-   .rx_drop_en = 0,
+   /* If no descriptors available, pkts are dropped by default */
+   .rx_drop_en = 1,
};
 
dev_info->default_txconf = (struct rte_eth_txconf) {
@@ -465,8 +480,6 @@ static void bnxt_dev_info_get_op(struct rte_eth_dev 
*eth_dev,
},
.tx_free_thresh = 32,
.tx_rs_thresh = 32,
-   .txq_flags = ETH_TXQ_FLAGS_NOMULTSEGS |
-ETH_TXQ_FLAGS_NOOFFLOADS,
};
eth_dev->data->dev_conf.intr_conf.lsc = 1;
 
@@ -510,6 +523,16 @@ static void bnxt_dev_info_get_op(struct rte_eth_dev 
*eth_dev,
 static int bnxt_dev_configure_op(struct rte_eth_dev *eth_dev)
 {
struct bnxt *bp = (struct bnxt *)eth_dev->data->dev_private;
+   uint64_t tx_offloads = eth_dev->data->dev_conf.txmode.offloads;
+   uint64_t rx_offloads = eth_dev->data->dev_conf.rxmode.offloads;
+
+   if (tx_offloads != BNXT_DEV_TX_OFFLOAD_SUPPORT)
+   PMD_DRV_LO

Re: [dpdk-dev] [PATCH] compressdev: implement API

2018-03-06 Thread Ahmed Mansour
On 3/5/2018 9:32 AM, Verma, Shally wrote:
>
>> -Original Message-
>> From: Ahmed Mansour [mailto:ahmed.mans...@nxp.com]
>> Sent: 03 March 2018 01:19
>> To: Trahe, Fiona ; Verma, Shally 
>> ; dev@dpdk.org
>> Cc: De Lara Guarch, Pablo ; Athreya, 
>> Narayana Prasad ;
>> Gupta, Ashish ; Sahu, Sunila 
>> ; Challa, Mahipal
>> ; Jain, Deepak K ; 
>> Hemant Agrawal ; Roy
>> Pledge ; Youri Querry 
>> Subject: Re: [dpdk-dev] [PATCH] compressdev: implement API
>>
>> On 3/2/2018 4:53 AM, Trahe, Fiona wrote:
 On 3/1/2018 9:41 AM, Trahe, Fiona wrote:
> Hi Shally
>
> //snip//
>> [Shally] This looks better to me. So it mean app would always call 
>> xform_init() for stateless and attach
 an
>> updated priv_xform to ops (depending upon if there's shareable or not). 
>> So it does not need to have
>> NULL pointer on priv_xform. right?
>>
> [Fiona] yes. The PMD must return a valid priv_xform pointer.
 [Ahmed] What I understood is that the xform_init will be called once
 initially. if the @flag returned is NONE_SHAREABLE then the application
 must not attach two inflight ops to the same @priv_xform? Otherwise the
 application can attach many ops in flight to the @priv_xform?
>>> [Fiona Yes. App calls the xform_init() once on a device where it plans to 
>>> send stateless ops.
>>> If PMD returns shareable, then it doesn't need to call again and can attach 
>>> this to every stateless op going to that device.
>>> If PMD returns SINGLE_OP then it must call xform_init() before every other
>>> stateless op it wants to have inflight simultaneously. This does not mean 
>>> it must be called before every op,
>>> but probably will set up a batch of priv_xforms  - it can reuse each 
>>> priv_xform once the op finishes with it.
>> [Ahmed] @Shally Can this complexity of managing the NONE_SHAREABLE mode
>> be pushed into the PMD? A flexible stockpile can be kept and maintained
>> by the PMD and it can be increased or decreased based on
>> low-water/high-water thresholds
> [Shally] It is doable to manage within PMD but need to do hands on to 
> evaluate effectiveness. So far, we have never exercised this way and left it 
> to application to attach different session (or stream) to op for maximum 
> performance gain. So, I would say, may it be ok to have flag feature in first 
> place and deprecate later, if it not required?! Or just have API without any 
> flag option and add a feature flag to indicate PMD support for 
> SHAREABLE/NON-SHAREABLE xform_priv handle?!
[Ahmed] Either way looks ok to me. I see your point about performance.
If this is in the PMD it will have to constantly guess how much memory
the user needs and accommodate dynamically. The user can implement a
similar scheme or if the application is simple they can pre-allocate and
reduce CPU allocation de-allocation overhead.


[dpdk-dev] [PATCH v3] net/null:Different mac address support

2018-03-06 Thread Mallesh Koujalagi
After attaching two Null device to ovs, seeing "00.00.00.00.00.00" mac
address for both null devices. Fix this issue, by setting different mac
address.

Signed-off-by: Mallesh Koujalagi 
---
 drivers/net/null/rte_eth_null.c | 8 +++-
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/drivers/net/null/rte_eth_null.c b/drivers/net/null/rte_eth_null.c
index 9385ffd..42e3a77 100644
--- a/drivers/net/null/rte_eth_null.c
+++ b/drivers/net/null/rte_eth_null.c
@@ -73,6 +73,7 @@ struct pmd_internals {
struct null_queue rx_null_queues[RTE_MAX_QUEUES_PER_PORT];
struct null_queue tx_null_queues[RTE_MAX_QUEUES_PER_PORT];
 
+   struct ether_addr eth_addr;
/** Bit mask of RSS offloads, the bit offset also means flow type */
uint64_t flow_type_rss_offloads;
 
@@ -84,9 +85,6 @@ struct pmd_internals {
 
uint8_t rss_key[40];/**< 40-byte hash key. */
 };
-
-
-static struct ether_addr eth_addr = { .addr_bytes = {0} };
 static struct rte_eth_link pmd_link = {
.link_speed = ETH_SPEED_NUM_10G,
.link_duplex = ETH_LINK_FULL_DUPLEX,
@@ -519,7 +517,6 @@ eth_dev_null_create(struct rte_vdev_device *dev,
rte_free(data);
return -ENOMEM;
}
-
/* now put it all together
 * - store queue data in internals,
 * - store numa_node info in ethdev data
@@ -533,6 +530,7 @@ eth_dev_null_create(struct rte_vdev_device *dev,
internals->packet_size = packet_size;
internals->packet_copy = packet_copy;
internals->port_id = eth_dev->data->port_id;
+   eth_random_addr(internals->eth_addr.addr_bytes);
 
internals->flow_type_rss_offloads =  ETH_RSS_PROTO_MASK;
internals->reta_size = RTE_DIM(internals->reta_conf) * 
RTE_RETA_GROUP_SIZE;
@@ -543,7 +541,7 @@ eth_dev_null_create(struct rte_vdev_device *dev,
data->nb_rx_queues = (uint16_t)nb_rx_queues;
data->nb_tx_queues = (uint16_t)nb_tx_queues;
data->dev_link = pmd_link;
-   data->mac_addrs = ð_addr;
+   data->mac_addrs = &internals->eth_addr;
 
eth_dev->data = data;
eth_dev->dev_ops = &ops;
-- 
2.7.4



[dpdk-dev] Fwd: PMD for Broadcom/Emulex OCe14000 OCP Skyhawk-R

2018-03-06 Thread sujith sankar
Hi all,

Is PMD for Broadcom/Emulex OCe14000 OCP Skyhawk-R available?  There
are a few documents in Broadcom's site.  But could not find the source
code of it.

I believe 6Wind team developed the PMD for Broadcom.  But what is the
status of it?  Is it freely available?

Tried to get some help from users alias, but could not.
Could someone please help me with info on this?

Thanks,
-Sujith


Re: [dpdk-dev] [RFC 4/4] drivers/raw/ifpga_rawdev: Rawdev for Intel FPGA Device, it's a PCI Driver of FPGA Device Manager

2018-03-06 Thread Zhang, Tianfei


-Original Message-
From: Shreyansh Jain [mailto:shreyansh.j...@nxp.com] 
Sent: Tuesday, March 6, 2018 2:48 PM
To: Xu, Rosen 
Cc: dev@dpdk.org; Doherty, Declan ; Zhang, Tianfei 

Subject: Re: [dpdk-dev] [RFC 4/4] drivers/raw/ifpga_rawdev: Rawdev for Intel 
FPGA Device, it's a PCI Driver of FPGA Device Manager

On Tue, Mar 6, 2018 at 7:13 AM, Rosen Xu  wrote:
> Signed-off-by: Rosen Xu 
> ---
>  drivers/raw/ifpga_rawdev/Makefile  |  59 
>  drivers/raw/ifpga_rawdev/ifpga_rawdev.c| 343 
> +
>  drivers/raw/ifpga_rawdev/ifpga_rawdev.h| 109 +++
>  drivers/raw/ifpga_rawdev/ifpga_rawdev_example.c| 121 

When rawdev skeleton driver was integrated, Thomas raised this point of naming 
'skeleton_rawdev' rather than just 'skeleton'.
So, rather than 'ifpga_rawdev' rather than 'ifpga'.
At that time I thought we could use  as model. But, 
frankly, to me it seems a bad choice now. Extra '_rawdev'
doesn't serve any purpose here.

So, feel free to change your naming to a more appropriate "drivers/raw/ifpga/" 
or "drivers/raw/ifpga_sample" etc.

Probably I too can change the skeleton_rawdev to skeleton.

>  .../ifpga_rawdev/rte_pmd_ifpga_rawdev_version.map  |   4 +
>  5 files changed, 636 insertions(+)
>  create mode 100644 drivers/raw/ifpga_rawdev/Makefile  create mode 
> 100644 drivers/raw/ifpga_rawdev/ifpga_rawdev.c
>  create mode 100644 drivers/raw/ifpga_rawdev/ifpga_rawdev.h
>  create mode 100644 drivers/raw/ifpga_rawdev/ifpga_rawdev_example.c
>  create mode 100644 
> drivers/raw/ifpga_rawdev/rte_pmd_ifpga_rawdev_version.map
>
> diff --git a/drivers/raw/ifpga_rawdev/Makefile 
> b/drivers/raw/ifpga_rawdev/Makefile
> new file mode 100644
> index 000..3166fe2
> --- /dev/null
> +++ b/drivers/raw/ifpga_rawdev/Makefile
> @@ -0,0 +1,59 @@
> +#   BSD LICENSE
> +#
> +#   Copyright(c) 2010-2017 Intel Corporation. All rights reserved.
> +#   All rights reserved.
> +#
> +#   Redistribution and use in source and binary forms, with or without
> +#   modification, are permitted provided that the following conditions
> +#   are met:
> +#
> +# * Redistributions of source code must retain the above copyright
> +#   notice, this list of conditions and the following disclaimer.
> +# * Redistributions in binary form must reproduce the above copyright
> +#   notice, this list of conditions and the following disclaimer in
> +#   the documentation and/or other materials provided with the
> +#   distribution.
> +# * Neither the name of Intel Corporation nor the names of its
> +#   contributors may be used to endorse or promote products derived
> +#   from this software without specific prior written permission.
> +#
> +#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> +#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> +#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> +#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> +#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> +#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> +#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> +#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> +#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> +#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> +#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> +

SPDX identifier in place of BSD boiler-plate.

> +include $(RTE_SDK)/mk/rte.vars.mk
> +
> +#
> +# library name
> +#
> +LIB = librte_pmd_ifpga_rawdev.a
> +
> +CFLAGS += -DALLOW_EXPERIMENTAL_API
> +CFLAGS += -O3
> +CFLAGS += $(WERROR_FLAGS)
> +CFLAGS += -I$(RTE_SDK)/drivers/bus/ifpga CFLAGS += 
> +-I$(RTE_SDK)/drivers/raw/ifpga_rawdev
> +LDLIBS += -lrte_eal
> +LDLIBS += -lrte_rawdev
> +LDLIBS += -lrte_bus_vdev
> +LDLIBS += -lrte_kvargs
> +
> +EXPORT_MAP := rte_pmd_ifpga_rawdev_version.map
> +
> +LIBABIVER := 1
> +
> +#
> +# all source are stored in SRCS-y
> +#
> +SRCS-$(CONFIG_RTE_LIBRTE_PMD_SKELETON_RAWDEV) += ifpga_rawdev.c
> +SRCS-$(CONFIG_RTE_LIBRTE_PMD_SKELETON_RAWDEV) += 
> +ifpga_rawdev_example.c

This is a copy-paste issue - CONFIG_RTE_LIBRTE_PMD_SKELETON_RAWDEV

> +
> +include $(RTE_SDK)/mk/rte.lib.mk
> diff --git a/drivers/raw/ifpga_rawdev/ifpga_rawdev.c 
> b/drivers/raw/ifpga_rawdev/ifpga_rawdev.c
> new file mode 100644
> index 000..6046711
> --- /dev/null
> +++ b/drivers/raw/ifpga_rawdev/ifpga_rawdev.c
> @@ -0,0 +1,343 @@
> +/*-
> + *   BSD LICENSE
> + *
> + *   Copyright 2016 NXP.

:) - should be Intel.
Even better - SPDX

> + *
> + *   Redistribution and use in source and binary forms, with or without
> + *   modification, are permitted provided that the following conditions
> + *   are met:
> + *
> + * * Redistributions of source code must retain the above copyright

Re: [dpdk-dev] [PATCH 3/4] drivers/net: do not allocate rte_eth_dev_data privately

2018-03-06 Thread Matan Azrad
Hi Jianfeng

From: Tan, Jianfeng, Sent: Tuesday, March 6, 2018 10:56 AM
> > -Original Message-
> > From: Matan Azrad [mailto:ma...@mellanox.com]
> > Sent: Tuesday, March 6, 2018 2:08 PM
> > To: Tan, Jianfeng; Yigit, Ferruh
> > Cc: Richardson, Bruce; Ananyev, Konstantin; Thomas Monjalon;
> > maxime.coque...@redhat.com; Burakov, Anatoly; dev@dpdk.org
> > Subject: RE: [dpdk-dev] [PATCH 3/4] drivers/net: do not allocate
> > rte_eth_dev_data privately
> >
> > Hi Jianfeng
> >
> > Please see a comment below.
> >
> > > From: Jianfeng Tan, Sent: Sunday, March 4, 2018 5:30 PM We
> > > introduced private rte_eth_dev_data to allow vdev to be created both
> > in
> > > primary process and secondary process(es). This is not friendly to
> > > multi- process model, for example, it leads to port id contention
> > > issue if two processes both find the data entry is free.
> > >
> > > And to get stats of primary vdev in secondary, we must allocate from
> > > the pre-defined array so that we can find it.
> > >
> > > Suggested-by: Bruce Richardson 
> > > Signed-off-by: Jianfeng Tan 
> > > ---
> > >  drivers/net/af_packet/rte_eth_af_packet.c | 25 +++--
> > >  drivers/net/kni/rte_eth_kni.c | 13 ++---
> > >  drivers/net/null/rte_eth_null.c   | 17 +++--
> > >  drivers/net/octeontx/octeontx_ethdev.c| 14 ++
> > >  drivers/net/pcap/rte_eth_pcap.c   | 18 +++---
> > >  drivers/net/tap/rte_eth_tap.c |  9 +
> > >  drivers/net/vhost/rte_eth_vhost.c | 17 ++---
> > >  7 files changed, 20 insertions(+), 93 deletions(-)
> > >
> > > diff --git a/drivers/net/af_packet/rte_eth_af_packet.c
> > > b/drivers/net/af_packet/rte_eth_af_packet.c
> > > index 57eccfd..2db692f 100644
> > > --- a/drivers/net/af_packet/rte_eth_af_packet.c
> > > +++ b/drivers/net/af_packet/rte_eth_af_packet.c
> > > @@ -564,25 +564,17 @@ rte_pmd_init_internals(struct rte_vdev_device
> > > *dev,
> > >   RTE_LOG(ERR, PMD,
> > >   "%s: no interface specified for AF_PACKET
> ethdev\n",
> > >   name);
> > > - goto error_early;
> > > + return -1;
> > >   }
> > >
> > >   RTE_LOG(INFO, PMD,
> > >   "%s: creating AF_PACKET-backed ethdev on numa socket
> %u\n",
> > >   name, numa_node);
> > >
> > > - /*
> > > -  * now do all data allocation - for eth_dev structure, dummy pci
> > > driver
> > > -  * and internal (private) data
> > > -  */
> > > - data = rte_zmalloc_socket(name, sizeof(*data), 0, numa_node);
> > > - if (data == NULL)
> > > - goto error_early;
> > > -
> > >   *internals = rte_zmalloc_socket(name, sizeof(**internals),
> > >   0, numa_node);
> > >   if (*internals == NULL)
> > > - goto error_early;
> > > + return -1;
> > >
> > >   for (q = 0; q < nb_queues; q++) {
> > >   (*internals)->rx_queue[q].map = MAP_FAILED; @@ -604,24
> > > +596,24 @@ rte_pmd_init_internals(struct rte_vdev_device *dev,
> > >   RTE_LOG(ERR, PMD,
> > >   "%s: I/F name too long (%s)\n",
> > >   name, pair->value);
> > > - goto error_early;
> > > + return -1;
> > >   }
> > >   if (ioctl(sockfd, SIOCGIFINDEX, &ifr) == -1) {
> > >   RTE_LOG(ERR, PMD,
> > >   "%s: ioctl failed (SIOCGIFINDEX)\n",
> > >   name);
> > > - goto error_early;
> > > + return -1;
> > >   }
> > >   (*internals)->if_name = strdup(pair->value);
> > >   if ((*internals)->if_name == NULL)
> > > - goto error_early;
> > > + return -1;
> > >   (*internals)->if_index = ifr.ifr_ifindex;
> > >
> > >   if (ioctl(sockfd, SIOCGIFHWADDR, &ifr) == -1) {
> > >   RTE_LOG(ERR, PMD,
> > >   "%s: ioctl failed (SIOCGIFHWADDR)\n",
> > >   name);
> > > - goto error_early;
> > > + return -1;
> > >   }
> > >   memcpy(&(*internals)->eth_addr, ifr.ifr_hwaddr.sa_data,
> ETH_ALEN);
> > >
> > > @@ -775,14 +767,13 @@ rte_pmd_init_internals(struct rte_vdev_device
> > > *dev,
> > >
> > >   (*internals)->nb_queues = nb_queues;
> > >
> > > - rte_memcpy(data, (*eth_dev)->data, sizeof(*data));
> > > + data = (*eth_dev)->data;
> > >   data->dev_private = *internals;
> > >   data->nb_rx_queues = (uint16_t)nb_queues;
> > >   data->nb_tx_queues = (uint16_t)nb_queues;
> > >   data->dev_link = pmd_link;
> > >   data->mac_addrs = &(*internals)->eth_addr;
> > >
> > > - (*eth_dev)->data = data;
> > >   (*eth_dev)->dev_ops = &ops;
> > >
> > >   return 0;
> > > @@ -802,8 +793,6 @@ rte_pmd_init_internals(struct rte_vdev_device
> > *dev,
> > >   }
> > >   free((*internals)->if_name);
> > >   rte_free(*internals);
> > > -error_early:
> > > - rte_free(data);
> > >   return -1;
> > >  }
> > >
> >
> > I think you should remove the private rte_eth_dev_data freeing in
> > rte_pmd_af_packet_remove().
> > This is relevant to all the vde

Re: [dpdk-dev] [PATCH 3/4] drivers/net: do not allocate rte_eth_dev_data privately

2018-03-06 Thread Matan Azrad
Hi Jianfeng


From: Matan Azrad, Wednesday, March 7, 2018 8:01 AM
> Hi Jianfeng
> 
> From: Tan, Jianfeng, Sent: Tuesday, March 6, 2018 10:56 AM
> > > -Original Message-
> > > From: Matan Azrad [mailto:ma...@mellanox.com]
> > > Sent: Tuesday, March 6, 2018 2:08 PM
> > > To: Tan, Jianfeng; Yigit, Ferruh
> > > Cc: Richardson, Bruce; Ananyev, Konstantin; Thomas Monjalon;
> > > maxime.coque...@redhat.com; Burakov, Anatoly; dev@dpdk.org
> > > Subject: RE: [dpdk-dev] [PATCH 3/4] drivers/net: do not allocate
> > > rte_eth_dev_data privately
> > >
> > > Hi Jianfeng
> > >
> > > Please see a comment below.
> > >
> > > > From: Jianfeng Tan, Sent: Sunday, March 4, 2018 5:30 PM We
> > > > introduced private rte_eth_dev_data to allow vdev to be created
> > > > both
> > > in
> > > > primary process and secondary process(es). This is not friendly to
> > > > multi- process model, for example, it leads to port id contention
> > > > issue if two processes both find the data entry is free.
> > > >
> > > > And to get stats of primary vdev in secondary, we must allocate
> > > > from the pre-defined array so that we can find it.
> > > >
> > > > Suggested-by: Bruce Richardson 
> > > > Signed-off-by: Jianfeng Tan 
> > > > ---
> > > >  drivers/net/af_packet/rte_eth_af_packet.c | 25 +++
> --
> > > >  drivers/net/kni/rte_eth_kni.c | 13 ++---
> > > >  drivers/net/null/rte_eth_null.c   | 17 +++--
> > > >  drivers/net/octeontx/octeontx_ethdev.c| 14 ++
> > > >  drivers/net/pcap/rte_eth_pcap.c   | 18 +++---
> > > >  drivers/net/tap/rte_eth_tap.c |  9 +
> > > >  drivers/net/vhost/rte_eth_vhost.c | 17 ++---
> > > >  7 files changed, 20 insertions(+), 93 deletions(-)
> > > >
> > > > diff --git a/drivers/net/af_packet/rte_eth_af_packet.c
> > > > b/drivers/net/af_packet/rte_eth_af_packet.c
> > > > index 57eccfd..2db692f 100644
> > > > --- a/drivers/net/af_packet/rte_eth_af_packet.c
> > > > +++ b/drivers/net/af_packet/rte_eth_af_packet.c
> > > > @@ -564,25 +564,17 @@ rte_pmd_init_internals(struct
> > > > rte_vdev_device *dev,
> > > > RTE_LOG(ERR, PMD,
> > > > "%s: no interface specified for AF_PACKET
> > ethdev\n",
> > > > name);
> > > > -   goto error_early;
> > > > +   return -1;
> > > > }
> > > >
> > > > RTE_LOG(INFO, PMD,
> > > > "%s: creating AF_PACKET-backed ethdev on numa socket
> > %u\n",
> > > > name, numa_node);
> > > >
> > > > -   /*
> > > > -* now do all data allocation - for eth_dev structure, dummy pci
> > > > driver
> > > > -* and internal (private) data
> > > > -*/
> > > > -   data = rte_zmalloc_socket(name, sizeof(*data), 0, numa_node);
> > > > -   if (data == NULL)
> > > > -   goto error_early;
> > > > -
> > > > *internals = rte_zmalloc_socket(name, sizeof(**internals),
> > > > 0, numa_node);
> > > > if (*internals == NULL)
> > > > -   goto error_early;
> > > > +   return -1;
> > > >
> > > > for (q = 0; q < nb_queues; q++) {
> > > > (*internals)->rx_queue[q].map = MAP_FAILED; @@ -604,24
> > > > +596,24 @@ rte_pmd_init_internals(struct rte_vdev_device *dev,
> > > > RTE_LOG(ERR, PMD,
> > > > "%s: I/F name too long (%s)\n",
> > > > name, pair->value);
> > > > -   goto error_early;
> > > > +   return -1;
> > > > }
> > > > if (ioctl(sockfd, SIOCGIFINDEX, &ifr) == -1) {
> > > > RTE_LOG(ERR, PMD,
> > > > "%s: ioctl failed (SIOCGIFINDEX)\n",
> > > > name);
> > > > -   goto error_early;
> > > > +   return -1;
> > > > }
> > > > (*internals)->if_name = strdup(pair->value);
> > > > if ((*internals)->if_name == NULL)
> > > > -   goto error_early;
> > > > +   return -1;
> > > > (*internals)->if_index = ifr.ifr_ifindex;
> > > >
> > > > if (ioctl(sockfd, SIOCGIFHWADDR, &ifr) == -1) {
> > > > RTE_LOG(ERR, PMD,
> > > > "%s: ioctl failed (SIOCGIFHWADDR)\n",
> > > > name);
> > > > -   goto error_early;
> > > > +   return -1;
> > > > }
> > > > memcpy(&(*internals)->eth_addr, ifr.ifr_hwaddr.sa_data,
> > ETH_ALEN);
> > > >
> > > > @@ -775,14 +767,13 @@ rte_pmd_init_internals(struct
> > > > rte_vdev_device *dev,
> > > >
> > > > (*internals)->nb_queues = nb_queues;
> > > >
> > > > -   rte_memcpy(data, (*eth_dev)->data, sizeof(*data));
> > > > +   data = (*eth_dev)->data;
> > > > data->dev_private = *internals;
> > > > data->nb_rx_queues = (uint