Re: [dpdk-dev] [PATCH] test/crypto: replace wireless algos test vectors

2020-02-07 Thread Trahe, Fiona



> -Original Message-
> From: Dybkowski, AdamX 
> Sent: Thursday, February 6, 2020 11:25 AM
> To: dev@dpdk.org; Trahe, Fiona 
> Cc: Dybkowski, AdamX 
> Subject: [PATCH] test/crypto: replace wireless algos test vectors
> 
> This patch replaces all KASUMI, SNOW3G, ZUC and all mixed
> crypto unit test vectors with randomly generated input arrays.
> All outputs were calculated and verified on both QAT PMD
> and appropriate software-only PMDs wherever possible.
> 
> Signed-off-by: Adam Dybkowski 
Acked-by: Fiona Trahe 


Re: [dpdk-dev] [RFC PATCH 1/7] vfio: Include optional device match in vfio_device_ops callbacks

2020-02-07 Thread Cornelia Huck
On Thu, 6 Feb 2020 11:18:42 -0700
Alex Williamson  wrote:

> On Thu, 6 Feb 2020 12:14:19 +0100
> Cornelia Huck  wrote:
> 
> > On Tue, 04 Feb 2020 16:05:43 -0700
> > Alex Williamson  wrote:
> >   
> > > Allow bus drivers to provide their own callback to match a device to
> > > the user provided string.
> > > 
> > > Signed-off-by: Alex Williamson 
> > > ---
> > >  drivers/vfio/vfio.c  |   19 +++
> > >  include/linux/vfio.h |3 +++

> I think with your first option we arrive at something like this:
> 
> diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
> index dda1726adda8..b5609a411139 100644
> --- a/drivers/vfio/vfio.c
> +++ b/drivers/vfio/vfio.c
> @@ -883,14 +883,15 @@ static struct vfio_device 
> *vfio_device_get_from_name(struct vfio_group *group,
>  
>   if (it->ops->match) {
>   ret = it->ops->match(it->device_data, buf);
> - if (ret < 0 && ret != -ENODEV) {
> + if (ret < 0) {
>   device = ERR_PTR(ret);
>   break;
>   }
> - } else
> - ret = strcmp(dev_name(it->dev), buf);
> + } else {
> + ret = !strcmp(dev_name(it->dev), buf);
> + }
>  
> - if (!ret) {
> + if (ret) {
>   device = it;
>   vfio_device_get(device);
>   break;
> diff --git a/include/linux/vfio.h b/include/linux/vfio.h
> index 755e0f0e2900..029694b977f2 100644
> --- a/include/linux/vfio.h
> +++ b/include/linux/vfio.h
> @@ -26,8 +26,9 @@
>   * operations documented below
>   * @mmap: Perform mmap(2) on a region of the device file descriptor
>   * @request: Request for the bus driver to release the device
> - * @match: Optional device name match callback (return: 0 for match, -ENODEV
> - * (or >0) for no match and continue, other -errno: no match and 
> stop)
> + * @match: Optional device name match callback (return: 0 for no-match, >0 
> for
> + * match, -errno for abort (ex. match with insufficient or incorrect
> + * additional args)
>   */
>  struct vfio_device_ops {
>   char*name;
> 
> 
> I like that.  Thanks,
> 
> Alex

Looks good to me.



Re: [dpdk-dev] [PATCH v4 00/36] update for i40e base code

2020-02-07 Thread Kevin Traynor
On 04/02/2020 03:23, Ye Xiaolong wrote:
> Hi, Kevin
> 
> On 02/03, Kevin Traynor wrote:
>> On 13/01/2020 05:59, Ye, Xiaolong wrote:
> 
> [snip]
> 
 -Original Message-
>> Hi Xiaolong/Beilei/Qi Z,
>>
>> Some of these patches are tagged to go to 'S'table and are 'F'ixes,
>> while others are also fixes for older releases but not tagged to go to
>> stable.
>>
>> No issue with not backporting patches (in fact i would prefer not if
>> some doubts), but want to check if the stable tag was missed by accident?
>>
>> Also, note that Fw listed is v6.01 for 18.11, (v7.00 for 19.11) so
>> changes would need to be compatible with that.
> 
> Yes, some of the stable tags were missed deliberately since the new base code
> update is for FW 7.1/7.2, it's not guaranteed to be compatible with old FW.
> 
>>
>> 20.02 8f33cbcfa S F net/i40e/base: fix buffer address (16.04)
>> 20.02 4b3da9415 S F net/i40e/base: fix error message (1.7.0)
>> 20.02 1da546c39 - F net/i40e/base: fix missing link modes (17.08)
>> 20.02 79bfe7808 S F net/i40e/base: fix Tx descriptors number (1.7.0)
>> 20.02 50126939c - F net/i40e/base: fix retrying logic (18.02)
>> 20.02 b330a1c5c - F net/i40e/base: fix display of FEC settings (17.08)
>> 20.02 03ef7d47f - F net/i40e/base: add new link speed constants (17.08)
>>
> 
> I would suggest we only backport below obvious fixes to stable releases.
> 
> 20.02 8f33cbcfa S F net/i40e/base: fix buffer address (16.04)
> 20.02 4b3da9415 S F net/i40e/base: fix error message (1.7.0)
> 20.02 79bfe7808 S F net/i40e/base: fix Tx descriptors number (1.7.0)
> 20.02 50126939c - F net/i40e/base: fix retrying logic (18.02)
> 

Thanks for confirming Xialong, I will backport the ones you suggest.

Kevin.

> Thanks,
> Xiaolong
> 
>> Please confirm on patches above missing stable tag. For time being I
>> will only consider ones with stable tag.
>>
>> thanks,
>> Kevin.
>>
> 



[dpdk-dev] [PATCH] doc: update thunderx guide

2020-02-07 Thread Harman Kalra
Updating the thunderx with loopback port support,
debugging options, new features added and correcting
some formating issues.

Signed-off-by: Harman Kalra 
---
 doc/guides/nics/thunderx.rst | 153 ---
 1 file changed, 88 insertions(+), 65 deletions(-)

diff --git a/doc/guides/nics/thunderx.rst b/doc/guides/nics/thunderx.rst
index 89d439def..f42133e54 100644
--- a/doc/guides/nics/thunderx.rst
+++ b/doc/guides/nics/thunderx.rst
@@ -25,6 +25,7 @@ Features of the ThunderX PMD are:
 - Port hardware statistics
 - Jumbo frames
 - Link state information
+- Setting up link state.
 - Scattered and gather for TX and RX
 - VLAN stripping
 - SR-IOV VF
@@ -242,6 +243,12 @@ driver' list, secondary VFs are on the remaining on the 
remaining part of the li
   Depending on the hardware used, the kernel driver sets a threshold 
``vf_id``. VFs that try to attached with an id below or equal to
   this boundary are considered primary VFs. VFs that try to attach with an 
id above this boundary are considered secondary VFs.
 
+LBK HW Access
+~
+
+Loopback HW Unit (LBK) receives packets from NIC-RX and sends packets back to 
NIC-TX.
+The loopback block has N channels and contains data buffering that is shared 
across
+all channels. Four primary VFs are reserved as loopback ports.
 
 Example device binding
 ~~
@@ -263,36 +270,40 @@ on a non-NUMA machine.
 
   Network devices using kernel driver
   ===
-  :01:10.0 'Device a026' if= drv=thunder-BGX 
unused=vfio-pci,uio_pci_generic
-  :01:10.1 'Device a026' if= drv=thunder-BGX 
unused=vfio-pci,uio_pci_generic
-  0002:01:00.0 'Device a01e' if= drv=thunder-nic 
unused=vfio-pci,uio_pci_generic
-  0002:01:00.1 'Device 0011' if=eth0 drv=thunder-nicvf 
unused=vfio-pci,uio_pci_generic
-  0002:01:00.2 'Device 0011' if=eth1 drv=thunder-nicvf 
unused=vfio-pci,uio_pci_generic
-  0002:01:00.3 'Device 0011' if=eth2 drv=thunder-nicvf 
unused=vfio-pci,uio_pci_generic
-  0002:01:00.4 'Device 0011' if= drv=thunder-nicvf 
unused=vfio-pci,uio_pci_generic
-  0002:01:00.5 'Device 0011' if= drv=thunder-nicvf 
unused=vfio-pci,uio_pci_generic
-  0002:01:00.6 'Device 0011' if= drv=thunder-nicvf 
unused=vfio-pci,uio_pci_generic
-  0002:01:00.7 'Device 0011' if= drv=thunder-nicvf 
unused=vfio-pci,uio_pci_generic
-  0002:01:01.0 'Device 0011' if= drv=thunder-nicvf 
unused=vfio-pci,uio_pci_generic
-  0002:01:01.1 'Device 0011' if= drv=thunder-nicvf 
unused=vfio-pci,uio_pci_generic
-  0002:01:01.2 'Device 0011' if= drv=thunder-nicvf 
unused=vfio-pci,uio_pci_generic
-  0002:01:01.3 'Device 0011' if= drv=thunder-nicvf 
unused=vfio-pci,uio_pci_generic
-  0002:01:01.4 'Device 0011' if= drv=thunder-nicvf 
unused=vfio-pci,uio_pci_generic
-  0002:01:01.5 'Device 0011' if= drv=thunder-nicvf 
unused=vfio-pci,uio_pci_generic
-  0002:01:01.6 'Device 0011' if= drv=thunder-nicvf 
unused=vfio-pci,uio_pci_generic
-  0002:01:01.7 'Device 0011' if= drv=thunder-nicvf 
unused=vfio-pci,uio_pci_generic
-  0002:01:02.0 'Device 0011' if= drv=thunder-nicvf 
unused=vfio-pci,uio_pci_generic
-  0002:01:02.1 'Device 0011' if= drv=thunder-nicvf 
unused=vfio-pci,uio_pci_generic
-  0002:01:02.2 'Device 0011' if= drv=thunder-nicvf 
unused=vfio-pci,uio_pci_generic
+  :01:10.0 'THUNDERX BGX (Common Ethernet Interface) a026' if= 
drv=thunder-BGX unused=vfio-pci
+  :01:10.1 'THUNDERX BGX (Common Ethernet Interface) a026' if= 
drv=thunder-BGX unused=vfio-pci
+  0001:01:00.0 'THUNDERX Network Interface Controller a01e' if= 
drv=thunder-nic unused=vfio-pci
+  0001:01:00.1 'Device a034' if=eth0 drv=thunder-nicvf unused=vfio-pci
+  0001:01:00.2 'Device a034' if=eth1 drv=thunder-nicvf unused=vfio-pci
+  0001:01:00.3 'Device a034' if=eth2 drv=thunder-nicvf unused=vfio-pci
+  0001:01:00.4 'Device a034' if=eth3 drv=thunder-nicvf unused=vfio-pci
+  0001:01:00.5 'Device a034' if=eth4 drv=thunder-nicvf unused=vfio-pci
+  0001:01:00.6 'Device a034' if=lbk0 drv=thunder-nicvf unused=vfio-pci
+  0001:01:00.7 'Device a034' if=lbk1 drv=thunder-nicvf unused=vfio-pci
+  0001:01:01.0 'Device a034' if=lbk2 drv=thunder-nicvf unused=vfio-pci
+  0001:01:01.1 'Device a034' if=lbk3 drv=thunder-nicvf unused=vfio-pci
+  0001:01:01.2 'Device a034' if= drv=thunder-nicvf unused=vfio-pci
+  0001:01:01.3 'Device a034' if= drv=thunder-nicvf unused=vfio-pci
+  0001:01:01.4 'Device a034' if= drv=thunder-nicvf unused=vfio-pci
+  0001:01:01.5 'Device a034' if= drv=thunder-nicvf unused=vfio-pci
+  0001:01:01.6 'Device a034' if= drv=thunder-nicvf unused=vfio-pci
+  0001:01:01.7 'Device a034' if= drv=thunder-nicvf unused=vfio-pci
+  0001:01:02.0 'Device a034' if= drv=thunder-nicvf unused=vfio-pci
+  0001:01:02.1 'Device a034' if= drv=thunder-nicvf unused=vfio-pci
+  0001:01:02.2 'Device a034' if= drv=

[dpdk-dev] [Bug 393] rte_zmalloc_socket does not zero memory

2020-02-07 Thread bugzilla
https://bugs.dpdk.org/show_bug.cgi?id=393

Bug ID: 393
   Summary: rte_zmalloc_socket does not zero memory
   Product: DPDK
   Version: 19.11
  Hardware: x86
OS: Linux
Status: UNCONFIRMED
  Severity: critical
  Priority: Normal
 Component: core
  Assignee: dev@dpdk.org
  Reporter: pavel.kr...@anritsu.com
  Target Milestone: ---

Hello,
It seems that rte_zmalloc_socket does not return zeroed memory on some cases.
It seems to work OK on all but HP G10 server where I was testing application
that tried to create rte fragment table and do IP defragementation where it
crashed on some of the structure members. 

I put check into rte_zmalloc_socket to see if the memory was really zeroed and
it was not! Also if I zero the memory then the defragmentation works correctly
and does not crash.

I see from google that there were problems like this in the past with non
zeroed rte_zmalloc_socket. 

I have tested also the 19.02 version and it has same problem.

this is on Centos 7.3 with linux kernel 3.10.0-514.el7.x86_64. Don't know if
the linux itself can return non-zeroed memory in some case to dpdk lib.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[dpdk-dev] [Bug 392] l3fwd fails to run with eventdev

2020-02-07 Thread bugzilla
https://bugs.dpdk.org/show_bug.cgi?id=392

Vipin Varghese (vipin.vargh...@intel.com) changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|IN_PROGRESS |RESOLVED

--- Comment #2 from Vipin Varghese (vipin.vargh...@intel.com) ---
Thanks pavan for the patch. I am ok with this correction. But will wait for
Jerin to confirm.

Note: we can close the Bugzilla

-- 
You are receiving this mail because:
You are the assignee for the bug.

Re: [dpdk-dev] [PATCH v2 0/7] MinGW-w64 support

2020-02-07 Thread Dmitry Kozliuk
Hi William,

I applied your v2 patch and I did a native build on windows 10.
> Hit an error showing
> ../lib/librte_eal/windows/eal/eal_lcore.c:54:2: error: 'for' loop
> initial declarations are only allowed in C99 mode
>

Thanks, will fix in v3.

However the output looks weird:
> C:\dpdk\build\examples>dpdk-helloworld.exe
> EAL: Detected 2 lcore(s)
> EAL: Detected 1 NUMA nodes
> hehello fllo frorom cm core 1
> ore 0
>

It looks like your stdout is unbuffered (default is line-buffered). What
terminal are you using (cmd, Power Shell, Terminal App, conemu, etc)?

C compiler for the host machine: cc (gcc 4.8.3 "cc (GCC) 4.8.3")
>

GCC 4.8.3 is quite outdated, MinGW-w64 ships GCC 8 nowadays. Do we need to
support it for Windows (I doubt MinGW-w64 does)?

--
Dmitry Kozlyuk

>


[dpdk-dev] [PATCH v2] examples/ioat: fix invalid link status check

2020-02-07 Thread Ciara Power
The return value of the get link function call was not checked, and
could return a negative value indicating a failure. This meant the
link_status of the link being checked is invalid, because the link was
not filled with data. The return value is now checked, and if the return
value is not 0 for success, the loop continues with the next port.

To avoid confusion between variable names, the existing retval variable
is renamed to link_status, to better represent its use.

Coverity issue: 350348
Fixes: c8e6ceecebc1 ("examples/ioat: add new sample app for ioat driver")
Cc: pawelx.mod...@intel.com
Cc: sta...@dpdk.org

Signed-off-by: Ciara Power 

---
v2:
  - Changed commit log subject and description
  - Rebased onto master
---
 examples/ioat/ioatfwd.c | 13 +
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/examples/ioat/ioatfwd.c b/examples/ioat/ioatfwd.c
index e9117718f..99ecc0fde 100644
--- a/examples/ioat/ioatfwd.c
+++ b/examples/ioat/ioatfwd.c
@@ -697,7 +697,7 @@ check_link_status(uint32_t port_mask)
 {
uint16_t portid;
struct rte_eth_link link;
-   int retval = 0;
+   int ret, link_status = 0;
 
printf("\nChecking link status\n");
RTE_ETH_FOREACH_DEV(portid) {
@@ -705,7 +705,12 @@ check_link_status(uint32_t port_mask)
continue;
 
memset(&link, 0, sizeof(link));
-   rte_eth_link_get(portid, &link);
+   ret = rte_eth_link_get(portid, &link);
+   if (ret < 0) {
+   printf("Port %u link get failed: err=%d\n",
+   portid, ret);
+   continue;
+   }
 
/* Print link status */
if (link.link_status) {
@@ -714,11 +719,11 @@ check_link_status(uint32_t port_mask)
portid, link.link_speed,
(link.link_duplex == ETH_LINK_FULL_DUPLEX) ?
("full-duplex") : ("half-duplex\n"));
-   retval = 1;
+   link_status = 1;
} else
printf("Port %d Link Down\n", portid);
}
-   return retval;
+   return link_status;
 }
 
 static void
-- 
2.17.1



[dpdk-dev] Query for support hw gro offload

2020-02-07 Thread fengchengwen
Hello Ferruh,

1. Current dpdk framework provide software GRO feature, but do not
support hardware GRO offload. and it now can only process tcp4 and
vxland_tcp4 streams.
2. HNS3's hardware 1620(ARM SoC) support hardware GRO offload, it
can process tcp4 and tcp6 streams.
3. linux kernerl driver support hardware GRO feature already (named
rx-gro-hw).

so my questions:
1. Can dpdk framework have plan to support hardware GRO offload?
2. If we make this patch, could it upstream?

Regards,
Chengwen



[dpdk-dev] [PATCH v3] eal/mem: preallocate VA space in no-huge mode

2020-02-07 Thread Anatoly Burakov
When --no-huge mode is used, the memory is currently allocated with
mmap(NULL, ...). This is fine in most cases, but can fail in cases
where DPDK is run on a machine with an IOMMU that is of more limited
address width than that of a VA, because we're not specifying the
address hint for mmap() call.

Fix it by preallocating VA space before mapping it.

Cc: sta...@dpdk.org

Signed-off-by: Anatoly Burakov 
---

Notes:
v3:
- Fix mmap flags used in place of offset
- Fix using internal_config.memory in place of mem_sz
- Add additional address sanity check

v2:
- Add unmap on unsuccessful mmap

I couldn't figure out which specific commit has introduced
the issue, so there's no fix tag. The most likely candidate
is one that introduced the DMA mask thing in the first place
but i'm not sure.

 lib/librte_eal/linux/eal/eal_memory.c | 24 
 1 file changed, 20 insertions(+), 4 deletions(-)

diff --git a/lib/librte_eal/linux/eal/eal_memory.c 
b/lib/librte_eal/linux/eal/eal_memory.c
index 5604c2a7c0..7a9c97ff88 100644
--- a/lib/librte_eal/linux/eal/eal_memory.c
+++ b/lib/librte_eal/linux/eal/eal_memory.c
@@ -1340,6 +1340,8 @@ eal_legacy_hugepage_init(void)
 
/* hugetlbfs can be disabled */
if (internal_config.no_hugetlbfs) {
+   void *prealloc_addr;
+   size_t mem_sz;
struct rte_memseg_list *msl;
int n_segs, cur_seg, fd, flags;
 #ifdef MEMFD_SUPPORTED
@@ -1395,17 +1397,31 @@ eal_legacy_hugepage_init(void)
}
}
 #endif
-   addr = mmap(NULL, internal_config.memory, PROT_READ | 
PROT_WRITE,
-   flags, fd, 0);
-   if (addr == MAP_FAILED) {
+   /* preallocate address space for the memory, so that it can be
+* fit into the DMA mask.
+*/
+   mem_sz = internal_config.memory;
+   prealloc_addr = eal_get_virtual_area(
+   NULL, &mem_sz, page_sz, 0, 0);
+   if (prealloc_addr == NULL) {
+   RTE_LOG(ERR, EAL,
+   "%s: reserving memory area failed: "
+   "%s\n",
+   __func__, strerror(errno));
+   return -1;
+   }
+   addr = mmap(prealloc_addr, mem_sz, PROT_READ | PROT_WRITE,
+   flags | MAP_FIXED, fd, 0);
+   if (addr == MAP_FAILED || addr != prealloc_addr) {
RTE_LOG(ERR, EAL, "%s: mmap() failed: %s\n", __func__,
strerror(errno));
+   munmap(prealloc_addr, mem_sz);
return -1;
}
msl->base_va = addr;
msl->page_sz = page_sz;
msl->socket_id = 0;
-   msl->len = internal_config.memory;
+   msl->len = mem_sz;
msl->heap = 1;
 
/* we're in single-file segments mode, so only the segment list
-- 
2.17.1


[dpdk-dev] [PATCH v3 1/2] net/virtio: refactor devargs parsing

2020-02-07 Thread Ivan Dyukov
refactor vdpa specific devargs parsing to more generic way

Signed-off-by: Ivan Dyukov 
---
 drivers/net/virtio/virtio_ethdev.c | 35 +-
 1 file changed, 25 insertions(+), 10 deletions(-)

diff --git a/drivers/net/virtio/virtio_ethdev.c 
b/drivers/net/virtio/virtio_ethdev.c
index 044eb10a7..22323d9a5 100644
--- a/drivers/net/virtio/virtio_ethdev.c
+++ b/drivers/net/virtio/virtio_ethdev.c
@@ -1955,16 +1955,18 @@ eth_virtio_dev_uninit(struct rte_eth_dev *eth_dev)
 }
 
 static int vdpa_check_handler(__rte_unused const char *key,
-   const char *value, __rte_unused void *opaque)
+   const char *value, void *ret_val)
 {
-   if (strcmp(value, "1"))
-   return -1;
+   if (strcmp(value, "1") == 0)
+   *(int *)ret_val = 1;
+   else
+   *(int *)ret_val = 0;
 
return 0;
 }
 
 static int
-vdpa_mode_selected(struct rte_devargs *devargs)
+virtio_dev_devargs_parse(struct rte_devargs *devargs, int *vdpa)
 {
struct rte_kvargs *kvlist;
const char *key = "vdpa";
@@ -1980,12 +1982,16 @@ vdpa_mode_selected(struct rte_devargs *devargs)
if (!rte_kvargs_count(kvlist, key))
goto exit;
 
-   /* vdpa mode selected when there's a key-value pair: vdpa=1 */
-   if (rte_kvargs_process(kvlist, key,
-   vdpa_check_handler, NULL) < 0) {
-   goto exit;
+   if (vdpa) {
+   /* vdpa mode selected when there's a key-value pair:
+* vdpa=1
+*/
+   ret = rte_kvargs_process(kvlist, key,
+   vdpa_check_handler, vdpa);
+   if (ret  < 0)
+   goto exit;
}
-   ret = 1;
+
 
 exit:
rte_kvargs_free(kvlist);
@@ -1995,8 +2001,17 @@ vdpa_mode_selected(struct rte_devargs *devargs)
 static int eth_virtio_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
struct rte_pci_device *pci_dev)
 {
+   int vdpa = 0;
+   int ret = 0;
+
+   ret = virtio_dev_devargs_parse(pci_dev->device.devargs, &vdpa);
+   if (ret < 0) {
+   PMD_DRV_LOG(ERR,
+   "devargs parsing is failed");
+   return ret;
+   }
/* virtio pmd skips probe if device needs to work in vdpa mode */
-   if (vdpa_mode_selected(pci_dev->device.devargs))
+   if (vdpa == 1)
return 1;
 
return rte_eth_dev_pci_generic_probe(pci_dev, sizeof(struct virtio_hw),
-- 
2.17.1



[dpdk-dev] [PATCH v3 2/2] net/virtio: add link speed devarg

2020-02-07 Thread Ivan Dyukov
Some applications like pktgen use link_speed to calculate
transmit rate. It limits outcome traffic to hardcoded 10G.

This patch adds link_speed devarg which allows to configure
link_speed of virtio device.

Signed-off-by: Ivan Dyukov 
---
 doc/guides/nics/virtio.rst |   7 ++
 drivers/net/virtio/virtio_ethdev.c | 101 -
 drivers/net/virtio/virtio_pci.h|   1 +
 3 files changed, 92 insertions(+), 17 deletions(-)

diff --git a/doc/guides/nics/virtio.rst b/doc/guides/nics/virtio.rst
index d1f5fb898..f190f2e4f 100644
--- a/doc/guides/nics/virtio.rst
+++ b/doc/guides/nics/virtio.rst
@@ -356,6 +356,13 @@ Below devargs are supported by the PCI virtio driver:
 a virtio device needs to work in vDPA mode.
 (Default: 0 (disabled))
 
+#.  ``link_speed``:
+
+It is used to specify link speed of virtio device. Link speed is a part of
+link status structure. It could be requested by application using
+rte_eth_link_get_nowait function.
+(Default: 1 (10G))
+
 Below devargs are supported by the virtio-user vdev:
 
 #.  ``path``:
diff --git a/drivers/net/virtio/virtio_ethdev.c 
b/drivers/net/virtio/virtio_ethdev.c
index 22323d9a5..5ef3c11a7 100644
--- a/drivers/net/virtio/virtio_ethdev.c
+++ b/drivers/net/virtio/virtio_ethdev.c
@@ -45,6 +45,10 @@ static int virtio_dev_promiscuous_enable(struct rte_eth_dev 
*dev);
 static int virtio_dev_promiscuous_disable(struct rte_eth_dev *dev);
 static int virtio_dev_allmulticast_enable(struct rte_eth_dev *dev);
 static int virtio_dev_allmulticast_disable(struct rte_eth_dev *dev);
+static uint32_t virtio_dev_speed_capa_get(uint32_t link_speed);
+static int virtio_dev_devargs_parse(struct rte_devargs *devargs,
+   int *vdpa,
+   uint32_t *link_speed);
 static int virtio_dev_info_get(struct rte_eth_dev *dev,
struct rte_eth_dev_info *dev_info);
 static int virtio_dev_link_update(struct rte_eth_dev *dev,
@@ -1861,6 +1865,7 @@ int
 eth_virtio_dev_init(struct rte_eth_dev *eth_dev)
 {
struct virtio_hw *hw = eth_dev->data->dev_private;
+   uint32_t link_speed = ETH_SPEED_NUM_10G;
int ret;
 
if (sizeof(struct virtio_net_hdr_mrg_rxbuf) > RTE_PKTMBUF_HEADROOM) {
@@ -1886,7 +1891,11 @@ eth_virtio_dev_init(struct rte_eth_dev *eth_dev)
 
return 0;
}
-
+   ret = virtio_dev_devargs_parse(eth_dev->device->devargs,
+NULL, &link_speed);
+   if (ret < 0)
+   return ret;
+   hw->link_speed = link_speed;
/*
 * Pass the information to the rte_eth_dev_close() that it should also
 * release the private port resources.
@@ -1953,6 +1962,14 @@ eth_virtio_dev_uninit(struct rte_eth_dev *eth_dev)
 
return 0;
 }
+#define VIRTIO_ARG_LINK_SPEED "link_speed"
+#define VIRTIO_ARG_VDPA   "vdpa"
+
+static const char * const valid_args[] = {
+   VIRTIO_ARG_LINK_SPEED,
+   VIRTIO_ARG_VDPA,
+   NULL
+};
 
 static int vdpa_check_handler(__rte_unused const char *key,
const char *value, void *ret_val)
@@ -1965,33 +1982,84 @@ static int vdpa_check_handler(__rte_unused const char 
*key,
return 0;
 }
 
+
+static uint32_t
+virtio_dev_speed_capa_get(uint32_t link_speed)
+{
+   switch (link_speed) {
+   case ETH_SPEED_NUM_10G:
+   return ETH_LINK_SPEED_10G;
+   case ETH_SPEED_NUM_20G:
+   return ETH_LINK_SPEED_20G;
+   case ETH_SPEED_NUM_25G:
+   return ETH_LINK_SPEED_25G;
+   case ETH_SPEED_NUM_40G:
+   return ETH_LINK_SPEED_40G;
+   case ETH_SPEED_NUM_50G:
+   return ETH_LINK_SPEED_50G;
+   case ETH_SPEED_NUM_56G:
+   return ETH_LINK_SPEED_56G;
+   case ETH_SPEED_NUM_100G:
+   return ETH_LINK_SPEED_100G;
+   default:
+   return 0;
+   }
+}
+
+static int link_speed_handler(const char *key __rte_unused,
+   const char *value, void *ret_val)
+{
+   uint32_t val;
+   if (!value || !ret_val)
+   return -EINVAL;
+   val = strtoul(value, NULL, 0);
+   /* validate input */
+   if (virtio_dev_speed_capa_get(val) == 0)
+   return -EINVAL;
+   *(uint32_t *)ret_val = val;
+
+   return 0;
+}
+
+
 static int
-virtio_dev_devargs_parse(struct rte_devargs *devargs, int *vdpa)
+virtio_dev_devargs_parse(struct rte_devargs *devargs, int *vdpa,
+   uint32_t *link_speed)
 {
struct rte_kvargs *kvlist;
-   const char *key = "vdpa";
int ret = 0;
 
if (devargs == NULL)
return 0;
 
-   kvlist = rte_kvargs_parse(devargs->args, NULL);
-   if (kvlist == NULL)
+   kvlist = rte_kvargs_parse(devargs->args, valid_args);
+   if (kvlist == NULL) {
+   PMD_INIT_LOG(ERR, "error when parsing param");
return 0;
-
-   if (!rte_kvargs_count(kvlist, key))
-   goto exit;
-
-   if (vdpa) {
+   }
+   if 

Re: [dpdk-dev] [PATCH v2 2/2] net/virtio: add link speed devarg

2020-02-07 Thread Ivan Dyukov
Hi Maxime, Adrian,

Thank you for comments!  I have sent new patch with updated documentation.
This is not final version of the link speed. I still have plan to rework 
it and use link speed which is stored in qemu.

Best regards,
Ivan
06.02.2020 17:26, Adrian Moreno пишет:
> On 2/6/20 3:22 PM, Maxime Coquelin wrote:
>> Adding back Ivan as you removed it from the To: list.
>> So he may not have seen your comment.
I really missed it. Thanks.

> Oops, sorry about that.
> Adrian
>> On 1/29/20 11:10 AM, Adrian Moreno wrote:
>>> On 1/20/20 6:05 PM, Ivan Dyukov wrote:
 Some applications like pktgen use link_speed to calculate
 transmit rate. It limits outcome traffic to hardcoded 10G.

 This patch adds link_speed devarg which allows to configure
 link_speed of virtio device.

 Signed-off-by: Ivan Dyukov 
 ---
   drivers/net/virtio/virtio_ethdev.c | 101 -
   drivers/net/virtio/virtio_pci.h|   1 +
   2 files changed, 85 insertions(+), 17 deletions(-)

>>> Hi Ivan,
>>>
>>> IMHO, this new option deserves being documented in 
>>> doc/guides/nics/virtio.rst.
>>>
>>> Otherwise it looks good to me.
>> I agree with Adrian here, the new option need to be documented.
>>
>> Thanks,
>> Maxime
>>
>>> Thank you.
>>> Adrian
 diff --git a/drivers/net/virtio/virtio_ethdev.c 
 b/drivers/net/virtio/virtio_ethdev.c
 index 22323d9a5..5ef3c11a7 100644
 --- a/drivers/net/virtio/virtio_ethdev.c
 +++ b/drivers/net/virtio/virtio_ethdev.c
 @@ -45,6 +45,10 @@ static int virtio_dev_promiscuous_enable(struct 
 rte_eth_dev *dev);
   static int virtio_dev_promiscuous_disable(struct rte_eth_dev *dev);
   static int virtio_dev_allmulticast_enable(struct rte_eth_dev *dev);
   static int virtio_dev_allmulticast_disable(struct rte_eth_dev *dev);
 +static uint32_t virtio_dev_speed_capa_get(uint32_t link_speed);
 +static int virtio_dev_devargs_parse(struct rte_devargs *devargs,
 +  int *vdpa,
 +  uint32_t *link_speed);
   static int virtio_dev_info_get(struct rte_eth_dev *dev,
struct rte_eth_dev_info *dev_info);
   static int virtio_dev_link_update(struct rte_eth_dev *dev,
 @@ -1861,6 +1865,7 @@ int
   eth_virtio_dev_init(struct rte_eth_dev *eth_dev)
   {
struct virtio_hw *hw = eth_dev->data->dev_private;
 +  uint32_t link_speed = ETH_SPEED_NUM_10G;
int ret;
   
if (sizeof(struct virtio_net_hdr_mrg_rxbuf) > RTE_PKTMBUF_HEADROOM) {
 @@ -1886,7 +1891,11 @@ eth_virtio_dev_init(struct rte_eth_dev *eth_dev)
   
return 0;
}
 -
 +  ret = virtio_dev_devargs_parse(eth_dev->device->devargs,
 +   NULL, &link_speed);
 +  if (ret < 0)
 +  return ret;
 +  hw->link_speed = link_speed;
/*
 * Pass the information to the rte_eth_dev_close() that it should also
 * release the private port resources.
 @@ -1953,6 +1962,14 @@ eth_virtio_dev_uninit(struct rte_eth_dev *eth_dev)
   
return 0;
   }
 +#define VIRTIO_ARG_LINK_SPEED "link_speed"
 +#define VIRTIO_ARG_VDPA   "vdpa"
 +
 +static const char * const valid_args[] = {
 +  VIRTIO_ARG_LINK_SPEED,
 +  VIRTIO_ARG_VDPA,
 +  NULL
 +};
   
   static int vdpa_check_handler(__rte_unused const char *key,
const char *value, void *ret_val)
 @@ -1965,33 +1982,84 @@ static int vdpa_check_handler(__rte_unused const 
 char *key,
return 0;
   }
   
 +
 +static uint32_t
 +virtio_dev_speed_capa_get(uint32_t link_speed)
 +{
 +  switch (link_speed) {
 +  case ETH_SPEED_NUM_10G:
 +  return ETH_LINK_SPEED_10G;
 +  case ETH_SPEED_NUM_20G:
 +  return ETH_LINK_SPEED_20G;
 +  case ETH_SPEED_NUM_25G:
 +  return ETH_LINK_SPEED_25G;
 +  case ETH_SPEED_NUM_40G:
 +  return ETH_LINK_SPEED_40G;
 +  case ETH_SPEED_NUM_50G:
 +  return ETH_LINK_SPEED_50G;
 +  case ETH_SPEED_NUM_56G:
 +  return ETH_LINK_SPEED_56G;
 +  case ETH_SPEED_NUM_100G:
 +  return ETH_LINK_SPEED_100G;
 +  default:
 +  return 0;
 +  }
 +}
 +
 +static int link_speed_handler(const char *key __rte_unused,
 +  const char *value, void *ret_val)
 +{
 +  uint32_t val;
 +  if (!value || !ret_val)
 +  return -EINVAL;
 +  val = strtoul(value, NULL, 0);
 +  /* validate input */
 +  if (virtio_dev_speed_capa_get(val) == 0)
 +  return -EINVAL;
 +  *(uint32_t *)ret_val = val;
 +
 +  return 0;
 +}
 +
 +
   static int
 -virtio_dev_devargs_parse(struct rte_devargs *devargs, int *vdpa)
 +virtio_dev_devargs_parse(struct rte_devargs *devargs, int *vdpa,
 +  uint32_t *link_speed)
   {
struct rte_kvargs *kvlist;
 -  const

Re: [dpdk-dev] Query for support hw gro offload

2020-02-07 Thread Ferruh Yigit
On 2/7/2020 10:39 AM, fengchengwen wrote:
> Hello Ferruh,
> 
> 1. Current dpdk framework provide software GRO feature, but do not
> support hardware GRO offload. and it now can only process tcp4 and
> vxland_tcp4 streams.
> 2. HNS3's hardware 1620(ARM SoC) support hardware GRO offload, it
> can process tcp4 and tcp6 streams.
> 3. linux kernerl driver support hardware GRO feature already (named
> rx-gro-hw).
> 
> so my questions:
> 1. Can dpdk framework have plan to support hardware GRO offload?
> 2. If we make this patch, could it upstream?

Hi Chengwen,

cc'ed ethdev maintainers.

Can't you use LRO [1] for HW GRO offload?

[1]
http://lxr.dpdk.org/dpdk/v19.11/source/doc/guides/nics/features.rst#L193



Re: [dpdk-dev] [PATCH v2] service: don't walk out of bounds when checking services

2020-02-07 Thread Kevin Traynor
On 20/12/2019 14:43, David Marchand wrote:
> On Wed, Dec 4, 2019 at 9:34 AM David Marchand  
> wrote:
>>
>> On Wed, Dec 4, 2019 at 9:33 AM David Marchand  
>> wrote:
>>>
>>> On Tue, Dec 3, 2019 at 10:15 PM Aaron Conole  wrote:

 The service_valid call is used without properly bounds checking the
 input parameter.  Almost all instances of the service_valid call are
 inside a for() loop that prevents excessive walks, but some of the
 public APIs don't bounds check and will pass invalid arguments.

 Prevent this by using SERVICE_GET_OR_ERR_RET where it makes sense,
 and adding a bounds check to one service_valid() use.

 Fixes: 8d39d3e237c2 ("service: fix race in service on app lcore function")
 Fixes: e9139a32f6e8 ("service: add function to run on app lcore")
 Fixes: e30dd31847d2 ("service: add mechanism for quiescing")
>> Cc: sta...@dpdk.org
>>

With the commit below, this patch will apply cleanly on 18.11.

Seems ok to me to add below commit, wdyt?

commit e484ccddbe1b41886fef1e445ef2fdfa55086198
Author: Nikhil Rao 
Date:   Mon Sep 16 15:31:02 2019 +0530

service: avoid false sharing on core state


 Signed-off-by: Aaron Conole 
>>>
>>> Reviewed-by: David Marchand 
> 
> Applied, thanks.
> 
> 
> --
> David Marchand
> 



Re: [dpdk-dev] [RFC] Accelerator API to chain packet processing functions

2020-02-07 Thread Coyle, David
Hi Jerin, see below

> 
> On Thu, Feb 6, 2020 at 10:01 PM Coyle, David 
> wrote:
> 
> Hi David,
> 
> > >
> > >
> > > > > > - XGS-PON MAC: Crypto-CRC-BIP
> > > > > > - Order:
> > > > > > - Downstream: CRC, Encrypt, BIP
> > > > >
> > > > > I understand if the chain has two operations then it may
> > > > > possible to have handcrafted SW code to do both operations in one
> pass.
> > > > > I understand the spec is agnostic on a number of passes it does
> > > > > require to enable the xfrom but To understand the SW/HW
> > > > > capability, In the above case, "CRC, Encrypt, BIP", It is done
> > > > > in one pass in SW or three passes in SW or one pass using HW?
> > > >
> > > > [DC] The CRC, Encrypt, BIP is also currently done as 1 pass in
> > > > AESNI MB
> > > library SW.
> > > > However, this could also be performed as a single pass in a HW
> > > > accelerator
> > >
> > > As a specification, cascading the xform chains make sense.
> > > Do we have any HW that does support chaining the xforms more than
> "two"
> > > in one pass?
> > > i.e real chaining function where two blocks of HWs work hand in hand
> > > for chaining.
> > > If none, it may be better to abstract as synonymous API(No dequeue,
> > > no
> > > enqueue) for the CPU use case.
> >
> > [DC] I'm not aware of any HW that supports this at the moment, but that's
> not to say it couldn't in the future - if anyone else has any examples though,
> please feel free to share.
> > Regardless, I don't see why we would introduce a different API for SW
> devices and HW devices.
> 
> There is a risk in drafting API that meant for HW without any HW exists.
> Because there could be inefficiency on the metadata and fast path API for
> both models.
> For example, In the case of CPU based scheme, it will be pure overhead
> emulate the "queue"(the enqueue and dequeue) for the sake of abstraction
> where CPU works better in the synchronous model and I have doubt that the
> session-based scheme will work for HW or not as both difference  HW needs
> to work hand in hand(IOMMU aspects for two PCI device)

[DC] I understand what you are saying about the overhead of emulating the "sw 
queue" but this same model is already used in many of the existing device PMDs.
In the case of SW devices, such as AESNI-MB or NULL for crypto or zlib for 
compression, the enqueue/dequeue in the PMD is emulated through an rte_ring 
which is very efficient.
The accelerator API will use the existing device PMDs so keeping the same model 
seems like a sensible approach.

From an application's point of view, this abstraction of the underlying device 
type is important for usability and maintainability -  the application doesn't 
need to know
the device type as such and therefore doesn't need to make different API calls. 

The enqueue/dequeue type API was also used with QAT in mind. While QAT HW 
doesn't support these xform chains at the moment, it could potentially do so in 
the future.
As a side note, as part of the work of adding the accelerator API, the QAT PMD 
will be updated to support the DOCSIS Crypto-CRC accelerator xform chain, where 
the Crypto
is done on QAT HW and the CRC will be done in SW, most likely through a call to 
the optimized rte_net_crc library. This will give a consistent API for the 
DOCSIS-MAC data-plane
pipeline prototype we have developed, which uses both AESNI-MB and QAT for 
benchmarks.

We will take your feedback on the enqueue/dequeue approach for SW devices into 
consideration though during development.

Finally, I'm unsure what you mean by this line:

"I have doubt that the session-based scheme will work for HW or not as 
both difference  HW needs to work hand in hand(IOMMU aspects for two PCI 
device)"

What do mean by different HW working "hand in hand" and "two PCI device"?
The intention is that 1 HW device (or it's PMD) would have to support the accel 
xform chain

> 
> Having said that, I agree with the need for use case and API for CPU case. 
> Till
> we find a HW spec, we need to make the solution as CPU specific and latter
> extend based on HW metadata required.
> Accelerator API sounds like HW accelerator and there is no HW support then
> it may not good. We can change the API that works for the use cases that we
> know how it works efficiently.
> 
> 
> 
> 
> 
> 
> 
> > It would be up to each underlying PMD to decide if/how it supports a
> > particular accelerator xform chain, but from an application's point of
> > view, the accelerator API is always the same
> >
> >


Re: [dpdk-dev] [RFT] net/netvsc: initialize link state

2020-02-07 Thread Mohammed Gamal
On Thu, 2020-02-06 at 16:10 -0800, Stephen Hemminger wrote:
> If application is using link state interrupt, the correct link state
> needs to be filled in when device is started. This is similar to
> how virtio updates link information.
> 
> Reported-by: Mohammed Gamal 
> Signed-off-by: Stephen Hemminger 
> ---
> This version marked RFT because am in airport without access to a
> machine to test it.
> 
>  drivers/net/netvsc/hn_ethdev.c | 4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/drivers/net/netvsc/hn_ethdev.c
> b/drivers/net/netvsc/hn_ethdev.c
> index c79f924379fe..564620748daf 100644
> --- a/drivers/net/netvsc/hn_ethdev.c
> +++ b/drivers/net/netvsc/hn_ethdev.c
> @@ -823,6 +823,10 @@ hn_dev_start(struct rte_eth_dev *dev)
>   if (error)
>   hn_rndis_set_rxfilter(hv, 0);
>  
> + /* Initialize Link state */
> + if (error == 0)
> + hn_dev_link_update(dev, 0);
> +
>   return error;
>  }
>  

I tested this and I always get the link status as UP, regardless of
whether I start the interface on the guest in UP or DOWN state. Looking
at hn_dev_link_update() code, I see that the link status depends on the
NDIS status that the driver gets from the host if my understanding is
correct.
The question is whether if I use 'ip li set dev $IF_NAME down' on the
guest affects the status the host sees, or would the host set the state
to NDIS_MEDIA_STATE_CONNECTED of the device is physcially connected
regardless of what the guest tries to do?



Re: [dpdk-dev] [PATCH] mbuf: fix to update documentation of QinQ stripped bit interpretation

2020-02-07 Thread Somnath Kotur
Olivier,

On Thu, Feb 6, 2020 at 10:55 PM Olivier Matz  wrote:
>
> Hi Somnath,
>
> Sorry for the delay, please find some comments below.
>
> I suggest the following title instead:
>
>   mbuf: extend meaning of QinQ stripped bit
>
> On Mon, Jan 06, 2020 at 02:04:23PM +0530, Somnath Kotur wrote:
> > Certain hardware may be able to strip and/or save only the outermost
> > VLAN instead of both the VLANs in the mbuf in a QinQ scenario.
> > To handle such cases, we could re-interpret setting of just
> > PKT_RX_QINQ_STRIPPED to indicate that only the outermost VLAN has
> > been stripped and saved in mbuf->vlan_tci_outer.
>
> To be sure we're on the same page: we are talking about case 7 of this
> link: http://inbox.dpdk.org/dev/20191227145041.GQ22738@platinum/

I'm not sure we are on the same page then, please see my response inline below
>
> So, even if the inner vlan_tci is not stripped from packet data, it has
> to be saved in m->vlan_tci, because it is implied by PKT_RX_VLAN.
>
> From the same link, the case where the driver only strips+saves the
> outer vlan without saving or stripping the inner one is case 3.
>
While this is how it works currently, I'm wondering how will the
application know if this was
a double VLAN pkt, correct?
Also when i look at options 5 and 7 I don't really see the difference
in semantics between them ?
Both seem to store the outer-vlan and inner-vlan in m->vlan_tci_outer
and m->vlan_tci_inner respectively

7/ PKT_RX_VLAN | PKT_RX_QINQ | PKT_RX_QINQ_STRIPPED
   outer-vlan is removed from packet data
   m->vlan_tci_outer=outer-vlan
   m->vlan_tci=inner-vlan

I was hoping that with the new interpretation of PKT_RX_QINQ_STRIPPED,
it only meant that outer-vlan is removed from packet data
and m->vlan_tci_outer = outer_vlan, while PKT_RX_QINQ implies it is a
double-vlan pkt and PKT_RX_VLAN implies that pkt has VLAN
associated with it?   Not m->vlan_tci = inner-vlan

Thanks
Som


> > Only When both PKT_RX_QINQ_STRIPPED and PKT_RX_VLAN_STRIPPED are set, the 2
> > VLANs have been stripped by the hardware and their TCI are saved in
> > mbuf->vlan_tci (inner) and mbuf->vlan_tci_outer (outer).
> >
> > Signed-off-by: Somnath Kotur 
> > ---
> >  lib/librte_mbuf/rte_mbuf_core.h | 15 +++
> >  1 file changed, 11 insertions(+), 4 deletions(-)
> >
> > diff --git a/lib/librte_mbuf/rte_mbuf_core.h 
> > b/lib/librte_mbuf/rte_mbuf_core.h
> > index 9a8557d..db1070b 100644
> > --- a/lib/librte_mbuf/rte_mbuf_core.h
> > +++ b/lib/librte_mbuf/rte_mbuf_core.h
> > @@ -124,12 +124,19 @@
> >  #define PKT_RX_FDIR_FLX  (1ULL << 14)
> >
> >  /**
> > - * The 2 vlans have been stripped by the hardware and their tci are
> > - * saved in mbuf->vlan_tci (inner) and mbuf->vlan_tci_outer (outer).
> > + * The outer vlan has been stripped by the hardware and their tci are
> > + * saved in mbuf->vlan_tci_outer (outer).
>
> their tci are -> its tci is
>
> >   * This can only happen if vlan stripping is enabled in the RX
> >   * configuration of the PMD.
> > - * When PKT_RX_QINQ_STRIPPED is set, the flags (PKT_RX_VLAN |
> > - * PKT_RX_VLAN_STRIPPED | PKT_RX_QINQ) must also be set.
> > + * When PKT_RX_QINQ_STRIPPED is set, the flags (PKT_RX_VLAN |  PKT_RX_QINQ)
> > + * must also be set.
>
> ok
>
> > + * When both PKT_RX_QINQ_STRIPPED and PKT_RX_VLAN_STRIPPED are set, the 2 
> > vlans
> > + * have been stripped by the hardware and their tci are saved in
> > + * mbuf->vlan_tci (inner) and mbuf->vlan_tci_outer (outer).
>
> This is correct, but I'd use a bullet list to add another sentence:
>
>  * - If both PKT_RX_QINQ_STRIPPED and PKT_RX_VLAN_STRIPPED are set, the 2 
> vlans
>  *   have been stripped by the hardware and their tci are saved in
>  *   mbuf->vlan_tci (inner) and mbuf->vlan_tci_outer (outer).
>  * - If PKT_RX_QINQ_STRIPPED is set and PKT_RX_VLAN_STRIPPED is unset, only 
> the
>  *   outer vlan is removed from packet data, but both tci are saved in
>  *   mbuf->vlan_tci (inner) and mbuf->vlan_tci_outer (outer).
>
> > + * This can only happen if vlan stripping is enabled in the RX 
> > configuration
> > + * of the PMD.
>
> The same exact sentence is above, this one can be removed.
>
> > + * When PKT_RX_QINQ_STRIPPED and PKT_RX_VLAN_STRIPPED are set,
> > + * (PKT_RX_VLAN | PKT_RX_QINQ) must also be set.
>
> This can be removed too as it is redundant with above sentence:
>
>  * When PKT_RX_QINQ_STRIPPED is set, the flags (PKT_RX_VLAN |  PKT_RX_QINQ)
>  * must also be set.
>
>
> Thanks,
> Olivier


[dpdk-dev] [PATCH] crypto/ccp: fix queue alignment

2020-02-07 Thread David Marchand
Caught by compiling with -fno-common.
A cacheline_aligned symbol can be found in the crypto/ccp driver
object files.

Looking at this driver source, the cacheline_aligned (kernel?)
alignment macro is undefined.
The compiler treats this as a symbol definition and generates a global
symbol.

Fixes: ef4b04f87fa6 ("crypto/ccp: support device init")
Cc: sta...@dpdk.org

Signed-off-by: David Marchand 
---
 drivers/crypto/ccp/ccp_dev.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/crypto/ccp/ccp_dev.h b/drivers/crypto/ccp/ccp_dev.h
index f4ad9eafd..37e04218c 100644
--- a/drivers/crypto/ccp/ccp_dev.h
+++ b/drivers/crypto/ccp/ccp_dev.h
@@ -220,7 +220,7 @@ struct ccp_queue {
/**< lsb assigned for sha ctx */
uint32_t sb_hmac;
/**< lsb assigned for hmac ctx */
-} cacheline_aligned;
+} __rte_cache_aligned;
 
 /**
  * A structure describing a CCP device.
-- 
2.23.0



Re: [dpdk-dev] [RFC] Accelerator API to chain packet processing functions

2020-02-07 Thread Jerin Jacob
On Fri, Feb 7, 2020 at 6:08 PM Coyle, David  wrote:
>
> Hi Jerin, see below

Hi David,

> >
> > On Thu, Feb 6, 2020 at 10:01 PM Coyle, David 
> > wrote:
> >

> >
> > There is a risk in drafting API that meant for HW without any HW exists.
> > Because there could be inefficiency on the metadata and fast path API for
> > both models.
> > For example, In the case of CPU based scheme, it will be pure overhead
> > emulate the "queue"(the enqueue and dequeue) for the sake of abstraction
> > where CPU works better in the synchronous model and I have doubt that the
> > session-based scheme will work for HW or not as both difference  HW needs
> > to work hand in hand(IOMMU aspects for two PCI device)
>
> [DC] I understand what you are saying about the overhead of emulating the "sw 
> queue" but this same model is already used in many of the existing device 
> PMDs.
> In the case of SW devices, such as AESNI-MB or NULL for crypto or zlib for 
> compression, the enqueue/dequeue in the PMD is emulated through an rte_ring 
> which is very efficient.
> The accelerator API will use the existing device PMDs so keeping the same 
> model seems like a sensible approach.

In this release, we added CPU crypto support in cryptodev to support
the synchronous model to fix the overhead.

>
> From an application's point of view, this abstraction of the underlying 
> device type is important for usability and maintainability -  the application 
> doesn't need to know
> the device type as such and therefore doesn't need to make different API 
> calls.
>
> The enqueue/dequeue type API was also used with QAT in mind. While QAT HW 
> doesn't support these xform chains at the moment, it could potentially do so 
> in the future.
> As a side note, as part of the work of adding the accelerator API, the QAT 
> PMD will be updated to support the DOCSIS Crypto-CRC accelerator xform chain, 
> where the Crypto
> is done on QAT HW and the CRC will be done in SW, most likely through a call 
> to the optimized rte_net_crc library. This will give a consistent API for the 
> DOCSIS-MAC data-plane
> pipeline prototype we have developed, which uses both AESNI-MB and QAT for 
> benchmarks.
>
> We will take your feedback on the enqueue/dequeue approach for SW devices 
> into consideration though during development.
>
> Finally, I'm unsure what you mean by this line:
>
> "I have doubt that the session-based scheme will work for HW or not 
> as both difference  HW needs to work hand in hand(IOMMU aspects for two PCI 
> device)"
>
> What do mean by different HW working "hand in hand" and "two PCI device"?
> The intention is that 1 HW device (or it's PMD) would have to support the 
> accel xform chain

I was thinking, it will be N PCIe devices that create the chain. Each
distinct PCI device does the fixed-function and chains them together.

I do understand the usage of QAT HW and CRC in SW.
So If I understand it correctly, in rte_security, we are combining
rte_ethdev and rte_cryptodev. With this spec, we are trying to
combine,
rte_cryptodev and rte_compressdev. So it looks good to me. My only
remaining concern is the name of this API, accelerator too generic
name. IMO, like rte_security, we may need to give more meaningful name
for the use case where crytodev and compressdev can work together.


[dpdk-dev] [PATCH] test/crypto: add cpu crypto mode tests

2020-02-07 Thread Marcin Smoczynski
This patch adds ability to run unit tests in cpu crypto mode for AESNI
GCM cryptodev.

Signed-off-by: Marcin Smoczynski 
---
 app/test/test_cryptodev.c | 181 --
 1 file changed, 172 insertions(+), 9 deletions(-)

diff --git a/app/test/test_cryptodev.c b/app/test/test_cryptodev.c
index e6abc22b6..7b1ef5c86 100644
--- a/app/test/test_cryptodev.c
+++ b/app/test/test_cryptodev.c
@@ -1,5 +1,5 @@
 /* SPDX-License-Identifier: BSD-3-Clause
- * Copyright(c) 2015-2019 Intel Corporation
+ * Copyright(c) 2015-2020 Intel Corporation
  */
 
 #include 
@@ -52,6 +52,9 @@
 
 static int gbl_driver_id;
 
+static enum rte_security_session_action_type gbl_action_type =
+   RTE_SECURITY_ACTION_TYPE_NONE;
+
 struct crypto_testsuite_params {
struct rte_mempool *mbuf_pool;
struct rte_mempool *large_mbuf_pool;
@@ -139,9 +142,97 @@ ceil_byte_length(uint32_t num_bits)
return (num_bits >> 3);
 }
 
+static void
+process_cpu_gmac_op(uint8_t dev_id, struct rte_crypto_op *op)
+{
+   int32_t n, st;
+   void *iv;
+   struct rte_crypto_sym_op *sop;
+   union rte_crypto_sym_ofs ofs;
+   struct rte_crypto_sgl sgl;
+   struct rte_crypto_sym_vec symvec;
+   struct rte_crypto_vec vec[UINT8_MAX];
+
+   sop = op->sym;
+
+   n = rte_crypto_mbuf_to_vec(sop->m_src, sop->auth.data.offset,
+   sop->auth.data.length, vec, RTE_DIM(vec));
+
+   if (n < 0 || n != sop->m_src->nb_segs) {
+   op->status = RTE_CRYPTO_OP_STATUS_ERROR;
+   return;
+   }
+
+   sgl.vec = vec;
+   sgl.num = n;
+   symvec.sgl = &sgl;
+   iv = rte_crypto_op_ctod_offset(op, void *, IV_OFFSET);
+   symvec.iv = &iv;
+   symvec.aad = NULL;
+   symvec.digest = (void **)&sop->auth.digest.data;
+   symvec.status = &st;
+   symvec.num = 1;
+
+   ofs.raw = 0;
+
+   n = rte_cryptodev_sym_cpu_crypto_process(dev_id, sop->session, ofs,
+   &symvec);
+
+   if (n != 1)
+   op->status = RTE_CRYPTO_OP_STATUS_AUTH_FAILED;
+   else
+   op->status = RTE_CRYPTO_OP_STATUS_SUCCESS;
+}
+
+
+static void
+process_cpu_aead_op(uint8_t dev_id, struct rte_crypto_op *op)
+{
+   int32_t n, st;
+   void *iv;
+   struct rte_crypto_sym_op *sop;
+   union rte_crypto_sym_ofs ofs;
+   struct rte_crypto_sgl sgl;
+   struct rte_crypto_sym_vec symvec;
+   struct rte_crypto_vec vec[UINT8_MAX];
+
+   sop = op->sym;
+
+   n = rte_crypto_mbuf_to_vec(sop->m_src, sop->aead.data.offset,
+   sop->aead.data.length, vec, RTE_DIM(vec));
+
+   if (n < 0 || n != sop->m_src->nb_segs) {
+   op->status = RTE_CRYPTO_OP_STATUS_ERROR;
+   return;
+   }
+
+   sgl.vec = vec;
+   sgl.num = n;
+   symvec.sgl = &sgl;
+   iv = rte_crypto_op_ctod_offset(op, void *, IV_OFFSET);
+   symvec.iv = &iv;
+   symvec.aad = (void **)&sop->aead.aad.data;
+   symvec.digest = (void **)&sop->aead.digest.data;
+   symvec.status = &st;
+   symvec.num = 1;
+
+   ofs.raw = 0;
+
+   n = rte_cryptodev_sym_cpu_crypto_process(dev_id, sop->session, ofs,
+   &symvec);
+
+   if (n != 1)
+   op->status = RTE_CRYPTO_OP_STATUS_AUTH_FAILED;
+   else
+   op->status = RTE_CRYPTO_OP_STATUS_SUCCESS;
+}
+
 static struct rte_crypto_op *
 process_crypto_request(uint8_t dev_id, struct rte_crypto_op *op)
 {
+
+   RTE_VERIFY(gbl_action_type != RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO);
+
if (rte_cryptodev_enqueue_burst(dev_id, 0, &op, 1) != 1) {
RTE_LOG(ERR, USER1, "Error sending packet for encryption\n");
return NULL;
@@ -6937,7 +7028,11 @@ test_authenticated_encryption(const struct 
aead_test_data *tdata)
ut_params->op->sym->m_src = ut_params->ibuf;
 
/* Process crypto operation */
-   TEST_ASSERT_NOT_NULL(process_crypto_request(ts_params->valid_devs[0],
+   if (gbl_action_type == RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO)
+   process_cpu_aead_op(ts_params->valid_devs[0], ut_params->op);
+   else
+   TEST_ASSERT_NOT_NULL(
+   process_crypto_request(ts_params->valid_devs[0],
ut_params->op), "failed to process sym crypto op");
 
TEST_ASSERT_EQUAL(ut_params->op->status, RTE_CRYPTO_OP_STATUS_SUCCESS,
@@ -7868,7 +7963,11 @@ test_authenticated_decryption(const struct 
aead_test_data *tdata)
ut_params->op->sym->m_src = ut_params->ibuf;
 
/* Process crypto operation */
-   TEST_ASSERT_NOT_NULL(process_crypto_request(ts_params->valid_devs[0],
+   if (gbl_action_type == RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO)
+   process_cpu_aead_op(ts_params->valid_devs[0], ut_params->op);
+   else
+   TEST_ASSERT_NOT_NULL(
+   process_crypto_request(ts_params->valid_devs[0],
ut_params-

Re: [dpdk-dev] [PATCH v2] service: don't walk out of bounds when checking services

2020-02-07 Thread Aaron Conole
Kevin Traynor  writes:

> On 20/12/2019 14:43, David Marchand wrote:
>> On Wed, Dec 4, 2019 at 9:34 AM David Marchand  
>> wrote:
>>>
>>> On Wed, Dec 4, 2019 at 9:33 AM David Marchand  
>>> wrote:

 On Tue, Dec 3, 2019 at 10:15 PM Aaron Conole  wrote:
>
> The service_valid call is used without properly bounds checking the
> input parameter.  Almost all instances of the service_valid call are
> inside a for() loop that prevents excessive walks, but some of the
> public APIs don't bounds check and will pass invalid arguments.
>
> Prevent this by using SERVICE_GET_OR_ERR_RET where it makes sense,
> and adding a bounds check to one service_valid() use.
>
> Fixes: 8d39d3e237c2 ("service: fix race in service on app lcore function")
> Fixes: e9139a32f6e8 ("service: add function to run on app lcore")
> Fixes: e30dd31847d2 ("service: add mechanism for quiescing")
>>> Cc: sta...@dpdk.org
>>>
>
> With the commit below, this patch will apply cleanly on 18.11.
>
> Seems ok to me to add below commit, wdyt?

If I'm reading it correctly, the move is for an internal data structure
in librte_eal, so I think it shouldn't be an ABI breakage.

Looks safe to me as well.

> commit e484ccddbe1b41886fef1e445ef2fdfa55086198
> Author: Nikhil Rao 
> Date:   Mon Sep 16 15:31:02 2019 +0530
>
> service: avoid false sharing on core state
>
>
> Signed-off-by: Aaron Conole 

 Reviewed-by: David Marchand 
>> 
>> Applied, thanks.
>> 
>> 
>> --
>> David Marchand
>> 



Re: [dpdk-dev] [PATCH] mbuf: fix to update documentation of QinQ stripped bit interpretation

2020-02-07 Thread Olivier Matz
Hi Somnath,

On Fri, Feb 07, 2020 at 07:13:04PM +0530, Somnath Kotur wrote:
> Olivier,
> 
> On Thu, Feb 6, 2020 at 10:55 PM Olivier Matz  wrote:
> >
> > Hi Somnath,
> >
> > Sorry for the delay, please find some comments below.
> >
> > I suggest the following title instead:
> >
> >   mbuf: extend meaning of QinQ stripped bit
> >
> > On Mon, Jan 06, 2020 at 02:04:23PM +0530, Somnath Kotur wrote:
> > > Certain hardware may be able to strip and/or save only the outermost
> > > VLAN instead of both the VLANs in the mbuf in a QinQ scenario.
> > > To handle such cases, we could re-interpret setting of just
> > > PKT_RX_QINQ_STRIPPED to indicate that only the outermost VLAN has
> > > been stripped and saved in mbuf->vlan_tci_outer.
> >
> > To be sure we're on the same page: we are talking about case 7 of this
> > link: http://inbox.dpdk.org/dev/20191227145041.GQ22738@platinum/
> 
> I'm not sure we are on the same page then, please see my response inline below
> >
> > So, even if the inner vlan_tci is not stripped from packet data, it has
> > to be saved in m->vlan_tci, because it is implied by PKT_RX_VLAN.
> >
> > From the same link, the case where the driver only strips+saves the
> > outer vlan without saving or stripping the inner one is case 3.
> >
> While this is how it works currently, I'm wondering how will the
> application know if this was
> a double VLAN pkt, correct?

If the hardware supports it and configured for it, the flag PKT_RX_QINQ
is set when there are 2 vlans.

FYI, the m->packet_type field can also be useful. It describes what is
really present in the packet data, as described in rte_mbuf_core.h:

/*
 * The packet type, which is the combination of outer/inner L2, L3, L4
 * and tunnel types. The packet_type is about data really present in the
 * mbuf. Example: if vlan stripping is enabled, a received vlan packet
 * would have RTE_PTYPE_L2_ETHER and not RTE_PTYPE_L2_VLAN because the
 * vlan is stripped from the data.
 */

> Also when i look at options 5 and 7 I don't really see the difference
> in semantics between them ?
> Both seem to store the outer-vlan and inner-vlan in m->vlan_tci_outer
> and m->vlan_tci_inner respectively
> 
> 7/ PKT_RX_VLAN | PKT_RX_QINQ | PKT_RX_QINQ_STRIPPED
>outer-vlan is removed from packet data
>m->vlan_tci_outer=outer-vlan
>m->vlan_tci=inner-vlan

The difference is about what was stripped or not from mbuf data.

> I was hoping that with the new interpretation of PKT_RX_QINQ_STRIPPED,
> it only meant that outer-vlan is removed from packet data
> and m->vlan_tci_outer = outer_vlan, while PKT_RX_QINQ implies it is a
> double-vlan pkt and PKT_RX_VLAN implies that pkt has VLAN
> associated with it?   Not m->vlan_tci = inner-vlan

The meaning of each flag should be as simple as possible, I think we can
summarize them like this:

- PKT_RX_VLAN: the vlan is saved in vlan tci.
- PKT_RX_VLAN_STRIPPED: the vlan hdr is removed from packet data.
- PKT_RX_QINQ: the outer vlan is saved in vlan tci.
- PKT_RX_QINQ_STRIPPED: the inner vlan is stripped from packet data.
- When PKT_RX_QINQ is set, PKT_RX_VLAN* refer to the inner vlan
  of initial packet, else it refers to the first vlan of the packet.

There is a link between vlan flag and vlan_tci field, and qinq flag and
vlan_tci_outer field.

I'm still not sure to understand what you expect. Can you give an
example with flags (which are set), and the expected content of m->vlan_tci
and m->vlan_tci_outer?

By the way, the case 5/ is not very well described too, maybe we should
add something about it.

Thanks,
Olivier


> 
> Thanks
> Som
> 
> 
> > > Only When both PKT_RX_QINQ_STRIPPED and PKT_RX_VLAN_STRIPPED are set, the 
> > > 2
> > > VLANs have been stripped by the hardware and their TCI are saved in
> > > mbuf->vlan_tci (inner) and mbuf->vlan_tci_outer (outer).
> > >
> > > Signed-off-by: Somnath Kotur 
> > > ---
> > >  lib/librte_mbuf/rte_mbuf_core.h | 15 +++
> > >  1 file changed, 11 insertions(+), 4 deletions(-)
> > >
> > > diff --git a/lib/librte_mbuf/rte_mbuf_core.h 
> > > b/lib/librte_mbuf/rte_mbuf_core.h
> > > index 9a8557d..db1070b 100644
> > > --- a/lib/librte_mbuf/rte_mbuf_core.h
> > > +++ b/lib/librte_mbuf/rte_mbuf_core.h
> > > @@ -124,12 +124,19 @@
> > >  #define PKT_RX_FDIR_FLX  (1ULL << 14)
> > >
> > >  /**
> > > - * The 2 vlans have been stripped by the hardware and their tci are
> > > - * saved in mbuf->vlan_tci (inner) and mbuf->vlan_tci_outer (outer).
> > > + * The outer vlan has been stripped by the hardware and their tci are
> > > + * saved in mbuf->vlan_tci_outer (outer).
> >
> > their tci are -> its tci is
> >
> > >   * This can only happen if vlan stripping is enabled in the RX
> > >   * configuration of the PMD.
> > > - * When PKT_RX_QINQ_STRIPPED is set, the flags (PKT_RX_VLAN |
> > > - * PKT_RX_VLAN_STRIPPED | PKT_RX_QINQ) must also be set.
> > > + * When PKT_RX_QINQ_STRIPPED is set, the flags (PKT_RX_VLAN |  
> > > PKT_RX_

[dpdk-dev] [PATCH 0/3] AF_XDP PMD Fixes

2020-02-07 Thread Ciara Loftus
This series introduces some fixes for the zero copy path of the AF_XDP.
In zero copy, the mempool objects are mapped directly into the AF_XDP UMEM.
Below depicts the layout of an object in a mempool.

+-++--+--+-+-+
| mp  | struct | mbuf | mbuf | XDP | |
| hdr | rte_   | priv |  hr  | hr  |   payload   |
| obj | mbuf   |  |  | | |
+-++--+--+-+-+
  64 128   * 128   256*

< frame size >
< frame hr ->

1: net/af_xdp: fix umem frame size & headroom calculations
* The previous frame size calculation incorrectly used
mb_pool->private_data_size and didn't include mb_pool->header_size. Instead
of performing a manual calculation, use the rte_mempool_calc_obj_size API
to determine the frame size.
* The previous frame headroom calculation also incorrectly used
mb_pool->private_data_size and didn't include mb_pool->header_size or the
mbuf priv size.

2. net/af_xdp: use correct fill queue addresses
The fill queue addresses should start at the beginning of the mempool
object instead of the beginning of the mbuf. This is because the umem frame
headroom includes the mp hdrobj size. Starting at this point ensures AF_XDP
doesn't write past the available room in the frame, in the case of larger
packets which are close to the size of the mbuf.

3. net/af_xdp: fix maximum MTU value
The maximum MTU for af_xdp zero copy is equal to the page size less the
frame overhead introduced by AF_XDP (XDP HR = 256) and DPDK (frame hr = 
320). The patch updates this value to reflect this, and removes some
unneeded constants for both zero-copy and copy mode.

Ciara Loftus (3):
  net/af_xdp: fix umem frame size & headroom calculations
  net/af_xdp: use correct fill queue addresses
  net/af_xdp: fix maximum MTU value

 drivers/net/af_xdp/rte_eth_af_xdp.c | 60 +
 1 file changed, 35 insertions(+), 25 deletions(-)

-- 
2.17.1



[dpdk-dev] [PATCH 2/3] net/af_xdp: use correct fill queue addresses

2020-02-07 Thread Ciara Loftus
The fill queue addresses should start at the beginning of the mempool
object instead of the beginning of the mbuf. This is because the umem frame
headroom includes the mp hdrobj size. Starting at this point ensures AF_XDP
doesn't write past the available room in the frame, in the case of larger
packets which are close to the size of the mbuf.

Fixes: d8a210774e1d ("net/af_xdp: support unaligned umem chunks")
Cc: sta...@dpdk.org

Signed-off-by: Ciara Loftus 
---
 drivers/net/af_xdp/rte_eth_af_xdp.c | 24 
 1 file changed, 16 insertions(+), 8 deletions(-)

diff --git a/drivers/net/af_xdp/rte_eth_af_xdp.c 
b/drivers/net/af_xdp/rte_eth_af_xdp.c
index 8b189119c..ebc64d4dd 100644
--- a/drivers/net/af_xdp/rte_eth_af_xdp.c
+++ b/drivers/net/af_xdp/rte_eth_af_xdp.c
@@ -172,7 +172,8 @@ reserve_fill_queue_zc(struct xsk_umem_info *umem, uint16_t 
reserve_size,
uint64_t addr;
 
fq_addr = xsk_ring_prod__fill_addr(fq, idx++);
-   addr = (uint64_t)bufs[i] - (uint64_t)umem->buffer;
+   addr = (uint64_t)bufs[i] - (uint64_t)umem->buffer -
+   umem->mb_pool->header_size;
*fq_addr = addr;
}
 
@@ -271,8 +272,10 @@ af_xdp_rx_zc(void *queue, struct rte_mbuf **bufs, uint16_t 
nb_pkts)
addr = xsk_umem__extract_addr(addr);
 
bufs[i] = (struct rte_mbuf *)
-   xsk_umem__get_data(umem->buffer, addr);
-   bufs[i]->data_off = offset - sizeof(struct rte_mbuf);
+   xsk_umem__get_data(umem->buffer, addr +
+   umem->mb_pool->header_size);
+   bufs[i]->data_off = offset - sizeof(struct rte_mbuf) -
+   umem->mb_pool->header_size;
 
rte_pktmbuf_pkt_len(bufs[i]) = len;
rte_pktmbuf_data_len(bufs[i]) = len;
@@ -385,7 +388,8 @@ pull_umem_cq(struct xsk_umem_info *umem, int size)
 #if defined(XDP_UMEM_UNALIGNED_CHUNK_FLAG)
addr = xsk_umem__extract_addr(addr);
rte_pktmbuf_free((struct rte_mbuf *)
-   xsk_umem__get_data(umem->buffer, addr));
+   xsk_umem__get_data(umem->buffer,
+   addr + umem->mb_pool->header_size));
 #else
rte_ring_enqueue(umem->buf_ring, (void *)addr);
 #endif
@@ -443,9 +447,11 @@ af_xdp_tx_zc(void *queue, struct rte_mbuf **bufs, uint16_t 
nb_pkts)
}
desc = xsk_ring_prod__tx_desc(&txq->tx, idx_tx);
desc->len = mbuf->pkt_len;
-   addr = (uint64_t)mbuf - (uint64_t)umem->buffer;
+   addr = (uint64_t)mbuf - (uint64_t)umem->buffer -
+   umem->mb_pool->header_size;
offset = rte_pktmbuf_mtod(mbuf, uint64_t) -
-   (uint64_t)mbuf;
+   (uint64_t)mbuf +
+   umem->mb_pool->header_size;
offset = offset << XSK_UNALIGNED_BUF_OFFSET_SHIFT;
desc->addr = addr | offset;
count++;
@@ -466,9 +472,11 @@ af_xdp_tx_zc(void *queue, struct rte_mbuf **bufs, uint16_t 
nb_pkts)
desc = xsk_ring_prod__tx_desc(&txq->tx, idx_tx);
desc->len = mbuf->pkt_len;
 
-   addr = (uint64_t)local_mbuf - (uint64_t)umem->buffer;
+   addr = (uint64_t)local_mbuf - (uint64_t)umem->buffer -
+   umem->mb_pool->header_size;
offset = rte_pktmbuf_mtod(local_mbuf, uint64_t) -
-   (uint64_t)local_mbuf;
+   (uint64_t)local_mbuf +
+   umem->mb_pool->header_size;
pkt = xsk_umem__get_data(umem->buffer, addr + offset);
offset = offset << XSK_UNALIGNED_BUF_OFFSET_SHIFT;
desc->addr = addr | offset;
-- 
2.17.1



[dpdk-dev] [PATCH 1/3] net/af_xdp: fix umem frame size & headroom calculations

2020-02-07 Thread Ciara Loftus
The previous frame size calculation incorrectly used
mb_pool->private_data_size and didn't include mb_pool->header_size.
Instead of performing a manual calculation, use the
rte_mempool_calc_obj_size API to determine the frame size.

The previous frame headroom calculation also incorrectly used
mb_pool->private_data_size and didn't include mb_pool->header_size or the
mbuf priv size. Fix this.

Fixes: d8a210774e1d ("net/af_xdp: support unaligned umem chunks")
Cc: sta...@dpdk.org

Signed-off-by: Ciara Loftus 
---
 drivers/net/af_xdp/rte_eth_af_xdp.c | 13 -
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/drivers/net/af_xdp/rte_eth_af_xdp.c 
b/drivers/net/af_xdp/rte_eth_af_xdp.c
index 683e2a559..8b189119c 100644
--- a/drivers/net/af_xdp/rte_eth_af_xdp.c
+++ b/drivers/net/af_xdp/rte_eth_af_xdp.c
@@ -34,6 +34,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -755,11 +756,13 @@ xsk_umem_info *xdp_umem_configure(struct pmd_internals 
*internals __rte_unused,
void *base_addr = NULL;
struct rte_mempool *mb_pool = rxq->mb_pool;
 
-   usr_config.frame_size = rte_pktmbuf_data_room_size(mb_pool) +
-   ETH_AF_XDP_MBUF_OVERHEAD +
-   mb_pool->private_data_size;
-   usr_config.frame_headroom = ETH_AF_XDP_DATA_HEADROOM +
-   mb_pool->private_data_size;
+   usr_config.frame_size = rte_mempool_calc_obj_size(mb_pool->elt_size,
+   mb_pool->flags,
+   NULL);
+   usr_config.frame_headroom = mb_pool->header_size +
+   sizeof(struct rte_mbuf) +
+   rte_pktmbuf_priv_size(mb_pool) +
+   RTE_PKTMBUF_HEADROOM;
 
umem = rte_zmalloc_socket("umem", sizeof(*umem), 0, rte_socket_id());
if (umem == NULL) {
-- 
2.17.1



[dpdk-dev] [PATCH 3/3] net/af_xdp: fix maximum MTU value

2020-02-07 Thread Ciara Loftus
The maximum MTU for af_xdp zero copy is equal to the page size less the
frame overhead introduced by AF_XDP (XDP HR = 256) and DPDK (frame headroom
= 320). The patch updates this value to reflect this.

This change also makes it possible to remove unneeded constants for both
zero-copy and copy mode.

Fixes: d8a210774e1d ("net/af_xdp: support unaligned umem chunks")
Cc: sta...@dpdk.org

Signed-off-by: Ciara Loftus 
---
 drivers/net/af_xdp/rte_eth_af_xdp.c | 23 +++
 1 file changed, 11 insertions(+), 12 deletions(-)

diff --git a/drivers/net/af_xdp/rte_eth_af_xdp.c 
b/drivers/net/af_xdp/rte_eth_af_xdp.c
index ebc64d4dd..bf18a571d 100644
--- a/drivers/net/af_xdp/rte_eth_af_xdp.c
+++ b/drivers/net/af_xdp/rte_eth_af_xdp.c
@@ -59,13 +59,6 @@ static int af_xdp_logtype;
 
 #define ETH_AF_XDP_FRAME_SIZE  2048
 #define ETH_AF_XDP_NUM_BUFFERS 4096
-#ifdef XDP_UMEM_UNALIGNED_CHUNK_FLAG
-#define ETH_AF_XDP_MBUF_OVERHEAD   128 /* sizeof(struct rte_mbuf) */
-#define ETH_AF_XDP_DATA_HEADROOM \
-   (ETH_AF_XDP_MBUF_OVERHEAD + RTE_PKTMBUF_HEADROOM)
-#else
-#define ETH_AF_XDP_DATA_HEADROOM   0
-#endif
 #define ETH_AF_XDP_DFLT_NUM_DESCS  XSK_RING_CONS__DEFAULT_NUM_DESCS
 #define ETH_AF_XDP_DFLT_START_QUEUE_IDX0
 #define ETH_AF_XDP_DFLT_QUEUE_COUNT1
@@ -601,7 +594,14 @@ eth_dev_info(struct rte_eth_dev *dev, struct 
rte_eth_dev_info *dev_info)
dev_info->max_tx_queues = internals->queue_cnt;
 
dev_info->min_mtu = RTE_ETHER_MIN_MTU;
-   dev_info->max_mtu = ETH_AF_XDP_FRAME_SIZE - ETH_AF_XDP_DATA_HEADROOM;
+#if defined(XDP_UMEM_UNALIGNED_CHUNK_FLAG)
+   dev_info->max_mtu = getpagesize() -
+   sizeof(struct rte_mempool_objhdr) -
+   sizeof(struct rte_mbuf) -
+   RTE_PKTMBUF_HEADROOM - XDP_PACKET_HEADROOM;
+#else
+   dev_info->max_mtu = ETH_AF_XDP_FRAME_SIZE;
+#endif
 
dev_info->default_rxportconf.nb_queues = 1;
dev_info->default_txportconf.nb_queues = 1;
@@ -803,7 +803,7 @@ xsk_umem_info *xdp_umem_configure(struct pmd_internals 
*internals,
.fill_size = ETH_AF_XDP_DFLT_NUM_DESCS,
.comp_size = ETH_AF_XDP_DFLT_NUM_DESCS,
.frame_size = ETH_AF_XDP_FRAME_SIZE,
-   .frame_headroom = ETH_AF_XDP_DATA_HEADROOM };
+   .frame_headroom = 0 };
char ring_name[RTE_RING_NAMESIZE];
char mz_name[RTE_MEMZONE_NAMESIZE];
int ret;
@@ -828,8 +828,7 @@ xsk_umem_info *xdp_umem_configure(struct pmd_internals 
*internals,
 
for (i = 0; i < ETH_AF_XDP_NUM_BUFFERS; i++)
rte_ring_enqueue(umem->buf_ring,
-(void *)(i * ETH_AF_XDP_FRAME_SIZE +
- ETH_AF_XDP_DATA_HEADROOM));
+(void *)(i * ETH_AF_XDP_FRAME_SIZE));
 
snprintf(mz_name, sizeof(mz_name), "af_xdp_umem_%s_%u",
   internals->if_name, rxq->xsk_queue_idx);
@@ -938,7 +937,7 @@ eth_rx_queue_setup(struct rte_eth_dev *dev,
/* Now get the space available for data in the mbuf */
buf_size = rte_pktmbuf_data_room_size(mb_pool) -
RTE_PKTMBUF_HEADROOM;
-   data_size = ETH_AF_XDP_FRAME_SIZE - ETH_AF_XDP_DATA_HEADROOM;
+   data_size = ETH_AF_XDP_FRAME_SIZE;
 
if (data_size > buf_size) {
AF_XDP_LOG(ERR, "%s: %d bytes will not fit in mbuf (%d 
bytes)\n",
-- 
2.17.1



Re: [dpdk-dev] [RFT] net/netvsc: initialize link state

2020-02-07 Thread Stephen Hemminger
On Fri, 07 Feb 2020 15:22:23 +0200
Mohammed Gamal  wrote:

> On Thu, 2020-02-06 at 16:10 -0800, Stephen Hemminger wrote:
> > If application is using link state interrupt, the correct link state
> > needs to be filled in when device is started. This is similar to
> > how virtio updates link information.
> > 
> > Reported-by: Mohammed Gamal 
> > Signed-off-by: Stephen Hemminger 
> > ---
> > This version marked RFT because am in airport without access to a
> > machine to test it.
> > 
> >  drivers/net/netvsc/hn_ethdev.c | 4 
> >  1 file changed, 4 insertions(+)
> > 
> > diff --git a/drivers/net/netvsc/hn_ethdev.c
> > b/drivers/net/netvsc/hn_ethdev.c
> > index c79f924379fe..564620748daf 100644
> > --- a/drivers/net/netvsc/hn_ethdev.c
> > +++ b/drivers/net/netvsc/hn_ethdev.c
> > @@ -823,6 +823,10 @@ hn_dev_start(struct rte_eth_dev *dev)
> > if (error)
> > hn_rndis_set_rxfilter(hv, 0);
> >  
> > +   /* Initialize Link state */
> > +   if (error == 0)
> > +   hn_dev_link_update(dev, 0);
> > +
> > return error;
> >  }
> >
> 
> I tested this and I always get the link status as UP, regardless of
> whether I start the interface on the guest in UP or DOWN state. Looking
> at hn_dev_link_update() code, I see that the link status depends on the
> NDIS status that the driver gets from the host if my understanding is
> correct.
> The question is whether if I use 'ip li set dev $IF_NAME down' on the
> guest affects the status the host sees, or would the host set the state
> to NDIS_MEDIA_STATE_CONNECTED of the device is physcially connected
> regardless of what the guest tries to do?
> 

Are you confused about admin state vs link state?  Admin state is the
up/down state in software, and link state is the (virtual) hardware link
status.  In traditional Linux, admin state is controlled by ip link
set up/down; in DPDK the admin state is implied by whether the DPDK
device is started or stopped.  The link state for hardware devices is
determined by whether the hardware link has synchronized with the switch.
In virtual environments this is synchronized. In Linux link state
is reported as NOCARRIER (IFF_RUNNING). In DPDK it is reported in
via the link info get.

The device visible to the kernel is the accelerated networking (Mellanox)
device and is not related directly to the netvsc device.

To test link up/down is not easy on Azure. You would have to use Azure CLI
to disconnect the NIC from VM.  On native Hyper-V you can test by
setting up a virtual switch with an external network device; then
unplug the network device.




Re: [dpdk-dev] [RFT] net/netvsc: initialize link state

2020-02-07 Thread Mohammed Gamal
On Fri, 2020-02-07 at 08:12 -0800, Stephen Hemminger wrote:
> On Fri, 07 Feb 2020 15:22:23 +0200
> Mohammed Gamal  wrote:
> 
> > On Thu, 2020-02-06 at 16:10 -0800, Stephen Hemminger wrote:
> > > If application is using link state interrupt, the correct link
> > > state
> > > needs to be filled in when device is started. This is similar to
> > > how virtio updates link information.
> > > 
> > > Reported-by: Mohammed Gamal 
> > > Signed-off-by: Stephen Hemminger 
> > > ---
> > > This version marked RFT because am in airport without access to a
> > > machine to test it.
> > > 
> > >  drivers/net/netvsc/hn_ethdev.c | 4 
> > >  1 file changed, 4 insertions(+)
> > > 
> > > diff --git a/drivers/net/netvsc/hn_ethdev.c
> > > b/drivers/net/netvsc/hn_ethdev.c
> > > index c79f924379fe..564620748daf 100644
> > > --- a/drivers/net/netvsc/hn_ethdev.c
> > > +++ b/drivers/net/netvsc/hn_ethdev.c
> > > @@ -823,6 +823,10 @@ hn_dev_start(struct rte_eth_dev *dev)
> > >   if (error)
> > >   hn_rndis_set_rxfilter(hv, 0);
> > >  
> > > + /* Initialize Link state */
> > > + if (error == 0)
> > > + hn_dev_link_update(dev, 0);
> > > +
> > >   return error;
> > >  }
> > >
> > 
> > I tested this and I always get the link status as UP, regardless of
> > whether I start the interface on the guest in UP or DOWN state.
> > Looking
> > at hn_dev_link_update() code, I see that the link status depends on
> > the
> > NDIS status that the driver gets from the host if my understanding
> > is
> > correct.
> > The question is whether if I use 'ip li set dev $IF_NAME down' on
> > the
> > guest affects the status the host sees, or would the host set the
> > state
> > to NDIS_MEDIA_STATE_CONNECTED of the device is physcially connected
> > regardless of what the guest tries to do?
> > 
> 
> Are you confused about admin state vs link state?  Admin state is the
> up/down state in software, and link state is the (virtual) hardware
> link
> status.  In traditional Linux, admin state is controlled by ip link
> set up/down; in DPDK the admin state is implied by whether the DPDK
> device is started or stopped.  The link state for hardware devices is
> determined by whether the hardware link has synchronized with the
> switch.
> In virtual environments this is synchronized. In Linux link state
> is reported as NOCARRIER (IFF_RUNNING). In DPDK it is reported in
> via the link info get.
> 
> The device visible to the kernel is the accelerated networking
> (Mellanox)
> device and is not related directly to the netvsc device.
> 
> To test link up/down is not easy on Azure. You would have to use
> Azure CLI
> to disconnect the NIC from VM.  On native Hyper-V you can test by
> setting up a virtual switch with an external network device; then
> unplug the network device.
> 
> 

I see. Thanks for the explanation. In this case this does work as
expected.

Tested-by: Mohammed Gamal 



Re: [dpdk-dev] [PATCH v7 9/9] eal: add minimum viable code to support parsing

2020-02-07 Thread Ranjit Menon

On 2/6/2020 1:26 AM, Thomas Monjalon wrote:

As discussed in community meeting, the goal was to have core parsing
in Windows EAL 20.02.
Given that there is a crash and a doubt on the imported getopt library,
I think it's better to postpone this series to 20.05.




Thomas...
We have fixed the crash in the v8 series of this patch - Pallavi sent it 
out for review last night. (It was a simple fix)


Can we please get this merged into 20.02? It would be great to progress 
Windows support in this release. (The initial Windows support was merged 
in 19.02, so it's been a year since we moved it forward! :-))


thanks,

Ranjit


[dpdk-dev] [PATCH v2 0/3] AF_XDP PMD Fixes

2020-02-07 Thread Ciara Loftus
This series introduces some fixes for the zero copy path of the AF_XDP.
In zero copy, the mempool objects are mapped directly into the AF_XDP UMEM.
Below depicts the layout of an object in a mempool.

+-++--+--+-+-+
| mp  | struct | mbuf | mbuf | XDP | |
| hdr | rte_   | priv |  hr  | hr  |   payload   |
| obj | mbuf   |  |  | | |
+-++--+--+-+-+
  64 128   * 128   256*

< frame size >
< frame hr ->

1: net/af_xdp: fix umem frame size & headroom calculations
* The previous frame size calculation incorrectly used
mb_pool->private_data_size and didn't include mb_pool->header_size. Instead
of performing a manual calculation, use the rte_mempool_calc_obj_size API
to determine the frame size.
* The previous frame headroom calculation also incorrectly used
mb_pool->private_data_size and didn't include mb_pool->header_size or the
mbuf priv size.

2. net/af_xdp: use correct fill queue addresses
The fill queue addresses should start at the beginning of the mempool
object instead of the beginning of the mbuf. This is because the umem frame
headroom includes the mp hdrobj size. Starting at this point ensures AF_XDP
doesn't write past the available room in the frame, in the case of larger
packets which are close to the size of the mbuf.

3. net/af_xdp: fix maximum MTU value
The maximum MTU for af_xdp zero copy is equal to the page size less the
frame overhead introduced by AF_XDP (XDP HR = 256) and DPDK (frame hr = 
320). The patch updates this value to reflect this, and removes some
unneeded constants for both zero-copy and copy mode.

v2:
* Include mbuf priv size in rx mbuf data_off calculation

Ciara Loftus (3):
  net/af_xdp: fix umem frame size & headroom calculations
  net/af_xdp: use correct fill queue addresses
  net/af_xdp: fix maximum MTU value

 drivers/net/af_xdp/rte_eth_af_xdp.c | 60 +
 1 file changed, 35 insertions(+), 25 deletions(-)

-- 
2.17.1



[dpdk-dev] [PATCH v2 1/3] net/af_xdp: fix umem frame size & headroom calculations

2020-02-07 Thread Ciara Loftus
The previous frame size calculation incorrectly used
mb_pool->private_data_size and didn't include mb_pool->header_size.
Instead of performing a manual calculation, use the
rte_mempool_calc_obj_size API to determine the frame size.

The previous frame headroom calculation also incorrectly used
mb_pool->private_data_size and didn't include mb_pool->header_size or the
mbuf priv size. Fix this.

Fixes: d8a210774e1d ("net/af_xdp: support unaligned umem chunks")
Cc: sta...@dpdk.org

Signed-off-by: Ciara Loftus 
---
 drivers/net/af_xdp/rte_eth_af_xdp.c | 13 -
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/drivers/net/af_xdp/rte_eth_af_xdp.c 
b/drivers/net/af_xdp/rte_eth_af_xdp.c
index 683e2a559..8b189119c 100644
--- a/drivers/net/af_xdp/rte_eth_af_xdp.c
+++ b/drivers/net/af_xdp/rte_eth_af_xdp.c
@@ -34,6 +34,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -755,11 +756,13 @@ xsk_umem_info *xdp_umem_configure(struct pmd_internals 
*internals __rte_unused,
void *base_addr = NULL;
struct rte_mempool *mb_pool = rxq->mb_pool;
 
-   usr_config.frame_size = rte_pktmbuf_data_room_size(mb_pool) +
-   ETH_AF_XDP_MBUF_OVERHEAD +
-   mb_pool->private_data_size;
-   usr_config.frame_headroom = ETH_AF_XDP_DATA_HEADROOM +
-   mb_pool->private_data_size;
+   usr_config.frame_size = rte_mempool_calc_obj_size(mb_pool->elt_size,
+   mb_pool->flags,
+   NULL);
+   usr_config.frame_headroom = mb_pool->header_size +
+   sizeof(struct rte_mbuf) +
+   rte_pktmbuf_priv_size(mb_pool) +
+   RTE_PKTMBUF_HEADROOM;
 
umem = rte_zmalloc_socket("umem", sizeof(*umem), 0, rte_socket_id());
if (umem == NULL) {
-- 
2.17.1



[dpdk-dev] [PATCH v2 3/3] net/af_xdp: fix maximum MTU value

2020-02-07 Thread Ciara Loftus
The maximum MTU for af_xdp zero copy is equal to the page size less the
frame overhead introduced by AF_XDP (XDP HR = 256) and DPDK (frame headroom
= 320). The patch updates this value to reflect this.

This change also makes it possible to remove unneeded constants for both
zero-copy and copy mode.

Fixes: d8a210774e1d ("net/af_xdp: support unaligned umem chunks")
Cc: sta...@dpdk.org

Signed-off-by: Ciara Loftus 
---
 drivers/net/af_xdp/rte_eth_af_xdp.c | 23 +++
 1 file changed, 11 insertions(+), 12 deletions(-)

diff --git a/drivers/net/af_xdp/rte_eth_af_xdp.c 
b/drivers/net/af_xdp/rte_eth_af_xdp.c
index 1e98cd44f..75f037c3e 100644
--- a/drivers/net/af_xdp/rte_eth_af_xdp.c
+++ b/drivers/net/af_xdp/rte_eth_af_xdp.c
@@ -59,13 +59,6 @@ static int af_xdp_logtype;
 
 #define ETH_AF_XDP_FRAME_SIZE  2048
 #define ETH_AF_XDP_NUM_BUFFERS 4096
-#ifdef XDP_UMEM_UNALIGNED_CHUNK_FLAG
-#define ETH_AF_XDP_MBUF_OVERHEAD   128 /* sizeof(struct rte_mbuf) */
-#define ETH_AF_XDP_DATA_HEADROOM \
-   (ETH_AF_XDP_MBUF_OVERHEAD + RTE_PKTMBUF_HEADROOM)
-#else
-#define ETH_AF_XDP_DATA_HEADROOM   0
-#endif
 #define ETH_AF_XDP_DFLT_NUM_DESCS  XSK_RING_CONS__DEFAULT_NUM_DESCS
 #define ETH_AF_XDP_DFLT_START_QUEUE_IDX0
 #define ETH_AF_XDP_DFLT_QUEUE_COUNT1
@@ -602,7 +595,14 @@ eth_dev_info(struct rte_eth_dev *dev, struct 
rte_eth_dev_info *dev_info)
dev_info->max_tx_queues = internals->queue_cnt;
 
dev_info->min_mtu = RTE_ETHER_MIN_MTU;
-   dev_info->max_mtu = ETH_AF_XDP_FRAME_SIZE - ETH_AF_XDP_DATA_HEADROOM;
+#if defined(XDP_UMEM_UNALIGNED_CHUNK_FLAG)
+   dev_info->max_mtu = getpagesize() -
+   sizeof(struct rte_mempool_objhdr) -
+   sizeof(struct rte_mbuf) -
+   RTE_PKTMBUF_HEADROOM - XDP_PACKET_HEADROOM;
+#else
+   dev_info->max_mtu = ETH_AF_XDP_FRAME_SIZE;
+#endif
 
dev_info->default_rxportconf.nb_queues = 1;
dev_info->default_txportconf.nb_queues = 1;
@@ -804,7 +804,7 @@ xsk_umem_info *xdp_umem_configure(struct pmd_internals 
*internals,
.fill_size = ETH_AF_XDP_DFLT_NUM_DESCS,
.comp_size = ETH_AF_XDP_DFLT_NUM_DESCS,
.frame_size = ETH_AF_XDP_FRAME_SIZE,
-   .frame_headroom = ETH_AF_XDP_DATA_HEADROOM };
+   .frame_headroom = 0 };
char ring_name[RTE_RING_NAMESIZE];
char mz_name[RTE_MEMZONE_NAMESIZE];
int ret;
@@ -829,8 +829,7 @@ xsk_umem_info *xdp_umem_configure(struct pmd_internals 
*internals,
 
for (i = 0; i < ETH_AF_XDP_NUM_BUFFERS; i++)
rte_ring_enqueue(umem->buf_ring,
-(void *)(i * ETH_AF_XDP_FRAME_SIZE +
- ETH_AF_XDP_DATA_HEADROOM));
+(void *)(i * ETH_AF_XDP_FRAME_SIZE));
 
snprintf(mz_name, sizeof(mz_name), "af_xdp_umem_%s_%u",
   internals->if_name, rxq->xsk_queue_idx);
@@ -939,7 +938,7 @@ eth_rx_queue_setup(struct rte_eth_dev *dev,
/* Now get the space available for data in the mbuf */
buf_size = rte_pktmbuf_data_room_size(mb_pool) -
RTE_PKTMBUF_HEADROOM;
-   data_size = ETH_AF_XDP_FRAME_SIZE - ETH_AF_XDP_DATA_HEADROOM;
+   data_size = ETH_AF_XDP_FRAME_SIZE;
 
if (data_size > buf_size) {
AF_XDP_LOG(ERR, "%s: %d bytes will not fit in mbuf (%d 
bytes)\n",
-- 
2.17.1



[dpdk-dev] [PATCH v2 2/3] net/af_xdp: use correct fill queue addresses

2020-02-07 Thread Ciara Loftus
The fill queue addresses should start at the beginning of the mempool
object instead of the beginning of the mbuf. This is because the umem
frame headroom includes the mp hdrobj size. Starting at this point
ensures AF_XDP doesn't write past the available room in the frame, in
the case of larger packets which are close to the size of the mbuf.

Fixes: d8a210774e1d ("net/af_xdp: support unaligned umem chunks")
Cc: sta...@dpdk.org

Signed-off-by: Ciara Loftus 
---
v2:
* Include mbuf priv size in rx mbuf data_off calculation

Ciara Loftus (3):
  net/af_xdp: fix umem frame size & headroom calculations
  net/af_xdp: use correct fill queue addresses
  net/af_xdp: fix maximum MTU value

 drivers/net/af_xdp/rte_eth_af_xdp.c | 60 +
 1 file changed, 35 insertions(+), 25 deletions(-)

-- 
2.17.1

 drivers/net/af_xdp/rte_eth_af_xdp.c | 25 +
 1 file changed, 17 insertions(+), 8 deletions(-)

diff --git a/drivers/net/af_xdp/rte_eth_af_xdp.c 
b/drivers/net/af_xdp/rte_eth_af_xdp.c
index 8b189119c..1e98cd44f 100644
--- a/drivers/net/af_xdp/rte_eth_af_xdp.c
+++ b/drivers/net/af_xdp/rte_eth_af_xdp.c
@@ -172,7 +172,8 @@ reserve_fill_queue_zc(struct xsk_umem_info *umem, uint16_t 
reserve_size,
uint64_t addr;
 
fq_addr = xsk_ring_prod__fill_addr(fq, idx++);
-   addr = (uint64_t)bufs[i] - (uint64_t)umem->buffer;
+   addr = (uint64_t)bufs[i] - (uint64_t)umem->buffer -
+   umem->mb_pool->header_size;
*fq_addr = addr;
}
 
@@ -271,8 +272,11 @@ af_xdp_rx_zc(void *queue, struct rte_mbuf **bufs, uint16_t 
nb_pkts)
addr = xsk_umem__extract_addr(addr);
 
bufs[i] = (struct rte_mbuf *)
-   xsk_umem__get_data(umem->buffer, addr);
-   bufs[i]->data_off = offset - sizeof(struct rte_mbuf);
+   xsk_umem__get_data(umem->buffer, addr +
+   umem->mb_pool->header_size);
+   bufs[i]->data_off = offset - sizeof(struct rte_mbuf) -
+   rte_pktmbuf_priv_size(umem->mb_pool) -
+   umem->mb_pool->header_size;
 
rte_pktmbuf_pkt_len(bufs[i]) = len;
rte_pktmbuf_data_len(bufs[i]) = len;
@@ -385,7 +389,8 @@ pull_umem_cq(struct xsk_umem_info *umem, int size)
 #if defined(XDP_UMEM_UNALIGNED_CHUNK_FLAG)
addr = xsk_umem__extract_addr(addr);
rte_pktmbuf_free((struct rte_mbuf *)
-   xsk_umem__get_data(umem->buffer, addr));
+   xsk_umem__get_data(umem->buffer,
+   addr + umem->mb_pool->header_size));
 #else
rte_ring_enqueue(umem->buf_ring, (void *)addr);
 #endif
@@ -443,9 +448,11 @@ af_xdp_tx_zc(void *queue, struct rte_mbuf **bufs, uint16_t 
nb_pkts)
}
desc = xsk_ring_prod__tx_desc(&txq->tx, idx_tx);
desc->len = mbuf->pkt_len;
-   addr = (uint64_t)mbuf - (uint64_t)umem->buffer;
+   addr = (uint64_t)mbuf - (uint64_t)umem->buffer -
+   umem->mb_pool->header_size;
offset = rte_pktmbuf_mtod(mbuf, uint64_t) -
-   (uint64_t)mbuf;
+   (uint64_t)mbuf +
+   umem->mb_pool->header_size;
offset = offset << XSK_UNALIGNED_BUF_OFFSET_SHIFT;
desc->addr = addr | offset;
count++;
@@ -466,9 +473,11 @@ af_xdp_tx_zc(void *queue, struct rte_mbuf **bufs, uint16_t 
nb_pkts)
desc = xsk_ring_prod__tx_desc(&txq->tx, idx_tx);
desc->len = mbuf->pkt_len;
 
-   addr = (uint64_t)local_mbuf - (uint64_t)umem->buffer;
+   addr = (uint64_t)local_mbuf - (uint64_t)umem->buffer -
+   umem->mb_pool->header_size;
offset = rte_pktmbuf_mtod(local_mbuf, uint64_t) -
-   (uint64_t)local_mbuf;
+   (uint64_t)local_mbuf +
+   umem->mb_pool->header_size;
pkt = xsk_umem__get_data(umem->buffer, addr + offset);
offset = offset << XSK_UNALIGNED_BUF_OFFSET_SHIFT;
desc->addr = addr | offset;
-- 
2.17.1



Re: [dpdk-dev] [PATCH] test/crypto: add cpu crypto mode tests

2020-02-07 Thread Ananyev, Konstantin


> 
> This patch adds ability to run unit tests in cpu crypto mode for AESNI
> GCM cryptodev.
> 
> Signed-off-by: Marcin Smoczynski 
> ---

Tested-by: Konstantin Ananyev 
Acked-by: Konstantin Ananyev 

> 2.17.1



Re: [dpdk-dev] [PATCH v7 9/9] eal: add minimum viable code to support parsing

2020-02-07 Thread Ranjit Menon

On 2/7/2020 8:46 AM, Ranjit Menon wrote:

On 2/6/2020 1:26 AM, Thomas Monjalon wrote:
> As discussed in community meeting, the goal was to have core parsing
> in Windows EAL 20.02.
> Given that there is a crash and a doubt on the imported getopt library,
> I think it's better to postpone this series to 20.05.
>
>

Thomas...
We have fixed the crash in the v8 series of this patch - Pallavi sent it
out for review last night. (It was a simple fix)

Can we please get this merged into 20.02? It would be great to progress
Windows support in this release. (The initial Windows support was merged
in 19.02, so it's been a year since we moved it forward! :-))

thanks,

Ranjit

Sorry, I believe the initial patches were merged in 19.05, still a long time!

@Dmitry: If you could do some rudimentary tests and ACK the v8 series of this 
patch, it would be very helpful.
thanks,

Ranjit


Re: [dpdk-dev] [PATCH v2] devtools: add new SPDX license compliance checker

2020-02-07 Thread Stephen Hemminger
On Wed, 29 Jan 2020 07:59:07 -0800
Stephen Hemminger  wrote:

> Simple script to look for drivers and scripts that
> are missing requires SPDX header.
> 
> Signed-off-by: Stephen Hemminger 

Ping. This should have been merged by now.


Re: [dpdk-dev] [PATCH v2 0/7] MinGW-w64 support

2020-02-07 Thread William Tu
Hi Dmitry,

Thanks for your reply.

On Fri, Feb 7, 2020 at 2:24 AM Dmitry Kozliuk  wrote:
>
> Hi William,
>
>> I applied your v2 patch and I did a native build on windows 10.
>> Hit an error showing
>> ../lib/librte_eal/windows/eal/eal_lcore.c:54:2: error: 'for' loop
>> initial declarations are only allowed in C99 mode
>
>
> Thanks, will fix in v3.
>
>> However the output looks weird:
>> C:\dpdk\build\examples>dpdk-helloworld.exe
>> EAL: Detected 2 lcore(s)
>> EAL: Detected 1 NUMA nodes
>> hehello fllo frorom cm core 1
>> ore 0
>
>
> It looks like your stdout is unbuffered (default is line-buffered). What 
> terminal are you using (cmd, Power Shell, Terminal App, conemu, etc)?
>
I'm using the "Command Prompt" from windows 10

>> C compiler for the host machine: cc (gcc 4.8.3 "cc (GCC) 4.8.3")
>
>
> GCC 4.8.3 is quite outdated, MinGW-w64 ships GCC 8 nowadays. Do we need to 
> support it for Windows (I doubt MinGW-w64 does)?
>
Oh, I download a pretty old version (mingw-w64 3.3.0).
Let me update.

BTW, I also tested cross-compile using my Ubuntu Box and everything works!
William


Re: [dpdk-dev] SPDX license nag

2020-02-07 Thread Stephen Hemminger
On Mon, 27 Jan 2020 11:15:44 +
"Mcnamara, John"  wrote:

> > -Original Message-
> > From: dev  On Behalf Of Stephen Hemminger
> > Sent: Wednesday, January 22, 2020 4:19 PM
> > To: dev@dpdk.org
> > Subject: [dpdk-dev] SPDX license nag
> > 
> > Files without SPDX License
> > --
> > app/test-pmd/flowgen.c   - Tilera/Mellanox
> > app/test-pmd/macswap.c   - Tilera/Mellanox
> > app/test/test_compressdev_test_buffer.h  - Intel
> > app/test/test_timer_racecond.c   - Akamai
> > devtools/cocci.sh- EZchip/Mellanox
> > devtools/load-devel-config   - Canonical
> > examples/ipsec-secgw/test/trs_aesgcm_inline_crypto_fallback_defs.sh - Intel
> > examples/ipsec-secgw/test/tun_aesgcm_inline_crypto_fallback_defs.sh - Intel
> > examples/performance-thread/l3fwd-thread/test.sh - Intel
> > lib/librte_ethdev/rte_ethdev_pci.h   - Brocade/Jan 
> > Blunck/Broadcom
> > lib/librte_ethdev/rte_ethdev_vdev.h  - Brocade/Jan 
> > Blunck/Broadcom
> > 
> > Some patches for these are outstanding but haven't been merged  
> 
> 
> I've looked at the headers/logs for these files and added the owners (or 
> potential owners after mergers) above.
> 
> Can everyone on the CC list please take care of these for your company/files. 
> It is a simple patch to fix and Stephen has already provided some.
> 
> Thanks Stephen for keeping the focus on this.
> 
> John
>  

20.02-rc2 SPDX license nag.


Files without SPDX License
--
app/test-pmd/flowgen.c
app/test-pmd/macswap.c
app/test/test_compressdev_test_buffer.h
app/test/test_timer_racecond.c
devtools/cocci.sh
devtools/load-devel-config
examples/ipsec-secgw/test/trs_aesgcm_inline_crypto_fallback_defs.sh
examples/ipsec-secgw/test/tun_aesgcm_inline_crypto_fallback_defs.sh
examples/performance-thread/l3fwd-thread/test.sh
lib/librte_ethdev/rte_ethdev_pci.h
lib/librte_ethdev/rte_ethdev_vdev.h

Files with additional license text
--
app/test-pmd/flowgen.c
app/test-pmd/macswap.c
app/test/test_timer_racecond.c
devtools/cocci.sh
lib/librte_ethdev/rte_ethdev_pci.h
lib/librte_ethdev/rte_ethdev_vdev.h


[dpdk-dev] [PATCH v3] devtools: add new SPDX license compliance checker

2020-02-07 Thread Stephen Hemminger
Simple script to look for drivers and scripts that
are missing requires SPDX header.

Signed-off-by: Stephen Hemminger 
---
v3 - pickup more places with boilerplate text
 avoid false positive for cocci scripts or abignore

 devtools/spdx-check.sh | 24 
 1 file changed, 24 insertions(+)
 create mode 100755 devtools/spdx-check.sh

diff --git a/devtools/spdx-check.sh b/devtools/spdx-check.sh
new file mode 100755
index ..a5be10b26f44
--- /dev/null
+++ b/devtools/spdx-check.sh
@@ -0,0 +1,24 @@
+#! /bin/sh
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright (c) 2019 Microsoft Corporation
+#
+# Produce a list of files with incorrect license
+# information
+
+echo "Files without SPDX License"
+echo "--"
+
+git grep -L SPDX-License-Identifier -- \
+':^.git*' ':^.ci/*' ':^.travis.yml' \
+':^README' ':^MAINTAINERS' ':^VERSION' ':^ABI_VERSION' \
+':^*/Kbuild' ':^*/README' \
+':^license/' ':^doc/' ':^config/' ':^buildtools/' \
+':^*.cocci' ':^*.abignore' \
+':^*.def' ':^*.map' ':^*.ini' ':^*.data' ':^*.cfg' ':^*.txt'
+
+echo
+echo "Files with additional license text"
+echo "--"
+
+git grep -l Redistribution -- \
+':^license/' ':^/devtools/spdx-check.sh'
-- 
2.20.1



[dpdk-dev] [PATCH v2] net/netvsc: initialize link state

2020-02-07 Thread Stephen Hemminger
If application is using link state interrupt, the correct link state
needs to be filled in when device is started. This is similar to
how virtio updates link information.

Fixes: 4e9c73e96e83 ("net/netvsc: add Hyper-V network device")
Cc: sta...@dpdk.org
Reported-by: Mohammed Gamal 
Tested-by: Mohammed Gamal 
Signed-off-by: Stephen Hemminger 
---
Putting on same email thread as original submission
v2 - new patch that does initialization at start
 added tested-by and fixes tag

 drivers/net/netvsc/hn_ethdev.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/net/netvsc/hn_ethdev.c b/drivers/net/netvsc/hn_ethdev.c
index c79f924379fe..564620748daf 100644
--- a/drivers/net/netvsc/hn_ethdev.c
+++ b/drivers/net/netvsc/hn_ethdev.c
@@ -823,6 +823,10 @@ hn_dev_start(struct rte_eth_dev *dev)
if (error)
hn_rndis_set_rxfilter(hv, 0);
 
+   /* Initialize Link state */
+   if (error == 0)
+   hn_dev_link_update(dev, 0);
+
return error;
 }
 
-- 
2.20.1



Re: [dpdk-dev] [PATCH 00/14] cleanup resources on shutdown

2020-02-07 Thread Stephen Hemminger
On Thu, 6 Feb 2020 15:06:56 +0100
David Marchand  wrote:

> On Sat, Jan 4, 2020 at 2:34 AM Stephen Hemminger
>  wrote:
> >
> > Recently started using valgrind with DPDK, and the results
> > are not clean.
> >
> > The DPDK has a function that applications can use to tell it
> > to cleanup resources on shutdown (rte_eal_cleanup). But the
> > current coverage of that API is spotty. Many internal parts of
> > DPDK leave files and allocated memory behind.
> >
> > This patch set is a start at getting the sub-parts of
> > DPDK to cleanup after themselves. These are the easier ones,
> > the harder and more critical ones are in the drivers
> > and the memory subsystem.  
> 
> I am too short on time to check and integrate this big series in rc2,
> and from now it would be too risky to take in 20.02.
> Can you respin it for 20.05 with FreeBSD fixes too?

OK, but if this kind of patch can't be reviewed then
the DPDK process is still broken.

I don't see how FreeBSD matters here. It can be leaky but that
is ok.  I split it out to get review, then you complain it
is too big :-(



Re: [dpdk-dev] [PATCH v8 0/9] Windows patchset with additional EAL functionalities

2020-02-07 Thread Dmitry Kozlyuk
> This patchset includes additional functionalities for Windows EAL
> to support command-line parsing feature and some EAL common code
> on Windows.
> 
> This patchset can be applied to windpdk-next-dev branch in the draft repo.
> 
> v8 changes:
>   Fixed the naming conventions.
>   Fixed a crash encountered due to getopt function.
>   Removed "--syslog" from help options for Windows.
> 

Observing no issues on Windows 10, Meson 0.53.999, Clang 9.0.1.

Reviewed-by: Dmitry Kozlyuk 

-- 
Dmitry Kozlyuk


Re: [dpdk-dev] [PATCH 4/6] build: MinGW-w64 support for Meson

2020-02-07 Thread Dmitry Kozlyuk



> On Wed, Feb 05, 2020 at 11:41:05PM +0300, Dmitry Kozlyuk wrote:
> > > > > > > +if is_windows
> > > > > > > + # Require platform SDK for Windows 7 and above.
> > > > > > > + add_project_arguments('-D_WIN32_WINNT=0x0601', language: 'c')   
> > > > > > >
> > > > > > 
> > > > > > Please explain. Why Windows 7 is needed? What this define is doing? 
> > > > > >
> > > > > 
> > > > > Yes, Windows 7 and above is need for already existing code in 
> > > > > eal_lcore.c,
> > > > > specifically for GetLogicalProcessorInformation() call.
> > > > > 
> > > > > When including , one must define minimum API version the
> > > > > application is compiled against [0]. MSVC and Clang default to the 
> > > > > version of
> > > > > platform SDK (that is, maximum supported). MinGW defaults to Windows 
> > > > > XP, so
> > > > > this definition must be either in  before #include 
> > > > >  or
> > > > > here. Because other files may include , I'd prefer to have 
> > > > > a
> > > > > global definition via compiler command-line.
> > > > > 
> > > > > [0]:
> > > > > https://docs.microsoft.com/en-us/windows/win32/WinProg/using-the-windows-headers
> > > > > 
> > > > 
> > > > OK, thanks.
> > > > Please reword the comment with something like
> > > > "Minimum supported API is Windows 7."
> > > > 
> > > For this, as an alternative to putting it as a project argument, you can 
> > > just
> > > add it to dpdk_conf which means it will end up as a define in the global
> > > rte_build_config.h and so be directly included in each compilation unit
> > > ahead of any other headers. (rte_config.h includes rte_build_config.h)  
> > 
> > Can you please explain why using dpdk_conf is a better alternative? In
> > lib/meson.build I can see add_project_arguments('_D_GNU_SOURCE', ...), which
> > serves a similar purpose on POSIX systems. Compiler option also makes it
> > impossible to forget or redefine this constant in code by mistake.
> >   
> I'm not necessarily saying it's better, it's just an alternative to
> consider. :-) Having it in rte_config.h makes the define available to any
> external apps using DPDK, which may or may not be desirable.

It's certainly *not* desirable, apps usually have this value
consciously and explicitly defined on their own, and it may differ from the
one required to compile DPDK. Thanks for the tip, but as I hope you can see,
it should stay a project argument.

-- 
Dmitry Kozlyuk


Re: [dpdk-dev] [RFC] Accelerator API to chain packet processing functions

2020-02-07 Thread Stephen Hemminger
On Fri, 7 Feb 2020 19:48:17 +0530
Jerin Jacob  wrote:

> On Fri, Feb 7, 2020 at 6:08 PM Coyle, David  wrote:
> >
> > Hi Jerin, see below  
> 
> Hi David,
> 
> > >
> > > On Thu, Feb 6, 2020 at 10:01 PM Coyle, David 
> > > wrote:
> > >  
> 
> > >
> > > There is a risk in drafting API that meant for HW without any HW exists.
> > > Because there could be inefficiency on the metadata and fast path API for
> > > both models.
> > > For example, In the case of CPU based scheme, it will be pure overhead
> > > emulate the "queue"(the enqueue and dequeue) for the sake of abstraction
> > > where CPU works better in the synchronous model and I have doubt that the
> > > session-based scheme will work for HW or not as both difference  HW needs
> > > to work hand in hand(IOMMU aspects for two PCI device)  
> >
> > [DC] I understand what you are saying about the overhead of emulating the 
> > "sw queue" but this same model is already used in many of the existing 
> > device PMDs.
> > In the case of SW devices, such as AESNI-MB or NULL for crypto or zlib for 
> > compression, the enqueue/dequeue in the PMD is emulated through an rte_ring 
> > which is very efficient.
> > The accelerator API will use the existing device PMDs so keeping the same 
> > model seems like a sensible approach.  
> 
> In this release, we added CPU crypto support in cryptodev to support
> the synchronous model to fix the overhead.
> 
> >
> > From an application's point of view, this abstraction of the underlying 
> > device type is important for usability and maintainability -  the 
> > application doesn't need to know
> > the device type as such and therefore doesn't need to make different API 
> > calls.
> >
> > The enqueue/dequeue type API was also used with QAT in mind. While QAT HW 
> > doesn't support these xform chains at the moment, it could potentially do 
> > so in the future.
> > As a side note, as part of the work of adding the accelerator API, the QAT 
> > PMD will be updated to support the DOCSIS Crypto-CRC accelerator xform 
> > chain, where the Crypto
> > is done on QAT HW and the CRC will be done in SW, most likely through a 
> > call to the optimized rte_net_crc library. This will give a consistent API 
> > for the DOCSIS-MAC data-plane
> > pipeline prototype we have developed, which uses both AESNI-MB and QAT for 
> > benchmarks.
> >
> > We will take your feedback on the enqueue/dequeue approach for SW devices 
> > into consideration though during development.
> >
> > Finally, I'm unsure what you mean by this line:
> >
> > "I have doubt that the session-based scheme will work for HW or not 
> > as both difference  HW needs to work hand in hand(IOMMU aspects for two PCI 
> > device)"
> >
> > What do mean by different HW working "hand in hand" and "two PCI device"?
> > The intention is that 1 HW device (or it's PMD) would have to support the 
> > accel xform chain  
> 
> I was thinking, it will be N PCIe devices that create the chain. Each
> distinct PCI device does the fixed-function and chains them together.
> 
> I do understand the usage of QAT HW and CRC in SW.
> So If I understand it correctly, in rte_security, we are combining
> rte_ethdev and rte_cryptodev. With this spec, we are trying to
> combine,
> rte_cryptodev and rte_compressdev. So it looks good to me. My only
> remaining concern is the name of this API, accelerator too generic
> name. IMO, like rte_security, we may need to give more meaningful name
> for the use case where crytodev and compressdev can work together.

Having an API that could be used by parallel hardware does make sense,
but the DPDK already has multiple packet processing infrastructure pieces.

I would rather the DPDK converge on one widely used, robust and tested packet
method. Rather than the current "choose your poison or roll your own" which is
what we have now. The proposed graph seems to be the best so far.


[dpdk-dev] BUG BPF examples broken

2020-02-07 Thread Stephen Hemminger
The BPF examples do not compile on Debian.

The issue is that ret_mbuf_core.h includes rte_atomic which include rte_common
which pulls in errno.h and bits/errno.h. And then asm/errno.h

/usr/include/linux/errno.h:1:10: fatal error: 'asm/errno.h' file not found
#include 
 ^
1 error generated.

asm/errno.h is architecture specific and BPF is an architecture.


[dpdk-dev] [PATCH] event/dpaa2: set number of order sequeuences

2020-02-07 Thread Nipun Gupta
This patch sets the number of atomic ordered sequeuences
supported by the driver

Fixes: dbf63bd43afa ("event/dpaa2: support ordered queue")
Cc: sta...@dpdk.org

Signed-off-by: Nipun Gupta 
---
 drivers/event/dpaa2/dpaa2_eventdev.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/event/dpaa2/dpaa2_eventdev.c 
b/drivers/event/dpaa2/dpaa2_eventdev.c
index d71361666..1833d659d 100644
--- a/drivers/event/dpaa2/dpaa2_eventdev.c
+++ b/drivers/event/dpaa2/dpaa2_eventdev.c
@@ -479,6 +479,8 @@ dpaa2_eventdev_queue_def_conf(struct rte_eventdev *dev, 
uint8_t queue_id,
RTE_SET_USED(queue_id);
 
queue_conf->nb_atomic_flows = DPAA2_EVENT_QUEUE_ATOMIC_FLOWS;
+   queue_conf->nb_atomic_order_sequences =
+   DPAA2_EVENT_QUEUE_ORDER_SEQUENCES;
queue_conf->schedule_type = RTE_SCHED_TYPE_PARALLEL;
queue_conf->priority = RTE_EVENT_DEV_PRIORITY_NORMAL;
 }
-- 
2.17.1



Re: [dpdk-dev] [RFC] Accelerator API to chain packet processing functions

2020-02-07 Thread Jerin Jacob
On Sat, Feb 8, 2020 at 2:04 AM Stephen Hemminger
 wrote:
>
> On Fri, 7 Feb 2020 19:48:17 +0530
> Jerin Jacob  wrote:
>
> > On Fri, Feb 7, 2020 at 6:08 PM Coyle, David  wrote:
> > >
> > > Hi Jerin, see below
> >
> > Hi David,
> >
> > > >
> > > > On Thu, Feb 6, 2020 at 10:01 PM Coyle, David 
> > > > wrote:
> > > >
> >
> > > >
> > > > There is a risk in drafting API that meant for HW without any HW exists.
> > > > Because there could be inefficiency on the metadata and fast path API 
> > > > for
> > > > both models.
> > > > For example, In the case of CPU based scheme, it will be pure overhead
> > > > emulate the "queue"(the enqueue and dequeue) for the sake of abstraction
> > > > where CPU works better in the synchronous model and I have doubt that 
> > > > the
> > > > session-based scheme will work for HW or not as both difference  HW 
> > > > needs
> > > > to work hand in hand(IOMMU aspects for two PCI device)
> > >
> > > [DC] I understand what you are saying about the overhead of emulating the 
> > > "sw queue" but this same model is already used in many of the existing 
> > > device PMDs.
> > > In the case of SW devices, such as AESNI-MB or NULL for crypto or zlib 
> > > for compression, the enqueue/dequeue in the PMD is emulated through an 
> > > rte_ring which is very efficient.
> > > The accelerator API will use the existing device PMDs so keeping the same 
> > > model seems like a sensible approach.
> >
> > In this release, we added CPU crypto support in cryptodev to support
> > the synchronous model to fix the overhead.
> >
> > >
> > > From an application's point of view, this abstraction of the underlying 
> > > device type is important for usability and maintainability -  the 
> > > application doesn't need to know
> > > the device type as such and therefore doesn't need to make different API 
> > > calls.
> > >
> > > The enqueue/dequeue type API was also used with QAT in mind. While QAT HW 
> > > doesn't support these xform chains at the moment, it could potentially do 
> > > so in the future.
> > > As a side note, as part of the work of adding the accelerator API, the 
> > > QAT PMD will be updated to support the DOCSIS Crypto-CRC accelerator 
> > > xform chain, where the Crypto
> > > is done on QAT HW and the CRC will be done in SW, most likely through a 
> > > call to the optimized rte_net_crc library. This will give a consistent 
> > > API for the DOCSIS-MAC data-plane
> > > pipeline prototype we have developed, which uses both AESNI-MB and QAT 
> > > for benchmarks.
> > >
> > > We will take your feedback on the enqueue/dequeue approach for SW devices 
> > > into consideration though during development.
> > >
> > > Finally, I'm unsure what you mean by this line:
> > >
> > > "I have doubt that the session-based scheme will work for HW or 
> > > not as both difference  HW needs to work hand in hand(IOMMU aspects for 
> > > two PCI device)"
> > >
> > > What do mean by different HW working "hand in hand" and "two PCI device"?
> > > The intention is that 1 HW device (or it's PMD) would have to support the 
> > > accel xform chain
> >
> > I was thinking, it will be N PCIe devices that create the chain. Each
> > distinct PCI device does the fixed-function and chains them together.
> >
> > I do understand the usage of QAT HW and CRC in SW.
> > So If I understand it correctly, in rte_security, we are combining
> > rte_ethdev and rte_cryptodev. With this spec, we are trying to
> > combine,
> > rte_cryptodev and rte_compressdev. So it looks good to me. My only
> > remaining concern is the name of this API, accelerator too generic
> > name. IMO, like rte_security, we may need to give more meaningful name
> > for the use case where crytodev and compressdev can work together.
>
> Having an API that could be used by parallel hardware does make sense,
> but the DPDK already has multiple packet processing infrastructure pieces.
>
> I would rather the DPDK converge on one widely used, robust and tested packet
> method. Rather than the current "choose your poison or roll your own" which is
> what we have now. The proposed graph seems to be the best so far.

I agree. Even I thought of saying graph can do this, as, it has higher
abstraction and runtime chaining support, but then I thought it will
be self markering.
David could you check https://www.mail-archive.com/dev@dpdk.org/msg156318.html
If this one only focusing crypto dev + compressdev, What if we have
ethdev + compressdev + security device in the future.
graph has higher abstraction so it can accommodate ANY chaining
requirements. i.e AESNI-MB + QAT will go as a separate node