Re: [dpdk-dev] [PATCH] test/crypto: replace wireless algos test vectors
> -Original Message- > From: Dybkowski, AdamX > Sent: Thursday, February 6, 2020 11:25 AM > To: dev@dpdk.org; Trahe, Fiona > Cc: Dybkowski, AdamX > Subject: [PATCH] test/crypto: replace wireless algos test vectors > > This patch replaces all KASUMI, SNOW3G, ZUC and all mixed > crypto unit test vectors with randomly generated input arrays. > All outputs were calculated and verified on both QAT PMD > and appropriate software-only PMDs wherever possible. > > Signed-off-by: Adam Dybkowski Acked-by: Fiona Trahe
Re: [dpdk-dev] [RFC PATCH 1/7] vfio: Include optional device match in vfio_device_ops callbacks
On Thu, 6 Feb 2020 11:18:42 -0700 Alex Williamson wrote: > On Thu, 6 Feb 2020 12:14:19 +0100 > Cornelia Huck wrote: > > > On Tue, 04 Feb 2020 16:05:43 -0700 > > Alex Williamson wrote: > > > > > Allow bus drivers to provide their own callback to match a device to > > > the user provided string. > > > > > > Signed-off-by: Alex Williamson > > > --- > > > drivers/vfio/vfio.c | 19 +++ > > > include/linux/vfio.h |3 +++ > I think with your first option we arrive at something like this: > > diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c > index dda1726adda8..b5609a411139 100644 > --- a/drivers/vfio/vfio.c > +++ b/drivers/vfio/vfio.c > @@ -883,14 +883,15 @@ static struct vfio_device > *vfio_device_get_from_name(struct vfio_group *group, > > if (it->ops->match) { > ret = it->ops->match(it->device_data, buf); > - if (ret < 0 && ret != -ENODEV) { > + if (ret < 0) { > device = ERR_PTR(ret); > break; > } > - } else > - ret = strcmp(dev_name(it->dev), buf); > + } else { > + ret = !strcmp(dev_name(it->dev), buf); > + } > > - if (!ret) { > + if (ret) { > device = it; > vfio_device_get(device); > break; > diff --git a/include/linux/vfio.h b/include/linux/vfio.h > index 755e0f0e2900..029694b977f2 100644 > --- a/include/linux/vfio.h > +++ b/include/linux/vfio.h > @@ -26,8 +26,9 @@ > * operations documented below > * @mmap: Perform mmap(2) on a region of the device file descriptor > * @request: Request for the bus driver to release the device > - * @match: Optional device name match callback (return: 0 for match, -ENODEV > - * (or >0) for no match and continue, other -errno: no match and > stop) > + * @match: Optional device name match callback (return: 0 for no-match, >0 > for > + * match, -errno for abort (ex. match with insufficient or incorrect > + * additional args) > */ > struct vfio_device_ops { > char*name; > > > I like that. Thanks, > > Alex Looks good to me.
Re: [dpdk-dev] [PATCH v4 00/36] update for i40e base code
On 04/02/2020 03:23, Ye Xiaolong wrote: > Hi, Kevin > > On 02/03, Kevin Traynor wrote: >> On 13/01/2020 05:59, Ye, Xiaolong wrote: > > [snip] > -Original Message- >> Hi Xiaolong/Beilei/Qi Z, >> >> Some of these patches are tagged to go to 'S'table and are 'F'ixes, >> while others are also fixes for older releases but not tagged to go to >> stable. >> >> No issue with not backporting patches (in fact i would prefer not if >> some doubts), but want to check if the stable tag was missed by accident? >> >> Also, note that Fw listed is v6.01 for 18.11, (v7.00 for 19.11) so >> changes would need to be compatible with that. > > Yes, some of the stable tags were missed deliberately since the new base code > update is for FW 7.1/7.2, it's not guaranteed to be compatible with old FW. > >> >> 20.02 8f33cbcfa S F net/i40e/base: fix buffer address (16.04) >> 20.02 4b3da9415 S F net/i40e/base: fix error message (1.7.0) >> 20.02 1da546c39 - F net/i40e/base: fix missing link modes (17.08) >> 20.02 79bfe7808 S F net/i40e/base: fix Tx descriptors number (1.7.0) >> 20.02 50126939c - F net/i40e/base: fix retrying logic (18.02) >> 20.02 b330a1c5c - F net/i40e/base: fix display of FEC settings (17.08) >> 20.02 03ef7d47f - F net/i40e/base: add new link speed constants (17.08) >> > > I would suggest we only backport below obvious fixes to stable releases. > > 20.02 8f33cbcfa S F net/i40e/base: fix buffer address (16.04) > 20.02 4b3da9415 S F net/i40e/base: fix error message (1.7.0) > 20.02 79bfe7808 S F net/i40e/base: fix Tx descriptors number (1.7.0) > 20.02 50126939c - F net/i40e/base: fix retrying logic (18.02) > Thanks for confirming Xialong, I will backport the ones you suggest. Kevin. > Thanks, > Xiaolong > >> Please confirm on patches above missing stable tag. For time being I >> will only consider ones with stable tag. >> >> thanks, >> Kevin. >> >
[dpdk-dev] [PATCH] doc: update thunderx guide
Updating the thunderx with loopback port support, debugging options, new features added and correcting some formating issues. Signed-off-by: Harman Kalra --- doc/guides/nics/thunderx.rst | 153 --- 1 file changed, 88 insertions(+), 65 deletions(-) diff --git a/doc/guides/nics/thunderx.rst b/doc/guides/nics/thunderx.rst index 89d439def..f42133e54 100644 --- a/doc/guides/nics/thunderx.rst +++ b/doc/guides/nics/thunderx.rst @@ -25,6 +25,7 @@ Features of the ThunderX PMD are: - Port hardware statistics - Jumbo frames - Link state information +- Setting up link state. - Scattered and gather for TX and RX - VLAN stripping - SR-IOV VF @@ -242,6 +243,12 @@ driver' list, secondary VFs are on the remaining on the remaining part of the li Depending on the hardware used, the kernel driver sets a threshold ``vf_id``. VFs that try to attached with an id below or equal to this boundary are considered primary VFs. VFs that try to attach with an id above this boundary are considered secondary VFs. +LBK HW Access +~ + +Loopback HW Unit (LBK) receives packets from NIC-RX and sends packets back to NIC-TX. +The loopback block has N channels and contains data buffering that is shared across +all channels. Four primary VFs are reserved as loopback ports. Example device binding ~~ @@ -263,36 +270,40 @@ on a non-NUMA machine. Network devices using kernel driver === - :01:10.0 'Device a026' if= drv=thunder-BGX unused=vfio-pci,uio_pci_generic - :01:10.1 'Device a026' if= drv=thunder-BGX unused=vfio-pci,uio_pci_generic - 0002:01:00.0 'Device a01e' if= drv=thunder-nic unused=vfio-pci,uio_pci_generic - 0002:01:00.1 'Device 0011' if=eth0 drv=thunder-nicvf unused=vfio-pci,uio_pci_generic - 0002:01:00.2 'Device 0011' if=eth1 drv=thunder-nicvf unused=vfio-pci,uio_pci_generic - 0002:01:00.3 'Device 0011' if=eth2 drv=thunder-nicvf unused=vfio-pci,uio_pci_generic - 0002:01:00.4 'Device 0011' if= drv=thunder-nicvf unused=vfio-pci,uio_pci_generic - 0002:01:00.5 'Device 0011' if= drv=thunder-nicvf unused=vfio-pci,uio_pci_generic - 0002:01:00.6 'Device 0011' if= drv=thunder-nicvf unused=vfio-pci,uio_pci_generic - 0002:01:00.7 'Device 0011' if= drv=thunder-nicvf unused=vfio-pci,uio_pci_generic - 0002:01:01.0 'Device 0011' if= drv=thunder-nicvf unused=vfio-pci,uio_pci_generic - 0002:01:01.1 'Device 0011' if= drv=thunder-nicvf unused=vfio-pci,uio_pci_generic - 0002:01:01.2 'Device 0011' if= drv=thunder-nicvf unused=vfio-pci,uio_pci_generic - 0002:01:01.3 'Device 0011' if= drv=thunder-nicvf unused=vfio-pci,uio_pci_generic - 0002:01:01.4 'Device 0011' if= drv=thunder-nicvf unused=vfio-pci,uio_pci_generic - 0002:01:01.5 'Device 0011' if= drv=thunder-nicvf unused=vfio-pci,uio_pci_generic - 0002:01:01.6 'Device 0011' if= drv=thunder-nicvf unused=vfio-pci,uio_pci_generic - 0002:01:01.7 'Device 0011' if= drv=thunder-nicvf unused=vfio-pci,uio_pci_generic - 0002:01:02.0 'Device 0011' if= drv=thunder-nicvf unused=vfio-pci,uio_pci_generic - 0002:01:02.1 'Device 0011' if= drv=thunder-nicvf unused=vfio-pci,uio_pci_generic - 0002:01:02.2 'Device 0011' if= drv=thunder-nicvf unused=vfio-pci,uio_pci_generic + :01:10.0 'THUNDERX BGX (Common Ethernet Interface) a026' if= drv=thunder-BGX unused=vfio-pci + :01:10.1 'THUNDERX BGX (Common Ethernet Interface) a026' if= drv=thunder-BGX unused=vfio-pci + 0001:01:00.0 'THUNDERX Network Interface Controller a01e' if= drv=thunder-nic unused=vfio-pci + 0001:01:00.1 'Device a034' if=eth0 drv=thunder-nicvf unused=vfio-pci + 0001:01:00.2 'Device a034' if=eth1 drv=thunder-nicvf unused=vfio-pci + 0001:01:00.3 'Device a034' if=eth2 drv=thunder-nicvf unused=vfio-pci + 0001:01:00.4 'Device a034' if=eth3 drv=thunder-nicvf unused=vfio-pci + 0001:01:00.5 'Device a034' if=eth4 drv=thunder-nicvf unused=vfio-pci + 0001:01:00.6 'Device a034' if=lbk0 drv=thunder-nicvf unused=vfio-pci + 0001:01:00.7 'Device a034' if=lbk1 drv=thunder-nicvf unused=vfio-pci + 0001:01:01.0 'Device a034' if=lbk2 drv=thunder-nicvf unused=vfio-pci + 0001:01:01.1 'Device a034' if=lbk3 drv=thunder-nicvf unused=vfio-pci + 0001:01:01.2 'Device a034' if= drv=thunder-nicvf unused=vfio-pci + 0001:01:01.3 'Device a034' if= drv=thunder-nicvf unused=vfio-pci + 0001:01:01.4 'Device a034' if= drv=thunder-nicvf unused=vfio-pci + 0001:01:01.5 'Device a034' if= drv=thunder-nicvf unused=vfio-pci + 0001:01:01.6 'Device a034' if= drv=thunder-nicvf unused=vfio-pci + 0001:01:01.7 'Device a034' if= drv=thunder-nicvf unused=vfio-pci + 0001:01:02.0 'Device a034' if= drv=thunder-nicvf unused=vfio-pci + 0001:01:02.1 'Device a034' if= drv=thunder-nicvf unused=vfio-pci + 0001:01:02.2 'Device a034' if= drv=
[dpdk-dev] [Bug 393] rte_zmalloc_socket does not zero memory
https://bugs.dpdk.org/show_bug.cgi?id=393 Bug ID: 393 Summary: rte_zmalloc_socket does not zero memory Product: DPDK Version: 19.11 Hardware: x86 OS: Linux Status: UNCONFIRMED Severity: critical Priority: Normal Component: core Assignee: dev@dpdk.org Reporter: pavel.kr...@anritsu.com Target Milestone: --- Hello, It seems that rte_zmalloc_socket does not return zeroed memory on some cases. It seems to work OK on all but HP G10 server where I was testing application that tried to create rte fragment table and do IP defragementation where it crashed on some of the structure members. I put check into rte_zmalloc_socket to see if the memory was really zeroed and it was not! Also if I zero the memory then the defragmentation works correctly and does not crash. I see from google that there were problems like this in the past with non zeroed rte_zmalloc_socket. I have tested also the 19.02 version and it has same problem. this is on Centos 7.3 with linux kernel 3.10.0-514.el7.x86_64. Don't know if the linux itself can return non-zeroed memory in some case to dpdk lib. -- You are receiving this mail because: You are the assignee for the bug.
[dpdk-dev] [Bug 392] l3fwd fails to run with eventdev
https://bugs.dpdk.org/show_bug.cgi?id=392 Vipin Varghese (vipin.vargh...@intel.com) changed: What|Removed |Added Resolution|--- |FIXED Status|IN_PROGRESS |RESOLVED --- Comment #2 from Vipin Varghese (vipin.vargh...@intel.com) --- Thanks pavan for the patch. I am ok with this correction. But will wait for Jerin to confirm. Note: we can close the Bugzilla -- You are receiving this mail because: You are the assignee for the bug.
Re: [dpdk-dev] [PATCH v2 0/7] MinGW-w64 support
Hi William, I applied your v2 patch and I did a native build on windows 10. > Hit an error showing > ../lib/librte_eal/windows/eal/eal_lcore.c:54:2: error: 'for' loop > initial declarations are only allowed in C99 mode > Thanks, will fix in v3. However the output looks weird: > C:\dpdk\build\examples>dpdk-helloworld.exe > EAL: Detected 2 lcore(s) > EAL: Detected 1 NUMA nodes > hehello fllo frorom cm core 1 > ore 0 > It looks like your stdout is unbuffered (default is line-buffered). What terminal are you using (cmd, Power Shell, Terminal App, conemu, etc)? C compiler for the host machine: cc (gcc 4.8.3 "cc (GCC) 4.8.3") > GCC 4.8.3 is quite outdated, MinGW-w64 ships GCC 8 nowadays. Do we need to support it for Windows (I doubt MinGW-w64 does)? -- Dmitry Kozlyuk >
[dpdk-dev] [PATCH v2] examples/ioat: fix invalid link status check
The return value of the get link function call was not checked, and could return a negative value indicating a failure. This meant the link_status of the link being checked is invalid, because the link was not filled with data. The return value is now checked, and if the return value is not 0 for success, the loop continues with the next port. To avoid confusion between variable names, the existing retval variable is renamed to link_status, to better represent its use. Coverity issue: 350348 Fixes: c8e6ceecebc1 ("examples/ioat: add new sample app for ioat driver") Cc: pawelx.mod...@intel.com Cc: sta...@dpdk.org Signed-off-by: Ciara Power --- v2: - Changed commit log subject and description - Rebased onto master --- examples/ioat/ioatfwd.c | 13 + 1 file changed, 9 insertions(+), 4 deletions(-) diff --git a/examples/ioat/ioatfwd.c b/examples/ioat/ioatfwd.c index e9117718f..99ecc0fde 100644 --- a/examples/ioat/ioatfwd.c +++ b/examples/ioat/ioatfwd.c @@ -697,7 +697,7 @@ check_link_status(uint32_t port_mask) { uint16_t portid; struct rte_eth_link link; - int retval = 0; + int ret, link_status = 0; printf("\nChecking link status\n"); RTE_ETH_FOREACH_DEV(portid) { @@ -705,7 +705,12 @@ check_link_status(uint32_t port_mask) continue; memset(&link, 0, sizeof(link)); - rte_eth_link_get(portid, &link); + ret = rte_eth_link_get(portid, &link); + if (ret < 0) { + printf("Port %u link get failed: err=%d\n", + portid, ret); + continue; + } /* Print link status */ if (link.link_status) { @@ -714,11 +719,11 @@ check_link_status(uint32_t port_mask) portid, link.link_speed, (link.link_duplex == ETH_LINK_FULL_DUPLEX) ? ("full-duplex") : ("half-duplex\n")); - retval = 1; + link_status = 1; } else printf("Port %d Link Down\n", portid); } - return retval; + return link_status; } static void -- 2.17.1
[dpdk-dev] Query for support hw gro offload
Hello Ferruh, 1. Current dpdk framework provide software GRO feature, but do not support hardware GRO offload. and it now can only process tcp4 and vxland_tcp4 streams. 2. HNS3's hardware 1620(ARM SoC) support hardware GRO offload, it can process tcp4 and tcp6 streams. 3. linux kernerl driver support hardware GRO feature already (named rx-gro-hw). so my questions: 1. Can dpdk framework have plan to support hardware GRO offload? 2. If we make this patch, could it upstream? Regards, Chengwen
[dpdk-dev] [PATCH v3] eal/mem: preallocate VA space in no-huge mode
When --no-huge mode is used, the memory is currently allocated with mmap(NULL, ...). This is fine in most cases, but can fail in cases where DPDK is run on a machine with an IOMMU that is of more limited address width than that of a VA, because we're not specifying the address hint for mmap() call. Fix it by preallocating VA space before mapping it. Cc: sta...@dpdk.org Signed-off-by: Anatoly Burakov --- Notes: v3: - Fix mmap flags used in place of offset - Fix using internal_config.memory in place of mem_sz - Add additional address sanity check v2: - Add unmap on unsuccessful mmap I couldn't figure out which specific commit has introduced the issue, so there's no fix tag. The most likely candidate is one that introduced the DMA mask thing in the first place but i'm not sure. lib/librte_eal/linux/eal/eal_memory.c | 24 1 file changed, 20 insertions(+), 4 deletions(-) diff --git a/lib/librte_eal/linux/eal/eal_memory.c b/lib/librte_eal/linux/eal/eal_memory.c index 5604c2a7c0..7a9c97ff88 100644 --- a/lib/librte_eal/linux/eal/eal_memory.c +++ b/lib/librte_eal/linux/eal/eal_memory.c @@ -1340,6 +1340,8 @@ eal_legacy_hugepage_init(void) /* hugetlbfs can be disabled */ if (internal_config.no_hugetlbfs) { + void *prealloc_addr; + size_t mem_sz; struct rte_memseg_list *msl; int n_segs, cur_seg, fd, flags; #ifdef MEMFD_SUPPORTED @@ -1395,17 +1397,31 @@ eal_legacy_hugepage_init(void) } } #endif - addr = mmap(NULL, internal_config.memory, PROT_READ | PROT_WRITE, - flags, fd, 0); - if (addr == MAP_FAILED) { + /* preallocate address space for the memory, so that it can be +* fit into the DMA mask. +*/ + mem_sz = internal_config.memory; + prealloc_addr = eal_get_virtual_area( + NULL, &mem_sz, page_sz, 0, 0); + if (prealloc_addr == NULL) { + RTE_LOG(ERR, EAL, + "%s: reserving memory area failed: " + "%s\n", + __func__, strerror(errno)); + return -1; + } + addr = mmap(prealloc_addr, mem_sz, PROT_READ | PROT_WRITE, + flags | MAP_FIXED, fd, 0); + if (addr == MAP_FAILED || addr != prealloc_addr) { RTE_LOG(ERR, EAL, "%s: mmap() failed: %s\n", __func__, strerror(errno)); + munmap(prealloc_addr, mem_sz); return -1; } msl->base_va = addr; msl->page_sz = page_sz; msl->socket_id = 0; - msl->len = internal_config.memory; + msl->len = mem_sz; msl->heap = 1; /* we're in single-file segments mode, so only the segment list -- 2.17.1
[dpdk-dev] [PATCH v3 1/2] net/virtio: refactor devargs parsing
refactor vdpa specific devargs parsing to more generic way Signed-off-by: Ivan Dyukov --- drivers/net/virtio/virtio_ethdev.c | 35 +- 1 file changed, 25 insertions(+), 10 deletions(-) diff --git a/drivers/net/virtio/virtio_ethdev.c b/drivers/net/virtio/virtio_ethdev.c index 044eb10a7..22323d9a5 100644 --- a/drivers/net/virtio/virtio_ethdev.c +++ b/drivers/net/virtio/virtio_ethdev.c @@ -1955,16 +1955,18 @@ eth_virtio_dev_uninit(struct rte_eth_dev *eth_dev) } static int vdpa_check_handler(__rte_unused const char *key, - const char *value, __rte_unused void *opaque) + const char *value, void *ret_val) { - if (strcmp(value, "1")) - return -1; + if (strcmp(value, "1") == 0) + *(int *)ret_val = 1; + else + *(int *)ret_val = 0; return 0; } static int -vdpa_mode_selected(struct rte_devargs *devargs) +virtio_dev_devargs_parse(struct rte_devargs *devargs, int *vdpa) { struct rte_kvargs *kvlist; const char *key = "vdpa"; @@ -1980,12 +1982,16 @@ vdpa_mode_selected(struct rte_devargs *devargs) if (!rte_kvargs_count(kvlist, key)) goto exit; - /* vdpa mode selected when there's a key-value pair: vdpa=1 */ - if (rte_kvargs_process(kvlist, key, - vdpa_check_handler, NULL) < 0) { - goto exit; + if (vdpa) { + /* vdpa mode selected when there's a key-value pair: +* vdpa=1 +*/ + ret = rte_kvargs_process(kvlist, key, + vdpa_check_handler, vdpa); + if (ret < 0) + goto exit; } - ret = 1; + exit: rte_kvargs_free(kvlist); @@ -1995,8 +2001,17 @@ vdpa_mode_selected(struct rte_devargs *devargs) static int eth_virtio_pci_probe(struct rte_pci_driver *pci_drv __rte_unused, struct rte_pci_device *pci_dev) { + int vdpa = 0; + int ret = 0; + + ret = virtio_dev_devargs_parse(pci_dev->device.devargs, &vdpa); + if (ret < 0) { + PMD_DRV_LOG(ERR, + "devargs parsing is failed"); + return ret; + } /* virtio pmd skips probe if device needs to work in vdpa mode */ - if (vdpa_mode_selected(pci_dev->device.devargs)) + if (vdpa == 1) return 1; return rte_eth_dev_pci_generic_probe(pci_dev, sizeof(struct virtio_hw), -- 2.17.1
[dpdk-dev] [PATCH v3 2/2] net/virtio: add link speed devarg
Some applications like pktgen use link_speed to calculate transmit rate. It limits outcome traffic to hardcoded 10G. This patch adds link_speed devarg which allows to configure link_speed of virtio device. Signed-off-by: Ivan Dyukov --- doc/guides/nics/virtio.rst | 7 ++ drivers/net/virtio/virtio_ethdev.c | 101 - drivers/net/virtio/virtio_pci.h| 1 + 3 files changed, 92 insertions(+), 17 deletions(-) diff --git a/doc/guides/nics/virtio.rst b/doc/guides/nics/virtio.rst index d1f5fb898..f190f2e4f 100644 --- a/doc/guides/nics/virtio.rst +++ b/doc/guides/nics/virtio.rst @@ -356,6 +356,13 @@ Below devargs are supported by the PCI virtio driver: a virtio device needs to work in vDPA mode. (Default: 0 (disabled)) +#. ``link_speed``: + +It is used to specify link speed of virtio device. Link speed is a part of +link status structure. It could be requested by application using +rte_eth_link_get_nowait function. +(Default: 1 (10G)) + Below devargs are supported by the virtio-user vdev: #. ``path``: diff --git a/drivers/net/virtio/virtio_ethdev.c b/drivers/net/virtio/virtio_ethdev.c index 22323d9a5..5ef3c11a7 100644 --- a/drivers/net/virtio/virtio_ethdev.c +++ b/drivers/net/virtio/virtio_ethdev.c @@ -45,6 +45,10 @@ static int virtio_dev_promiscuous_enable(struct rte_eth_dev *dev); static int virtio_dev_promiscuous_disable(struct rte_eth_dev *dev); static int virtio_dev_allmulticast_enable(struct rte_eth_dev *dev); static int virtio_dev_allmulticast_disable(struct rte_eth_dev *dev); +static uint32_t virtio_dev_speed_capa_get(uint32_t link_speed); +static int virtio_dev_devargs_parse(struct rte_devargs *devargs, + int *vdpa, + uint32_t *link_speed); static int virtio_dev_info_get(struct rte_eth_dev *dev, struct rte_eth_dev_info *dev_info); static int virtio_dev_link_update(struct rte_eth_dev *dev, @@ -1861,6 +1865,7 @@ int eth_virtio_dev_init(struct rte_eth_dev *eth_dev) { struct virtio_hw *hw = eth_dev->data->dev_private; + uint32_t link_speed = ETH_SPEED_NUM_10G; int ret; if (sizeof(struct virtio_net_hdr_mrg_rxbuf) > RTE_PKTMBUF_HEADROOM) { @@ -1886,7 +1891,11 @@ eth_virtio_dev_init(struct rte_eth_dev *eth_dev) return 0; } - + ret = virtio_dev_devargs_parse(eth_dev->device->devargs, +NULL, &link_speed); + if (ret < 0) + return ret; + hw->link_speed = link_speed; /* * Pass the information to the rte_eth_dev_close() that it should also * release the private port resources. @@ -1953,6 +1962,14 @@ eth_virtio_dev_uninit(struct rte_eth_dev *eth_dev) return 0; } +#define VIRTIO_ARG_LINK_SPEED "link_speed" +#define VIRTIO_ARG_VDPA "vdpa" + +static const char * const valid_args[] = { + VIRTIO_ARG_LINK_SPEED, + VIRTIO_ARG_VDPA, + NULL +}; static int vdpa_check_handler(__rte_unused const char *key, const char *value, void *ret_val) @@ -1965,33 +1982,84 @@ static int vdpa_check_handler(__rte_unused const char *key, return 0; } + +static uint32_t +virtio_dev_speed_capa_get(uint32_t link_speed) +{ + switch (link_speed) { + case ETH_SPEED_NUM_10G: + return ETH_LINK_SPEED_10G; + case ETH_SPEED_NUM_20G: + return ETH_LINK_SPEED_20G; + case ETH_SPEED_NUM_25G: + return ETH_LINK_SPEED_25G; + case ETH_SPEED_NUM_40G: + return ETH_LINK_SPEED_40G; + case ETH_SPEED_NUM_50G: + return ETH_LINK_SPEED_50G; + case ETH_SPEED_NUM_56G: + return ETH_LINK_SPEED_56G; + case ETH_SPEED_NUM_100G: + return ETH_LINK_SPEED_100G; + default: + return 0; + } +} + +static int link_speed_handler(const char *key __rte_unused, + const char *value, void *ret_val) +{ + uint32_t val; + if (!value || !ret_val) + return -EINVAL; + val = strtoul(value, NULL, 0); + /* validate input */ + if (virtio_dev_speed_capa_get(val) == 0) + return -EINVAL; + *(uint32_t *)ret_val = val; + + return 0; +} + + static int -virtio_dev_devargs_parse(struct rte_devargs *devargs, int *vdpa) +virtio_dev_devargs_parse(struct rte_devargs *devargs, int *vdpa, + uint32_t *link_speed) { struct rte_kvargs *kvlist; - const char *key = "vdpa"; int ret = 0; if (devargs == NULL) return 0; - kvlist = rte_kvargs_parse(devargs->args, NULL); - if (kvlist == NULL) + kvlist = rte_kvargs_parse(devargs->args, valid_args); + if (kvlist == NULL) { + PMD_INIT_LOG(ERR, "error when parsing param"); return 0; - - if (!rte_kvargs_count(kvlist, key)) - goto exit; - - if (vdpa) { + } + if
Re: [dpdk-dev] [PATCH v2 2/2] net/virtio: add link speed devarg
Hi Maxime, Adrian, Thank you for comments! I have sent new patch with updated documentation. This is not final version of the link speed. I still have plan to rework it and use link speed which is stored in qemu. Best regards, Ivan 06.02.2020 17:26, Adrian Moreno пишет: > On 2/6/20 3:22 PM, Maxime Coquelin wrote: >> Adding back Ivan as you removed it from the To: list. >> So he may not have seen your comment. I really missed it. Thanks. > Oops, sorry about that. > Adrian >> On 1/29/20 11:10 AM, Adrian Moreno wrote: >>> On 1/20/20 6:05 PM, Ivan Dyukov wrote: Some applications like pktgen use link_speed to calculate transmit rate. It limits outcome traffic to hardcoded 10G. This patch adds link_speed devarg which allows to configure link_speed of virtio device. Signed-off-by: Ivan Dyukov --- drivers/net/virtio/virtio_ethdev.c | 101 - drivers/net/virtio/virtio_pci.h| 1 + 2 files changed, 85 insertions(+), 17 deletions(-) >>> Hi Ivan, >>> >>> IMHO, this new option deserves being documented in >>> doc/guides/nics/virtio.rst. >>> >>> Otherwise it looks good to me. >> I agree with Adrian here, the new option need to be documented. >> >> Thanks, >> Maxime >> >>> Thank you. >>> Adrian diff --git a/drivers/net/virtio/virtio_ethdev.c b/drivers/net/virtio/virtio_ethdev.c index 22323d9a5..5ef3c11a7 100644 --- a/drivers/net/virtio/virtio_ethdev.c +++ b/drivers/net/virtio/virtio_ethdev.c @@ -45,6 +45,10 @@ static int virtio_dev_promiscuous_enable(struct rte_eth_dev *dev); static int virtio_dev_promiscuous_disable(struct rte_eth_dev *dev); static int virtio_dev_allmulticast_enable(struct rte_eth_dev *dev); static int virtio_dev_allmulticast_disable(struct rte_eth_dev *dev); +static uint32_t virtio_dev_speed_capa_get(uint32_t link_speed); +static int virtio_dev_devargs_parse(struct rte_devargs *devargs, + int *vdpa, + uint32_t *link_speed); static int virtio_dev_info_get(struct rte_eth_dev *dev, struct rte_eth_dev_info *dev_info); static int virtio_dev_link_update(struct rte_eth_dev *dev, @@ -1861,6 +1865,7 @@ int eth_virtio_dev_init(struct rte_eth_dev *eth_dev) { struct virtio_hw *hw = eth_dev->data->dev_private; + uint32_t link_speed = ETH_SPEED_NUM_10G; int ret; if (sizeof(struct virtio_net_hdr_mrg_rxbuf) > RTE_PKTMBUF_HEADROOM) { @@ -1886,7 +1891,11 @@ eth_virtio_dev_init(struct rte_eth_dev *eth_dev) return 0; } - + ret = virtio_dev_devargs_parse(eth_dev->device->devargs, + NULL, &link_speed); + if (ret < 0) + return ret; + hw->link_speed = link_speed; /* * Pass the information to the rte_eth_dev_close() that it should also * release the private port resources. @@ -1953,6 +1962,14 @@ eth_virtio_dev_uninit(struct rte_eth_dev *eth_dev) return 0; } +#define VIRTIO_ARG_LINK_SPEED "link_speed" +#define VIRTIO_ARG_VDPA "vdpa" + +static const char * const valid_args[] = { + VIRTIO_ARG_LINK_SPEED, + VIRTIO_ARG_VDPA, + NULL +}; static int vdpa_check_handler(__rte_unused const char *key, const char *value, void *ret_val) @@ -1965,33 +1982,84 @@ static int vdpa_check_handler(__rte_unused const char *key, return 0; } + +static uint32_t +virtio_dev_speed_capa_get(uint32_t link_speed) +{ + switch (link_speed) { + case ETH_SPEED_NUM_10G: + return ETH_LINK_SPEED_10G; + case ETH_SPEED_NUM_20G: + return ETH_LINK_SPEED_20G; + case ETH_SPEED_NUM_25G: + return ETH_LINK_SPEED_25G; + case ETH_SPEED_NUM_40G: + return ETH_LINK_SPEED_40G; + case ETH_SPEED_NUM_50G: + return ETH_LINK_SPEED_50G; + case ETH_SPEED_NUM_56G: + return ETH_LINK_SPEED_56G; + case ETH_SPEED_NUM_100G: + return ETH_LINK_SPEED_100G; + default: + return 0; + } +} + +static int link_speed_handler(const char *key __rte_unused, + const char *value, void *ret_val) +{ + uint32_t val; + if (!value || !ret_val) + return -EINVAL; + val = strtoul(value, NULL, 0); + /* validate input */ + if (virtio_dev_speed_capa_get(val) == 0) + return -EINVAL; + *(uint32_t *)ret_val = val; + + return 0; +} + + static int -virtio_dev_devargs_parse(struct rte_devargs *devargs, int *vdpa) +virtio_dev_devargs_parse(struct rte_devargs *devargs, int *vdpa, + uint32_t *link_speed) { struct rte_kvargs *kvlist; - const
Re: [dpdk-dev] Query for support hw gro offload
On 2/7/2020 10:39 AM, fengchengwen wrote: > Hello Ferruh, > > 1. Current dpdk framework provide software GRO feature, but do not > support hardware GRO offload. and it now can only process tcp4 and > vxland_tcp4 streams. > 2. HNS3's hardware 1620(ARM SoC) support hardware GRO offload, it > can process tcp4 and tcp6 streams. > 3. linux kernerl driver support hardware GRO feature already (named > rx-gro-hw). > > so my questions: > 1. Can dpdk framework have plan to support hardware GRO offload? > 2. If we make this patch, could it upstream? Hi Chengwen, cc'ed ethdev maintainers. Can't you use LRO [1] for HW GRO offload? [1] http://lxr.dpdk.org/dpdk/v19.11/source/doc/guides/nics/features.rst#L193
Re: [dpdk-dev] [PATCH v2] service: don't walk out of bounds when checking services
On 20/12/2019 14:43, David Marchand wrote: > On Wed, Dec 4, 2019 at 9:34 AM David Marchand > wrote: >> >> On Wed, Dec 4, 2019 at 9:33 AM David Marchand >> wrote: >>> >>> On Tue, Dec 3, 2019 at 10:15 PM Aaron Conole wrote: The service_valid call is used without properly bounds checking the input parameter. Almost all instances of the service_valid call are inside a for() loop that prevents excessive walks, but some of the public APIs don't bounds check and will pass invalid arguments. Prevent this by using SERVICE_GET_OR_ERR_RET where it makes sense, and adding a bounds check to one service_valid() use. Fixes: 8d39d3e237c2 ("service: fix race in service on app lcore function") Fixes: e9139a32f6e8 ("service: add function to run on app lcore") Fixes: e30dd31847d2 ("service: add mechanism for quiescing") >> Cc: sta...@dpdk.org >> With the commit below, this patch will apply cleanly on 18.11. Seems ok to me to add below commit, wdyt? commit e484ccddbe1b41886fef1e445ef2fdfa55086198 Author: Nikhil Rao Date: Mon Sep 16 15:31:02 2019 +0530 service: avoid false sharing on core state Signed-off-by: Aaron Conole >>> >>> Reviewed-by: David Marchand > > Applied, thanks. > > > -- > David Marchand >
Re: [dpdk-dev] [RFC] Accelerator API to chain packet processing functions
Hi Jerin, see below > > On Thu, Feb 6, 2020 at 10:01 PM Coyle, David > wrote: > > Hi David, > > > > > > > > > > > > > - XGS-PON MAC: Crypto-CRC-BIP > > > > > > - Order: > > > > > > - Downstream: CRC, Encrypt, BIP > > > > > > > > > > I understand if the chain has two operations then it may > > > > > possible to have handcrafted SW code to do both operations in one > pass. > > > > > I understand the spec is agnostic on a number of passes it does > > > > > require to enable the xfrom but To understand the SW/HW > > > > > capability, In the above case, "CRC, Encrypt, BIP", It is done > > > > > in one pass in SW or three passes in SW or one pass using HW? > > > > > > > > [DC] The CRC, Encrypt, BIP is also currently done as 1 pass in > > > > AESNI MB > > > library SW. > > > > However, this could also be performed as a single pass in a HW > > > > accelerator > > > > > > As a specification, cascading the xform chains make sense. > > > Do we have any HW that does support chaining the xforms more than > "two" > > > in one pass? > > > i.e real chaining function where two blocks of HWs work hand in hand > > > for chaining. > > > If none, it may be better to abstract as synonymous API(No dequeue, > > > no > > > enqueue) for the CPU use case. > > > > [DC] I'm not aware of any HW that supports this at the moment, but that's > not to say it couldn't in the future - if anyone else has any examples though, > please feel free to share. > > Regardless, I don't see why we would introduce a different API for SW > devices and HW devices. > > There is a risk in drafting API that meant for HW without any HW exists. > Because there could be inefficiency on the metadata and fast path API for > both models. > For example, In the case of CPU based scheme, it will be pure overhead > emulate the "queue"(the enqueue and dequeue) for the sake of abstraction > where CPU works better in the synchronous model and I have doubt that the > session-based scheme will work for HW or not as both difference HW needs > to work hand in hand(IOMMU aspects for two PCI device) [DC] I understand what you are saying about the overhead of emulating the "sw queue" but this same model is already used in many of the existing device PMDs. In the case of SW devices, such as AESNI-MB or NULL for crypto or zlib for compression, the enqueue/dequeue in the PMD is emulated through an rte_ring which is very efficient. The accelerator API will use the existing device PMDs so keeping the same model seems like a sensible approach. From an application's point of view, this abstraction of the underlying device type is important for usability and maintainability - the application doesn't need to know the device type as such and therefore doesn't need to make different API calls. The enqueue/dequeue type API was also used with QAT in mind. While QAT HW doesn't support these xform chains at the moment, it could potentially do so in the future. As a side note, as part of the work of adding the accelerator API, the QAT PMD will be updated to support the DOCSIS Crypto-CRC accelerator xform chain, where the Crypto is done on QAT HW and the CRC will be done in SW, most likely through a call to the optimized rte_net_crc library. This will give a consistent API for the DOCSIS-MAC data-plane pipeline prototype we have developed, which uses both AESNI-MB and QAT for benchmarks. We will take your feedback on the enqueue/dequeue approach for SW devices into consideration though during development. Finally, I'm unsure what you mean by this line: "I have doubt that the session-based scheme will work for HW or not as both difference HW needs to work hand in hand(IOMMU aspects for two PCI device)" What do mean by different HW working "hand in hand" and "two PCI device"? The intention is that 1 HW device (or it's PMD) would have to support the accel xform chain > > Having said that, I agree with the need for use case and API for CPU case. > Till > we find a HW spec, we need to make the solution as CPU specific and latter > extend based on HW metadata required. > Accelerator API sounds like HW accelerator and there is no HW support then > it may not good. We can change the API that works for the use cases that we > know how it works efficiently. > > > > > > > > > It would be up to each underlying PMD to decide if/how it supports a > > particular accelerator xform chain, but from an application's point of > > view, the accelerator API is always the same > > > >
Re: [dpdk-dev] [RFT] net/netvsc: initialize link state
On Thu, 2020-02-06 at 16:10 -0800, Stephen Hemminger wrote: > If application is using link state interrupt, the correct link state > needs to be filled in when device is started. This is similar to > how virtio updates link information. > > Reported-by: Mohammed Gamal > Signed-off-by: Stephen Hemminger > --- > This version marked RFT because am in airport without access to a > machine to test it. > > drivers/net/netvsc/hn_ethdev.c | 4 > 1 file changed, 4 insertions(+) > > diff --git a/drivers/net/netvsc/hn_ethdev.c > b/drivers/net/netvsc/hn_ethdev.c > index c79f924379fe..564620748daf 100644 > --- a/drivers/net/netvsc/hn_ethdev.c > +++ b/drivers/net/netvsc/hn_ethdev.c > @@ -823,6 +823,10 @@ hn_dev_start(struct rte_eth_dev *dev) > if (error) > hn_rndis_set_rxfilter(hv, 0); > > + /* Initialize Link state */ > + if (error == 0) > + hn_dev_link_update(dev, 0); > + > return error; > } > I tested this and I always get the link status as UP, regardless of whether I start the interface on the guest in UP or DOWN state. Looking at hn_dev_link_update() code, I see that the link status depends on the NDIS status that the driver gets from the host if my understanding is correct. The question is whether if I use 'ip li set dev $IF_NAME down' on the guest affects the status the host sees, or would the host set the state to NDIS_MEDIA_STATE_CONNECTED of the device is physcially connected regardless of what the guest tries to do?
Re: [dpdk-dev] [PATCH] mbuf: fix to update documentation of QinQ stripped bit interpretation
Olivier, On Thu, Feb 6, 2020 at 10:55 PM Olivier Matz wrote: > > Hi Somnath, > > Sorry for the delay, please find some comments below. > > I suggest the following title instead: > > mbuf: extend meaning of QinQ stripped bit > > On Mon, Jan 06, 2020 at 02:04:23PM +0530, Somnath Kotur wrote: > > Certain hardware may be able to strip and/or save only the outermost > > VLAN instead of both the VLANs in the mbuf in a QinQ scenario. > > To handle such cases, we could re-interpret setting of just > > PKT_RX_QINQ_STRIPPED to indicate that only the outermost VLAN has > > been stripped and saved in mbuf->vlan_tci_outer. > > To be sure we're on the same page: we are talking about case 7 of this > link: http://inbox.dpdk.org/dev/20191227145041.GQ22738@platinum/ I'm not sure we are on the same page then, please see my response inline below > > So, even if the inner vlan_tci is not stripped from packet data, it has > to be saved in m->vlan_tci, because it is implied by PKT_RX_VLAN. > > From the same link, the case where the driver only strips+saves the > outer vlan without saving or stripping the inner one is case 3. > While this is how it works currently, I'm wondering how will the application know if this was a double VLAN pkt, correct? Also when i look at options 5 and 7 I don't really see the difference in semantics between them ? Both seem to store the outer-vlan and inner-vlan in m->vlan_tci_outer and m->vlan_tci_inner respectively 7/ PKT_RX_VLAN | PKT_RX_QINQ | PKT_RX_QINQ_STRIPPED outer-vlan is removed from packet data m->vlan_tci_outer=outer-vlan m->vlan_tci=inner-vlan I was hoping that with the new interpretation of PKT_RX_QINQ_STRIPPED, it only meant that outer-vlan is removed from packet data and m->vlan_tci_outer = outer_vlan, while PKT_RX_QINQ implies it is a double-vlan pkt and PKT_RX_VLAN implies that pkt has VLAN associated with it? Not m->vlan_tci = inner-vlan Thanks Som > > Only When both PKT_RX_QINQ_STRIPPED and PKT_RX_VLAN_STRIPPED are set, the 2 > > VLANs have been stripped by the hardware and their TCI are saved in > > mbuf->vlan_tci (inner) and mbuf->vlan_tci_outer (outer). > > > > Signed-off-by: Somnath Kotur > > --- > > lib/librte_mbuf/rte_mbuf_core.h | 15 +++ > > 1 file changed, 11 insertions(+), 4 deletions(-) > > > > diff --git a/lib/librte_mbuf/rte_mbuf_core.h > > b/lib/librte_mbuf/rte_mbuf_core.h > > index 9a8557d..db1070b 100644 > > --- a/lib/librte_mbuf/rte_mbuf_core.h > > +++ b/lib/librte_mbuf/rte_mbuf_core.h > > @@ -124,12 +124,19 @@ > > #define PKT_RX_FDIR_FLX (1ULL << 14) > > > > /** > > - * The 2 vlans have been stripped by the hardware and their tci are > > - * saved in mbuf->vlan_tci (inner) and mbuf->vlan_tci_outer (outer). > > + * The outer vlan has been stripped by the hardware and their tci are > > + * saved in mbuf->vlan_tci_outer (outer). > > their tci are -> its tci is > > > * This can only happen if vlan stripping is enabled in the RX > > * configuration of the PMD. > > - * When PKT_RX_QINQ_STRIPPED is set, the flags (PKT_RX_VLAN | > > - * PKT_RX_VLAN_STRIPPED | PKT_RX_QINQ) must also be set. > > + * When PKT_RX_QINQ_STRIPPED is set, the flags (PKT_RX_VLAN | PKT_RX_QINQ) > > + * must also be set. > > ok > > > + * When both PKT_RX_QINQ_STRIPPED and PKT_RX_VLAN_STRIPPED are set, the 2 > > vlans > > + * have been stripped by the hardware and their tci are saved in > > + * mbuf->vlan_tci (inner) and mbuf->vlan_tci_outer (outer). > > This is correct, but I'd use a bullet list to add another sentence: > > * - If both PKT_RX_QINQ_STRIPPED and PKT_RX_VLAN_STRIPPED are set, the 2 > vlans > * have been stripped by the hardware and their tci are saved in > * mbuf->vlan_tci (inner) and mbuf->vlan_tci_outer (outer). > * - If PKT_RX_QINQ_STRIPPED is set and PKT_RX_VLAN_STRIPPED is unset, only > the > * outer vlan is removed from packet data, but both tci are saved in > * mbuf->vlan_tci (inner) and mbuf->vlan_tci_outer (outer). > > > + * This can only happen if vlan stripping is enabled in the RX > > configuration > > + * of the PMD. > > The same exact sentence is above, this one can be removed. > > > + * When PKT_RX_QINQ_STRIPPED and PKT_RX_VLAN_STRIPPED are set, > > + * (PKT_RX_VLAN | PKT_RX_QINQ) must also be set. > > This can be removed too as it is redundant with above sentence: > > * When PKT_RX_QINQ_STRIPPED is set, the flags (PKT_RX_VLAN | PKT_RX_QINQ) > * must also be set. > > > Thanks, > Olivier
[dpdk-dev] [PATCH] crypto/ccp: fix queue alignment
Caught by compiling with -fno-common. A cacheline_aligned symbol can be found in the crypto/ccp driver object files. Looking at this driver source, the cacheline_aligned (kernel?) alignment macro is undefined. The compiler treats this as a symbol definition and generates a global symbol. Fixes: ef4b04f87fa6 ("crypto/ccp: support device init") Cc: sta...@dpdk.org Signed-off-by: David Marchand --- drivers/crypto/ccp/ccp_dev.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/crypto/ccp/ccp_dev.h b/drivers/crypto/ccp/ccp_dev.h index f4ad9eafd..37e04218c 100644 --- a/drivers/crypto/ccp/ccp_dev.h +++ b/drivers/crypto/ccp/ccp_dev.h @@ -220,7 +220,7 @@ struct ccp_queue { /**< lsb assigned for sha ctx */ uint32_t sb_hmac; /**< lsb assigned for hmac ctx */ -} cacheline_aligned; +} __rte_cache_aligned; /** * A structure describing a CCP device. -- 2.23.0
Re: [dpdk-dev] [RFC] Accelerator API to chain packet processing functions
On Fri, Feb 7, 2020 at 6:08 PM Coyle, David wrote: > > Hi Jerin, see below Hi David, > > > > On Thu, Feb 6, 2020 at 10:01 PM Coyle, David > > wrote: > > > > > > There is a risk in drafting API that meant for HW without any HW exists. > > Because there could be inefficiency on the metadata and fast path API for > > both models. > > For example, In the case of CPU based scheme, it will be pure overhead > > emulate the "queue"(the enqueue and dequeue) for the sake of abstraction > > where CPU works better in the synchronous model and I have doubt that the > > session-based scheme will work for HW or not as both difference HW needs > > to work hand in hand(IOMMU aspects for two PCI device) > > [DC] I understand what you are saying about the overhead of emulating the "sw > queue" but this same model is already used in many of the existing device > PMDs. > In the case of SW devices, such as AESNI-MB or NULL for crypto or zlib for > compression, the enqueue/dequeue in the PMD is emulated through an rte_ring > which is very efficient. > The accelerator API will use the existing device PMDs so keeping the same > model seems like a sensible approach. In this release, we added CPU crypto support in cryptodev to support the synchronous model to fix the overhead. > > From an application's point of view, this abstraction of the underlying > device type is important for usability and maintainability - the application > doesn't need to know > the device type as such and therefore doesn't need to make different API > calls. > > The enqueue/dequeue type API was also used with QAT in mind. While QAT HW > doesn't support these xform chains at the moment, it could potentially do so > in the future. > As a side note, as part of the work of adding the accelerator API, the QAT > PMD will be updated to support the DOCSIS Crypto-CRC accelerator xform chain, > where the Crypto > is done on QAT HW and the CRC will be done in SW, most likely through a call > to the optimized rte_net_crc library. This will give a consistent API for the > DOCSIS-MAC data-plane > pipeline prototype we have developed, which uses both AESNI-MB and QAT for > benchmarks. > > We will take your feedback on the enqueue/dequeue approach for SW devices > into consideration though during development. > > Finally, I'm unsure what you mean by this line: > > "I have doubt that the session-based scheme will work for HW or not > as both difference HW needs to work hand in hand(IOMMU aspects for two PCI > device)" > > What do mean by different HW working "hand in hand" and "two PCI device"? > The intention is that 1 HW device (or it's PMD) would have to support the > accel xform chain I was thinking, it will be N PCIe devices that create the chain. Each distinct PCI device does the fixed-function and chains them together. I do understand the usage of QAT HW and CRC in SW. So If I understand it correctly, in rte_security, we are combining rte_ethdev and rte_cryptodev. With this spec, we are trying to combine, rte_cryptodev and rte_compressdev. So it looks good to me. My only remaining concern is the name of this API, accelerator too generic name. IMO, like rte_security, we may need to give more meaningful name for the use case where crytodev and compressdev can work together.
[dpdk-dev] [PATCH] test/crypto: add cpu crypto mode tests
This patch adds ability to run unit tests in cpu crypto mode for AESNI GCM cryptodev. Signed-off-by: Marcin Smoczynski --- app/test/test_cryptodev.c | 181 -- 1 file changed, 172 insertions(+), 9 deletions(-) diff --git a/app/test/test_cryptodev.c b/app/test/test_cryptodev.c index e6abc22b6..7b1ef5c86 100644 --- a/app/test/test_cryptodev.c +++ b/app/test/test_cryptodev.c @@ -1,5 +1,5 @@ /* SPDX-License-Identifier: BSD-3-Clause - * Copyright(c) 2015-2019 Intel Corporation + * Copyright(c) 2015-2020 Intel Corporation */ #include @@ -52,6 +52,9 @@ static int gbl_driver_id; +static enum rte_security_session_action_type gbl_action_type = + RTE_SECURITY_ACTION_TYPE_NONE; + struct crypto_testsuite_params { struct rte_mempool *mbuf_pool; struct rte_mempool *large_mbuf_pool; @@ -139,9 +142,97 @@ ceil_byte_length(uint32_t num_bits) return (num_bits >> 3); } +static void +process_cpu_gmac_op(uint8_t dev_id, struct rte_crypto_op *op) +{ + int32_t n, st; + void *iv; + struct rte_crypto_sym_op *sop; + union rte_crypto_sym_ofs ofs; + struct rte_crypto_sgl sgl; + struct rte_crypto_sym_vec symvec; + struct rte_crypto_vec vec[UINT8_MAX]; + + sop = op->sym; + + n = rte_crypto_mbuf_to_vec(sop->m_src, sop->auth.data.offset, + sop->auth.data.length, vec, RTE_DIM(vec)); + + if (n < 0 || n != sop->m_src->nb_segs) { + op->status = RTE_CRYPTO_OP_STATUS_ERROR; + return; + } + + sgl.vec = vec; + sgl.num = n; + symvec.sgl = &sgl; + iv = rte_crypto_op_ctod_offset(op, void *, IV_OFFSET); + symvec.iv = &iv; + symvec.aad = NULL; + symvec.digest = (void **)&sop->auth.digest.data; + symvec.status = &st; + symvec.num = 1; + + ofs.raw = 0; + + n = rte_cryptodev_sym_cpu_crypto_process(dev_id, sop->session, ofs, + &symvec); + + if (n != 1) + op->status = RTE_CRYPTO_OP_STATUS_AUTH_FAILED; + else + op->status = RTE_CRYPTO_OP_STATUS_SUCCESS; +} + + +static void +process_cpu_aead_op(uint8_t dev_id, struct rte_crypto_op *op) +{ + int32_t n, st; + void *iv; + struct rte_crypto_sym_op *sop; + union rte_crypto_sym_ofs ofs; + struct rte_crypto_sgl sgl; + struct rte_crypto_sym_vec symvec; + struct rte_crypto_vec vec[UINT8_MAX]; + + sop = op->sym; + + n = rte_crypto_mbuf_to_vec(sop->m_src, sop->aead.data.offset, + sop->aead.data.length, vec, RTE_DIM(vec)); + + if (n < 0 || n != sop->m_src->nb_segs) { + op->status = RTE_CRYPTO_OP_STATUS_ERROR; + return; + } + + sgl.vec = vec; + sgl.num = n; + symvec.sgl = &sgl; + iv = rte_crypto_op_ctod_offset(op, void *, IV_OFFSET); + symvec.iv = &iv; + symvec.aad = (void **)&sop->aead.aad.data; + symvec.digest = (void **)&sop->aead.digest.data; + symvec.status = &st; + symvec.num = 1; + + ofs.raw = 0; + + n = rte_cryptodev_sym_cpu_crypto_process(dev_id, sop->session, ofs, + &symvec); + + if (n != 1) + op->status = RTE_CRYPTO_OP_STATUS_AUTH_FAILED; + else + op->status = RTE_CRYPTO_OP_STATUS_SUCCESS; +} + static struct rte_crypto_op * process_crypto_request(uint8_t dev_id, struct rte_crypto_op *op) { + + RTE_VERIFY(gbl_action_type != RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO); + if (rte_cryptodev_enqueue_burst(dev_id, 0, &op, 1) != 1) { RTE_LOG(ERR, USER1, "Error sending packet for encryption\n"); return NULL; @@ -6937,7 +7028,11 @@ test_authenticated_encryption(const struct aead_test_data *tdata) ut_params->op->sym->m_src = ut_params->ibuf; /* Process crypto operation */ - TEST_ASSERT_NOT_NULL(process_crypto_request(ts_params->valid_devs[0], + if (gbl_action_type == RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO) + process_cpu_aead_op(ts_params->valid_devs[0], ut_params->op); + else + TEST_ASSERT_NOT_NULL( + process_crypto_request(ts_params->valid_devs[0], ut_params->op), "failed to process sym crypto op"); TEST_ASSERT_EQUAL(ut_params->op->status, RTE_CRYPTO_OP_STATUS_SUCCESS, @@ -7868,7 +7963,11 @@ test_authenticated_decryption(const struct aead_test_data *tdata) ut_params->op->sym->m_src = ut_params->ibuf; /* Process crypto operation */ - TEST_ASSERT_NOT_NULL(process_crypto_request(ts_params->valid_devs[0], + if (gbl_action_type == RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO) + process_cpu_aead_op(ts_params->valid_devs[0], ut_params->op); + else + TEST_ASSERT_NOT_NULL( + process_crypto_request(ts_params->valid_devs[0], ut_params-
Re: [dpdk-dev] [PATCH v2] service: don't walk out of bounds when checking services
Kevin Traynor writes: > On 20/12/2019 14:43, David Marchand wrote: >> On Wed, Dec 4, 2019 at 9:34 AM David Marchand >> wrote: >>> >>> On Wed, Dec 4, 2019 at 9:33 AM David Marchand >>> wrote: On Tue, Dec 3, 2019 at 10:15 PM Aaron Conole wrote: > > The service_valid call is used without properly bounds checking the > input parameter. Almost all instances of the service_valid call are > inside a for() loop that prevents excessive walks, but some of the > public APIs don't bounds check and will pass invalid arguments. > > Prevent this by using SERVICE_GET_OR_ERR_RET where it makes sense, > and adding a bounds check to one service_valid() use. > > Fixes: 8d39d3e237c2 ("service: fix race in service on app lcore function") > Fixes: e9139a32f6e8 ("service: add function to run on app lcore") > Fixes: e30dd31847d2 ("service: add mechanism for quiescing") >>> Cc: sta...@dpdk.org >>> > > With the commit below, this patch will apply cleanly on 18.11. > > Seems ok to me to add below commit, wdyt? If I'm reading it correctly, the move is for an internal data structure in librte_eal, so I think it shouldn't be an ABI breakage. Looks safe to me as well. > commit e484ccddbe1b41886fef1e445ef2fdfa55086198 > Author: Nikhil Rao > Date: Mon Sep 16 15:31:02 2019 +0530 > > service: avoid false sharing on core state > > > Signed-off-by: Aaron Conole Reviewed-by: David Marchand >> >> Applied, thanks. >> >> >> -- >> David Marchand >>
Re: [dpdk-dev] [PATCH] mbuf: fix to update documentation of QinQ stripped bit interpretation
Hi Somnath, On Fri, Feb 07, 2020 at 07:13:04PM +0530, Somnath Kotur wrote: > Olivier, > > On Thu, Feb 6, 2020 at 10:55 PM Olivier Matz wrote: > > > > Hi Somnath, > > > > Sorry for the delay, please find some comments below. > > > > I suggest the following title instead: > > > > mbuf: extend meaning of QinQ stripped bit > > > > On Mon, Jan 06, 2020 at 02:04:23PM +0530, Somnath Kotur wrote: > > > Certain hardware may be able to strip and/or save only the outermost > > > VLAN instead of both the VLANs in the mbuf in a QinQ scenario. > > > To handle such cases, we could re-interpret setting of just > > > PKT_RX_QINQ_STRIPPED to indicate that only the outermost VLAN has > > > been stripped and saved in mbuf->vlan_tci_outer. > > > > To be sure we're on the same page: we are talking about case 7 of this > > link: http://inbox.dpdk.org/dev/20191227145041.GQ22738@platinum/ > > I'm not sure we are on the same page then, please see my response inline below > > > > So, even if the inner vlan_tci is not stripped from packet data, it has > > to be saved in m->vlan_tci, because it is implied by PKT_RX_VLAN. > > > > From the same link, the case where the driver only strips+saves the > > outer vlan without saving or stripping the inner one is case 3. > > > While this is how it works currently, I'm wondering how will the > application know if this was > a double VLAN pkt, correct? If the hardware supports it and configured for it, the flag PKT_RX_QINQ is set when there are 2 vlans. FYI, the m->packet_type field can also be useful. It describes what is really present in the packet data, as described in rte_mbuf_core.h: /* * The packet type, which is the combination of outer/inner L2, L3, L4 * and tunnel types. The packet_type is about data really present in the * mbuf. Example: if vlan stripping is enabled, a received vlan packet * would have RTE_PTYPE_L2_ETHER and not RTE_PTYPE_L2_VLAN because the * vlan is stripped from the data. */ > Also when i look at options 5 and 7 I don't really see the difference > in semantics between them ? > Both seem to store the outer-vlan and inner-vlan in m->vlan_tci_outer > and m->vlan_tci_inner respectively > > 7/ PKT_RX_VLAN | PKT_RX_QINQ | PKT_RX_QINQ_STRIPPED >outer-vlan is removed from packet data >m->vlan_tci_outer=outer-vlan >m->vlan_tci=inner-vlan The difference is about what was stripped or not from mbuf data. > I was hoping that with the new interpretation of PKT_RX_QINQ_STRIPPED, > it only meant that outer-vlan is removed from packet data > and m->vlan_tci_outer = outer_vlan, while PKT_RX_QINQ implies it is a > double-vlan pkt and PKT_RX_VLAN implies that pkt has VLAN > associated with it? Not m->vlan_tci = inner-vlan The meaning of each flag should be as simple as possible, I think we can summarize them like this: - PKT_RX_VLAN: the vlan is saved in vlan tci. - PKT_RX_VLAN_STRIPPED: the vlan hdr is removed from packet data. - PKT_RX_QINQ: the outer vlan is saved in vlan tci. - PKT_RX_QINQ_STRIPPED: the inner vlan is stripped from packet data. - When PKT_RX_QINQ is set, PKT_RX_VLAN* refer to the inner vlan of initial packet, else it refers to the first vlan of the packet. There is a link between vlan flag and vlan_tci field, and qinq flag and vlan_tci_outer field. I'm still not sure to understand what you expect. Can you give an example with flags (which are set), and the expected content of m->vlan_tci and m->vlan_tci_outer? By the way, the case 5/ is not very well described too, maybe we should add something about it. Thanks, Olivier > > Thanks > Som > > > > > Only When both PKT_RX_QINQ_STRIPPED and PKT_RX_VLAN_STRIPPED are set, the > > > 2 > > > VLANs have been stripped by the hardware and their TCI are saved in > > > mbuf->vlan_tci (inner) and mbuf->vlan_tci_outer (outer). > > > > > > Signed-off-by: Somnath Kotur > > > --- > > > lib/librte_mbuf/rte_mbuf_core.h | 15 +++ > > > 1 file changed, 11 insertions(+), 4 deletions(-) > > > > > > diff --git a/lib/librte_mbuf/rte_mbuf_core.h > > > b/lib/librte_mbuf/rte_mbuf_core.h > > > index 9a8557d..db1070b 100644 > > > --- a/lib/librte_mbuf/rte_mbuf_core.h > > > +++ b/lib/librte_mbuf/rte_mbuf_core.h > > > @@ -124,12 +124,19 @@ > > > #define PKT_RX_FDIR_FLX (1ULL << 14) > > > > > > /** > > > - * The 2 vlans have been stripped by the hardware and their tci are > > > - * saved in mbuf->vlan_tci (inner) and mbuf->vlan_tci_outer (outer). > > > + * The outer vlan has been stripped by the hardware and their tci are > > > + * saved in mbuf->vlan_tci_outer (outer). > > > > their tci are -> its tci is > > > > > * This can only happen if vlan stripping is enabled in the RX > > > * configuration of the PMD. > > > - * When PKT_RX_QINQ_STRIPPED is set, the flags (PKT_RX_VLAN | > > > - * PKT_RX_VLAN_STRIPPED | PKT_RX_QINQ) must also be set. > > > + * When PKT_RX_QINQ_STRIPPED is set, the flags (PKT_RX_VLAN | > > > PKT_RX_
[dpdk-dev] [PATCH 0/3] AF_XDP PMD Fixes
This series introduces some fixes for the zero copy path of the AF_XDP. In zero copy, the mempool objects are mapped directly into the AF_XDP UMEM. Below depicts the layout of an object in a mempool. +-++--+--+-+-+ | mp | struct | mbuf | mbuf | XDP | | | hdr | rte_ | priv | hr | hr | payload | | obj | mbuf | | | | | +-++--+--+-+-+ 64 128 * 128 256* < frame size > < frame hr -> 1: net/af_xdp: fix umem frame size & headroom calculations * The previous frame size calculation incorrectly used mb_pool->private_data_size and didn't include mb_pool->header_size. Instead of performing a manual calculation, use the rte_mempool_calc_obj_size API to determine the frame size. * The previous frame headroom calculation also incorrectly used mb_pool->private_data_size and didn't include mb_pool->header_size or the mbuf priv size. 2. net/af_xdp: use correct fill queue addresses The fill queue addresses should start at the beginning of the mempool object instead of the beginning of the mbuf. This is because the umem frame headroom includes the mp hdrobj size. Starting at this point ensures AF_XDP doesn't write past the available room in the frame, in the case of larger packets which are close to the size of the mbuf. 3. net/af_xdp: fix maximum MTU value The maximum MTU for af_xdp zero copy is equal to the page size less the frame overhead introduced by AF_XDP (XDP HR = 256) and DPDK (frame hr = 320). The patch updates this value to reflect this, and removes some unneeded constants for both zero-copy and copy mode. Ciara Loftus (3): net/af_xdp: fix umem frame size & headroom calculations net/af_xdp: use correct fill queue addresses net/af_xdp: fix maximum MTU value drivers/net/af_xdp/rte_eth_af_xdp.c | 60 + 1 file changed, 35 insertions(+), 25 deletions(-) -- 2.17.1
[dpdk-dev] [PATCH 2/3] net/af_xdp: use correct fill queue addresses
The fill queue addresses should start at the beginning of the mempool object instead of the beginning of the mbuf. This is because the umem frame headroom includes the mp hdrobj size. Starting at this point ensures AF_XDP doesn't write past the available room in the frame, in the case of larger packets which are close to the size of the mbuf. Fixes: d8a210774e1d ("net/af_xdp: support unaligned umem chunks") Cc: sta...@dpdk.org Signed-off-by: Ciara Loftus --- drivers/net/af_xdp/rte_eth_af_xdp.c | 24 1 file changed, 16 insertions(+), 8 deletions(-) diff --git a/drivers/net/af_xdp/rte_eth_af_xdp.c b/drivers/net/af_xdp/rte_eth_af_xdp.c index 8b189119c..ebc64d4dd 100644 --- a/drivers/net/af_xdp/rte_eth_af_xdp.c +++ b/drivers/net/af_xdp/rte_eth_af_xdp.c @@ -172,7 +172,8 @@ reserve_fill_queue_zc(struct xsk_umem_info *umem, uint16_t reserve_size, uint64_t addr; fq_addr = xsk_ring_prod__fill_addr(fq, idx++); - addr = (uint64_t)bufs[i] - (uint64_t)umem->buffer; + addr = (uint64_t)bufs[i] - (uint64_t)umem->buffer - + umem->mb_pool->header_size; *fq_addr = addr; } @@ -271,8 +272,10 @@ af_xdp_rx_zc(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts) addr = xsk_umem__extract_addr(addr); bufs[i] = (struct rte_mbuf *) - xsk_umem__get_data(umem->buffer, addr); - bufs[i]->data_off = offset - sizeof(struct rte_mbuf); + xsk_umem__get_data(umem->buffer, addr + + umem->mb_pool->header_size); + bufs[i]->data_off = offset - sizeof(struct rte_mbuf) - + umem->mb_pool->header_size; rte_pktmbuf_pkt_len(bufs[i]) = len; rte_pktmbuf_data_len(bufs[i]) = len; @@ -385,7 +388,8 @@ pull_umem_cq(struct xsk_umem_info *umem, int size) #if defined(XDP_UMEM_UNALIGNED_CHUNK_FLAG) addr = xsk_umem__extract_addr(addr); rte_pktmbuf_free((struct rte_mbuf *) - xsk_umem__get_data(umem->buffer, addr)); + xsk_umem__get_data(umem->buffer, + addr + umem->mb_pool->header_size)); #else rte_ring_enqueue(umem->buf_ring, (void *)addr); #endif @@ -443,9 +447,11 @@ af_xdp_tx_zc(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts) } desc = xsk_ring_prod__tx_desc(&txq->tx, idx_tx); desc->len = mbuf->pkt_len; - addr = (uint64_t)mbuf - (uint64_t)umem->buffer; + addr = (uint64_t)mbuf - (uint64_t)umem->buffer - + umem->mb_pool->header_size; offset = rte_pktmbuf_mtod(mbuf, uint64_t) - - (uint64_t)mbuf; + (uint64_t)mbuf + + umem->mb_pool->header_size; offset = offset << XSK_UNALIGNED_BUF_OFFSET_SHIFT; desc->addr = addr | offset; count++; @@ -466,9 +472,11 @@ af_xdp_tx_zc(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts) desc = xsk_ring_prod__tx_desc(&txq->tx, idx_tx); desc->len = mbuf->pkt_len; - addr = (uint64_t)local_mbuf - (uint64_t)umem->buffer; + addr = (uint64_t)local_mbuf - (uint64_t)umem->buffer - + umem->mb_pool->header_size; offset = rte_pktmbuf_mtod(local_mbuf, uint64_t) - - (uint64_t)local_mbuf; + (uint64_t)local_mbuf + + umem->mb_pool->header_size; pkt = xsk_umem__get_data(umem->buffer, addr + offset); offset = offset << XSK_UNALIGNED_BUF_OFFSET_SHIFT; desc->addr = addr | offset; -- 2.17.1
[dpdk-dev] [PATCH 1/3] net/af_xdp: fix umem frame size & headroom calculations
The previous frame size calculation incorrectly used mb_pool->private_data_size and didn't include mb_pool->header_size. Instead of performing a manual calculation, use the rte_mempool_calc_obj_size API to determine the frame size. The previous frame headroom calculation also incorrectly used mb_pool->private_data_size and didn't include mb_pool->header_size or the mbuf priv size. Fix this. Fixes: d8a210774e1d ("net/af_xdp: support unaligned umem chunks") Cc: sta...@dpdk.org Signed-off-by: Ciara Loftus --- drivers/net/af_xdp/rte_eth_af_xdp.c | 13 - 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/drivers/net/af_xdp/rte_eth_af_xdp.c b/drivers/net/af_xdp/rte_eth_af_xdp.c index 683e2a559..8b189119c 100644 --- a/drivers/net/af_xdp/rte_eth_af_xdp.c +++ b/drivers/net/af_xdp/rte_eth_af_xdp.c @@ -34,6 +34,7 @@ #include #include #include +#include #include #include #include @@ -755,11 +756,13 @@ xsk_umem_info *xdp_umem_configure(struct pmd_internals *internals __rte_unused, void *base_addr = NULL; struct rte_mempool *mb_pool = rxq->mb_pool; - usr_config.frame_size = rte_pktmbuf_data_room_size(mb_pool) + - ETH_AF_XDP_MBUF_OVERHEAD + - mb_pool->private_data_size; - usr_config.frame_headroom = ETH_AF_XDP_DATA_HEADROOM + - mb_pool->private_data_size; + usr_config.frame_size = rte_mempool_calc_obj_size(mb_pool->elt_size, + mb_pool->flags, + NULL); + usr_config.frame_headroom = mb_pool->header_size + + sizeof(struct rte_mbuf) + + rte_pktmbuf_priv_size(mb_pool) + + RTE_PKTMBUF_HEADROOM; umem = rte_zmalloc_socket("umem", sizeof(*umem), 0, rte_socket_id()); if (umem == NULL) { -- 2.17.1
[dpdk-dev] [PATCH 3/3] net/af_xdp: fix maximum MTU value
The maximum MTU for af_xdp zero copy is equal to the page size less the frame overhead introduced by AF_XDP (XDP HR = 256) and DPDK (frame headroom = 320). The patch updates this value to reflect this. This change also makes it possible to remove unneeded constants for both zero-copy and copy mode. Fixes: d8a210774e1d ("net/af_xdp: support unaligned umem chunks") Cc: sta...@dpdk.org Signed-off-by: Ciara Loftus --- drivers/net/af_xdp/rte_eth_af_xdp.c | 23 +++ 1 file changed, 11 insertions(+), 12 deletions(-) diff --git a/drivers/net/af_xdp/rte_eth_af_xdp.c b/drivers/net/af_xdp/rte_eth_af_xdp.c index ebc64d4dd..bf18a571d 100644 --- a/drivers/net/af_xdp/rte_eth_af_xdp.c +++ b/drivers/net/af_xdp/rte_eth_af_xdp.c @@ -59,13 +59,6 @@ static int af_xdp_logtype; #define ETH_AF_XDP_FRAME_SIZE 2048 #define ETH_AF_XDP_NUM_BUFFERS 4096 -#ifdef XDP_UMEM_UNALIGNED_CHUNK_FLAG -#define ETH_AF_XDP_MBUF_OVERHEAD 128 /* sizeof(struct rte_mbuf) */ -#define ETH_AF_XDP_DATA_HEADROOM \ - (ETH_AF_XDP_MBUF_OVERHEAD + RTE_PKTMBUF_HEADROOM) -#else -#define ETH_AF_XDP_DATA_HEADROOM 0 -#endif #define ETH_AF_XDP_DFLT_NUM_DESCS XSK_RING_CONS__DEFAULT_NUM_DESCS #define ETH_AF_XDP_DFLT_START_QUEUE_IDX0 #define ETH_AF_XDP_DFLT_QUEUE_COUNT1 @@ -601,7 +594,14 @@ eth_dev_info(struct rte_eth_dev *dev, struct rte_eth_dev_info *dev_info) dev_info->max_tx_queues = internals->queue_cnt; dev_info->min_mtu = RTE_ETHER_MIN_MTU; - dev_info->max_mtu = ETH_AF_XDP_FRAME_SIZE - ETH_AF_XDP_DATA_HEADROOM; +#if defined(XDP_UMEM_UNALIGNED_CHUNK_FLAG) + dev_info->max_mtu = getpagesize() - + sizeof(struct rte_mempool_objhdr) - + sizeof(struct rte_mbuf) - + RTE_PKTMBUF_HEADROOM - XDP_PACKET_HEADROOM; +#else + dev_info->max_mtu = ETH_AF_XDP_FRAME_SIZE; +#endif dev_info->default_rxportconf.nb_queues = 1; dev_info->default_txportconf.nb_queues = 1; @@ -803,7 +803,7 @@ xsk_umem_info *xdp_umem_configure(struct pmd_internals *internals, .fill_size = ETH_AF_XDP_DFLT_NUM_DESCS, .comp_size = ETH_AF_XDP_DFLT_NUM_DESCS, .frame_size = ETH_AF_XDP_FRAME_SIZE, - .frame_headroom = ETH_AF_XDP_DATA_HEADROOM }; + .frame_headroom = 0 }; char ring_name[RTE_RING_NAMESIZE]; char mz_name[RTE_MEMZONE_NAMESIZE]; int ret; @@ -828,8 +828,7 @@ xsk_umem_info *xdp_umem_configure(struct pmd_internals *internals, for (i = 0; i < ETH_AF_XDP_NUM_BUFFERS; i++) rte_ring_enqueue(umem->buf_ring, -(void *)(i * ETH_AF_XDP_FRAME_SIZE + - ETH_AF_XDP_DATA_HEADROOM)); +(void *)(i * ETH_AF_XDP_FRAME_SIZE)); snprintf(mz_name, sizeof(mz_name), "af_xdp_umem_%s_%u", internals->if_name, rxq->xsk_queue_idx); @@ -938,7 +937,7 @@ eth_rx_queue_setup(struct rte_eth_dev *dev, /* Now get the space available for data in the mbuf */ buf_size = rte_pktmbuf_data_room_size(mb_pool) - RTE_PKTMBUF_HEADROOM; - data_size = ETH_AF_XDP_FRAME_SIZE - ETH_AF_XDP_DATA_HEADROOM; + data_size = ETH_AF_XDP_FRAME_SIZE; if (data_size > buf_size) { AF_XDP_LOG(ERR, "%s: %d bytes will not fit in mbuf (%d bytes)\n", -- 2.17.1
Re: [dpdk-dev] [RFT] net/netvsc: initialize link state
On Fri, 07 Feb 2020 15:22:23 +0200 Mohammed Gamal wrote: > On Thu, 2020-02-06 at 16:10 -0800, Stephen Hemminger wrote: > > If application is using link state interrupt, the correct link state > > needs to be filled in when device is started. This is similar to > > how virtio updates link information. > > > > Reported-by: Mohammed Gamal > > Signed-off-by: Stephen Hemminger > > --- > > This version marked RFT because am in airport without access to a > > machine to test it. > > > > drivers/net/netvsc/hn_ethdev.c | 4 > > 1 file changed, 4 insertions(+) > > > > diff --git a/drivers/net/netvsc/hn_ethdev.c > > b/drivers/net/netvsc/hn_ethdev.c > > index c79f924379fe..564620748daf 100644 > > --- a/drivers/net/netvsc/hn_ethdev.c > > +++ b/drivers/net/netvsc/hn_ethdev.c > > @@ -823,6 +823,10 @@ hn_dev_start(struct rte_eth_dev *dev) > > if (error) > > hn_rndis_set_rxfilter(hv, 0); > > > > + /* Initialize Link state */ > > + if (error == 0) > > + hn_dev_link_update(dev, 0); > > + > > return error; > > } > > > > I tested this and I always get the link status as UP, regardless of > whether I start the interface on the guest in UP or DOWN state. Looking > at hn_dev_link_update() code, I see that the link status depends on the > NDIS status that the driver gets from the host if my understanding is > correct. > The question is whether if I use 'ip li set dev $IF_NAME down' on the > guest affects the status the host sees, or would the host set the state > to NDIS_MEDIA_STATE_CONNECTED of the device is physcially connected > regardless of what the guest tries to do? > Are you confused about admin state vs link state? Admin state is the up/down state in software, and link state is the (virtual) hardware link status. In traditional Linux, admin state is controlled by ip link set up/down; in DPDK the admin state is implied by whether the DPDK device is started or stopped. The link state for hardware devices is determined by whether the hardware link has synchronized with the switch. In virtual environments this is synchronized. In Linux link state is reported as NOCARRIER (IFF_RUNNING). In DPDK it is reported in via the link info get. The device visible to the kernel is the accelerated networking (Mellanox) device and is not related directly to the netvsc device. To test link up/down is not easy on Azure. You would have to use Azure CLI to disconnect the NIC from VM. On native Hyper-V you can test by setting up a virtual switch with an external network device; then unplug the network device.
Re: [dpdk-dev] [RFT] net/netvsc: initialize link state
On Fri, 2020-02-07 at 08:12 -0800, Stephen Hemminger wrote: > On Fri, 07 Feb 2020 15:22:23 +0200 > Mohammed Gamal wrote: > > > On Thu, 2020-02-06 at 16:10 -0800, Stephen Hemminger wrote: > > > If application is using link state interrupt, the correct link > > > state > > > needs to be filled in when device is started. This is similar to > > > how virtio updates link information. > > > > > > Reported-by: Mohammed Gamal > > > Signed-off-by: Stephen Hemminger > > > --- > > > This version marked RFT because am in airport without access to a > > > machine to test it. > > > > > > drivers/net/netvsc/hn_ethdev.c | 4 > > > 1 file changed, 4 insertions(+) > > > > > > diff --git a/drivers/net/netvsc/hn_ethdev.c > > > b/drivers/net/netvsc/hn_ethdev.c > > > index c79f924379fe..564620748daf 100644 > > > --- a/drivers/net/netvsc/hn_ethdev.c > > > +++ b/drivers/net/netvsc/hn_ethdev.c > > > @@ -823,6 +823,10 @@ hn_dev_start(struct rte_eth_dev *dev) > > > if (error) > > > hn_rndis_set_rxfilter(hv, 0); > > > > > > + /* Initialize Link state */ > > > + if (error == 0) > > > + hn_dev_link_update(dev, 0); > > > + > > > return error; > > > } > > > > > > > I tested this and I always get the link status as UP, regardless of > > whether I start the interface on the guest in UP or DOWN state. > > Looking > > at hn_dev_link_update() code, I see that the link status depends on > > the > > NDIS status that the driver gets from the host if my understanding > > is > > correct. > > The question is whether if I use 'ip li set dev $IF_NAME down' on > > the > > guest affects the status the host sees, or would the host set the > > state > > to NDIS_MEDIA_STATE_CONNECTED of the device is physcially connected > > regardless of what the guest tries to do? > > > > Are you confused about admin state vs link state? Admin state is the > up/down state in software, and link state is the (virtual) hardware > link > status. In traditional Linux, admin state is controlled by ip link > set up/down; in DPDK the admin state is implied by whether the DPDK > device is started or stopped. The link state for hardware devices is > determined by whether the hardware link has synchronized with the > switch. > In virtual environments this is synchronized. In Linux link state > is reported as NOCARRIER (IFF_RUNNING). In DPDK it is reported in > via the link info get. > > The device visible to the kernel is the accelerated networking > (Mellanox) > device and is not related directly to the netvsc device. > > To test link up/down is not easy on Azure. You would have to use > Azure CLI > to disconnect the NIC from VM. On native Hyper-V you can test by > setting up a virtual switch with an external network device; then > unplug the network device. > > I see. Thanks for the explanation. In this case this does work as expected. Tested-by: Mohammed Gamal
Re: [dpdk-dev] [PATCH v7 9/9] eal: add minimum viable code to support parsing
On 2/6/2020 1:26 AM, Thomas Monjalon wrote: As discussed in community meeting, the goal was to have core parsing in Windows EAL 20.02. Given that there is a crash and a doubt on the imported getopt library, I think it's better to postpone this series to 20.05. Thomas... We have fixed the crash in the v8 series of this patch - Pallavi sent it out for review last night. (It was a simple fix) Can we please get this merged into 20.02? It would be great to progress Windows support in this release. (The initial Windows support was merged in 19.02, so it's been a year since we moved it forward! :-)) thanks, Ranjit
[dpdk-dev] [PATCH v2 0/3] AF_XDP PMD Fixes
This series introduces some fixes for the zero copy path of the AF_XDP. In zero copy, the mempool objects are mapped directly into the AF_XDP UMEM. Below depicts the layout of an object in a mempool. +-++--+--+-+-+ | mp | struct | mbuf | mbuf | XDP | | | hdr | rte_ | priv | hr | hr | payload | | obj | mbuf | | | | | +-++--+--+-+-+ 64 128 * 128 256* < frame size > < frame hr -> 1: net/af_xdp: fix umem frame size & headroom calculations * The previous frame size calculation incorrectly used mb_pool->private_data_size and didn't include mb_pool->header_size. Instead of performing a manual calculation, use the rte_mempool_calc_obj_size API to determine the frame size. * The previous frame headroom calculation also incorrectly used mb_pool->private_data_size and didn't include mb_pool->header_size or the mbuf priv size. 2. net/af_xdp: use correct fill queue addresses The fill queue addresses should start at the beginning of the mempool object instead of the beginning of the mbuf. This is because the umem frame headroom includes the mp hdrobj size. Starting at this point ensures AF_XDP doesn't write past the available room in the frame, in the case of larger packets which are close to the size of the mbuf. 3. net/af_xdp: fix maximum MTU value The maximum MTU for af_xdp zero copy is equal to the page size less the frame overhead introduced by AF_XDP (XDP HR = 256) and DPDK (frame hr = 320). The patch updates this value to reflect this, and removes some unneeded constants for both zero-copy and copy mode. v2: * Include mbuf priv size in rx mbuf data_off calculation Ciara Loftus (3): net/af_xdp: fix umem frame size & headroom calculations net/af_xdp: use correct fill queue addresses net/af_xdp: fix maximum MTU value drivers/net/af_xdp/rte_eth_af_xdp.c | 60 + 1 file changed, 35 insertions(+), 25 deletions(-) -- 2.17.1
[dpdk-dev] [PATCH v2 1/3] net/af_xdp: fix umem frame size & headroom calculations
The previous frame size calculation incorrectly used mb_pool->private_data_size and didn't include mb_pool->header_size. Instead of performing a manual calculation, use the rte_mempool_calc_obj_size API to determine the frame size. The previous frame headroom calculation also incorrectly used mb_pool->private_data_size and didn't include mb_pool->header_size or the mbuf priv size. Fix this. Fixes: d8a210774e1d ("net/af_xdp: support unaligned umem chunks") Cc: sta...@dpdk.org Signed-off-by: Ciara Loftus --- drivers/net/af_xdp/rte_eth_af_xdp.c | 13 - 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/drivers/net/af_xdp/rte_eth_af_xdp.c b/drivers/net/af_xdp/rte_eth_af_xdp.c index 683e2a559..8b189119c 100644 --- a/drivers/net/af_xdp/rte_eth_af_xdp.c +++ b/drivers/net/af_xdp/rte_eth_af_xdp.c @@ -34,6 +34,7 @@ #include #include #include +#include #include #include #include @@ -755,11 +756,13 @@ xsk_umem_info *xdp_umem_configure(struct pmd_internals *internals __rte_unused, void *base_addr = NULL; struct rte_mempool *mb_pool = rxq->mb_pool; - usr_config.frame_size = rte_pktmbuf_data_room_size(mb_pool) + - ETH_AF_XDP_MBUF_OVERHEAD + - mb_pool->private_data_size; - usr_config.frame_headroom = ETH_AF_XDP_DATA_HEADROOM + - mb_pool->private_data_size; + usr_config.frame_size = rte_mempool_calc_obj_size(mb_pool->elt_size, + mb_pool->flags, + NULL); + usr_config.frame_headroom = mb_pool->header_size + + sizeof(struct rte_mbuf) + + rte_pktmbuf_priv_size(mb_pool) + + RTE_PKTMBUF_HEADROOM; umem = rte_zmalloc_socket("umem", sizeof(*umem), 0, rte_socket_id()); if (umem == NULL) { -- 2.17.1
[dpdk-dev] [PATCH v2 3/3] net/af_xdp: fix maximum MTU value
The maximum MTU for af_xdp zero copy is equal to the page size less the frame overhead introduced by AF_XDP (XDP HR = 256) and DPDK (frame headroom = 320). The patch updates this value to reflect this. This change also makes it possible to remove unneeded constants for both zero-copy and copy mode. Fixes: d8a210774e1d ("net/af_xdp: support unaligned umem chunks") Cc: sta...@dpdk.org Signed-off-by: Ciara Loftus --- drivers/net/af_xdp/rte_eth_af_xdp.c | 23 +++ 1 file changed, 11 insertions(+), 12 deletions(-) diff --git a/drivers/net/af_xdp/rte_eth_af_xdp.c b/drivers/net/af_xdp/rte_eth_af_xdp.c index 1e98cd44f..75f037c3e 100644 --- a/drivers/net/af_xdp/rte_eth_af_xdp.c +++ b/drivers/net/af_xdp/rte_eth_af_xdp.c @@ -59,13 +59,6 @@ static int af_xdp_logtype; #define ETH_AF_XDP_FRAME_SIZE 2048 #define ETH_AF_XDP_NUM_BUFFERS 4096 -#ifdef XDP_UMEM_UNALIGNED_CHUNK_FLAG -#define ETH_AF_XDP_MBUF_OVERHEAD 128 /* sizeof(struct rte_mbuf) */ -#define ETH_AF_XDP_DATA_HEADROOM \ - (ETH_AF_XDP_MBUF_OVERHEAD + RTE_PKTMBUF_HEADROOM) -#else -#define ETH_AF_XDP_DATA_HEADROOM 0 -#endif #define ETH_AF_XDP_DFLT_NUM_DESCS XSK_RING_CONS__DEFAULT_NUM_DESCS #define ETH_AF_XDP_DFLT_START_QUEUE_IDX0 #define ETH_AF_XDP_DFLT_QUEUE_COUNT1 @@ -602,7 +595,14 @@ eth_dev_info(struct rte_eth_dev *dev, struct rte_eth_dev_info *dev_info) dev_info->max_tx_queues = internals->queue_cnt; dev_info->min_mtu = RTE_ETHER_MIN_MTU; - dev_info->max_mtu = ETH_AF_XDP_FRAME_SIZE - ETH_AF_XDP_DATA_HEADROOM; +#if defined(XDP_UMEM_UNALIGNED_CHUNK_FLAG) + dev_info->max_mtu = getpagesize() - + sizeof(struct rte_mempool_objhdr) - + sizeof(struct rte_mbuf) - + RTE_PKTMBUF_HEADROOM - XDP_PACKET_HEADROOM; +#else + dev_info->max_mtu = ETH_AF_XDP_FRAME_SIZE; +#endif dev_info->default_rxportconf.nb_queues = 1; dev_info->default_txportconf.nb_queues = 1; @@ -804,7 +804,7 @@ xsk_umem_info *xdp_umem_configure(struct pmd_internals *internals, .fill_size = ETH_AF_XDP_DFLT_NUM_DESCS, .comp_size = ETH_AF_XDP_DFLT_NUM_DESCS, .frame_size = ETH_AF_XDP_FRAME_SIZE, - .frame_headroom = ETH_AF_XDP_DATA_HEADROOM }; + .frame_headroom = 0 }; char ring_name[RTE_RING_NAMESIZE]; char mz_name[RTE_MEMZONE_NAMESIZE]; int ret; @@ -829,8 +829,7 @@ xsk_umem_info *xdp_umem_configure(struct pmd_internals *internals, for (i = 0; i < ETH_AF_XDP_NUM_BUFFERS; i++) rte_ring_enqueue(umem->buf_ring, -(void *)(i * ETH_AF_XDP_FRAME_SIZE + - ETH_AF_XDP_DATA_HEADROOM)); +(void *)(i * ETH_AF_XDP_FRAME_SIZE)); snprintf(mz_name, sizeof(mz_name), "af_xdp_umem_%s_%u", internals->if_name, rxq->xsk_queue_idx); @@ -939,7 +938,7 @@ eth_rx_queue_setup(struct rte_eth_dev *dev, /* Now get the space available for data in the mbuf */ buf_size = rte_pktmbuf_data_room_size(mb_pool) - RTE_PKTMBUF_HEADROOM; - data_size = ETH_AF_XDP_FRAME_SIZE - ETH_AF_XDP_DATA_HEADROOM; + data_size = ETH_AF_XDP_FRAME_SIZE; if (data_size > buf_size) { AF_XDP_LOG(ERR, "%s: %d bytes will not fit in mbuf (%d bytes)\n", -- 2.17.1
[dpdk-dev] [PATCH v2 2/3] net/af_xdp: use correct fill queue addresses
The fill queue addresses should start at the beginning of the mempool object instead of the beginning of the mbuf. This is because the umem frame headroom includes the mp hdrobj size. Starting at this point ensures AF_XDP doesn't write past the available room in the frame, in the case of larger packets which are close to the size of the mbuf. Fixes: d8a210774e1d ("net/af_xdp: support unaligned umem chunks") Cc: sta...@dpdk.org Signed-off-by: Ciara Loftus --- v2: * Include mbuf priv size in rx mbuf data_off calculation Ciara Loftus (3): net/af_xdp: fix umem frame size & headroom calculations net/af_xdp: use correct fill queue addresses net/af_xdp: fix maximum MTU value drivers/net/af_xdp/rte_eth_af_xdp.c | 60 + 1 file changed, 35 insertions(+), 25 deletions(-) -- 2.17.1 drivers/net/af_xdp/rte_eth_af_xdp.c | 25 + 1 file changed, 17 insertions(+), 8 deletions(-) diff --git a/drivers/net/af_xdp/rte_eth_af_xdp.c b/drivers/net/af_xdp/rte_eth_af_xdp.c index 8b189119c..1e98cd44f 100644 --- a/drivers/net/af_xdp/rte_eth_af_xdp.c +++ b/drivers/net/af_xdp/rte_eth_af_xdp.c @@ -172,7 +172,8 @@ reserve_fill_queue_zc(struct xsk_umem_info *umem, uint16_t reserve_size, uint64_t addr; fq_addr = xsk_ring_prod__fill_addr(fq, idx++); - addr = (uint64_t)bufs[i] - (uint64_t)umem->buffer; + addr = (uint64_t)bufs[i] - (uint64_t)umem->buffer - + umem->mb_pool->header_size; *fq_addr = addr; } @@ -271,8 +272,11 @@ af_xdp_rx_zc(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts) addr = xsk_umem__extract_addr(addr); bufs[i] = (struct rte_mbuf *) - xsk_umem__get_data(umem->buffer, addr); - bufs[i]->data_off = offset - sizeof(struct rte_mbuf); + xsk_umem__get_data(umem->buffer, addr + + umem->mb_pool->header_size); + bufs[i]->data_off = offset - sizeof(struct rte_mbuf) - + rte_pktmbuf_priv_size(umem->mb_pool) - + umem->mb_pool->header_size; rte_pktmbuf_pkt_len(bufs[i]) = len; rte_pktmbuf_data_len(bufs[i]) = len; @@ -385,7 +389,8 @@ pull_umem_cq(struct xsk_umem_info *umem, int size) #if defined(XDP_UMEM_UNALIGNED_CHUNK_FLAG) addr = xsk_umem__extract_addr(addr); rte_pktmbuf_free((struct rte_mbuf *) - xsk_umem__get_data(umem->buffer, addr)); + xsk_umem__get_data(umem->buffer, + addr + umem->mb_pool->header_size)); #else rte_ring_enqueue(umem->buf_ring, (void *)addr); #endif @@ -443,9 +448,11 @@ af_xdp_tx_zc(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts) } desc = xsk_ring_prod__tx_desc(&txq->tx, idx_tx); desc->len = mbuf->pkt_len; - addr = (uint64_t)mbuf - (uint64_t)umem->buffer; + addr = (uint64_t)mbuf - (uint64_t)umem->buffer - + umem->mb_pool->header_size; offset = rte_pktmbuf_mtod(mbuf, uint64_t) - - (uint64_t)mbuf; + (uint64_t)mbuf + + umem->mb_pool->header_size; offset = offset << XSK_UNALIGNED_BUF_OFFSET_SHIFT; desc->addr = addr | offset; count++; @@ -466,9 +473,11 @@ af_xdp_tx_zc(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts) desc = xsk_ring_prod__tx_desc(&txq->tx, idx_tx); desc->len = mbuf->pkt_len; - addr = (uint64_t)local_mbuf - (uint64_t)umem->buffer; + addr = (uint64_t)local_mbuf - (uint64_t)umem->buffer - + umem->mb_pool->header_size; offset = rte_pktmbuf_mtod(local_mbuf, uint64_t) - - (uint64_t)local_mbuf; + (uint64_t)local_mbuf + + umem->mb_pool->header_size; pkt = xsk_umem__get_data(umem->buffer, addr + offset); offset = offset << XSK_UNALIGNED_BUF_OFFSET_SHIFT; desc->addr = addr | offset; -- 2.17.1
Re: [dpdk-dev] [PATCH] test/crypto: add cpu crypto mode tests
> > This patch adds ability to run unit tests in cpu crypto mode for AESNI > GCM cryptodev. > > Signed-off-by: Marcin Smoczynski > --- Tested-by: Konstantin Ananyev Acked-by: Konstantin Ananyev > 2.17.1
Re: [dpdk-dev] [PATCH v7 9/9] eal: add minimum viable code to support parsing
On 2/7/2020 8:46 AM, Ranjit Menon wrote: On 2/6/2020 1:26 AM, Thomas Monjalon wrote: > As discussed in community meeting, the goal was to have core parsing > in Windows EAL 20.02. > Given that there is a crash and a doubt on the imported getopt library, > I think it's better to postpone this series to 20.05. > > Thomas... We have fixed the crash in the v8 series of this patch - Pallavi sent it out for review last night. (It was a simple fix) Can we please get this merged into 20.02? It would be great to progress Windows support in this release. (The initial Windows support was merged in 19.02, so it's been a year since we moved it forward! :-)) thanks, Ranjit Sorry, I believe the initial patches were merged in 19.05, still a long time! @Dmitry: If you could do some rudimentary tests and ACK the v8 series of this patch, it would be very helpful. thanks, Ranjit
Re: [dpdk-dev] [PATCH v2] devtools: add new SPDX license compliance checker
On Wed, 29 Jan 2020 07:59:07 -0800 Stephen Hemminger wrote: > Simple script to look for drivers and scripts that > are missing requires SPDX header. > > Signed-off-by: Stephen Hemminger Ping. This should have been merged by now.
Re: [dpdk-dev] [PATCH v2 0/7] MinGW-w64 support
Hi Dmitry, Thanks for your reply. On Fri, Feb 7, 2020 at 2:24 AM Dmitry Kozliuk wrote: > > Hi William, > >> I applied your v2 patch and I did a native build on windows 10. >> Hit an error showing >> ../lib/librte_eal/windows/eal/eal_lcore.c:54:2: error: 'for' loop >> initial declarations are only allowed in C99 mode > > > Thanks, will fix in v3. > >> However the output looks weird: >> C:\dpdk\build\examples>dpdk-helloworld.exe >> EAL: Detected 2 lcore(s) >> EAL: Detected 1 NUMA nodes >> hehello fllo frorom cm core 1 >> ore 0 > > > It looks like your stdout is unbuffered (default is line-buffered). What > terminal are you using (cmd, Power Shell, Terminal App, conemu, etc)? > I'm using the "Command Prompt" from windows 10 >> C compiler for the host machine: cc (gcc 4.8.3 "cc (GCC) 4.8.3") > > > GCC 4.8.3 is quite outdated, MinGW-w64 ships GCC 8 nowadays. Do we need to > support it for Windows (I doubt MinGW-w64 does)? > Oh, I download a pretty old version (mingw-w64 3.3.0). Let me update. BTW, I also tested cross-compile using my Ubuntu Box and everything works! William
Re: [dpdk-dev] SPDX license nag
On Mon, 27 Jan 2020 11:15:44 + "Mcnamara, John" wrote: > > -Original Message- > > From: dev On Behalf Of Stephen Hemminger > > Sent: Wednesday, January 22, 2020 4:19 PM > > To: dev@dpdk.org > > Subject: [dpdk-dev] SPDX license nag > > > > Files without SPDX License > > -- > > app/test-pmd/flowgen.c - Tilera/Mellanox > > app/test-pmd/macswap.c - Tilera/Mellanox > > app/test/test_compressdev_test_buffer.h - Intel > > app/test/test_timer_racecond.c - Akamai > > devtools/cocci.sh- EZchip/Mellanox > > devtools/load-devel-config - Canonical > > examples/ipsec-secgw/test/trs_aesgcm_inline_crypto_fallback_defs.sh - Intel > > examples/ipsec-secgw/test/tun_aesgcm_inline_crypto_fallback_defs.sh - Intel > > examples/performance-thread/l3fwd-thread/test.sh - Intel > > lib/librte_ethdev/rte_ethdev_pci.h - Brocade/Jan > > Blunck/Broadcom > > lib/librte_ethdev/rte_ethdev_vdev.h - Brocade/Jan > > Blunck/Broadcom > > > > Some patches for these are outstanding but haven't been merged > > > I've looked at the headers/logs for these files and added the owners (or > potential owners after mergers) above. > > Can everyone on the CC list please take care of these for your company/files. > It is a simple patch to fix and Stephen has already provided some. > > Thanks Stephen for keeping the focus on this. > > John > 20.02-rc2 SPDX license nag. Files without SPDX License -- app/test-pmd/flowgen.c app/test-pmd/macswap.c app/test/test_compressdev_test_buffer.h app/test/test_timer_racecond.c devtools/cocci.sh devtools/load-devel-config examples/ipsec-secgw/test/trs_aesgcm_inline_crypto_fallback_defs.sh examples/ipsec-secgw/test/tun_aesgcm_inline_crypto_fallback_defs.sh examples/performance-thread/l3fwd-thread/test.sh lib/librte_ethdev/rte_ethdev_pci.h lib/librte_ethdev/rte_ethdev_vdev.h Files with additional license text -- app/test-pmd/flowgen.c app/test-pmd/macswap.c app/test/test_timer_racecond.c devtools/cocci.sh lib/librte_ethdev/rte_ethdev_pci.h lib/librte_ethdev/rte_ethdev_vdev.h
[dpdk-dev] [PATCH v3] devtools: add new SPDX license compliance checker
Simple script to look for drivers and scripts that are missing requires SPDX header. Signed-off-by: Stephen Hemminger --- v3 - pickup more places with boilerplate text avoid false positive for cocci scripts or abignore devtools/spdx-check.sh | 24 1 file changed, 24 insertions(+) create mode 100755 devtools/spdx-check.sh diff --git a/devtools/spdx-check.sh b/devtools/spdx-check.sh new file mode 100755 index ..a5be10b26f44 --- /dev/null +++ b/devtools/spdx-check.sh @@ -0,0 +1,24 @@ +#! /bin/sh +# SPDX-License-Identifier: BSD-3-Clause +# Copyright (c) 2019 Microsoft Corporation +# +# Produce a list of files with incorrect license +# information + +echo "Files without SPDX License" +echo "--" + +git grep -L SPDX-License-Identifier -- \ +':^.git*' ':^.ci/*' ':^.travis.yml' \ +':^README' ':^MAINTAINERS' ':^VERSION' ':^ABI_VERSION' \ +':^*/Kbuild' ':^*/README' \ +':^license/' ':^doc/' ':^config/' ':^buildtools/' \ +':^*.cocci' ':^*.abignore' \ +':^*.def' ':^*.map' ':^*.ini' ':^*.data' ':^*.cfg' ':^*.txt' + +echo +echo "Files with additional license text" +echo "--" + +git grep -l Redistribution -- \ +':^license/' ':^/devtools/spdx-check.sh' -- 2.20.1
[dpdk-dev] [PATCH v2] net/netvsc: initialize link state
If application is using link state interrupt, the correct link state needs to be filled in when device is started. This is similar to how virtio updates link information. Fixes: 4e9c73e96e83 ("net/netvsc: add Hyper-V network device") Cc: sta...@dpdk.org Reported-by: Mohammed Gamal Tested-by: Mohammed Gamal Signed-off-by: Stephen Hemminger --- Putting on same email thread as original submission v2 - new patch that does initialization at start added tested-by and fixes tag drivers/net/netvsc/hn_ethdev.c | 4 1 file changed, 4 insertions(+) diff --git a/drivers/net/netvsc/hn_ethdev.c b/drivers/net/netvsc/hn_ethdev.c index c79f924379fe..564620748daf 100644 --- a/drivers/net/netvsc/hn_ethdev.c +++ b/drivers/net/netvsc/hn_ethdev.c @@ -823,6 +823,10 @@ hn_dev_start(struct rte_eth_dev *dev) if (error) hn_rndis_set_rxfilter(hv, 0); + /* Initialize Link state */ + if (error == 0) + hn_dev_link_update(dev, 0); + return error; } -- 2.20.1
Re: [dpdk-dev] [PATCH 00/14] cleanup resources on shutdown
On Thu, 6 Feb 2020 15:06:56 +0100 David Marchand wrote: > On Sat, Jan 4, 2020 at 2:34 AM Stephen Hemminger > wrote: > > > > Recently started using valgrind with DPDK, and the results > > are not clean. > > > > The DPDK has a function that applications can use to tell it > > to cleanup resources on shutdown (rte_eal_cleanup). But the > > current coverage of that API is spotty. Many internal parts of > > DPDK leave files and allocated memory behind. > > > > This patch set is a start at getting the sub-parts of > > DPDK to cleanup after themselves. These are the easier ones, > > the harder and more critical ones are in the drivers > > and the memory subsystem. > > I am too short on time to check and integrate this big series in rc2, > and from now it would be too risky to take in 20.02. > Can you respin it for 20.05 with FreeBSD fixes too? OK, but if this kind of patch can't be reviewed then the DPDK process is still broken. I don't see how FreeBSD matters here. It can be leaky but that is ok. I split it out to get review, then you complain it is too big :-(
Re: [dpdk-dev] [PATCH v8 0/9] Windows patchset with additional EAL functionalities
> This patchset includes additional functionalities for Windows EAL > to support command-line parsing feature and some EAL common code > on Windows. > > This patchset can be applied to windpdk-next-dev branch in the draft repo. > > v8 changes: > Fixed the naming conventions. > Fixed a crash encountered due to getopt function. > Removed "--syslog" from help options for Windows. > Observing no issues on Windows 10, Meson 0.53.999, Clang 9.0.1. Reviewed-by: Dmitry Kozlyuk -- Dmitry Kozlyuk
Re: [dpdk-dev] [PATCH 4/6] build: MinGW-w64 support for Meson
> On Wed, Feb 05, 2020 at 11:41:05PM +0300, Dmitry Kozlyuk wrote: > > > > > > > +if is_windows > > > > > > > + # Require platform SDK for Windows 7 and above. > > > > > > > + add_project_arguments('-D_WIN32_WINNT=0x0601', language: 'c') > > > > > > > > > > > > > > > > > > > Please explain. Why Windows 7 is needed? What this define is doing? > > > > > > > > > > > > > > > > Yes, Windows 7 and above is need for already existing code in > > > > > eal_lcore.c, > > > > > specifically for GetLogicalProcessorInformation() call. > > > > > > > > > > When including , one must define minimum API version the > > > > > application is compiled against [0]. MSVC and Clang default to the > > > > > version of > > > > > platform SDK (that is, maximum supported). MinGW defaults to Windows > > > > > XP, so > > > > > this definition must be either in before #include > > > > > or > > > > > here. Because other files may include , I'd prefer to have > > > > > a > > > > > global definition via compiler command-line. > > > > > > > > > > [0]: > > > > > https://docs.microsoft.com/en-us/windows/win32/WinProg/using-the-windows-headers > > > > > > > > > > > > > OK, thanks. > > > > Please reword the comment with something like > > > > "Minimum supported API is Windows 7." > > > > > > > For this, as an alternative to putting it as a project argument, you can > > > just > > > add it to dpdk_conf which means it will end up as a define in the global > > > rte_build_config.h and so be directly included in each compilation unit > > > ahead of any other headers. (rte_config.h includes rte_build_config.h) > > > > Can you please explain why using dpdk_conf is a better alternative? In > > lib/meson.build I can see add_project_arguments('_D_GNU_SOURCE', ...), which > > serves a similar purpose on POSIX systems. Compiler option also makes it > > impossible to forget or redefine this constant in code by mistake. > > > I'm not necessarily saying it's better, it's just an alternative to > consider. :-) Having it in rte_config.h makes the define available to any > external apps using DPDK, which may or may not be desirable. It's certainly *not* desirable, apps usually have this value consciously and explicitly defined on their own, and it may differ from the one required to compile DPDK. Thanks for the tip, but as I hope you can see, it should stay a project argument. -- Dmitry Kozlyuk
Re: [dpdk-dev] [RFC] Accelerator API to chain packet processing functions
On Fri, 7 Feb 2020 19:48:17 +0530 Jerin Jacob wrote: > On Fri, Feb 7, 2020 at 6:08 PM Coyle, David wrote: > > > > Hi Jerin, see below > > Hi David, > > > > > > > On Thu, Feb 6, 2020 at 10:01 PM Coyle, David > > > wrote: > > > > > > > > > > There is a risk in drafting API that meant for HW without any HW exists. > > > Because there could be inefficiency on the metadata and fast path API for > > > both models. > > > For example, In the case of CPU based scheme, it will be pure overhead > > > emulate the "queue"(the enqueue and dequeue) for the sake of abstraction > > > where CPU works better in the synchronous model and I have doubt that the > > > session-based scheme will work for HW or not as both difference HW needs > > > to work hand in hand(IOMMU aspects for two PCI device) > > > > [DC] I understand what you are saying about the overhead of emulating the > > "sw queue" but this same model is already used in many of the existing > > device PMDs. > > In the case of SW devices, such as AESNI-MB or NULL for crypto or zlib for > > compression, the enqueue/dequeue in the PMD is emulated through an rte_ring > > which is very efficient. > > The accelerator API will use the existing device PMDs so keeping the same > > model seems like a sensible approach. > > In this release, we added CPU crypto support in cryptodev to support > the synchronous model to fix the overhead. > > > > > From an application's point of view, this abstraction of the underlying > > device type is important for usability and maintainability - the > > application doesn't need to know > > the device type as such and therefore doesn't need to make different API > > calls. > > > > The enqueue/dequeue type API was also used with QAT in mind. While QAT HW > > doesn't support these xform chains at the moment, it could potentially do > > so in the future. > > As a side note, as part of the work of adding the accelerator API, the QAT > > PMD will be updated to support the DOCSIS Crypto-CRC accelerator xform > > chain, where the Crypto > > is done on QAT HW and the CRC will be done in SW, most likely through a > > call to the optimized rte_net_crc library. This will give a consistent API > > for the DOCSIS-MAC data-plane > > pipeline prototype we have developed, which uses both AESNI-MB and QAT for > > benchmarks. > > > > We will take your feedback on the enqueue/dequeue approach for SW devices > > into consideration though during development. > > > > Finally, I'm unsure what you mean by this line: > > > > "I have doubt that the session-based scheme will work for HW or not > > as both difference HW needs to work hand in hand(IOMMU aspects for two PCI > > device)" > > > > What do mean by different HW working "hand in hand" and "two PCI device"? > > The intention is that 1 HW device (or it's PMD) would have to support the > > accel xform chain > > I was thinking, it will be N PCIe devices that create the chain. Each > distinct PCI device does the fixed-function and chains them together. > > I do understand the usage of QAT HW and CRC in SW. > So If I understand it correctly, in rte_security, we are combining > rte_ethdev and rte_cryptodev. With this spec, we are trying to > combine, > rte_cryptodev and rte_compressdev. So it looks good to me. My only > remaining concern is the name of this API, accelerator too generic > name. IMO, like rte_security, we may need to give more meaningful name > for the use case where crytodev and compressdev can work together. Having an API that could be used by parallel hardware does make sense, but the DPDK already has multiple packet processing infrastructure pieces. I would rather the DPDK converge on one widely used, robust and tested packet method. Rather than the current "choose your poison or roll your own" which is what we have now. The proposed graph seems to be the best so far.
[dpdk-dev] BUG BPF examples broken
The BPF examples do not compile on Debian. The issue is that ret_mbuf_core.h includes rte_atomic which include rte_common which pulls in errno.h and bits/errno.h. And then asm/errno.h /usr/include/linux/errno.h:1:10: fatal error: 'asm/errno.h' file not found #include ^ 1 error generated. asm/errno.h is architecture specific and BPF is an architecture.
[dpdk-dev] [PATCH] event/dpaa2: set number of order sequeuences
This patch sets the number of atomic ordered sequeuences supported by the driver Fixes: dbf63bd43afa ("event/dpaa2: support ordered queue") Cc: sta...@dpdk.org Signed-off-by: Nipun Gupta --- drivers/event/dpaa2/dpaa2_eventdev.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/event/dpaa2/dpaa2_eventdev.c b/drivers/event/dpaa2/dpaa2_eventdev.c index d71361666..1833d659d 100644 --- a/drivers/event/dpaa2/dpaa2_eventdev.c +++ b/drivers/event/dpaa2/dpaa2_eventdev.c @@ -479,6 +479,8 @@ dpaa2_eventdev_queue_def_conf(struct rte_eventdev *dev, uint8_t queue_id, RTE_SET_USED(queue_id); queue_conf->nb_atomic_flows = DPAA2_EVENT_QUEUE_ATOMIC_FLOWS; + queue_conf->nb_atomic_order_sequences = + DPAA2_EVENT_QUEUE_ORDER_SEQUENCES; queue_conf->schedule_type = RTE_SCHED_TYPE_PARALLEL; queue_conf->priority = RTE_EVENT_DEV_PRIORITY_NORMAL; } -- 2.17.1
Re: [dpdk-dev] [RFC] Accelerator API to chain packet processing functions
On Sat, Feb 8, 2020 at 2:04 AM Stephen Hemminger wrote: > > On Fri, 7 Feb 2020 19:48:17 +0530 > Jerin Jacob wrote: > > > On Fri, Feb 7, 2020 at 6:08 PM Coyle, David wrote: > > > > > > Hi Jerin, see below > > > > Hi David, > > > > > > > > > > On Thu, Feb 6, 2020 at 10:01 PM Coyle, David > > > > wrote: > > > > > > > > > > > > > > There is a risk in drafting API that meant for HW without any HW exists. > > > > Because there could be inefficiency on the metadata and fast path API > > > > for > > > > both models. > > > > For example, In the case of CPU based scheme, it will be pure overhead > > > > emulate the "queue"(the enqueue and dequeue) for the sake of abstraction > > > > where CPU works better in the synchronous model and I have doubt that > > > > the > > > > session-based scheme will work for HW or not as both difference HW > > > > needs > > > > to work hand in hand(IOMMU aspects for two PCI device) > > > > > > [DC] I understand what you are saying about the overhead of emulating the > > > "sw queue" but this same model is already used in many of the existing > > > device PMDs. > > > In the case of SW devices, such as AESNI-MB or NULL for crypto or zlib > > > for compression, the enqueue/dequeue in the PMD is emulated through an > > > rte_ring which is very efficient. > > > The accelerator API will use the existing device PMDs so keeping the same > > > model seems like a sensible approach. > > > > In this release, we added CPU crypto support in cryptodev to support > > the synchronous model to fix the overhead. > > > > > > > > From an application's point of view, this abstraction of the underlying > > > device type is important for usability and maintainability - the > > > application doesn't need to know > > > the device type as such and therefore doesn't need to make different API > > > calls. > > > > > > The enqueue/dequeue type API was also used with QAT in mind. While QAT HW > > > doesn't support these xform chains at the moment, it could potentially do > > > so in the future. > > > As a side note, as part of the work of adding the accelerator API, the > > > QAT PMD will be updated to support the DOCSIS Crypto-CRC accelerator > > > xform chain, where the Crypto > > > is done on QAT HW and the CRC will be done in SW, most likely through a > > > call to the optimized rte_net_crc library. This will give a consistent > > > API for the DOCSIS-MAC data-plane > > > pipeline prototype we have developed, which uses both AESNI-MB and QAT > > > for benchmarks. > > > > > > We will take your feedback on the enqueue/dequeue approach for SW devices > > > into consideration though during development. > > > > > > Finally, I'm unsure what you mean by this line: > > > > > > "I have doubt that the session-based scheme will work for HW or > > > not as both difference HW needs to work hand in hand(IOMMU aspects for > > > two PCI device)" > > > > > > What do mean by different HW working "hand in hand" and "two PCI device"? > > > The intention is that 1 HW device (or it's PMD) would have to support the > > > accel xform chain > > > > I was thinking, it will be N PCIe devices that create the chain. Each > > distinct PCI device does the fixed-function and chains them together. > > > > I do understand the usage of QAT HW and CRC in SW. > > So If I understand it correctly, in rte_security, we are combining > > rte_ethdev and rte_cryptodev. With this spec, we are trying to > > combine, > > rte_cryptodev and rte_compressdev. So it looks good to me. My only > > remaining concern is the name of this API, accelerator too generic > > name. IMO, like rte_security, we may need to give more meaningful name > > for the use case where crytodev and compressdev can work together. > > Having an API that could be used by parallel hardware does make sense, > but the DPDK already has multiple packet processing infrastructure pieces. > > I would rather the DPDK converge on one widely used, robust and tested packet > method. Rather than the current "choose your poison or roll your own" which is > what we have now. The proposed graph seems to be the best so far. I agree. Even I thought of saying graph can do this, as, it has higher abstraction and runtime chaining support, but then I thought it will be self markering. David could you check https://www.mail-archive.com/dev@dpdk.org/msg156318.html If this one only focusing crypto dev + compressdev, What if we have ethdev + compressdev + security device in the future. graph has higher abstraction so it can accommodate ANY chaining requirements. i.e AESNI-MB + QAT will go as a separate node