[dpdk-dev] [PATCH v2] doc: announce ABI change for cryptodev and ethdev
Support for security operations is planned to be added in ethdev and cryptodev for the 17.11 release. For this following changes are required. - rte_cryptodev and rte_eth_dev structures need to be added new parameter rte_security_ops which extend support for security ops to the corresponding driver. - rte_cryptodev_info and rte_eth_dev_info need to be added with rte_security_capabilities to identify the capabilities of the corresponding driver. - rte_security_session needs to be added in the union inside structure rte_crypto_sym_op to decide between various types of sessions Signed-off-by: Akhil Goyal Acked-by: Hemant Agrawal Acked-by: Boris Pismenny Acked-by: Shahaf Shuler Acked-by: Pablo de Lara --- changes in v2: Added one more ABI change wrt to security session, This patch is not split into 3 patches as these all will be implemented simultaneously with rte_security support. doc/guides/rel_notes/deprecation.rst | 15 +++ 1 file changed, 15 insertions(+) diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst index 72d1f35..e34a809 100644 --- a/doc/guides/rel_notes/deprecation.rst +++ b/doc/guides/rel_notes/deprecation.rst @@ -78,3 +78,18 @@ Deprecation Notices be removed in 17.11: - ``rte_cryptodev_create_vdev`` + +* cryptodev: new parameters - ``rte_security_capabilities`` and + ``rte_security_ops`` will be added to ``rte_cryptodev_info`` and + ``rte_cryptodev`` respectively to support security protocol offloaded + operations. + +* cryptodev: new parameter ``rte_security_session`` will be added in the union + of the structure ``rte_crypto_sym_op``, so that the user can choose either to + use ``rte_cryptodev_sym_session`` or ``rte_crypto_sym_xform`` or + ``rte_security_session``. + +* ethdev: new parameters - ``rte_security_capabilities`` and + ``rte_security_ops`` will be added to ``rte_eth_dev_info`` and + ``rte_eth_dev`` respectively to support security operations like + ipsec inline. -- 2.9.3
[dpdk-dev] Why IVSHMEM was removed since 16.11 ?
The release notes of dpdk-16.11 had shown that IVSHMEM was removed due to some design issues. So, what are these issues? Thanks!
Re: [dpdk-dev] [PATCH] vfio: fix sPAPR IOMMU DMA window size
On 08/08/17 01:11, Jonas Pfefferle wrote: > DMA window size needs to be big enough to span all memory segment's > physical addresses. We do not need multiple levels of IOMMU tables > as we already span ~70TB of physical memory with 16MB hugepages. > > Signed-off-by: Jonas Pfefferle > --- > lib/librte_eal/linuxapp/eal/eal_vfio.c | 25 ++--- > 1 file changed, 22 insertions(+), 3 deletions(-) > > diff --git a/lib/librte_eal/linuxapp/eal/eal_vfio.c > b/lib/librte_eal/linuxapp/eal/eal_vfio.c > index 946df7e..8502216 100644 > --- a/lib/librte_eal/linuxapp/eal/eal_vfio.c > +++ b/lib/librte_eal/linuxapp/eal/eal_vfio.c > @@ -722,6 +722,18 @@ vfio_type1_dma_map(int vfio_container_fd) > return 0; > } > > +static uint64_t > +roundup_next_pow2(uint64_t n) > +{ > + uint32_t i; > + > + n--; > + for (i = 1; i < sizeof(n) * CHAR_BIT; i += i) > + n |= n >> i; > + > + return ++n; > +} > + wow :) QEMU does it using __builtin_ctzll() (used below for the page_shift) without a loop: https://git.qemu.org/gitweb.cgi?p=qemu.git;a=blob;f=include/qemu/host-utils.h;h=95cf4f4163e50457cdf808263065ca5ef3f935da;hb=f22ab6cb0c47bd2a2785b7d58130949bd7d8d9af#l382 Anyway, seems working. Reviewed-by: Alexey Kardashevskiy > static int > vfio_spapr_dma_map(int vfio_container_fd) > { > @@ -759,10 +771,12 @@ vfio_spapr_dma_map(int vfio_container_fd) > return -1; > } > > - /* calculate window size based on number of hugepages configured */ > - create.window_size = rte_eal_get_physmem_size(); > + /* physicaly pages are sorted descending i.e. ms[0].phys_addr is max */ > + /* create DMA window from 0 to max(phys_addr + len) */ > + /* sPAPR requires window size to be a power of 2 */ > + create.window_size = roundup_next_pow2(ms[0].phys_addr + ms[0].len); > create.page_shift = __builtin_ctzll(ms->hugepage_sz); > - create.levels = 2; > + create.levels = 1; > > ret = ioctl(vfio_container_fd, VFIO_IOMMU_SPAPR_TCE_CREATE, &create); > if (ret) { > @@ -771,6 +785,11 @@ vfio_spapr_dma_map(int vfio_container_fd) > return -1; > } > > + if (create.start_addr != 0) { > + RTE_LOG(ERR, EAL, " DMA window start address != 0\n"); > + return -1; > + } > + > /* map all DPDK segments for DMA. use 1:1 PA to IOVA mapping */ > for (i = 0; i < RTE_MAX_MEMSEG; i++) { > struct vfio_iommu_type1_dma_map dma_map; > -- Alexey
Re: [dpdk-dev] [PATCH] vfio: fix sPAPR IOMMU DMA window size
Alexey Kardashevskiy wrote on 08/08/2017 09:38:00 AM: > From: Alexey Kardashevskiy > To: Jonas Pfefferle , anatoly.bura...@intel.com > Cc: dev@dpdk.org > Date: 08/08/2017 09:38 AM > Subject: Re: [PATCH] vfio: fix sPAPR IOMMU DMA window size > > On 08/08/17 01:11, Jonas Pfefferle wrote: > > DMA window size needs to be big enough to span all memory segment's > > physical addresses. We do not need multiple levels of IOMMU tables > > as we already span ~70TB of physical memory with 16MB hugepages. > > > > Signed-off-by: Jonas Pfefferle > > --- > > lib/librte_eal/linuxapp/eal/eal_vfio.c | 25 ++--- > > 1 file changed, 22 insertions(+), 3 deletions(-) > > > > diff --git a/lib/librte_eal/linuxapp/eal/eal_vfio.c b/lib/ > librte_eal/linuxapp/eal/eal_vfio.c > > index 946df7e..8502216 100644 > > --- a/lib/librte_eal/linuxapp/eal/eal_vfio.c > > +++ b/lib/librte_eal/linuxapp/eal/eal_vfio.c > > @@ -722,6 +722,18 @@ vfio_type1_dma_map(int vfio_container_fd) > > return 0; > > } > > > > +static uint64_t > > +roundup_next_pow2(uint64_t n) > > +{ > > + uint32_t i; > > + > > + n--; > > + for (i = 1; i < sizeof(n) * CHAR_BIT; i += i) > > + n |= n >> i; > > + > > + return ++n; > > +} > > + > > wow :) > > QEMU does it using __builtin_ctzll() (used below for the page_shift) > without a loop: > > https://git.qemu.org/gitweb.cgi?p=qemu.git;a=blob;f=include/qemu/ > host- > utils.h;h=95cf4f4163e50457cdf808263065ca5ef3f935da;hb=f22ab6cb0c47bd2a2785b7d58130949bd7d8d9af#l382 > > > Anyway, seems working. Ok let me fix that :) > > > Reviewed-by: Alexey Kardashevskiy > > > > > > static int > > vfio_spapr_dma_map(int vfio_container_fd) > > { > > @@ -759,10 +771,12 @@ vfio_spapr_dma_map(int vfio_container_fd) > >return -1; > > } > > > > - /* calculate window size based on number of hugepages configured */ > > - create.window_size = rte_eal_get_physmem_size(); > > + /* physicaly pages are sorted descending i.e. ms[0].phys_addr is max */ > > + /* create DMA window from 0 to max(phys_addr + len) */ > > + /* sPAPR requires window size to be a power of 2 */ > > + create.window_size = roundup_next_pow2(ms[0].phys_addr + ms [0].len); > > create.page_shift = __builtin_ctzll(ms->hugepage_sz); > > - create.levels = 2; > > + create.levels = 1; > > > > ret = ioctl(vfio_container_fd, VFIO_IOMMU_SPAPR_TCE_CREATE, &create); > > if (ret) { > > @@ -771,6 +785,11 @@ vfio_spapr_dma_map(int vfio_container_fd) > >return -1; > > } > > > > + if (create.start_addr != 0) { > > + RTE_LOG(ERR, EAL, " DMA window start address != 0\n"); > > + return -1; > > + } > > + > > /* map all DPDK segments for DMA. use 1:1 PA to IOVA mapping */ > > for (i = 0; i < RTE_MAX_MEMSEG; i++) { > >struct vfio_iommu_type1_dma_map dma_map; > > > > > -- > Alexey >
Re: [dpdk-dev] [PATCH v2] doc: announce ABI change for cryptodev and ethdev
Hi Akhil, > -Original Message- > From: Akhil Goyal [mailto:akhil.go...@nxp.com] > Sent: Tuesday, August 8, 2017 8:10 AM > To: dev@dpdk.org; Doherty, Declan ; > tho...@monjalon.net; Nicolau, Radu ; > avia...@mellanox.com; bor...@mellanox.com; > hemant.agra...@nxp.com; De Lara Guarch, Pablo > > Cc: shah...@mellanox.com; Akhil Goyal > Subject: [PATCH v2] doc: announce ABI change for cryptodev and ethdev > > Support for security operations is planned to be added in ethdev and > cryptodev for the 17.11 release. > > For this following changes are required. > - rte_cryptodev and rte_eth_dev structures need to be added new > parameter rte_security_ops which extend support for security ops to the > corresponding driver. > - rte_cryptodev_info and rte_eth_dev_info need to be added with > rte_security_capabilities to identify the capabilities of the corresponding > driver. > - rte_security_session needs to be added in the union inside structure > rte_crypto_sym_op to decide between various types of sessions > > Signed-off-by: Akhil Goyal > Acked-by: Hemant Agrawal > Acked-by: Boris Pismenny > Acked-by: Shahaf Shuler > Acked-by: Pablo de Lara > --- > changes in v2: > Added one more ABI change wrt to security session, This patch is not split > into 3 patches as these all will be implemented simultaneously with > rte_security support. > > doc/guides/rel_notes/deprecation.rst | 15 +++ > 1 file changed, 15 insertions(+) > > diff --git a/doc/guides/rel_notes/deprecation.rst > b/doc/guides/rel_notes/deprecation.rst > index 72d1f35..e34a809 100644 > --- a/doc/guides/rel_notes/deprecation.rst > +++ b/doc/guides/rel_notes/deprecation.rst > @@ -78,3 +78,18 @@ Deprecation Notices >be removed in 17.11: > >- ``rte_cryptodev_create_vdev`` > + > +* cryptodev: new parameters - ``rte_security_capabilities`` and > + ``rte_security_ops`` will be added to ``rte_cryptodev_info`` and > + ``rte_cryptodev`` respectively to support security protocol offloaded > + operations. > + > +* cryptodev: new parameter ``rte_security_session`` will be added in > +the union > + of the structure ``rte_crypto_sym_op``, so that the user can choose > +either to > + use ``rte_cryptodev_sym_session`` or ``rte_crypto_sym_xform`` or > + ``rte_security_session``. I don't think it is required to have a deprecation notice for this, as it is adding another element in a union, and it is not changing its size. However, this would require an addition in rte_crypto_op_sess_type, which, as long as it is added at the end, should not require a deprecation notice. About splitting this into two patches, I would do it, as I believe deprecation notices should target a specific library.
Re: [dpdk-dev] [PATCH] vfio: fix sPAPR IOMMU DMA window size
> -Original Message- > From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Alexey Kardashevskiy > Sent: Tuesday, August 8, 2017 10:38 AM > To: Jonas Pfefferle ; Burakov, Anatoly > > Cc: dev@dpdk.org > Subject: Re: [dpdk-dev] [PATCH] vfio: fix sPAPR IOMMU DMA window size > > On 08/08/17 01:11, Jonas Pfefferle wrote: > > DMA window size needs to be big enough to span all memory segment's > > physical addresses. We do not need multiple levels of IOMMU tables > > as we already span ~70TB of physical memory with 16MB hugepages. > > > > Signed-off-by: Jonas Pfefferle > > --- > > lib/librte_eal/linuxapp/eal/eal_vfio.c | 25 ++--- > > 1 file changed, 22 insertions(+), 3 deletions(-) > > > > diff --git a/lib/librte_eal/linuxapp/eal/eal_vfio.c > > b/lib/librte_eal/linuxapp/eal/eal_vfio.c > > index 946df7e..8502216 100644 > > --- a/lib/librte_eal/linuxapp/eal/eal_vfio.c > > +++ b/lib/librte_eal/linuxapp/eal/eal_vfio.c > > @@ -722,6 +722,18 @@ vfio_type1_dma_map(int vfio_container_fd) > > return 0; > > } > > > > +static uint64_t > > +roundup_next_pow2(uint64_t n) > > +{ > > + uint32_t i; > > + > > + n--; > > + for (i = 1; i < sizeof(n) * CHAR_BIT; i += i) > > + n |= n >> i; > > + > > + return ++n; > > +} > > + > > wow :) > > QEMU does it using __builtin_ctzll() (used below for the page_shift) > without a loop: > > https://git.qemu.org/gitweb.cgi?p=qemu.git;a=blob;f=include/qemu/host- > utils.h;h=95cf4f4163e50457cdf808263065ca5ef3f935da;hb=f22ab6cb0c47bd2a2785b7d58130949bd7d8d9af#l382 > > > Anyway, seems working. As I remember, there already exists rte_align64pow2(). Konstantin > > > Reviewed-by: Alexey Kardashevskiy > > > > > > static int > > vfio_spapr_dma_map(int vfio_container_fd) > > { > > @@ -759,10 +771,12 @@ vfio_spapr_dma_map(int vfio_container_fd) > > return -1; > > } > > > > - /* calculate window size based on number of hugepages configured */ > > - create.window_size = rte_eal_get_physmem_size(); > > + /* physicaly pages are sorted descending i.e. ms[0].phys_addr is max */ > > + /* create DMA window from 0 to max(phys_addr + len) */ > > + /* sPAPR requires window size to be a power of 2 */ > > + create.window_size = roundup_next_pow2(ms[0].phys_addr + ms[0].len); > > create.page_shift = __builtin_ctzll(ms->hugepage_sz); > > - create.levels = 2; > > + create.levels = 1; > > > > ret = ioctl(vfio_container_fd, VFIO_IOMMU_SPAPR_TCE_CREATE, &create); > > if (ret) { > > @@ -771,6 +785,11 @@ vfio_spapr_dma_map(int vfio_container_fd) > > return -1; > > } > > > > + if (create.start_addr != 0) { > > + RTE_LOG(ERR, EAL, " DMA window start address != 0\n"); > > + return -1; > > + } > > + > > /* map all DPDK segments for DMA. use 1:1 PA to IOVA mapping */ > > for (i = 0; i < RTE_MAX_MEMSEG; i++) { > > struct vfio_iommu_type1_dma_map dma_map; > > > > > -- > Alexey
[dpdk-dev] [PATCH v2] vfio: fix sPAPR IOMMU DMA window size
DMA window size needs to be big enough to span all memory segment's physical addresses. We do not need multiple levels of IOMMU tables as we already span ~70TB of physical memory with 16MB hugepages. Signed-off-by: Jonas Pfefferle --- v2: * roundup to next power 2 function without loop. lib/librte_eal/linuxapp/eal/eal_vfio.c | 42 +++--- 1 file changed, 39 insertions(+), 3 deletions(-) diff --git a/lib/librte_eal/linuxapp/eal/eal_vfio.c b/lib/librte_eal/linuxapp/eal/eal_vfio.c index 946df7e..a3f9977 100644 --- a/lib/librte_eal/linuxapp/eal/eal_vfio.c +++ b/lib/librte_eal/linuxapp/eal/eal_vfio.c @@ -722,6 +722,35 @@ vfio_type1_dma_map(int vfio_container_fd) return 0; } +static inline int +clz64(uint64_t val) +{ + return val ? __builtin_clzll(val) : 64; +} + +static inline bool +is_power_of_2(uint64_t value) +{ + if (!value) + return false; + + return !(value & (value - 1)); +} + +static inline uint64_t +roundup_next_pow2(uint64_t value) +{ + uint8_t nlz = clz64(value); + + if (is_power_of_2(value)) + return value; + + if (!nlz) + return 0; + + return 1ULL << (64 - nlz); +} + static int vfio_spapr_dma_map(int vfio_container_fd) { @@ -759,10 +788,12 @@ vfio_spapr_dma_map(int vfio_container_fd) return -1; } - /* calculate window size based on number of hugepages configured */ - create.window_size = rte_eal_get_physmem_size(); + /* physicaly pages are sorted descending i.e. ms[0].phys_addr is max */ + /* create DMA window from 0 to max(phys_addr + len) */ + /* sPAPR requires window size to be a power of 2 */ + create.window_size = roundup_next_pow2(ms[0].phys_addr + ms[0].len); create.page_shift = __builtin_ctzll(ms->hugepage_sz); - create.levels = 2; + create.levels = 1; ret = ioctl(vfio_container_fd, VFIO_IOMMU_SPAPR_TCE_CREATE, &create); if (ret) { @@ -771,6 +802,11 @@ vfio_spapr_dma_map(int vfio_container_fd) return -1; } + if (create.start_addr != 0) { + RTE_LOG(ERR, EAL, " DMA window start address != 0\n"); + return -1; + } + /* map all DPDK segments for DMA. use 1:1 PA to IOVA mapping */ for (i = 0; i < RTE_MAX_MEMSEG; i++) { struct vfio_iommu_type1_dma_map dma_map; -- 2.7.4
Re: [dpdk-dev] [RFC] ethdev: add ioctl-like API to control device specific features
On 8/4/2017 1:56 PM, Bruce Richardson wrote: > On Fri, Aug 04, 2017 at 12:58:01PM +0100, Ferruh Yigit wrote: >> On 8/3/2017 8:53 PM, Thomas Monjalon wrote: >>> 03/08/2017 18:15, Stephen Hemminger: On Thu, 3 Aug 2017 14:21:38 +0100 Bruce Richardson wrote: > On Thu, Aug 03, 2017 at 01:21:35PM +0100, Chilikin, Andrey wrote: >> To control some device-specific features public device-specific functions >> rte_pmd_*.h are used. >> >> But this solution requires applications to distinguish devices at runtime >> and, depending on the device type, call corresponding device-specific >> functions even if functions' parameters are the same. >> >> IOCTL-like API can be added to ethdev instead of public device-specific >> functions to address the following: >> >> * allow more usable support of features across a range of NIC from >> one vendor, but not others >> * allow features to be implemented by multiple NIC drivers without >> relying on a critical mass to get the functionality in ethdev >> * there are a large number of possible device specific functions, and >> creating individual APIs for each one is not a good solution >> * IOCTLs are a proven method for solving this problem in other areas, >> i.e. OS kernels. >> >> Control requests for this API will be globally defined at ethdev level, >> so >> an application will use single API call to control different devices from >> one/multiple vendors. >> >> API call may look like as a classic ioctl with an extra parameter for >> argument length for better sanity checks: >> >> int >> rte_eth_dev_ioctl(uint16_t port, uint64_t ctl, void *argp, >> unsigned arg_length); >> >> Regards, >> Andrey > > I think we need to start putting in IOCTLs for ethdevs, much as I hate > to admit it, since I dislike IOCTLs and other functions with opaque > arguments! Having driver specific functions I don't think will scale > well as each vendor tries to expose as much of their driver specific > functionality as possible. > > One other additional example: I discovered just this week another issue > with driver specific functions and testpmd, when I was working on the > meson build rework. > > * With shared libraries, when we do "ninja install" we want our DPDK > libs moved to e.g. /usr/local/lib, but the drivers moved to a separate > driver folder, so that they can be automatically loaded from that > single location by DPDK apps [== CONFIG_RTE_EAL_PMD_PATH]. > * However, testpmd, as well as using the drivers as plugins, uses > driver-specific functions, which means that it explicitly links > against the pmd .so files. > * Those driver .so files are not in with the other libraries, so ld.so > does not find the pmd, and the installed testpmd fails to run due to > missing library dependencies. > * The workaround is to add the drivers path to the ld load path, but we > should not require ld library path changes just to get DPDK apps to > work. > > Using ioctls instead of driver-specific functions would solve this. > > My 2c. My 2c. No. Short answer: Ioctl's were a bad idea in Unix (per Dennis Ritchie et al) and are now despised by Linux kernel developers. They provide an unstructured, unsecured, back door for device driver abuse. Try to get a new driver in Linux with a unique ioctl, and it will be hard to get accepted. Long answer: So far every device specific feature has fit into ethdev model. Doing ioctl is admitting "it is too hard to be general, we need need an out". For something that is a flag, it should fit into existing config model; ignoring silly ABI constraints. For a real feature (think flow direction), we want a first class API for that. For a wart, then devargs will do. Give a good example of something that should be an ioctl. Don't build the API first and then let it get cluttered. >>> >>> I agree with Stephen. >>> >>> And please do not forget that ioctl still requires an API: >>> the argument that you put in ioctl is the API of the feature. >>> So it is the same thing as defining a new function. >> >> I am also not fan of the ioctl usage. I believe it hides APIs behind ids >> and prevent argument check by compiler. >> >> BUT, the number of the increasing PMD specific APIs are also worrying, >> it is becoming harder to maintain, and I believe this is something NOT >> sustainable in long run. >> >> >> What about having *eth_dev_extended_ops* ? >> >> >> As a part of the rte_eth_dev. This can be in the librte_ether library >> but in a separated file. >> >> And the APIs for these ops can be less strict on compatibility, and >> easier to add. >> >> Benefits of having this new dev_ops:
Re: [dpdk-dev] [PATCH v3] doc: announce ethdev API change for detach flag
31/07/2017 11:40, Gaetan Rivet: > The flag RTE_ETH_DEV_DETACHABLE will disappear. > > This flag is not needed anymore following the hotplug work done for > v17.08. It can be removed, its function is now implicitly made available > by the relevant EAL and rte_bus implementations. > > Signed-off-by: Gaetan Rivet > Acked-by: John McNamara > Acked-by: Adrien Mazarguil Acked-by: Thomas Monjalon Applied, thanks
[dpdk-dev] [PATCH v3] vfio: fix sPAPR IOMMU DMA window size
DMA window size needs to be big enough to span all memory segment's physical addresses. We do not need multiple levels of IOMMU tables as we already span ~70TB of physical memory with 16MB hugepages. Signed-off-by: Jonas Pfefferle --- v2: * roundup to next power 2 function without loop. v3: * Replace roundup_next_pow2 with rte_align64pow2 lib/librte_eal/linuxapp/eal/eal_vfio.c | 13 ++--- 1 file changed, 10 insertions(+), 3 deletions(-) diff --git a/lib/librte_eal/linuxapp/eal/eal_vfio.c b/lib/librte_eal/linuxapp/eal/eal_vfio.c index 946df7e..550c41c 100644 --- a/lib/librte_eal/linuxapp/eal/eal_vfio.c +++ b/lib/librte_eal/linuxapp/eal/eal_vfio.c @@ -759,10 +759,12 @@ vfio_spapr_dma_map(int vfio_container_fd) return -1; } - /* calculate window size based on number of hugepages configured */ - create.window_size = rte_eal_get_physmem_size(); + /* physicaly pages are sorted descending i.e. ms[0].phys_addr is max */ + /* create DMA window from 0 to max(phys_addr + len) */ + /* sPAPR requires window size to be a power of 2 */ + create.window_size = rte_align64pow2(ms[0].phys_addr + ms[0].len); create.page_shift = __builtin_ctzll(ms->hugepage_sz); - create.levels = 2; + create.levels = 1; ret = ioctl(vfio_container_fd, VFIO_IOMMU_SPAPR_TCE_CREATE, &create); if (ret) { @@ -771,6 +773,11 @@ vfio_spapr_dma_map(int vfio_container_fd) return -1; } + if (create.start_addr != 0) { + RTE_LOG(ERR, EAL, " DMA window start address != 0\n"); + return -1; + } + /* map all DPDK segments for DMA. use 1:1 PA to IOVA mapping */ for (i = 0; i < RTE_MAX_MEMSEG; i++) { struct vfio_iommu_type1_dma_map dma_map; -- 2.7.4
Re: [dpdk-dev] [PATCH v2] doc: announce ABI change for cryptodev and ethdev
Hi Pablo, On 8/8/2017 1:38 PM, De Lara Guarch, Pablo wrote: Hi Akhil, -Original Message- From: Akhil Goyal [mailto:akhil.go...@nxp.com] Sent: Tuesday, August 8, 2017 8:10 AM To: dev@dpdk.org; Doherty, Declan ; tho...@monjalon.net; Nicolau, Radu ; avia...@mellanox.com; bor...@mellanox.com; hemant.agra...@nxp.com; De Lara Guarch, Pablo Cc: shah...@mellanox.com; Akhil Goyal Subject: [PATCH v2] doc: announce ABI change for cryptodev and ethdev Support for security operations is planned to be added in ethdev and cryptodev for the 17.11 release. For this following changes are required. - rte_cryptodev and rte_eth_dev structures need to be added new parameter rte_security_ops which extend support for security ops to the corresponding driver. - rte_cryptodev_info and rte_eth_dev_info need to be added with rte_security_capabilities to identify the capabilities of the corresponding driver. - rte_security_session needs to be added in the union inside structure rte_crypto_sym_op to decide between various types of sessions Signed-off-by: Akhil Goyal Acked-by: Hemant Agrawal Acked-by: Boris Pismenny Acked-by: Shahaf Shuler Acked-by: Pablo de Lara --- changes in v2: Added one more ABI change wrt to security session, This patch is not split into 3 patches as these all will be implemented simultaneously with rte_security support. doc/guides/rel_notes/deprecation.rst | 15 +++ 1 file changed, 15 insertions(+) diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst index 72d1f35..e34a809 100644 --- a/doc/guides/rel_notes/deprecation.rst +++ b/doc/guides/rel_notes/deprecation.rst @@ -78,3 +78,18 @@ Deprecation Notices be removed in 17.11: - ``rte_cryptodev_create_vdev`` + +* cryptodev: new parameters - ``rte_security_capabilities`` and + ``rte_security_ops`` will be added to ``rte_cryptodev_info`` and + ``rte_cryptodev`` respectively to support security protocol offloaded + operations. + +* cryptodev: new parameter ``rte_security_session`` will be added in +the union + of the structure ``rte_crypto_sym_op``, so that the user can choose +either to + use ``rte_cryptodev_sym_session`` or ``rte_crypto_sym_xform`` or + ``rte_security_session``. I don't think it is required to have a deprecation notice for this, as it is adding another element in a union, and it is not changing its size. However, this would require an addition in rte_crypto_op_sess_type, which, as long as it is added at the end, should not require a deprecation notice. About splitting this into two patches, I would do it, as I believe deprecation notices should target a specific library. Thanks for the comments. I have marked the v2 as not applicable and marked v1 as in review in the patchwork. -Akhil
Re: [dpdk-dev] [PATCH] vfio: fix sPAPR IOMMU DMA window size
"Ananyev, Konstantin" wrote on 08/08/2017 10:27:28 AM: > From: "Ananyev, Konstantin" > To: Alexey Kardashevskiy , Jonas Pfefferle > , "Burakov, Anatoly" > Cc: "dev@dpdk.org" > Date: 08/08/2017 10:27 AM > Subject: RE: [dpdk-dev] [PATCH] vfio: fix sPAPR IOMMU DMA window size > > > > > -Original Message- > > From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Alexey Kardashevskiy > > Sent: Tuesday, August 8, 2017 10:38 AM > > To: Jonas Pfefferle ; Burakov, Anatoly > > > Cc: dev@dpdk.org > > Subject: Re: [dpdk-dev] [PATCH] vfio: fix sPAPR IOMMU DMA window size > > > > On 08/08/17 01:11, Jonas Pfefferle wrote: > > > DMA window size needs to be big enough to span all memory segment's > > > physical addresses. We do not need multiple levels of IOMMU tables > > > as we already span ~70TB of physical memory with 16MB hugepages. > > > > > > Signed-off-by: Jonas Pfefferle > > > --- > > > lib/librte_eal/linuxapp/eal/eal_vfio.c | 25 + +--- > > > 1 file changed, 22 insertions(+), 3 deletions(-) > > > > > > diff --git a/lib/librte_eal/linuxapp/eal/eal_vfio.c b/lib/ > librte_eal/linuxapp/eal/eal_vfio.c > > > index 946df7e..8502216 100644 > > > --- a/lib/librte_eal/linuxapp/eal/eal_vfio.c > > > +++ b/lib/librte_eal/linuxapp/eal/eal_vfio.c > > > @@ -722,6 +722,18 @@ vfio_type1_dma_map(int vfio_container_fd) > > > return 0; > > > } > > > > > > +static uint64_t > > > +roundup_next_pow2(uint64_t n) > > > +{ > > > + uint32_t i; > > > + > > > + n--; > > > + for (i = 1; i < sizeof(n) * CHAR_BIT; i += i) > > > + n |= n >> i; > > > + > > > + return ++n; > > > +} > > > + > > > > wow :) > > > > QEMU does it using __builtin_ctzll() (used below for the page_shift) > > without a loop: > > > > https://git.qemu.org/gitweb.cgi?p=qemu.git;a=blob;f=include/qemu/host- > > > utils.h;h=95cf4f4163e50457cdf808263065ca5ef3f935da;hb=f22ab6cb0c47bd2a2785b7d58130949bd7d8d9af#l382 > > > > > > Anyway, seems working. > > As I remember, there already exists rte_align64pow2(). > Konstantin Thanks. Fixed it. > > > > > > > Reviewed-by: Alexey Kardashevskiy > > > > > > > > > > > static int > > > vfio_spapr_dma_map(int vfio_container_fd) > > > { > > > @@ -759,10 +771,12 @@ vfio_spapr_dma_map(int vfio_container_fd) > > >return -1; > > > } > > > > > > - /* calculate window size based on number of hugepages configured */ > > > - create.window_size = rte_eal_get_physmem_size(); > > > + /* physicaly pages are sorted descending i.e. ms > [0].phys_addr is max */ > > > + /* create DMA window from 0 to max(phys_addr + len) */ > > > + /* sPAPR requires window size to be a power of 2 */ > > > + create.window_size = roundup_next_pow2(ms[0].phys_addr + ms [0].len); > > > create.page_shift = __builtin_ctzll(ms->hugepage_sz); > > > - create.levels = 2; > > > + create.levels = 1; > > > > > > ret = ioctl(vfio_container_fd, VFIO_IOMMU_SPAPR_TCE_CREATE, &create); > > > if (ret) { > > > @@ -771,6 +785,11 @@ vfio_spapr_dma_map(int vfio_container_fd) > > >return -1; > > > } > > > > > > + if (create.start_addr != 0) { > > > + RTE_LOG(ERR, EAL, " DMA window start address != 0\n"); > > > + return -1; > > > + } > > > + > > > /* map all DPDK segments for DMA. use 1:1 PA to IOVA mapping */ > > > for (i = 0; i < RTE_MAX_MEMSEG; i++) { > > >struct vfio_iommu_type1_dma_map dma_map; > > > > > > > > > -- > > Alexey
Re: [dpdk-dev] using rte_flow via tespmd with Intel X552
On 8/2/2017 2:41 PM, Adrien Mazarguil wrote: > Hi Wenzhuo, > > On Wed, Aug 02, 2017 at 01:00:47PM +, Lu, Wenzhuo wrote: >> Hi, >> >>> -Original Message- >>> From: Zhao1, Wei >>> Sent: Wednesday, August 2, 2017 3:28 AM >>> To: TETSURO NAKAMURA >>> Cc: TAKADA Naoki ; Satoshi NISHIYAMA >>> ; Lu, Wenzhuo ; >>> dev@dpdk.org >>> Subject: RE: using rte_flow via tespmd with Intel X552 >>> >>> Hi, NAKAMURA >>> >>> I have upload 2 documents about command format when creating flow for >>> NIC igb and ixgbe in the email attachment. >>> I have decided to commit the context in the doc to doc\guides\nics\ixgbe.rst >>> before, BUT by now it seems the content of the document is too much To >>> store there. So, is there any other doc more suitable in DPDK to record >>> this. >>> And welcome for any other suggestion. >>> AND, Wenzhuo, what is your opinion for this? >> I also have the same feeling that it may make the ixgbe.rst too big. >> + John and Adrien. Hi John, Adrien, I'm think about creating a specific doc >> for rte flow, we can include the examples for every NIC here. Is it good? If >> so, where's the good place to put it? Thanks. > > Depends on the kind of examples. Code samples or testpmd flow command? > Unless you mean a document that describes individual rte_flow features > supported by each NIC? > > Code samples could be added to a new "Programming flow rules" (or something) > section in doc/guide/prog_guide/rte_flow.rst. > > Another section about NICs could be added in that documentation as well but > should only contain a short summary for each of them and a link to the > relevant PMD documentation in nics/$PMD.rst (or elsewhere in the DPDK > tree), if any. > > If you want to describe individual rte_flow features supported by each NIC, > I think there's no other choice but to create a bunch of > doc/guides/nics/features/rte_flow/*.ini files. Indeed I was thinking doing a change in .ini files related rte flow in 17.11 rte_flow is a method to implement some filtering features, rte_flow itself is not a device feature. And filtering features already listed in .ini file. So I was thinking removing rte_flow from feature table, and update existing filter features as something like: "Ethertype filter = L" --> Feature exist and implemented using Legacy method. "Ethertype filter = Y" --> Feature exist and implemented using rte flow Does it make sense? > -Original Message- From: TETSURO NAKAMURA [mailto:nakamura.tets...@lab.ntt.co.jp] Sent: Wednesday, August 2, 2017 12:22 PM To: Zhao1, Wei Cc: TAKADA Naoki ; Satoshi NISHIYAMA ; Lu, Wenzhuo ; dev@dpdk.org Subject: Re: using rte_flow via tespmd with Intel X552 Wei Zhao - san (Cc'ing Wenzhuo - san) According to the e-mail thread [1] in April, you have a document about command format when creating flows on ixgbe NIC. I would appreciate your uploading that document to community or could you just share it with me via an e-mail ?? [1] http://dpdk.org/ml/archives/users/2017-April/001786.html Thanks, Tetsuro # re-sending the e-mail because I failed to submit the e-mail to us...@dpdk.org. -- Tetsuro Nakamura NTT Network Service Systems Laboratories TEL:0422 59 6914(National)/+81 422 59 6914(International) 3-9-11, Midori-Cho Musashino-Shi, Tokyo 180-8585 Japan >> >
Re: [dpdk-dev] [PATCH v3] ethdev: add return code to rte_eth_stats_reset()
> From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of David Harton > Sent: Monday, August 7, 2017 6:39 PM > To: tho...@monjalon.net > Cc: dev@dpdk.org; David Harton > Subject: [dpdk-dev] [PATCH v3] ethdev: add return code to > rte_eth_stats_reset() > > Some devices do not support reset of eth stats. An application may > need to know not to clear shadow stats if the device cannot. > > rte_eth_stats_reset is updated to provide a return code to share > whether the device supports reset or not. > > Signed-off-by: David Harton > --- Hi, As far as I know changing the return type (void to int) of a function does *not* break ABI, but does "break" API as the application code should now check the return value. In theory the application could ignore the return value and current behavior is maintained. The validate-abi.sh script says "Compatible" with the following item flagged: Problems with Symbols High 0 Medium 0 Low 1 Change>> Type of return value has been changed from void to int (4 bytes). Effect>> Replacement of return type may indicate a change in its semantic meaning. Perhaps somebody with more ABI expertise than I would double check the return value change? Some smaller things inline below. > v3: > * overcame noob errors and figured out patch challenges > * this release should finally be clean :) > > v2: > * fixed soft tab issue inserted while moving changes > > lib/librte_ether/rte_ethdev.c | 8 +--- > lib/librte_ether/rte_ethdev.h | 4 +++- > 2 files changed, 8 insertions(+), 4 deletions(-) > > diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c > index 0597641..f72cc5a 100644 > --- a/lib/librte_ether/rte_ethdev.c > +++ b/lib/librte_ether/rte_ethdev.c > @@ -1341,17 +1341,19 @@ rte_eth_stats_get(uint8_t port_id, struct > rte_eth_stats *stats) > return 0; > } > > -void > +int > rte_eth_stats_reset(uint8_t port_id) > { > struct rte_eth_dev *dev; > > - RTE_ETH_VALID_PORTID_OR_RET(port_id); > + RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL); > dev = &rte_eth_devices[port_id]; > > - RTE_FUNC_PTR_OR_RET(*dev->dev_ops->stats_reset); > + RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->stats_reset, -ENOTSUP); > (*dev->dev_ops->stats_reset)(dev); > dev->data->rx_mbuf_alloc_failed = 0; > + > + return 0; > } > > static int > diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h > index 0adf327..d8ccd60 100644 > --- a/lib/librte_ether/rte_ethdev.h > +++ b/lib/librte_ether/rte_ethdev.h > @@ -2246,8 +2246,10 @@ int rte_eth_stats_get(uint8_t port_id, struct > rte_eth_stats > *stats); > * > * @param port_id > * The port identifier of the Ethernet device. > + * @return > + * Zero if successful. Non-zero otherwise. We should document all return values: @retval 0 Success @retval -EINVAL Invalid port_id provided @retval -ENOTSUP Stats reset functionality not supported by PMD The API change itself should probably be added to release notes, as applications may wish to be aware this function has changed. > */ > -void rte_eth_stats_reset(uint8_t port_id); > +int rte_eth_stats_reset(uint8_t port_id); > > /** > * Retrieve names of extended statistics of an Ethernet device. > -- > 2.10.3.dirty I'm happy to Ack the code/release-notes with above suggestions, but I'd like a second opinion to Ack ABI. -Harry
Re: [dpdk-dev] Why IVSHMEM was removed since 16.11 ?
> From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Furong > Sent: Tuesday, August 8, 2017 8:26 AM > To: dev@dpdk.org > Subject: [dpdk-dev] Why IVSHMEM was removed since 16.11 ? > > The release notes of dpdk-16.11 had shown that IVSHMEM was removed > due to some design issues. > > So, what are these issues? > > Thanks! > Hi Furong, There were multiple issues involved. Biggest of all, it required a patch to QEMU that wasn't maintained and wasn't upstream (i.e. vanilla QEMU didn't work with DPDK's implementation of IVSHMEM support). Second, it was basically hacked in to EAL in order to support what it does [1], and the engineering effort to fix all of that isn't worth the benefit it would provide, as no one appeared to be using it heavily enough to object to its deprecation. Hope this clears things up. [1] http://dpdk.org/ml/archives/dev/2016-June/040844.html Thanks, Anatoly
Re: [dpdk-dev] [PATCH v3] vfio: fix sPAPR IOMMU DMA window size
From: Jonas Pfefferle [mailto:j...@zurich.ibm.com] > Sent: Tuesday, August 8, 2017 9:41 AM > To: Burakov, Anatoly > Cc: dev@dpdk.org; a...@ozlabs.ru; Jonas Pfefferle > Subject: [PATCH v3] vfio: fix sPAPR IOMMU DMA window size > > DMA window size needs to be big enough to span all memory segment's > physical addresses. We do not need multiple levels of IOMMU tables > as we already span ~70TB of physical memory with 16MB hugepages. > > Signed-off-by: Jonas Pfefferle > --- > v2: > * roundup to next power 2 function without loop. > > v3: > * Replace roundup_next_pow2 with rte_align64pow2 > > lib/librte_eal/linuxapp/eal/eal_vfio.c | 13 ++--- > 1 file changed, 10 insertions(+), 3 deletions(-) > > diff --git a/lib/librte_eal/linuxapp/eal/eal_vfio.c > b/lib/librte_eal/linuxapp/eal/eal_vfio.c > index 946df7e..550c41c 100644 > --- a/lib/librte_eal/linuxapp/eal/eal_vfio.c > +++ b/lib/librte_eal/linuxapp/eal/eal_vfio.c > @@ -759,10 +759,12 @@ vfio_spapr_dma_map(int vfio_container_fd) > return -1; > } > > - /* calculate window size based on number of hugepages configured > */ > - create.window_size = rte_eal_get_physmem_size(); > + /* physicaly pages are sorted descending i.e. ms[0].phys_addr is max > */ Do we always expect that to be the case in the future? Maybe it would be safer to walk the memsegs list. Thanks, Anatoly > + /* create DMA window from 0 to max(phys_addr + len) */ > + /* sPAPR requires window size to be a power of 2 */ > + create.window_size = rte_align64pow2(ms[0].phys_addr + > ms[0].len); > create.page_shift = __builtin_ctzll(ms->hugepage_sz); > - create.levels = 2; > + create.levels = 1; > > ret = ioctl(vfio_container_fd, VFIO_IOMMU_SPAPR_TCE_CREATE, > &create); > if (ret) { > @@ -771,6 +773,11 @@ vfio_spapr_dma_map(int vfio_container_fd) > return -1; > } > > + if (create.start_addr != 0) { > + RTE_LOG(ERR, EAL, " DMA window start address != 0\n"); > + return -1; > + } > + > /* map all DPDK segments for DMA. use 1:1 PA to IOVA mapping */ > for (i = 0; i < RTE_MAX_MEMSEG; i++) { > struct vfio_iommu_type1_dma_map dma_map; > -- > 2.7.4
Re: [dpdk-dev] [PATCH v3] vfio: fix sPAPR IOMMU DMA window size
"Burakov, Anatoly" wrote on 08/08/2017 11:15:24 AM: > From: "Burakov, Anatoly" > To: Jonas Pfefferle > Cc: "dev@dpdk.org" , "a...@ozlabs.ru" > Date: 08/08/2017 11:18 AM > Subject: RE: [PATCH v3] vfio: fix sPAPR IOMMU DMA window size > > From: Jonas Pfefferle [mailto:j...@zurich.ibm.com] > > Sent: Tuesday, August 8, 2017 9:41 AM > > To: Burakov, Anatoly > > Cc: dev@dpdk.org; a...@ozlabs.ru; Jonas Pfefferle > > Subject: [PATCH v3] vfio: fix sPAPR IOMMU DMA window size > > > > DMA window size needs to be big enough to span all memory segment's > > physical addresses. We do not need multiple levels of IOMMU tables > > as we already span ~70TB of physical memory with 16MB hugepages. > > > > Signed-off-by: Jonas Pfefferle > > --- > > v2: > > * roundup to next power 2 function without loop. > > > > v3: > > * Replace roundup_next_pow2 with rte_align64pow2 > > > > lib/librte_eal/linuxapp/eal/eal_vfio.c | 13 ++--- > > 1 file changed, 10 insertions(+), 3 deletions(-) > > > > diff --git a/lib/librte_eal/linuxapp/eal/eal_vfio.c > > b/lib/librte_eal/linuxapp/eal/eal_vfio.c > > index 946df7e..550c41c 100644 > > --- a/lib/librte_eal/linuxapp/eal/eal_vfio.c > > +++ b/lib/librte_eal/linuxapp/eal/eal_vfio.c > > @@ -759,10 +759,12 @@ vfio_spapr_dma_map(int vfio_container_fd) > >return -1; > > } > > > > - /* calculate window size based on number of hugepages configured > > */ > > - create.window_size = rte_eal_get_physmem_size(); > > + /* physicaly pages are sorted descending i.e. ms[0].phys_addr is max > > */ > > Do we always expect that to be the case in the future? Maybe it > would be safer to walk the memsegs list. > > Thanks, > Anatoly I had this loop in before but removed it in favor of simplicity. If we believe that the ordering is going to change in the future I'm happy to bring back the loop. Is there other code which is relying on the fact that the memsegs are sorted by their physical addresses? > > > + /* create DMA window from 0 to max(phys_addr + len) */ > > + /* sPAPR requires window size to be a power of 2 */ > > + create.window_size = rte_align64pow2(ms[0].phys_addr + > > ms[0].len); > > create.page_shift = __builtin_ctzll(ms->hugepage_sz); > > - create.levels = 2; > > + create.levels = 1; > > > > ret = ioctl(vfio_container_fd, VFIO_IOMMU_SPAPR_TCE_CREATE, > > &create); > > if (ret) { > > @@ -771,6 +773,11 @@ vfio_spapr_dma_map(int vfio_container_fd) > >return -1; > > } > > > > + if (create.start_addr != 0) { > > + RTE_LOG(ERR, EAL, " DMA window start address != 0\n"); > > + return -1; > > + } > > + > > /* map all DPDK segments for DMA. use 1:1 PA to IOVA mapping */ > > for (i = 0; i < RTE_MAX_MEMSEG; i++) { > >struct vfio_iommu_type1_dma_map dma_map; > > -- > > 2.7.4 >
Re: [dpdk-dev] [PATCH v3] vfio: fix sPAPR IOMMU DMA window size
> From: Jonas Pfefferle1 [mailto:j...@zurich.ibm.com] > Sent: Tuesday, August 8, 2017 10:30 AM > To: Burakov, Anatoly > Cc: a...@ozlabs.ru; dev@dpdk.org > Subject: RE: [PATCH v3] vfio: fix sPAPR IOMMU DMA window size > > "Burakov, Anatoly" wrote on 08/08/2017 > 11:15:24 AM: > > > From: "Burakov, Anatoly" > > To: Jonas Pfefferle > > Cc: "dev@dpdk.org" , "a...@ozlabs.ru" > > Date: 08/08/2017 11:18 AM > > Subject: RE: [PATCH v3] vfio: fix sPAPR IOMMU DMA window size > > > > From: Jonas Pfefferle [mailto:j...@zurich.ibm.com] > > > Sent: Tuesday, August 8, 2017 9:41 AM > > > To: Burakov, Anatoly > > > Cc: dev@dpdk.org; a...@ozlabs.ru; Jonas Pfefferle > > > Subject: [PATCH v3] vfio: fix sPAPR IOMMU DMA window size > > > > > > DMA window size needs to be big enough to span all memory segment's > > > physical addresses. We do not need multiple levels of IOMMU tables > > > as we already span ~70TB of physical memory with 16MB hugepages. > > > > > > Signed-off-by: Jonas Pfefferle > > > --- > > > v2: > > > * roundup to next power 2 function without loop. > > > > > > v3: > > > * Replace roundup_next_pow2 with rte_align64pow2 > > > > > > lib/librte_eal/linuxapp/eal/eal_vfio.c | 13 ++--- > > > 1 file changed, 10 insertions(+), 3 deletions(-) > > > > > > diff --git a/lib/librte_eal/linuxapp/eal/eal_vfio.c > > > b/lib/librte_eal/linuxapp/eal/eal_vfio.c > > > index 946df7e..550c41c 100644 > > > --- a/lib/librte_eal/linuxapp/eal/eal_vfio.c > > > +++ b/lib/librte_eal/linuxapp/eal/eal_vfio.c > > > @@ -759,10 +759,12 @@ vfio_spapr_dma_map(int vfio_container_fd) > > > return -1; > > > } > > > > > > - /* calculate window size based on number of hugepages configured > > > */ > > > - create.window_size = rte_eal_get_physmem_size(); > > > + /* physicaly pages are sorted descending i.e. ms[0].phys_addr is max > > > */ > > > > Do we always expect that to be the case in the future? Maybe it > > would be safer to walk the memsegs list. > > > > Thanks, > > Anatoly > > I had this loop in before but removed it in favor of simplicity. > If we believe that the ordering is going to change in the future > I'm happy to bring back the loop. Is there other code which is > relying on the fact that the memsegs are sorted by their physical > addresses? I don't think there is. In any case, I think making assumptions about particulars of memseg organization is not a very good practice. I seem to recall us doing similar things in other places, so maybe down the line we could introduce a new API (or internal-only) function to get a memseg with min/max address. For now I think a loop will do. > > > > > > + /* create DMA window from 0 to max(phys_addr + len) */ > > > + /* sPAPR requires window size to be a power of 2 */ > > > + create.window_size = rte_align64pow2(ms[0].phys_addr + > > > ms[0].len); > > > create.page_shift = __builtin_ctzll(ms->hugepage_sz); > > > - create.levels = 2; > > > + create.levels = 1; > > > > > > ret = ioctl(vfio_container_fd, VFIO_IOMMU_SPAPR_TCE_CREATE, > > > &create); > > > if (ret) { > > > @@ -771,6 +773,11 @@ vfio_spapr_dma_map(int vfio_container_fd) > > > return -1; > > > } > > > > > > + if (create.start_addr != 0) { > > > + RTE_LOG(ERR, EAL, " DMA window start address != 0\n"); > > > + return -1; > > > + } > > > + > > > /* map all DPDK segments for DMA. use 1:1 PA to IOVA mapping */ > > > for (i = 0; i < RTE_MAX_MEMSEG; i++) { > > > struct vfio_iommu_type1_dma_map dma_map; > > > -- > > > 2.7.4 > >
Re: [dpdk-dev] [PATCH] ethdev: add notice for planned name size change
> > > VMBUS support will use GUID in eth_dev_data name field which requires > > > at least 37 characters. Plan for increase in size now. > > > > > > Signed-off-by: Stephen Hemminger > > > --- > > > +* ethdev: An ABI change is planned for 17.11 for the structure > > eth_dev_data. > > Isn't it better to use "``rte_eth_dev_data``"? > > > > > + The size of the unique name will increase RTE_ETH_NAME_MAX_LEN > > from > > > + 32 to > > > + 64 characters to allow using a globally unique identifier (GUID) in > > > this > > field. > > > > Except for the small comment above, > > Acked-by: Yongseok Koh > > Acked-by: Shahaf Shuler Acked-by: Thomas Monjalon Applied, thanks
Re: [dpdk-dev] [PATCH] doc: announce ABI change for cryptodev and ethdev
08/08/2017 07:03, Shahaf Shuler: > Monday, August 7, 2017 9:07 PM, Boris Pismenny: > > > From: Thomas Monjalon [mailto:tho...@monjalon.net] > > > 04/08/2017 07:26, Hemant Agrawal: > > > > On 8/3/2017 9:02 PM, Akhil Goyal wrote: > > > > > Support for security operations is planned to be added in ethdev > > > > > and cryptodev for the 17.11 release. > > > > > > > > > > For this following changes are required. > > > > > - rte_cryptodev and rte_eth_dev structures need to be added new > > > > > parameter rte_security_ops which extend support for security ops > > > > > to the corresponding driver. > > > > > - rte_cryptodev_info and rte_ethd_dev_info need to be added with > > > > > rte_security_capabilities to identify the capabilities of the > > > > > corresponding driver. > > > > > > It is not explained what is the fundamental difference between > > > rte_security and rte_crypto? > > > It looks to be just a technical workaround. > > > > rte_security is a layer between crypto and NIC. > > > > Today crypto sessions are created exclusively against crypto devices, but > > they don't use network related fields, while the network namespace doesn't > > use crypto related fields. We expect this API to represent crypto sessions > > that combine network fields and allow to add/delete them for all devices. > > > > For NICs we will use rte_flow with rte_security for inline/full crypto > > protocol > > offload such as ESP. > > > > > > > > Why the ABI would be changed by rte_security additions? > > > > > > > > Signed-off-by: Akhil Goyal > > > > > > > > > Acked-by: Hemant Agrawal > > > > > > No more opinions outside of NXP? > > > It seems there is not yet a consensus on how to manage IPsec offloading. > > > I heard there were some phone calls about these stuff but nothing > > > clear appears publicly on the mailing list. > > > Looks to be a community failure. > > > > We agreed to go ahead with this approach on one such phone call. I hope we > > could use the dpdk github for development. > > > > Acked-by: Boris Pismenny > > Acked-by: Shahaf Shuler Applied It means you have a chance to do this change in 17.11. It does not mean you can be sure that the patches will be accepted. This is introducing a new complexity. It must be discussed with the technical board before approving the final design in 17.11.
Re: [dpdk-dev] [PATCH v3] doc: announce API and ABI change for ethdev
> > This is an API/ABI change notice for DPDK 17.11 announcing the redefinition > > of > > port_id. port_id is currently defined as uint8_t, which is limited to the > > range 0 to > > 255. A larger range is required for vdev scalability. > > > > It is necessary for a redefinition of port_id to extend it from 1 bytes to > > 2 bytes. All ethdev APIs and usages related to port_id will be changed at > > the > > same time. > > > > cc: jianfeng@intel.com > > cc: y...@fridaylinux.org > > cc: jerin.ja...@caviumnetworks.com > > cc: tho...@monjalon.net > > cc: remy.hor...@intel.com > > > > Signed-off-by: Zhiyong Yang > > Acked-by: Jianfeng Tan > > Acked-by: Yuanhan Liu > > Acked-by: Jerin Jacob > > Acked-by: Remy Horton > > This ack in v2 is added in v3. Thanks, Shahaf. > > Acked-by: Shahaf Shuler Applied, thanks
Re: [dpdk-dev] [PATCH] doc: announce crypto vdev init removal
03/08/2017 12:42, Akhil Goyal: > On 8/3/2017 7:42 AM, Pablo de Lara wrote: > > In order to remove all dependencies on vdev for cryptodev, > > the implementation of rte_cryptodev_vdev_pmd_init() function > > needs to be moved to rte_cryptodev_vdev.h, and all crypto > > vdevs will include it, and therefore, this function will > > be removed as a public API. > > > > Signed-off-by: Pablo de Lara > > Acked-by: Akhil Goyal Applied, thanks
Re: [dpdk-dev] [PATCH] doc: announce API change in crypto driver allocation
03/08/2017 12:43, Akhil Goyal: > On 8/3/2017 6:32 AM, Pablo de Lara wrote: > > rte_cryptodev_allocate_driver() function gets one parameter > > (rte_driver), as the cryptodev_driver structure is > > allocated inside the function with rte_malloc. > > > > This function is called from a constructor function, > > when crypto PMDs are registered. > > If malloc fails, there is no way to recover from it, > > so it is better to allocate this structure > > statically, in each PMD. > > > > Therefore, it is required to add an extra parameter in > > this function, to also get a pointer to this structure. > > > > Signed-off-by: Pablo de Lara > > > Acked-by: Akhil Goyal Applied, thanks
Re: [dpdk-dev] [PATCH] doc: API change notice for librte_meter
04/08/2017 15:19, Cristian Dumitrescu: > Signed-off-by: Cristian Dumitrescu > Acked-by: John McNamara > Acked-by: Jasvinder Singh > Acked-by: Radu Nicolau > Acked-by: David Hunt Applied, thanks
Re: [dpdk-dev] [PATCH v3] ethdev: add return code to rte_eth_stats_reset()
> -Original Message- > From: Van Haaren, Harry [mailto:harry.van.haa...@intel.com] > Sent: Tuesday, August 08, 2017 5:03 AM > To: David Harton (dharton) ; tho...@monjalon.net > Cc: dev@dpdk.org > Subject: RE: [dpdk-dev] [PATCH v3] ethdev: add return code to > rte_eth_stats_reset() > > > From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of David Harton > > Sent: Monday, August 7, 2017 6:39 PM > > To: tho...@monjalon.net > > Cc: dev@dpdk.org; David Harton > > Subject: [dpdk-dev] [PATCH v3] ethdev: add return code to > > rte_eth_stats_reset() > > > > Some devices do not support reset of eth stats. An application may > > need to know not to clear shadow stats if the device cannot. > > > > rte_eth_stats_reset is updated to provide a return code to share > > whether the device supports reset or not. > > > > Signed-off-by: David Harton > > --- > > Hi, > > As far as I know changing the return type (void to int) of a function does > *not* break ABI, but does "break" API as the application code should now > check the return value. In theory the application could ignore the return > value and current behavior is maintained. > > The validate-abi.sh script says "Compatible" with the following item > flagged: > > Problems with Symbols > High 0 > Medium 0 > Low 1 > > Change>> Type of return value has been changed from void to int (4 bytes). > Effect>> Replacement of return type may indicate a change in its semantic > meaning. > > Perhaps somebody with more ABI expertise than I would double check the > return value change? > > > Some smaller things inline below. > > > v3: > > * overcame noob errors and figured out patch challenges > > * this release should finally be clean :) > > > > v2: > > * fixed soft tab issue inserted while moving changes > > > > lib/librte_ether/rte_ethdev.c | 8 +--- > > lib/librte_ether/rte_ethdev.h | 4 +++- > > 2 files changed, 8 insertions(+), 4 deletions(-) > > > > diff --git a/lib/librte_ether/rte_ethdev.c > > b/lib/librte_ether/rte_ethdev.c index 0597641..f72cc5a 100644 > > --- a/lib/librte_ether/rte_ethdev.c > > +++ b/lib/librte_ether/rte_ethdev.c > > @@ -1341,17 +1341,19 @@ rte_eth_stats_get(uint8_t port_id, struct > rte_eth_stats *stats) > > return 0; > > } > > > > -void > > +int > > rte_eth_stats_reset(uint8_t port_id) > > { > > struct rte_eth_dev *dev; > > > > - RTE_ETH_VALID_PORTID_OR_RET(port_id); > > + RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL); > > dev = &rte_eth_devices[port_id]; > > > > - RTE_FUNC_PTR_OR_RET(*dev->dev_ops->stats_reset); > > + RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->stats_reset, -ENOTSUP); > > (*dev->dev_ops->stats_reset)(dev); > > dev->data->rx_mbuf_alloc_failed = 0; > > + > > + return 0; > > } > > > > static int > > diff --git a/lib/librte_ether/rte_ethdev.h > > b/lib/librte_ether/rte_ethdev.h index 0adf327..d8ccd60 100644 > > --- a/lib/librte_ether/rte_ethdev.h > > +++ b/lib/librte_ether/rte_ethdev.h > > @@ -2246,8 +2246,10 @@ int rte_eth_stats_get(uint8_t port_id, struct > > rte_eth_stats *stats); > > * > > * @param port_id > > * The port identifier of the Ethernet device. > > + * @return > > + * Zero if successful. Non-zero otherwise. > > We should document all return values: > > @retval 0 Success > @retval -EINVAL Invalid port_id provided @retval -ENOTSUP Stats reset > functionality not supported by PMD Sure. I was following the convention of function above it rte_eth_stats_get() but I agree better to advertise externally. I'll also modify the port number check errval to -ENODEV. > > The API change itself should probably be added to release notes, as > applications may wish to be aware this function has changed. Sounds good. > > > */ > > -void rte_eth_stats_reset(uint8_t port_id); > > +int rte_eth_stats_reset(uint8_t port_id); > > > > /** > > * Retrieve names of extended statistics of an Ethernet device. > > -- > > 2.10.3.dirty > > > I'm happy to Ack the code/release-notes with above suggestions, but I'd > like a second opinion to Ack ABI. Thanks for the review, Dave > > -Harry
Re: [dpdk-dev] [PATCH v3] vfio: fix sPAPR IOMMU DMA window size
"Burakov, Anatoly" wrote on 08/08/2017 11:43:43 AM: > From: "Burakov, Anatoly" > To: Jonas Pfefferle1 > Cc: "a...@ozlabs.ru" , "dev@dpdk.org" > Date: 08/08/2017 11:43 AM > Subject: RE: [PATCH v3] vfio: fix sPAPR IOMMU DMA window size > > > From: Jonas Pfefferle1 [mailto:j...@zurich.ibm.com] > > Sent: Tuesday, August 8, 2017 10:30 AM > > To: Burakov, Anatoly > > Cc: a...@ozlabs.ru; dev@dpdk.org > > Subject: RE: [PATCH v3] vfio: fix sPAPR IOMMU DMA window size > > > > "Burakov, Anatoly" wrote on 08/08/2017 > > 11:15:24 AM: > > > > > From: "Burakov, Anatoly" > > > To: Jonas Pfefferle > > > Cc: "dev@dpdk.org" , "a...@ozlabs.ru" > > > Date: 08/08/2017 11:18 AM > > > Subject: RE: [PATCH v3] vfio: fix sPAPR IOMMU DMA window size > > > > > > From: Jonas Pfefferle [mailto:j...@zurich.ibm.com] > > > > Sent: Tuesday, August 8, 2017 9:41 AM > > > > To: Burakov, Anatoly > > > > Cc: dev@dpdk.org; a...@ozlabs.ru; Jonas Pfefferle > > > > Subject: [PATCH v3] vfio: fix sPAPR IOMMU DMA window size > > > > > > > > DMA window size needs to be big enough to span all memory segment's > > > > physical addresses. We do not need multiple levels of IOMMU tables > > > > as we already span ~70TB of physical memory with 16MB hugepages. > > > > > > > > Signed-off-by: Jonas Pfefferle > > > > --- > > > > v2: > > > > * roundup to next power 2 function without loop. > > > > > > > > v3: > > > > * Replace roundup_next_pow2 with rte_align64pow2 > > > > > > > > lib/librte_eal/linuxapp/eal/eal_vfio.c | 13 ++--- > > > > 1 file changed, 10 insertions(+), 3 deletions(-) > > > > > > > > diff --git a/lib/librte_eal/linuxapp/eal/eal_vfio.c > > > > b/lib/librte_eal/linuxapp/eal/eal_vfio.c > > > > index 946df7e..550c41c 100644 > > > > --- a/lib/librte_eal/linuxapp/eal/eal_vfio.c > > > > +++ b/lib/librte_eal/linuxapp/eal/eal_vfio.c > > > > @@ -759,10 +759,12 @@ vfio_spapr_dma_map(int vfio_container_fd) > > > > return -1; > > > > } > > > > > > > > - /* calculate window size based on number of hugepages configured > > > > */ > > > > - create.window_size = rte_eal_get_physmem_size(); > > > > + /* physicaly pages are sorted descending i.e. ms[0].phys_addr is max > > > > */ > > > > > > Do we always expect that to be the case in the future? Maybe it > > > would be safer to walk the memsegs list. > > > > > > Thanks, > > > Anatoly > > > > I had this loop in before but removed it in favor of simplicity. > > If we believe that the ordering is going to change in the future > > I'm happy to bring back the loop. Is there other code which is > > relying on the fact that the memsegs are sorted by their physical > > addresses? > > I don't think there is. In any case, I think making assumptions > about particulars of memseg organization is not a very good practice. > > I seem to recall us doing similar things in other places, so maybe > down the line we could introduce a new API (or internal-only) > function to get a memseg with min/max address. For now I think a > loop will do. Ok. Makes sense to me. Let me resubmit a new version with the loop. > > > > > > > > > > + /* create DMA window from 0 to max(phys_addr + len) */ > > > > + /* sPAPR requires window size to be a power of 2 */ > > > > + create.window_size = rte_align64pow2(ms[0].phys_addr + > > > > ms[0].len); > > > > create.page_shift = __builtin_ctzll(ms->hugepage_sz); > > > > - create.levels = 2; > > > > + create.levels = 1; > > > > > > > > ret = ioctl(vfio_container_fd, VFIO_IOMMU_SPAPR_TCE_CREATE, > > > > &create); > > > > if (ret) { > > > > @@ -771,6 +773,11 @@ vfio_spapr_dma_map(int vfio_container_fd) > > > > return -1; > > > > } > > > > > > > > + if (create.start_addr != 0) { > > > > + RTE_LOG(ERR, EAL, " DMA window start address != 0\n"); > > > > + return -1; > > > > + } > > > > + > > > > /* map all DPDK segments for DMA. use 1:1 PA to IOVA mapping */ > > > > for (i = 0; i < RTE_MAX_MEMSEG; i++) { > > > > struct vfio_iommu_type1_dma_map dma_map; > > > > -- > > > > 2.7.4 > > > >
Re: [dpdk-dev] [PATCH v3] ethdev: add return code to rte_eth_stats_reset()
On Tue, Aug 8, 2017 at 11:02 AM, Van Haaren, Harry < harry.van.haa...@intel.com> wrote: > > From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of David Harton > > Sent: Monday, August 7, 2017 6:39 PM > > To: tho...@monjalon.net > > Cc: dev@dpdk.org; David Harton > > Subject: [dpdk-dev] [PATCH v3] ethdev: add return code to > rte_eth_stats_reset() > > > > Some devices do not support reset of eth stats. An application may > > need to know not to clear shadow stats if the device cannot. > > > > rte_eth_stats_reset is updated to provide a return code to share > > whether the device supports reset or not. > > > > Signed-off-by: David Harton > > --- > > Hi, > > As far as I know changing the return type (void to int) of a function does > *not* break ABI, but does "break" API as the application code should now > check the return value. In theory the application could ignore the return > value and current behavior is maintained. > After discussing with Harry on IRC it turns out we both ended up checking the same online sources to verify our thoughts, like [1]. Given this and several other sources it seems to be as outlined above an API but not ABI break. I'm not an expert and this is mostly opinion, but my personal rule mostly is: "if in doubt bump it". Running similar issues I was the one providing [2] for a reason, with this here being a case that appears safe but there eventually always seems to come up an architecture or alternative compiler which does some arcane register juggling differently and makes those "safe" changes breaking people after the fact. [1]: https://stackoverflow.com/questions/15626579/c-abi-is-it-safe-to-change-void-function-to-return-int [2]: http://dpdk.org/doc/guides/contributing/versioning.html#setting-a-major-abi-version
[dpdk-dev] [PATCH v4] vfio: fix sPAPR IOMMU DMA window size
DMA window size needs to be big enough to span all memory segment's physical addresses. We do not need multiple levels of IOMMU tables as we already span ~70TB of physical memory with 16MB hugepages. Signed-off-by: Jonas Pfefferle --- v2: * roundup to next power 2 function without loop. v3: * Replace roundup_next_pow2 with rte_align64pow2 v4: * do not assume ordering of physical addresses of memsegs lib/librte_eal/linuxapp/eal/eal_vfio.c | 20 +--- 1 file changed, 17 insertions(+), 3 deletions(-) diff --git a/lib/librte_eal/linuxapp/eal/eal_vfio.c b/lib/librte_eal/linuxapp/eal/eal_vfio.c index 946df7e..7d5d61d 100644 --- a/lib/librte_eal/linuxapp/eal/eal_vfio.c +++ b/lib/librte_eal/linuxapp/eal/eal_vfio.c @@ -759,10 +759,19 @@ vfio_spapr_dma_map(int vfio_container_fd) return -1; } - /* calculate window size based on number of hugepages configured */ - create.window_size = rte_eal_get_physmem_size(); + /* create DMA window from 0 to max(phys_addr + len) */ + for (i = 0; i < RTE_MAX_MEMSEG; i++) { + if (ms[i].addr == NULL) + break; + + create.window_size = RTE_MAX(create.window_size, + ms[i].phys_addr + ms[i].len); + } + + /* sPAPR requires window size to be a power of 2 */ + create.window_size = rte_align64pow2(create.window_size); create.page_shift = __builtin_ctzll(ms->hugepage_sz); - create.levels = 2; + create.levels = 1; ret = ioctl(vfio_container_fd, VFIO_IOMMU_SPAPR_TCE_CREATE, &create); if (ret) { @@ -771,6 +780,11 @@ vfio_spapr_dma_map(int vfio_container_fd) return -1; } + if (create.start_addr != 0) { + RTE_LOG(ERR, EAL, " DMA window start address != 0\n"); + return -1; + } + /* map all DPDK segments for DMA. use 1:1 PA to IOVA mapping */ for (i = 0; i < RTE_MAX_MEMSEG; i++) { struct vfio_iommu_type1_dma_map dma_map; -- 2.7.4
Re: [dpdk-dev] [PATCH v4] vfio: fix sPAPR IOMMU DMA window size
> From: Jonas Pfefferle [mailto:j...@zurich.ibm.com] > Sent: Tuesday, August 8, 2017 12:17 PM > To: Burakov, Anatoly > Cc: dev@dpdk.org; a...@ozlabs.ru; Jonas Pfefferle > Subject: [PATCH v4] vfio: fix sPAPR IOMMU DMA window size > > DMA window size needs to be big enough to span all memory segment's > physical addresses. We do not need multiple levels of IOMMU tables as we > already span ~70TB of physical memory with 16MB hugepages. > > Signed-off-by: Jonas Pfefferle > --- > v2: > * roundup to next power 2 function without loop. > > v3: > * Replace roundup_next_pow2 with rte_align64pow2 > > v4: > * do not assume ordering of physical addresses of memsegs > > lib/librte_eal/linuxapp/eal/eal_vfio.c | 20 +--- > 1 file changed, 17 insertions(+), 3 deletions(-) > > diff --git a/lib/librte_eal/linuxapp/eal/eal_vfio.c > b/lib/librte_eal/linuxapp/eal/eal_vfio.c > index 946df7e..7d5d61d 100644 > --- a/lib/librte_eal/linuxapp/eal/eal_vfio.c > +++ b/lib/librte_eal/linuxapp/eal/eal_vfio.c > @@ -759,10 +759,19 @@ vfio_spapr_dma_map(int vfio_container_fd) > return -1; > } > > - /* calculate window size based on number of hugepages configured > */ > - create.window_size = rte_eal_get_physmem_size(); > + /* create DMA window from 0 to max(phys_addr + len) */ > + for (i = 0; i < RTE_MAX_MEMSEG; i++) { > + if (ms[i].addr == NULL) > + break; > + > + create.window_size = RTE_MAX(create.window_size, > + ms[i].phys_addr + ms[i].len); > + } > + > + /* sPAPR requires window size to be a power of 2 */ > + create.window_size = rte_align64pow2(create.window_size); > create.page_shift = __builtin_ctzll(ms->hugepage_sz); > - create.levels = 2; > + create.levels = 1; > > ret = ioctl(vfio_container_fd, VFIO_IOMMU_SPAPR_TCE_CREATE, > &create); > if (ret) { > @@ -771,6 +780,11 @@ vfio_spapr_dma_map(int vfio_container_fd) > return -1; > } > > + if (create.start_addr != 0) { > + RTE_LOG(ERR, EAL, " DMA window start address != 0\n"); > + return -1; > + } > + > /* map all DPDK segments for DMA. use 1:1 PA to IOVA mapping */ > for (i = 0; i < RTE_MAX_MEMSEG; i++) { > struct vfio_iommu_type1_dma_map dma_map; > -- > 2.7.4 Acked by: Anatoly Burakov Thanks, Anatoly
[dpdk-dev] [PATCH v1] doc: add template release notes for 17.11
Add template release notes for DPDK 17.11 with inline comments and explanations of the various sections. Signed-off-by: John McNamara --- doc/guides/rel_notes/index.rst | 1 + doc/guides/rel_notes/release_17_11.rst | 199 + 2 files changed, 200 insertions(+) create mode 100644 doc/guides/rel_notes/release_17_11.rst diff --git a/doc/guides/rel_notes/index.rst b/doc/guides/rel_notes/index.rst index b3c8090..35659d8 100644 --- a/doc/guides/rel_notes/index.rst +++ b/doc/guides/rel_notes/index.rst @@ -36,6 +36,7 @@ Release Notes :numbered: rel_description +release_17_11 release_17_08 release_17_05 release_17_02 diff --git a/doc/guides/rel_notes/release_17_11.rst b/doc/guides/rel_notes/release_17_11.rst new file mode 100644 index 000..6e5768f --- /dev/null +++ b/doc/guides/rel_notes/release_17_11.rst @@ -0,0 +1,199 @@ +DPDK Release 17.11 +== + +.. **Read this first.** + + The text in the sections below explains how to update the release notes. + + Use proper spelling, capitalization and punctuation in all sections. + + Variable and config names should be quoted as fixed width text: + ``LIKE_THIS``. + + Build the docs and view the output file to ensure the changes are correct:: + + make doc-guides-html + + xdg-open build/doc/html/guides/rel_notes/release_17_11.html + + +New Features + + +.. This section should contain new features added in this release. Sample + format: + + * **Add a title in the past tense with a full stop.** + + Add a short 1-2 sentence description in the past tense. The description + should be enough to allow someone scanning the release notes to + understand the new feature. + + If the feature adds a lot of sub-features you can use a bullet list like + this: + + * Added feature foo to do something. + * Enhanced feature bar to do something else. + + Refer to the previous release notes for examples. + + This section is a comment. do not overwrite or remove it. + Also, make sure to start the actual text at the margin. + = + + +Resolved Issues +--- + +.. This section should contain bug fixes added to the relevant + sections. Sample format: + + * **code/section Fixed issue in the past tense with a full stop.** + + Add a short 1-2 sentence description of the resolved issue in the past + tense. + + The title should contain the code/lib section like a commit message. + + Add the entries in alphabetic order in the relevant sections below. + + This section is a comment. do not overwrite or remove it. + Also, make sure to start the actual text at the margin. + = + + +EAL +~~~ + + +Drivers +~~~ + + +Libraries +~ + + +Examples + + + +Other +~ + + +Known Issues + + +.. This section should contain new known issues in this release. Sample format: + + * **Add title in present tense with full stop.** + + Add a short 1-2 sentence description of the known issue in the present + tense. Add information on any known workarounds. + + This section is a comment. do not overwrite or remove it. + Also, make sure to start the actual text at the margin. + = + + +API Changes +--- + +.. This section should contain API changes. Sample format: + + * Add a short 1-2 sentence description of the API change. Use fixed width + quotes for ``rte_function_names`` or ``rte_struct_names``. Use the past + tense. + + This section is a comment. do not overwrite or remove it. + Also, make sure to start the actual text at the margin. + = + + +ABI Changes +--- + +.. This section should contain ABI changes. Sample format: + + * Add a short 1-2 sentence description of the ABI change that was announced + in the previous releases and made in this release. Use fixed width quotes + for ``rte_function_names`` or ``rte_struct_names``. Use the past tense. + + This section is a comment. do not overwrite or remove it. + Also, make sure to start the actual text at the margin. + = + + + +Shared Library Versions +--- + +.. Update any library version updated in this release and prepend with a ``+`` + sign, like this: + + librte_acl.so.2 + + librte_cfgfile.so.2 + librte_cmdline.so.2 + + This section is a comment. do not overwrite or remove it. + = + + +The libraries prepended with a plus sign were incremented in this version. + +.. code-block:: diff + + librte_acl.so.2 + librte_bitratestats.so.1 + librte_cfgfile.so.2 + librte_cmdline.so.2 + librte_cryptodev.so.3
Re: [dpdk-dev] [PATCH v1] doc: update release notes for 17.08
> -Original Message- > From: Thomas Monjalon [mailto:tho...@monjalon.net] > Sent: Monday, August 7, 2017 12:17 PM > To: Mcnamara, John > Cc: dev@dpdk.org > Subject: Re: [dpdk-dev] [PATCH v1] doc: update release notes for 17.08 > > 04/08/2017 16:20, John McNamara: > > Fix grammar, spelling and formatting of DPDK 17.08 release notes. > > > > Signed-off-by: John McNamara > > Applied, thanks > > Please, would you like to prepare a blank release notes for 17.11? Done. John
[dpdk-dev] [PATCH] net/bonding: fix bonding in 8023ad mode
This patch blocks possibility to set master bonding by rte_eth_bond_mode_set() in 802.3ad mode, as the API doesn't prevent this. Fixes: 6d72657ce379 ("net/bonding: add other aggregator modes") Cc: danielx.t.mrzyg...@intel.com Signed-off-by: Jacek Piasecki --- drivers/net/bonding/rte_eth_bond_api.c | 29 - drivers/net/bonding/rte_eth_bond_private.h | 3 +++ 2 files changed, 31 insertions(+), 1 deletion(-) diff --git a/drivers/net/bonding/rte_eth_bond_api.c b/drivers/net/bonding/rte_eth_bond_api.c index de1d9e0..d6aaf07 100644 --- a/drivers/net/bonding/rte_eth_bond_api.c +++ b/drivers/net/bonding/rte_eth_bond_api.c @@ -63,6 +63,25 @@ valid_bonded_port_id(uint8_t port_id) } int +check_for_master_bonded_ethdev(const struct rte_eth_dev *eth_dev) +{ + int i; + struct bond_dev_private *internals; + + if (check_for_bonded_ethdev(eth_dev) != 0) + return 0; + + internals = eth_dev->data->dev_private; + + /* Check if any of slave devices is a bonded device */ + for (i = 0; i < internals->slave_count; i++) + if (valid_bonded_port_id(internals->slaves[i].port_id) == 0) + return 1; + + return 0; +} + +int valid_slave_port_id(uint8_t port_id, uint8_t mode) { RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -1); @@ -490,10 +509,18 @@ rte_eth_bond_slave_remove(uint8_t bonded_port_id, uint8_t slave_port_id) int rte_eth_bond_mode_set(uint8_t bonded_port_id, uint8_t mode) { + struct rte_eth_dev *bonded_eth_dev; + if (valid_bonded_port_id(bonded_port_id) != 0) return -1; - return bond_ethdev_mode_set(&rte_eth_devices[bonded_port_id], mode); + bonded_eth_dev = &rte_eth_devices[bonded_port_id]; + + if (check_for_master_bonded_ethdev(bonded_eth_dev) != 0 && + mode == BONDING_MODE_8023AD) + return -1; + + return bond_ethdev_mode_set(bonded_eth_dev, mode); } int diff --git a/drivers/net/bonding/rte_eth_bond_private.h b/drivers/net/bonding/rte_eth_bond_private.h index 1fe6ff8..4dd7e5f 100644 --- a/drivers/net/bonding/rte_eth_bond_private.h +++ b/drivers/net/bonding/rte_eth_bond_private.h @@ -184,6 +184,9 @@ extern const struct eth_dev_ops default_dev_ops; int check_for_bonded_ethdev(const struct rte_eth_dev *eth_dev); +int +check_for_master_bonded_ethdev(const struct rte_eth_dev *eth_dev); + /* Search given slave array to find position of given id. * Return slave pos or slaves_count if not found. */ static inline uint8_t -- 2.7.4
Re: [dpdk-dev] [PATCH v3] ethdev: add return code to rte_eth_stats_reset()
08/08/2017 13:03, Christian Ehrhardt: > On Tue, Aug 8, 2017 at 11:02 AM, Van Haaren, Harry < > harry.van.haa...@intel.com> wrote: > > > > > > Some devices do not support reset of eth stats. An application may > > > need to know not to clear shadow stats if the device cannot. > > > > > > rte_eth_stats_reset is updated to provide a return code to share > > > whether the device supports reset or not. > > > > > > Signed-off-by: David Harton > > > --- > > > > Hi, > > > > As far as I know changing the return type (void to int) of a function does > > *not* break ABI, but does "break" API as the application code should now > > check the return value. In theory the application could ignore the return > > value and current behavior is maintained. > > > > After discussing with Harry on IRC it turns out we both ended up checking > the same online sources > to verify our thoughts, like [1]. > > Given this and several other sources it seems to be as outlined above an > API but not ABI break. > I'm not an expert and this is mostly opinion, but my personal rule mostly > is: "if in doubt bump it". Anyway, the ABI will be broken (and bumped) again in 17.11. This patch will be accepted in 17.11.
Re: [dpdk-dev] [PATCH v1] doc: add template release notes for 17.11
Hi, Thanks for sending the next release notes so quickly. Unfortunately... 08/08/2017 14:31, John McNamara: > +The libraries prepended with a plus sign were incremented in this version. > + > +.. code-block:: diff > + > + librte_acl.so.2 > + librte_bitratestats.so.1 > + librte_cfgfile.so.2 > + librte_cmdline.so.2 > + librte_cryptodev.so.3 > + librte_distributor.so.1 > + librte_eal.so.4 > + librte_ethdev.so.7 > + librte_gro.so.1 > + librte_hash.so.2 > + librte_ip_frag.so.1 > + librte_jobstats.so.1 > + librte_kni.so.2 > + librte_kvargs.so.1 > + librte_latencystats.so.1 > + librte_lpm.so.2 > + librte_mbuf.so.3 > + librte_mempool.so.2 > + librte_meter.so.1 > + librte_metrics.so.1 > + librte_net.so.1 > + librte_pdump.so.1 > + librte_pipeline.so.3 > + librte_pmd_bond.so.1 > + librte_pmd_ring.so.2 > + librte_port.so.3 > + librte_power.so.1 > + librte_reorder.so.1 > + librte_ring.so.1 > + librte_sched.so.1 > + librte_table.so.2 > + librte_timer.so.1 > + librte_vhost.so.3 ... we are going to bump some ABIVER which were forgotten during 17.08. You will probably need to update after 17.08 release.
[dpdk-dev] [PATCH] eal: bump ABI version for PCI, bus and devargs work
1. PCI domain field in PCI address structure grew from 16 to 32 bits. From: 463ced957c3f ("pci: increase domain storage to 32 bits") 2. rte_bus structure gaining new ops. From: 3a8f0bc68a90 ("bus: add method to find device") From: 7c8810f43f6e ("bus: introduce device plug/unplug") From: 609eb7dde6d0 ("bus: introduce parsing functionality") From: 98eb4b845c1a ("bus: introduce scan policies") 3. rte_devargs structure having been mostly rewritten. From: f3a1188cee4a ("devargs: make device representation generic") From: 47828c5f3bc3 ("devargs: parse bus info") From: 39f403e0d5bb ("devargs: restore device type API") Signed-off-by: Gaetan Rivet --- lib/librte_eal/bsdapp/eal/Makefile | 2 +- lib/librte_eal/linuxapp/eal/Makefile | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/lib/librte_eal/bsdapp/eal/Makefile b/lib/librte_eal/bsdapp/eal/Makefile index 05517a2..005019e 100644 --- a/lib/librte_eal/bsdapp/eal/Makefile +++ b/lib/librte_eal/bsdapp/eal/Makefile @@ -48,7 +48,7 @@ LDLIBS += -lgcc_s EXPORT_MAP := rte_eal_version.map -LIBABIVER := 4 +LIBABIVER := 5 # specific to bsdapp exec-env SRCS-$(CONFIG_RTE_EXEC_ENV_BSDAPP) := eal.c diff --git a/lib/librte_eal/linuxapp/eal/Makefile b/lib/librte_eal/linuxapp/eal/Makefile index e6ab6c3..90bca4d 100644 --- a/lib/librte_eal/linuxapp/eal/Makefile +++ b/lib/librte_eal/linuxapp/eal/Makefile @@ -37,7 +37,7 @@ ARCH_DIR ?= $(RTE_ARCH) EXPORT_MAP := rte_eal_version.map VPATH += $(RTE_SDK)/lib/librte_eal/common/arch/$(ARCH_DIR) -LIBABIVER := 4 +LIBABIVER := 5 VPATH += $(RTE_SDK)/lib/librte_eal/common -- 2.1.4
[dpdk-dev] [PATCH] doc: notify EAL ABI change
Signed-off-by: Gaetan Rivet --- doc/guides/rel_notes/release_17_08.rst | 10 +- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/doc/guides/rel_notes/release_17_08.rst b/doc/guides/rel_notes/release_17_08.rst index a137aa7..a723252 100644 --- a/doc/guides/rel_notes/release_17_08.rst +++ b/doc/guides/rel_notes/release_17_08.rst @@ -343,6 +343,14 @@ ABI Changes * Removed ``session_mp`` from ``rte_cryptodev_config``. +* Changed type of ``domain`` field in ``rte_pci_addr`` to ``uint32_t`` + to follow the PCI standard. + +* Added new ``rte_bus`` experimental APIs available as operators within the + ``rte_bus`` structure. + +* Made ``rte_devargs`` structure internal device representation generic to + prepare for a bus-agnostic EAL. Shared Library Versions --- @@ -368,7 +376,7 @@ The libraries prepended with a plus sign were incremented in this version. librte_cmdline.so.2 + librte_cryptodev.so.3 librte_distributor.so.1 - librte_eal.so.4 + + librte_eal.so.5 + librte_ethdev.so.7 + librte_gro.so.1 librte_hash.so.2 -- 2.1.4
[dpdk-dev] [PATCH 1/2] doc: fix v17.05 release note
librte_eventdev.so.1 was introduced in v17.05, and it missed to update in release_17_05 release notes. Fixes: 222555480a7f9 ("version: 17.05.0") Cc: sta...@dpdk.org Reported-by: Thomas Monjalon Signed-off-by: Jerin Jacob --- doc/guides/rel_notes/release_17_05.rst | 1 + 1 file changed, 1 insertion(+) diff --git a/doc/guides/rel_notes/release_17_05.rst b/doc/guides/rel_notes/release_17_05.rst index cbdb40657..87b3f9279 100644 --- a/doc/guides/rel_notes/release_17_05.rst +++ b/doc/guides/rel_notes/release_17_05.rst @@ -518,6 +518,7 @@ The libraries prepended with a plus sign were incremented in this version. librte_distributor.so.1 + librte_eal.so.4 librte_ethdev.so.6 + + librte_eventdev.so.1 librte_hash.so.2 librte_ip_frag.so.1 librte_jobstats.so.1 -- 2.14.0
[dpdk-dev] [PATCH 2/2] eventdev: bump library version
Bumping the library version to reflect the ABI change, where rte_event_pmd_pci_probe(), rte_event_pmd_pci_remove(), rte_event_pmd_vdev_init(), rte_event_pmd_vdev_uninit() functions removed from the library. Fixes: b1b3d9f90502 ("eventdev: make vdev init and uninit functions optional") Fixes: 9a8269d56942 ("eventdev: make PCI probe and remove functions optional") Reported-by: Thomas Monjalon Signed-off-by: Jerin Jacob --- doc/guides/rel_notes/release_17_08.rst | 1 + lib/librte_eventdev/Makefile | 2 +- 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/doc/guides/rel_notes/release_17_08.rst b/doc/guides/rel_notes/release_17_08.rst index a137aa79f..1c2bc6de9 100644 --- a/doc/guides/rel_notes/release_17_08.rst +++ b/doc/guides/rel_notes/release_17_08.rst @@ -370,6 +370,7 @@ The libraries prepended with a plus sign were incremented in this version. librte_distributor.so.1 librte_eal.so.4 + librte_ethdev.so.7 + + librte_eventdev.so.2 + librte_gro.so.1 librte_hash.so.2 librte_ip_frag.so.1 diff --git a/lib/librte_eventdev/Makefile b/lib/librte_eventdev/Makefile index 64165020a..410578a14 100644 --- a/lib/librte_eventdev/Makefile +++ b/lib/librte_eventdev/Makefile @@ -34,7 +34,7 @@ include $(RTE_SDK)/mk/rte.vars.mk LIB = librte_eventdev.a # library version -LIBABIVER := 1 +LIBABIVER := 2 # build flags CFLAGS += -O3 -- 2.14.0
Re: [dpdk-dev] [RFC] ethdev: add ioctl-like API to control device specific features
On Tue, 8 Aug 2017 09:32:07 +0100 Ferruh Yigit wrote: > On 8/4/2017 1:56 PM, Bruce Richardson wrote: > > On Fri, Aug 04, 2017 at 12:58:01PM +0100, Ferruh Yigit wrote: > >> On 8/3/2017 8:53 PM, Thomas Monjalon wrote: > >>> 03/08/2017 18:15, Stephen Hemminger: > On Thu, 3 Aug 2017 14:21:38 +0100 > Bruce Richardson wrote: > > > On Thu, Aug 03, 2017 at 01:21:35PM +0100, Chilikin, Andrey wrote: > >> To control some device-specific features public device-specific > >> functions > >> rte_pmd_*.h are used. > >> > >> But this solution requires applications to distinguish devices at > >> runtime > >> and, depending on the device type, call corresponding device-specific > >> functions even if functions' parameters are the same. > >> > >> IOCTL-like API can be added to ethdev instead of public device-specific > >> functions to address the following: > >> > >> * allow more usable support of features across a range of NIC from > >> one vendor, but not others > >> * allow features to be implemented by multiple NIC drivers without > >> relying on a critical mass to get the functionality in ethdev > >> * there are a large number of possible device specific functions, and > >> creating individual APIs for each one is not a good solution > >> * IOCTLs are a proven method for solving this problem in other areas, > >> i.e. OS kernels. > >> > >> Control requests for this API will be globally defined at ethdev > >> level, so > >> an application will use single API call to control different devices > >> from > >> one/multiple vendors. > >> > >> API call may look like as a classic ioctl with an extra parameter for > >> argument length for better sanity checks: > >> > >> int > >> rte_eth_dev_ioctl(uint16_t port, uint64_t ctl, void *argp, > >> unsigned arg_length); > >> > >> Regards, > >> Andrey > > > > I think we need to start putting in IOCTLs for ethdevs, much as I hate > > to admit it, since I dislike IOCTLs and other functions with opaque > > arguments! Having driver specific functions I don't think will scale > > well as each vendor tries to expose as much of their driver specific > > functionality as possible. > > > > One other additional example: I discovered just this week another issue > > with driver specific functions and testpmd, when I was working on the > > meson build rework. > > > > * With shared libraries, when we do "ninja install" we want our DPDK > > libs moved to e.g. /usr/local/lib, but the drivers moved to a separate > > driver folder, so that they can be automatically loaded from that > > single location by DPDK apps [== CONFIG_RTE_EAL_PMD_PATH]. > > * However, testpmd, as well as using the drivers as plugins, uses > > driver-specific functions, which means that it explicitly links > > against the pmd .so files. > > * Those driver .so files are not in with the other libraries, so ld.so > > does not find the pmd, and the installed testpmd fails to run due to > > missing library dependencies. > > * The workaround is to add the drivers path to the ld load path, but we > > should not require ld library path changes just to get DPDK apps to > > work. > > > > Using ioctls instead of driver-specific functions would solve this. > > > > My 2c. > > My 2c. No. > > Short answer: > Ioctl's were a bad idea in Unix (per Dennis Ritchie et al) and are now > despised by Linux kernel developers. They provide an unstructured, > unsecured, > back door for device driver abuse. Try to get a new driver in Linux with > a unique ioctl, and it will be hard to get accepted. > > Long answer: > So far every device specific feature has fit into ethdev model. Doing > ioctl > is admitting "it is too hard to be general, we need need an out". For > something > that is a flag, it should fit into existing config model; ignoring silly > ABI constraints. > For a real feature (think flow direction), we want a first class API for > that. > For a wart, then devargs will do. > > Give a good example of something that should be an ioctl. Don't build the > API first and then let it get cluttered. > >>> > >>> I agree with Stephen. > >>> > >>> And please do not forget that ioctl still requires an API: > >>> the argument that you put in ioctl is the API of the feature. > >>> So it is the same thing as defining a new function. > >> > >> I am also not fan of the ioctl usage. I believe it hides APIs behind ids > >> and prevent argument check by compiler. > >> > >> BUT, the number of the increasing PMD specific APIs are also worrying, > >> it is becoming harder to maintain, and I believe this is something NOT > >
Re: [dpdk-dev] [PATCH v1] doc: add template release notes for 17.11
> -Original Message- > From: Thomas Monjalon [mailto:tho...@monjalon.net] > Sent: Tuesday, August 8, 2017 2:50 PM > To: Mcnamara, John > Cc: dev@dpdk.org > Subject: Re: [dpdk-dev] [PATCH v1] doc: add template release notes for > 17.11 > > Hi, > > + librte_timer.so.1 > > + librte_vhost.so.3 > > ... we are going to bump some ABIVER which were forgotten during 17.08. > You will probably need to update after 17.08 release. No problem. I'll send a v2 right after the 17.08 release. John
Re: [dpdk-dev] [RFC] ethdev: add ioctl-like API to control device specific features
On Aug 4, 2017, at 6:58 AM, Ferruh Yigit mailto:ferruh.yi...@intel.com>> wrote: On 8/3/2017 8:53 PM, Thomas Monjalon wrote: 03/08/2017 18:15, Stephen Hemminger: On Thu, 3 Aug 2017 14:21:38 +0100 Bruce Richardson mailto:bruce.richard...@intel.com>> wrote: On Thu, Aug 03, 2017 at 01:21:35PM +0100, Chilikin, Andrey wrote: To control some device-specific features public device-specific functions rte_pmd_*.h are used. But this solution requires applications to distinguish devices at runtime and, depending on the device type, call corresponding device-specific functions even if functions' parameters are the same. IOCTL-like API can be added to ethdev instead of public device-specific functions to address the following: * allow more usable support of features across a range of NIC from one vendor, but not others * allow features to be implemented by multiple NIC drivers without relying on a critical mass to get the functionality in ethdev * there are a large number of possible device specific functions, and creating individual APIs for each one is not a good solution * IOCTLs are a proven method for solving this problem in other areas, i.e. OS kernels. Control requests for this API will be globally defined at ethdev level, so an application will use single API call to control different devices from one/multiple vendors. API call may look like as a classic ioctl with an extra parameter for argument length for better sanity checks: int rte_eth_dev_ioctl(uint16_t port, uint64_t ctl, void *argp, unsigned arg_length); Regards, Andrey I think we need to start putting in IOCTLs for ethdevs, much as I hate to admit it, since I dislike IOCTLs and other functions with opaque arguments! Having driver specific functions I don't think will scale well as each vendor tries to expose as much of their driver specific functionality as possible. One other additional example: I discovered just this week another issue with driver specific functions and testpmd, when I was working on the meson build rework. * With shared libraries, when we do "ninja install" we want our DPDK libs moved to e.g. /usr/local/lib, but the drivers moved to a separate driver folder, so that they can be automatically loaded from that single location by DPDK apps [== CONFIG_RTE_EAL_PMD_PATH]. * However, testpmd, as well as using the drivers as plugins, uses driver-specific functions, which means that it explicitly links against the pmd .so files. * Those driver .so files are not in with the other libraries, so ld.so does not find the pmd, and the installed testpmd fails to run due to missing library dependencies. * The workaround is to add the drivers path to the ld load path, but we should not require ld library path changes just to get DPDK apps to work. Using ioctls instead of driver-specific functions would solve this. My 2c. My 2c. No. Short answer: Ioctl's were a bad idea in Unix (per Dennis Ritchie et al) and are now despised by Linux kernel developers. They provide an unstructured, unsecured, back door for device driver abuse. Try to get a new driver in Linux with a unique ioctl, and it will be hard to get accepted. Long answer: So far every device specific feature has fit into ethdev model. Doing ioctl is admitting "it is too hard to be general, we need need an out". For something that is a flag, it should fit into existing config model; ignoring silly ABI constraints. For a real feature (think flow direction), we want a first class API for that. For a wart, then devargs will do. Give a good example of something that should be an ioctl. Don't build the API first and then let it get cluttered. I agree with Stephen. And please do not forget that ioctl still requires an API: the argument that you put in ioctl is the API of the feature. So it is the same thing as defining a new function. I am also not fan of the ioctl usage. I believe it hides APIs behind ids and prevent argument check by compiler. BUT, the number of the increasing PMD specific APIs are also worrying, it is becoming harder to maintain, and I believe this is something NOT sustainable in long run. What about having *eth_dev_extended_ops* ? We had talk about adding something like device specific APIs to DPDK in the past, which to me are just IOCTL like APIs. The big problem with IOCTLs is trying to cram a bunch of specific requests into a very generic API and I do not like ioctl as defined in Linux/Unix today. The old IOCTLs calls are too opaque and difficult for compilers to test args and many other issues. We talked about having a single API in rte_eth_dev that would allow a user to ask for and possible get a list of function pointers in a given structure for the requested type. If a user calls this API to get some feature from a given NIC he would get NULL or a pointer to a set of functions. The generic API in rte_eth would allow the user to request what structures or types of APIs it supports. Using a specific API to get the lis
Re: [dpdk-dev] [PATCH] eal: bump ABI version for PCI, bus and devargs work
08/08/2017 16:26, Gaetan Rivet: > 1. PCI domain field in PCI address structure grew from 16 to 32 bits. > >From: 463ced957c3f ("pci: increase domain storage to 32 bits") > > 2. rte_bus structure gaining new ops. > >From: 3a8f0bc68a90 ("bus: add method to find device") >From: 7c8810f43f6e ("bus: introduce device plug/unplug") >From: 609eb7dde6d0 ("bus: introduce parsing functionality") >From: 98eb4b845c1a ("bus: introduce scan policies") > > 3. rte_devargs structure having been mostly rewritten. > >From: f3a1188cee4a ("devargs: make device representation generic") >From: 47828c5f3bc3 ("devargs: parse bus info") >From: 39f403e0d5bb ("devargs: restore device type API") > > Signed-off-by: Gaetan Rivet Applied and merged with release notes update, thanks
Re: [dpdk-dev] [PATCH] doc: notify EAL ABI change
08/08/2017 16:26, Gaetan Rivet: > Signed-off-by: Gaetan Rivet > --- > doc/guides/rel_notes/release_17_08.rst | 10 +- > 1 file changed, 9 insertions(+), 1 deletion(-) Applied and merged with ABIVER bump, thanks
Re: [dpdk-dev] [RFC] ethdev: add ioctl-like API to control device specific features
Fix format. > On Aug 4, 2017, at 6:58 AM, Ferruh Yigit wrote: > > On 8/3/2017 8:53 PM, Thomas Monjalon wrote: >> 03/08/2017 18:15, Stephen Hemminger: >>> On Thu, 3 Aug 2017 14:21:38 +0100 >>> Bruce Richardson wrote: >>> On Thu, Aug 03, 2017 at 01:21:35PM +0100, Chilikin, Andrey wrote: > To control some device-specific features public device-specific functions > rte_pmd_*.h are used. > > But this solution requires applications to distinguish devices at runtime > and, depending on the device type, call corresponding device-specific > functions even if functions' parameters are the same. > > IOCTL-like API can be added to ethdev instead of public device-specific > functions to address the following: > > * allow more usable support of features across a range of NIC from > one vendor, but not others > * allow features to be implemented by multiple NIC drivers without > relying on a critical mass to get the functionality in ethdev > * there are a large number of possible device specific functions, and > creating individual APIs for each one is not a good solution > * IOCTLs are a proven method for solving this problem in other areas, > i.e. OS kernels. > > Control requests for this API will be globally defined at ethdev level, so > an application will use single API call to control different devices from > one/multiple vendors. > > API call may look like as a classic ioctl with an extra parameter for > argument length for better sanity checks: > > int > rte_eth_dev_ioctl(uint16_t port, uint64_t ctl, void *argp, >unsigned arg_length); > > Regards, > Andrey I think we need to start putting in IOCTLs for ethdevs, much as I hate to admit it, since I dislike IOCTLs and other functions with opaque arguments! Having driver specific functions I don't think will scale well as each vendor tries to expose as much of their driver specific functionality as possible. One other additional example: I discovered just this week another issue with driver specific functions and testpmd, when I was working on the meson build rework. * With shared libraries, when we do "ninja install" we want our DPDK libs moved to e.g. /usr/local/lib, but the drivers moved to a separate driver folder, so that they can be automatically loaded from that single location by DPDK apps [== CONFIG_RTE_EAL_PMD_PATH]. * However, testpmd, as well as using the drivers as plugins, uses driver-specific functions, which means that it explicitly links against the pmd .so files. * Those driver .so files are not in with the other libraries, so ld.so does not find the pmd, and the installed testpmd fails to run due to missing library dependencies. * The workaround is to add the drivers path to the ld load path, but we should not require ld library path changes just to get DPDK apps to work. Using ioctls instead of driver-specific functions would solve this. My 2c. >>> >>> My 2c. No. >>> >>> Short answer: >>> Ioctl's were a bad idea in Unix (per Dennis Ritchie et al) and are now >>> despised by Linux kernel developers. They provide an unstructured, >>> unsecured, >>> back door for device driver abuse. Try to get a new driver in Linux with >>> a unique ioctl, and it will be hard to get accepted. >>> >>> Long answer: >>> So far every device specific feature has fit into ethdev model. Doing ioctl >>> is admitting "it is too hard to be general, we need need an out". For >>> something >>> that is a flag, it should fit into existing config model; ignoring silly >>> ABI constraints. >>> For a real feature (think flow direction), we want a first class API for >>> that. >>> For a wart, then devargs will do. >>> >>> Give a good example of something that should be an ioctl. Don't build the >>> API first and then let it get cluttered. >> >> I agree with Stephen. >> >> And please do not forget that ioctl still requires an API: >> the argument that you put in ioctl is the API of the feature. >> So it is the same thing as defining a new function. > > I am also not fan of the ioctl usage. I believe it hides APIs behind ids > and prevent argument check by compiler. > > BUT, the number of the increasing PMD specific APIs are also worrying, > it is becoming harder to maintain, and I believe this is something NOT > sustainable in long run. > > > What about having *eth_dev_extended_ops* ? We had talk about adding something like device specific APIs to DPDK in the past, which to me are just IOCTL like APIs. The big problem with IOCTLs is trying to cram a bunch of specific requests into a very generic API and I do not like ioctl as defined in Linux/Unix today. The old IOCTLs calls are too opaque and difficult for compilers to test args and many other issues. We t
Re: [dpdk-dev] [PATCH 2/2] eventdev: bump library version
08/08/2017 16:48, Jerin Jacob: > Bumping the library version to reflect the ABI change, where > rte_event_pmd_pci_probe(), rte_event_pmd_pci_remove(), > rte_event_pmd_vdev_init(), rte_event_pmd_vdev_uninit() > functions removed from the library. > > Fixes: b1b3d9f90502 ("eventdev: make vdev init and uninit functions optional") > Fixes: 9a8269d56942 ("eventdev: make PCI probe and remove functions optional") > > Reported-by: Thomas Monjalon > Signed-off-by: Jerin Jacob Series applied, thanks
Re: [dpdk-dev] [RFC] ethdev: add ioctl-like API to control device specific features
On Tue, 8 Aug 2017 17:28:19 + "Wiles, Keith" wrote: > Fix format. > > > On Aug 4, 2017, at 6:58 AM, Ferruh Yigit wrote: > > > > On 8/3/2017 8:53 PM, Thomas Monjalon wrote: > >> 03/08/2017 18:15, Stephen Hemminger: > >>> On Thu, 3 Aug 2017 14:21:38 +0100 > >>> Bruce Richardson wrote: > >>> > On Thu, Aug 03, 2017 at 01:21:35PM +0100, Chilikin, Andrey wrote: > > To control some device-specific features public device-specific > > functions > > rte_pmd_*.h are used. > > > > But this solution requires applications to distinguish devices at > > runtime > > and, depending on the device type, call corresponding device-specific > > functions even if functions' parameters are the same. > > > > IOCTL-like API can be added to ethdev instead of public device-specific > > functions to address the following: > > > > * allow more usable support of features across a range of NIC from > > one vendor, but not others > > * allow features to be implemented by multiple NIC drivers without > > relying on a critical mass to get the functionality in ethdev > > * there are a large number of possible device specific functions, and > > creating individual APIs for each one is not a good solution > > * IOCTLs are a proven method for solving this problem in other areas, > > i.e. OS kernels. > > > > Control requests for this API will be globally defined at ethdev level, > > so > > an application will use single API call to control different devices > > from > > one/multiple vendors. > > > > API call may look like as a classic ioctl with an extra parameter for > > argument length for better sanity checks: > > > > int > > rte_eth_dev_ioctl(uint16_t port, uint64_t ctl, void *argp, > >unsigned arg_length); > > > > Regards, > > Andrey > > I think we need to start putting in IOCTLs for ethdevs, much as I hate > to admit it, since I dislike IOCTLs and other functions with opaque > arguments! Having driver specific functions I don't think will scale > well as each vendor tries to expose as much of their driver specific > functionality as possible. > > One other additional example: I discovered just this week another issue > with driver specific functions and testpmd, when I was working on the > meson build rework. > > * With shared libraries, when we do "ninja install" we want our DPDK > libs moved to e.g. /usr/local/lib, but the drivers moved to a separate > driver folder, so that they can be automatically loaded from that > single location by DPDK apps [== CONFIG_RTE_EAL_PMD_PATH]. > * However, testpmd, as well as using the drivers as plugins, uses > driver-specific functions, which means that it explicitly links > against the pmd .so files. > * Those driver .so files are not in with the other libraries, so ld.so > does not find the pmd, and the installed testpmd fails to run due to > missing library dependencies. > * The workaround is to add the drivers path to the ld load path, but we > should not require ld library path changes just to get DPDK apps to > work. > > Using ioctls instead of driver-specific functions would solve this. > > My 2c. > >>> > >>> My 2c. No. > >>> > >>> Short answer: > >>> Ioctl's were a bad idea in Unix (per Dennis Ritchie et al) and are now > >>> despised by Linux kernel developers. They provide an unstructured, > >>> unsecured, > >>> back door for device driver abuse. Try to get a new driver in Linux with > >>> a unique ioctl, and it will be hard to get accepted. > >>> > >>> Long answer: > >>> So far every device specific feature has fit into ethdev model. Doing > >>> ioctl > >>> is admitting "it is too hard to be general, we need need an out". For > >>> something > >>> that is a flag, it should fit into existing config model; ignoring silly > >>> ABI constraints. > >>> For a real feature (think flow direction), we want a first class API for > >>> that. > >>> For a wart, then devargs will do. > >>> > >>> Give a good example of something that should be an ioctl. Don't build the > >>> API first and then let it get cluttered. > >> > >> I agree with Stephen. > >> > >> And please do not forget that ioctl still requires an API: > >> the argument that you put in ioctl is the API of the feature. > >> So it is the same thing as defining a new function. > > > > I am also not fan of the ioctl usage. I believe it hides APIs behind ids > > and prevent argument check by compiler. > > > > BUT, the number of the increasing PMD specific APIs are also worrying, > > it is becoming harder to maintain, and I believe this is something NOT > > sustainable in long run. > > > > > > What about having *eth_dev_extended_ops* ? > > > We had talk about adding something like dev
Re: [dpdk-dev] [RFC] ethdev: add ioctl-like API to control device specific features
> On Aug 8, 2017, at 1:02 PM, Stephen Hemminger > wrote: > > On Tue, 8 Aug 2017 17:28:19 + > "Wiles, Keith" wrote: > >> Fix format. >> >>> On Aug 4, 2017, at 6:58 AM, Ferruh Yigit wrote: >>> >>> On 8/3/2017 8:53 PM, Thomas Monjalon wrote: 03/08/2017 18:15, Stephen Hemminger: > On Thu, 3 Aug 2017 14:21:38 +0100 > Bruce Richardson wrote: > >> On Thu, Aug 03, 2017 at 01:21:35PM +0100, Chilikin, Andrey wrote: >>> To control some device-specific features public device-specific >>> functions >>> rte_pmd_*.h are used. >>> >>> But this solution requires applications to distinguish devices at >>> runtime >>> and, depending on the device type, call corresponding device-specific >>> functions even if functions' parameters are the same. >>> >>> IOCTL-like API can be added to ethdev instead of public device-specific >>> functions to address the following: >>> >>> * allow more usable support of features across a range of NIC from >>> one vendor, but not others >>> * allow features to be implemented by multiple NIC drivers without >>> relying on a critical mass to get the functionality in ethdev >>> * there are a large number of possible device specific functions, and >>> creating individual APIs for each one is not a good solution >>> * IOCTLs are a proven method for solving this problem in other areas, >>> i.e. OS kernels. >>> >>> Control requests for this API will be globally defined at ethdev level, >>> so >>> an application will use single API call to control different devices >>> from >>> one/multiple vendors. >>> >>> API call may look like as a classic ioctl with an extra parameter for >>> argument length for better sanity checks: >>> >>> int >>> rte_eth_dev_ioctl(uint16_t port, uint64_t ctl, void *argp, >>> unsigned arg_length); >>> >>> Regards, >>> Andrey >> >> I think we need to start putting in IOCTLs for ethdevs, much as I hate >> to admit it, since I dislike IOCTLs and other functions with opaque >> arguments! Having driver specific functions I don't think will scale >> well as each vendor tries to expose as much of their driver specific >> functionality as possible. >> >> One other additional example: I discovered just this week another issue >> with driver specific functions and testpmd, when I was working on the >> meson build rework. >> >> * With shared libraries, when we do "ninja install" we want our DPDK >> libs moved to e.g. /usr/local/lib, but the drivers moved to a separate >> driver folder, so that they can be automatically loaded from that >> single location by DPDK apps [== CONFIG_RTE_EAL_PMD_PATH]. >> * However, testpmd, as well as using the drivers as plugins, uses >> driver-specific functions, which means that it explicitly links >> against the pmd .so files. >> * Those driver .so files are not in with the other libraries, so ld.so >> does not find the pmd, and the installed testpmd fails to run due to >> missing library dependencies. >> * The workaround is to add the drivers path to the ld load path, but we >> should not require ld library path changes just to get DPDK apps to >> work. >> >> Using ioctls instead of driver-specific functions would solve this. >> >> My 2c. > > My 2c. No. > > Short answer: > Ioctl's were a bad idea in Unix (per Dennis Ritchie et al) and are now > despised by Linux kernel developers. They provide an unstructured, > unsecured, > back door for device driver abuse. Try to get a new driver in Linux with > a unique ioctl, and it will be hard to get accepted. > > Long answer: > So far every device specific feature has fit into ethdev model. Doing > ioctl > is admitting "it is too hard to be general, we need need an out". For > something > that is a flag, it should fit into existing config model; ignoring silly > ABI constraints. > For a real feature (think flow direction), we want a first class API for > that. > For a wart, then devargs will do. > > Give a good example of something that should be an ioctl. Don't build the > API first and then let it get cluttered. I agree with Stephen. And please do not forget that ioctl still requires an API: the argument that you put in ioctl is the API of the feature. So it is the same thing as defining a new function. >>> >>> I am also not fan of the ioctl usage. I believe it hides APIs behind ids >>> and prevent argument check by compiler. >>> >>> BUT, the number of the increasing PMD specific APIs are also worrying, >>> it is becoming harder to maintain, and I believe this is something NOT >>> sustainable in long run. >>> >>> >>> What about having *eth_dev_extended_ops*
[dpdk-dev] [dpdk-announce] DPDK 17.08 released
A new major release is available: http://fast.dpdk.org/rel/dpdk-17.08.tar.xz Some highlights: - x86 requires SSE4.2 - more ARM optimizations using NEON - service cores API - GRO library - traffic management API (QoS) - ethdev flow fuzzy match - ethdev flow isolate mode - lock-free Tx queue capability API - failsafe driver - cryptodev rework - eventdev burst mode - NXP DPAA2 eventdev driver - dpdk-test-eventdev application - eventdev_pipeline_sw_pmd example More details in the release notes: http://dpdk.org/doc/guides/rel_notes/release_17_08.html A lot of work was done during 3 months: 1023 patches from 125 authors 1018 files changed, 90842 insertions(+), 22245 deletions(-) There are 38 new contributors (including authors, reviewers and testers): Thanks to Ashwin Sekhar T K, Balasubramanian Manoharan, Boris Pismenny, Changpeng Liu, Cian Ferriter, Dahir Osman, Dariusz Stojaczyk, Dirk-Holger Lenz, Gabriel Carrillo, George Wilkie, Harrison McCullough, Herbert Guan, Ivan Dyukov, Jamie Lavigne, Jens Freimann, Jesse Bruni, Kirill Rybalchenko, Leonid Myravjev, Mandeep Rohilla, Markus Theil, Matan Azrad, Michael Lilja, Moti Haimovsky, Radu Nicolau, Ray Kinsella, Robert Shearman, RongQiang Xie, Sangjin Han, Sha Zhang, Shachar Beiser, Shahed Shaikh, Steeven Li, Tom Barbette, Tonghao Zhang, Vincent S. Cojot, Wen Chiu, Xiaoyun Li and Xingyou Chen. Below is the number of patches per authors grouped per company (accuracy may be not perfect): 423 Intel (53) 130 Cavium (9) 100 6WIND (6) 89 NXP (4) 75 Mellanox (9) 49 Broadcom (1) 31 unknown (8) 20 Chelsio (1) 13 Microsoft (1) 13 Brocade (6) 12 Solarflare (2) 8 Cisco (1) 7 Linaro (1) 7 CESNET (1) 7 Atomic Rules (1) 6 RedHat (4) 5 Samsung (2) 5 OKTET Labs (2) 3 Wind River (2) 3 IBM (1) 3 ARM (1) 2 Amazon (2) 1 Semihalf (1) 1 Netronome (1) 1 Napatech (1) 1 Huawei (1) 1 HP (1) 1 Big Switch (1) 1 AT&T (1) The new features for the 17.11 cycle must be submitted before August 25, in order to be reviewed and integrated during September. The next release is expected to happen at the very beginning of November. Let's continue the good work! Thanks everyone
Re: [dpdk-dev] [PATCH 1/2] eventdev: add event adapter for ethernet Rx queues
> -Original Message- > From: Jerin Jacob [mailto:jerin.ja...@caviumnetworks.com] > Sent: Thursday, August 3, 2017 1:23 AM > To: Eads, Gage > Cc: Rao, Nikhil ; dev@dpdk.org; tho...@monjalon.net; > Richardson, Bruce ; Van Haaren, Harry > ; hemant.agra...@nxp.com; > nipun.gu...@nxp.com; Vangati, Narender ; > Gujjar, Abhinandan S > Subject: Re: [PATCH 1/2] eventdev: add event adapter for ethernet Rx queues > > -Original Message- > > Date: Wed, 2 Aug 2017 19:19:32 + > > From: "Eads, Gage" > > To: Jerin Jacob , "Rao, Nikhil" > > > > CC: "dev@dpdk.org" , "tho...@monjalon.net" > > , "Richardson, Bruce" > > , "Van Haaren, Harry" > , "hemant.agra...@nxp.com" > > , "nipun.gu...@nxp.com" > > , "Vangati, Narender" > , "Gujjar, Abhinandan S" > > > > Subject: RE: [PATCH 1/2] eventdev: add event adapter for ethernet Rx > > queues > > > > > > > > > > > > > > > > > > 5) specifying rte_event_eth_rx_adapter_conf.rx_event_port_id on > > > > > rte_event_eth_rx_adapter_create() would waste one HW eventdev > > > > > port if its happen to be used RX_ADAPTER_CAP_INBUILT_PORT on > > > rte_event_eth_rx_adapter_queue_add(). > > > > > unlike SW eventdev port, HW eventdev ports are costly so I > > > > > think, We need to have another eventdev PMD ops to create > service/producer ports. > > > > > Or any other scheme that creates > > > > > rte_event_eth_rx_adapter_conf.rx_event_port_id > > > > > on demand by common code. > > > > > > > > > > > > > One solution is: > > > > > > > > struct rte_event_eth_rx_adapter_conf { > > > > uint8_t dev_id; > > > > > > > > int (*conf_cb)(uint8_t id, uint8_t port_id, uint32_t flags, > > > > struct rte_event_eth_rx_adapter_conf *conf); > > > > > > > > unsigned int max_nb_rx; > > > > > > > > int event_port_id; > > > > > > > > char service_name[]; > > > > } > > > > > > > > Where dev_id and conf_cb have to be specified in the create call, > > > > but event_port_id and service_name will be filled in when > > > > conf_cb() is invoked > > > > > > I was thinking like event_port_id will be rte_event_port_count() + 1. > > > ie When adapter needs the additional port, It can > > > - stop the eventdev > > > - reconfigure with rte_event_queue_count() , rte_event_port_count() > > > + 1 > > > - start the eventdev. > > > > > > The only problem with callback is that all the application needs to > > > implement > it. > > > If you think, application need more control then we can expose > > > callback and if it is NULL then default handler can be called in common > > > code. > > > > > > > I don't think we can rely on there being another port available -- a user > > may > have configured the sw eventdev with all 64 ports, for instance. > > On that case, irrespective any scheme(callback vs non callback) the adapter > creation would fail. Right? > > > What if the user is required to calculate cfg.nb_event_ports as a function > > of > the RX_ADAPTER_CAP_INBUILT_PORT capability (i.e. add a port if the capability > is not set), such that a reconfigure is not required? > > We have only one NON INBUILT eventdev port per adapter. Right? i.e in the v1 > spec it was rte_event_eth_rx_adapter_conf.event_port_id, > How about it can be rte_event_port_count() + 1 ? Since we are NOT linking this > port, the context call be kept in adapter itself. Right? It could be. Thinking on it some more, I'm a little concerned about doing configuration without the application's knowledge. Possible issues that could arise: - The user later reconfigures the event device with fewer ports and the adapter's port becomes invalid, or reconfigures it with more ports and begins using the port the adapter is using - rte_event_port_count() + 1 extends beyond the PMD's capabilities (the sw PMD is hard-coded to support a max of 64 ports, for example) Having the user be responsible for the port configuration could avoid these problems. Since the user needs to check the pair's capabilities for the CAP_ADD_QUEUE anyway, they could also check for INBUILT_PORT and decide whether or not to request an additional port at eventdev configure time -- thereby ensuring they don't waste a port when using hardware with inbuilt ports. And this keeps the configuration code in one place (the app), rather than spread across the app, adapter, and potentially the conf_cb. Besides these concerns, I think the transparent configuration approach (plus conf_cb when necessary to override) would work, but could have issues in the aforementioned edge cases. > > > > As for application control: that would be a useful option in the conf_cb > scheme. Some apps will want to configure the adapter's port (its > new_event_threshold, its queue depths) differently from the default. > > struct rte_event_port_conf * can be passed on the adapter create if > application > needs more control. > > > > > Thanks, > > Gage