[dpdk-dev] [PATCH v2] doc: announce ABI change for cryptodev and ethdev

2017-08-08 Thread Akhil Goyal
Support for security operations is planned to be added
in ethdev and cryptodev for the 17.11 release.

For this following changes are required.
- rte_cryptodev and rte_eth_dev structures need to be added
new parameter rte_security_ops which extend support for
security ops to the corresponding driver.
- rte_cryptodev_info and rte_eth_dev_info need to be added
with rte_security_capabilities to identify the capabilities of
the corresponding driver.
- rte_security_session needs to be added in the union inside
structure rte_crypto_sym_op to decide between various types of
sessions

Signed-off-by: Akhil Goyal 
Acked-by: Hemant Agrawal 
Acked-by: Boris Pismenny 
Acked-by: Shahaf Shuler 
Acked-by: Pablo de Lara 
---
changes in v2:
Added one more ABI change wrt to security session,
This patch is not split into 3 patches as these all will be implemented
simultaneously with rte_security support.

 doc/guides/rel_notes/deprecation.rst | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/doc/guides/rel_notes/deprecation.rst 
b/doc/guides/rel_notes/deprecation.rst
index 72d1f35..e34a809 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -78,3 +78,18 @@ Deprecation Notices
   be removed in 17.11:
 
   - ``rte_cryptodev_create_vdev``
+
+* cryptodev: new parameters - ``rte_security_capabilities`` and
+  ``rte_security_ops`` will be added to ``rte_cryptodev_info`` and
+  ``rte_cryptodev`` respectively to support security protocol offloaded
+  operations.
+
+* cryptodev: new parameter ``rte_security_session`` will be added in the union
+  of the structure ``rte_crypto_sym_op``, so that the user can choose either to
+  use ``rte_cryptodev_sym_session`` or ``rte_crypto_sym_xform`` or
+  ``rte_security_session``.
+
+* ethdev: new parameters - ``rte_security_capabilities`` and
+  ``rte_security_ops`` will be added to ``rte_eth_dev_info`` and
+  ``rte_eth_dev`` respectively  to support security operations like
+  ipsec inline.
-- 
2.9.3



[dpdk-dev] Why IVSHMEM was removed since 16.11 ?

2017-08-08 Thread Furong
The release notes of dpdk-16.11 had shown that IVSHMEM was removed due 
to some design issues.


So, what are these issues?

Thanks!




Re: [dpdk-dev] [PATCH] vfio: fix sPAPR IOMMU DMA window size

2017-08-08 Thread Alexey Kardashevskiy
On 08/08/17 01:11, Jonas Pfefferle wrote:
> DMA window size needs to be big enough to span all memory segment's
> physical addresses. We do not need multiple levels of IOMMU tables
> as we already span ~70TB of physical memory with 16MB hugepages.
> 
> Signed-off-by: Jonas Pfefferle 
> ---
>  lib/librte_eal/linuxapp/eal/eal_vfio.c | 25 ++---
>  1 file changed, 22 insertions(+), 3 deletions(-)
> 
> diff --git a/lib/librte_eal/linuxapp/eal/eal_vfio.c 
> b/lib/librte_eal/linuxapp/eal/eal_vfio.c
> index 946df7e..8502216 100644
> --- a/lib/librte_eal/linuxapp/eal/eal_vfio.c
> +++ b/lib/librte_eal/linuxapp/eal/eal_vfio.c
> @@ -722,6 +722,18 @@ vfio_type1_dma_map(int vfio_container_fd)
>   return 0;
>  }
>  
> +static uint64_t
> +roundup_next_pow2(uint64_t n)
> +{
> + uint32_t i;
> +
> + n--;
> + for (i = 1; i < sizeof(n) * CHAR_BIT; i += i)
> + n |= n >> i;
> +
> + return ++n;
> +}
> +

wow :)

QEMU does it using __builtin_ctzll() (used below for the page_shift)
without a loop:

https://git.qemu.org/gitweb.cgi?p=qemu.git;a=blob;f=include/qemu/host-utils.h;h=95cf4f4163e50457cdf808263065ca5ef3f935da;hb=f22ab6cb0c47bd2a2785b7d58130949bd7d8d9af#l382


Anyway, seems working.


Reviewed-by: Alexey Kardashevskiy 




>  static int
>  vfio_spapr_dma_map(int vfio_container_fd)
>  {
> @@ -759,10 +771,12 @@ vfio_spapr_dma_map(int vfio_container_fd)
>   return -1;
>   }
>  
> - /* calculate window size based on number of hugepages configured */
> - create.window_size = rte_eal_get_physmem_size();
> + /* physicaly pages are sorted descending i.e. ms[0].phys_addr is max */
> + /* create DMA window from 0 to max(phys_addr + len) */
> + /* sPAPR requires window size to be a power of 2 */
> + create.window_size = roundup_next_pow2(ms[0].phys_addr + ms[0].len);
>   create.page_shift = __builtin_ctzll(ms->hugepage_sz);
> - create.levels = 2;
> + create.levels = 1;
>  
>   ret = ioctl(vfio_container_fd, VFIO_IOMMU_SPAPR_TCE_CREATE, &create);
>   if (ret) {
> @@ -771,6 +785,11 @@ vfio_spapr_dma_map(int vfio_container_fd)
>   return -1;
>   }
>  
> + if (create.start_addr != 0) {
> + RTE_LOG(ERR, EAL, "  DMA window start address != 0\n");
> + return -1;
> + }
> +
>   /* map all DPDK segments for DMA. use 1:1 PA to IOVA mapping */
>   for (i = 0; i < RTE_MAX_MEMSEG; i++) {
>   struct vfio_iommu_type1_dma_map dma_map;
> 


-- 
Alexey


Re: [dpdk-dev] [PATCH] vfio: fix sPAPR IOMMU DMA window size

2017-08-08 Thread Jonas Pfefferle1
Alexey Kardashevskiy  wrote on 08/08/2017 09:38:00 AM:

> From: Alexey Kardashevskiy 
> To: Jonas Pfefferle , anatoly.bura...@intel.com
> Cc: dev@dpdk.org
> Date: 08/08/2017 09:38 AM
> Subject: Re: [PATCH] vfio: fix sPAPR IOMMU DMA window size
>
> On 08/08/17 01:11, Jonas Pfefferle wrote:
> > DMA window size needs to be big enough to span all memory segment's
> > physical addresses. We do not need multiple levels of IOMMU tables
> > as we already span ~70TB of physical memory with 16MB hugepages.
> >
> > Signed-off-by: Jonas Pfefferle 
> > ---
> >  lib/librte_eal/linuxapp/eal/eal_vfio.c | 25 ++---
> >  1 file changed, 22 insertions(+), 3 deletions(-)
> >
> > diff --git a/lib/librte_eal/linuxapp/eal/eal_vfio.c b/lib/
> librte_eal/linuxapp/eal/eal_vfio.c
> > index 946df7e..8502216 100644
> > --- a/lib/librte_eal/linuxapp/eal/eal_vfio.c
> > +++ b/lib/librte_eal/linuxapp/eal/eal_vfio.c
> > @@ -722,6 +722,18 @@ vfio_type1_dma_map(int vfio_container_fd)
> > return 0;
> >  }
> >
> > +static uint64_t
> > +roundup_next_pow2(uint64_t n)
> > +{
> > +   uint32_t i;
> > +
> > +   n--;
> > +   for (i = 1; i < sizeof(n) * CHAR_BIT; i += i)
> > +  n |= n >> i;
> > +
> > +   return ++n;
> > +}
> > +
>
> wow :)
>
> QEMU does it using __builtin_ctzll() (used below for the page_shift)
> without a loop:
>
> https://git.qemu.org/gitweb.cgi?p=qemu.git;a=blob;f=include/qemu/
> host-
>
utils.h;h=95cf4f4163e50457cdf808263065ca5ef3f935da;hb=f22ab6cb0c47bd2a2785b7d58130949bd7d8d9af#l382

>
>
> Anyway, seems working.

Ok let me fix that :)

>
>
> Reviewed-by: Alexey Kardashevskiy 
>
>
>
>
> >  static int
> >  vfio_spapr_dma_map(int vfio_container_fd)
> >  {
> > @@ -759,10 +771,12 @@ vfio_spapr_dma_map(int vfio_container_fd)
> >return -1;
> > }
> >
> > -   /* calculate window size based on number of hugepages configured */
> > -   create.window_size = rte_eal_get_physmem_size();
> > +   /* physicaly pages are sorted descending i.e. ms[0].phys_addr is
max */
> > +   /* create DMA window from 0 to max(phys_addr + len) */
> > +   /* sPAPR requires window size to be a power of 2 */
> > +   create.window_size = roundup_next_pow2(ms[0].phys_addr + ms
[0].len);
> > create.page_shift = __builtin_ctzll(ms->hugepage_sz);
> > -   create.levels = 2;
> > +   create.levels = 1;
> >
> > ret = ioctl(vfio_container_fd, VFIO_IOMMU_SPAPR_TCE_CREATE,
&create);
> > if (ret) {
> > @@ -771,6 +785,11 @@ vfio_spapr_dma_map(int vfio_container_fd)
> >return -1;
> > }
> >
> > +   if (create.start_addr != 0) {
> > +  RTE_LOG(ERR, EAL, "  DMA window start address != 0\n");
> > +  return -1;
> > +   }
> > +
> > /* map all DPDK segments for DMA. use 1:1 PA to IOVA mapping */
> > for (i = 0; i < RTE_MAX_MEMSEG; i++) {
> >struct vfio_iommu_type1_dma_map dma_map;
> >
>
>
> --
> Alexey
>


Re: [dpdk-dev] [PATCH v2] doc: announce ABI change for cryptodev and ethdev

2017-08-08 Thread De Lara Guarch, Pablo
Hi Akhil,

> -Original Message-
> From: Akhil Goyal [mailto:akhil.go...@nxp.com]
> Sent: Tuesday, August 8, 2017 8:10 AM
> To: dev@dpdk.org; Doherty, Declan ;
> tho...@monjalon.net; Nicolau, Radu ;
> avia...@mellanox.com; bor...@mellanox.com;
> hemant.agra...@nxp.com; De Lara Guarch, Pablo
> 
> Cc: shah...@mellanox.com; Akhil Goyal 
> Subject: [PATCH v2] doc: announce ABI change for cryptodev and ethdev
> 
> Support for security operations is planned to be added in ethdev and
> cryptodev for the 17.11 release.
> 
> For this following changes are required.
> - rte_cryptodev and rte_eth_dev structures need to be added new
> parameter rte_security_ops which extend support for security ops to the
> corresponding driver.
> - rte_cryptodev_info and rte_eth_dev_info need to be added with
> rte_security_capabilities to identify the capabilities of the corresponding
> driver.
> - rte_security_session needs to be added in the union inside structure
> rte_crypto_sym_op to decide between various types of sessions
> 
> Signed-off-by: Akhil Goyal 
> Acked-by: Hemant Agrawal 
> Acked-by: Boris Pismenny 
> Acked-by: Shahaf Shuler 
> Acked-by: Pablo de Lara 
> ---
> changes in v2:
> Added one more ABI change wrt to security session, This patch is not split
> into 3 patches as these all will be implemented simultaneously with
> rte_security support.
> 
>  doc/guides/rel_notes/deprecation.rst | 15 +++
>  1 file changed, 15 insertions(+)
> 
> diff --git a/doc/guides/rel_notes/deprecation.rst
> b/doc/guides/rel_notes/deprecation.rst
> index 72d1f35..e34a809 100644
> --- a/doc/guides/rel_notes/deprecation.rst
> +++ b/doc/guides/rel_notes/deprecation.rst
> @@ -78,3 +78,18 @@ Deprecation Notices
>be removed in 17.11:
> 
>- ``rte_cryptodev_create_vdev``
> +
> +* cryptodev: new parameters - ``rte_security_capabilities`` and
> +  ``rte_security_ops`` will be added to ``rte_cryptodev_info`` and
> +  ``rte_cryptodev`` respectively to support security protocol offloaded
> +  operations.
> +
> +* cryptodev: new parameter ``rte_security_session`` will be added in
> +the union
> +  of the structure ``rte_crypto_sym_op``, so that the user can choose
> +either to
> +  use ``rte_cryptodev_sym_session`` or ``rte_crypto_sym_xform`` or
> +  ``rte_security_session``.

I don't think it is required to have a deprecation notice for this, as it is 
adding
another element in a union, and it is not changing its size.
However, this would require an addition in rte_crypto_op_sess_type, which,
as long as it is added at the end, should not require a deprecation notice.

About splitting this into two patches, I would do it, as I believe deprecation 
notices
should target a specific library.




Re: [dpdk-dev] [PATCH] vfio: fix sPAPR IOMMU DMA window size

2017-08-08 Thread Ananyev, Konstantin


> -Original Message-
> From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Alexey Kardashevskiy
> Sent: Tuesday, August 8, 2017 10:38 AM
> To: Jonas Pfefferle ; Burakov, Anatoly 
> 
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH] vfio: fix sPAPR IOMMU DMA window size
> 
> On 08/08/17 01:11, Jonas Pfefferle wrote:
> > DMA window size needs to be big enough to span all memory segment's
> > physical addresses. We do not need multiple levels of IOMMU tables
> > as we already span ~70TB of physical memory with 16MB hugepages.
> >
> > Signed-off-by: Jonas Pfefferle 
> > ---
> >  lib/librte_eal/linuxapp/eal/eal_vfio.c | 25 ++---
> >  1 file changed, 22 insertions(+), 3 deletions(-)
> >
> > diff --git a/lib/librte_eal/linuxapp/eal/eal_vfio.c 
> > b/lib/librte_eal/linuxapp/eal/eal_vfio.c
> > index 946df7e..8502216 100644
> > --- a/lib/librte_eal/linuxapp/eal/eal_vfio.c
> > +++ b/lib/librte_eal/linuxapp/eal/eal_vfio.c
> > @@ -722,6 +722,18 @@ vfio_type1_dma_map(int vfio_container_fd)
> > return 0;
> >  }
> >
> > +static uint64_t
> > +roundup_next_pow2(uint64_t n)
> > +{
> > +   uint32_t i;
> > +
> > +   n--;
> > +   for (i = 1; i < sizeof(n) * CHAR_BIT; i += i)
> > +   n |= n >> i;
> > +
> > +   return ++n;
> > +}
> > +
> 
> wow :)
> 
> QEMU does it using __builtin_ctzll() (used below for the page_shift)
> without a loop:
> 
> https://git.qemu.org/gitweb.cgi?p=qemu.git;a=blob;f=include/qemu/host-
> utils.h;h=95cf4f4163e50457cdf808263065ca5ef3f935da;hb=f22ab6cb0c47bd2a2785b7d58130949bd7d8d9af#l382
> 
> 
> Anyway, seems working.

As I remember, there already exists rte_align64pow2().
Konstantin

> 
> 
> Reviewed-by: Alexey Kardashevskiy 
> 
> 
> 
> 
> >  static int
> >  vfio_spapr_dma_map(int vfio_container_fd)
> >  {
> > @@ -759,10 +771,12 @@ vfio_spapr_dma_map(int vfio_container_fd)
> > return -1;
> > }
> >
> > -   /* calculate window size based on number of hugepages configured */
> > -   create.window_size = rte_eal_get_physmem_size();
> > +   /* physicaly pages are sorted descending i.e. ms[0].phys_addr is max */
> > +   /* create DMA window from 0 to max(phys_addr + len) */
> > +   /* sPAPR requires window size to be a power of 2 */
> > +   create.window_size = roundup_next_pow2(ms[0].phys_addr + ms[0].len);
> > create.page_shift = __builtin_ctzll(ms->hugepage_sz);
> > -   create.levels = 2;
> > +   create.levels = 1;
> >
> > ret = ioctl(vfio_container_fd, VFIO_IOMMU_SPAPR_TCE_CREATE, &create);
> > if (ret) {
> > @@ -771,6 +785,11 @@ vfio_spapr_dma_map(int vfio_container_fd)
> > return -1;
> > }
> >
> > +   if (create.start_addr != 0) {
> > +   RTE_LOG(ERR, EAL, "  DMA window start address != 0\n");
> > +   return -1;
> > +   }
> > +
> > /* map all DPDK segments for DMA. use 1:1 PA to IOVA mapping */
> > for (i = 0; i < RTE_MAX_MEMSEG; i++) {
> > struct vfio_iommu_type1_dma_map dma_map;
> >
> 
> 
> --
> Alexey


[dpdk-dev] [PATCH v2] vfio: fix sPAPR IOMMU DMA window size

2017-08-08 Thread Jonas Pfefferle
DMA window size needs to be big enough to span all memory segment's
physical addresses. We do not need multiple levels of IOMMU tables
as we already span ~70TB of physical memory with 16MB hugepages.

Signed-off-by: Jonas Pfefferle 
---
v2:
* roundup to next power 2 function without loop.

lib/librte_eal/linuxapp/eal/eal_vfio.c | 42 +++---
 1 file changed, 39 insertions(+), 3 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal_vfio.c 
b/lib/librte_eal/linuxapp/eal/eal_vfio.c
index 946df7e..a3f9977 100644
--- a/lib/librte_eal/linuxapp/eal/eal_vfio.c
+++ b/lib/librte_eal/linuxapp/eal/eal_vfio.c
@@ -722,6 +722,35 @@ vfio_type1_dma_map(int vfio_container_fd)
return 0;
 }
 
+static inline int
+clz64(uint64_t val)
+{
+   return val ? __builtin_clzll(val) : 64;
+}
+
+static inline bool
+is_power_of_2(uint64_t value)
+{
+   if (!value)
+   return false;
+
+   return !(value & (value - 1));
+}
+
+static inline uint64_t
+roundup_next_pow2(uint64_t value)
+{
+   uint8_t nlz = clz64(value);
+
+   if (is_power_of_2(value))
+   return value;
+
+   if (!nlz)
+   return 0;
+
+   return 1ULL << (64 - nlz);
+}
+
 static int
 vfio_spapr_dma_map(int vfio_container_fd)
 {
@@ -759,10 +788,12 @@ vfio_spapr_dma_map(int vfio_container_fd)
return -1;
}
 
-   /* calculate window size based on number of hugepages configured */
-   create.window_size = rte_eal_get_physmem_size();
+   /* physicaly pages are sorted descending i.e. ms[0].phys_addr is max */
+   /* create DMA window from 0 to max(phys_addr + len) */
+   /* sPAPR requires window size to be a power of 2 */
+   create.window_size = roundup_next_pow2(ms[0].phys_addr + ms[0].len);
create.page_shift = __builtin_ctzll(ms->hugepage_sz);
-   create.levels = 2;
+   create.levels = 1;
 
ret = ioctl(vfio_container_fd, VFIO_IOMMU_SPAPR_TCE_CREATE, &create);
if (ret) {
@@ -771,6 +802,11 @@ vfio_spapr_dma_map(int vfio_container_fd)
return -1;
}
 
+   if (create.start_addr != 0) {
+   RTE_LOG(ERR, EAL, "  DMA window start address != 0\n");
+   return -1;
+   }
+
/* map all DPDK segments for DMA. use 1:1 PA to IOVA mapping */
for (i = 0; i < RTE_MAX_MEMSEG; i++) {
struct vfio_iommu_type1_dma_map dma_map;
-- 
2.7.4



Re: [dpdk-dev] [RFC] ethdev: add ioctl-like API to control device specific features

2017-08-08 Thread Ferruh Yigit
On 8/4/2017 1:56 PM, Bruce Richardson wrote:
> On Fri, Aug 04, 2017 at 12:58:01PM +0100, Ferruh Yigit wrote:
>> On 8/3/2017 8:53 PM, Thomas Monjalon wrote:
>>> 03/08/2017 18:15, Stephen Hemminger:
 On Thu, 3 Aug 2017 14:21:38 +0100
 Bruce Richardson  wrote:

> On Thu, Aug 03, 2017 at 01:21:35PM +0100, Chilikin, Andrey wrote:
>> To control some device-specific features public device-specific functions
>> rte_pmd_*.h are used.
>>
>> But this solution requires applications to distinguish devices at runtime
>> and, depending on the device type, call corresponding device-specific
>> functions even if functions' parameters are the same.
>>
>> IOCTL-like API can be added to ethdev instead of public device-specific
>> functions to address the following:
>>
>> * allow more usable support of features across a range of NIC from
>>   one vendor, but not others
>> * allow features to be implemented by multiple NIC drivers without
>>   relying on a critical mass to get the functionality in ethdev
>> * there are a large number of possible device specific functions, and
>>   creating individual APIs for each one is not a good solution
>> * IOCTLs are a proven method for solving this problem in other areas,
>>   i.e. OS kernels.
>>
>> Control requests for this API will be globally defined at ethdev level, 
>> so
>> an application will use single API call to control different devices from
>> one/multiple vendors.
>>
>> API call may look like as a classic ioctl with an extra parameter for
>> argument length for better sanity checks:
>>
>> int
>> rte_eth_dev_ioctl(uint16_t port, uint64_t ctl, void *argp,
>> unsigned arg_length);
>>
>> Regards,
>> Andrey  
>
> I think we need to start putting in IOCTLs for ethdevs, much as I hate
> to admit it, since I dislike IOCTLs and other functions with opaque
> arguments! Having driver specific functions I don't think will scale
> well as each vendor tries to expose as much of their driver specific
> functionality as possible.
>
> One other additional example: I discovered just this week another issue
> with driver specific functions and testpmd, when I was working on the
> meson build rework.
>
> * With shared libraries, when we do "ninja install" we want our DPDK
>   libs moved to e.g. /usr/local/lib, but the drivers moved to a separate
>   driver folder, so that they can be automatically loaded from that
>   single location by DPDK apps [== CONFIG_RTE_EAL_PMD_PATH].
> * However, testpmd, as well as using the drivers as plugins, uses
>   driver-specific functions, which means that it explicitly links
>   against the pmd .so files.
> * Those driver .so files are not in with the other libraries, so ld.so
>   does not find the pmd, and the installed testpmd fails to run due to
>   missing library dependencies.
> * The workaround is to add the drivers path to the ld load path, but we
>   should not require ld library path changes just to get DPDK apps to
>   work.
>
> Using ioctls instead of driver-specific functions would solve this.
>
> My 2c.

 My 2c. No.

 Short answer:
 Ioctl's were a bad idea in Unix (per Dennis Ritchie et al) and are now
 despised by Linux kernel developers. They provide an unstructured, 
 unsecured,
 back door for device driver abuse. Try to get a new driver in Linux with
 a unique ioctl, and it will be hard to get accepted.

 Long answer:
 So far every device specific feature has fit into ethdev model. Doing ioctl
 is admitting "it is too hard to be general, we need need an out". For 
 something
 that is a flag, it should fit into existing config model; ignoring silly 
 ABI constraints.
 For a real feature (think flow direction), we want a first class API for 
 that.
 For a wart, then devargs will do.

 Give a good example of something that should be an ioctl. Don't build the
 API first and then let it get cluttered.
>>>
>>> I agree with Stephen.
>>>
>>> And please do not forget that ioctl still requires an API:
>>> the argument that you put in ioctl is the API of the feature.
>>> So it is the same thing as defining a new function.
>>
>> I am also not fan of the ioctl usage. I believe it hides APIs behind ids
>> and prevent argument check by compiler.
>>
>> BUT, the number of the increasing PMD specific APIs are also worrying,
>> it is becoming harder to maintain, and I believe this is something NOT
>> sustainable in long run.
>>
>>
>> What about having *eth_dev_extended_ops* ?
>>
>>
>> As a part of the rte_eth_dev. This can be in the librte_ether library
>> but in a separated file.
>>
>> And the APIs for these ops can be less strict on compatibility, and
>> easier to add.
>>
>> Benefits of having this new dev_ops:

Re: [dpdk-dev] [PATCH v3] doc: announce ethdev API change for detach flag

2017-08-08 Thread Thomas Monjalon
31/07/2017 11:40, Gaetan Rivet:
> The flag RTE_ETH_DEV_DETACHABLE will disappear.
> 
> This flag is not needed anymore following the hotplug work done for
> v17.08. It can be removed, its function is now implicitly made available
> by the relevant EAL and rte_bus implementations.
> 
> Signed-off-by: Gaetan Rivet 
> Acked-by: John McNamara 
> Acked-by: Adrien Mazarguil 

Acked-by: Thomas Monjalon 

Applied, thanks



[dpdk-dev] [PATCH v3] vfio: fix sPAPR IOMMU DMA window size

2017-08-08 Thread Jonas Pfefferle
DMA window size needs to be big enough to span all memory segment's
physical addresses. We do not need multiple levels of IOMMU tables
as we already span ~70TB of physical memory with 16MB hugepages.

Signed-off-by: Jonas Pfefferle 
---
v2:
* roundup to next power 2 function without loop.

v3:
* Replace roundup_next_pow2 with rte_align64pow2

 lib/librte_eal/linuxapp/eal/eal_vfio.c | 13 ++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal_vfio.c 
b/lib/librte_eal/linuxapp/eal/eal_vfio.c
index 946df7e..550c41c 100644
--- a/lib/librte_eal/linuxapp/eal/eal_vfio.c
+++ b/lib/librte_eal/linuxapp/eal/eal_vfio.c
@@ -759,10 +759,12 @@ vfio_spapr_dma_map(int vfio_container_fd)
return -1;
}
 
-   /* calculate window size based on number of hugepages configured */
-   create.window_size = rte_eal_get_physmem_size();
+   /* physicaly pages are sorted descending i.e. ms[0].phys_addr is max */
+   /* create DMA window from 0 to max(phys_addr + len) */
+   /* sPAPR requires window size to be a power of 2 */
+   create.window_size = rte_align64pow2(ms[0].phys_addr + ms[0].len);
create.page_shift = __builtin_ctzll(ms->hugepage_sz);
-   create.levels = 2;
+   create.levels = 1;
 
ret = ioctl(vfio_container_fd, VFIO_IOMMU_SPAPR_TCE_CREATE, &create);
if (ret) {
@@ -771,6 +773,11 @@ vfio_spapr_dma_map(int vfio_container_fd)
return -1;
}
 
+   if (create.start_addr != 0) {
+   RTE_LOG(ERR, EAL, "  DMA window start address != 0\n");
+   return -1;
+   }
+
/* map all DPDK segments for DMA. use 1:1 PA to IOVA mapping */
for (i = 0; i < RTE_MAX_MEMSEG; i++) {
struct vfio_iommu_type1_dma_map dma_map;
-- 
2.7.4



Re: [dpdk-dev] [PATCH v2] doc: announce ABI change for cryptodev and ethdev

2017-08-08 Thread Akhil Goyal

Hi Pablo,

On 8/8/2017 1:38 PM, De Lara Guarch, Pablo wrote:

Hi Akhil,


-Original Message-
From: Akhil Goyal [mailto:akhil.go...@nxp.com]
Sent: Tuesday, August 8, 2017 8:10 AM
To: dev@dpdk.org; Doherty, Declan ;
tho...@monjalon.net; Nicolau, Radu ;
avia...@mellanox.com; bor...@mellanox.com;
hemant.agra...@nxp.com; De Lara Guarch, Pablo

Cc: shah...@mellanox.com; Akhil Goyal 
Subject: [PATCH v2] doc: announce ABI change for cryptodev and ethdev

Support for security operations is planned to be added in ethdev and
cryptodev for the 17.11 release.

For this following changes are required.
- rte_cryptodev and rte_eth_dev structures need to be added new
parameter rte_security_ops which extend support for security ops to the
corresponding driver.
- rte_cryptodev_info and rte_eth_dev_info need to be added with
rte_security_capabilities to identify the capabilities of the corresponding
driver.
- rte_security_session needs to be added in the union inside structure
rte_crypto_sym_op to decide between various types of sessions

Signed-off-by: Akhil Goyal 
Acked-by: Hemant Agrawal 
Acked-by: Boris Pismenny 
Acked-by: Shahaf Shuler 
Acked-by: Pablo de Lara 
---
changes in v2:
Added one more ABI change wrt to security session, This patch is not split
into 3 patches as these all will be implemented simultaneously with
rte_security support.

  doc/guides/rel_notes/deprecation.rst | 15 +++
  1 file changed, 15 insertions(+)

diff --git a/doc/guides/rel_notes/deprecation.rst
b/doc/guides/rel_notes/deprecation.rst
index 72d1f35..e34a809 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -78,3 +78,18 @@ Deprecation Notices
be removed in 17.11:

- ``rte_cryptodev_create_vdev``
+
+* cryptodev: new parameters - ``rte_security_capabilities`` and
+  ``rte_security_ops`` will be added to ``rte_cryptodev_info`` and
+  ``rte_cryptodev`` respectively to support security protocol offloaded
+  operations.
+
+* cryptodev: new parameter ``rte_security_session`` will be added in
+the union
+  of the structure ``rte_crypto_sym_op``, so that the user can choose
+either to
+  use ``rte_cryptodev_sym_session`` or ``rte_crypto_sym_xform`` or
+  ``rte_security_session``.


I don't think it is required to have a deprecation notice for this, as it is 
adding
another element in a union, and it is not changing its size.
However, this would require an addition in rte_crypto_op_sess_type, which,
as long as it is added at the end, should not require a deprecation notice.

About splitting this into two patches, I would do it, as I believe deprecation 
notices
should target a specific library.





Thanks for the comments. I have marked the v2 as not applicable and 
marked v1 as in review in the patchwork.


-Akhil



Re: [dpdk-dev] [PATCH] vfio: fix sPAPR IOMMU DMA window size

2017-08-08 Thread Jonas Pfefferle1
"Ananyev, Konstantin"  wrote on 08/08/2017
10:27:28 AM:

> From: "Ananyev, Konstantin" 
> To: Alexey Kardashevskiy , Jonas Pfefferle
> , "Burakov, Anatoly" 
> Cc: "dev@dpdk.org" 
> Date: 08/08/2017 10:27 AM
> Subject: RE: [dpdk-dev] [PATCH] vfio: fix sPAPR IOMMU DMA window size
>
>
>
> > -Original Message-
> > From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Alexey
Kardashevskiy
> > Sent: Tuesday, August 8, 2017 10:38 AM
> > To: Jonas Pfefferle ; Burakov, Anatoly
> 
> > Cc: dev@dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH] vfio: fix sPAPR IOMMU DMA window size
> >
> > On 08/08/17 01:11, Jonas Pfefferle wrote:
> > > DMA window size needs to be big enough to span all memory segment's
> > > physical addresses. We do not need multiple levels of IOMMU tables
> > > as we already span ~70TB of physical memory with 16MB hugepages.
> > >
> > > Signed-off-by: Jonas Pfefferle 
> > > ---
> > >  lib/librte_eal/linuxapp/eal/eal_vfio.c | 25 +
+---
> > >  1 file changed, 22 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/lib/librte_eal/linuxapp/eal/eal_vfio.c b/lib/
> librte_eal/linuxapp/eal/eal_vfio.c
> > > index 946df7e..8502216 100644
> > > --- a/lib/librte_eal/linuxapp/eal/eal_vfio.c
> > > +++ b/lib/librte_eal/linuxapp/eal/eal_vfio.c
> > > @@ -722,6 +722,18 @@ vfio_type1_dma_map(int vfio_container_fd)
> > > return 0;
> > >  }
> > >
> > > +static uint64_t
> > > +roundup_next_pow2(uint64_t n)
> > > +{
> > > +   uint32_t i;
> > > +
> > > +   n--;
> > > +   for (i = 1; i < sizeof(n) * CHAR_BIT; i += i)
> > > +  n |= n >> i;
> > > +
> > > +   return ++n;
> > > +}
> > > +
> >
> > wow :)
> >
> > QEMU does it using __builtin_ctzll() (used below for the page_shift)
> > without a loop:
> >
> > https://git.qemu.org/gitweb.cgi?p=qemu.git;a=blob;f=include/qemu/host-
> >
>
utils.h;h=95cf4f4163e50457cdf808263065ca5ef3f935da;hb=f22ab6cb0c47bd2a2785b7d58130949bd7d8d9af#l382

> >
> >
> > Anyway, seems working.
>
> As I remember, there already exists rte_align64pow2().
> Konstantin

Thanks. Fixed it.

>
> >
> >
> > Reviewed-by: Alexey Kardashevskiy 
> >
> >
> >
> >
> > >  static int
> > >  vfio_spapr_dma_map(int vfio_container_fd)
> > >  {
> > > @@ -759,10 +771,12 @@ vfio_spapr_dma_map(int vfio_container_fd)
> > >return -1;
> > > }
> > >
> > > -   /* calculate window size based on number of hugepages configured
*/
> > > -   create.window_size = rte_eal_get_physmem_size();
> > > +   /* physicaly pages are sorted descending i.e. ms
> [0].phys_addr is max */
> > > +   /* create DMA window from 0 to max(phys_addr + len) */
> > > +   /* sPAPR requires window size to be a power of 2 */
> > > +   create.window_size = roundup_next_pow2(ms[0].phys_addr + ms
[0].len);
> > > create.page_shift = __builtin_ctzll(ms->hugepage_sz);
> > > -   create.levels = 2;
> > > +   create.levels = 1;
> > >
> > > ret = ioctl(vfio_container_fd, VFIO_IOMMU_SPAPR_TCE_CREATE,
&create);
> > > if (ret) {
> > > @@ -771,6 +785,11 @@ vfio_spapr_dma_map(int vfio_container_fd)
> > >return -1;
> > > }
> > >
> > > +   if (create.start_addr != 0) {
> > > +  RTE_LOG(ERR, EAL, "  DMA window start address != 0\n");
> > > +  return -1;
> > > +   }
> > > +
> > > /* map all DPDK segments for DMA. use 1:1 PA to IOVA mapping */
> > > for (i = 0; i < RTE_MAX_MEMSEG; i++) {
> > >struct vfio_iommu_type1_dma_map dma_map;
> > >
> >
> >
> > --
> > Alexey


Re: [dpdk-dev] using rte_flow via tespmd with Intel X552

2017-08-08 Thread Ferruh Yigit
On 8/2/2017 2:41 PM, Adrien Mazarguil wrote:
> Hi Wenzhuo,
> 
> On Wed, Aug 02, 2017 at 01:00:47PM +, Lu, Wenzhuo wrote:
>> Hi,
>>
>>> -Original Message-
>>> From: Zhao1, Wei
>>> Sent: Wednesday, August 2, 2017 3:28 AM
>>> To: TETSURO NAKAMURA 
>>> Cc: TAKADA Naoki ; Satoshi NISHIYAMA
>>> ; Lu, Wenzhuo ;
>>> dev@dpdk.org
>>> Subject: RE: using rte_flow via tespmd with Intel X552
>>>
>>> Hi, NAKAMURA
>>>
>>> I have upload 2 documents about command format when creating flow for
>>> NIC igb and ixgbe in the email attachment.
>>> I have decided to commit the context in the doc to doc\guides\nics\ixgbe.rst
>>> before, BUT by now it seems the content of the document is too much To
>>> store there. So, is there any other doc more suitable in DPDK to record 
>>> this.
>>> And welcome for any other suggestion.
>>> AND, Wenzhuo, what is your opinion for this?
>> I also have the same feeling that it may make the ixgbe.rst too big. 
>> + John and Adrien. Hi John,  Adrien, I'm think about creating a specific doc 
>> for rte flow, we can include the examples for every NIC here. Is it good? If 
>> so, where's the good place to put it? Thanks.
> 
> Depends on the kind of examples. Code samples or testpmd flow command?
> Unless you mean a document that describes individual rte_flow features
> supported by each NIC?
> 
> Code samples could be added to a new "Programming flow rules" (or something)
> section in doc/guide/prog_guide/rte_flow.rst.
> 
> Another section about NICs could be added in that documentation as well but
> should only contain a short summary for each of them and a link to the
> relevant PMD documentation in nics/$PMD.rst (or elsewhere in the DPDK
> tree), if any.
> 
> If you want to describe individual rte_flow features supported by each NIC,
> I think there's no other choice but to create a bunch of
> doc/guides/nics/features/rte_flow/*.ini files.

Indeed I was thinking doing a change in .ini files related rte flow in 17.11

rte_flow is a method to implement some filtering features, rte_flow
itself is not a device feature. And filtering features already listed in
.ini file.

So I was thinking removing rte_flow from feature table, and update
existing filter features as something like:
"Ethertype filter = L"  --> Feature exist and implemented using Legacy
method.
"Ethertype filter = Y"  --> Feature exist and implemented using rte flow

Does it make sense?

> 
 -Original Message-
 From: TETSURO NAKAMURA [mailto:nakamura.tets...@lab.ntt.co.jp]
 Sent: Wednesday, August 2, 2017 12:22 PM
 To: Zhao1, Wei 
 Cc: TAKADA Naoki ; Satoshi NISHIYAMA
 ; Lu, Wenzhuo ;
 dev@dpdk.org
 Subject: Re: using rte_flow via tespmd with Intel X552

 Wei Zhao - san
 (Cc'ing Wenzhuo - san)

 According to the e-mail thread [1] in April, you have a document about
 command format when creating flows on ixgbe NIC.

 I would appreciate your uploading that document to community or could
 you just share it with me via an e-mail ??

 [1] http://dpdk.org/ml/archives/users/2017-April/001786.html

 Thanks,

 Tetsuro

 # re-sending the e-mail because I failed to submit the e-mail to
 us...@dpdk.org.

 --
 Tetsuro Nakamura  NTT Network Service
 Systems Laboratories
 TEL:0422 59 6914(National)/+81 422 59 6914(International) 3-9-11,
 Midori-Cho Musashino-Shi, Tokyo 180-8585 Japan

>>
> 



Re: [dpdk-dev] [PATCH v3] ethdev: add return code to rte_eth_stats_reset()

2017-08-08 Thread Van Haaren, Harry
> From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of David Harton
> Sent: Monday, August 7, 2017 6:39 PM
> To: tho...@monjalon.net
> Cc: dev@dpdk.org; David Harton 
> Subject: [dpdk-dev] [PATCH v3] ethdev: add return code to 
> rte_eth_stats_reset()
> 
> Some devices do not support reset of eth stats.  An application may
> need to know not to clear shadow stats if the device cannot.
> 
> rte_eth_stats_reset is updated to provide a return code to share
> whether the device supports reset or not.
> 
> Signed-off-by: David Harton 
> ---

Hi,

As far as I know changing the return type (void to int) of a function does 
*not* break ABI, but does "break" API as the application code should now check 
the return value. In theory the application could ignore the return value and 
current behavior is maintained.

The validate-abi.sh script says "Compatible" with the following item flagged:

Problems with Symbols
High 0
Medium 0
Low 1

Change>> Type of return value has been changed from void to int (4 bytes).
Effect>> Replacement of return type may indicate a change in its semantic 
meaning.

Perhaps somebody with more ABI expertise than I would double check the return 
value change?


Some smaller things inline below.

> v3:
> * overcame noob errors and figured out patch challenges
> * this release should finally be clean :)
> 
> v2:
> * fixed soft tab issue inserted while moving changes
> 
>  lib/librte_ether/rte_ethdev.c | 8 +---
>  lib/librte_ether/rte_ethdev.h | 4 +++-
>  2 files changed, 8 insertions(+), 4 deletions(-)
> 
> diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
> index 0597641..f72cc5a 100644
> --- a/lib/librte_ether/rte_ethdev.c
> +++ b/lib/librte_ether/rte_ethdev.c
> @@ -1341,17 +1341,19 @@ rte_eth_stats_get(uint8_t port_id, struct 
> rte_eth_stats *stats)
>   return 0;
>  }
> 
> -void
> +int
>  rte_eth_stats_reset(uint8_t port_id)
>  {
>   struct rte_eth_dev *dev;
> 
> - RTE_ETH_VALID_PORTID_OR_RET(port_id);
> + RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);
>   dev = &rte_eth_devices[port_id];
> 
> - RTE_FUNC_PTR_OR_RET(*dev->dev_ops->stats_reset);
> + RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->stats_reset, -ENOTSUP);
>   (*dev->dev_ops->stats_reset)(dev);
>   dev->data->rx_mbuf_alloc_failed = 0;
> +
> + return 0;
>  }
> 
>  static int
> diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
> index 0adf327..d8ccd60 100644
> --- a/lib/librte_ether/rte_ethdev.h
> +++ b/lib/librte_ether/rte_ethdev.h
> @@ -2246,8 +2246,10 @@ int rte_eth_stats_get(uint8_t port_id, struct 
> rte_eth_stats
> *stats);
>   *
>   * @param port_id
>   *   The port identifier of the Ethernet device.
> + * @return
> + *   Zero if successful. Non-zero otherwise.

We should document all return values:

@retval 0 Success
@retval -EINVAL Invalid port_id provided
@retval -ENOTSUP Stats reset functionality not supported by PMD

The API change itself should probably be added to release notes, as
applications may wish to be aware this function has changed.

>   */
> -void rte_eth_stats_reset(uint8_t port_id);
> +int rte_eth_stats_reset(uint8_t port_id);
> 
>  /**
>   * Retrieve names of extended statistics of an Ethernet device.
> --
> 2.10.3.dirty


I'm happy to Ack the code/release-notes with above suggestions, but I'd like a 
second opinion to Ack ABI.

-Harry


Re: [dpdk-dev] Why IVSHMEM was removed since 16.11 ?

2017-08-08 Thread Burakov, Anatoly
> From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Furong
> Sent: Tuesday, August 8, 2017 8:26 AM
> To: dev@dpdk.org
> Subject: [dpdk-dev] Why IVSHMEM was removed since 16.11 ?
> 
> The release notes of dpdk-16.11 had shown that IVSHMEM was removed
> due to some design issues.
> 
> So, what are these issues?
> 
> Thanks!
> 

Hi Furong,

There were multiple issues involved. Biggest of all, it required a patch to 
QEMU that wasn't maintained and wasn't upstream (i.e. vanilla QEMU didn't work 
with DPDK's implementation of IVSHMEM support). Second, it was basically hacked 
in to EAL in order to support what it does [1], and the engineering effort to 
fix all of that isn't worth the benefit it would provide, as no one appeared to 
be using it heavily enough to object to its deprecation.

Hope this clears things up.

[1] http://dpdk.org/ml/archives/dev/2016-June/040844.html

Thanks,
Anatoly 


Re: [dpdk-dev] [PATCH v3] vfio: fix sPAPR IOMMU DMA window size

2017-08-08 Thread Burakov, Anatoly
From: Jonas Pfefferle [mailto:j...@zurich.ibm.com]
> Sent: Tuesday, August 8, 2017 9:41 AM
> To: Burakov, Anatoly 
> Cc: dev@dpdk.org; a...@ozlabs.ru; Jonas Pfefferle 
> Subject: [PATCH v3] vfio: fix sPAPR IOMMU DMA window size
> 
> DMA window size needs to be big enough to span all memory segment's
> physical addresses. We do not need multiple levels of IOMMU tables
> as we already span ~70TB of physical memory with 16MB hugepages.
> 
> Signed-off-by: Jonas Pfefferle 
> ---
> v2:
> * roundup to next power 2 function without loop.
> 
> v3:
> * Replace roundup_next_pow2 with rte_align64pow2
> 
>  lib/librte_eal/linuxapp/eal/eal_vfio.c | 13 ++---
>  1 file changed, 10 insertions(+), 3 deletions(-)
> 
> diff --git a/lib/librte_eal/linuxapp/eal/eal_vfio.c
> b/lib/librte_eal/linuxapp/eal/eal_vfio.c
> index 946df7e..550c41c 100644
> --- a/lib/librte_eal/linuxapp/eal/eal_vfio.c
> +++ b/lib/librte_eal/linuxapp/eal/eal_vfio.c
> @@ -759,10 +759,12 @@ vfio_spapr_dma_map(int vfio_container_fd)
>   return -1;
>   }
> 
> - /* calculate window size based on number of hugepages configured
> */
> - create.window_size = rte_eal_get_physmem_size();
> + /* physicaly pages are sorted descending i.e. ms[0].phys_addr is max
> */

Do we always expect that to be the case in the future? Maybe it would be safer 
to walk the memsegs list.

Thanks,
Anatoly

> + /* create DMA window from 0 to max(phys_addr + len) */
> + /* sPAPR requires window size to be a power of 2 */
> + create.window_size = rte_align64pow2(ms[0].phys_addr +
> ms[0].len);
>   create.page_shift = __builtin_ctzll(ms->hugepage_sz);
> - create.levels = 2;
> + create.levels = 1;
> 
>   ret = ioctl(vfio_container_fd, VFIO_IOMMU_SPAPR_TCE_CREATE,
> &create);
>   if (ret) {
> @@ -771,6 +773,11 @@ vfio_spapr_dma_map(int vfio_container_fd)
>   return -1;
>   }
> 
> + if (create.start_addr != 0) {
> + RTE_LOG(ERR, EAL, "  DMA window start address != 0\n");
> + return -1;
> + }
> +
>   /* map all DPDK segments for DMA. use 1:1 PA to IOVA mapping */
>   for (i = 0; i < RTE_MAX_MEMSEG; i++) {
>   struct vfio_iommu_type1_dma_map dma_map;
> --
> 2.7.4



Re: [dpdk-dev] [PATCH v3] vfio: fix sPAPR IOMMU DMA window size

2017-08-08 Thread Jonas Pfefferle1
"Burakov, Anatoly"  wrote on 08/08/2017 11:15:24
AM:

> From: "Burakov, Anatoly" 
> To: Jonas Pfefferle 
> Cc: "dev@dpdk.org" , "a...@ozlabs.ru" 
> Date: 08/08/2017 11:18 AM
> Subject: RE: [PATCH v3] vfio: fix sPAPR IOMMU DMA window size
>
> From: Jonas Pfefferle [mailto:j...@zurich.ibm.com]
> > Sent: Tuesday, August 8, 2017 9:41 AM
> > To: Burakov, Anatoly 
> > Cc: dev@dpdk.org; a...@ozlabs.ru; Jonas Pfefferle 
> > Subject: [PATCH v3] vfio: fix sPAPR IOMMU DMA window size
> >
> > DMA window size needs to be big enough to span all memory segment's
> > physical addresses. We do not need multiple levels of IOMMU tables
> > as we already span ~70TB of physical memory with 16MB hugepages.
> >
> > Signed-off-by: Jonas Pfefferle 
> > ---
> > v2:
> > * roundup to next power 2 function without loop.
> >
> > v3:
> > * Replace roundup_next_pow2 with rte_align64pow2
> >
> >  lib/librte_eal/linuxapp/eal/eal_vfio.c | 13 ++---
> >  1 file changed, 10 insertions(+), 3 deletions(-)
> >
> > diff --git a/lib/librte_eal/linuxapp/eal/eal_vfio.c
> > b/lib/librte_eal/linuxapp/eal/eal_vfio.c
> > index 946df7e..550c41c 100644
> > --- a/lib/librte_eal/linuxapp/eal/eal_vfio.c
> > +++ b/lib/librte_eal/linuxapp/eal/eal_vfio.c
> > @@ -759,10 +759,12 @@ vfio_spapr_dma_map(int vfio_container_fd)
> >return -1;
> > }
> >
> > -   /* calculate window size based on number of hugepages configured
> > */
> > -   create.window_size = rte_eal_get_physmem_size();
> > +   /* physicaly pages are sorted descending i.e. ms[0].phys_addr is
max
> > */
>
> Do we always expect that to be the case in the future? Maybe it
> would be safer to walk the memsegs list.
>
> Thanks,
> Anatoly

I had this loop in before but removed it in favor of simplicity.
If we believe that the ordering is going to change in the future
I'm happy to bring back the loop. Is there other code which is
relying on the fact that the memsegs are sorted by their physical
addresses?

>
> > +   /* create DMA window from 0 to max(phys_addr + len) */
> > +   /* sPAPR requires window size to be a power of 2 */
> > +   create.window_size = rte_align64pow2(ms[0].phys_addr +
> > ms[0].len);
> > create.page_shift = __builtin_ctzll(ms->hugepage_sz);
> > -   create.levels = 2;
> > +   create.levels = 1;
> >
> > ret = ioctl(vfio_container_fd, VFIO_IOMMU_SPAPR_TCE_CREATE,
> > &create);
> > if (ret) {
> > @@ -771,6 +773,11 @@ vfio_spapr_dma_map(int vfio_container_fd)
> >return -1;
> > }
> >
> > +   if (create.start_addr != 0) {
> > +  RTE_LOG(ERR, EAL, "  DMA window start address != 0\n");
> > +  return -1;
> > +   }
> > +
> > /* map all DPDK segments for DMA. use 1:1 PA to IOVA mapping */
> > for (i = 0; i < RTE_MAX_MEMSEG; i++) {
> >struct vfio_iommu_type1_dma_map dma_map;
> > --
> > 2.7.4
>


Re: [dpdk-dev] [PATCH v3] vfio: fix sPAPR IOMMU DMA window size

2017-08-08 Thread Burakov, Anatoly
> From: Jonas Pfefferle1 [mailto:j...@zurich.ibm.com]
> Sent: Tuesday, August 8, 2017 10:30 AM
> To: Burakov, Anatoly 
> Cc: a...@ozlabs.ru; dev@dpdk.org
> Subject: RE: [PATCH v3] vfio: fix sPAPR IOMMU DMA window size
> 
> "Burakov, Anatoly"  wrote on 08/08/2017
> 11:15:24 AM:
> 
> > From: "Burakov, Anatoly" 
> > To: Jonas Pfefferle 
> > Cc: "dev@dpdk.org" , "a...@ozlabs.ru" 
> > Date: 08/08/2017 11:18 AM
> > Subject: RE: [PATCH v3] vfio: fix sPAPR IOMMU DMA window size
> >
> > From: Jonas Pfefferle [mailto:j...@zurich.ibm.com]
> > > Sent: Tuesday, August 8, 2017 9:41 AM
> > > To: Burakov, Anatoly 
> > > Cc: dev@dpdk.org; a...@ozlabs.ru; Jonas Pfefferle 
> > > Subject: [PATCH v3] vfio: fix sPAPR IOMMU DMA window size
> > >
> > > DMA window size needs to be big enough to span all memory segment's
> > > physical addresses. We do not need multiple levels of IOMMU tables
> > > as we already span ~70TB of physical memory with 16MB hugepages.
> > >
> > > Signed-off-by: Jonas Pfefferle 
> > > ---
> > > v2:
> > > * roundup to next power 2 function without loop.
> > >
> > > v3:
> > > * Replace roundup_next_pow2 with rte_align64pow2
> > >
> > >  lib/librte_eal/linuxapp/eal/eal_vfio.c | 13 ++---
> > >  1 file changed, 10 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/lib/librte_eal/linuxapp/eal/eal_vfio.c
> > > b/lib/librte_eal/linuxapp/eal/eal_vfio.c
> > > index 946df7e..550c41c 100644
> > > --- a/lib/librte_eal/linuxapp/eal/eal_vfio.c
> > > +++ b/lib/librte_eal/linuxapp/eal/eal_vfio.c
> > > @@ -759,10 +759,12 @@ vfio_spapr_dma_map(int vfio_container_fd)
> > >        return -1;
> > >     }
> > >
> > > -   /* calculate window size based on number of hugepages configured
> > > */
> > > -   create.window_size = rte_eal_get_physmem_size();
> > > +   /* physicaly pages are sorted descending i.e. ms[0].phys_addr is max
> > > */
> >
> > Do we always expect that to be the case in the future? Maybe it
> > would be safer to walk the memsegs list.
> >
> > Thanks,
> > Anatoly
> 
> I had this loop in before but removed it in favor of simplicity.
> If we believe that the ordering is going to change in the future
> I'm happy to bring back the loop. Is there other code which is
> relying on the fact that the memsegs are sorted by their physical
> addresses?

I don't think there is. In any case, I think making assumptions about 
particulars of memseg organization is not a very good practice.

I seem to recall us doing similar things in other places, so maybe down the 
line we could introduce a new API (or internal-only) function to get a memseg 
with min/max address. For now I think a loop will do. 

> 
> >
> > > +   /* create DMA window from 0 to max(phys_addr + len) */
> > > +   /* sPAPR requires window size to be a power of 2 */
> > > +   create.window_size = rte_align64pow2(ms[0].phys_addr +
> > > ms[0].len);
> > >     create.page_shift = __builtin_ctzll(ms->hugepage_sz);
> > > -   create.levels = 2;
> > > +   create.levels = 1;
> > >
> > >     ret = ioctl(vfio_container_fd, VFIO_IOMMU_SPAPR_TCE_CREATE,
> > > &create);
> > >     if (ret) {
> > > @@ -771,6 +773,11 @@ vfio_spapr_dma_map(int vfio_container_fd)
> > >        return -1;
> > >     }
> > >
> > > +   if (create.start_addr != 0) {
> > > +      RTE_LOG(ERR, EAL, "  DMA window start address != 0\n");
> > > +      return -1;
> > > +   }
> > > +
> > >     /* map all DPDK segments for DMA. use 1:1 PA to IOVA mapping */
> > >     for (i = 0; i < RTE_MAX_MEMSEG; i++) {
> > >        struct vfio_iommu_type1_dma_map dma_map;
> > > --
> > > 2.7.4
> >


Re: [dpdk-dev] [PATCH] ethdev: add notice for planned name size change

2017-08-08 Thread Thomas Monjalon
> > > VMBUS support will use GUID in eth_dev_data name field which requires
> > > at least 37 characters. Plan for increase in size now.
> > >
> > > Signed-off-by: Stephen Hemminger 
> > > ---
> > > +* ethdev: An ABI change is planned for 17.11 for the structure
> > eth_dev_data.
> > Isn't it better to use "``rte_eth_dev_data``"?
> > 
> > > +  The size of the unique name will increase RTE_ETH_NAME_MAX_LEN
> > from
> > > + 32 to
> > > +  64 characters to allow using a globally unique identifier (GUID) in 
> > > this
> > field.
> > 
> > Except for the small comment above,
> > Acked-by: Yongseok Koh 
> 
> Acked-by: Shahaf Shuler 

Acked-by: Thomas Monjalon 

Applied, thanks


Re: [dpdk-dev] [PATCH] doc: announce ABI change for cryptodev and ethdev

2017-08-08 Thread Thomas Monjalon
08/08/2017 07:03, Shahaf Shuler:
> Monday, August 7, 2017 9:07 PM, Boris Pismenny:
> > > From: Thomas Monjalon [mailto:tho...@monjalon.net]
> > > 04/08/2017 07:26, Hemant Agrawal:
> > > > On 8/3/2017 9:02 PM, Akhil Goyal wrote:
> > > > > Support for security operations is planned to be added in ethdev
> > > > > and cryptodev for the 17.11 release.
> > > > >
> > > > > For this following changes are required.
> > > > > - rte_cryptodev and rte_eth_dev structures need to be added new
> > > > > parameter rte_security_ops which extend support for security ops
> > > > > to the corresponding driver.
> > > > > - rte_cryptodev_info and rte_ethd_dev_info need to be added with
> > > > > rte_security_capabilities to identify the capabilities of the
> > > > > corresponding driver.
> > >
> > > It is not explained what is the fundamental difference between
> > > rte_security and rte_crypto?
> > > It looks to be just a technical workaround.
> > 
> > rte_security is a layer between crypto and NIC.
> > 
> > Today crypto sessions are created exclusively against crypto devices, but
> > they don't use network related fields, while the network namespace doesn't
> > use crypto related fields. We expect this API to represent crypto sessions
> > that combine network fields and allow to add/delete them for all devices.
> > 
> > For NICs we will use rte_flow with rte_security for inline/full crypto 
> > protocol
> > offload such as ESP.
> > 
> > >
> > > Why the ABI would be changed by rte_security additions?
> > >
> > > > > Signed-off-by: Akhil Goyal 
> > > > >
> > > > Acked-by: Hemant Agrawal 
> > >
> > > No more opinions outside of NXP?
> > > It seems there is not yet a consensus on how to manage IPsec offloading.
> > > I heard there were some phone calls about these stuff but nothing
> > > clear appears publicly on the mailing list.
> > > Looks to be a community failure.
> > 
> > We agreed to go ahead with this approach on one such phone call. I hope we
> > could use the dpdk github for development.
> > 
> > Acked-by: Boris Pismenny 
> 
> Acked-by: Shahaf Shuler 

Applied
It means you have a chance to do this change in 17.11.
It does not mean you can be sure that the patches will be accepted.

This is introducing a new complexity.
It must be discussed with the technical board before approving
the final design in 17.11.


Re: [dpdk-dev] [PATCH v3] doc: announce API and ABI change for ethdev

2017-08-08 Thread Thomas Monjalon
> > This is an API/ABI change notice for DPDK 17.11 announcing the redefinition 
> > of
> > port_id. port_id is currently defined as uint8_t, which is limited to the 
> > range 0 to
> > 255. A larger range is required for vdev scalability.
> > 
> > It is necessary for a redefinition of port_id to extend it from 1 bytes to
> > 2 bytes. All ethdev APIs and usages related to port_id will be changed at 
> > the
> > same time.
> > 
> > cc: jianfeng@intel.com
> > cc: y...@fridaylinux.org
> > cc: jerin.ja...@caviumnetworks.com
> > cc: tho...@monjalon.net
> > cc: remy.hor...@intel.com
> > 
> > Signed-off-by: Zhiyong Yang 
> > Acked-by: Jianfeng Tan 
> > Acked-by: Yuanhan Liu 
> > Acked-by: Jerin Jacob 
> > Acked-by: Remy Horton 
> 
> This ack in v2 is added in v3. Thanks, Shahaf.
> 
> Acked-by: Shahaf Shuler 

Applied, thanks



Re: [dpdk-dev] [PATCH] doc: announce crypto vdev init removal

2017-08-08 Thread Thomas Monjalon
03/08/2017 12:42, Akhil Goyal:
> On 8/3/2017 7:42 AM, Pablo de Lara wrote:
> > In order to remove all dependencies on vdev for cryptodev,
> > the implementation of rte_cryptodev_vdev_pmd_init() function
> > needs to be moved to rte_cryptodev_vdev.h, and all crypto
> > vdevs will include it, and therefore, this function will
> > be removed as a public API.
> > 
> > Signed-off-by: Pablo de Lara 
> 
> Acked-by: Akhil Goyal 

Applied, thanks



Re: [dpdk-dev] [PATCH] doc: announce API change in crypto driver allocation

2017-08-08 Thread Thomas Monjalon
03/08/2017 12:43, Akhil Goyal:
> On 8/3/2017 6:32 AM, Pablo de Lara wrote:
> > rte_cryptodev_allocate_driver() function gets one parameter
> > (rte_driver), as the cryptodev_driver structure is
> > allocated inside the function with rte_malloc.
> > 
> > This function is called from a constructor function,
> > when crypto PMDs are registered.
> > If malloc fails, there is no way to recover from it,
> > so it is better to allocate this structure
> > statically, in each PMD.
> > 
> > Therefore, it is required to add an extra parameter in
> > this function, to also get a pointer to this structure.
> > 
> > Signed-off-by: Pablo de Lara 
> > 
> Acked-by: Akhil Goyal 

Applied, thanks



Re: [dpdk-dev] [PATCH] doc: API change notice for librte_meter

2017-08-08 Thread Thomas Monjalon
04/08/2017 15:19, Cristian Dumitrescu:
> Signed-off-by: Cristian Dumitrescu 
> Acked-by: John McNamara 
> Acked-by: Jasvinder Singh 
> Acked-by: Radu Nicolau 
> Acked-by: David Hunt 

Applied, thanks



Re: [dpdk-dev] [PATCH v3] ethdev: add return code to rte_eth_stats_reset()

2017-08-08 Thread David Harton (dharton)
> -Original Message-
> From: Van Haaren, Harry [mailto:harry.van.haa...@intel.com]
> Sent: Tuesday, August 08, 2017 5:03 AM
> To: David Harton (dharton) ; tho...@monjalon.net
> Cc: dev@dpdk.org
> Subject: RE: [dpdk-dev] [PATCH v3] ethdev: add return code to
> rte_eth_stats_reset()
> 
> > From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of David Harton
> > Sent: Monday, August 7, 2017 6:39 PM
> > To: tho...@monjalon.net
> > Cc: dev@dpdk.org; David Harton 
> > Subject: [dpdk-dev] [PATCH v3] ethdev: add return code to
> > rte_eth_stats_reset()
> >
> > Some devices do not support reset of eth stats.  An application may
> > need to know not to clear shadow stats if the device cannot.
> >
> > rte_eth_stats_reset is updated to provide a return code to share
> > whether the device supports reset or not.
> >
> > Signed-off-by: David Harton 
> > ---
> 
> Hi,
> 
> As far as I know changing the return type (void to int) of a function does
> *not* break ABI, but does "break" API as the application code should now
> check the return value. In theory the application could ignore the return
> value and current behavior is maintained.
> 
> The validate-abi.sh script says "Compatible" with the following item
> flagged:
> 
> Problems with Symbols
> High 0
> Medium 0
> Low   1
> 
> Change>> Type of return value has been changed from void to int (4 bytes).
> Effect>> Replacement of return type may indicate a change in its semantic
> meaning.
> 
> Perhaps somebody with more ABI expertise than I would double check the
> return value change?
> 
> 
> Some smaller things inline below.
> 
> > v3:
> > * overcame noob errors and figured out patch challenges
> > * this release should finally be clean :)
> >
> > v2:
> > * fixed soft tab issue inserted while moving changes
> >
> >  lib/librte_ether/rte_ethdev.c | 8 +---
> > lib/librte_ether/rte_ethdev.h | 4 +++-
> >  2 files changed, 8 insertions(+), 4 deletions(-)
> >
> > diff --git a/lib/librte_ether/rte_ethdev.c
> > b/lib/librte_ether/rte_ethdev.c index 0597641..f72cc5a 100644
> > --- a/lib/librte_ether/rte_ethdev.c
> > +++ b/lib/librte_ether/rte_ethdev.c
> > @@ -1341,17 +1341,19 @@ rte_eth_stats_get(uint8_t port_id, struct
> rte_eth_stats *stats)
> > return 0;
> >  }
> >
> > -void
> > +int
> >  rte_eth_stats_reset(uint8_t port_id)
> >  {
> > struct rte_eth_dev *dev;
> >
> > -   RTE_ETH_VALID_PORTID_OR_RET(port_id);
> > +   RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);
> > dev = &rte_eth_devices[port_id];
> >
> > -   RTE_FUNC_PTR_OR_RET(*dev->dev_ops->stats_reset);
> > +   RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->stats_reset, -ENOTSUP);
> > (*dev->dev_ops->stats_reset)(dev);
> > dev->data->rx_mbuf_alloc_failed = 0;
> > +
> > +   return 0;
> >  }
> >
> >  static int
> > diff --git a/lib/librte_ether/rte_ethdev.h
> > b/lib/librte_ether/rte_ethdev.h index 0adf327..d8ccd60 100644
> > --- a/lib/librte_ether/rte_ethdev.h
> > +++ b/lib/librte_ether/rte_ethdev.h
> > @@ -2246,8 +2246,10 @@ int rte_eth_stats_get(uint8_t port_id, struct
> > rte_eth_stats *stats);
> >   *
> >   * @param port_id
> >   *   The port identifier of the Ethernet device.
> > + * @return
> > + *   Zero if successful. Non-zero otherwise.
> 
> We should document all return values:
> 
> @retval 0 Success
> @retval -EINVAL Invalid port_id provided @retval -ENOTSUP Stats reset
> functionality not supported by PMD

Sure.  I was following the convention of function above it
rte_eth_stats_get() but I agree better to advertise externally.
I'll also modify the port number check errval to -ENODEV.

> 
> The API change itself should probably be added to release notes, as
> applications may wish to be aware this function has changed.

Sounds good.

> 
> >   */
> > -void rte_eth_stats_reset(uint8_t port_id);
> > +int rte_eth_stats_reset(uint8_t port_id);
> >
> >  /**
> >   * Retrieve names of extended statistics of an Ethernet device.
> > --
> > 2.10.3.dirty
> 
> 
> I'm happy to Ack the code/release-notes with above suggestions, but I'd
> like a second opinion to Ack ABI.

Thanks for the review,
Dave

> 
> -Harry


Re: [dpdk-dev] [PATCH v3] vfio: fix sPAPR IOMMU DMA window size

2017-08-08 Thread Jonas Pfefferle1

"Burakov, Anatoly"  wrote on 08/08/2017 11:43:43
AM:

> From: "Burakov, Anatoly" 
> To: Jonas Pfefferle1 
> Cc: "a...@ozlabs.ru" , "dev@dpdk.org" 
> Date: 08/08/2017 11:43 AM
> Subject: RE: [PATCH v3] vfio: fix sPAPR IOMMU DMA window size
>
> > From: Jonas Pfefferle1 [mailto:j...@zurich.ibm.com]
> > Sent: Tuesday, August 8, 2017 10:30 AM
> > To: Burakov, Anatoly 
> > Cc: a...@ozlabs.ru; dev@dpdk.org
> > Subject: RE: [PATCH v3] vfio: fix sPAPR IOMMU DMA window size
> >
> > "Burakov, Anatoly"  wrote on 08/08/2017
> > 11:15:24 AM:
> >
> > > From: "Burakov, Anatoly" 
> > > To: Jonas Pfefferle 
> > > Cc: "dev@dpdk.org" , "a...@ozlabs.ru" 
> > > Date: 08/08/2017 11:18 AM
> > > Subject: RE: [PATCH v3] vfio: fix sPAPR IOMMU DMA window size
> > >
> > > From: Jonas Pfefferle [mailto:j...@zurich.ibm.com]
> > > > Sent: Tuesday, August 8, 2017 9:41 AM
> > > > To: Burakov, Anatoly 
> > > > Cc: dev@dpdk.org; a...@ozlabs.ru; Jonas Pfefferle

> > > > Subject: [PATCH v3] vfio: fix sPAPR IOMMU DMA window size
> > > >
> > > > DMA window size needs to be big enough to span all memory segment's
> > > > physical addresses. We do not need multiple levels of IOMMU tables
> > > > as we already span ~70TB of physical memory with 16MB hugepages.
> > > >
> > > > Signed-off-by: Jonas Pfefferle 
> > > > ---
> > > > v2:
> > > > * roundup to next power 2 function without loop.
> > > >
> > > > v3:
> > > > * Replace roundup_next_pow2 with rte_align64pow2
> > > >
> > > >  lib/librte_eal/linuxapp/eal/eal_vfio.c | 13 ++---
> > > >  1 file changed, 10 insertions(+), 3 deletions(-)
> > > >
> > > > diff --git a/lib/librte_eal/linuxapp/eal/eal_vfio.c
> > > > b/lib/librte_eal/linuxapp/eal/eal_vfio.c
> > > > index 946df7e..550c41c 100644
> > > > --- a/lib/librte_eal/linuxapp/eal/eal_vfio.c
> > > > +++ b/lib/librte_eal/linuxapp/eal/eal_vfio.c
> > > > @@ -759,10 +759,12 @@ vfio_spapr_dma_map(int vfio_container_fd)
> > > >        return -1;
> > > >     }
> > > >
> > > > -   /* calculate window size based on number of hugepages
configured
> > > > */
> > > > -   create.window_size = rte_eal_get_physmem_size();
> > > > +   /* physicaly pages are sorted descending i.e. ms[0].phys_addr
is max
> > > > */
> > >
> > > Do we always expect that to be the case in the future? Maybe it
> > > would be safer to walk the memsegs list.
> > >
> > > Thanks,
> > > Anatoly
> >
> > I had this loop in before but removed it in favor of simplicity.
> > If we believe that the ordering is going to change in the future
> > I'm happy to bring back the loop. Is there other code which is
> > relying on the fact that the memsegs are sorted by their physical
> > addresses?
>
> I don't think there is. In any case, I think making assumptions
> about particulars of memseg organization is not a very good practice.
>
> I seem to recall us doing similar things in other places, so maybe
> down the line we could introduce a new API (or internal-only)
> function to get a memseg with min/max address. For now I think a
> loop will do.

Ok. Makes sense to me. Let me resubmit a new version with the loop.

>
> >
> > >
> > > > +   /* create DMA window from 0 to max(phys_addr + len) */
> > > > +   /* sPAPR requires window size to be a power of 2 */
> > > > +   create.window_size = rte_align64pow2(ms[0].phys_addr +
> > > > ms[0].len);
> > > >     create.page_shift = __builtin_ctzll(ms->hugepage_sz);
> > > > -   create.levels = 2;
> > > > +   create.levels = 1;
> > > >
> > > >     ret = ioctl(vfio_container_fd, VFIO_IOMMU_SPAPR_TCE_CREATE,
> > > > &create);
> > > >     if (ret) {
> > > > @@ -771,6 +773,11 @@ vfio_spapr_dma_map(int vfio_container_fd)
> > > >        return -1;
> > > >     }
> > > >
> > > > +   if (create.start_addr != 0) {
> > > > +      RTE_LOG(ERR, EAL, "  DMA window start address != 0\n");
> > > > +      return -1;
> > > > +   }
> > > > +
> > > >     /* map all DPDK segments for DMA. use 1:1 PA to IOVA mapping */
> > > >     for (i = 0; i < RTE_MAX_MEMSEG; i++) {
> > > >        struct vfio_iommu_type1_dma_map dma_map;
> > > > --
> > > > 2.7.4
> > >
>


Re: [dpdk-dev] [PATCH v3] ethdev: add return code to rte_eth_stats_reset()

2017-08-08 Thread Christian Ehrhardt
On Tue, Aug 8, 2017 at 11:02 AM, Van Haaren, Harry <
harry.van.haa...@intel.com> wrote:

> > From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of David Harton
> > Sent: Monday, August 7, 2017 6:39 PM
> > To: tho...@monjalon.net
> > Cc: dev@dpdk.org; David Harton 
> > Subject: [dpdk-dev] [PATCH v3] ethdev: add return code to
> rte_eth_stats_reset()
> >
> > Some devices do not support reset of eth stats.  An application may
> > need to know not to clear shadow stats if the device cannot.
> >
> > rte_eth_stats_reset is updated to provide a return code to share
> > whether the device supports reset or not.
> >
> > Signed-off-by: David Harton 
> > ---
>
> Hi,
>
> As far as I know changing the return type (void to int) of a function does
> *not* break ABI, but does "break" API as the application code should now
> check the return value. In theory the application could ignore the return
> value and current behavior is maintained.
>

After discussing with Harry on IRC it turns out we both ended up checking
the same online sources
to verify our thoughts, like [1].

Given this and several other sources it seems to be as outlined above an
API but not ABI break.
I'm not an expert and this is mostly opinion, but my personal rule mostly
is: "if in doubt bump it".
Running similar issues I was the one providing [2] for a reason, with this
here being a case that
appears safe but there eventually always seems to come up an architecture
or alternative compiler
which does some arcane register juggling differently and makes those "safe"
changes breaking people after the fact.

[1]:
https://stackoverflow.com/questions/15626579/c-abi-is-it-safe-to-change-void-function-to-return-int
[2]:
http://dpdk.org/doc/guides/contributing/versioning.html#setting-a-major-abi-version


[dpdk-dev] [PATCH v4] vfio: fix sPAPR IOMMU DMA window size

2017-08-08 Thread Jonas Pfefferle
DMA window size needs to be big enough to span all memory segment's
physical addresses. We do not need multiple levels of IOMMU tables
as we already span ~70TB of physical memory with 16MB hugepages.

Signed-off-by: Jonas Pfefferle 
---
v2:
* roundup to next power 2 function without loop.

v3:
* Replace roundup_next_pow2 with rte_align64pow2

v4:
* do not assume ordering of physical addresses of memsegs

 lib/librte_eal/linuxapp/eal/eal_vfio.c | 20 +---
 1 file changed, 17 insertions(+), 3 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal_vfio.c 
b/lib/librte_eal/linuxapp/eal/eal_vfio.c
index 946df7e..7d5d61d 100644
--- a/lib/librte_eal/linuxapp/eal/eal_vfio.c
+++ b/lib/librte_eal/linuxapp/eal/eal_vfio.c
@@ -759,10 +759,19 @@ vfio_spapr_dma_map(int vfio_container_fd)
return -1;
}
 
-   /* calculate window size based on number of hugepages configured */
-   create.window_size = rte_eal_get_physmem_size();
+   /* create DMA window from 0 to max(phys_addr + len) */
+   for (i = 0; i < RTE_MAX_MEMSEG; i++) {
+   if (ms[i].addr == NULL)
+   break;
+
+   create.window_size = RTE_MAX(create.window_size,
+   ms[i].phys_addr + ms[i].len);
+   }
+
+   /* sPAPR requires window size to be a power of 2 */
+   create.window_size = rte_align64pow2(create.window_size);
create.page_shift = __builtin_ctzll(ms->hugepage_sz);
-   create.levels = 2;
+   create.levels = 1;
 
ret = ioctl(vfio_container_fd, VFIO_IOMMU_SPAPR_TCE_CREATE, &create);
if (ret) {
@@ -771,6 +780,11 @@ vfio_spapr_dma_map(int vfio_container_fd)
return -1;
}
 
+   if (create.start_addr != 0) {
+   RTE_LOG(ERR, EAL, "  DMA window start address != 0\n");
+   return -1;
+   }
+
/* map all DPDK segments for DMA. use 1:1 PA to IOVA mapping */
for (i = 0; i < RTE_MAX_MEMSEG; i++) {
struct vfio_iommu_type1_dma_map dma_map;
-- 
2.7.4



Re: [dpdk-dev] [PATCH v4] vfio: fix sPAPR IOMMU DMA window size

2017-08-08 Thread Burakov, Anatoly
> From: Jonas Pfefferle [mailto:j...@zurich.ibm.com]
> Sent: Tuesday, August 8, 2017 12:17 PM
> To: Burakov, Anatoly 
> Cc: dev@dpdk.org; a...@ozlabs.ru; Jonas Pfefferle 
> Subject: [PATCH v4] vfio: fix sPAPR IOMMU DMA window size
> 
> DMA window size needs to be big enough to span all memory segment's
> physical addresses. We do not need multiple levels of IOMMU tables as we
> already span ~70TB of physical memory with 16MB hugepages.
> 
> Signed-off-by: Jonas Pfefferle 
> ---
> v2:
> * roundup to next power 2 function without loop.
> 
> v3:
> * Replace roundup_next_pow2 with rte_align64pow2
> 
> v4:
> * do not assume ordering of physical addresses of memsegs
> 
>  lib/librte_eal/linuxapp/eal/eal_vfio.c | 20 +---
>  1 file changed, 17 insertions(+), 3 deletions(-)
> 
> diff --git a/lib/librte_eal/linuxapp/eal/eal_vfio.c
> b/lib/librte_eal/linuxapp/eal/eal_vfio.c
> index 946df7e..7d5d61d 100644
> --- a/lib/librte_eal/linuxapp/eal/eal_vfio.c
> +++ b/lib/librte_eal/linuxapp/eal/eal_vfio.c
> @@ -759,10 +759,19 @@ vfio_spapr_dma_map(int vfio_container_fd)
>   return -1;
>   }
> 
> - /* calculate window size based on number of hugepages configured
> */
> - create.window_size = rte_eal_get_physmem_size();
> + /* create DMA window from 0 to max(phys_addr + len) */
> + for (i = 0; i < RTE_MAX_MEMSEG; i++) {
> + if (ms[i].addr == NULL)
> + break;
> +
> + create.window_size = RTE_MAX(create.window_size,
> + ms[i].phys_addr + ms[i].len);
> + }
> +
> + /* sPAPR requires window size to be a power of 2 */
> + create.window_size = rte_align64pow2(create.window_size);
>   create.page_shift = __builtin_ctzll(ms->hugepage_sz);
> - create.levels = 2;
> + create.levels = 1;
> 
>   ret = ioctl(vfio_container_fd, VFIO_IOMMU_SPAPR_TCE_CREATE,
> &create);
>   if (ret) {
> @@ -771,6 +780,11 @@ vfio_spapr_dma_map(int vfio_container_fd)
>   return -1;
>   }
> 
> + if (create.start_addr != 0) {
> + RTE_LOG(ERR, EAL, "  DMA window start address != 0\n");
> + return -1;
> + }
> +
>   /* map all DPDK segments for DMA. use 1:1 PA to IOVA mapping */
>   for (i = 0; i < RTE_MAX_MEMSEG; i++) {
>   struct vfio_iommu_type1_dma_map dma_map;
> --
> 2.7.4

Acked by: Anatoly Burakov 

Thanks,
Anatoly


[dpdk-dev] [PATCH v1] doc: add template release notes for 17.11

2017-08-08 Thread John McNamara
Add template release notes for DPDK 17.11 with inline
comments and explanations of the various sections.

Signed-off-by: John McNamara 
---
 doc/guides/rel_notes/index.rst |   1 +
 doc/guides/rel_notes/release_17_11.rst | 199 +
 2 files changed, 200 insertions(+)
 create mode 100644 doc/guides/rel_notes/release_17_11.rst

diff --git a/doc/guides/rel_notes/index.rst b/doc/guides/rel_notes/index.rst
index b3c8090..35659d8 100644
--- a/doc/guides/rel_notes/index.rst
+++ b/doc/guides/rel_notes/index.rst
@@ -36,6 +36,7 @@ Release Notes
 :numbered:
 
 rel_description
+release_17_11
 release_17_08
 release_17_05
 release_17_02
diff --git a/doc/guides/rel_notes/release_17_11.rst 
b/doc/guides/rel_notes/release_17_11.rst
new file mode 100644
index 000..6e5768f
--- /dev/null
+++ b/doc/guides/rel_notes/release_17_11.rst
@@ -0,0 +1,199 @@
+DPDK Release 17.11
+==
+
+.. **Read this first.**
+
+   The text in the sections below explains how to update the release notes.
+
+   Use proper spelling, capitalization and punctuation in all sections.
+
+   Variable and config names should be quoted as fixed width text:
+   ``LIKE_THIS``.
+
+   Build the docs and view the output file to ensure the changes are correct::
+
+  make doc-guides-html
+
+  xdg-open build/doc/html/guides/rel_notes/release_17_11.html
+
+
+New Features
+
+
+.. This section should contain new features added in this release. Sample
+   format:
+
+   * **Add a title in the past tense with a full stop.**
+
+ Add a short 1-2 sentence description in the past tense. The description
+ should be enough to allow someone scanning the release notes to
+ understand the new feature.
+
+ If the feature adds a lot of sub-features you can use a bullet list like
+ this:
+
+ * Added feature foo to do something.
+ * Enhanced feature bar to do something else.
+
+ Refer to the previous release notes for examples.
+
+ This section is a comment. do not overwrite or remove it.
+ Also, make sure to start the actual text at the margin.
+ =
+
+
+Resolved Issues
+---
+
+.. This section should contain bug fixes added to the relevant
+   sections. Sample format:
+
+   * **code/section Fixed issue in the past tense with a full stop.**
+
+ Add a short 1-2 sentence description of the resolved issue in the past
+ tense.
+
+ The title should contain the code/lib section like a commit message.
+
+ Add the entries in alphabetic order in the relevant sections below.
+
+   This section is a comment. do not overwrite or remove it.
+   Also, make sure to start the actual text at the margin.
+   =
+
+
+EAL
+~~~
+
+
+Drivers
+~~~
+
+
+Libraries
+~
+
+
+Examples
+
+
+
+Other
+~
+
+
+Known Issues
+
+
+.. This section should contain new known issues in this release. Sample format:
+
+   * **Add title in present tense with full stop.**
+
+ Add a short 1-2 sentence description of the known issue in the present
+ tense. Add information on any known workarounds.
+
+   This section is a comment. do not overwrite or remove it.
+   Also, make sure to start the actual text at the margin.
+   =
+
+
+API Changes
+---
+
+.. This section should contain API changes. Sample format:
+
+   * Add a short 1-2 sentence description of the API change. Use fixed width
+ quotes for ``rte_function_names`` or ``rte_struct_names``. Use the past
+ tense.
+
+   This section is a comment. do not overwrite or remove it.
+   Also, make sure to start the actual text at the margin.
+   =
+
+
+ABI Changes
+---
+
+.. This section should contain ABI changes. Sample format:
+
+   * Add a short 1-2 sentence description of the ABI change that was announced
+ in the previous releases and made in this release. Use fixed width quotes
+ for ``rte_function_names`` or ``rte_struct_names``. Use the past tense.
+
+   This section is a comment. do not overwrite or remove it.
+   Also, make sure to start the actual text at the margin.
+   =
+
+
+
+Shared Library Versions
+---
+
+.. Update any library version updated in this release and prepend with a ``+``
+   sign, like this:
+
+ librte_acl.so.2
+   + librte_cfgfile.so.2
+ librte_cmdline.so.2
+
+   This section is a comment. do not overwrite or remove it.
+   =
+
+
+The libraries prepended with a plus sign were incremented in this version.
+
+.. code-block:: diff
+
+ librte_acl.so.2
+ librte_bitratestats.so.1
+ librte_cfgfile.so.2
+ librte_cmdline.so.2
+ librte_cryptodev.so.3

Re: [dpdk-dev] [PATCH v1] doc: update release notes for 17.08

2017-08-08 Thread Mcnamara, John
> -Original Message-
> From: Thomas Monjalon [mailto:tho...@monjalon.net]
> Sent: Monday, August 7, 2017 12:17 PM
> To: Mcnamara, John 
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v1] doc: update release notes for 17.08
> 
> 04/08/2017 16:20, John McNamara:
> > Fix grammar, spelling and formatting of DPDK 17.08 release notes.
> >
> > Signed-off-by: John McNamara 
> 
> Applied, thanks
> 
> Please, would you like to prepare a blank release notes for 17.11?


Done.

John





[dpdk-dev] [PATCH] net/bonding: fix bonding in 8023ad mode

2017-08-08 Thread Jacek Piasecki
This patch blocks possibility to set master bonding by
rte_eth_bond_mode_set() in 802.3ad mode, as the API
doesn't prevent this.

Fixes: 6d72657ce379 ("net/bonding: add other aggregator modes")
Cc: danielx.t.mrzyg...@intel.com

Signed-off-by: Jacek Piasecki 
---
 drivers/net/bonding/rte_eth_bond_api.c | 29 -
 drivers/net/bonding/rte_eth_bond_private.h |  3 +++
 2 files changed, 31 insertions(+), 1 deletion(-)

diff --git a/drivers/net/bonding/rte_eth_bond_api.c 
b/drivers/net/bonding/rte_eth_bond_api.c
index de1d9e0..d6aaf07 100644
--- a/drivers/net/bonding/rte_eth_bond_api.c
+++ b/drivers/net/bonding/rte_eth_bond_api.c
@@ -63,6 +63,25 @@ valid_bonded_port_id(uint8_t port_id)
 }
 
 int
+check_for_master_bonded_ethdev(const struct rte_eth_dev *eth_dev)
+{
+   int i;
+   struct bond_dev_private *internals;
+
+   if (check_for_bonded_ethdev(eth_dev) != 0)
+   return 0;
+
+   internals = eth_dev->data->dev_private;
+
+   /* Check if any of slave devices is a bonded device */
+   for (i = 0; i < internals->slave_count; i++)
+   if (valid_bonded_port_id(internals->slaves[i].port_id) == 0)
+   return 1;
+
+   return 0;
+}
+
+int
 valid_slave_port_id(uint8_t port_id, uint8_t mode)
 {
RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -1);
@@ -490,10 +509,18 @@ rte_eth_bond_slave_remove(uint8_t bonded_port_id, uint8_t 
slave_port_id)
 int
 rte_eth_bond_mode_set(uint8_t bonded_port_id, uint8_t mode)
 {
+   struct rte_eth_dev *bonded_eth_dev;
+
if (valid_bonded_port_id(bonded_port_id) != 0)
return -1;
 
-   return bond_ethdev_mode_set(&rte_eth_devices[bonded_port_id], mode);
+   bonded_eth_dev = &rte_eth_devices[bonded_port_id];
+
+   if (check_for_master_bonded_ethdev(bonded_eth_dev) != 0 &&
+   mode == BONDING_MODE_8023AD)
+   return -1;
+
+   return bond_ethdev_mode_set(bonded_eth_dev, mode);
 }
 
 int
diff --git a/drivers/net/bonding/rte_eth_bond_private.h 
b/drivers/net/bonding/rte_eth_bond_private.h
index 1fe6ff8..4dd7e5f 100644
--- a/drivers/net/bonding/rte_eth_bond_private.h
+++ b/drivers/net/bonding/rte_eth_bond_private.h
@@ -184,6 +184,9 @@ extern const struct eth_dev_ops default_dev_ops;
 int
 check_for_bonded_ethdev(const struct rte_eth_dev *eth_dev);
 
+int
+check_for_master_bonded_ethdev(const struct rte_eth_dev *eth_dev);
+
 /* Search given slave array to find position of given id.
  * Return slave pos or slaves_count if not found. */
 static inline uint8_t
-- 
2.7.4



Re: [dpdk-dev] [PATCH v3] ethdev: add return code to rte_eth_stats_reset()

2017-08-08 Thread Thomas Monjalon
08/08/2017 13:03, Christian Ehrhardt:
> On Tue, Aug 8, 2017 at 11:02 AM, Van Haaren, Harry <
> harry.van.haa...@intel.com> wrote:
> > >
> > > Some devices do not support reset of eth stats.  An application may
> > > need to know not to clear shadow stats if the device cannot.
> > >
> > > rte_eth_stats_reset is updated to provide a return code to share
> > > whether the device supports reset or not.
> > >
> > > Signed-off-by: David Harton 
> > > ---
> >
> > Hi,
> >
> > As far as I know changing the return type (void to int) of a function does
> > *not* break ABI, but does "break" API as the application code should now
> > check the return value. In theory the application could ignore the return
> > value and current behavior is maintained.
> >
> 
> After discussing with Harry on IRC it turns out we both ended up checking
> the same online sources
> to verify our thoughts, like [1].
> 
> Given this and several other sources it seems to be as outlined above an
> API but not ABI break.
> I'm not an expert and this is mostly opinion, but my personal rule mostly
> is: "if in doubt bump it".

Anyway, the ABI will be broken (and bumped) again in 17.11.
This patch will be accepted in 17.11.



Re: [dpdk-dev] [PATCH v1] doc: add template release notes for 17.11

2017-08-08 Thread Thomas Monjalon
Hi,

Thanks for sending the next release notes so quickly.
Unfortunately...

08/08/2017 14:31, John McNamara:
> +The libraries prepended with a plus sign were incremented in this version.
> +
> +.. code-block:: diff
> +
> + librte_acl.so.2
> + librte_bitratestats.so.1
> + librte_cfgfile.so.2
> + librte_cmdline.so.2
> + librte_cryptodev.so.3
> + librte_distributor.so.1
> + librte_eal.so.4
> + librte_ethdev.so.7
> + librte_gro.so.1
> + librte_hash.so.2
> + librte_ip_frag.so.1
> + librte_jobstats.so.1
> + librte_kni.so.2
> + librte_kvargs.so.1
> + librte_latencystats.so.1
> + librte_lpm.so.2
> + librte_mbuf.so.3
> + librte_mempool.so.2
> + librte_meter.so.1
> + librte_metrics.so.1
> + librte_net.so.1
> + librte_pdump.so.1
> + librte_pipeline.so.3
> + librte_pmd_bond.so.1
> + librte_pmd_ring.so.2
> + librte_port.so.3
> + librte_power.so.1
> + librte_reorder.so.1
> + librte_ring.so.1
> + librte_sched.so.1
> + librte_table.so.2
> + librte_timer.so.1
> + librte_vhost.so.3

... we are going to bump some ABIVER which were forgotten during 17.08.
You will probably need to update after 17.08 release.


[dpdk-dev] [PATCH] eal: bump ABI version for PCI, bus and devargs work

2017-08-08 Thread Gaetan Rivet
1. PCI domain field in PCI address structure grew from 16 to 32 bits.

   From: 463ced957c3f ("pci: increase domain storage to 32 bits")

2. rte_bus structure gaining new ops.

   From: 3a8f0bc68a90 ("bus: add method to find device")
   From: 7c8810f43f6e ("bus: introduce device plug/unplug")
   From: 609eb7dde6d0 ("bus: introduce parsing functionality")
   From: 98eb4b845c1a ("bus: introduce scan policies")

3. rte_devargs structure having been mostly rewritten.

   From: f3a1188cee4a ("devargs: make device representation generic")
   From: 47828c5f3bc3 ("devargs: parse bus info")
   From: 39f403e0d5bb ("devargs: restore device type API")

Signed-off-by: Gaetan Rivet 
---
 lib/librte_eal/bsdapp/eal/Makefile   | 2 +-
 lib/librte_eal/linuxapp/eal/Makefile | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/bsdapp/eal/Makefile 
b/lib/librte_eal/bsdapp/eal/Makefile
index 05517a2..005019e 100644
--- a/lib/librte_eal/bsdapp/eal/Makefile
+++ b/lib/librte_eal/bsdapp/eal/Makefile
@@ -48,7 +48,7 @@ LDLIBS += -lgcc_s
 
 EXPORT_MAP := rte_eal_version.map
 
-LIBABIVER := 4
+LIBABIVER := 5
 
 # specific to bsdapp exec-env
 SRCS-$(CONFIG_RTE_EXEC_ENV_BSDAPP) := eal.c
diff --git a/lib/librte_eal/linuxapp/eal/Makefile 
b/lib/librte_eal/linuxapp/eal/Makefile
index e6ab6c3..90bca4d 100644
--- a/lib/librte_eal/linuxapp/eal/Makefile
+++ b/lib/librte_eal/linuxapp/eal/Makefile
@@ -37,7 +37,7 @@ ARCH_DIR ?= $(RTE_ARCH)
 EXPORT_MAP := rte_eal_version.map
 VPATH += $(RTE_SDK)/lib/librte_eal/common/arch/$(ARCH_DIR)
 
-LIBABIVER := 4
+LIBABIVER := 5
 
 VPATH += $(RTE_SDK)/lib/librte_eal/common
 
-- 
2.1.4



[dpdk-dev] [PATCH] doc: notify EAL ABI change

2017-08-08 Thread Gaetan Rivet
Signed-off-by: Gaetan Rivet 
---
 doc/guides/rel_notes/release_17_08.rst | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/doc/guides/rel_notes/release_17_08.rst 
b/doc/guides/rel_notes/release_17_08.rst
index a137aa7..a723252 100644
--- a/doc/guides/rel_notes/release_17_08.rst
+++ b/doc/guides/rel_notes/release_17_08.rst
@@ -343,6 +343,14 @@ ABI Changes
 
 * Removed ``session_mp`` from ``rte_cryptodev_config``.
 
+* Changed type of ``domain`` field in ``rte_pci_addr`` to ``uint32_t``
+  to follow the PCI standard.
+
+* Added new ``rte_bus`` experimental APIs available as operators within the
+  ``rte_bus`` structure.
+
+* Made ``rte_devargs`` structure internal device representation generic to
+  prepare for a bus-agnostic EAL.
 
 Shared Library Versions
 ---
@@ -368,7 +376,7 @@ The libraries prepended with a plus sign were incremented 
in this version.
  librte_cmdline.so.2
+ librte_cryptodev.so.3
  librte_distributor.so.1
- librte_eal.so.4
+   + librte_eal.so.5
+ librte_ethdev.so.7
+ librte_gro.so.1
  librte_hash.so.2
-- 
2.1.4



[dpdk-dev] [PATCH 1/2] doc: fix v17.05 release note

2017-08-08 Thread Jerin Jacob
librte_eventdev.so.1 was introduced in v17.05, and it missed to
update in release_17_05 release notes.

Fixes: 222555480a7f9 ("version: 17.05.0")

Cc: sta...@dpdk.org
Reported-by: Thomas Monjalon 
Signed-off-by: Jerin Jacob 
---
 doc/guides/rel_notes/release_17_05.rst | 1 +
 1 file changed, 1 insertion(+)

diff --git a/doc/guides/rel_notes/release_17_05.rst 
b/doc/guides/rel_notes/release_17_05.rst
index cbdb40657..87b3f9279 100644
--- a/doc/guides/rel_notes/release_17_05.rst
+++ b/doc/guides/rel_notes/release_17_05.rst
@@ -518,6 +518,7 @@ The libraries prepended with a plus sign were incremented 
in this version.
  librte_distributor.so.1
+ librte_eal.so.4
  librte_ethdev.so.6
+   + librte_eventdev.so.1
  librte_hash.so.2
  librte_ip_frag.so.1
  librte_jobstats.so.1
-- 
2.14.0



[dpdk-dev] [PATCH 2/2] eventdev: bump library version

2017-08-08 Thread Jerin Jacob
Bumping the library version to reflect the ABI change, where
rte_event_pmd_pci_probe(), rte_event_pmd_pci_remove(),
rte_event_pmd_vdev_init(), rte_event_pmd_vdev_uninit()
functions removed from the library.

Fixes: b1b3d9f90502 ("eventdev: make vdev init and uninit functions optional")
Fixes: 9a8269d56942 ("eventdev: make PCI probe and remove functions optional")

Reported-by: Thomas Monjalon 
Signed-off-by: Jerin Jacob 
---
 doc/guides/rel_notes/release_17_08.rst | 1 +
 lib/librte_eventdev/Makefile   | 2 +-
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/doc/guides/rel_notes/release_17_08.rst 
b/doc/guides/rel_notes/release_17_08.rst
index a137aa79f..1c2bc6de9 100644
--- a/doc/guides/rel_notes/release_17_08.rst
+++ b/doc/guides/rel_notes/release_17_08.rst
@@ -370,6 +370,7 @@ The libraries prepended with a plus sign were incremented 
in this version.
  librte_distributor.so.1
  librte_eal.so.4
+ librte_ethdev.so.7
+   + librte_eventdev.so.2
+ librte_gro.so.1
  librte_hash.so.2
  librte_ip_frag.so.1
diff --git a/lib/librte_eventdev/Makefile b/lib/librte_eventdev/Makefile
index 64165020a..410578a14 100644
--- a/lib/librte_eventdev/Makefile
+++ b/lib/librte_eventdev/Makefile
@@ -34,7 +34,7 @@ include $(RTE_SDK)/mk/rte.vars.mk
 LIB = librte_eventdev.a
 
 # library version
-LIBABIVER := 1
+LIBABIVER := 2
 
 # build flags
 CFLAGS += -O3
-- 
2.14.0



Re: [dpdk-dev] [RFC] ethdev: add ioctl-like API to control device specific features

2017-08-08 Thread Stephen Hemminger
On Tue, 8 Aug 2017 09:32:07 +0100
Ferruh Yigit  wrote:

> On 8/4/2017 1:56 PM, Bruce Richardson wrote:
> > On Fri, Aug 04, 2017 at 12:58:01PM +0100, Ferruh Yigit wrote:  
> >> On 8/3/2017 8:53 PM, Thomas Monjalon wrote:  
> >>> 03/08/2017 18:15, Stephen Hemminger:  
>  On Thu, 3 Aug 2017 14:21:38 +0100
>  Bruce Richardson  wrote:
>   
> > On Thu, Aug 03, 2017 at 01:21:35PM +0100, Chilikin, Andrey wrote:  
> >> To control some device-specific features public device-specific 
> >> functions
> >> rte_pmd_*.h are used.
> >>
> >> But this solution requires applications to distinguish devices at 
> >> runtime
> >> and, depending on the device type, call corresponding device-specific
> >> functions even if functions' parameters are the same.
> >>
> >> IOCTL-like API can be added to ethdev instead of public device-specific
> >> functions to address the following:
> >>
> >> * allow more usable support of features across a range of NIC from
> >>   one vendor, but not others
> >> * allow features to be implemented by multiple NIC drivers without
> >>   relying on a critical mass to get the functionality in ethdev
> >> * there are a large number of possible device specific functions, and
> >>   creating individual APIs for each one is not a good solution
> >> * IOCTLs are a proven method for solving this problem in other areas,
> >>   i.e. OS kernels.
> >>
> >> Control requests for this API will be globally defined at ethdev 
> >> level, so
> >> an application will use single API call to control different devices 
> >> from
> >> one/multiple vendors.
> >>
> >> API call may look like as a classic ioctl with an extra parameter for
> >> argument length for better sanity checks:
> >>
> >> int
> >> rte_eth_dev_ioctl(uint16_t port, uint64_t ctl, void *argp,
> >> unsigned arg_length);
> >>
> >> Regards,
> >> Andrey
> >
> > I think we need to start putting in IOCTLs for ethdevs, much as I hate
> > to admit it, since I dislike IOCTLs and other functions with opaque
> > arguments! Having driver specific functions I don't think will scale
> > well as each vendor tries to expose as much of their driver specific
> > functionality as possible.
> >
> > One other additional example: I discovered just this week another issue
> > with driver specific functions and testpmd, when I was working on the
> > meson build rework.
> >
> > * With shared libraries, when we do "ninja install" we want our DPDK
> >   libs moved to e.g. /usr/local/lib, but the drivers moved to a separate
> >   driver folder, so that they can be automatically loaded from that
> >   single location by DPDK apps [== CONFIG_RTE_EAL_PMD_PATH].
> > * However, testpmd, as well as using the drivers as plugins, uses
> >   driver-specific functions, which means that it explicitly links
> >   against the pmd .so files.
> > * Those driver .so files are not in with the other libraries, so ld.so
> >   does not find the pmd, and the installed testpmd fails to run due to
> >   missing library dependencies.
> > * The workaround is to add the drivers path to the ld load path, but we
> >   should not require ld library path changes just to get DPDK apps to
> >   work.
> >
> > Using ioctls instead of driver-specific functions would solve this.
> >
> > My 2c.  
> 
>  My 2c. No.
> 
>  Short answer:
>  Ioctl's were a bad idea in Unix (per Dennis Ritchie et al) and are now
>  despised by Linux kernel developers. They provide an unstructured, 
>  unsecured,
>  back door for device driver abuse. Try to get a new driver in Linux with
>  a unique ioctl, and it will be hard to get accepted.
> 
>  Long answer:
>  So far every device specific feature has fit into ethdev model. Doing 
>  ioctl
>  is admitting "it is too hard to be general, we need need an out". For 
>  something
>  that is a flag, it should fit into existing config model; ignoring silly 
>  ABI constraints.
>  For a real feature (think flow direction), we want a first class API for 
>  that.
>  For a wart, then devargs will do.
> 
>  Give a good example of something that should be an ioctl. Don't build the
>  API first and then let it get cluttered.  
> >>>
> >>> I agree with Stephen.
> >>>
> >>> And please do not forget that ioctl still requires an API:
> >>> the argument that you put in ioctl is the API of the feature.
> >>> So it is the same thing as defining a new function.  
> >>
> >> I am also not fan of the ioctl usage. I believe it hides APIs behind ids
> >> and prevent argument check by compiler.
> >>
> >> BUT, the number of the increasing PMD specific APIs are also worrying,
> >> it is becoming harder to maintain, and I believe this is something NOT
> >

Re: [dpdk-dev] [PATCH v1] doc: add template release notes for 17.11

2017-08-08 Thread Mcnamara, John


> -Original Message-
> From: Thomas Monjalon [mailto:tho...@monjalon.net]
> Sent: Tuesday, August 8, 2017 2:50 PM
> To: Mcnamara, John 
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v1] doc: add template release notes for
> 17.11
> 
> Hi,
> > + librte_timer.so.1
> > + librte_vhost.so.3
> 
> ... we are going to bump some ABIVER which were forgotten during 17.08.
> You will probably need to update after 17.08 release.

No problem. I'll send a v2 right after the 17.08 release.

John


Re: [dpdk-dev] [RFC] ethdev: add ioctl-like API to control device specific features

2017-08-08 Thread Wiles, Keith

On Aug 4, 2017, at 6:58 AM, Ferruh Yigit 
mailto:ferruh.yi...@intel.com>> wrote:

On 8/3/2017 8:53 PM, Thomas Monjalon wrote:
03/08/2017 18:15, Stephen Hemminger:
On Thu, 3 Aug 2017 14:21:38 +0100
Bruce Richardson 
mailto:bruce.richard...@intel.com>> wrote:

On Thu, Aug 03, 2017 at 01:21:35PM +0100, Chilikin, Andrey wrote:
To control some device-specific features public device-specific functions
rte_pmd_*.h are used.

But this solution requires applications to distinguish devices at runtime
and, depending on the device type, call corresponding device-specific
functions even if functions' parameters are the same.

IOCTL-like API can be added to ethdev instead of public device-specific
functions to address the following:

* allow more usable support of features across a range of NIC from
 one vendor, but not others
* allow features to be implemented by multiple NIC drivers without
 relying on a critical mass to get the functionality in ethdev
* there are a large number of possible device specific functions, and
 creating individual APIs for each one is not a good solution
* IOCTLs are a proven method for solving this problem in other areas,
 i.e. OS kernels.

Control requests for this API will be globally defined at ethdev level, so
an application will use single API call to control different devices from
one/multiple vendors.

API call may look like as a classic ioctl with an extra parameter for
argument length for better sanity checks:

int
rte_eth_dev_ioctl(uint16_t port, uint64_t ctl, void *argp,
   unsigned arg_length);

Regards,
Andrey

I think we need to start putting in IOCTLs for ethdevs, much as I hate
to admit it, since I dislike IOCTLs and other functions with opaque
arguments! Having driver specific functions I don't think will scale
well as each vendor tries to expose as much of their driver specific
functionality as possible.

One other additional example: I discovered just this week another issue
with driver specific functions and testpmd, when I was working on the
meson build rework.

* With shared libraries, when we do "ninja install" we want our DPDK
 libs moved to e.g. /usr/local/lib, but the drivers moved to a separate
 driver folder, so that they can be automatically loaded from that
 single location by DPDK apps [== CONFIG_RTE_EAL_PMD_PATH].
* However, testpmd, as well as using the drivers as plugins, uses
 driver-specific functions, which means that it explicitly links
 against the pmd .so files.
* Those driver .so files are not in with the other libraries, so ld.so
 does not find the pmd, and the installed testpmd fails to run due to
 missing library dependencies.
* The workaround is to add the drivers path to the ld load path, but we
 should not require ld library path changes just to get DPDK apps to
 work.

Using ioctls instead of driver-specific functions would solve this.

My 2c.

My 2c. No.

Short answer:
Ioctl's were a bad idea in Unix (per Dennis Ritchie et al) and are now
despised by Linux kernel developers. They provide an unstructured, unsecured,
back door for device driver abuse. Try to get a new driver in Linux with
a unique ioctl, and it will be hard to get accepted.

Long answer:
So far every device specific feature has fit into ethdev model. Doing ioctl
is admitting "it is too hard to be general, we need need an out". For something
that is a flag, it should fit into existing config model; ignoring silly ABI 
constraints.
For a real feature (think flow direction), we want a first class API for that.
For a wart, then devargs will do.

Give a good example of something that should be an ioctl. Don't build the
API first and then let it get cluttered.

I agree with Stephen.

And please do not forget that ioctl still requires an API:
the argument that you put in ioctl is the API of the feature.
So it is the same thing as defining a new function.

I am also not fan of the ioctl usage. I believe it hides APIs behind ids
and prevent argument check by compiler.

BUT, the number of the increasing PMD specific APIs are also worrying,
it is becoming harder to maintain, and I believe this is something NOT
sustainable in long run.


What about having *eth_dev_extended_ops* ?

We had talk about adding something like device specific APIs to DPDK in the 
past, which to me are just IOCTL like APIs. The big problem with IOCTLs is 
trying to cram a bunch of specific requests into a very generic API and I do 
not like ioctl as defined in Linux/Unix today. The old IOCTLs calls are too 
opaque and difficult for compilers to test args and many other issues.

We talked about having a single API in rte_eth_dev that would allow a user to 
ask for and possible get a list of function pointers in a given structure for 
the requested type. If a user calls this API to get some feature from a given 
NIC he would get NULL or a pointer to a set of functions. The generic API in 
rte_eth would allow the user to request what structures or types of APIs it 
supports.

Using a specific API to get the lis

Re: [dpdk-dev] [PATCH] eal: bump ABI version for PCI, bus and devargs work

2017-08-08 Thread Thomas Monjalon
08/08/2017 16:26, Gaetan Rivet:
> 1. PCI domain field in PCI address structure grew from 16 to 32 bits.
> 
>From: 463ced957c3f ("pci: increase domain storage to 32 bits")
> 
> 2. rte_bus structure gaining new ops.
> 
>From: 3a8f0bc68a90 ("bus: add method to find device")
>From: 7c8810f43f6e ("bus: introduce device plug/unplug")
>From: 609eb7dde6d0 ("bus: introduce parsing functionality")
>From: 98eb4b845c1a ("bus: introduce scan policies")
> 
> 3. rte_devargs structure having been mostly rewritten.
> 
>From: f3a1188cee4a ("devargs: make device representation generic")
>From: 47828c5f3bc3 ("devargs: parse bus info")
>From: 39f403e0d5bb ("devargs: restore device type API")
> 
> Signed-off-by: Gaetan Rivet 

Applied and merged with release notes update, thanks



Re: [dpdk-dev] [PATCH] doc: notify EAL ABI change

2017-08-08 Thread Thomas Monjalon
08/08/2017 16:26, Gaetan Rivet:
> Signed-off-by: Gaetan Rivet 
> ---
>  doc/guides/rel_notes/release_17_08.rst | 10 +-
>  1 file changed, 9 insertions(+), 1 deletion(-)

Applied and merged with ABIVER bump, thanks



Re: [dpdk-dev] [RFC] ethdev: add ioctl-like API to control device specific features

2017-08-08 Thread Wiles, Keith
Fix format.

> On Aug 4, 2017, at 6:58 AM, Ferruh Yigit  wrote:
> 
> On 8/3/2017 8:53 PM, Thomas Monjalon wrote:
>> 03/08/2017 18:15, Stephen Hemminger:
>>> On Thu, 3 Aug 2017 14:21:38 +0100
>>> Bruce Richardson  wrote:
>>> 
 On Thu, Aug 03, 2017 at 01:21:35PM +0100, Chilikin, Andrey wrote:
> To control some device-specific features public device-specific functions
> rte_pmd_*.h are used.
> 
> But this solution requires applications to distinguish devices at runtime
> and, depending on the device type, call corresponding device-specific
> functions even if functions' parameters are the same.
> 
> IOCTL-like API can be added to ethdev instead of public device-specific
> functions to address the following:
> 
> * allow more usable support of features across a range of NIC from
>  one vendor, but not others
> * allow features to be implemented by multiple NIC drivers without
>  relying on a critical mass to get the functionality in ethdev
> * there are a large number of possible device specific functions, and
>  creating individual APIs for each one is not a good solution
> * IOCTLs are a proven method for solving this problem in other areas,
>  i.e. OS kernels.
> 
> Control requests for this API will be globally defined at ethdev level, so
> an application will use single API call to control different devices from
> one/multiple vendors.
> 
> API call may look like as a classic ioctl with an extra parameter for
> argument length for better sanity checks:
> 
> int
> rte_eth_dev_ioctl(uint16_t port, uint64_t ctl, void *argp,
>unsigned arg_length);
> 
> Regards,
> Andrey  
 
 I think we need to start putting in IOCTLs for ethdevs, much as I hate
 to admit it, since I dislike IOCTLs and other functions with opaque
 arguments! Having driver specific functions I don't think will scale
 well as each vendor tries to expose as much of their driver specific
 functionality as possible.
 
 One other additional example: I discovered just this week another issue
 with driver specific functions and testpmd, when I was working on the
 meson build rework.
 
 * With shared libraries, when we do "ninja install" we want our DPDK
  libs moved to e.g. /usr/local/lib, but the drivers moved to a separate
  driver folder, so that they can be automatically loaded from that
  single location by DPDK apps [== CONFIG_RTE_EAL_PMD_PATH].
 * However, testpmd, as well as using the drivers as plugins, uses
  driver-specific functions, which means that it explicitly links
  against the pmd .so files.
 * Those driver .so files are not in with the other libraries, so ld.so
  does not find the pmd, and the installed testpmd fails to run due to
  missing library dependencies.
 * The workaround is to add the drivers path to the ld load path, but we
  should not require ld library path changes just to get DPDK apps to
  work.
 
 Using ioctls instead of driver-specific functions would solve this.
 
 My 2c.
>>> 
>>> My 2c. No.
>>> 
>>> Short answer:
>>> Ioctl's were a bad idea in Unix (per Dennis Ritchie et al) and are now
>>> despised by Linux kernel developers. They provide an unstructured, 
>>> unsecured,
>>> back door for device driver abuse. Try to get a new driver in Linux with
>>> a unique ioctl, and it will be hard to get accepted.
>>> 
>>> Long answer:
>>> So far every device specific feature has fit into ethdev model. Doing ioctl
>>> is admitting "it is too hard to be general, we need need an out". For 
>>> something
>>> that is a flag, it should fit into existing config model; ignoring silly 
>>> ABI constraints.
>>> For a real feature (think flow direction), we want a first class API for 
>>> that.
>>> For a wart, then devargs will do.
>>> 
>>> Give a good example of something that should be an ioctl. Don't build the
>>> API first and then let it get cluttered.
>> 
>> I agree with Stephen.
>> 
>> And please do not forget that ioctl still requires an API:
>> the argument that you put in ioctl is the API of the feature.
>> So it is the same thing as defining a new function.
> 
> I am also not fan of the ioctl usage. I believe it hides APIs behind ids
> and prevent argument check by compiler.
> 
> BUT, the number of the increasing PMD specific APIs are also worrying,
> it is becoming harder to maintain, and I believe this is something NOT
> sustainable in long run.
> 
> 
> What about having *eth_dev_extended_ops* ?


We had talk about adding something like device specific APIs to DPDK in the 
past, which to me are just IOCTL like APIs. The big problem with IOCTLs is 
trying to cram a bunch of specific requests into a very generic API and I do 
not like ioctl as defined in Linux/Unix today. The old IOCTLs calls are too 
opaque and difficult for compilers to test args and many other issues.

We t

Re: [dpdk-dev] [PATCH 2/2] eventdev: bump library version

2017-08-08 Thread Thomas Monjalon
08/08/2017 16:48, Jerin Jacob:
> Bumping the library version to reflect the ABI change, where
> rte_event_pmd_pci_probe(), rte_event_pmd_pci_remove(),
> rte_event_pmd_vdev_init(), rte_event_pmd_vdev_uninit()
> functions removed from the library.
> 
> Fixes: b1b3d9f90502 ("eventdev: make vdev init and uninit functions optional")
> Fixes: 9a8269d56942 ("eventdev: make PCI probe and remove functions optional")
> 
> Reported-by: Thomas Monjalon 
> Signed-off-by: Jerin Jacob 

Series applied, thanks



Re: [dpdk-dev] [RFC] ethdev: add ioctl-like API to control device specific features

2017-08-08 Thread Stephen Hemminger
On Tue, 8 Aug 2017 17:28:19 +
"Wiles, Keith"  wrote:

> Fix format.
> 
> > On Aug 4, 2017, at 6:58 AM, Ferruh Yigit  wrote:
> > 
> > On 8/3/2017 8:53 PM, Thomas Monjalon wrote:  
> >> 03/08/2017 18:15, Stephen Hemminger:  
> >>> On Thu, 3 Aug 2017 14:21:38 +0100
> >>> Bruce Richardson  wrote:
> >>>   
>  On Thu, Aug 03, 2017 at 01:21:35PM +0100, Chilikin, Andrey wrote:  
> > To control some device-specific features public device-specific 
> > functions
> > rte_pmd_*.h are used.
> > 
> > But this solution requires applications to distinguish devices at 
> > runtime
> > and, depending on the device type, call corresponding device-specific
> > functions even if functions' parameters are the same.
> > 
> > IOCTL-like API can be added to ethdev instead of public device-specific
> > functions to address the following:
> > 
> > * allow more usable support of features across a range of NIC from
> >  one vendor, but not others
> > * allow features to be implemented by multiple NIC drivers without
> >  relying on a critical mass to get the functionality in ethdev
> > * there are a large number of possible device specific functions, and
> >  creating individual APIs for each one is not a good solution
> > * IOCTLs are a proven method for solving this problem in other areas,
> >  i.e. OS kernels.
> > 
> > Control requests for this API will be globally defined at ethdev level, 
> > so
> > an application will use single API call to control different devices 
> > from
> > one/multiple vendors.
> > 
> > API call may look like as a classic ioctl with an extra parameter for
> > argument length for better sanity checks:
> > 
> > int
> > rte_eth_dev_ioctl(uint16_t port, uint64_t ctl, void *argp,
> >unsigned arg_length);
> > 
> > Regards,
> > Andrey
>  
>  I think we need to start putting in IOCTLs for ethdevs, much as I hate
>  to admit it, since I dislike IOCTLs and other functions with opaque
>  arguments! Having driver specific functions I don't think will scale
>  well as each vendor tries to expose as much of their driver specific
>  functionality as possible.
>  
>  One other additional example: I discovered just this week another issue
>  with driver specific functions and testpmd, when I was working on the
>  meson build rework.
>  
>  * With shared libraries, when we do "ninja install" we want our DPDK
>   libs moved to e.g. /usr/local/lib, but the drivers moved to a separate
>   driver folder, so that they can be automatically loaded from that
>   single location by DPDK apps [== CONFIG_RTE_EAL_PMD_PATH].
>  * However, testpmd, as well as using the drivers as plugins, uses
>   driver-specific functions, which means that it explicitly links
>   against the pmd .so files.
>  * Those driver .so files are not in with the other libraries, so ld.so
>   does not find the pmd, and the installed testpmd fails to run due to
>   missing library dependencies.
>  * The workaround is to add the drivers path to the ld load path, but we
>   should not require ld library path changes just to get DPDK apps to
>   work.
>  
>  Using ioctls instead of driver-specific functions would solve this.
>  
>  My 2c.  
> >>> 
> >>> My 2c. No.
> >>> 
> >>> Short answer:
> >>> Ioctl's were a bad idea in Unix (per Dennis Ritchie et al) and are now
> >>> despised by Linux kernel developers. They provide an unstructured, 
> >>> unsecured,
> >>> back door for device driver abuse. Try to get a new driver in Linux with
> >>> a unique ioctl, and it will be hard to get accepted.
> >>> 
> >>> Long answer:
> >>> So far every device specific feature has fit into ethdev model. Doing 
> >>> ioctl
> >>> is admitting "it is too hard to be general, we need need an out". For 
> >>> something
> >>> that is a flag, it should fit into existing config model; ignoring silly 
> >>> ABI constraints.
> >>> For a real feature (think flow direction), we want a first class API for 
> >>> that.
> >>> For a wart, then devargs will do.
> >>> 
> >>> Give a good example of something that should be an ioctl. Don't build the
> >>> API first and then let it get cluttered.  
> >> 
> >> I agree with Stephen.
> >> 
> >> And please do not forget that ioctl still requires an API:
> >> the argument that you put in ioctl is the API of the feature.
> >> So it is the same thing as defining a new function.  
> > 
> > I am also not fan of the ioctl usage. I believe it hides APIs behind ids
> > and prevent argument check by compiler.
> > 
> > BUT, the number of the increasing PMD specific APIs are also worrying,
> > it is becoming harder to maintain, and I believe this is something NOT
> > sustainable in long run.
> > 
> > 
> > What about having *eth_dev_extended_ops* ?  
> 
> 
> We had talk about adding something like dev

Re: [dpdk-dev] [RFC] ethdev: add ioctl-like API to control device specific features

2017-08-08 Thread Wiles, Keith

> On Aug 8, 2017, at 1:02 PM, Stephen Hemminger  
> wrote:
> 
> On Tue, 8 Aug 2017 17:28:19 +
> "Wiles, Keith"  wrote:
> 
>> Fix format.
>> 
>>> On Aug 4, 2017, at 6:58 AM, Ferruh Yigit  wrote:
>>> 
>>> On 8/3/2017 8:53 PM, Thomas Monjalon wrote:  
 03/08/2017 18:15, Stephen Hemminger:  
> On Thu, 3 Aug 2017 14:21:38 +0100
> Bruce Richardson  wrote:
> 
>> On Thu, Aug 03, 2017 at 01:21:35PM +0100, Chilikin, Andrey wrote:  
>>> To control some device-specific features public device-specific 
>>> functions
>>> rte_pmd_*.h are used.
>>> 
>>> But this solution requires applications to distinguish devices at 
>>> runtime
>>> and, depending on the device type, call corresponding device-specific
>>> functions even if functions' parameters are the same.
>>> 
>>> IOCTL-like API can be added to ethdev instead of public device-specific
>>> functions to address the following:
>>> 
>>> * allow more usable support of features across a range of NIC from
>>> one vendor, but not others
>>> * allow features to be implemented by multiple NIC drivers without
>>> relying on a critical mass to get the functionality in ethdev
>>> * there are a large number of possible device specific functions, and
>>> creating individual APIs for each one is not a good solution
>>> * IOCTLs are a proven method for solving this problem in other areas,
>>> i.e. OS kernels.
>>> 
>>> Control requests for this API will be globally defined at ethdev level, 
>>> so
>>> an application will use single API call to control different devices 
>>> from
>>> one/multiple vendors.
>>> 
>>> API call may look like as a classic ioctl with an extra parameter for
>>> argument length for better sanity checks:
>>> 
>>> int
>>> rte_eth_dev_ioctl(uint16_t port, uint64_t ctl, void *argp,
>>>   unsigned arg_length);
>>> 
>>> Regards,
>>> Andrey
>> 
>> I think we need to start putting in IOCTLs for ethdevs, much as I hate
>> to admit it, since I dislike IOCTLs and other functions with opaque
>> arguments! Having driver specific functions I don't think will scale
>> well as each vendor tries to expose as much of their driver specific
>> functionality as possible.
>> 
>> One other additional example: I discovered just this week another issue
>> with driver specific functions and testpmd, when I was working on the
>> meson build rework.
>> 
>> * With shared libraries, when we do "ninja install" we want our DPDK
>> libs moved to e.g. /usr/local/lib, but the drivers moved to a separate
>> driver folder, so that they can be automatically loaded from that
>> single location by DPDK apps [== CONFIG_RTE_EAL_PMD_PATH].
>> * However, testpmd, as well as using the drivers as plugins, uses
>> driver-specific functions, which means that it explicitly links
>> against the pmd .so files.
>> * Those driver .so files are not in with the other libraries, so ld.so
>> does not find the pmd, and the installed testpmd fails to run due to
>> missing library dependencies.
>> * The workaround is to add the drivers path to the ld load path, but we
>> should not require ld library path changes just to get DPDK apps to
>> work.
>> 
>> Using ioctls instead of driver-specific functions would solve this.
>> 
>> My 2c.  
> 
> My 2c. No.
> 
> Short answer:
> Ioctl's were a bad idea in Unix (per Dennis Ritchie et al) and are now
> despised by Linux kernel developers. They provide an unstructured, 
> unsecured,
> back door for device driver abuse. Try to get a new driver in Linux with
> a unique ioctl, and it will be hard to get accepted.
> 
> Long answer:
> So far every device specific feature has fit into ethdev model. Doing 
> ioctl
> is admitting "it is too hard to be general, we need need an out". For 
> something
> that is a flag, it should fit into existing config model; ignoring silly 
> ABI constraints.
> For a real feature (think flow direction), we want a first class API for 
> that.
> For a wart, then devargs will do.
> 
> Give a good example of something that should be an ioctl. Don't build the
> API first and then let it get cluttered.  
 
 I agree with Stephen.
 
 And please do not forget that ioctl still requires an API:
 the argument that you put in ioctl is the API of the feature.
 So it is the same thing as defining a new function.  
>>> 
>>> I am also not fan of the ioctl usage. I believe it hides APIs behind ids
>>> and prevent argument check by compiler.
>>> 
>>> BUT, the number of the increasing PMD specific APIs are also worrying,
>>> it is becoming harder to maintain, and I believe this is something NOT
>>> sustainable in long run.
>>> 
>>> 
>>> What about having *eth_dev_extended_ops* 

[dpdk-dev] [dpdk-announce] DPDK 17.08 released

2017-08-08 Thread Thomas Monjalon
A new major release is available:
http://fast.dpdk.org/rel/dpdk-17.08.tar.xz

Some highlights:
- x86 requires SSE4.2
- more ARM optimizations using NEON
- service cores API
- GRO library
- traffic management API (QoS)
- ethdev flow fuzzy match
- ethdev flow isolate mode
- lock-free Tx queue capability API
- failsafe driver
- cryptodev rework
- eventdev burst mode
- NXP DPAA2 eventdev driver
- dpdk-test-eventdev application
- eventdev_pipeline_sw_pmd example

More details in the release notes:
http://dpdk.org/doc/guides/rel_notes/release_17_08.html

A lot of work was done during 3 months:
1023 patches from 125 authors
1018 files changed, 90842 insertions(+), 22245 deletions(-)

There are 38 new contributors
(including authors, reviewers and testers):
Thanks to Ashwin Sekhar T K, Balasubramanian Manoharan, Boris Pismenny,
Changpeng Liu, Cian Ferriter, Dahir Osman, Dariusz Stojaczyk,
Dirk-Holger Lenz, Gabriel Carrillo, George Wilkie, Harrison McCullough,
Herbert Guan, Ivan Dyukov, Jamie Lavigne, Jens Freimann, Jesse Bruni,
Kirill Rybalchenko, Leonid Myravjev, Mandeep Rohilla, Markus Theil,
Matan Azrad, Michael Lilja, Moti Haimovsky, Radu Nicolau, Ray Kinsella,
Robert Shearman, RongQiang Xie, Sangjin Han, Sha Zhang, Shachar Beiser,
Shahed Shaikh, Steeven Li, Tom Barbette, Tonghao Zhang, Vincent S. Cojot,
Wen Chiu, Xiaoyun Li and Xingyou Chen.

Below is the number of patches per authors grouped per company
(accuracy may be not perfect):
423 Intel (53)
130 Cavium (9)
100 6WIND (6)
 89 NXP (4)
 75 Mellanox (9)
 49 Broadcom (1)
 31 unknown (8)
 20 Chelsio (1)
 13 Microsoft (1)
 13 Brocade (6)
 12 Solarflare (2)
  8 Cisco (1)
  7 Linaro (1)
  7 CESNET (1)
  7 Atomic Rules (1)
  6 RedHat (4)
  5 Samsung (2)
  5 OKTET Labs (2)
  3 Wind River (2)
  3 IBM (1)
  3 ARM (1)
  2 Amazon (2)
  1 Semihalf (1)
  1 Netronome (1)
  1 Napatech (1)
  1 Huawei (1)
  1 HP (1)
  1 Big Switch (1)
  1 AT&T (1)

The new features for the 17.11 cycle must be submitted before August 25,
in order to be reviewed and integrated during September.
The next release is expected to happen at the very beginning of November.

Let's continue the good work!
Thanks everyone


Re: [dpdk-dev] [PATCH 1/2] eventdev: add event adapter for ethernet Rx queues

2017-08-08 Thread Eads, Gage


> -Original Message-
> From: Jerin Jacob [mailto:jerin.ja...@caviumnetworks.com]
> Sent: Thursday, August 3, 2017 1:23 AM
> To: Eads, Gage 
> Cc: Rao, Nikhil ; dev@dpdk.org; tho...@monjalon.net;
> Richardson, Bruce ; Van Haaren, Harry
> ; hemant.agra...@nxp.com;
> nipun.gu...@nxp.com; Vangati, Narender ;
> Gujjar, Abhinandan S 
> Subject: Re: [PATCH 1/2] eventdev: add event adapter for ethernet Rx queues
> 
> -Original Message-
> > Date: Wed, 2 Aug 2017 19:19:32 +
> > From: "Eads, Gage" 
> > To: Jerin Jacob , "Rao, Nikhil"
> >  
> > CC: "dev@dpdk.org" , "tho...@monjalon.net"
> >  , "Richardson, Bruce"
> > ,  "Van Haaren, Harry"
> , "hemant.agra...@nxp.com"
> >  , "nipun.gu...@nxp.com"
> > ,  "Vangati, Narender"
> , "Gujjar, Abhinandan S"
> >  
> > Subject: RE: [PATCH 1/2] eventdev: add event adapter for ethernet Rx
> > queues
> >
> >
> > 
> >
> > > > >
> > > > > 5) specifying rte_event_eth_rx_adapter_conf.rx_event_port_id on
> > > > > rte_event_eth_rx_adapter_create() would waste one HW eventdev
> > > > > port if its happen to be used RX_ADAPTER_CAP_INBUILT_PORT on
> > > rte_event_eth_rx_adapter_queue_add().
> > > > > unlike SW eventdev port, HW eventdev ports are costly so I
> > > > > think, We need to have another eventdev PMD ops to create
> service/producer ports.
> > > > > Or any other scheme that creates
> > > > > rte_event_eth_rx_adapter_conf.rx_event_port_id
> > > > > on demand by common code.
> > > > >
> > > >
> > > > One solution is:
> > > >
> > > > struct rte_event_eth_rx_adapter_conf {
> > > > uint8_t dev_id;
> > > >
> > > > int (*conf_cb)(uint8_t id, uint8_t port_id, uint32_t flags,
> > > > struct rte_event_eth_rx_adapter_conf *conf);
> > > >
> > > > unsigned int max_nb_rx;
> > > >
> > > > int event_port_id;
> > > >
> > > > char service_name[];
> > > > }
> > > >
> > > > Where dev_id and conf_cb have to be specified in the create call,
> > > > but event_port_id and service_name will be filled in when
> > > > conf_cb() is invoked
> > >
> > > I was thinking like event_port_id will be rte_event_port_count() + 1.
> > > ie When adapter needs the additional port, It can
> > > - stop the eventdev
> > > - reconfigure with rte_event_queue_count() , rte_event_port_count()
> > > + 1
> > > - start the eventdev.
> > >
> > > The only problem with callback is that all the application needs to 
> > > implement
> it.
> > > If you think, application need more control then we can expose
> > > callback and if it is NULL then default handler can be called in common 
> > > code.
> > >
> >
> > I don't think we can rely on there being another port available -- a user 
> > may
> have configured the sw eventdev with all 64 ports, for instance.
> 
> On that case, irrespective any scheme(callback vs non callback) the adapter
> creation would fail. Right?
> 
> > What if the user is required to calculate cfg.nb_event_ports as a function 
> > of
> the RX_ADAPTER_CAP_INBUILT_PORT capability (i.e. add a port if the capability
> is not set), such that a reconfigure is not required?
> 
> We have only one NON INBUILT eventdev port per adapter. Right? i.e in the v1
> spec it was rte_event_eth_rx_adapter_conf.event_port_id,
> How about it can be rte_event_port_count() + 1 ? Since we are NOT linking this
> port, the context call be kept in adapter itself. Right?

It could be. Thinking on it some more, I'm a little concerned about doing 
configuration without the application's knowledge. Possible issues that could 
arise:
- The user later reconfigures the event device with fewer ports and the 
adapter's port becomes invalid, or reconfigures it with more ports and begins 
using the port the adapter is using
- rte_event_port_count() + 1 extends beyond the PMD's capabilities (the sw PMD 
is hard-coded to support a max of 64 ports, for example)

Having the user be responsible for the port configuration could avoid these 
problems. Since the user needs to check the  pair's 
capabilities for the CAP_ADD_QUEUE anyway, they could also check for 
INBUILT_PORT and decide whether or not to request an additional port at 
eventdev configure time -- thereby ensuring they don't waste a port when using 
hardware with inbuilt ports. And this keeps the configuration code in one place 
(the app), rather than spread across the app, adapter, and potentially the 
conf_cb.

Besides these concerns, I think the transparent configuration approach (plus 
conf_cb when necessary to override) would work, but could have issues in the 
aforementioned edge cases.

> >
> > As for application control: that would be a useful option in the conf_cb
> scheme. Some apps will want to configure the adapter's port (its
> new_event_threshold, its queue depths) differently from the default.
> 
> struct rte_event_port_conf * can be passed on the adapter create if 
> application
> needs more control.
> 
> >
> > Thanks,
> > Gage