Re: [dpdk-dev] [PATCH 0/2] support to clear in-flight packets for async

2021-09-15 Thread Xia, Chenbo
Hi Yuan,

> -Original Message-
> From: Wang, YuanX 
> Sent: Thursday, September 9, 2021 2:58 PM
> To: dev@dpdk.org
> Cc: maxime.coque...@redhat.com; Xia, Chenbo ; Pai G,
> Sunil ; Hu, Jiayu ; Ding, Xuan
> ; Jiang, Cheng1 ; Ma, WenwuX
> ; Yang, YvonneX ; Wang, YuanX
> 
> Subject: [PATCH 0/2] support to clear in-flight packets for async
> 
> This patch supports to clear in-flight packets for aysnc dequeue and
> introduces thread-safe version of this function.

It'll be better if you can list the patchset this one depends on, otherwise
it will increase overhead for reviewers. Luckily I know the dependency, but
let's add it in new version.

Thanks,
Chenbo

> 
> Yuan Wang (2):
>   vhost: support to clear in-flight packets for async dequeue
>   vhost: support thread-safe API for clearing in-flight packets in async
> vhost
> 
>  lib/vhost/rte_vhost_async.h | 21 +
>  lib/vhost/version.map   |  1 +
>  lib/vhost/virtio_net.c  | 46 ++---
>  3 files changed, 65 insertions(+), 3 deletions(-)
> 
> --
> 2.25.1



Re: [dpdk-dev] [PATCH 1/2] vhost: support to clear in-flight packets for async dequeue

2021-09-15 Thread Xia, Chenbo
Hi Yuan,

> -Original Message-
> From: Wang, YuanX 
> Sent: Thursday, September 9, 2021 2:58 PM
> To: dev@dpdk.org
> Cc: maxime.coque...@redhat.com; Xia, Chenbo ; Pai G,
> Sunil ; Hu, Jiayu ; Ding, Xuan
> ; Jiang, Cheng1 ; Ma, WenwuX
> ; Yang, YvonneX ; Wang, YuanX
> 
> Subject: [PATCH 1/2] vhost: support to clear in-flight packets for async
> dequeue
> 
> rte_vhost_clear_queue_thread_unsafe() supports to clear
> in-flight packets for async enqueue only. But after
> supporting async dequeue, this API should support async dequeue too.
> 
> Signed-off-by: Yuan Wang 
> ---
>  lib/vhost/virtio_net.c | 16 ++--
>  1 file changed, 10 insertions(+), 6 deletions(-)
> 
> diff --git a/lib/vhost/virtio_net.c b/lib/vhost/virtio_net.c
> index e0159b53e3..7f6183a929 100644
> --- a/lib/vhost/virtio_net.c
> +++ b/lib/vhost/virtio_net.c
> @@ -27,6 +27,11 @@
> 
>  #define VHOST_ASYNC_BATCH_THRESHOLD 32
> 
> +static __rte_always_inline uint16_t
> +async_poll_dequeue_completed_split(struct virtio_net *dev,
> + struct vhost_virtqueue *vq, uint16_t queue_id,
> + struct rte_mbuf **pkts, uint16_t count, bool legacy_ol_flags);
> +
>  static  __rte_always_inline bool
>  rxvq_is_mergeable(struct virtio_net *dev)
>  {
> @@ -2119,11 +2124,6 @@ rte_vhost_clear_queue_thread_unsafe(int vid, uint16_t
> queue_id,
>   return 0;
> 
>   VHOST_LOG_DATA(DEBUG, "(%d) %s\n", dev->vid, __func__);
> - if (unlikely(!is_valid_virt_queue_idx(queue_id, 0, dev->nr_vring))) {
> - VHOST_LOG_DATA(ERR, "(%d) %s: invalid virtqueue idx %d.\n",
> - dev->vid, __func__, queue_id);
> - return 0;
> - }
> 
>   vq = dev->virtqueue[queue_id];
> 
> @@ -2133,7 +2133,11 @@ rte_vhost_clear_queue_thread_unsafe(int vid, uint16_t
> queue_id,
>   return 0;
>   }
> 
> - n_pkts_cpl = vhost_poll_enqueue_completed(dev, queue_id, pkts, count);
> + if ((queue_id % 2) == 0)

You can remove the internal '()'.

> + n_pkts_cpl = vhost_poll_enqueue_completed(dev, queue_id, pkts,
> count);
> + else
> + n_pkts_cpl = async_poll_dequeue_completed_split(dev, vq, 
> queue_id,
> pkts, count,

You should check we are using split queue before entering this split queue 
function.

Thanks,
Chenbo

> + dev->flags & 
> VIRTIO_DEV_LEGACY_OL_FLAGS);
> 
>   return n_pkts_cpl;
>  }
> --
> 2.25.1



Re: [dpdk-dev] [PATCH] vhost: add unsafe API to check inflight packets

2021-09-15 Thread Ding, Xuan
Hi Chenbo,

> -Original Message-
> From: Xia, Chenbo 
> Sent: Wednesday, September 15, 2021 2:49 PM
> To: Ding, Xuan ; dev@dpdk.org;
> maxime.coque...@redhat.com
> Cc: Hu, Jiayu ; cheng.ji...@intel.com; Richardson, Bruce
> ; Pai G, Sunil 
> Subject: RE: [PATCH] vhost: add unsafe API to check inflight packets
> 
> Hi Xuan,
> 
> > -Original Message-
> > From: Ding, Xuan 
> > Sent: Thursday, September 9, 2021 1:58 PM
> > To: dev@dpdk.org; maxime.coque...@redhat.com; Xia, Chenbo
> > 
> > Cc: Hu, Jiayu ; cheng.ji...@intel.com; Richardson, Bruce
> > ; Pai G, Sunil ; Ding,
> Xuan
> > 
> > Subject: [PATCH] vhost: add unsafe API to check inflight packets
> >
> > In async data path, when vring state changes, it is necessary to
> > know the number of inflight packets in DMA engine. This patch
> > provides a thread unsafe API to return the number of inflight
> > packets without using any lock.
> >
> > Signed-off-by: Xuan Ding 
> > ---
> >  doc/guides/prog_guide/vhost_lib.rst|  5 +
> >  doc/guides/rel_notes/release_21_11.rst |  5 +
> >  lib/vhost/rte_vhost_async.h| 14 ++
> >  lib/vhost/version.map  |  3 +++
> >  lib/vhost/vhost.c  | 25 +
> >  5 files changed, 52 insertions(+)
> >
> > diff --git a/doc/guides/prog_guide/vhost_lib.rst
> > b/doc/guides/prog_guide/vhost_lib.rst
> > index 8874033165..b4b1134f54 100644
> > --- a/doc/guides/prog_guide/vhost_lib.rst
> > +++ b/doc/guides/prog_guide/vhost_lib.rst
> > @@ -305,6 +305,11 @@ The following is an overview of some key Vhost API
> > functions:
> >This function returns the amount of in-flight packets for the vhost
> >queue using async acceleration.
> >
> > +``rte_vhost_async_get_inflight_thread_unsafe(vid, queue_id)``
> > +
> > +  Get the number of inflight packets for a vhost queue without
> > +  performing any locking.
> > +
> >  * ``rte_vhost_clear_queue_thread_unsafe(vid, queue_id, **pkts, count)``
> 
> This does not align with others. Please check.

Thanks, will fix it in next version.

> 
> >
> >Clear inflight packets which are submitted to DMA engine in vhost async
> > data
> > diff --git a/doc/guides/rel_notes/release_21_11.rst
> > b/doc/guides/rel_notes/release_21_11.rst
> > index 675b573834..db080e9490 100644
> > --- a/doc/guides/rel_notes/release_21_11.rst
> > +++ b/doc/guides/rel_notes/release_21_11.rst
> > @@ -55,6 +55,11 @@ New Features
> >   Also, make sure to start the actual text at the margin.
> >   ===
> >
> > +* **Added vhost API to get the number of inflight packets.**
> > +
> > +  Added an API which can get the number of inflight packets in
> > +  vhost async data path.
> > +
> 
> Please add 'without lock' or something similar as we already have a lock 
> version.

You are right, add "without lock" is more accuracy.

> 
> >  * **Enabled new devargs parser.**
> >
> >* Enabled devargs syntax
> > diff --git a/lib/vhost/rte_vhost_async.h b/lib/vhost/rte_vhost_async.h
> > index b25ff446f7..0af414bf78 100644
> > --- a/lib/vhost/rte_vhost_async.h
> > +++ b/lib/vhost/rte_vhost_async.h
> > @@ -246,6 +246,20 @@ uint16_t rte_vhost_poll_enqueue_completed(int vid,
> > uint16_t queue_id,
> >  __rte_experimental
> >  int rte_vhost_async_get_inflight(int vid, uint16_t queue_id);
> >
> > +/**
> > + * This function is lock-free version to return the amount of in-flight
> > + * packets for the vhost queue which uses async channel acceleration.
> > + *
> > + * @param vid
> > + *  id of vhost device to enqueue data
> > + * @param queue_id
> > + *  queue id to enqueue data
> 
> You can also check dequeue inflight packets, right?

Yes, this API applies to both enqueue and dequeue directions.

> 
> > + * @return
> > + *  the amount of in-flight packets on success; -1 on failure
> > + */
> > +__rte_experimental
> > +int rte_vhost_async_get_inflight_thread_unsafe(int vid, uint16_t queue_id);
> > +
> >  /**
> >   * This function checks async completion status and clear packets for
> >   * a specific vhost device queue. Packets which are inflight will be
> > diff --git a/lib/vhost/version.map b/lib/vhost/version.map
> > index c92a9d4962..b150dc408d 100644
> > --- a/lib/vhost/version.map
> > +++ b/lib/vhost/version.map
> > @@ -85,4 +85,7 @@ EXPERIMENTAL {
> > rte_vhost_async_channel_register_thread_unsafe;
> > rte_vhost_async_channel_unregister_thread_unsafe;
> > rte_vhost_clear_queue_thread_unsafe;
> > +
> > +   #added in 21.11
> > +   rte_vhost_async_get_inflight_thread_unsafe;
> >  };
> > diff --git a/lib/vhost/vhost.c b/lib/vhost/vhost.c
> > index 355ff37651..df96f84873 100644
> > --- a/lib/vhost/vhost.c
> > +++ b/lib/vhost/vhost.c
> > @@ -1886,5 +1886,30 @@ int rte_vhost_async_get_inflight(int vid, uint16_t
> > queue_id)
> > return ret;
> >  }
> >
> > +int rte_vhost_async_get_inflight_thread_unsafe(int vid, uint16_t queue_id)
> 
> According to DPDK coding style, return 

Re: [dpdk-dev] [PATCH] net/bonding: fix memory leak on closing device

2021-09-15 Thread Yu, DapengX



> -Original Message-
> From: Min Hu (Connor) 
> Sent: Wednesday, September 15, 2021 2:59 PM
> To: Yu, DapengX ; Chas Williams 
> Cc: dev@dpdk.org; sta...@dpdk.org
> Subject: Re: [PATCH] net/bonding: fix memory leak on closing device
> 
> Hi, dapengx,
>   Why not free internals->kvlist at the end of
> "bond_ethdev_configure" ?
> Does it call some bugs?

Just try not to deviate too much from the previous fix: 144dc4739975 
("net/bonding: fix leak on remove")
Since it is reasonable.
And releasing port resource in bond_ethdev_close() is in order to
avoid that memory leak detect tool to find memory leak after device is closed.

free internals->kvlist at the end of "bond_ethdev_configure" is also ok.

> 
> 在 2021/9/15 13:08, dapengx...@intel.com 写道:
> > From: Dapeng Yu 
> >
> > If the bond device was created by vdev mode, the kvlist was not free
> > after the bond device was closed.
> >
> > This patch fixes it.
> >
> > Fixes: 144dc4739975 ("net/bonding: fix leak on remove")
> > Cc: sta...@dpdk.org
> >
> > Signed-off-by: Dapeng Yu 
> > ---
> >   drivers/net/bonding/rte_eth_bond_pmd.c | 5 +++--
> >   1 file changed, 3 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/net/bonding/rte_eth_bond_pmd.c
> > b/drivers/net/bonding/rte_eth_bond_pmd.c
> > index a6755661c4..2e96b850fb 100644
> > --- a/drivers/net/bonding/rte_eth_bond_pmd.c
> > +++ b/drivers/net/bonding/rte_eth_bond_pmd.c
> > @@ -2163,6 +2163,9 @@ bond_ethdev_close(struct rte_eth_dev *dev)
> >  */
> > rte_mempool_free(internals->mode6.mempool);
> >
> > +   if (internals->kvlist != NULL)
> > +   rte_kvargs_free(internals->kvlist);
> > +
> > return 0;
> >   }
> >
> > @@ -3475,8 +3478,6 @@ bond_remove(struct rte_vdev_device *dev)
> > ret = bond_ethdev_stop(eth_dev);
> > bond_ethdev_close(eth_dev);
> > }
> > -   if (internals->kvlist != NULL)
> > -   rte_kvargs_free(internals->kvlist);
> > rte_eth_dev_release_port(eth_dev);
> >
> > return ret;
> >


Re: [dpdk-dev] [PATCH 2/2] vhost: support thread-safe API for clearing in-flight packets in async vhost

2021-09-15 Thread Xia, Chenbo
Hi Yuan,

> -Original Message-
> From: Wang, YuanX 
> Sent: Thursday, September 9, 2021 2:58 PM
> To: dev@dpdk.org
> Cc: maxime.coque...@redhat.com; Xia, Chenbo ; Pai G,
> Sunil ; Hu, Jiayu ; Ding, Xuan
> ; Jiang, Cheng1 ; Ma, WenwuX
> ; Yang, YvonneX ; Wang, YuanX
> 
> Subject: [PATCH 2/2] vhost: support thread-safe API for clearing in-flight
> packets in async vhost

support -> add

> 
> This patch adds thread-safe version for
> clearing in-flight packets function.
> 
> Signed-off-by: Yuan Wang 
> ---
>  lib/vhost/rte_vhost_async.h | 21 +
>  lib/vhost/version.map   |  1 +
>  lib/vhost/virtio_net.c  | 36 
>  3 files changed, 58 insertions(+)

Miss update of release note.

> 
> diff --git a/lib/vhost/rte_vhost_async.h b/lib/vhost/rte_vhost_async.h
> index 5e2429ab70..a418e0a03d 100644
> --- a/lib/vhost/rte_vhost_async.h
> +++ b/lib/vhost/rte_vhost_async.h
> @@ -261,6 +261,27 @@ int rte_vhost_async_get_inflight(int vid, uint16_t
> queue_id);
>  __rte_experimental
>  uint16_t rte_vhost_clear_queue_thread_unsafe(int vid, uint16_t queue_id,
>   struct rte_mbuf **pkts, uint16_t count);
> +
> +/**
> + * This function checks async completion status and clear packets for
> + * a specific vhost device queue. Packets which are inflight will be
> + * returned in an array.
> + *
> + * @param vid
> + *  ID of vhost device to clear data
> + * @param queue_id
> + *  Queue id to clear data
> + * @param pkts
> + *  Blank array to get return packet pointer
> + * @param count
> + *  Size of the packet array
> + * @return
> + *  Number of packets returned
> + */
> +__rte_experimental
> +uint16_t rte_vhost_clear_queue(int vid, uint16_t queue_id,
> + struct rte_mbuf **pkts, uint16_t count);
> +
>  /**
>   * This function tries to receive packets from the guest with offloading
>   * copies to the async channel. The packets that are transfer completed
> diff --git a/lib/vhost/version.map b/lib/vhost/version.map
> index 3d566a6d5f..f78cc89b58 100644
> --- a/lib/vhost/version.map
> +++ b/lib/vhost/version.map
> @@ -88,4 +88,5 @@ EXPERIMENTAL {
> 
>   # added in 21.11
>   rte_vhost_async_try_dequeue_burst;
> + rte_vhost_clear_queue;
>  };
> diff --git a/lib/vhost/virtio_net.c b/lib/vhost/virtio_net.c
> index 7f6183a929..51693a7c35 100644
> --- a/lib/vhost/virtio_net.c
> +++ b/lib/vhost/virtio_net.c
> @@ -2142,6 +2142,42 @@ rte_vhost_clear_queue_thread_unsafe(int vid, uint16_t
> queue_id,
>   return n_pkts_cpl;
>  }
> 
> +uint16_t
> +rte_vhost_clear_queue(int vid, uint16_t queue_id, struct rte_mbuf **pkts,
> uint16_t count)
> +{
> + struct virtio_net *dev = get_device(vid);
> + struct vhost_virtqueue *vq;
> + uint16_t n_pkts_cpl;
> +
> + if (!dev)
> + return 0;
> +
> + VHOST_LOG_DATA(DEBUG, "(%d) %s\n", dev->vid, __func__);
> +

Should check queue id here.

> + vq = dev->virtqueue[queue_id];
> +
> + if (unlikely(!vq->async_registered)) {
> + VHOST_LOG_DATA(ERR, "(%d) %s: async not registered for queue
> id %d.\n",
> + dev->vid, __func__, queue_id);
> + return 0;
> + }
> +
> + if (!rte_spinlock_trylock(&vq->access_lock)) {
> + VHOST_LOG_CONFIG(ERR, "Failed to clear async queue, virt queue
> busy.\n");

Should be VHOST_LOG_DATA. And please add vid and qid info in the log.

> + return 0;
> + }
> +
> + if ((queue_id % 2) == 0)

You can remove internal '()'

> + n_pkts_cpl = vhost_poll_enqueue_completed(dev, queue_id, pkts,
> count);
> + else
> + n_pkts_cpl = async_poll_dequeue_completed_split(dev, vq, 
> queue_id,
> pkts, count,

Add check to make sure it's split queue.

Thanks,
Chenbo

> + dev->flags & 
> VIRTIO_DEV_LEGACY_OL_FLAGS);
> +
> + rte_spinlock_unlock(&vq->access_lock);
> +
> + return n_pkts_cpl;
> +}
> +
>  static __rte_always_inline uint32_t
>  virtio_dev_rx_async_submit(struct virtio_net *dev, uint16_t queue_id,
>   struct rte_mbuf **pkts, uint32_t count)
> --
> 2.25.1



Re: [dpdk-dev] [PATCH] net/softnic: remove experimental table from API

2021-09-15 Thread Ferruh Yigit
On 9/1/2021 2:48 PM, Kinsella, Ray wrote:
> 
> 
> On 01/09/2021 13:20, Jasvinder Singh wrote:
>> This API was introduced in 18.08, therefore removing
>> experimental tag to promote it to stable state.
>>
>> Signed-off-by: Jasvinder Singh 
>> ---
>>  drivers/net/softnic/rte_eth_softnic.h | 1 -
>>  drivers/net/softnic/version.map   | 7 +--
>>  2 files changed, 1 insertion(+), 7 deletions(-)
>>
> Acked-by: Ray Kinsella 
> 

Applied to dpdk-next-net/main, thanks.


Re: [dpdk-dev] [PATCH] net/bnxt: fix Rx queue startup state

2021-09-15 Thread Ajit Khaparde
On Tue, Sep 14, 2021 at 5:51 AM Lance Richardson
 wrote:
>
> Since the addition of support for runtime queue setup,
> receive queues that are started by default no longer
> have the correct state. Fix this by setting the state
> when a port is started.
>
> Fixes: 0105ea1296c9 ("net/bnxt: support runtime queue setup")
> Signed-off-by: Lance Richardson 
> Reviewed-by: Ajit Khaparde 
> Reviewed-by: Somnath Kotur 
> Reviewed-by: Kalesh Anakkur Purayil 
Patch applied to dpdk-next-net-brcm. Thanks

> ---
>  drivers/net/bnxt/bnxt_ethdev.c | 6 ++
>  1 file changed, 6 insertions(+)
>
> diff --git a/drivers/net/bnxt/bnxt_ethdev.c b/drivers/net/bnxt/bnxt_ethdev.c
> index d6e3847963..097dd10de9 100644
> --- a/drivers/net/bnxt/bnxt_ethdev.c
> +++ b/drivers/net/bnxt/bnxt_ethdev.c
> @@ -482,6 +482,12 @@ static int bnxt_setup_one_vnic(struct bnxt *bp, uint16_t 
> vnic_id)
> rxq->vnic->fw_grp_ids[j] = INVALID_HW_RING_ID;
> else
> vnic->rx_queue_cnt++;
> +
> +   if (!rxq->rx_deferred_start) {
> +   bp->eth_dev->data->rx_queue_state[j] =
> +   RTE_ETH_QUEUE_STATE_STARTED;
> +   rxq->rx_started = true;
> +   }
> }
>
> PMD_DRV_LOG(DEBUG, "vnic->rx_queue_cnt = %d\n", vnic->rx_queue_cnt);
> --
> 2.25.1
>


Re: [dpdk-dev] [PATCH] config/ppc: fix build with GCC >= 10

2021-09-15 Thread Ferruh Yigit
On 9/15/2021 6:08 AM, David Marchand wrote:
> Like for python, multiline statements in meson must either use a
> backslash character (explicit continuation) or be enclosed in ()
> (implicit continuation).
> 
> python PEP8 recommends the latter [1], and it looks like meson had
> an issue with backslash before 0.50 [2].
> 
> 1: https://www.python.org/dev/peps/pep-0008/#multiline-if-statements
> 2: https://github.com/mesonbuild/meson/commit/90c9b868b20b
> 
> Fixes: 394407f50c90 ("config/ppc: ignore GCC 11 psabi warnings")
> 
> Reported-by: Ferruh Yigit 
> Signed-off-by: David Marchand 
> ---
>  config/ppc/meson.build | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/config/ppc/meson.build b/config/ppc/meson.build
> index 0b1948fc7c..aa1327a595 100644
> --- a/config/ppc/meson.build
> +++ b/config/ppc/meson.build
> @@ -20,8 +20,8 @@ endif
>  
>  # Suppress the gcc warning "note: the layout of aggregates containing
>  # vectors with 4-byte alignment has changed in GCC 5".
> -if cc.get_id() == 'gcc' and cc.version().version_compare('>=10.0') and
> -cc.version().version_compare('<12.0') and 
> cc.has_argument('-Wno-psabi')
> +if (cc.get_id() == 'gcc' and cc.version().version_compare('>=10.0') and
> +cc.version().version_compare('<12.0') and 
> cc.has_argument('-Wno-psabi'))
>  add_project_arguments('-Wno-psabi', language: 'c')
>  endif
>  
> 

Tested-by: Ferruh Yigit 


Re: [dpdk-dev] [PATCH] net/bonding: fix memory leak on closing device

2021-09-15 Thread Min Hu (Connor)

Acked-by: Min Hu (Connor) 

在 2021/9/15 15:18, Yu, DapengX 写道:




-Original Message-
From: Min Hu (Connor) 
Sent: Wednesday, September 15, 2021 2:59 PM
To: Yu, DapengX ; Chas Williams 
Cc: dev@dpdk.org; sta...@dpdk.org
Subject: Re: [PATCH] net/bonding: fix memory leak on closing device

Hi, dapengx,
Why not free internals->kvlist at the end of
"bond_ethdev_configure" ?
Does it call some bugs?


Just try not to deviate too much from the previous fix: 144dc4739975 ("net/bonding: 
fix leak on remove")
Since it is reasonable.
And releasing port resource in bond_ethdev_close() is in order to
avoid that memory leak detect tool to find memory leak after device is closed.

free internals->kvlist at the end of "bond_ethdev_configure" is also ok.



在 2021/9/15 13:08, dapengx...@intel.com 写道:

From: Dapeng Yu 

If the bond device was created by vdev mode, the kvlist was not free
after the bond device was closed.

This patch fixes it.

Fixes: 144dc4739975 ("net/bonding: fix leak on remove")
Cc: sta...@dpdk.org

Signed-off-by: Dapeng Yu 
---
   drivers/net/bonding/rte_eth_bond_pmd.c | 5 +++--
   1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/net/bonding/rte_eth_bond_pmd.c
b/drivers/net/bonding/rte_eth_bond_pmd.c
index a6755661c4..2e96b850fb 100644
--- a/drivers/net/bonding/rte_eth_bond_pmd.c
+++ b/drivers/net/bonding/rte_eth_bond_pmd.c
@@ -2163,6 +2163,9 @@ bond_ethdev_close(struct rte_eth_dev *dev)
 */
rte_mempool_free(internals->mode6.mempool);

+   if (internals->kvlist != NULL)
+   rte_kvargs_free(internals->kvlist);
+
return 0;
   }

@@ -3475,8 +3478,6 @@ bond_remove(struct rte_vdev_device *dev)
ret = bond_ethdev_stop(eth_dev);
bond_ethdev_close(eth_dev);
}
-   if (internals->kvlist != NULL)
-   rte_kvargs_free(internals->kvlist);
rte_eth_dev_release_port(eth_dev);

return ret;


.



Re: [dpdk-dev] [PATCH] ethdev: promote set ptypes API to stable

2021-09-15 Thread Ferruh Yigit
On 9/2/2021 4:56 PM, Kinsella, Ray wrote:
> 
> 
> On 02/09/2021 09:17, pbhagavat...@marvell.com wrote:
>> From: Pavan Nikhilesh 
>>
>> Remove experimental tag from rte_eth_dev_set_ptypes().
>>
>> Signed-off-by: Pavan Nikhilesh 
> Acked-by: Ray Kinsella 
> 

Applied to dpdk-next-net/main, thanks.


Re: [dpdk-dev] [dpdk-stable] [PATCH] net/af_xdp: fix support of secondary process

2021-09-15 Thread Ferruh Yigit
On 9/3/2021 5:15 PM, Stephen Hemminger wrote:
> Doing basic operations like info_get or get_stats was broken
> in af_xdp PMD. The info_get would crash because dev->device
> was NULL in secondary process. Fix this by doing same initialization
> as af_packet and tap devices.
> 
> The get_stats would crash because the XDP socket is not open in
> primary process. As a workaround don't query kernel for dropped
> packets when called from secondary process.
> 
> Note: this does not address the other bug which is that transmitting
> in secondary process is broken because the send() in tx_kick
> will fail because XDP socket fd is not valid in secondary process.
> 
> Bugzilla ID: 805
> Fixes: f1debd77efaf ("net/af_xdp: introduce AF_XDP PMD")
> Cc: sta...@dpdk.org
> Cc: xiaolong...@intel.com
> Ciara Loftus 
> Qi Zhang 
> Anatoly Burakov 
> 

+cc Ciara, Qi & Anatoly

> Signed-off-by: Stephen Hemminger 
> ---
>  drivers/net/af_xdp/rte_eth_af_xdp.c | 17 +
>  1 file changed, 9 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/net/af_xdp/rte_eth_af_xdp.c 
> b/drivers/net/af_xdp/rte_eth_af_xdp.c
> index 74ffa4511284..70abc14fa753 100644
> --- a/drivers/net/af_xdp/rte_eth_af_xdp.c
> +++ b/drivers/net/af_xdp/rte_eth_af_xdp.c
> @@ -860,7 +860,7 @@ eth_stats_get(struct rte_eth_dev *dev, struct 
> rte_eth_stats *stats)
>   struct pkt_rx_queue *rxq;
>   struct pkt_tx_queue *txq;
>   socklen_t optlen;
> - int i, ret;
> + int i;
>  
>   for (i = 0; i < dev->data->nb_rx_queues; i++) {
>   optlen = sizeof(struct xdp_statistics);
> @@ -876,13 +876,12 @@ eth_stats_get(struct rte_eth_dev *dev, struct 
> rte_eth_stats *stats)
>   stats->ibytes += stats->q_ibytes[i];
>   stats->imissed += rxq->stats.rx_dropped;
>   stats->oerrors += txq->stats.tx_dropped;
> - ret = getsockopt(xsk_socket__fd(rxq->xsk), SOL_XDP,
> - XDP_STATISTICS, &xdp_stats, &optlen);
> - if (ret != 0) {
> - AF_XDP_LOG(ERR, "getsockopt() failed for 
> XDP_STATISTICS.\n");
> - return -1;
> - }
> - stats->imissed += xdp_stats.rx_dropped;
> +
> + /* The socket fd is not valid in secondary process */
> + if (rte_eal_process_type() != RTE_PROC_SECONDARY &&
> + getsockopt(xsk_socket__fd(rxq->xsk), SOL_XDP,
> +XDP_STATISTICS, &xdp_stats, &optlen) == 0)
> + stats->imissed += xdp_stats.rx_dropped;
>  
>   stats->opackets += stats->q_opackets[i];
>   stats->obytes += stats->q_obytes[i];
> @@ -1799,7 +1798,9 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device *dev)
>   AF_XDP_LOG(ERR, "Failed to probe %s\n", name);
>   return -EINVAL;
>   }
> + /* TODO: reconnect socket from primary */
>   eth_dev->dev_ops = &ops;
> + eth_dev->device = &dev->device;
>   rte_eth_dev_probing_finish(eth_dev);
>   return 0;
>   }
> 



Re: [dpdk-dev] [PATCH] net/i40e/base: fix the resource leakage problem

2021-09-15 Thread Zhang, Qi Z



> -Original Message-
> From: dev  On Behalf Of
> chenqiming_hua...@163.com
> Sent: Saturday, August 21, 2021 2:30 PM
> To: dev@dpdk.org
> Cc: Xing, Beilei ; Qiming Chen
> ; sta...@dpdk.org
> Subject: [dpdk-dev] [PATCH] net/i40e/base: fix the resource leakage problem
> 
> From: Qiming Chen 
> 
> In the i40e_init_arq function, when the i40e_config_arq_regs function returns
> from processing failure, the previously applied arq_bufs resource is not
> released, which leads to leakage.
> The patch is processed in the same way as the i40e_init_asq function,
> maintaining a unified coding style.
> 
> Fixes: 49ea51605be4 ("net/i40e/base: gracefully clean the resources")
> Cc: sta...@dpdk.org
> 
> Signed-off-by: Qiming Chen 

Acked-by: Qi Zhang 

Applied to dpdk-next-net-intel.

Thanks
Qi



Re: [dpdk-dev] [PATCH] net/i40e: fix vf resource leakage problem

2021-09-15 Thread Zhang, Qi Z



> -Original Message-
> From: Zhang, Qi Z
> Sent: Wednesday, September 15, 2021 9:27 AM
> To: 'chenqiming_hua...@163.com' ;
> dev@dpdk.org
> Cc: Xing, Beilei ; sta...@dpdk.org
> Subject: RE: [dpdk-dev] [PATCH] net/i40e: fix vf resource leakage problem
> 
> 
> 
> > -Original Message-
> > From: dev  On Behalf Of
> > chenqiming_hua...@163.com
> > Sent: Saturday, August 21, 2021 4:14 PM
> > To: dev@dpdk.org
> > Cc: Xing, Beilei ; Qiming Chen
> > ; sta...@dpdk.org
> > Subject: [dpdk-dev] [PATCH] net/i40e: fix vf resource leakage problem
> >
> > From: Qiming Chen 
> >
> > In the i40evf_dev_init function, when the MAC memory alloc fails, the
> > previously initialized vf resource is not released, resulting in leakage.
> > The patch calls the i40evf_uninit_vf function in the abnormal branch
> > to release resources.
> >
> > Fixes: 5c9222058df7 ("i40e: move to drivers/net/")
> > Cc: sta...@dpdk.org
> >
> > Signed-off-by: Qiming Chen 
> 
> Acked-by: Qi Zhang 
> 
> Applied to dpdk-next-net-intel.

Sorry, I acked on wrong patch , this patch will not be applied as i40evf will 
be removed in this release, is should be submitted to LTS directly.

> 
> Thanks
> Qi



Re: [dpdk-dev] [PATCH v3] telemetry: add support for dicts of dicts

2021-09-15 Thread Power, Ciara
Hi Radu,

>-Original Message-
>From: Nicolau, Radu 
>Sent: Tuesday 14 September 2021 17:05
>To: Power, Ciara 
>Cc: dev@dpdk.org; Richardson, Bruce ; Nicolau,
>Radu ; Doherty, Declan
>
>Subject: [PATCH v3] telemetry: add support for dicts of dicts
>
>Add support for dicts of dicts to telemetry library.
>Increase the max string size to 128.
>
>Signed-off-by: Declan Doherty 
>Signed-off-by: Radu Nicolau 
>---
> app/test/test_telemetry_data.c | 29 
>lib/telemetry/rte_telemetry.h  |  2 +-
> lib/telemetry/telemetry.c  | 48 +-
> lib/telemetry/telemetry_data.c |  3 ++-
> 4 files changed, 74 insertions(+), 8 deletions(-)
>

Thanks,

Acked-by: Ciara Power 




Re: [dpdk-dev] [PATCH 2/2] net/cnxk: callback for getting link status

2021-09-15 Thread Nithin Kumar Dabilpuram



Acked-by: Nithin Dabilpuram 

On 7/30/21 9:38 PM, Harman Kalra wrote:

Adding a new callback for reading the link status. PF can read it's
link status and can forward the same to VF once it comes up.

Signed-off-by: Harman Kalra 
---
  drivers/net/cnxk/cnxk_ethdev.c |  9 +
  drivers/net/cnxk/cnxk_ethdev.h |  2 ++
  drivers/net/cnxk/cnxk_link.c   | 23 +++
  3 files changed, 34 insertions(+)

diff --git a/drivers/net/cnxk/cnxk_ethdev.c b/drivers/net/cnxk/cnxk_ethdev.c
index 0e3652ed51..7152dcd002 100644
--- a/drivers/net/cnxk/cnxk_ethdev.c
+++ b/drivers/net/cnxk/cnxk_ethdev.c
@@ -1314,6 +1314,10 @@ cnxk_eth_dev_init(struct rte_eth_dev *eth_dev)
/* Register up msg callbacks */
roc_nix_mac_link_cb_register(nix, cnxk_eth_dev_link_status_cb);
  
+	/* Register up msg callbacks */

+   roc_nix_mac_link_info_get_cb_register(nix,
+ cnxk_eth_dev_link_status_get_cb);
+
dev->eth_dev = eth_dev;
dev->configured = 0;
dev->ptype_disable = 0;
@@ -1415,6 +1419,11 @@ cnxk_eth_dev_uninit(struct rte_eth_dev *eth_dev, bool 
reset)
/* Disable link status events */
roc_nix_mac_link_event_start_stop(nix, false);
  
+	/* Unregister the link update op, this is required to stop VFs from

+* receiving link status updates on exit path.
+*/
+   roc_nix_mac_link_cb_unregister(nix);
+
/* Free up SQs */
for (i = 0; i < eth_dev->data->nb_tx_queues; i++) {
dev_ops->tx_queue_release(eth_dev->data->tx_queues[i]);
diff --git a/drivers/net/cnxk/cnxk_ethdev.h b/drivers/net/cnxk/cnxk_ethdev.h
index 4eead03905..4caf26303f 100644
--- a/drivers/net/cnxk/cnxk_ethdev.h
+++ b/drivers/net/cnxk/cnxk_ethdev.h
@@ -349,6 +349,8 @@ int cnxk_nix_rss_hash_conf_get(struct rte_eth_dev *eth_dev,
  void cnxk_nix_toggle_flag_link_cfg(struct cnxk_eth_dev *dev, bool set);
  void cnxk_eth_dev_link_status_cb(struct roc_nix *nix,
 struct roc_nix_link_info *link);
+void cnxk_eth_dev_link_status_get_cb(struct roc_nix *nix,
+struct roc_nix_link_info *link);
  int cnxk_nix_link_update(struct rte_eth_dev *eth_dev, int wait_to_complete);
  int cnxk_nix_queue_stats_mapping(struct rte_eth_dev *dev, uint16_t queue_id,
 uint8_t stat_idx, uint8_t is_rx);
diff --git a/drivers/net/cnxk/cnxk_link.c b/drivers/net/cnxk/cnxk_link.c
index 3fdbdba495..6a70801675 100644
--- a/drivers/net/cnxk/cnxk_link.c
+++ b/drivers/net/cnxk/cnxk_link.c
@@ -45,6 +45,29 @@ nix_link_status_print(struct rte_eth_dev *eth_dev, struct 
rte_eth_link *link)
plt_info("Port %d: Link Down", (int)(eth_dev->data->port_id));
  }
  
+void

+cnxk_eth_dev_link_status_get_cb(struct roc_nix *nix,
+   struct roc_nix_link_info *link)
+{
+   struct cnxk_eth_dev *dev = (struct cnxk_eth_dev *)nix;
+   struct rte_eth_link eth_link;
+   struct rte_eth_dev *eth_dev;
+
+   if (!link || !nix)
+   return;
+
+   eth_dev = dev->eth_dev;
+   if (!eth_dev)
+   return;
+
+   rte_eth_linkstatus_get(eth_dev, ð_link);
+
+   link->status = eth_link.link_status;
+   link->speed = eth_link.link_speed;
+   link->autoneg = eth_link.link_autoneg;
+   link->full_duplex = eth_link.link_duplex;
+}
+
  void
  cnxk_eth_dev_link_status_cb(struct roc_nix *nix, struct roc_nix_link_info 
*link)
  {



[dpdk-dev] [PATCH v2 0/2] i40e Rx descriptor loads ordering

2021-09-15 Thread Ruifeng Wang
On Rx path, NIC fills Rx descriptor with data pertains to received packet.

A single descriptor consists of multiple words. Word1 has the bit that
indicates readiness of descriptor for software to use. So word1 should
be loaded before other words.

On architectures with weaker memory ordering, barrier is needed to ensure
the ordering of loads.

This patch set fixed the risk on both scalar path and aarch64 vector path.

v2:
Updated commit message. Performance impact added. (Honnappa)

Ruifeng Wang (2):
  net/i40e: fix risk in Rx descriptor read in NEON vector path
  net/i40e: fix risk in Rx descriptor read in scalar path

 drivers/net/i40e/i40e_rxtx.c  | 12 
 drivers/net/i40e/i40e_rxtx_vec_neon.c |  8 
 2 files changed, 20 insertions(+)

-- 
2.25.1



[dpdk-dev] [PATCH v2 1/2] net/i40e: fix risk in Rx descriptor read in NEON vector path

2021-09-15 Thread Ruifeng Wang
Rx descriptor is 16B/32B in size. If the DD bit is set, it indicates
that the rest of the descriptor words have valid values. Hence, the
word containing DD bit must be read first before reading the rest of
the descriptor words.

In NEON vector PMD, vector load loads two contiguous 8B of
descriptor data into vector register. Given vector load ensures no
16B atomicity, read of the word that includes DD field could be
reordered after read of other words. In this case, some words could
contain invalid data.

Read barrier is added after read of qword1 that includes DD field.
And qword0 is reloaded to update vector register. This ensures
that the fetched data is correct.

Testpmd single core test on N1SDP/ThunderX2 showed no performance drop.

Fixes: ae0eb310f253 ("net/i40e: implement vector PMD for ARM")
Cc: sta...@dpdk.org

Signed-off-by: Ruifeng Wang 
Reviewed-by: Honnappa Nagarahalli 
---
 drivers/net/i40e/i40e_rxtx_vec_neon.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/drivers/net/i40e/i40e_rxtx_vec_neon.c 
b/drivers/net/i40e/i40e_rxtx_vec_neon.c
index b2683fda60..71191c7cc8 100644
--- a/drivers/net/i40e/i40e_rxtx_vec_neon.c
+++ b/drivers/net/i40e/i40e_rxtx_vec_neon.c
@@ -286,6 +286,14 @@ _recv_raw_pkts_vec(struct i40e_rx_queue *__rte_restrict 
rxq,
descs[1] =  vld1q_u64((uint64_t *)(rxdp + 1));
descs[0] =  vld1q_u64((uint64_t *)(rxdp));
 
+   /* Use acquire fence to order loads of descriptor qwords */
+   rte_atomic_thread_fence(__ATOMIC_ACQUIRE);
+   /* A.2 reload qword0 to make it ordered after qword1 load */
+   descs[3] = vld1q_lane_u64((uint64_t *)(rxdp + 3), descs[3], 0);
+   descs[2] = vld1q_lane_u64((uint64_t *)(rxdp + 2), descs[2], 0);
+   descs[1] = vld1q_lane_u64((uint64_t *)(rxdp + 1), descs[1], 0);
+   descs[0] = vld1q_lane_u64((uint64_t *)(rxdp), descs[0], 0);
+
/* B.1 load 4 mbuf point */
mbp1 = vld1q_u64((uint64_t *)&sw_ring[pos]);
mbp2 = vld1q_u64((uint64_t *)&sw_ring[pos + 2]);
-- 
2.25.1



[dpdk-dev] [PATCH v2 2/2] net/i40e: fix risk in Rx descriptor read in scalar path

2021-09-15 Thread Ruifeng Wang
Rx descriptor is 16B/32B in size. If the DD bit is set, it indicates
that the rest of the descriptor words have valid values. Hence, the
word containing DD bit must be read first before reading the rest of
the descriptor words.

Since the entire descriptor is not read atomically, on relaxed memory
ordered systems like Aarch64, read of the word containing DD field
could be reordered after read of other words.

Read barrier is inserted between read of the word with DD field
and read of other words. The barrier ensures that the fetched data
is correct.

Testpmd single core test showed no performance drop on x86 or N1SDP.
On ThunderX2, 22% performance regression was observed.

Fixes: 7b0cf70135d1 ("net/i40e: support ARM platform")
Cc: sta...@dpdk.org

Signed-off-by: Ruifeng Wang 
Reviewed-by: Honnappa Nagarahalli 
---
 drivers/net/i40e/i40e_rxtx.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/drivers/net/i40e/i40e_rxtx.c b/drivers/net/i40e/i40e_rxtx.c
index 8329cbdd4e..c4cd6b6b60 100644
--- a/drivers/net/i40e/i40e_rxtx.c
+++ b/drivers/net/i40e/i40e_rxtx.c
@@ -746,6 +746,12 @@ i40e_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, 
uint16_t nb_pkts)
break;
}
 
+   /**
+* Use acquire fence to ensure that qword1 which includes DD
+* bit is loaded before loading of other descriptor words.
+*/
+   rte_atomic_thread_fence(__ATOMIC_ACQUIRE);
+
rxd = *rxdp;
nb_hold++;
rxe = &sw_ring[rx_id];
@@ -862,6 +868,12 @@ i40e_recv_scattered_pkts(void *rx_queue,
break;
}
 
+   /**
+* Use acquire fence to ensure that qword1 which includes DD
+* bit is loaded before loading of other descriptor words.
+*/
+   rte_atomic_thread_fence(__ATOMIC_ACQUIRE);
+
rxd = *rxdp;
nb_hold++;
rxe = &sw_ring[rx_id];
-- 
2.25.1



Re: [dpdk-dev] [PATCH v1] lib/ethdev: fix a typo in ethdev comment

2021-09-15 Thread Ferruh Yigit
On 9/6/2021 3:13 AM, Joyce Kong wrote:
> Fix a typo that mb_pool was misspelt as mp_pool.
> 
> Fixes: 4ff702b5dfa9 ("ethdev: introduce Rx buffer split")
> Cc: Viacheslav Ovsiienko 
> Cc: sta...@dpdk.org
> 
> Signed-off-by: Joyce Kong 
> Reviewed-by: Ruifeng Wang 

Acked-by: Ferruh Yigit 

Applied to dpdk-next-net/main, thanks.


Re: [dpdk-dev] [PATCH v2] Warns if IPv4, UDP or TCP checksum offload not available

2021-09-15 Thread Ananyev, Konstantin



> -Original Message-
> From: Stephen Hemminger 
> Sent: Wednesday, September 15, 2021 12:44 AM
> To: Ananyev, Konstantin 
> Cc: Usama Nadeem ; tho...@monjalon.net; dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v2] Warns if IPv4, UDP or TCP checksum offload 
> not available
> 
> On Tue, 14 Sep 2021 22:22:04 +
> "Ananyev, Konstantin"  wrote:
> 
> > >
> > > From: usamanadeem321 
> > >
> > > Checks if IPV4, UDP and TCP Checksum offloads are available.
> > > If not available, prints a warning message.
> > >
> > > Bugzilla ID: 545
> > > Signed-off-by: usamanadeem321 
> > > ---
> > >  examples/l3fwd/main.c | 22 +-
> > >  1 file changed, 21 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/examples/l3fwd/main.c b/examples/l3fwd/main.c
> > > index 00ac267af1..ae62bc570d 100644
> > > --- a/examples/l3fwd/main.c
> > > +++ b/examples/l3fwd/main.c
> > > @@ -123,7 +123,6 @@ static struct rte_eth_conf port_conf = {
> > >   .mq_mode = ETH_MQ_RX_RSS,
> > >   .max_rx_pkt_len = RTE_ETHER_MAX_LEN,
> > >   .split_hdr_size = 0,
> > > - .offloads = DEV_RX_OFFLOAD_CHECKSUM,
> > >   },
> > >   .rx_adv_conf = {
> > >   .rss_conf = {
> > > @@ -1039,6 +1038,27 @@ l3fwd_poll_resource_setup(void)
> > >   local_port_conf.txmode.offloads |=
> > >   DEV_TX_OFFLOAD_MBUF_FAST_FREE;
> > >
> > > + if (dev_info.rx_offload_capa & DEV_RX_OFFLOAD_IPV4_CKSUM)
> > > + local_port_conf.rxmode.offloads |=
> > > +  DEV_RX_OFFLOAD_IPV4_CKSUM;
> > > + else {
> > > + printf("WARNING: IPV4 Checksum offload not 
> > > available.\n");
> > > + }
> > > +
> > > + if (dev_info.rx_offload_capa & DEV_RX_OFFLOAD_UDP_CKSUM)
> > > + local_port_conf.rxmode.offloads |=
> > > + DEV_RX_OFFLOAD_UDP_CKSUM;
> > > +
> > > + else
> > > + printf("WARNING: UDP Checksum offload not 
> > > available.\n");
> > > +
> > > + if (dev_info.rx_offload_capa & DEV_RX_OFFLOAD_TCP_CKSUM)
> > > + local_port_conf.rxmode.offloads |=
> > > + DEV_RX_OFFLOAD_TCP_CKSUM;
> > > +
> > > + else
> > > + printf("WARNING: TCP Checksum offload not 
> > > available.\n");
> > > +
> >
> > Sorry, but I didn't get the logic:
> > Application expects some offloads to be supported by HW.
> 
> The application is expecting more offloads than is necessary for basic
> IP level forwarding which is all the example is documented to do.
> 
>   "The application performs L3 forwarding."
> 
> > You add the code that checks for offloads, but if they are not supported 
> > just prints warning
> > and continues, as if everything is ok. Doesn't look like correct behaviour 
> > to me.
> > I think, it should either terminate with error message or be prepared to 
> > work properly
> > on HW without these offloads (check cksums in SW if necessary).
> > In fact I don't see what was wrong with original behaviour, one thing that 
> > probably
> > was missing - more descriptive error message.
> 
> It is not a problem with your patch, it is fine.
> 
> It is a problem in how l3fwd has grown and changed and no longer really what
> was intended in the original version. There is no reason that the application
> should be looking at L4 data. In fact, it shouldn't care if it gets TCP, UDP, 
> SCP or DCCP;
> but the application now depends on ptype.
> 
> It should be possible to do L3 forwarding independent of packet type.
> The application only needs to look at Ether type and do IPv4 or IPv6 based on 
> that.
> 

As I remember l3fwd cares about L4 headers (chan cksums) because it can do FWD 
decisions
based on 5-tuple (exact-macth mode).
I presume that's the reason L4 cksum offloads was enabled at first place.
For LPM/FIB I believe ipv4 cksum check should be sufficient.
If we believe that some offloads are excessive,
then I think right way is to simply remove them
(with updating docs and source in a proper way etc.).
Just printing warnings and continuing seems wrong to me.
  





Re: [dpdk-dev] [PATCH 1/2] net/i40e: fix risk in Rx descriptor read in NEON vector path

2021-09-15 Thread Ruifeng Wang
> -Original Message-
> From: Honnappa Nagarahalli 
> Sent: Wednesday, September 15, 2021 2:33 AM
> To: Ruifeng Wang ; dev@dpdk.org
> Cc: beilei.x...@intel.com; qi.z.zh...@intel.com;
> bruce.richard...@intel.com; jer...@marvell.com;
> hemant.agra...@nxp.com; d...@linux.vnet.ibm.com; sta...@dpdk.org; nd
> ; Ruifeng Wang ; Honnappa
> Nagarahalli ; nd 
> Subject: RE: [PATCH 1/2] net/i40e: fix risk in Rx descriptor read in NEON
> vector path
> 
> 
> Similar comments that I have to patch 2/2
> 
> >
> > Rx descriptor is 16B/32B in size and consists of multiple words.
> > The word that includes DD field should be read first. Read result with
> > DD bit set indicates the rest part in a descriptor is valid.
> Suggest rewording as follows:
> Rx descriptor is 16B/32B in size. If the DD bit is set, it indicates that the 
> rest of
> the descriptor words have valid values. Hence, the word containing DD bit
> must be read first before reading the rest of the descriptor words.
> 
> >
> > In NEON vector PMD, vector load loads two contiguous 8B of descriptor
> > data into vector register. Given vector load ensures no 16B atomicity,
> > read of the word that includes DD field could be reordered after read
> > of other words. In this case, some words could be invalid data.
> "some words could contain invalid data"
> 
> >
> > Read barrier is added after read of qword1 that includes DD field.
> > And qword0 is reloaded to update vector register. This ensures what
> > fetched is correct descriptor data.
> "This ensures that the fetched data is correct".
> 
> Suggest capturing the performance impact, so it is clearly documented.

Added performance impact to commit message in v2.
> >
> > Fixes: ae0eb310f253 ("net/i40e: implement vector PMD for ARM")
> > Cc: sta...@dpdk.org
> >
> > Signed-off-by: Ruifeng Wang 
> With the above comments,
> Reviewed-by: Honnappa Nagarahalli 
> 

Thanks for your review.
Comments are addressed in v2.
> > ---
> >  drivers/net/i40e/i40e_rxtx_vec_neon.c | 8 
> >  1 file changed, 8 insertions(+)
> >
> > diff --git a/drivers/net/i40e/i40e_rxtx_vec_neon.c
> > b/drivers/net/i40e/i40e_rxtx_vec_neon.c
> > index b2683fda60..71191c7cc8 100644
> > --- a/drivers/net/i40e/i40e_rxtx_vec_neon.c
> > +++ b/drivers/net/i40e/i40e_rxtx_vec_neon.c
> > @@ -286,6 +286,14 @@ _recv_raw_pkts_vec(struct i40e_rx_queue
> > *__rte_restrict rxq,
> > descs[1] =  vld1q_u64((uint64_t *)(rxdp + 1));
> > descs[0] =  vld1q_u64((uint64_t *)(rxdp));
> >
> > +   /* Use acquire fence to order loads of descriptor qwords */
> > +   rte_atomic_thread_fence(__ATOMIC_ACQUIRE);
> > +   /* A.2 reload qword0 to make it ordered after qword1 load
> */
> > +   descs[3] = vld1q_lane_u64((uint64_t *)(rxdp + 3), descs[3],
> > 0);
> > +   descs[2] = vld1q_lane_u64((uint64_t *)(rxdp + 2), descs[2],
> > 0);
> > +   descs[1] = vld1q_lane_u64((uint64_t *)(rxdp + 1), descs[1],
> > 0);
> > +   descs[0] = vld1q_lane_u64((uint64_t *)(rxdp), descs[0], 0);
> > +
> > /* B.1 load 4 mbuf point */
> > mbp1 = vld1q_u64((uint64_t *)&sw_ring[pos]);
> > mbp2 = vld1q_u64((uint64_t *)&sw_ring[pos + 2]);
> > --
> > 2.25.1



[dpdk-dev] [Bug 810] driver i40e: pci can't been probed。

2021-09-15 Thread bugzilla
https://bugs.dpdk.org/show_bug.cgi?id=810

Bug ID: 810
   Summary: driver i40e:  pci can't been probed。
   Product: DPDK
   Version: 21.11
  Hardware: x86
OS: Linux
Status: UNCONFIRMED
  Severity: major
  Priority: Normal
 Component: ethdev
  Assignee: dev@dpdk.org
  Reporter: 1031265...@qq.com
  Target Milestone: ---

first, i have to say i used the lastest dpdk and igb_uio version.
i bind netcard to igb_uio driver, but netcard can't beed probed when i start
dpdk-process. 
and this bug was found only in "3.10.0-1160.41.1.el7.x86_64"

all the soft and hard configurations are available in the followings

netcard drivers:
filename:  
/lib/modules/3.10.0-1160.41.1.el7.x86_64/kernel/drivers/net/ethernet/intel/i40e/i40e.ko.xz
version:2.8.20-k
license:GPL v2
description:Intel(R) Ethernet Connection XL710 Network Driver
author: Intel Corporation, 
retpoline:  Y
rhelversion:7.9
srcversion: F7480CBA290FAAF2FAF95C7
alias:  pci:v8086d158Bsv*sd*bc*sc*i*
alias:  pci:v8086d158Asv*sd*bc*sc*i*
alias:  pci:v8086d0D58sv*sd*bc*sc*i*
alias:  pci:v8086d0CF8sv*sd*bc*sc*i*
alias:  pci:v8086d1588sv*sd*bc*sc*i*
alias:  pci:v8086d1587sv*sd*bc*sc*i*
alias:  pci:v8086d37D3sv*sd*bc*sc*i*
alias:  pci:v8086d37D2sv*sd*bc*sc*i*
alias:  pci:v8086d37D1sv*sd*bc*sc*i*
alias:  pci:v8086d37D0sv*sd*bc*sc*i*
alias:  pci:v8086d37CFsv*sd*bc*sc*i*
alias:  pci:v8086d37CEsv*sd*bc*sc*i*
alias:  pci:v8086d104Fsv*sd*bc*sc*i*
alias:  pci:v8086d104Esv*sd*bc*sc*i*
alias:  pci:v8086d15FFsv*sd*bc*sc*i*
alias:  pci:v8086d1589sv*sd*bc*sc*i*
alias:  pci:v8086d1586sv*sd*bc*sc*i*
alias:  pci:v8086d1585sv*sd*bc*sc*i*
alias:  pci:v8086d1584sv*sd*bc*sc*i*
alias:  pci:v8086d1583sv*sd*bc*sc*i*
alias:  pci:v8086d1581sv*sd*bc*sc*i*
alias:  pci:v8086d1580sv*sd*bc*sc*i*
alias:  pci:v8086d1574sv*sd*bc*sc*i*
alias:  pci:v8086d1572sv*sd*bc*sc*i*
depends:ptp
intree: Y
vermagic:   3.10.0-1160.41.1.el7.x86_64 SMP mod_unload modversions 
signer: CentOS Linux kernel signing key
sig_key:4B:E3:B8:E9:52:F4:81:B2:62:51:AC:E4:66:9B:A7:99:71:D1:F1:AF
sig_hashalgo:   sha256
parm:   debug:Debug level (0=none,...,16=all), Debug mask (0x8XXX)
(uint)


centos-ver and linux source ver:
3.10.0-1160.41.1.el7.x86_64 #1 SMP Tue Aug 31 14:52:47 UTC 2021 x86_64 x86_64
x86_64 GNU/Linux

CentOS Linux release 7.9.2009 (Core)

-- 
You are receiving this mail because:
You are the assignee for the bug.

Re: [dpdk-dev] [PATCH 1/5] net/virtio: implement rte_power_monitor API

2021-09-15 Thread Xia, Chenbo
Hi Miao,

> -Original Message-
> From: Li, Miao 
> Sent: Friday, September 10, 2021 9:06 PM
> To: dev@dpdk.org
> Cc: Xia, Chenbo ; maxime.coque...@redhat.com; Li, Miao
> 
> Subject: [PATCH 1/5] net/virtio: implement rte_power_monitor API
> 
> This patch implements rte_power_monitor API in virtio PMD to reduce
> power consumption when no packet come in. According to current semantics
> of power monitor, this commit adds a callback function to decide whether
> aborts the sleep by checking current value against the expected value and
> virtio_get_monitor_addr to provide address to monitor. When no packet come
> in, the value of address will not be changed and the running core will
> sleep. Once packets arrive, the value of address will be changed and the
> running core will wakeup.
> 
> Signed-off-by: Miao Li 
> ---
>  drivers/net/virtio/virtio_ethdev.c | 57 ++
>  1 file changed, 57 insertions(+)
> 
> diff --git a/drivers/net/virtio/virtio_ethdev.c
> b/drivers/net/virtio/virtio_ethdev.c
> index e58085a2c9..4ce49936f5 100644
> --- a/drivers/net/virtio/virtio_ethdev.c
> +++ b/drivers/net/virtio/virtio_ethdev.c
> @@ -73,6 +73,8 @@ static int virtio_mac_addr_set(struct rte_eth_dev *dev,
>   struct rte_ether_addr *mac_addr);
> 
>  static int virtio_intr_disable(struct rte_eth_dev *dev);
> +static int virtio_get_monitor_addr(void *rx_queue,
> + struct rte_power_monitor_cond *pmc);
> 
>  static int virtio_dev_queue_stats_mapping_set(
>   struct rte_eth_dev *eth_dev,
> @@ -975,6 +977,7 @@ static const struct eth_dev_ops virtio_eth_dev_ops = {
>   .mac_addr_add= virtio_mac_addr_add,
>   .mac_addr_remove = virtio_mac_addr_remove,
>   .mac_addr_set= virtio_mac_addr_set,
> + .get_monitor_addr= virtio_get_monitor_addr,
>  };
> 
>  /*
> @@ -1306,6 +1309,60 @@ virtio_mac_addr_set(struct rte_eth_dev *dev, struct
> rte_ether_addr *mac_addr)
>   return 0;
>  }
> 
> +#define CLB_VAL_IDX 0
> +#define CLB_MSK_IDX 1
> +static int
> +virtio_packed_monitor_callback(const uint64_t value,
> + const uint64_t opaque[RTE_POWER_MONITOR_OPAQUE_SZ])
> +{
> + const uint64_t m = opaque[CLB_MSK_IDX];
> + const uint64_t v = opaque[CLB_VAL_IDX];
> +
> + return (value & m) == v ? -1 : 0;
> +}
> +
> +static int
> +virtio_split_monitor_callback(const uint64_t value,
> + const uint64_t opaque[RTE_POWER_MONITOR_OPAQUE_SZ])
> +{
> + const uint64_t m = opaque[CLB_MSK_IDX];
> + const uint64_t v = opaque[CLB_VAL_IDX];
> +
> + return (value & m) == v ? 0 : -1;
> +}
> +
> +static int
> +virtio_get_monitor_addr(void *rx_queue, struct rte_power_monitor_cond *pmc)
> +{
> + struct virtnet_rx *rxvq = rx_queue;
> + struct virtqueue *vq = virtnet_rxq_to_vq(rxvq);
> + struct virtio_hw *hw = vq->hw;
> + if (vq == NULL)
> + return -EINVAL;
> + if (virtio_with_packed_queue(hw)) {
> + struct vring_packed_desc *desc;
> + desc = vq->vq_packed.ring.desc;
> + pmc->addr = &desc[vq->vq_used_cons_idx].flags;
> + if (vq->vq_packed.used_wrap_counter)
> + pmc->opaque[CLB_VAL_IDX] =
> + VRING_PACKED_DESC_F_AVAIL_USED;
> + else
> + pmc->opaque[CLB_VAL_IDX] = 0;
> + pmc->opaque[CLB_MSK_IDX] = VRING_PACKED_DESC_F_AVAIL_USED;
> + pmc->fn = virtio_packed_monitor_callback;
> + pmc->size = sizeof(uint16_t);

I suggest to use sizeof(desc[vq->vq_used_cons_idx].flags) or sizeof(desc->flags)
in case the flag type changes.

> + } else {
> + pmc->addr = &vq->vq_split.ring.used->idx;
> + pmc->opaque[CLB_VAL_IDX] = vq->vq_used_cons_idx
> + & (vq->vq_nentries - 1);
> + pmc->opaque[CLB_MSK_IDX] = vq->vq_nentries - 1;
> + pmc->fn = virtio_split_monitor_callback;
> + pmc->size = sizeof(uint16_t);

Same here.

Thanks,
Chenbo

> + }
> +
> + return 0;
> +}
> +
>  static int
>  virtio_vlan_filter_set(struct rte_eth_dev *dev, uint16_t vlan_id, int on)
>  {
> --
> 2.25.1



Re: [dpdk-dev] [PATCH v2] ethdev: promote burst mode API to stable

2021-09-15 Thread Ferruh Yigit
On 9/6/2021 7:36 AM, Andrew Rybchenko wrote:
> On 9/6/21 8:56 AM, Haiyue Wang wrote:
>> The DPDK Symbol Bot reports:
>> Please note the symbols listed below have expired. In line with the
>> DPDK ABI policy, they should be scheduled for removal, in the next
>> DPDK release.
>>
>> Symbol
>> rte_eth_rx_burst_mode_get
>> rte_eth_tx_burst_mode_get
>>
>> Signed-off-by: Haiyue Wang 
>> Acked-by: Ferruh Yigit 
>> Acked-by: Ray Kinsella 
> 
> Acked-by: Andrew Rybchenko 
> 

Applied to dpdk-next-net/main, thanks.


Re: [dpdk-dev] [PATCH 2/5] lib/vhost: implement rte_power_monitor API

2021-09-15 Thread Xia, Chenbo
Hi Miao,

> -Original Message-
> From: Li, Miao 
> Sent: Friday, September 10, 2021 9:06 PM
> To: dev@dpdk.org
> Cc: Xia, Chenbo ; maxime.coque...@redhat.com; Li, Miao
> 
> Subject: [PATCH 2/5] lib/vhost: implement rte_power_monitor API

Should be 'vhost: implement rte_power_monitor API'

> 
> This patch defines rte_vhost_power_monitor_cond which is used to pass
> some information to vhost driver. The information is including the address
> to monitor, the expected value, the mask to extract value read from 'addr',
> the flag used to distinguish packed ring or split ring. Vhost driver can
> use these information to fill rte_power_monitor_cond.
> 
> Signed-off-by: Miao Li 
> ---
>  lib/vhost/rte_vhost.h | 33 +
>  lib/vhost/version.map |  3 +++
>  lib/vhost/vhost.c | 30 ++
>  3 files changed, 66 insertions(+)
> 
> diff --git a/lib/vhost/rte_vhost.h b/lib/vhost/rte_vhost.h
> index 8d875e9322..f58643b0a3 100644
> --- a/lib/vhost/rte_vhost.h
> +++ b/lib/vhost/rte_vhost.h
> @@ -38,6 +38,8 @@ extern "C" {
>  #define RTE_VHOST_USER_ASYNC_COPY(1ULL << 7)
>  #define RTE_VHOST_USER_NET_COMPLIANT_OL_FLAGS(1ULL << 8)
> 
> +#define VHOST_POWER_MONITOR_RING_PACKED (1ULL << 0)

I'd say I don't quite like introducing this flag so that vhost lib
app needs to know the vring is split or packed. I have another suggestion
to do the same thing, please check below comment.

> +
>  /* Features. */
>  #ifndef VIRTIO_NET_F_GUEST_ANNOUNCE
>   #define VIRTIO_NET_F_GUEST_ANNOUNCE 21
> @@ -292,6 +294,20 @@ struct vhost_device_ops {
>   void *reserved[1]; /**< Reserved for future extension */
>  };
> 
> +/**
> + * Power monitor condition.
> + */
> +struct rte_vhost_power_monitor_cond {
> + volatile void *addr;  /**< Address to monitor for changes */
> + /**< If the `mask` is non-zero, location pointed
> +  *   to by `addr` will be read and compared
> +  *   against this value.
> +  */
> + uint64_t val;
> + uint64_t mask; /**< 64-bit mask to extract value read from `addr` */
> + uint8_t flag;  /**< if 1, vhost packed ring, otherwise split ring */

What about define two values instead of the flag. One value for
'(value & m) == v ?' is True, another for False.

> +};
> +
>  /**
>   * Convert guest physical address to host virtual address
>   *
> @@ -914,6 +930,23 @@ int rte_vhost_vring_call(int vid, uint16_t vring_idx);
>   */
>  uint32_t rte_vhost_rx_queue_count(int vid, uint16_t qid);
> 
> +/**
> + * Get power monitor address of the vhost device
> + *
> + * @param vid
> + *  vhost device ID
> + * @param queue_id
> + *  vhost queue ID
> + * @param pmc
> + *  power monitor condition
> + * @return
> + *  0 on success, -1 on failure
> + */
> +__rte_experimental
> +int
> +rte_vhost_get_monitor_addr(int vid, uint16_t queue_id,
> + struct rte_vhost_power_monitor_cond *pmc);
> +
>  /**
>   * Get log base and log size of the vhost device
>   *
> diff --git a/lib/vhost/version.map b/lib/vhost/version.map
> index c92a9d4962..0a9667ef1e 100644
> --- a/lib/vhost/version.map
> +++ b/lib/vhost/version.map
> @@ -85,4 +85,7 @@ EXPERIMENTAL {
>   rte_vhost_async_channel_register_thread_unsafe;
>   rte_vhost_async_channel_unregister_thread_unsafe;
>   rte_vhost_clear_queue_thread_unsafe;
> +
> + # added in 21.11
> + rte_vhost_get_monitor_addr;
>  };
> diff --git a/lib/vhost/vhost.c b/lib/vhost/vhost.c
> index 355ff37651..f7374d3f94 100644
> --- a/lib/vhost/vhost.c
> +++ b/lib/vhost/vhost.c
> @@ -1886,5 +1886,35 @@ int rte_vhost_async_get_inflight(int vid, uint16_t
> queue_id)
>   return ret;
>  }
> 
> +int
> +rte_vhost_get_monitor_addr(int vid, uint16_t queue_id,
> + struct rte_vhost_power_monitor_cond *pmc)
> +{
> + struct virtio_net *dev = get_device(vid);

Check dev is not NULL before accessing its member.

> + struct vhost_virtqueue *vq = dev->virtqueue[queue_id];
> + if (vq == NULL)
> + return -1;
> + if (vq_is_packed(dev)) {
> + struct vring_packed_desc *desc;
> + desc = vq->desc_packed;
> + pmc->addr = &desc[vq->last_avail_idx].flags;
> + if (vq->avail_wrap_counter)
> + pmc->val = VRING_DESC_F_AVAIL;
> + else
> + pmc->val = VRING_DESC_F_USED;
> + pmc->mask = VRING_DESC_F_AVAIL | VRING_DESC_F_USED;
> + pmc->flag = VHOST_POWER_MONITOR_RING_PACKED;
> + } else {
> + pmc->addr = &vq->avail->idx;
> + pmc->val = vq->last_avail_idx & (vq->size - 1);
> + pmc->mask = vq->size - 1;
> + pmc->flag = 0;
> + }
> + if (pmc->addr == NULL)
> + return -1;

Is it possible that addr == NULL?

Thanks,
Chenbo

> +
> + return 0;
> +}
> +
>  RTE_LOG_REGISTER_SUFFIX(vhost_config_log_level, config, INFO);
>  RTE_LOG_REGISTER_SUFFIX(vhost_data_log_level, data, WARNING);
> --
> 2.25.1



[dpdk-dev] [PATCH v2 v2] vhost: add unsafe API to check inflight packets

2021-09-15 Thread Xuan Ding
In async data path, when vring state changes, it is necessary to
know the number of inflight packets in DMA engine. This patch
provides a thread unsafe API to return the number of inflight
packets without using any lock.

Signed-off-by: Xuan Ding 
---

v2:
* Fixed some format issues.
---
 doc/guides/prog_guide/vhost_lib.rst|  5 
 doc/guides/rel_notes/release_21_11.rst |  5 
 lib/vhost/rte_vhost_async.h| 14 +
 lib/vhost/version.map  |  3 ++
 lib/vhost/vhost.c  | 41 ++
 5 files changed, 63 insertions(+), 5 deletions(-)

diff --git a/doc/guides/prog_guide/vhost_lib.rst 
b/doc/guides/prog_guide/vhost_lib.rst
index 8874033165..0c4fb9ea91 100644
--- a/doc/guides/prog_guide/vhost_lib.rst
+++ b/doc/guides/prog_guide/vhost_lib.rst
@@ -305,6 +305,11 @@ The following is an overview of some key Vhost API 
functions:
   This function returns the amount of in-flight packets for the vhost
   queue using async acceleration.
 
+* ``rte_vhost_async_get_inflight_thread_unsafe(vid, queue_id)``
+
+  Get the number of inflight packets for a vhost queue without
+  performing any locking.
+
 * ``rte_vhost_clear_queue_thread_unsafe(vid, queue_id, **pkts, count)``
 
   Clear inflight packets which are submitted to DMA engine in vhost async data
diff --git a/doc/guides/rel_notes/release_21_11.rst 
b/doc/guides/rel_notes/release_21_11.rst
index 675b573834..c08e2b80c4 100644
--- a/doc/guides/rel_notes/release_21_11.rst
+++ b/doc/guides/rel_notes/release_21_11.rst
@@ -55,6 +55,11 @@ New Features
  Also, make sure to start the actual text at the margin.
  ===
 
+* **Added vhost API to get the number of inflight packets.**
+
+  Added an API which can get the number of inflight packets in
+  vhost async data path without lock.
+
 * **Enabled new devargs parser.**
 
   * Enabled devargs syntax
diff --git a/lib/vhost/rte_vhost_async.h b/lib/vhost/rte_vhost_async.h
index b25ff446f7..0af414bf78 100644
--- a/lib/vhost/rte_vhost_async.h
+++ b/lib/vhost/rte_vhost_async.h
@@ -246,6 +246,20 @@ uint16_t rte_vhost_poll_enqueue_completed(int vid, 
uint16_t queue_id,
 __rte_experimental
 int rte_vhost_async_get_inflight(int vid, uint16_t queue_id);
 
+/**
+ * This function is lock-free version to return the amount of in-flight
+ * packets for the vhost queue which uses async channel acceleration.
+ *
+ * @param vid
+ *  id of vhost device to enqueue data
+ * @param queue_id
+ *  queue id to enqueue data
+ * @return
+ *  the amount of in-flight packets on success; -1 on failure
+ */
+__rte_experimental
+int rte_vhost_async_get_inflight_thread_unsafe(int vid, uint16_t queue_id);
+
 /**
  * This function checks async completion status and clear packets for
  * a specific vhost device queue. Packets which are inflight will be
diff --git a/lib/vhost/version.map b/lib/vhost/version.map
index c92a9d4962..b150dc408d 100644
--- a/lib/vhost/version.map
+++ b/lib/vhost/version.map
@@ -85,4 +85,7 @@ EXPERIMENTAL {
rte_vhost_async_channel_register_thread_unsafe;
rte_vhost_async_channel_unregister_thread_unsafe;
rte_vhost_clear_queue_thread_unsafe;
+
+   #added in 21.11
+   rte_vhost_async_get_inflight_thread_unsafe;
 };
diff --git a/lib/vhost/vhost.c b/lib/vhost/vhost.c
index 355ff37651..69e9d229af 100644
--- a/lib/vhost/vhost.c
+++ b/lib/vhost/vhost.c
@@ -1500,7 +1500,8 @@ rte_vhost_get_vdpa_device(int vid)
return dev->vdpa_dev;
 }
 
-int rte_vhost_get_log_base(int vid, uint64_t *log_base,
+int
+rte_vhost_get_log_base(int vid, uint64_t *log_base,
uint64_t *log_size)
 {
struct virtio_net *dev = get_device(vid);
@@ -1514,7 +1515,8 @@ int rte_vhost_get_log_base(int vid, uint64_t *log_base,
return 0;
 }
 
-int rte_vhost_get_vring_base(int vid, uint16_t queue_id,
+int
+rte_vhost_get_vring_base(int vid, uint16_t queue_id,
uint16_t *last_avail_idx, uint16_t *last_used_idx)
 {
struct vhost_virtqueue *vq;
@@ -1543,7 +1545,8 @@ int rte_vhost_get_vring_base(int vid, uint16_t queue_id,
return 0;
 }
 
-int rte_vhost_set_vring_base(int vid, uint16_t queue_id,
+int
+rte_vhost_set_vring_base(int vid, uint16_t queue_id,
uint16_t last_avail_idx, uint16_t last_used_idx)
 {
struct vhost_virtqueue *vq;
@@ -1606,7 +1609,8 @@ rte_vhost_get_vring_base_from_inflight(int vid,
return 0;
 }
 
-int rte_vhost_extern_callback_register(int vid,
+int
+rte_vhost_extern_callback_register(int vid,
struct rte_vhost_user_extern_ops const * const ops, void *ctx)
 {
struct virtio_net *dev = get_device(vid);
@@ -1854,7 +1858,8 @@ rte_vhost_async_channel_unregister_thread_unsafe(int vid, 
uint16_t queue_id)
return 0;
 }
 
-int rte_vhost_async_get_inflight(int vid, uint16_t queue_id)
+int
+rte_vhost_async_get_inflight(int vid, uint16_t queue_id)
 {
struct vhost_virtqueue *vq;
   

Re: [dpdk-dev] [PATCH] config/ppc: fix build with GCC >= 10

2021-09-15 Thread Bruce Richardson
On Wed, Sep 15, 2021 at 09:14:34AM +0100, Ferruh Yigit wrote:
> On 9/15/2021 6:08 AM, David Marchand wrote:
> > Like for python, multiline statements in meson must either use a
> > backslash character (explicit continuation) or be enclosed in ()
> > (implicit continuation).
> > 
> > python PEP8 recommends the latter [1], and it looks like meson had
> > an issue with backslash before 0.50 [2].
> > 
> > 1: https://www.python.org/dev/peps/pep-0008/#multiline-if-statements
> > 2: https://github.com/mesonbuild/meson/commit/90c9b868b20b
> > 
> > Fixes: 394407f50c90 ("config/ppc: ignore GCC 11 psabi warnings")
> > 
> > Reported-by: Ferruh Yigit 
> > Signed-off-by: David Marchand 
> > ---
> >  config/ppc/meson.build | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/config/ppc/meson.build b/config/ppc/meson.build
> > index 0b1948fc7c..aa1327a595 100644
> > --- a/config/ppc/meson.build
> > +++ b/config/ppc/meson.build
> > @@ -20,8 +20,8 @@ endif
> >  
> >  # Suppress the gcc warning "note: the layout of aggregates containing
> >  # vectors with 4-byte alignment has changed in GCC 5".
> > -if cc.get_id() == 'gcc' and cc.version().version_compare('>=10.0') and
> > -cc.version().version_compare('<12.0') and 
> > cc.has_argument('-Wno-psabi')
> > +if (cc.get_id() == 'gcc' and cc.version().version_compare('>=10.0') and
> > +cc.version().version_compare('<12.0') and 
> > cc.has_argument('-Wno-psabi'))
> >  add_project_arguments('-Wno-psabi', language: 'c')
> >  endif
> >  
> > 
> 
> Tested-by: Ferruh Yigit 
Acked-by: Bruce Richardson 



Re: [dpdk-dev] [PATCH 2/2] net/virtio: fix Tx completed mbufs leak on device stop

2021-09-15 Thread Andrew Rybchenko
On 9/13/21 6:41 PM, Maxime Coquelin wrote:
> 
> 
> On 8/18/21 4:13 PM, Andrew Rybchenko wrote:
>> From: Ivan Ilchenko 
>>
>> Free Tx completed mbufs on device stop. Not completed Tx mbufs cannot be
>> freed since they are still in use.
>>
>> Fixes: c1f86306a02 ("virtio: add new driver")
>> Cc: sta...@dpdk.org
>>
>> Signed-off-by: Ivan Ilchenko 
>> Signed-off-by: Andrew Rybchenko 
>> ---
>>  drivers/net/virtio/virtio_ethdev.c | 30 ++
>>  1 file changed, 30 insertions(+)
>>
>> diff --git a/drivers/net/virtio/virtio_ethdev.c 
>> b/drivers/net/virtio/virtio_ethdev.c
>> index e58085a2c9..ed3fefee7c 100644
>> --- a/drivers/net/virtio/virtio_ethdev.c
>> +++ b/drivers/net/virtio/virtio_ethdev.c
>> @@ -2393,6 +2393,34 @@ static void virtio_dev_free_mbufs(struct rte_eth_dev 
>> *dev)
>>  PMD_INIT_LOG(DEBUG, "%d mbufs freed", mbuf_num);
>>  }
>>  
>> +static void
>> +virtio_tx_completed_cleanup(struct rte_eth_dev *dev)
>> +{
>> +struct virtio_hw *hw = dev->data->dev_private;
>> +struct virtqueue *vq;
>> +int qidx;
>> +void (*xmit_cleanup)(struct virtqueue *vq, uint16_t nb_used);
>> +
>> +if (virtio_with_packed_queue(hw)) {
>> +if (hw->use_vec_tx)
>> +xmit_cleanup = &virtio_xmit_cleanup_inorder_packed;
>> +else if (virtio_with_feature(hw, VIRTIO_F_IN_ORDER))
>> +xmit_cleanup = &virtio_xmit_cleanup_inorder_packed;
>> +else
>> +xmit_cleanup = &virtio_xmit_cleanup_normal_packed;
>> +} else {
>> +if (hw->use_inorder_tx)
>> +xmit_cleanup = &virtio_xmit_cleanup_inorder;
>> +else
>> +xmit_cleanup = &virtio_xmit_cleanup;
>> +}
>> +
>> +for (qidx = 0; qidx < hw->max_queue_pairs; qidx++) {
>> +vq = hw->vqs[2 * qidx + VTNET_SQ_TQ_QUEUE_IDX];
> 
> Maybe add a check to ensure that vq is non-NULL since it is dereferenced
> later without checking.

Good idea, I'll send v2.

> 
>> +xmit_cleanup(vq, virtqueue_nused(vq));
>> +}
>> +}
>> +
>>  /*
>>   * Stop device: disable interrupt and mark link down
>>   */
>> @@ -2411,6 +2439,8 @@ virtio_dev_stop(struct rte_eth_dev *dev)
>>  goto out_unlock;
>>  hw->started = 0;
>>  
>> +virtio_tx_completed_cleanup(dev);
>> +
>>  if (intr_conf->lsc || intr_conf->rxq) {
>>  virtio_intr_disable(dev);
>>  
>>



[dpdk-dev] [PATCH v2 1/2] net/virtio: fix Tx cleanup functions to have same signature

2021-09-15 Thread Andrew Rybchenko
From: Ivan Ilchenko 

There is a family of cleanup from completed transmits functions.
Fix packed virtqueues cleanup functions to have the same signature
as split virtqueues have. This lets all functions of the family to
match the same callback prototype.

Fixes: 892dc798fa9 ("net/virtio: implement Tx path for packed queues")
Cc: sta...@dpdk.org

Signed-off-by: Ivan Ilchenko 
Signed-off-by: Andrew Rybchenko 
---
 drivers/net/virtio/virtqueue.h | 11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/drivers/net/virtio/virtqueue.h b/drivers/net/virtio/virtqueue.h
index 03957b2bd0..d0c48ca415 100644
--- a/drivers/net/virtio/virtqueue.h
+++ b/drivers/net/virtio/virtqueue.h
@@ -803,25 +803,26 @@ vq_ring_free_id_packed(struct virtqueue *vq, uint16_t id)
 }
 
 static void
-virtio_xmit_cleanup_inorder_packed(struct virtqueue *vq, int num)
+virtio_xmit_cleanup_inorder_packed(struct virtqueue *vq, uint16_t num)
 {
uint16_t used_idx, id, curr_id, free_cnt = 0;
uint16_t size = vq->vq_nentries;
struct vring_packed_desc *desc = vq->vq_packed.ring.desc;
struct vq_desc_extra *dxp;
+   int nb = num;
 
used_idx = vq->vq_used_cons_idx;
/* desc_is_used has a load-acquire or rte_io_rmb inside
 * and wait for used desc in virtqueue.
 */
-   while (num > 0 && desc_is_used(&desc[used_idx], vq)) {
+   while (nb > 0 && desc_is_used(&desc[used_idx], vq)) {
id = desc[used_idx].id;
do {
curr_id = used_idx;
dxp = &vq->vq_descx[used_idx];
used_idx += dxp->ndescs;
free_cnt += dxp->ndescs;
-   num -= dxp->ndescs;
+   nb -= dxp->ndescs;
if (used_idx >= size) {
used_idx -= size;
vq->vq_packed.used_wrap_counter ^= 1;
@@ -837,7 +838,7 @@ virtio_xmit_cleanup_inorder_packed(struct virtqueue *vq, 
int num)
 }
 
 static void
-virtio_xmit_cleanup_normal_packed(struct virtqueue *vq, int num)
+virtio_xmit_cleanup_normal_packed(struct virtqueue *vq, uint16_t num)
 {
uint16_t used_idx, id;
uint16_t size = vq->vq_nentries;
@@ -867,7 +868,7 @@ virtio_xmit_cleanup_normal_packed(struct virtqueue *vq, int 
num)
 
 /* Cleanup from completed transmits. */
 static inline void
-virtio_xmit_cleanup_packed(struct virtqueue *vq, int num, int in_order)
+virtio_xmit_cleanup_packed(struct virtqueue *vq, uint16_t num, int in_order)
 {
if (in_order)
virtio_xmit_cleanup_inorder_packed(vq, num);
-- 
2.30.2



[dpdk-dev] [PATCH v2 2/2] net/virtio: fix Tx completed mbufs leak on device stop

2021-09-15 Thread Andrew Rybchenko
From: Ivan Ilchenko 

Free Tx completed mbufs on device stop. Not completed Tx mbufs cannot be
freed since they are still in use.

Fixes: c1f86306a02 ("virtio: add new driver")
Cc: sta...@dpdk.org

Signed-off-by: Ivan Ilchenko 
Signed-off-by: Andrew Rybchenko 
---
v2:
- check vq pointer vs NULL before calling cleanup function

 drivers/net/virtio/virtio_ethdev.c | 31 ++
 1 file changed, 31 insertions(+)

diff --git a/drivers/net/virtio/virtio_ethdev.c 
b/drivers/net/virtio/virtio_ethdev.c
index da1633d77e..3e3b42eaf6 100644
--- a/drivers/net/virtio/virtio_ethdev.c
+++ b/drivers/net/virtio/virtio_ethdev.c
@@ -2401,6 +2401,35 @@ static void virtio_dev_free_mbufs(struct rte_eth_dev 
*dev)
PMD_INIT_LOG(DEBUG, "%d mbufs freed", mbuf_num);
 }
 
+static void
+virtio_tx_completed_cleanup(struct rte_eth_dev *dev)
+{
+   struct virtio_hw *hw = dev->data->dev_private;
+   struct virtqueue *vq;
+   int qidx;
+   void (*xmit_cleanup)(struct virtqueue *vq, uint16_t nb_used);
+
+   if (virtio_with_packed_queue(hw)) {
+   if (hw->use_vec_tx)
+   xmit_cleanup = &virtio_xmit_cleanup_inorder_packed;
+   else if (virtio_with_feature(hw, VIRTIO_F_IN_ORDER))
+   xmit_cleanup = &virtio_xmit_cleanup_inorder_packed;
+   else
+   xmit_cleanup = &virtio_xmit_cleanup_normal_packed;
+   } else {
+   if (hw->use_inorder_tx)
+   xmit_cleanup = &virtio_xmit_cleanup_inorder;
+   else
+   xmit_cleanup = &virtio_xmit_cleanup;
+   }
+
+   for (qidx = 0; qidx < hw->max_queue_pairs; qidx++) {
+   vq = hw->vqs[2 * qidx + VTNET_SQ_TQ_QUEUE_IDX];
+   if (vq != NULL)
+   xmit_cleanup(vq, virtqueue_nused(vq));
+   }
+}
+
 /*
  * Stop device: disable interrupt and mark link down
  */
@@ -2419,6 +2448,8 @@ virtio_dev_stop(struct rte_eth_dev *dev)
goto out_unlock;
hw->started = 0;
 
+   virtio_tx_completed_cleanup(dev);
+
if (intr_conf->lsc || intr_conf->rxq) {
virtio_intr_disable(dev);
 
-- 
2.30.2



Re: [dpdk-dev] [PATCH] net/virtio: wait device ready in device reset

2021-09-15 Thread Xueming(Steven) Li
On Thu, 2021-08-26 at 07:15 +, Xia, Chenbo wrote:
> Hi Adrew & Xueming,
> 
> > -Original Message-
> > From: Andrew Rybchenko 
> > Sent: Tuesday, August 24, 2021 11:41 PM
> > To: Xueming(Steven) Li 
> > Cc: dev@dpdk.org; Maxime Coquelin ; Xia, Chenbo
> > 
> > Subject: Re: [dpdk-dev] [PATCH] net/virtio: wait device ready in device 
> > reset
> > 
> > On 8/23/21 4:54 PM, Xueming(Steven) Li wrote:
> > > 
> > > 
> > > > -Original Message-
> > > > From: Andrew Rybchenko 
> > > > Sent: Monday, August 23, 2021 5:57 PM
> > > > To: Xueming(Steven) Li 
> > > > Cc: dev@dpdk.org; Maxime Coquelin ; Chenbo 
> > > > Xia
> > 
> > > > Subject: Re: [dpdk-dev] [PATCH] net/virtio: wait device ready in device
> > reset
> > > > 
> > > > On 8/23/21 9:39 AM, Xueming Li wrote:
> > > > > According to virtio spec, the device MUST reset when 0 is written to
> > > > > device_status, and present a 0 in device_status once that is done.
> > > > > 
> > > > > This patch adds the missing part of waiting status 0 in reset 
> > > > > function.
> > > > > 
> > > > > Signed-off-by: Xueming Li 
> > > > > ---
> > > > >  drivers/net/virtio/virtio.c | 7 +--
> > > > >  1 file changed, 5 insertions(+), 2 deletions(-)
> > > > > 
> > > > > diff --git a/drivers/net/virtio/virtio.c b/drivers/net/virtio/virtio.c
> > > > > index 7e1e77797f..f003f612d6 100644
> > > > > --- a/drivers/net/virtio/virtio.c
> > > > > +++ b/drivers/net/virtio/virtio.c
> > > > > @@ -3,6 +3,8 @@
> > > > >   * Copyright(c) 2020 Red Hat, Inc.
> > > > >   */
> > > > > 
> > > > > +#include 
> > > > > +
> > > > >  #include "virtio.h"
> > > > > 
> > > > >  uint64_t
> > > > > @@ -39,8 +41,9 @@ void
> > > > >  virtio_reset(struct virtio_hw *hw)
> > > > >  {
> > > > >   VIRTIO_OPS(hw)->set_status(hw, VIRTIO_CONFIG_STATUS_RESET);
> > > > > - /* flush status write */
> > > > > - VIRTIO_OPS(hw)->get_status(hw);
> > > > > + /* Flush status write and wait device ready. */
> > > > > + while (VIRTIO_OPS(hw)->get_status(hw) != 
> > > > > VIRTIO_CONFIG_STATUS_RESET)
> > > > > + usleep(1000L);
> > > > 
> > > > Don't we need a protection against forever loop here?
> > > 
> > > Good question, ideally we need, kernel driver function vp_reset() seems to
> > have same issue.
> > 
> > Yes, I've seen it.
> > 
> > > How about leaving an error message before return?
> > 
> > @Maxime, @Chenbo, what do you think?
> 
> I would vote for waiting for some time before return rather than forever loop
> and error message is needed.
> 
> My understanding is for kernel, it's fine to sleep forever as kernel could 
> schedule
> it but for DPDK, it will lead to main lcore unable to do other things but 
> sleep
> forever. Meanwhile, users will see the app stuck but don't know what's wrong 
> here.
> 
> Thanks,
> Chenbo
> 

Hi all, thanks for you sugestion, new version posted:
https://mails.dpdk.org/archives/dev/2021-September/219866.html

>  



Re: [dpdk-dev] [PATCH v1] net/virtio: wait device ready during reset

2021-09-15 Thread Andrew Rybchenko
On 9/15/21 12:21 PM, Xueming Li wrote:
> According to virtio spec, the device MUST reset when 0 is written to
> device_status, and present 0 in device_status once reset is done.
> 
> This patch waits status value to be 0 during reset operation, if
> timeout in 3 seconds, log and continue.

I have no strong opinion on timeout.

> 
> Signed-off-by: Xueming Li 
> Cc: Andrew Rybchenko 
> ---
>  drivers/net/virtio/virtio.c | 15 +--
>  1 file changed, 13 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/virtio/virtio.c b/drivers/net/virtio/virtio.c
> index 7e1e77797f..f865b27b65 100644
> --- a/drivers/net/virtio/virtio.c
> +++ b/drivers/net/virtio/virtio.c
> @@ -3,7 +3,10 @@
>   * Copyright(c) 2020 Red Hat, Inc.
>   */
>  
> +#include 
> +
>  #include "virtio.h"
> +#include "virtio_logs.h"
>  
>  uint64_t
>  virtio_negotiate_features(struct virtio_hw *hw, uint64_t host_features)
> @@ -38,9 +41,17 @@ virtio_write_dev_config(struct virtio_hw *hw, size_t 
> offset,
>  void
>  virtio_reset(struct virtio_hw *hw)
>  {
> + uint32_t retry = 0;
> +
>   VIRTIO_OPS(hw)->set_status(hw, VIRTIO_CONFIG_STATUS_RESET);
> - /* flush status write */
> - VIRTIO_OPS(hw)->get_status(hw);
> + /* Flush status write and wait device ready max 3 seconds. */
> + while (VIRTIO_OPS(hw)->get_status(hw) != VIRTIO_CONFIG_STATUS_RESET) {
> + if (retry++ > 3000) {
> + PMD_INIT_LOG(WARNING, "device reset timeout");

I think it would be very useful to log ethdev port ID here.

> + break;
> + }
> + usleep(1000L);
> + }
>  }
>  
>  void
> 



Re: [dpdk-dev] [RFC 0/7] hide eth dev related structures

2021-09-15 Thread Jerin Jacob
On Tue, Sep 14, 2021 at 7:03 PM Ananyev, Konstantin
 wrote:
>
>
> Hi Jerin,
>
> > > NOTE: This is just an RFC to start further discussion and collect the 
> > > feedback.
> > > Due to significant amount of work, changes required are applied only to 
> > > two
> > > PMDs so far: net/i40e and net/ice.
> > > So to build it you'll need to add:
> > > -Denable_drivers='common/*,mempool/*,net/ice,net/i40e'
> > > to your config options.
> >
> > >
> > > That approach was selected to avoid(/minimize) possible performance 
> > > losses.
> > >
> > > So far I done only limited amount functional and performance testing.
> > > Didn't spot any functional problems, and performance numbers
> > > remains the same before and after the patch on my box (testpmd, macswap 
> > > fwd).
> >
> >
> > Based on testing on octeonxt2. We see some regression in testpmd and
> > bit on l3fwd too.
> >
> > Without patch: 73.5mpps/core in testpmd iofwd
> > With out patch: 72 5mpps/core in testpmd iofwd
> >
> > Based on my understanding it is due to additional indirection.
>
> From your patch below, it looks like not actually additional indirection,
> but extra memory dereference - func and dev pointers are now stored
> at different places.

Yup. I meant the same. We are on the same page.

> Plus the fact that now we dereference rte_eth_devices[]
> data inside PMD function. Which probably prevents compiler and CPU to load
>  rte_eth_devices[port_id].data and rte_eth_devices[port_id]. 
> pre_tx_burst_cbs[queue_id]
> in advance before calling actual RX/TX function.

Yes.

> About your approach: I don’t mind to add extra opaque 'void *data' pointer,
> but would prefer not to expose callback invocations code into inline function.
> Main reason for that - I think it still need to be reworked to allow 
> adding/removing
> callbacks without stopping the device. Something similar to what was done for 
> cryptodev
> callbacks. To be able to do that in future without another ABI breakage 
> callbacks related part
> needs to be kept internal.
> Though what we probably can do: add two dynamic arrays of opaque pointers to  
> rte_eth_burst_api.
> One for rx/tx queue data pointers, second for rx/tx callback pointers.
> To be more specific, something like:
>
> typedef uint16_t (*rte_eth_rx_burst_t)( void *rxq, struct rte_mbuf **rx_pkts, 
> uint16_t nb_pkts, void *cbs);
> typedef uint16_t (*rte_eth_tx_burst_t)(void *txq, struct rte_mbuf **tx_pkts, 
> uint16_t nb_pkts, void *cbs);
> 
>
> struct rte_eth_burst_api {
> rte_eth_rx_burst_t rx_pkt_burst;
> /**< PMD receive function. */
> rte_eth_tx_burst_t tx_pkt_burst;
> /**< PMD transmit function. */
> rte_eth_tx_prep_t tx_pkt_prepare;
> /**< PMD transmit prepare function. */
> rte_eth_rx_queue_count_t rx_queue_count;
> /**< Get the number of used RX descriptors. */
> rte_eth_rx_descriptor_status_t rx_descriptor_status;
> /**< Check the status of a Rx descriptor. */
> rte_eth_tx_descriptor_status_t tx_descriptor_status;
> /**< Check the status of a Tx descriptor. */
> struct {
>  void **queue_data;   /* point to 
> rte_eth_devices[port_id].data-> rx_queues */
>  void **cbs;  /*  points to 
> rte_eth_devices[port_id].post_rx_burst_cbs */
>} rx_data, tx_data;
> } __rte_cache_aligned;
>
> static inline uint16_t
> rte_eth_rx_burst(uint16_t port_id, uint16_t queue_id,
>  struct rte_mbuf **rx_pkts, const uint16_t nb_pkts)
> {
>struct rte_eth_burst_api *p;
>
> if (port_id >= RTE_MAX_ETHPORTS || queue_id >= 
> RTE_MAX_QUEUES_PER_PORT)
> return 0;
>
>   p =  &rte_eth_burst_api[port_id];
>   return p->rx_pkt_burst(p->rx_data.queue_data[queue_id], rx_pkts, 
> nb_pkts, p->rx_data.cbs[queue_id]);



That works.


> }
>
> Same for TX.
>
> If that looks ok to everyone, I'll try to prepare next version based on that.


Looks good to me.

> In theory that should avoid extra dereference problem and even reduce 
> indirection.
> As a drawback data->rxq/txq should always be allocated for 
> RTE_MAX_QUEUES_PER_PORT entries,
> but I presume that’s not a big deal.
>
> As a side question - is there any reason why rte_ethdev_trace_rx_burst() is 
> invoked at very last point,
> while rte_ethdev_trace_tx_burst()  after CBs but before actual tx_pkt_burst()?
> It would make things simpler if tracng would always be done either on 
> entrance or exit of rx/tx_burst.

exit is fine.

>
> >
> > My suggestion to fix the problem by:
> > Removing the additional `data` redirection and pull callback function
> > pointers back
> > and keep rest as opaque as done in the existing patch like [1]
> >
> > I don't believe this has any real implication on future ABI stability
> > as we will not be adding
> > any new item in rte_eth_fp in any way as new features can be added in 
> > slowpath
> > rte_eth_dev as mentioned in the patch.

Ack

I will

Re: [dpdk-dev] [PATCH] ethdev: promote sibling iterators to stable

2021-09-15 Thread Ferruh Yigit
On 9/8/2021 9:54 AM, Kinsella, Ray wrote:
> 
> 
> On 06/09/2021 15:19, Andrew Rybchenko wrote:
>> On 9/6/21 4:02 PM, David Marchand wrote:
>>> This API saw no update since its introduction and will help applications
>>> like OVS ([1] and [2]) that currently look at rte_eth_devices[] to
>>> achieve the same.
>>>
>>> 1: https://github.com/openvswitch/ovs/blob/master/lib/netdev-dpdk.c#L1285
>>> 2: https://github.com/openvswitch/ovs/blob/master/lib/netdev-dpdk.c#L1476
>>>
>>> Signed-off-by: David Marchand 
>>
>> Acked-by: Andrew Rybchenko 
>>
> Acked-by: Ray Kinsella 
> 

Applied to dpdk-next-net/main, thanks.


[dpdk-dev] [PATCH] doc/contributing: add line continuation note to meson guide

2021-09-15 Thread Bruce Richardson
Add a note for the preference of using "()" rather than "\" for line
continuations in meson.

Suggested-by: David Marchand 
Signed-off-by: Bruce Richardson 
---
 doc/guides/contributing/coding_style.rst | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/doc/guides/contributing/coding_style.rst 
b/doc/guides/contributing/coding_style.rst
index d648689f10..b27b5fcfdb 100644
--- a/doc/guides/contributing/coding_style.rst
+++ b/doc/guides/contributing/coding_style.rst
@@ -1024,6 +1024,16 @@ The following guidelines apply to the build system code 
in meson.build files in
 * Line continuations should be doubly-indented to ensure visible difference 
from normal indentation.
   Any line continuations beyond the first may be singly indented to avoid 
large amounts of indentation.
 
+* Where a line is split in the middle of a statement, e.g. a multiline `if` 
statement,
+  brackets should be used in preference to escaping the line break.
+
+Example::
+
+if (condition1 and condition2# line breaks inside () need no 
escaping
+and condition3 and condition4)
+x = y
+endif
+
 * Lists of files or components must be alphabetical unless doing so would 
cause errors.
 
 * Two formats are supported for lists of files or list of components:
-- 
2.30.2



[dpdk-dev] [PATCH] eal: add telemetry callbacks for memory info

2021-09-15 Thread Harman Kalra
Registering new telemetry callbacks to dump named (memzones)
and unnamed (malloc) memory information to a file provided as
an argument.

Example:
Connecting to /var/run/dpdk/rte/dpdk_telemetry.v2
{"version": "DPDK 21.08.0", "pid": 34075, "max_output_len": 16384}
Connected to application: "dpdk-testpmd"
--> /eal/malloc_dump,/tmp/malloc_dump
{"/eal/malloc_dump": {"Malloc elements file: ": "/tmp/malloc_dump"}}
-->
--> /eal/malloc_info,/tmp/info
{"/eal/malloc_info": {"Malloc stats file: ": "/tmp/info"}}
-->
-->
--> /eal/memzone_dump,/tmp/memzone_info
{"/eal/memzone_dump": {"Memzones count: ": 11, \
"Memzones info file: ": "/tmp/memzone_info"}}

Signed-off-by: Harman Kalra 
---
 lib/eal/common/eal_common_memory.c | 78 ++
 1 file changed, 78 insertions(+)

diff --git a/lib/eal/common/eal_common_memory.c 
b/lib/eal/common/eal_common_memory.c
index f83b75092e..592b3453b6 100644
--- a/lib/eal/common/eal_common_memory.c
+++ b/lib/eal/common/eal_common_memory.c
@@ -20,6 +20,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 
 #include "eal_memalloc.h"
 #include "eal_private.h"
@@ -38,6 +40,7 @@
 
 #define MEMSEG_LIST_FMT "memseg-%" PRIu64 "k-%i-%i"
 
+static int count;
 static void *next_baseaddr;
 static uint64_t system_page_sz;
 
@@ -1102,3 +1105,78 @@ rte_eal_memory_init(void)
rte_mcfg_mem_read_unlock();
return -1;
 }
+
+#define EAL_MEMZONE_DUMP_REQ   "/eal/memzone_dump"
+#define EAL_MALLOC_INFO_REQ"/eal/malloc_info"
+#define EAL_MALLOC_DUMP_REQ"/eal/malloc_dump"
+
+static void
+memzone_walk_clb(const struct rte_memzone *mz __rte_unused,
+void *arg __rte_unused)
+{
+   count++;
+}
+
+/* Callback handler for telemetry library to dump named and unnamed memory
+ * information.
+ */
+static int
+handle_eal_mem_info_request(const char *cmd, const char *params,
+ struct rte_tel_data *d)
+{
+   char filename[PATH_MAX];
+   FILE *fp;
+
+   if (params == NULL || strlen(params) == 0)
+   return -1;
+
+   rte_strscpy(filename, params, PATH_MAX);
+
+   fp = fopen(filename, "w+");
+   if (fp == NULL) {
+   RTE_LOG(ERR, EAL, "cannot open %s", filename);
+   return -1;
+   }
+
+   rte_tel_data_start_dict(d);
+   /* Dumping memzone info. */
+   if (strcmp(cmd, EAL_MEMZONE_DUMP_REQ) == 0) {
+   count = 0;
+   /* Callback to count memzones */
+   rte_memzone_walk(memzone_walk_clb, NULL);
+   rte_tel_data_add_dict_int(d, "Memzones count: ", count);
+   rte_tel_data_add_dict_string(d, "Memzones info file: ",
+filename);
+   rte_memzone_dump(fp);
+   }
+
+   /* Dumping malloc statistics */
+   if (strcmp(cmd, EAL_MALLOC_INFO_REQ) == 0) {
+   rte_tel_data_add_dict_string(d, "Malloc stats file: ",
+filename);
+   rte_malloc_dump_stats(fp, NULL);
+   }
+
+   /* Dumping malloc elements info */
+   if (strcmp(cmd, EAL_MALLOC_DUMP_REQ) == 0) {
+   rte_tel_data_add_dict_string(d, "Malloc elements file: ",
+filename);
+   rte_malloc_dump_heaps(fp);
+   }
+
+   fclose(fp);
+   return 0;
+}
+
+RTE_INIT(memory_telemetry)
+{
+   rte_telemetry_register_cmd(
+   EAL_MEMZONE_DUMP_REQ, handle_eal_mem_info_request,
+   "Dumps memzones info to file. Parameters: file name");
+   rte_telemetry_register_cmd(
+   EAL_MALLOC_INFO_REQ, handle_eal_mem_info_request,
+   "Dumps malloc info to file. Parameters: file name");
+   rte_telemetry_register_cmd(
+   EAL_MALLOC_DUMP_REQ, handle_eal_mem_info_request,
+   "Dumps malloc elems to file. Parameters: file name");
+}
-- 
2.18.0



Re: [dpdk-dev] [PATCH 1/3] usertools/dpdk-telemetry: fix flake8 errors

2021-09-15 Thread Kevin Laatz

On 13/09/2021 11:51, Bruce Richardson wrote:

Fix style errors reported by flake8.

Fixes: 6a2967c112a3 ("usertools: add new telemetry script")
Fixes: 2d9a697e41ca ("usertools: add file-prefix option for telemetry")
Cc: sta...@dpdk.org

Signed-off-by: Bruce Richardson 
---
  usertools/dpdk-telemetry.py | 9 -
  1 file changed, 4 insertions(+), 5 deletions(-)



Acked-by: Kevin Laatz 



Re: [dpdk-dev] [PATCH v3 04/17] dma/idxd: add bus device probing

2021-09-15 Thread Maxime Coquelin
Hi Kevin,

On 9/8/21 12:30 PM, Kevin Laatz wrote:
> Add the basic device probing for DSA devices bound to the IDXD kernel
> driver. These devices can be configured via sysfs and made available to
> DPDK if they are found during bus scan. Relevant documentation is included.
> 
> Signed-off-by: Bruce Richardson 
> Signed-off-by: Kevin Laatz 
> ---
>  doc/guides/dmadevs/idxd.rst  |  64 +++
>  drivers/dma/idxd/idxd_bus.c  | 352 +++
>  drivers/dma/idxd/meson.build |   1 +
>  3 files changed, 417 insertions(+)
>  create mode 100644 drivers/dma/idxd/idxd_bus.c
> 

Sorry if it has been asked before, but what is the reason this DSA bus
driver is not in drivers/bus/?



Re: [dpdk-dev] [PATCH v4] app/testpmd: add option to display extended statistics

2021-09-15 Thread Ivan Ilchenko



On 9/2/21 7:08 PM, Ferruh Yigit wrote:

On 8/20/2021 2:55 PM, Andrew Rybchenko wrote:

From: Ivan Ilchenko 

Add 'display-xstats' option for using in accompanying with Rx/Tx statistics
(i.e. 'stats-period' option or 'show port stats' interactive command) to
display specified list of extended statistics.


Overall +1 to the feature.

Just a reminder that same thing can be done via telemetry, custom xstat values
can be retrieved from any dpdk application (including testpmd) by a telemetry
client. cc'ed Ciara if more detail is required.


Signed-off-by: Ivan Ilchenko 
Signed-off-by: Andrew Rybchenko 
---
v4:
 - split from patch series
 - move xstats information to rte_port structure to avoid
   extra global structure

  app/test-pmd/cmdline.c|  55 +
  app/test-pmd/config.c |  66 
  app/test-pmd/parameters.c |  14 
  app/test-pmd/testpmd.c| 110 ++
  app/test-pmd/testpmd.h|  23 ++
  doc/guides/testpmd_app_ug/run_app.rst |   5 ++
  6 files changed, 273 insertions(+)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 82253bc751..cd538ace30 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -3621,6 +3621,61 @@ cmdline_parse_inst_t cmd_stop = {
  
  /* *** SET CORELIST and PORTLIST CONFIGURATION *** */
  

Inserting the below function between the above comment and its function makes
the comment wrong.
This is in file 'cmdline.c', which has command line functions and static
functions that are needed for the command.
Below function is to parse the application parameter, and called by a function
in 'parameters.c'. Since that is the only consumer of this function, why not
move this function to 'parameters.c' file and make it static?

It will be done in the next version.

+int
+parse_xstats_list(const char *in_str, struct rte_eth_xstat_name **xstats,
+ unsigned int *xstats_num)
+{
+   int max_names_nb, names_nb;
+   int stringlen;
+   char **names;
+   char *str;
+   int ret;
+   int i;
+
+   names = NULL;
+   str = strdup(in_str);

'in_str' is an user input, it is coming from 'lgopts()', can you please double
check if it is guaranteed that it will be null terminated?

optarg points to argv that's guaranteed to be null terminated.



+   if (str == NULL) {
+   ret = ENOMEM;

Please return negative error values.
Only net/sfc has positive error values, since that is the syntax in its base
code and we let to keep the syntax, but for rest of the DPDK please use negative
error values.


+   goto out;
+   }
+   stringlen = strlen(str);
+> + for (i = 0, max_names_nb = 1; str[i] != '\0'; i++) {
+   if (str[i] == ',')
+   max_names_nb++;
+   }
+
+   names = calloc(max_names_nb, sizeof(*names));
+   if (names == NULL) {
+   ret = ENOMEM;
+   goto out;
+   }
+
+   names_nb = rte_strsplit(str, stringlen, names, max_names_nb, ',');
+   if (names_nb < 0) {
+   ret = EINVAL;
+   goto out;
+   }

Should we check the length of each 'names' to prevent unnecessary allocation for
the cause user provided something like --display-xstats ','?
I think this check can be done during copy below.

It's good to have, will be done in the next version.



+
+   *xstats = calloc(names_nb, sizeof(**xstats));
+   if (*xstats == NULL) {
+   ret = ENOMEM;
+   goto out;
+   }
+

When and how 'xstats' ('xstats_display') is freed?

I'll add free in the next version.



+   for (i = 0; i < names_nb; i++)
+   rte_strscpy((*xstats)[i].name, names[i],
+   sizeof((*xstats)[i].name));
+
+   *xstats_num = names_nb;
+   ret = 0;
+
+out:
+   free(names);
+   free(str);
+   return ret;
+}
+
  unsigned int
  parse_item_list(const char *str, const char *item_name, unsigned int 
max_items,
unsigned int *parsed_items, int check_unique_values)
diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
index 31d8ba1b91..ea5b59f54f 100644
--- a/app/test-pmd/config.c
+++ b/app/test-pmd/config.c
@@ -173,6 +173,70 @@ print_ethaddr(const char *name, struct rte_ether_addr 
*eth_addr)
printf("%s%s", name, buf);
  }
  
+static void

+nic_xstats_display_periodic(portid_t port_id)
+{
+   struct xstat_display_info *xstats_info;
+   uint64_t *prev_values, *curr_values;
+   uint64_t diff_value, value_rate;
+   uint64_t *ids, *ids_supp;
+   struct timespec cur_time;
+   unsigned int i, i_supp;
+   size_t ids_supp_sz;
+   uint64_t diff_ns;
+   int rc;
+
+   xstats_info = &ports[port_id].xstats_info;
+
+   ids_supp_sz = xstats_info->ids_supp_sz;
+   if (xstats_display_num == 0 || ids_supp_sz == 0)
+   return;
+
+   printf("\n");
+
+   

Re: [dpdk-dev] [PATCH v2] net/virtio: wait device ready during reset

2021-09-15 Thread Andrew Rybchenko
On 9/15/21 1:12 PM, Xueming Li wrote:
> According to virtio spec, the device MUST reset when 0 is written to
> device_status, and present 0 in device_status once reset is done.
> 
> This patch waits status value to be 0 during reset operation, if
> timeout in 3 seconds, log and continue.
> 
> Signed-off-by: Xueming Li 
> Cc: Andrew Rybchenko 

Acked-by: Andrew Rybchenko 


[dpdk-dev] OpenSSL 3.0 released, deprecated functions - DPDK crypto compilation issues

2021-09-15 Thread Kusztal, ArkadiuszX
Hi,

OpenSSL 3.0 is now released and since some PMDs in DPDK depends on libcrypto 
deprecated now functions, anyone who install 3.0 may see compilation problems.
So we suggest in case adaptation cannot be made in time to put version 
constraint in meson dependencies "version : '<3.0.0'", it probably will not 
work with OpenSSL installed from source though (at least did not work for me).

Regards,
Arek




Re: [dpdk-dev] [PATCH v4] net: fix Intel-specific Prepare the outer IPv4 hdr for checksum

2021-09-15 Thread Ferruh Yigit
On 9/7/2021 11:49 AM, Mohsin Kazmi wrote:
> Preparation of the headers for the hardware offload
> misses the outer IPv4 checksum offload.
> It results in bad checksum computed by hardware NIC.
> 
> This patch fixes the issue by setting the outer IPv4
> checksum field to 0.
> 
> Fixes: 4fb7e803eb1a ("ethdev: add Tx preparation")
> Cc: sta...@dpdk.org
> 
> Signed-off-by: Mohsin Kazmi 
> Acked-by: Qi Zhang 
> Acked-by: Olivier Matz 

<...>

> diff --git a/lib/net/rte_net.h b/lib/net/rte_net.h
> index 434435ffa2..42639bc154 100644
> --- a/lib/net/rte_net.h
> +++ b/lib/net/rte_net.h
> @@ -125,11 +125,22 @@ rte_net_intel_cksum_flags_prepare(struct rte_mbuf *m, 
> uint64_t ol_flags)

Not directly related with this patch, but is the function
'rte_net_intel_cksum_flags_prepare()' really Intel specific?

I can see sfc & ena are also using this function, should we rename it?


[dpdk-dev] [PATCH] net/sfc: fix getting accumulative SW xstat

2021-09-15 Thread Andrew Rybchenko
From: Ivan Ilchenko 

Add missing initialisation of the accumulative SW xstat to
zero since it is sum of per-queue xstats.

Fixes: fdd7719eb3c ("net/sfc: add xstats for Rx/Tx doorbells")
Cc: sta...@dpdk.org

Signed-off-by: Ivan Ilchenko 
Signed-off-by: Andrew Rybchenko 
---
 drivers/net/sfc/sfc_sw_stats.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/sfc/sfc_sw_stats.c b/drivers/net/sfc/sfc_sw_stats.c
index 8489b603f5..2b28ba29e6 100644
--- a/drivers/net/sfc/sfc_sw_stats.c
+++ b/drivers/net/sfc/sfc_sw_stats.c
@@ -313,6 +313,7 @@ sfc_sw_xstat_get_values_by_id(struct sfc_adapter *sa,
}
 
if (count_accum_value) {
+   values[accum_value_idx] = 0;
for (qid = 0; qid < nb_queues; ++qid) {
if (rte_bitmap_get(bmp, qid) != 0)
continue;
-- 
2.30.2



Re: [dpdk-dev] [PATCH 0/3] improvements for telemetry script

2021-09-15 Thread Power, Ciara
Hi Bruce,

>-Original Message-
>From: Richardson, Bruce 
>Sent: Monday 13 September 2021 11:52
>To: dev@dpdk.org
>Cc: Power, Ciara ; Hunt, David
>; Richardson, Bruce 
>Subject: [PATCH 0/3] improvements for telemetry script
>
>Patch 1 fixes errors reported by flake8 in the telemetry python script.
>Inspired by the work by Dave Hunt [1] the final two patches look to adjust the
>script so that it works nicer when commands come from an input pipe rather
>than from an interactive terminal.
>
>Without this set:
>  $ echo "/eal/params" | ./usertools/dpdk-telemetry.py
>  Connecting to /var/run/dpdk/rte/dpdk_telemetry.v2
>  {"version": "DPDK 21.11.0-rc0", "pid": 130033, "max_output_len": 16384}
>  Connected to application: "dpdk-test"
>  --> {"/eal/params": ["./build/app/test/dpdk-test", "-c", "F", "--no-huge"]}
>  --> Traceback (most recent call last):
>File "/home/bruce/dpdk.org/./usertools/dpdk-telemetry.py", line 109, in
>
>  handle_socket(os.path.join(rdir,
>'dpdk_telemetry.{}'.format(TELEMETRY_VERSION)))
>File "/home/bruce/dpdk.org/./usertools/dpdk-telemetry.py", line 78, in
>handle_socket
>  text = input('--> ').strip()
>  EOFError: EOF when reading a line
>
>With this patchset:
>  $ echo "/eal/params" | ./usertools/dpdk-telemetry.py
>  {"/eal/params": ["./build/app/test/dpdk-test", "-c", "F", "--no-huge"]}
>
>
>[1] http://patches.dpdk.org/project/dpdk/patch/20210909155625.24581-1-
>david.h...@intel.com/
>
>Bruce Richardson (3):
>  usertools/dpdk-telemetry: fix flake8 errors
>  usertools/dpdk_telemetry: fix handling EOF for input pipe
>  usertools/dpdk-telemetry: silence prompts for input pipes
>
> usertools/dpdk-telemetry.py | 38 ++---
> 1 file changed, 23 insertions(+), 15 deletions(-)
>
>--
>2.30.2

For the series,
Acked-by: Ciara Power 

Thanks!



Re: [dpdk-dev] [PATCH v1] usertools/telemetry: add non-interactive mode

2021-09-15 Thread Power, Ciara
Hi Dave,

>-Original Message-
>From: Richardson, Bruce 
>Sent: Monday 13 September 2021 11:54
>To: Hunt, David 
>Cc: dev@dpdk.org; Power, Ciara 
>Subject: Re: [dpdk-dev] [PATCH v1] usertools/telemetry: add non-interactive
>mode
>
>On Mon, Sep 13, 2021 at 11:43:25AM +0100, Bruce Richardson wrote:
>> On Thu, Sep 09, 2021 at 04:56:25PM +0100, David Hunt wrote:
>> > Add non-interactive mode to dpdk-telemetry.py so that a query string
>> > can be supplied on the command line, and script dumps out data and
>> > exits. Handing for calling from scripts.
>> >
>> > Signed-off-by: David Hunt 
>> > ---
>> Hi Dave,
>>
>> I'm not sure I like the use of "-q" for adding a query mode - it's
>> more a shortcut parameter for a "quiet" mode. If I may, I'd suggest an
>> alternative approach here might be to improve support for piping the
>> input commands to the script instead so that you can do e.g.
>>
>> "echo /ethdev/stats,0 | dpdk-telemetry.py"
>>
>> and have that work well in a script.
>>
>> I'll do up a patchset for improving that and upstream it for feedback.
>>
>
>Now at: http://patches.dpdk.org/project/dpdk/list/?series=18867
>
>/Bruce

Thanks for this, although I do think the improvement to allow for piping is a 
better solution.
I have acked Bruce's patchset, hopefully that will be suitable for your use 
case.

Thanks,
Ciara


Re: [dpdk-dev] OpenSSL 3.0 released, deprecated functions - DPDK crypto compilation issues

2021-09-15 Thread Akhil Goyal
Hi Arek,

Can you submit changes for the meson files needed to compile it with 3.0 if 
needed?

Regards,
Akhil

From: Kusztal, ArkadiuszX 
Sent: Wednesday, September 15, 2021 4:02 PM
To: dev@dpdk.org
Cc: Akhil Goyal ; asoma...@amd.com; Zhang, Roy Fan 
; Doherty, Declan 
Subject: [EXT] OpenSSL 3.0 released, deprecated functions - DPDK crypto 
compilation issues

External Email

Hi,

OpenSSL 3.0 is now released and since some PMDs in DPDK depends on libcrypto 
deprecated now functions, anyone who install 3.0 may see compilation problems.
So we suggest in case adaptation cannot be made in time to put version 
constraint in meson dependencies "version : '<3.0.0'", it probably will not 
work with OpenSSL installed from source though (at least did not work for me).

Regards,
Arek




Re: [dpdk-dev] [PATCH v8 03/12] bpf: allow self-xor operation

2021-09-15 Thread Ananyev, Konstantin



> When doing BPF filter program conversion, a common way
> to zero a register in single instruction is:
>  xor r7,r7
> The BPF validator would not allow this because the value of
> r7 was undefined. But after this operation it always zero.
> 
> Signed-off-by: Stephen Hemminger 
> ---
>  lib/bpf/bpf_validate.c | 8 ++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/lib/bpf/bpf_validate.c b/lib/bpf/bpf_validate.c
> index 7b1291b382e9..7647a7454dc2 100644
> --- a/lib/bpf/bpf_validate.c
> +++ b/lib/bpf/bpf_validate.c
> @@ -661,8 +661,12 @@ eval_alu(struct bpf_verifier *bvf, const struct 
> ebpf_insn *ins)
> 
>   op = BPF_OP(ins->code);
> 
> - err = eval_defined((op != EBPF_MOV) ? rd : NULL,
> - (op != BPF_NEG) ? &rs : NULL);
> + /* Allow self-xor as way to zero register */
> + if (op == BPF_XOR && ins->src_reg == ins->dst_reg)
> + err = NULL;
> + else
> + err = eval_defined((op != EBPF_MOV) ? rd : NULL,
> +(op != BPF_NEG) ? &rs : NULL);

Two things:
- We probably need to check that this is instruction with source register (not 
imm value).
- rd value is not evaluated to zero, while it probably should
  (will help evaluator to better predict further values) 

So might be better to do something like:

/* Allow self-xor as way to zero register */
if (op == BPF_XOR && BPF_SRC(ins->code) == BPF_X &&
ins->src_reg == ins->dst_reg) {
eval_fill_imm(&rs, UINT64_MAX, 0);
eval_fill_imm(rd, UINT64_MAX, 0);
}

err = eval_defined((op != EBPF_MOV) ? rd : NULL,
   (op != BPF_NEG) ? &rs : NULL);
if (err != NULL)
return err;

...

Another thing - shouldn't that patch be treated like a fix (cc to stable, etc.)?

>   if (err != NULL)
>   return err;
> 
> --
> 2.30.2



Re: [dpdk-dev] [PATCH v1] usertools/telemetry: add non-interactive mode

2021-09-15 Thread David Hunt



On 15/9/2021 11:49 AM, Power, Ciara wrote:

Hi Dave,


-Original Message-
From: Richardson, Bruce 
Sent: Monday 13 September 2021 11:54
To: Hunt, David 
Cc: dev@dpdk.org; Power, Ciara 
Subject: Re: [dpdk-dev] [PATCH v1] usertools/telemetry: add non-interactive
mode

On Mon, Sep 13, 2021 at 11:43:25AM +0100, Bruce Richardson wrote:

On Thu, Sep 09, 2021 at 04:56:25PM +0100, David Hunt wrote:

Add non-interactive mode to dpdk-telemetry.py so that a query string
can be supplied on the command line, and script dumps out data and
exits. Handing for calling from scripts.

Signed-off-by: David Hunt 
---

Hi Dave,

I'm not sure I like the use of "-q" for adding a query mode - it's
more a shortcut parameter for a "quiet" mode. If I may, I'd suggest an
alternative approach here might be to improve support for piping the
input commands to the script instead so that you can do e.g.

"echo /ethdev/stats,0 | dpdk-telemetry.py"

and have that work well in a script.

I'll do up a patchset for improving that and upstream it for feedback.


Now at: http://patches.dpdk.org/project/dpdk/list/?series=18867

/Bruce

Thanks for this, although I do think the improvement to allow for piping is a 
better solution.
I have acked Bruce's patchset, hopefully that will be suitable for your use 
case.

Thanks,
Ciara



+1

I'll mark my patchset as superceeded.

Rgds,
Dave.




Re: [dpdk-dev] [PATCH v8 04/12] bpf: add function to convert classic BPF to DPDK BPF

2021-09-15 Thread Ananyev, Konstantin



> 
> The pcap library emits classic BPF (32 bit) and is useful for
> creating filter programs.  The DPDK BPF library only implements
> extended BPF (eBPF).  Add an function to convert from old to
> new.
> 
> The rte_bpf_convert function uses rte_malloc to put the resulting
> program in hugepage shared memory so it can be passed from a
> secondary process to a primary process.
> 
> The code to convert was originally done as part of the Linux
> kernel implementation then converted to a userspace program.
> Both authors have agreed that it is allowable to license this
> as BSD licensed in DPDK.
> 
> Signed-off-by: Stephen Hemminger 
> ---
>  lib/bpf/bpf_convert.c | 570 ++
>  lib/bpf/meson.build   |   5 +
>  lib/bpf/rte_bpf.h |  25 ++
>  lib/bpf/version.map   |   6 +
>  4 files changed, 606 insertions(+)
>  create mode 100644 lib/bpf/bpf_convert.c
> 
> diff --git a/lib/bpf/meson.build b/lib/bpf/meson.build
> index 63cbd60185e0..54f7610ae990 100644
> --- a/lib/bpf/meson.build
> +++ b/lib/bpf/meson.build
> @@ -25,3 +25,8 @@ if dep.found()
>  sources += files('bpf_load_elf.c')
>  ext_deps += dep
>  endif
> +
> +if dpdk_conf.has('RTE_PORT_PCAP')

Do we really need that 'if' above?
Why not to always have it enabled?

> +sources += files('bpf_convert.c')
> +ext_deps += pcap_dep
> +endif
> diff --git a/lib/bpf/rte_bpf.h b/lib/bpf/rte_bpf.h
> index 69116f36ba8b..2f23e272a376 100644
> --- a/lib/bpf/rte_bpf.h
> +++ b/lib/bpf/rte_bpf.h
> @@ -198,6 +198,31 @@ rte_bpf_exec_burst(const struct rte_bpf *bpf, void 
> *ctx[], uint64_t rc[],
>  int
>  rte_bpf_get_jit(const struct rte_bpf *bpf, struct rte_bpf_jit *jit);
> 
> +#ifdef RTE_PORT_PCAP
> +
> +struct bpf_program;
> +
> +/**
> + * Convert a Classic BPF program from libpcap into a DPDK BPF code.
> + *
> + * @param prog
> + *  Classic BPF program from pcap_compile().
> + * @param prm
> + *  Result Extended BPF program.
> + * @return
> + *   Pointer to BPF program (allocated with *rte_malloc*)
> + *   that is used in future BPF operations,
> + *   or NULL on error, with error code set in rte_errno.
> + *   Possible rte_errno errors include:
> + *   - EINVAL - invalid parameter passed to function
> + *   - ENOMEM - can't reserve enough memory
> + */
> +__rte_experimental
> +struct rte_bpf_prm *
> +rte_bpf_convert(const struct bpf_program *prog);
> +
> +#endif
> +
>  #ifdef __cplusplus
>  }
>  #endif
> diff --git a/lib/bpf/version.map b/lib/bpf/version.map
> index 0bf35f487666..47082d5003ef 100644
> --- a/lib/bpf/version.map
> +++ b/lib/bpf/version.map
> @@ -14,3 +14,9 @@ DPDK_22 {
> 
>   local: *;
>  };
> +
> +EXPERIMENTAL {
> + global:
> +
> + rte_bpf_convert;
> +};
> --

Cool feature, thanks for contributing.
Acked-by: Konstantin Ananyev 

> 2.30.2



Re: [dpdk-dev] [PATCH v4] net: fix Intel-specific Prepare the outer IPv4 hdr for checksum

2021-09-15 Thread Ferruh Yigit
On 9/7/2021 11:49 AM, Mohsin Kazmi wrote:
> Preparation of the headers for the hardware offload
> misses the outer IPv4 checksum offload.
> It results in bad checksum computed by hardware NIC.
> 
> This patch fixes the issue by setting the outer IPv4
> checksum field to 0.
> 
> Fixes: 4fb7e803eb1a ("ethdev: add Tx preparation")
> Cc: sta...@dpdk.org
> 
> Signed-off-by: Mohsin Kazmi 
> Acked-by: Qi Zhang 
> Acked-by: Olivier Matz 

Applied to dpdk-next-net/main, thanks.

Patch title updated as following while merging:
net: fix checksum offload for outer IPv4



Re: [dpdk-dev] [PATCH v8 05/12] bpf: add function to dump eBPF instructions

2021-09-15 Thread Ananyev, Konstantin



> When debugging converted (and other) programs it is useful
> to see disassembled eBPF output.
> 
> Signed-off-by: Stephen Hemminger 
> ---
>  lib/bpf/bpf_convert.c |   7 ++-
>  lib/bpf/bpf_dump.c| 118 ++
>  lib/bpf/meson.build   |   1 +
>  lib/bpf/rte_bpf.h |  14 +
>  lib/bpf/version.map   |   1 +
>  5 files changed, 140 insertions(+), 1 deletion(-)
>  create mode 100644 lib/bpf/bpf_dump.c
> 
> diff --git a/lib/bpf/bpf_convert.c b/lib/bpf/bpf_convert.c
> index a46ffeb067dd..db84add7dcce 100644
> --- a/lib/bpf/bpf_convert.c
> +++ b/lib/bpf/bpf_convert.c
> @@ -331,7 +331,12 @@ static int bpf_convert_filter(const struct bpf_insn 
> *prog, size_t len,
>   case BPF_LD | BPF_IND | BPF_H:
>   case BPF_LD | BPF_IND | BPF_B:
>   /* All arithmetic insns map as-is. */
> - *insn = BPF_RAW_INSN(fp->code, BPF_REG_A, BPF_REG_X, 0, 
> fp->k);
> + insn->code = fp->code;
> + insn->dst_reg = BPF_REG_A;
> + bpf_src = BPF_SRC(fp->code);
> + insn->src_reg = bpf_src == BPF_X ? BPF_REG_X : 0;
> + insn->off = 0;
> + insn->imm = fp->k;
>   break;

Should it be part of that patch?
Looks like belongs to previous one, no?

> 
>   /* Jump transformation cannot use BPF block macros
> diff --git a/lib/bpf/bpf_dump.c b/lib/bpf/bpf_dump.c
> new file mode 100644
> index ..a6a431e64903
> --- /dev/null
> +++ b/lib/bpf/bpf_dump.c
> @@ -0,0 +1,118 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright (c) 2021 Stephen Hemminger
> + * Based on filter2xdp
> + * Copyright (C) 2017 Tobias Klauser
> + */
> +
> +#include 
> +#include 
> +
> +#include "rte_bpf.h"
> +
> +#define BPF_OP_INDEX(x) (BPF_OP(x) >> 4)
> +#define BPF_SIZE_INDEX(x) (BPF_SIZE(x) >> 3)
> +
> +static const char *const class_tbl[] = {
> + [BPF_LD] = "ld",   [BPF_LDX] = "ldx",[BPF_ST] = "st",
> + [BPF_STX] = "stx", [BPF_ALU] = "alu",[BPF_JMP] = "jmp",
> + [BPF_RET] = "ret", [BPF_MISC] = "alu64",
> +};
> +
> +static const char *const alu_op_tbl[16] = {
> + [BPF_ADD >> 4] = "add",[BPF_SUB >> 4] = "sub",
> + [BPF_MUL >> 4] = "mul",[BPF_DIV >> 4] = "div",
> + [BPF_OR >> 4] = "or",  [BPF_AND >> 4] = "and",
> + [BPF_LSH >> 4] = "lsh",[BPF_RSH >> 4] = "rsh",
> + [BPF_NEG >> 4] = "neg",[BPF_MOD >> 4] = "mod",
> + [BPF_XOR >> 4] = "xor",[EBPF_MOV >> 4] = "mov",
> + [EBPF_ARSH >> 4] = "arsh", [EBPF_END >> 4] = "endian",
> +};
> +
> +static const char *const size_tbl[] = {
> + [BPF_W >> 3] = "w",
> + [BPF_H >> 3] = "h",
> + [BPF_B >> 3] = "b",
> + [EBPF_DW >> 3] = "dw",
> +};
> +
> +static const char *const jump_tbl[16] = {
> + [BPF_JA >> 4] = "ja",  [BPF_JEQ >> 4] = "jeq",
> + [BPF_JGT >> 4] = "jgt",[BPF_JGE >> 4] = "jge",
> + [BPF_JSET >> 4] = "jset",  [EBPF_JNE >> 4] = "jne",
> + [EBPF_JSGT >> 4] = "jsgt", [EBPF_JSGE >> 4] = "jsge",
> + [EBPF_CALL >> 4] = "call", [EBPF_EXIT >> 4] = "exit",
> +};
> +
> +static void ebpf_dump(FILE *f, const struct ebpf_insn insn, size_t n)
> +{
> + const char *op, *postfix = "";
> + uint8_t cls = BPF_CLASS(insn.code);
> +
> + fprintf(f, " L%zu:\t", n);
> +
> + switch (cls) {
> + default:
> + fprintf(f, "unimp 0x%x // class: %s\n", insn.code,
> + class_tbl[cls]);
> + break;
> + case BPF_ALU:
> + postfix = "32";
> + /* fall through */
> + case EBPF_ALU64:
> + op = alu_op_tbl[BPF_OP_INDEX(insn.code)];
> + if (BPF_SRC(insn.code) == BPF_X)
> + fprintf(f, "%s%s r%u, r%u\n", op, postfix, insn.dst_reg,
> + insn.src_reg);
> + else
> + fprintf(f, "%s%s r%u, #0x%x\n", op, postfix,
> + insn.dst_reg, insn.imm);
> + break;
> + case BPF_LD:
> + op = "ld";
> + postfix = size_tbl[BPF_SIZE_INDEX(insn.code)];
> + if (BPF_MODE(insn.code) == BPF_IMM)
> + fprintf(f, "%s%s r%d, #0x%x\n", op, postfix,
> + insn.dst_reg, insn.imm);
> + else if (BPF_MODE(insn.code) == BPF_ABS)
> + fprintf(f, "%s%s r%d, [%d]\n", op, postfix,
> + insn.dst_reg, insn.imm);
> + else if (BPF_MODE(insn.code) == BPF_IND)
> + fprintf(f, "%s%s r%d, [r%u + %d]\n", op, postfix,
> + insn.dst_reg, insn.src_reg, insn.imm);
> + else
> + fprintf(f, "// BUG: LD opcode 0x%02x in eBPF insns\n",
> + insn.code);
> + break;
> + case BPF_LDX:
> + op = "ldx";
> + postfix = size_tb

Re: [dpdk-dev] [PATCH v3 04/17] dma/idxd: add bus device probing

2021-09-15 Thread Bruce Richardson
On Wed, Sep 15, 2021 at 12:12:34PM +0200, Maxime Coquelin wrote:
> Hi Kevin,
> 
> On 9/8/21 12:30 PM, Kevin Laatz wrote:
> > Add the basic device probing for DSA devices bound to the IDXD kernel
> > driver. These devices can be configured via sysfs and made available to
> > DPDK if they are found during bus scan. Relevant documentation is included.
> > 
> > Signed-off-by: Bruce Richardson 
> > Signed-off-by: Kevin Laatz 
> > ---
> >  doc/guides/dmadevs/idxd.rst  |  64 +++
> >  drivers/dma/idxd/idxd_bus.c  | 352 +++
> >  drivers/dma/idxd/meson.build |   1 +
> >  3 files changed, 417 insertions(+)
> >  create mode 100644 drivers/dma/idxd/idxd_bus.c
> > 
> 
> Sorry if it has been asked before, but what is the reason this DSA bus
> driver is not in drivers/bus/?
> 

This bus-driver solution came out of discussion previously when we were
looking to add DSA support to the ioat driver. Individual device drivers of
any class cannot themselves do any probing to discover devices, but only
bus drivers can do that. Therefore, to enable discovery of DSA devices used
by the kernel driver, a custom bus driver is necessary. Since this bus
driver only supports discovery of DSA devices using a single driver type
behind the scenes (referred to in previous discussions as a "singleton" bus
driver), it's kept in with the rest of the DSA driver code. It's not a full
bus driver, more a minimal driver whose job it is to scan /dev/dsa and
initialise the proper DSA/idxd driver.

So we could move it to /dev/bus, but I think it would lead to overall
higher maintenance and I fail to see any real benefits.

/Bruce


Re: [dpdk-dev] [PATCH v8 10/12] test: enable bpf autotest

2021-09-15 Thread Ananyev, Konstantin



> The BPF autotest is defined but not run automatically.
> Since it is short, it should be added to the autotest suite.
> 
> Signed-off-by: Stephen Hemminger 
> ---
>  app/test/meson.build | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/app/test/meson.build b/app/test/meson.build
> index 0d551ac9c2b2..cd18484bb73a 100644
> --- a/app/test/meson.build
> +++ b/app/test/meson.build
> @@ -194,6 +194,8 @@ test_deps = [
>  fast_tests = [
>  ['acl_autotest', true],
>  ['atomic_autotest', false],
> +['bpf_autotest', true],
> +['bpf_convert_autotest', true],
>  ['bitops_autotest', true],
>  ['byteorder_autotest', true],
>  ['cksum_autotest', true],
> --

Acked-by: Konstantin Ananyev 

> 2.30.2



[dpdk-dev] [PATCH v5] app/testpmd: add option to display extended statistics

2021-09-15 Thread Andrew Rybchenko
From: Ivan Ilchenko 

Add 'display-xstats' option for using in accompanying with Rx/Tx statistics
(i.e. 'stats-period' option or 'show port stats' interactive command) to
display specified list of extended statistics.

Signed-off-by: Ivan Ilchenko 
Signed-off-by: Andrew Rybchenko 
Acked-by: Ajit Khaparde 
---
v5:
- process review notes from Ferruh

v4:
- split from patch series
- move xstats information to rte_port structure to avoid
  extra global structure

 app/test-pmd/config.c |  62 +++
 app/test-pmd/parameters.c |  81 +++
 app/test-pmd/testpmd.c| 108 ++
 app/test-pmd/testpmd.h|  17 
 doc/guides/testpmd_app_ug/run_app.rst |   6 ++
 5 files changed, 274 insertions(+)

diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
index f5765b34f7..d9f0b5caba 100644
--- a/app/test-pmd/config.c
+++ b/app/test-pmd/config.c
@@ -173,6 +173,65 @@ print_ethaddr(const char *name, struct rte_ether_addr 
*eth_addr)
printf("%s%s", name, buf);
 }
 
+static void
+nic_xstats_display_periodic(portid_t port_id)
+{
+   struct xstat_display_info *xstats_info;
+   uint64_t *prev_values, *curr_values;
+   uint64_t diff_value, value_rate;
+   struct timespec cur_time;
+   uint64_t *ids_supp;
+   size_t ids_supp_sz;
+   uint64_t diff_ns;
+   unsigned int i;
+   int rc;
+
+   xstats_info = &ports[port_id].xstats_info;
+
+   ids_supp_sz = xstats_info->ids_supp_sz;
+   if (ids_supp_sz == 0)
+   return;
+
+   printf("\n");
+
+   ids_supp = xstats_info->ids_supp;
+   prev_values = xstats_info->prev_values;
+   curr_values = xstats_info->curr_values;
+
+   rc = rte_eth_xstats_get_by_id(port_id, ids_supp, curr_values,
+ ids_supp_sz);
+   if (rc != (int)ids_supp_sz) {
+   fprintf(stderr,
+   "Failed to get values of %zu xstats for port %u - 
return code %d\n",
+   ids_supp_sz, port_id, rc);
+   return;
+   }
+
+   diff_ns = 0;
+   if (clock_gettime(CLOCK_TYPE_ID, &cur_time) == 0) {
+   uint64_t ns;
+
+   ns = cur_time.tv_sec * NS_PER_SEC;
+   ns += cur_time.tv_nsec;
+
+   if (xstats_info->prev_ns != 0)
+   diff_ns = ns - xstats_info->prev_ns;
+   xstats_info->prev_ns = ns;
+   }
+
+   printf("%-31s%-17s%s\n", " ", "Value", "Rate (since last show)");
+   for (i = 0; i < ids_supp_sz; i++) {
+   diff_value = (curr_values[i] > prev_values[i]) ?
+(curr_values[i] - prev_values[i]) : 0;
+   prev_values[i] = curr_values[i];
+   value_rate = diff_ns > 0 ?
+   (double)diff_value / diff_ns * NS_PER_SEC : 0;
+
+   printf("  %-25s%12"PRIu64" %15"PRIu64"\n",
+  xstats_display[i].name, curr_values[i], value_rate);
+   }
+}
+
 void
 nic_stats_display(portid_t port_id)
 {
@@ -243,6 +302,9 @@ nic_stats_display(portid_t port_id)
   PRIu64"  Tx-bps: %12"PRIu64"\n", mpps_rx, mbps_rx * 8,
   mpps_tx, mbps_tx * 8);
 
+   if (xstats_display_num > 0)
+   nic_xstats_display_periodic(port_id);
+
printf("  %s%s\n",
   nic_stats_border, nic_stats_border);
 }
diff --git a/app/test-pmd/parameters.c b/app/test-pmd/parameters.c
index 3f94a82e32..b3217d6e5c 100644
--- a/app/test-pmd/parameters.c
+++ b/app/test-pmd/parameters.c
@@ -61,6 +61,10 @@ usage(char* progname)
   "(only if interactive is disabled).\n");
printf("  --stats-period=PERIOD: statistics will be shown "
   "every PERIOD seconds (only if interactive is disabled).\n");
+   printf("  --display-xstats xstat_name1[,...]: comma-separated list of "
+  "extended statistics to show. Used with --stats-period "
+  "specified or interactive commands that show Rx/Tx statistics "
+  "(i.e. 'show port stats').\n");
printf("  --nb-cores=N: set the number of forwarding cores "
   "(1 <= N <= %d).\n", nb_lcores);
printf("  --nb-ports=N: set the number of forwarding ports "
@@ -473,6 +477,72 @@ parse_event_printing_config(const char *optarg, int enable)
return 0;
 }
 
+static int
+parse_xstats_list(const char *in_str, struct rte_eth_xstat_name **xstats,
+ unsigned int *xstats_num)
+{
+   int max_names_nb, names_nb, nonempty_names_nb;
+   int name, nonempty_name;
+   int stringlen;
+   char **names;
+   char *str;
+   int ret;
+   int i;
+
+   names = NULL;
+   str = strdup(in_str);
+   if (str == NULL) {
+   ret = -ENOMEM;
+   goto out;
+   }
+   stringlen = strlen(str);
+
+   for (i = 

Re: [dpdk-dev] [PATCH v8 08/12] test: add test for bpf_convert

2021-09-15 Thread Ananyev, Konstantin



> 
> Add some functional tests for the Classic BPF to DPDK BPF converter.
> 
> Signed-off-by: Stephen Hemminger 
> ---
>  app/test/test_bpf.c | 173 
>  1 file changed, 173 insertions(+)
> 
> diff --git a/app/test/test_bpf.c b/app/test/test_bpf.c
> index 527c06b80708..1b5a178241d8 100644
> --- a/app/test/test_bpf.c
> +++ b/app/test/test_bpf.c
> @@ -10,6 +10,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -3233,3 +3234,175 @@ test_bpf(void)
>  }
> 
>  REGISTER_TEST_COMMAND(bpf_autotest, test_bpf);
> +
> +#ifdef RTE_PORT_PCAP
> +#include 
> +
> +static void
> +test_bpf_dump(struct bpf_program *cbf, const struct rte_bpf_prm *prm)
> +{
> + printf("cBPF program (%u insns)\n", cbf->bf_len);
> + bpf_dump(cbf, 1);
> +
> + printf("\neBPF program (%u insns)\n", prm->nb_ins);
> + rte_bpf_dump(stdout, prm->ins, prm->nb_ins);
> +}
> +
> +static int
> +test_bpf_match(pcap_t *pcap, const char *str, bool expected)
> +{
> + struct bpf_program fcode;
> + struct rte_bpf_prm *prm = NULL;
> + struct rte_bpf *bpf = NULL;
> + uint8_t tbuf[sizeof(struct dummy_mbuf)];
> + int ret = -1;
> + uint64_t rc;
> +
> + if (pcap_compile(pcap, &fcode, str, 1, PCAP_NETMASK_UNKNOWN)) {
> + printf("%s@%d: pcap_compile(\"%s\") failed: %s;\n",
> +__func__, __LINE__,  str, pcap_geterr(pcap));
> + return -1;
> + }
> +
> + prm = rte_bpf_convert(&fcode);
> + if (prm == NULL) {
> + printf("%s@%d: bpf_convert('%s') failed,, error=%d(%s);\n",
> +__func__, __LINE__, str, rte_errno, strerror(rte_errno));
> + goto error;
> + }
> +
> + bpf = rte_bpf_load(prm);
> + if (bpf == NULL) {
> + printf("%s@%d: failed to load bpf code, error=%d(%s);\n",
> + __func__, __LINE__, rte_errno, strerror(rte_errno));
> + goto error;
> + }
> +
> + test_ld_mbuf1_prepare(tbuf);
> + rc = rte_bpf_exec(bpf, tbuf);
> + if ((rc == 0) == expected)
> + ret = 0;
> + else
> + printf("%s@%d: failed match: expect %s 0 got %"PRIu64"\n",
> +__func__, __LINE__, expected ? "==" : "<>",  rc);
> +error:
> + if (bpf)
> + rte_bpf_destroy(bpf);
> + rte_free(prm);
> + pcap_freecode(&fcode);
> + return ret;
> +}
> +
> +/* Basic sanity test can we match a IP packet */
> +static int
> +test_bpf_filter_sanity(pcap_t *pcap)
> +{
> + int ret;
> +
> + ret = test_bpf_match(pcap, "ip", true);
> + ret |= test_bpf_match(pcap, "not ip", false);
> +
> + return ret;
> +}
> +
> +/*
> + * Some sample pcap filter strings from
> + * https://wiki.wireshark.org/CaptureFilters
> + */
> +static const char * const sample_filters[] = {
> + "host 172.18.5.4",
> + "net 192.168.0.0/24",
> + "src net 192.168.0.0/24",
> + "src net 192.168.0.0 mask 255.255.255.0",
> + "dst net 192.168.0.0/24",
> + "dst net 192.168.0.0 mask 255.255.255.0",
> + "port 53",
> + "host www.example.com and not (port 80 or port 25)",
> + "host www.example.com and not port 80 and not port 25",
> + "port not 53 and not arp",
> + "(tcp[0:2] > 1500 and tcp[0:2] < 1550) or (tcp[2:2] > 1500 and tcp[2:2] 
> < 1550)",
> + "ether proto 0x888e",
> + "ether[0] & 1 = 0 and ip[16] >= 224",
> + "icmp[icmptype] != icmp-echo and icmp[icmptype] != icmp-echoreply",
> + "tcp[tcpflags] & (tcp-syn|tcp-fin) != 0 and not src and dst net 
> 127.0.0.1",
> + "not ether dst 01:80:c2:00:00:0e",
> + "not broadcast and not multicast",
> + "dst host ff02::1",
> + "port 80 and tcp[((tcp[12:1] & 0xf0) >> 2):4] = 0x47455420",
> + /* Worms */
> + "dst port 135 and tcp port 135 and ip[2:2]==48",
> + "icmp[icmptype]==icmp-echo and ip[2:2]==92 and icmp[8:4]==0x",
> + "dst port 135 or dst port 445 or dst port 1433"
> + " and tcp[tcpflags] & (tcp-syn) != 0"
> + " and tcp[tcpflags] & (tcp-ack) = 0 and src net 192.168.0.0/24",
> + "tcp src port 443 and (tcp[((tcp[12] & 0xF0) >> 4 ) * 4] = 0x18)"
> + " and (tcp[((tcp[12] & 0xF0) >> 4 ) * 4 + 1] = 0x03)"
> + " and (tcp[((tcp[12] & 0xF0) >> 4 ) * 4 + 2] < 0x04)"
> + " and ((ip[2:2] - 4 * (ip[0] & 0x0F) - 4 * ((tcp[12] & 0xF0) >> 4) > 
> 69))",
> + /* Other */
> + "len = 128",
> +};
> +
> +static int
> +test_bpf_filter(pcap_t *pcap, const char *s)
> +{
> + struct bpf_program fcode;
> + struct rte_bpf_prm *prm = NULL;
> + struct rte_bpf *bpf = NULL;
> +
> + if (pcap_compile(pcap, &fcode, s, 1, PCAP_NETMASK_UNKNOWN)) {
> + printf("%s@%d: pcap_compile('%s') failed: %s;\n",
> +__func__, __LINE__, s, pcap_geterr(pcap));
> + return -1;
> + }
> +
> + prm = rte_bpf_convert(&fcode);
> + if (prm == NULL) {
> + printf("%s@%d: bpf_convert('%s') fai

Re: [dpdk-dev] [PATCH 1/4] vhost: support async dequeue for split ring

2021-09-15 Thread Xia, Chenbo
Hi Maxime & Yuan,

> -Original Message-
> From: Wang, YuanX 
> Sent: Wednesday, September 15, 2021 5:09 PM
> To: Xia, Chenbo ; Ma, WenwuX ;
> dev@dpdk.org
> Cc: maxime.coque...@redhat.com; Jiang, Cheng1 ; Hu,
> Jiayu ; Pai G, Sunil ; Yang,
> YvonneX ; Wang, Yinan 
> Subject: RE: [PATCH 1/4] vhost: support async dequeue for split ring
> 
> Hi Chenbo,
> 
> > -Original Message-
> > From: Xia, Chenbo 
> > Sent: Wednesday, September 15, 2021 10:52 AM
> > To: Ma, WenwuX ; dev@dpdk.org
> > Cc: maxime.coque...@redhat.com; Jiang, Cheng1 ;
> > Hu, Jiayu ; Pai G, Sunil ; Yang,
> > YvonneX ; Wang, YuanX
> > ; Wang, Yinan 
> > Subject: RE: [PATCH 1/4] vhost: support async dequeue for split ring
> >
> > Hi,
> >
> > > -Original Message-
> > > From: Ma, WenwuX 
> > > Sent: Tuesday, September 7, 2021 4:49 AM
> > > To: dev@dpdk.org
> > > Cc: maxime.coque...@redhat.com; Xia, Chenbo ;
> > > Jiang,
> > > Cheng1 ; Hu, Jiayu ; Pai
> > > G, Sunil ; Yang, YvonneX
> > > ; Wang, YuanX ; Ma,
> > > WenwuX ; Wang, Yinan 
> > > Subject: [PATCH 1/4] vhost: support async dequeue for split ring
> > >
> > > From: Yuan Wang 
> > >
> > > This patch implements asynchronous dequeue data path for split ring.
> > > A new asynchronous dequeue function is introduced. With this function,
> > > the application can try to receive packets from the guest with
> > > offloading copies to the async channel, thus saving precious CPU
> > > cycles.
> > >
> > > Signed-off-by: Yuan Wang 
> > > Signed-off-by: Jiayu Hu 
> > > Signed-off-by: Wenwu Ma 
> > > Tested-by: Yinan Wang 
> > > ---
> > >  doc/guides/prog_guide/vhost_lib.rst |   9 +
> > >  lib/vhost/rte_vhost_async.h |  36 +-
> > >  lib/vhost/version.map   |   3 +
> > >  lib/vhost/vhost.h   |   3 +-
> > >  lib/vhost/virtio_net.c  | 531 
> > >  5 files changed, 579 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/doc/guides/prog_guide/vhost_lib.rst
> > > b/doc/guides/prog_guide/vhost_lib.rst
> > > index 171e0096f6..9ed544db7a 100644
> > > --- a/doc/guides/prog_guide/vhost_lib.rst
> > > +++ b/doc/guides/prog_guide/vhost_lib.rst
> > > @@ -303,6 +303,15 @@ The following is an overview of some key Vhost
> > > API
> > > functions:
> > >Clear inflight packets which are submitted to DMA engine in vhost
> > > async data
> > >path. Completed packets are returned to applications through ``pkts``.
> > >
> > > +* ``rte_vhost_async_try_dequeue_burst(vid, queue_id, mbuf_pool, pkts,
> > > +count,
> > > nr_inflight)``
> > > +
> > > +  This function tries to receive packets from the guest with
> > > + offloading  copies to the async channel. The packets that are
> > > + transfer completed  are returned in ``pkts``. The other packets that
> > > + their copies are submitted  to the async channel but not completed are
> > called "in-flight packets".
> > > +  This function will not return in-flight packets until their copies
> > > + are  completed by the async channel.
> > > +
> > >  Vhost-user Implementations
> > >  --
> > >
> > > diff --git a/lib/vhost/rte_vhost_async.h b/lib/vhost/rte_vhost_async.h
> > > index ad71555a7f..5e2429ab70 100644
> > > --- a/lib/vhost/rte_vhost_async.h
> > > +++ b/lib/vhost/rte_vhost_async.h
> > > @@ -83,12 +83,18 @@ struct rte_vhost_async_channel_ops {
> > >   uint16_t max_packets);
> > >  };
> > >
> > > +struct async_nethdr {
> > > + struct virtio_net_hdr hdr;
> > > + bool valid;
> > > +};
> > > +
> >
> > As a struct exposed in public headers, it's better to prefix it with rte_.
> > In this case I would prefer rte_async_net_hdr.
> >
> > >  /**
> > > - * inflight async packet information
> > > + * in-flight async packet information
> > >   */
> > >  struct async_inflight_info {
> >
> > Could you help to rename it too? Like rte_async_inflight_info.
> 
> You are right, these two structs are for internal use and not suitable for
> exposure in the public header,
> but they are used for async channel, I think it's not suitable to be placed in
> other headers.
> Could you give some advice on which file to put them in?

@Maxime, What do you think of this? I think either changing it/renaming 
it/moving it
is ABI breakage. But since it's never used by any APP, I guess it's not big 
problem.
So what do you think we should do with the struct? I will vote for move it 
temporarily
to header like vhost.h. At some point, we can create a new internal async 
header for
structs like this. Or create it now?

@Yuan, I think again of the struct async_nethdr, do we really need to define 
this?
As for now, header being invalid only happens when 
virtio_net_with_host_offload(dev)
is false, right? So why not use this to know hdr invalid or not when you need 
to check?

Thanks,
Chenbo

> 
> >
> > >   struct rte_mbuf *mbuf;
> > > - uint16_t descs; /* num of descs inflight */
> > > + struct async_nethdr nethdr;
> > > + uint16_t descs; /* num of descs in-flight */
> > >   uint16_t nr

Re: [dpdk-dev] [PATCH v2] net/virtio: wait device ready during reset

2021-09-15 Thread Xia, Chenbo
> -Original Message-
> From: Xueming Li 
> Sent: Wednesday, September 15, 2021 6:12 PM
> To: dev@dpdk.org
> Cc: xuemi...@nvidia.com; Andrew Rybchenko ;
> Maxime Coquelin ; Xia, Chenbo
> 
> Subject: [PATCH v2] net/virtio: wait device ready during reset
> 
> According to virtio spec, the device MUST reset when 0 is written to
> device_status, and present 0 in device_status once reset is done.
> 
> This patch waits status value to be 0 during reset operation, if
> timeout in 3 seconds, log and continue.
> 
> Signed-off-by: Xueming Li 
> Cc: Andrew Rybchenko 
> --
> 2.33.0

Reviewed-by: Chenbo Xia 


Re: [dpdk-dev] OpenSSL 3.0 released, deprecated functions - DPDK crypto compilation issues

2021-09-15 Thread Kusztal, ArkadiuszX
Sure Akhil, will do some more testing and will send (unless some better 
proposal come in). It needs to be backported as well.

From: Akhil Goyal 
Sent: Wednesday, September 15, 2021 12:51 PM
To: Kusztal, ArkadiuszX ; dev@dpdk.org
Cc: asoma...@amd.com; Zhang, Roy Fan ; Doherty, Declan 

Subject: RE: OpenSSL 3.0 released, deprecated functions - DPDK crypto 
compilation issues

Hi Arek,

Can you submit changes for the meson files needed to compile it with 3.0 if 
needed?

Regards,
Akhil

From: Kusztal, ArkadiuszX 
mailto:arkadiuszx.kusz...@intel.com>>
Sent: Wednesday, September 15, 2021 4:02 PM
To: dev@dpdk.org
Cc: Akhil Goyal mailto:gak...@marvell.com>>; 
asoma...@amd.com; Zhang, Roy Fan 
mailto:roy.fan.zh...@intel.com>>; Doherty, Declan 
mailto:declan.dohe...@intel.com>>
Subject: [EXT] OpenSSL 3.0 released, deprecated functions - DPDK crypto 
compilation issues

External Email

Hi,

OpenSSL 3.0 is now released and since some PMDs in DPDK depends on libcrypto 
deprecated now functions, anyone who install 3.0 may see compilation problems.
So we suggest in case adaptation cannot be made in time to put version 
constraint in meson dependencies "version : '<3.0.0'", it probably will not 
work with OpenSSL installed from source though (at least did not work for me).

Regards,
Arek




[dpdk-dev] [PATCH v2] eal: add additional info if lcore exceeds max cores

2021-09-15 Thread David Hunt
If the user requests to use an lcore above 128 using -l or -c,
the eal will exit with "EAL: invalid core list syntax" and
very little other useful information.

This patch adds some extra information suggesting to use --lcores
so that physical cores above RTE_MAX_LCORE (default 128) can be
used. This is achieved by using the --lcores option by mapping
the logical cores in the application onto to physical cores.

There is no change in functionalty, just additional messages
suggesting how the --lcores option might be used for the supplied
list of lcores. For example, if "-l 12-14,130,132" is used, we
see the following additional output on the command line:

EAL: Error = One of the 5 cores provided exceeds RTE_MAX_LCORE (128)
EAL: Please use --lcores instead, e.g. --lcores 0@12,1@13,2@14,3@130,4@132

Signed-off-by: David Hunt 

---
changes in v2
   * Rather than increasing the default max lcores (as in v1),
 it was agreed to do this instead (switch to --lcores).
   * As the other patches in the v1 of the set are no longer related
 to this change, I'll submit as a separate patch set.
---
 lib/eal/common/eal_common_options.c | 31 +
 1 file changed, 27 insertions(+), 4 deletions(-)

diff --git a/lib/eal/common/eal_common_options.c 
b/lib/eal/common/eal_common_options.c
index ff5861b5f3..5c7a5a45a5 100644
--- a/lib/eal/common/eal_common_options.c
+++ b/lib/eal/common/eal_common_options.c
@@ -836,6 +836,8 @@ eal_parse_service_corelist(const char *corelist)
return 0;
 }
 
+#define MAX_LCORES_STRING 512
+
 static int
 eal_parse_corelist(const char *corelist, int *cores)
 {
@@ -843,6 +845,9 @@ eal_parse_corelist(const char *corelist, int *cores)
char *end = NULL;
int min, max;
int idx;
+   bool overflow = false;
+   char lcores[MAX_LCORES_STRING] = "";
+   int len = 0;
 
for (idx = 0; idx < RTE_MAX_LCORE; idx++)
cores[idx] = -1;
@@ -862,8 +867,10 @@ eal_parse_corelist(const char *corelist, int *cores)
idx = strtol(corelist, &end, 10);
if (errno || end == NULL)
return -1;
-   if (idx < 0 || idx >= RTE_MAX_LCORE)
+   if (idx < 0)
return -1;
+   if (idx >= RTE_MAX_LCORE)
+   overflow = true;
while (isblank(*end))
end++;
if (*end == '-') {
@@ -873,10 +880,19 @@ eal_parse_corelist(const char *corelist, int *cores)
if (min == RTE_MAX_LCORE)
min = idx;
for (idx = min; idx <= max; idx++) {
-   if (cores[idx] == -1) {
-   cores[idx] = count;
-   count++;
+   if (idx < RTE_MAX_LCORE) {
+   if (cores[idx] == -1)
+   cores[idx] = count;
}
+   count++;
+   if (count == 1)
+   len = len + snprintf(&lcores[len],
+   MAX_LCORES_STRING - len,
+   "%d@%d", count-1, idx);
+   else
+   len = len + snprintf(&lcores[len],
+   MAX_LCORES_STRING - len,
+   ",%d@%d", count-1, idx);
}
min = RTE_MAX_LCORE;
} else
@@ -886,6 +902,13 @@ eal_parse_corelist(const char *corelist, int *cores)
 
if (count == 0)
return -1;
+   if (overflow) {
+   RTE_LOG(ERR, EAL, "Error = One of the %d cores provided exceeds 
RTE_MAX_LCORE (%d)\n",
+   count, RTE_MAX_LCORE);
+   RTE_LOG(ERR, EAL, "Please use --lcores instead, e.g. --lcores 
%s\n",
+   lcores);
+   return -1;
+   }
return 0;
 }
 
-- 
2.17.1



Re: [dpdk-dev] [PATCH v1 1/6] build: increase default of max lcores to 512

2021-09-15 Thread David Hunt



On 14/9/2021 12:29 PM, David Marchand wrote:

On Tue, Sep 14, 2021 at 1:07 PM David Hunt  wrote:

“ERROR: logical core 212 is above the maximum lcore number permitted.
Please use the --lcores option to map lcores onto physical cores, e.g.
--lcores="(0-3)@(212-215).”

If you could directly provide the right --lcores syntax based on what
user provided with -c or -l, it would be even better.
This should be not that difficult.


Agreed. I now have something working that when given "-l 12-16,130,132",
will output the following:

EAL: One of the 7 cores provided exceeds RTE_MAX_LCORE (128)
EAL: Please use --lcores instead, e.g. --lcores "(0-6)@(12-16,130,132)"

That's not equivalent.

(0-6)@(12-16,130,132) means 7 lcores with each lcore running on the
same group of physical cores.
-l 12-16,130,132 means 7 lcores running on dedicated physical cores.
I would expect 0@12,1@13,2@14,3@15,4@16,5@130,6@132


You can see with debug logs:

$ echo quit | ./build/app/dpdk-testpmd --log-level=*:debug --no-huge
-m 512 --lcores '(0-2)@(0-2)' -- --total-num-mbufs 2048 |& grep
lcore.*is.ready
EAL: Main lcore 0 is ready (tid=7feb9550bc00;cpuset=[0,1,2])
EAL: lcore 1 is ready (tid=7feb909ce700;cpuset=[0,1,2])
EAL: lcore 2 is ready (tid=7feb901cd700;cpuset=[0,1,2])

vs

$ echo quit | ./build/app/dpdk-testpmd --log-level=*:debug --no-huge
-m 512 --lcores 0@0,1@1,2@2 -- --total-num-mbufs 2048 |& grep
lcore.*is.ready
EAL: Main lcore 0 is ready (tid=7fba1cd1ac00;cpuset=[0])
EAL: lcore 2 is ready (tid=7fba179dc700;cpuset=[2])
EAL: lcore 1 is ready (tid=7fba181dd700;cpuset=[1])



Hi David,

   Thanks for the clarification. I've made the relevant changes and 
submitted a v2. Hopefully the suggested parameters are correct this time! :)


Regards,
Dave.





[dpdk-dev] [PATCH] net/virtio: report max/min/align Tx desc limits in dev info

2021-09-15 Thread Andrew Rybchenko
From: Ivan Ilchenko 

Report max/min/align Tx descriptors limits in device info get callback.
Before calling the callback, rte_eth_dev_info_get() provides
default values of nb_min as zero and nb_max as UINT16_MAX that are
not correct for the driver, so one can't rely on them.

Signed-off-by: Ivan Ilchenko 
Signed-off-by: Andrew Rybchenko 
---
 drivers/net/virtio/virtio_ethdev.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/net/virtio/virtio_ethdev.c 
b/drivers/net/virtio/virtio_ethdev.c
index 6c7d9bf58d..be5e4c0011 100644
--- a/drivers/net/virtio/virtio_ethdev.c
+++ b/drivers/net/virtio/virtio_ethdev.c
@@ -2579,6 +2579,7 @@ virtio_dev_info_get(struct rte_eth_dev *dev, struct 
rte_eth_dev_info *dev_info)
 * The Queue Size value does not have to be a power of 2.
 */
dev_info->rx_desc_lim.nb_max = UINT16_MAX;
+   dev_info->tx_desc_lim.nb_max = UINT16_MAX;
} else {
/*
 * According to 2.6 Split Virtqueues:
@@ -2586,6 +2587,7 @@ virtio_dev_info_get(struct rte_eth_dev *dev, struct 
rte_eth_dev_info *dev_info)
 * Size value is 32768.
 */
dev_info->rx_desc_lim.nb_max = 32768;
+   dev_info->tx_desc_lim.nb_max = 32768;
}
/*
 * Actual minimum is not the same for virtqueues of different kinds,
@@ -2594,7 +2596,9 @@ virtio_dev_info_get(struct rte_eth_dev *dev, struct 
rte_eth_dev_info *dev_info)
 */
dev_info->rx_desc_lim.nb_min = RTE_MAX(DEFAULT_RX_FREE_THRESH,
   RTE_VIRTIO_VPMD_RX_REARM_THRESH);
+   dev_info->tx_desc_lim.nb_min = DEFAULT_TX_FREE_THRESH;
dev_info->rx_desc_lim.nb_align = 1;
+   dev_info->tx_desc_lim.nb_align = 1;
 
return 0;
 }
-- 
2.30.2



[dpdk-dev] [PATCH v14] eventdev: simplify Rx adapter event vector config

2021-09-15 Thread pbhagavatula
From: Pavan Nikhilesh 

Include vector configuration into the structure
``rte_event_eth_rx_adapter_queue_conf`` that is used to configure
Rx adapter ethernet device Rx queue parameters.
This simplifies event vector configuration as it avoids splitting
configuration per Rx queue.

Signed-off-by: Pavan Nikhilesh 
Acked-by: Jay Jayatheerthan 
---
 v14 Changes:
 - Update documentation.
 v13 Changes:
 - Fix cnxk driver compilation.
 v12 Changes:
 - Remove deprication notice.
 - Remove unnecessary change Id.

 app/test-eventdev/test_pipeline_common.c  |  16 +-
 .../prog_guide/event_ethernet_rx_adapter.rst  |  11 +-
 doc/guides/rel_notes/deprecation.rst  |   9 -
 drivers/event/cnxk/cn10k_eventdev.c   |  77 
 drivers/event/cnxk/cnxk_eventdev_adptr.c  |  41 
 lib/eventdev/eventdev_pmd.h   |  29 ---
 lib/eventdev/rte_event_eth_rx_adapter.c   | 179 ++
 lib/eventdev/rte_event_eth_rx_adapter.h   |  30 ---
 lib/eventdev/version.map  |   1 -
 9 files changed, 112 insertions(+), 281 deletions(-)

diff --git a/app/test-eventdev/test_pipeline_common.c 
b/app/test-eventdev/test_pipeline_common.c
index 6ee530d4cd..2697547641 100644
--- a/app/test-eventdev/test_pipeline_common.c
+++ b/app/test-eventdev/test_pipeline_common.c
@@ -332,7 +332,6 @@ pipeline_event_rx_adapter_setup(struct evt_options *opt, 
uint8_t stride,
uint16_t prod;
struct rte_mempool *vector_pool = NULL;
struct rte_event_eth_rx_adapter_queue_conf queue_conf;
-   struct rte_event_eth_rx_adapter_event_vector_config vec_conf;

memset(&queue_conf, 0,
sizeof(struct rte_event_eth_rx_adapter_queue_conf));
@@ -398,8 +397,12 @@ pipeline_event_rx_adapter_setup(struct evt_options *opt, 
uint8_t stride,
}

if (cap & RTE_EVENT_ETH_RX_ADAPTER_CAP_EVENT_VECTOR) {
+   queue_conf.vector_sz = opt->vector_size;
+   queue_conf.vector_timeout_ns =
+   opt->vector_tmo_nsec;
queue_conf.rx_queue_flags |=
RTE_EVENT_ETH_RX_ADAPTER_QUEUE_EVENT_VECTOR;
+   queue_conf.vector_mp = vector_pool;
} else {
evt_err("Rx adapter doesn't support event 
vector");
return -EINVAL;
@@ -419,17 +422,6 @@ pipeline_event_rx_adapter_setup(struct evt_options *opt, 
uint8_t stride,
return ret;
}

-   if (opt->ena_vector) {
-   vec_conf.vector_sz = opt->vector_size;
-   vec_conf.vector_timeout_ns = opt->vector_tmo_nsec;
-   vec_conf.vector_mp = vector_pool;
-   if (rte_event_eth_rx_adapter_queue_event_vector_config(
-   prod, prod, -1, &vec_conf) < 0) {
-   evt_err("Failed to configure event 
vectorization for Rx adapter");
-   return -EINVAL;
-   }
-   }
-
if (!(cap & RTE_EVENT_ETH_RX_ADAPTER_CAP_INTERNAL_PORT)) {
uint32_t service_id = -1U;

diff --git a/doc/guides/prog_guide/event_ethernet_rx_adapter.rst 
b/doc/guides/prog_guide/event_ethernet_rx_adapter.rst
index c01e5a9666..0780b6f711 100644
--- a/doc/guides/prog_guide/event_ethernet_rx_adapter.rst
+++ b/doc/guides/prog_guide/event_ethernet_rx_adapter.rst
@@ -195,12 +195,17 @@ The event devices, ethernet device pairs which support 
the capability
 flow characteristics and generate a ``rte_event`` containing 
``rte_event_vector``
 whose event type is either ``RTE_EVENT_TYPE_ETHDEV_VECTOR`` or
 ``RTE_EVENT_TYPE_ETH_RX_ADAPTER_VECTOR``.
-The aggregation size and timeout are configurable at a queue level and the
-maximum, minimum vector sizes and timeouts vary based on the device capability
-and can be queried using ``rte_event_eth_rx_adapter_vector_limits_get``.
+The maximum, minimum vector sizes and timeouts vary based on the device
+capability and can be queried using
+``rte_event_eth_rx_adapter_vector_limits_get``.
 The Rx adapter additionally might include useful data such as ethernet device
 port and queue identifier in the ``rte_event_vector::port`` and
 ``rte_event_vector::queue`` and mark ``rte_event_vector::attr_valid`` as true.
+The aggregation size and timeout are configurable at a queue level by setting
+``rte_event_eth_rx_adapter_queue_conf::vector_sz``,
+``rte_event_eth_rx_adapter_queue_conf::vector_timeout_ns`` and
+``rte_event_eth_rx_adapter_queue_conf::vector_mp`` when adding queues using
+``rte_event_eth_rx_adapter_queue_add``.

 A loop processing ``rte_event_vector`` containing mbufs is shown below.

diff --git a/doc/guides/rel_notes/deprecation.rst 
b/doc/guides/rel_notes/deprecation.r

Re: [dpdk-dev] [PATCH v14] eventdev: simplify Rx adapter event vector config

2021-09-15 Thread Kinsella, Ray



On 15/09/2021 14:15, pbhagavat...@marvell.com wrote:
> From: Pavan Nikhilesh 
> 
> Include vector configuration into the structure
> ``rte_event_eth_rx_adapter_queue_conf`` that is used to configure
> Rx adapter ethernet device Rx queue parameters.
> This simplifies event vector configuration as it avoids splitting
> configuration per Rx queue.
> 
> Signed-off-by: Pavan Nikhilesh 
> Acked-by: Jay Jayatheerthan 
> ---
>  v14 Changes:
>  - Update documentation.
>  v13 Changes:
>  - Fix cnxk driver compilation.
>  v12 Changes:
>  - Remove deprication notice.
>  - Remove unnecessary change Id.
> 
>  app/test-eventdev/test_pipeline_common.c  |  16 +-
>  .../prog_guide/event_ethernet_rx_adapter.rst  |  11 +-
>  doc/guides/rel_notes/deprecation.rst  |   9 -
>  drivers/event/cnxk/cn10k_eventdev.c   |  77 
>  drivers/event/cnxk/cnxk_eventdev_adptr.c  |  41 
>  lib/eventdev/eventdev_pmd.h   |  29 ---
>  lib/eventdev/rte_event_eth_rx_adapter.c   | 179 ++
>  lib/eventdev/rte_event_eth_rx_adapter.h   |  30 ---
>  lib/eventdev/version.map  |   1 -
>  9 files changed, 112 insertions(+), 281 deletions(-)

FYI - deprication is spelt deprecation 

Acked-by: Ray Kinsella 



[dpdk-dev] [PATCH v1] net/virtio: wait device ready during reset

2021-09-15 Thread Xueming Li
According to virtio spec, the device MUST reset when 0 is written to
device_status, and present 0 in device_status once reset is done.

This patch waits status value to be 0 during reset operation, if
timeout in 3 seconds, log and continue.

Signed-off-by: Xueming Li 
Cc: Andrew Rybchenko 
---
 drivers/net/virtio/virtio.c | 15 +--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/drivers/net/virtio/virtio.c b/drivers/net/virtio/virtio.c
index 7e1e77797f..f865b27b65 100644
--- a/drivers/net/virtio/virtio.c
+++ b/drivers/net/virtio/virtio.c
@@ -3,7 +3,10 @@
  * Copyright(c) 2020 Red Hat, Inc.
  */
 
+#include 
+
 #include "virtio.h"
+#include "virtio_logs.h"
 
 uint64_t
 virtio_negotiate_features(struct virtio_hw *hw, uint64_t host_features)
@@ -38,9 +41,17 @@ virtio_write_dev_config(struct virtio_hw *hw, size_t offset,
 void
 virtio_reset(struct virtio_hw *hw)
 {
+   uint32_t retry = 0;
+
VIRTIO_OPS(hw)->set_status(hw, VIRTIO_CONFIG_STATUS_RESET);
-   /* flush status write */
-   VIRTIO_OPS(hw)->get_status(hw);
+   /* Flush status write and wait device ready max 3 seconds. */
+   while (VIRTIO_OPS(hw)->get_status(hw) != VIRTIO_CONFIG_STATUS_RESET) {
+   if (retry++ > 3000) {
+   PMD_INIT_LOG(WARNING, "device reset timeout");
+   break;
+   }
+   usleep(1000L);
+   }
 }
 
 void
-- 
2.33.0



Re: [dpdk-dev] [PATCH 02/32] net/ngbe: support scattered Rx

2021-09-15 Thread Ferruh Yigit
On 9/8/2021 9:37 AM, Jiawen Wu wrote:
> Add scattered Rx function to support receiving segmented mbufs.
> 
> Signed-off-by: Jiawen Wu 
> ---
>  doc/guides/nics/features/ngbe.ini |   1 +
>  doc/guides/nics/ngbe.rst  |   1 +
>  drivers/net/ngbe/ngbe_ethdev.c|  20 +-
>  drivers/net/ngbe/ngbe_ethdev.h|   8 +
>  drivers/net/ngbe/ngbe_rxtx.c  | 541 ++
>  drivers/net/ngbe/ngbe_rxtx.h  |   5 +
>  6 files changed, 574 insertions(+), 2 deletions(-)
> 
> diff --git a/doc/guides/nics/features/ngbe.ini 
> b/doc/guides/nics/features/ngbe.ini
> index 8b7588184a..f85754eb7a 100644
> --- a/doc/guides/nics/features/ngbe.ini
> +++ b/doc/guides/nics/features/ngbe.ini
> @@ -8,6 +8,7 @@ Speed capabilities   = Y
>  Link status  = Y
>  Link status event= Y
>  Queue start/stop = Y
> +Scattered Rx = Y
>  Packet type parsing  = Y
>  Multiprocess aware   = Y
>  Linux= Y
> diff --git a/doc/guides/nics/ngbe.rst b/doc/guides/nics/ngbe.rst
> index d044397cd5..463452ce8c 100644
> --- a/doc/guides/nics/ngbe.rst
> +++ b/doc/guides/nics/ngbe.rst
> @@ -13,6 +13,7 @@ Features
>  
>  - Packet type information
>  - Link state information
> +- Scattered for RX
>  
>  
>  Prerequisites
> diff --git a/drivers/net/ngbe/ngbe_ethdev.c b/drivers/net/ngbe/ngbe_ethdev.c
> index 4388d93560..fba0a2dcfd 100644
> --- a/drivers/net/ngbe/ngbe_ethdev.c
> +++ b/drivers/net/ngbe/ngbe_ethdev.c
> @@ -140,8 +140,16 @@ eth_ngbe_dev_init(struct rte_eth_dev *eth_dev, void 
> *init_params __rte_unused)
>   eth_dev->rx_pkt_burst = &ngbe_recv_pkts;
>   eth_dev->tx_pkt_burst = &ngbe_xmit_pkts_simple;
>  
> - if (rte_eal_process_type() != RTE_PROC_PRIMARY)
> + /*
> +  * For secondary processes, we don't initialise any further as primary
> +  * has already done this work. Only check we don't need a different
> +  * Rx and Tx function.
> +  */
> + if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
> + ngbe_set_rx_function(eth_dev);
> +
>   return 0;
> + }
>  
>   rte_eth_copy_pci_info(eth_dev, pci_dev);
>  
> @@ -528,6 +536,9 @@ ngbe_dev_stop(struct rte_eth_dev *dev)
>  
>   ngbe_dev_clear_queues(dev);
>  
> + /* Clear stored conf */
> + dev->data->scattered_rx = 0;
> +
>   /* Clear recorded link status */
>   memset(&link, 0, sizeof(link));
>   rte_eth_linkstatus_set(dev, &link);
> @@ -628,6 +639,8 @@ ngbe_dev_info_get(struct rte_eth_dev *dev, struct 
> rte_eth_dev_info *dev_info)
>   dev_info->max_tx_queues = (uint16_t)hw->mac.max_tx_queues;
>   dev_info->min_rx_bufsize = 1024;
>   dev_info->max_rx_pktlen = 15872;
> + dev_info->rx_offload_capa = (ngbe_get_rx_port_offloads(dev) |
> +  dev_info->rx_queue_offload_capa);
>  
>   dev_info->default_rxconf = (struct rte_eth_rxconf) {
>   .rx_thresh = {
> @@ -670,7 +683,10 @@ ngbe_dev_info_get(struct rte_eth_dev *dev, struct 
> rte_eth_dev_info *dev_info)
>  const uint32_t *
>  ngbe_dev_supported_ptypes_get(struct rte_eth_dev *dev)
>  {
> - if (dev->rx_pkt_burst == ngbe_recv_pkts)
> + if (dev->rx_pkt_burst == ngbe_recv_pkts ||
> + dev->rx_pkt_burst == ngbe_recv_pkts_sc_single_alloc ||
> + dev->rx_pkt_burst == ngbe_recv_pkts_sc_bulk_alloc ||
> + dev->rx_pkt_burst == ngbe_recv_pkts_bulk_alloc)
>   return ngbe_get_supported_ptypes();
>  
>   return NULL;
> diff --git a/drivers/net/ngbe/ngbe_ethdev.h b/drivers/net/ngbe/ngbe_ethdev.h
> index 486c6c3839..e7fe9a03b7 100644
> --- a/drivers/net/ngbe/ngbe_ethdev.h
> +++ b/drivers/net/ngbe/ngbe_ethdev.h
> @@ -106,6 +106,14 @@ int ngbe_dev_tx_queue_stop(struct rte_eth_dev *dev, 
> uint16_t tx_queue_id);
>  uint16_t ngbe_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
>   uint16_t nb_pkts);
>  
> +uint16_t ngbe_recv_pkts_bulk_alloc(void *rx_queue, struct rte_mbuf **rx_pkts,
> + uint16_t nb_pkts);
> +
> +uint16_t ngbe_recv_pkts_sc_single_alloc(void *rx_queue,
> + struct rte_mbuf **rx_pkts, uint16_t nb_pkts);
> +uint16_t ngbe_recv_pkts_sc_bulk_alloc(void *rx_queue,
> + struct rte_mbuf **rx_pkts, uint16_t nb_pkts);
> +
>  uint16_t ngbe_xmit_pkts_simple(void *tx_queue, struct rte_mbuf **tx_pkts,
>   uint16_t nb_pkts);
>  
> diff --git a/drivers/net/ngbe/ngbe_rxtx.c b/drivers/net/ngbe/ngbe_rxtx.c
> index a3ef0f7577..49fa978853 100644
> --- a/drivers/net/ngbe/ngbe_rxtx.c
> +++ b/drivers/net/ngbe/ngbe_rxtx.c
> @@ -263,6 +263,243 @@ ngbe_rxd_pkt_info_to_pkt_type(uint32_t pkt_info, 
> uint16_t ptid_mask)
>   return ngbe_decode_ptype(ptid);
>  }
>  
> +/*
> + * LOOK_AHEAD defines how many desc statuses to check beyond the
> + * current descriptor.
> + * It must be a pound define for optimal performance.
> + * Do not change the value of LOOK_AHEAD, as the ngbe_rx_scan_hw_ring
> + * function only works with LOOK_AHEAD=8.
> + */
>

[dpdk-dev] [PATCH v2 0/4] iavf: add iAVF IPsec inline crypto support

2021-09-15 Thread Radu Nicolau
Add support for inline crypto for IPsec, for ESP transport and
tunnel over IPv4 and IPv6, as well as supporting the offload for
ESP over UDP, and inconjunction with TSO for UDP and TCP flows.

Radu Nicolau (4):
  common/iavf: add iAVF IPsec inline crypto support
  net/iavf: add iAVF IPsec inline crypto support
  net/iavf: Add xstats support for inline IPsec crypto
  net/iavf: add watchdog for VFLR

 drivers/common/iavf/iavf_type.h   |  215 +-
 drivers/common/iavf/virtchnl.h|   17 +-
 drivers/common/iavf/virtchnl_inline_ipsec.h   |  553 +
 drivers/net/iavf/iavf.h   |   53 +-
 drivers/net/iavf/iavf_ethdev.c|  222 +-
 drivers/net/iavf/iavf_generic_flow.c  |   16 +
 drivers/net/iavf/iavf_generic_flow.h  |2 +
 drivers/net/iavf/iavf_ipsec_crypto.c  | 1918 +
 drivers/net/iavf/iavf_ipsec_crypto.h  |   96 +
 .../net/iavf/iavf_ipsec_crypto_capabilities.h |  383 
 drivers/net/iavf/iavf_rxtx.c  |  729 +--
 drivers/net/iavf/iavf_rxtx.h  |  567 -
 drivers/net/iavf/iavf_rxtx_vec_sse.c  |   10 +-
 drivers/net/iavf/iavf_vchnl.c |  166 +-
 drivers/net/iavf/meson.build  |3 +-
 drivers/net/iavf/rte_pmd_iavf.h   |1 +
 drivers/net/iavf/version.map  |3 +
 17 files changed, 4615 insertions(+), 339 deletions(-)
 create mode 100644 drivers/common/iavf/virtchnl_inline_ipsec.h
 create mode 100644 drivers/net/iavf/iavf_ipsec_crypto.c
 create mode 100644 drivers/net/iavf/iavf_ipsec_crypto.h
 create mode 100644 drivers/net/iavf/iavf_ipsec_crypto_capabilities.h

-- 
v2: small updates and fixes in the flow related section

2.25.1



[dpdk-dev] [PATCH v2 1/4] common/iavf: add iAVF IPsec inline crypto support

2021-09-15 Thread Radu Nicolau
Add support for inline crypto for IPsec.

Signed-off-by: Declan Doherty 
Signed-off-by: Abhijit Sinha 
Signed-off-by: Radu Nicolau 
---
 drivers/common/iavf/iavf_type.h | 215 +++-
 drivers/common/iavf/virtchnl.h  |  17 +-
 drivers/common/iavf/virtchnl_inline_ipsec.h | 553 
 3 files changed, 775 insertions(+), 10 deletions(-)
 create mode 100644 drivers/common/iavf/virtchnl_inline_ipsec.h

diff --git a/drivers/common/iavf/iavf_type.h b/drivers/common/iavf/iavf_type.h
index 73dfb47e70..1f8f8ae5fd 100644
--- a/drivers/common/iavf/iavf_type.h
+++ b/drivers/common/iavf/iavf_type.h
@@ -709,11 +709,29 @@ enum iavf_rx_prog_status_desc_error_bits {
 #define IAVF_FOUR_BIT_MASK 0xF
 #define IAVF_EIGHTEEN_BIT_MASK 0x3
 
-/* TX Descriptor */
+/* TX Data Descriptor */
 struct iavf_tx_desc {
-   __le64 buffer_addr; /* Address of descriptor's data buf */
-   __le64 cmd_type_offset_bsz;
-};
+   union {
+   struct {
+   __le64 buffer_addr; /* Addr of descriptor's data buf */
+   __le64 cmd_type_offset_bsz;
+   };
+   struct {
+   __le64 qw0; /**< data buffer address */
+   __le64 qw1; /**< dtyp, cmd, offset, buf_sz and l2tag1 */
+   };
+   struct {
+   __le64 buffer_addr; /**< Data buffer address */
+   __le64 type:4;  /**< Descriptor type */
+   __le64 cmd:12;  /**< Command field */
+   __le64 offset_l2len:7;  /**< L2 header length */
+   __le64 offset_l3len:7;  /**< L3 header length */
+   __le64 offset_l4len:4;  /**< L4 header length */
+   __le64 buffer_sz:14;/**< Data buffer size */
+   __le64 l2tag1:16;   /**< L2 Tag 1 value */
+   } debug __rte_packed;
+   };
+} __rte_packed;
 
 #define IAVF_TXD_QW1_DTYPE_SHIFT   0
 #define IAVF_TXD_QW1_DTYPE_MASK(0xFUL << 
IAVF_TXD_QW1_DTYPE_SHIFT)
@@ -723,6 +741,7 @@ enum iavf_tx_desc_dtype_value {
IAVF_TX_DESC_DTYPE_NOP  = 0x1, /* same as Context desc */
IAVF_TX_DESC_DTYPE_CONTEXT  = 0x1,
IAVF_TX_DESC_DTYPE_FCOE_CTX = 0x2,
+   IAVF_TX_DESC_DTYPE_IPSEC= 0x3,
IAVF_TX_DESC_DTYPE_FILTER_PROG  = 0x8,
IAVF_TX_DESC_DTYPE_DDP_CTX  = 0x9,
IAVF_TX_DESC_DTYPE_FLEX_DATA= 0xB,
@@ -734,7 +753,7 @@ enum iavf_tx_desc_dtype_value {
 #define IAVF_TXD_QW1_CMD_SHIFT 4
 #define IAVF_TXD_QW1_CMD_MASK  (0x3FFUL << IAVF_TXD_QW1_CMD_SHIFT)
 
-enum iavf_tx_desc_cmd_bits {
+enum iavf_tx_data_desc_cmd_bits {
IAVF_TX_DESC_CMD_EOP= 0x0001,
IAVF_TX_DESC_CMD_RS = 0x0002,
IAVF_TX_DESC_CMD_ICRC   = 0x0004,
@@ -778,18 +797,79 @@ enum iavf_tx_desc_length_fields {
 #define IAVF_TXD_QW1_L2TAG1_SHIFT  48
 #define IAVF_TXD_QW1_L2TAG1_MASK   (0xULL << IAVF_TXD_QW1_L2TAG1_SHIFT)
 
+#define IAVF_TXD_DATA_QW1_DTYPE_SHIFT  (0)
+#define IAVF_TXD_DATA_QW1_DTYPE_MASK   (0xFUL << IAVF_TXD_QW1_DTYPE_SHIFT)
+
+#define IAVF_TXD_DATA_QW1_CMD_SHIFT(4)
+#define IAVF_TXD_DATA_QW1_CMD_MASK (0x3FFUL << IAVF_TXD_DATA_QW1_CMD_SHIFT)
+
+#define IAVF_TXD_DATA_QW1_OFFSET_SHIFT (16)
+#define IAVF_TXD_DATA_QW1_OFFSET_MASK  (0x3ULL << \
+   IAVF_TXD_DATA_QW1_OFFSET_SHIFT)
+
+#define IAVF_TXD_DATA_QW1_OFFSET_MACLEN_SHIFT  (IAVF_TXD_DATA_QW1_OFFSET_SHIFT)
+#define IAVF_TXD_DATA_QW1_OFFSET_MACLEN_MASK   \
+   (0x7FUL << IAVF_TXD_DATA_QW1_OFFSET_MACLEN_SHIFT)
+
+#define IAVF_TXD_DATA_QW1_OFFSET_IPLEN_SHIFT   \
+   (IAVF_TXD_DATA_QW1_OFFSET_SHIFT + IAVF_TX_DESC_LENGTH_IPLEN_SHIFT)
+#define IAVF_TXD_DATA_QW1_OFFSET_IPLEN_MASK\
+   (0x7FUL << IAVF_TXD_DATA_QW1_OFFSET_IPLEN_SHIFT)
+
+#define IAVF_TXD_DATA_QW1_OFFSET_L4LEN_SHIFT   \
+   (IAVF_TXD_DATA_QW1_OFFSET_SHIFT + IAVF_TX_DESC_LENGTH_L4_FC_LEN_SHIFT)
+#define IAVF_TXD_DATA_QW1_OFFSET_L4LEN_MASK\
+   (0xFUL << IAVF_TXD_DATA_QW1_OFFSET_L4LEN_SHIFT)
+
+#define IAVF_TXD_DATA_QW1_MACLEN_MASK  \
+   (0x7FUL << IAVF_TX_DESC_LENGTH_MACLEN_SHIFT)
+#define IAVF_TXD_DATA_QW1_IPLEN_MASK   \
+   (0x7FUL << IAVF_TX_DESC_LENGTH_IPLEN_SHIFT)
+#define IAVF_TXD_DATA_QW1_L4LEN_MASK   \
+   (0xFUL << IAVF_TX_DESC_LENGTH_L4_FC_LEN_SHIFT)
+#define IAVF_TXD_DATA_QW1_FCLEN_MASK   \
+   (0xFUL << IAVF_TX_DESC_LENGTH_L4_FC_LEN_SHIFT)
+
+#define IAVF_TXD_DATA_QW1_TX_BUF_SZ_SHIFT  (34)
+#define IAVF_TXD_DATA_QW1_TX_BUF_SZ_MASK   \
+   (0x3FFFULL << IAVF_TXD_DATA_QW1_TX_BUF_SZ_SHIFT)
+
+#define IAVF_TXD_DATA_QW1_L2TAG1_SHIFT (48)
+#define IAVF_TXD_DATA_QW1_L2TAG1_MASK  \
+   (0xULL << IAVF_TXD_DATA_QW1_L2TAG1_SHIFT)
+
 /* Context descriptors */
 struct iavf_tx_context_desc {
+   union {
+ 

[dpdk-dev] [PATCH v2 2/4] net/iavf: add iAVF IPsec inline crypto support

2021-09-15 Thread Radu Nicolau
Add support for inline crypto for IPsec, for ESP transport and
tunnel over IPv4 and IPv6, as well as supporting the offload for
ESP over UDP, and inconjunction with TSO for UDP and TCP flows.
Implement support for rte_security packet metadata

Add definition for IPsec descriptors, extend support for offload
in data and context descriptor to support

Add support to virtual channel mailbox for IPsec Crypto request
operations. IPsec Crypto requests receive an initial acknowledgement
from phsyical function driver of receipt of request and then an
asynchronous response with success/failure of request including any
response data.

Add enhanced descriptor debugging

Refactor of scalar tx burst function to support integration of offload

Signed-off-by: Declan Doherty 
Signed-off-by: Abhijit Sinha 
Signed-off-by: Radu Nicolau 
---
 drivers/net/iavf/iavf.h   |   26 +
 drivers/net/iavf/iavf_ethdev.c|   41 +-
 drivers/net/iavf/iavf_generic_flow.c  |   16 +
 drivers/net/iavf/iavf_generic_flow.h  |2 +
 drivers/net/iavf/iavf_ipsec_crypto.c  | 1918 +
 drivers/net/iavf/iavf_ipsec_crypto.h  |   96 +
 .../net/iavf/iavf_ipsec_crypto_capabilities.h |  383 
 drivers/net/iavf/iavf_rxtx.c  |  729 +--
 drivers/net/iavf/iavf_rxtx.h  |  579 -
 drivers/net/iavf/iavf_rxtx_vec_sse.c  |   10 +-
 drivers/net/iavf/iavf_vchnl.c |  166 +-
 drivers/net/iavf/meson.build  |3 +-
 drivers/net/iavf/rte_pmd_iavf.h   |1 +
 drivers/net/iavf/version.map  |3 +
 14 files changed, 3660 insertions(+), 313 deletions(-)
 create mode 100644 drivers/net/iavf/iavf_ipsec_crypto.c
 create mode 100644 drivers/net/iavf/iavf_ipsec_crypto.h
 create mode 100644 drivers/net/iavf/iavf_ipsec_crypto_capabilities.h

diff --git a/drivers/net/iavf/iavf.h b/drivers/net/iavf/iavf.h
index b3bd078111..934ef48278 100644
--- a/drivers/net/iavf/iavf.h
+++ b/drivers/net/iavf/iavf.h
@@ -189,6 +189,7 @@ struct iavf_info {
uint64_t supported_rxdid;
uint8_t *proto_xtr; /* proto xtr type for all queues */
volatile enum virtchnl_ops pend_cmd; /* pending command not finished */
+   rte_atomic32_t pend_cmd_count;
int cmd_retval; /* return value of the cmd response from PF */
uint8_t *aq_resp; /* buffer to store the adminq response from PF */
 
@@ -216,6 +217,7 @@ struct iavf_info {
rte_spinlock_t flow_ops_lock;
struct iavf_parser_list rss_parser_list;
struct iavf_parser_list dist_parser_list;
+   struct iavf_parser_list ipsec_crypto_parser_list;
 
struct iavf_fdir_info fdir; /* flow director info */
/* indicate large VF support enabled or not */
@@ -238,6 +240,7 @@ enum iavf_proto_xtr_type {
IAVF_PROTO_XTR_IPV6_FLOW,
IAVF_PROTO_XTR_TCP,
IAVF_PROTO_XTR_IP_OFFSET,
+   IAVF_PROTO_XTR_IPSEC_CRYPTO_SAID,
IAVF_PROTO_XTR_MAX,
 };
 
@@ -249,11 +252,14 @@ struct iavf_devargs {
uint8_t proto_xtr[IAVF_MAX_QUEUE_NUM];
 };
 
+struct iavf_security_ctx;
+
 /* Structure to store private data for each VF instance. */
 struct iavf_adapter {
struct iavf_hw hw;
struct rte_eth_dev *eth_dev;
struct iavf_info vf;
+   struct iavf_security_ctx *security_ctx;
 
bool rx_bulk_alloc_allowed;
/* For vector PMD */
@@ -272,6 +278,8 @@ struct iavf_adapter {
(&((struct iavf_adapter *)adapter)->vf)
 #define IAVF_DEV_PRIVATE_TO_HW(adapter) \
(&((struct iavf_adapter *)adapter)->hw)
+#define IAVF_DEV_PRIVATE_TO_IAVF_SECURITY_CTX(adapter) \
+   (((struct iavf_adapter *)adapter)->security_ctx)
 
 /* IAVF_VSI_TO */
 #define IAVF_VSI_TO_HW(vsi) \
@@ -340,9 +348,24 @@ _atomic_set_cmd(struct iavf_info *vf, enum virtchnl_ops 
ops)
if (!ret)
PMD_DRV_LOG(ERR, "There is incomplete cmd %d", vf->pend_cmd);
 
+   rte_atomic32_set(&vf->pend_cmd_count, 1);
+
return !ret;
 }
 
+/* Check there is pending cmd in execution. If none, set new command. */
+static inline int
+_atomic_set_async_response_cmd(struct iavf_info *vf, enum virtchnl_ops ops)
+{
+   int ret = rte_atomic32_cmpset(&vf->pend_cmd, VIRTCHNL_OP_UNKNOWN, ops);
+
+   if (!ret)
+   PMD_DRV_LOG(ERR, "There is incomplete cmd %d", vf->pend_cmd);
+
+   rte_atomic32_set(&vf->pend_cmd_count, 2);
+
+   return !ret;
+}
 int iavf_check_api_version(struct iavf_adapter *adapter);
 int iavf_get_vf_resource(struct iavf_adapter *adapter);
 void iavf_handle_virtchnl_msg(struct rte_eth_dev *dev);
@@ -399,5 +422,8 @@ int iavf_set_q_tc_map(struct rte_eth_dev *dev,
uint16_t size);
 void iavf_tm_conf_init(struct rte_eth_dev *dev);
 void iavf_tm_conf_uninit(struct rte_eth_dev *dev);
+int iavf_ipsec_crypto_request(struct iavf_adapter *adapter,
+   uint8_t *msg, size_t msg_len,
+   uint8_t *resp

[dpdk-dev] [PATCH v2 3/4] net/iavf: Add xstats support for inline IPsec crypto

2021-09-15 Thread Radu Nicolau
Add per queue counters for maintaining statistics for inline IPsec
crypto offload, which can be retrieved through the
rte_security_session_stats_get() with more detailed errors through the
rte_ethdev xstats.

Signed-off-by: Declan Doherty 
Signed-off-by: Radu Nicolau 
---
 drivers/net/iavf/iavf.h| 21 -
 drivers/net/iavf/iavf_ethdev.c | 84 --
 drivers/net/iavf/iavf_rxtx.h   | 12 -
 3 files changed, 89 insertions(+), 28 deletions(-)

diff --git a/drivers/net/iavf/iavf.h b/drivers/net/iavf/iavf.h
index 934ef48278..d5f574b4b3 100644
--- a/drivers/net/iavf/iavf.h
+++ b/drivers/net/iavf/iavf.h
@@ -92,6 +92,25 @@ struct iavf_adapter;
 struct iavf_rx_queue;
 struct iavf_tx_queue;
 
+
+struct iavf_ipsec_crypto_stats {
+   uint64_t icount;
+   uint64_t ibytes;
+   struct {
+   uint64_t count;
+   uint64_t sad_miss;
+   uint64_t not_processed;
+   uint64_t icv_check;
+   uint64_t ipsec_length;
+   uint64_t misc;
+   } ierrors;
+};
+
+struct iavf_eth_xstats {
+   struct virtchnl_eth_stats eth_stats;
+   struct iavf_ipsec_crypto_stats ips_stats;
+};
+
 /* Structure that defines a VSI, associated with a adapter. */
 struct iavf_vsi {
struct iavf_adapter *adapter; /* Backreference to associated adapter */
@@ -101,7 +120,7 @@ struct iavf_vsi {
uint16_t max_macaddrs;   /* Maximum number of MAC addresses */
uint16_t base_vector;
uint16_t msix_intr;  /* The MSIX interrupt binds to VSI */
-   struct virtchnl_eth_stats eth_stats_offset;
+   struct iavf_eth_xstats eth_stats_offset;
 };
 
 struct rte_flow;
diff --git a/drivers/net/iavf/iavf_ethdev.c b/drivers/net/iavf/iavf_ethdev.c
index 8a562e0942..b8b8d2e394 100644
--- a/drivers/net/iavf/iavf_ethdev.c
+++ b/drivers/net/iavf/iavf_ethdev.c
@@ -89,6 +89,7 @@ static const uint32_t *iavf_dev_supported_ptypes_get(struct 
rte_eth_dev *dev);
 static int iavf_dev_stats_get(struct rte_eth_dev *dev,
 struct rte_eth_stats *stats);
 static int iavf_dev_stats_reset(struct rte_eth_dev *dev);
+static int iavf_dev_xstats_reset(struct rte_eth_dev *dev);
 static int iavf_dev_xstats_get(struct rte_eth_dev *dev,
 struct rte_eth_xstat *xstats, unsigned int n);
 static int iavf_dev_xstats_get_names(struct rte_eth_dev *dev,
@@ -144,21 +145,37 @@ struct rte_iavf_xstats_name_off {
unsigned int offset;
 };
 
+#define _OFF_OF(a) offsetof(struct iavf_eth_xstats, a)
 static const struct rte_iavf_xstats_name_off rte_iavf_stats_strings[] = {
-   {"rx_bytes", offsetof(struct iavf_eth_stats, rx_bytes)},
-   {"rx_unicast_packets", offsetof(struct iavf_eth_stats, rx_unicast)},
-   {"rx_multicast_packets", offsetof(struct iavf_eth_stats, rx_multicast)},
-   {"rx_broadcast_packets", offsetof(struct iavf_eth_stats, rx_broadcast)},
-   {"rx_dropped_packets", offsetof(struct iavf_eth_stats, rx_discards)},
+   {"rx_bytes", _OFF_OF(eth_stats.rx_bytes)},
+   {"rx_unicast_packets", _OFF_OF(eth_stats.rx_unicast)},
+   {"rx_multicast_packets", _OFF_OF(eth_stats.rx_multicast)},
+   {"rx_broadcast_packets", _OFF_OF(eth_stats.rx_broadcast)},
+   {"rx_dropped_packets", _OFF_OF(eth_stats.rx_discards)},
{"rx_unknown_protocol_packets", offsetof(struct iavf_eth_stats,
rx_unknown_protocol)},
-   {"tx_bytes", offsetof(struct iavf_eth_stats, tx_bytes)},
-   {"tx_unicast_packets", offsetof(struct iavf_eth_stats, tx_unicast)},
-   {"tx_multicast_packets", offsetof(struct iavf_eth_stats, tx_multicast)},
-   {"tx_broadcast_packets", offsetof(struct iavf_eth_stats, tx_broadcast)},
-   {"tx_dropped_packets", offsetof(struct iavf_eth_stats, tx_discards)},
-   {"tx_error_packets", offsetof(struct iavf_eth_stats, tx_errors)},
+   {"tx_bytes", _OFF_OF(eth_stats.tx_bytes)},
+   {"tx_unicast_packets", _OFF_OF(eth_stats.tx_unicast)},
+   {"tx_multicast_packets", _OFF_OF(eth_stats.tx_multicast)},
+   {"tx_broadcast_packets", _OFF_OF(eth_stats.tx_broadcast)},
+   {"tx_dropped_packets", _OFF_OF(eth_stats.tx_discards)},
+   {"tx_error_packets", _OFF_OF(eth_stats.tx_errors)},
+
+   {"inline_ipsec_crypto_ipackets", _OFF_OF(ips_stats.icount)},
+   {"inline_ipsec_crypto_ibytes", _OFF_OF(ips_stats.ibytes)},
+   {"inline_ipsec_crypto_ierrors", _OFF_OF(ips_stats.ierrors.count)},
+   {"inline_ipsec_crypto_ierrors_sad_lookup",
+   _OFF_OF(ips_stats.ierrors.sad_miss)},
+   {"inline_ipsec_crypto_ierrors_not_processed",
+   _OFF_OF(ips_stats.ierrors.not_processed)},
+   {"inline_ipsec_crypto_ierrors_icv_fail",
+   _OFF_OF(ips_stats.ierrors.icv_check)},
+   {"inline_ipsec_crypto_ierrors_length",
+   _OFF_OF(ips_stats.ierrors.ipsec_length)},
+   {"inline_ipsec_crypto_ierrors_misc",
+  

[dpdk-dev] [PATCH v2 4/4] net/iavf: add watchdog for VFLR

2021-09-15 Thread Radu Nicolau
Add watchdog to iAVF PMD which support monitoring the VFLR register. If
the device is not already in reset then if a VF reset in progress is
detected then notfiy user through callback and set into reset state.
If the device is already in reset then poll for completion of reset.

Signed-off-by: Declan Doherty 
Signed-off-by: Radu Nicolau 
---
 drivers/net/iavf/iavf.h|  6 +++
 drivers/net/iavf/iavf_ethdev.c | 97 ++
 2 files changed, 103 insertions(+)

diff --git a/drivers/net/iavf/iavf.h b/drivers/net/iavf/iavf.h
index d5f574b4b3..4481d2e134 100644
--- a/drivers/net/iavf/iavf.h
+++ b/drivers/net/iavf/iavf.h
@@ -212,6 +212,12 @@ struct iavf_info {
int cmd_retval; /* return value of the cmd response from PF */
uint8_t *aq_resp; /* buffer to store the adminq response from PF */
 
+   struct {
+   uint8_t enabled:1;
+   uint64_t period_us;
+   } watchdog;
+   /** iAVF watchdog configuration */
+
/* Event from pf */
bool dev_closed;
bool link_up;
diff --git a/drivers/net/iavf/iavf_ethdev.c b/drivers/net/iavf/iavf_ethdev.c
index b8b8d2e394..1c9b58293e 100644
--- a/drivers/net/iavf/iavf_ethdev.c
+++ b/drivers/net/iavf/iavf_ethdev.c
@@ -24,6 +24,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "iavf.h"
 #include "iavf_rxtx.h"
@@ -239,6 +240,94 @@ iavf_tm_ops_get(struct rte_eth_dev *dev __rte_unused,
return 0;
 }
 
+
+static int
+iavf_vfr_inprogress(struct iavf_hw *hw)
+{
+   int inprogress = 0;
+
+   if ((IAVF_READ_REG(hw, IAVF_VFGEN_RSTAT) &
+   IAVF_VFGEN_RSTAT_VFR_STATE_MASK) ==
+   VIRTCHNL_VFR_INPROGRESS)
+   inprogress = 1;
+
+   if (inprogress)
+   PMD_DRV_LOG(INFO, "Watchdog detected VFR in progress");
+
+   return inprogress;
+}
+
+static void
+iavf_dev_watchdog(void *cb_arg)
+{
+   struct iavf_adapter *adapter = cb_arg;
+   struct iavf_hw *hw = IAVF_DEV_PRIVATE_TO_HW(adapter);
+   int vfr_inprogress = 0, rc = 0;
+
+   /* check if watchdog has been disabled since last call */
+   if (!adapter->vf.watchdog.enabled)
+   return;
+
+   /* If in reset then poll vfr_inprogress register for completion */
+   if (adapter->vf.vf_reset) {
+   vfr_inprogress = iavf_vfr_inprogress(hw);
+
+   if (!vfr_inprogress) {
+   PMD_DRV_LOG(INFO, "VF \"%s\" reset has completed",
+   adapter->eth_dev->data->name);
+   adapter->vf.vf_reset = false;
+   }
+   /* If not in reset then poll vfr_inprogress register for VFLR event */
+   } else {
+   vfr_inprogress = iavf_vfr_inprogress(hw);
+
+   if (vfr_inprogress) {
+   PMD_DRV_LOG(INFO,
+   "VF \"%s\" reset event has been detected by 
watchdog",
+   adapter->eth_dev->data->name);
+
+   /* enter reset state with VFLR event */
+   adapter->vf.vf_reset = true;
+
+   rte_eth_dev_callback_process(adapter->eth_dev,
+   RTE_ETH_EVENT_INTR_RESET, NULL);
+   }
+   }
+
+   /* re-alarm watchdog */
+   rc = rte_eal_alarm_set(adapter->vf.watchdog.period_us,
+   &iavf_dev_watchdog, cb_arg);
+
+   if (rc)
+   PMD_DRV_LOG(ERR, "Failed \"%s\" to reset device watchdog alarm",
+   adapter->eth_dev->data->name);
+}
+
+static void
+iavf_dev_watchdog_enable(struct iavf_adapter *adapter, uint64_t period_us)
+{
+   int rc;
+
+   PMD_DRV_LOG(INFO, "Enabling device watchdog");
+
+   adapter->vf.watchdog.enabled = 1;
+   adapter->vf.watchdog.period_us = period_us;
+
+   rc = rte_eal_alarm_set(adapter->vf.watchdog.period_us,
+   &iavf_dev_watchdog, (void *)adapter);
+   if (rc)
+   PMD_DRV_LOG(ERR, "Failed to enabled device watchdog");
+}
+
+static void
+iavf_dev_watchdog_disable(struct iavf_adapter *adapter)
+{
+   PMD_DRV_LOG(INFO, "Disabling device watchdog");
+
+   adapter->vf.watchdog.enabled = 0;
+   adapter->vf.watchdog.period_us = 0;
+}
+
 static int
 iavf_set_mc_addr_list(struct rte_eth_dev *dev,
struct rte_ether_addr *mc_addrs,
@@ -2423,6 +2512,11 @@ iavf_dev_init(struct rte_eth_dev *eth_dev)
 
iavf_default_rss_disable(adapter);
 
+
+   /* Start device watchdog, set polling period to 500us */
+   iavf_dev_watchdog_enable(adapter, 500);
+
+
return 0;
 }
 
@@ -2493,6 +2587,9 @@ iavf_dev_close(struct rte_eth_dev *dev)
if (vf->vf_reset && !rte_pci_set_bus_master(pci_dev, true))
vf->vf_reset = false;
 
+   /* disable watchdog */
+   iavf_dev_watchdog_disable(adapter);
+
return ret;
 }
 
-- 
2.25.1



[dpdk-dev] [PATCH v3] app/testpmd: add command to print representor info

2021-09-15 Thread Andrew Rybchenko
From: Viacheslav Galaktionov 

Make it simpler to debug configurations and code related to the representor
info API.

Signed-off-by: Viacheslav Galaktionov 
Signed-off-by: Andrew Rybchenko 
Reviewed-by: Andy Moreton 
Reviewed-by: Xueming(Steven) Li 
---
v3:
- change command to "show port info (port_id) representor"

v2:
- change output format to log just one line per range

 app/test-pmd/cmdline.c | 137 +
 1 file changed, 137 insertions(+)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 7dd3965d6f..2f24d7 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -162,6 +162,10 @@ static void cmd_help_long_parsed(void *parsed_result,
"show port (info|stats|summary|xstats|fdir|dcb_tc) 
(port_id|all)\n"
"Display information for port_id, or all.\n\n"
 
+   "show port info (port_id) representor\n"
+   "Show supported representors"
+   " for a specific port\n\n"
+
"show port port_id (module_eeprom|eeprom)\n"
"Display the module EEPROM or EEPROM information 
for port_id.\n\n"
 
@@ -7904,6 +7908,138 @@ cmdline_parse_inst_t cmd_showport = {
},
 };
 
+/* *** show port representors information *** */
+struct cmd_representor_info_result {
+   cmdline_fixed_string_t cmd_show;
+   cmdline_fixed_string_t cmd_port;
+   cmdline_fixed_string_t cmd_info;
+   cmdline_fixed_string_t cmd_keyword;
+   portid_t cmd_pid;
+};
+
+static void
+cmd_representor_info_parsed(void *parsed_result,
+   __rte_unused struct cmdline *cl,
+   __rte_unused void *data)
+{
+   struct cmd_representor_info_result *res = parsed_result;
+   struct rte_eth_representor_info *info;
+   struct rte_eth_representor_range *range;
+   uint32_t range_diff;
+   uint32_t i;
+   int ret;
+   int num;
+
+   if (!rte_eth_dev_is_valid_port(res->cmd_pid)) {
+   fprintf(stderr, "Invalid port id %u\n", res->cmd_pid);
+   return;
+   }
+
+   ret = rte_eth_representor_info_get(res->cmd_pid, NULL);
+   if (ret < 0) {
+   fprintf(stderr,
+   "Failed to get the number of representor info ranges 
for port %hu: %s\n",
+   res->cmd_pid, rte_strerror(-ret));
+   return;
+   }
+   num = ret;
+
+   info = calloc(1, sizeof(*info) + num * sizeof(info->ranges[0]));
+   if (info == NULL) {
+   fprintf(stderr,
+   "Failed to allocate memory for representor info for 
port %hu\n",
+   res->cmd_pid);
+   return;
+   }
+   info->nb_ranges_alloc = num;
+
+   ret = rte_eth_representor_info_get(res->cmd_pid, info);
+   if (ret < 0) {
+   fprintf(stderr,
+   "Failed to get the representor info for port %hu: %s\n",
+   res->cmd_pid, rte_strerror(-ret));
+   free(info);
+   return;
+   }
+
+   printf("Port controller: %hu\n", info->controller);
+   printf("Port PF: %hu\n", info->pf);
+
+   printf("Ranges: %u\n", info->nb_ranges);
+   for (i = 0; i < info->nb_ranges; i++) {
+   range = &info->ranges[i];
+   range_diff = range->id_end - range->id_base;
+
+   printf("%u. ", i + 1);
+   printf("'%s' ", range->name);
+   if (range_diff > 0)
+   printf("[%u-%u]: ", range->id_base, range->id_end);
+   else
+   printf("[%u]: ", range->id_base);
+
+   printf("Controller %d, PF %d", range->controller, range->pf);
+
+   switch (range->type) {
+   case RTE_ETH_REPRESENTOR_NONE:
+   printf(", NONE\n");
+   break;
+   case RTE_ETH_REPRESENTOR_VF:
+   if (range_diff > 0) {
+   printf(", VF %d..%d\n", range->vf,
+  range->vf + range_diff);
+   } else {
+   printf(", VF %d\n", range->vf);
+   }
+   break;
+   case RTE_ETH_REPRESENTOR_SF:
+   printf(", SF %d\n", range->sf);
+   break;
+   case RTE_ETH_REPRESENTOR_PF:
+   if (range_diff > 0)
+   printf("..%d\n", range->pf + range_diff);
+   else
+   printf("\n");
+   break;
+   default:
+   printf(", UNKNOWN TYPE %d\n", range->type);
+   break;
+   }
+   }
+
+   free(info);
+}
+
+cmdline_parse_token_string_t cmd_representor_info_show =
+

Re: [dpdk-dev] [PATCH v2] app/testpmd: add command to print representor info

2021-09-15 Thread Andrew Rybchenko
On 9/14/21 7:36 PM, Ferruh Yigit wrote:
> On 9/14/2021 5:17 PM, Andrew Rybchenko wrote:
>> On 9/14/21 6:52 PM, Ferruh Yigit wrote:
>>> On 8/31/2021 5:12 PM, Andrew Rybchenko wrote:
 From: Viacheslav Galaktionov 

 Make it simpler to debug configurations and code related to the representor
 info API.

 Signed-off-by: Viacheslav Galaktionov 
 Signed-off-by: Andrew Rybchenko 
 Reviewed-by: Andy Moreton 
 ---
 v2:
 - change output format to log just one line per range

  app/test-pmd/cmdline.c | 135 +
  1 file changed, 135 insertions(+)

 diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
 index 82253bc751..ae700f9dd1 100644
 --- a/app/test-pmd/cmdline.c
 +++ b/app/test-pmd/cmdline.c
 @@ -236,6 +236,10 @@ static void cmd_help_long_parsed(void *parsed_result,
"Show port supported ptypes"
" for a specific port\n\n"
  
 +  "show port (port_id) representor info\n"
 +  "Show supported representors"
 +  " for a specific port\n\n"
 +
>>>
>>> What do you think extending existing "show port info #" command instead of
>>> creating a new command for it?
>>
>> My fear with such approach is that output of the "show port
>> info #" is already too long and adding representors info
>> there will make it even much longer.
>>
> 
> That is fair concern, what about extend existing command with a new keyword to
> just print representor info:
> "show port info # representor"

Good idea, see v3.

>>> Since "show port info #" is a well known command, it can simplify the usage.
>>> When port is representor port it can display additional info.
>>>
>>
>> Just to be clear: it will output information for "backer"
>> (or parent) port which should be used to create representors.
>>



Re: [dpdk-dev] [PATCH v21 4/7] dmadev: introduce DMA device library implementation

2021-09-15 Thread Kevin Laatz

On 07/09/2021 13:56, Chengwen Feng wrote:

This patch introduce DMA device library implementation which includes
configuration and I/O with the DMA devices.

Signed-off-by: Chengwen Feng 
Acked-by: Bruce Richardson 
Acked-by: Morten Brørup 
Reviewed-by: Kevin Laatz 
Reviewed-by: Conor Walsh 
---
  config/rte_config.h  |   3 +
  lib/dmadev/meson.build   |   1 +
  lib/dmadev/rte_dmadev.c  | 607 +++
  lib/dmadev/rte_dmadev.h  | 118 ++-
  lib/dmadev/rte_dmadev_core.h |   2 +
  lib/dmadev/version.map   |   1 +
  6 files changed, 720 insertions(+), 12 deletions(-)
  create mode 100644 lib/dmadev/rte_dmadev.c


[snip]


  /**
   * @warning
@@ -941,10 +1018,27 @@ rte_dmadev_completed(uint16_t dev_id, uint16_t vchan, 
const uint16_t nb_cpls,
   *   status array are also set.
   */
  __rte_experimental
-uint16_t
+static inline uint16_t
  rte_dmadev_completed_status(uint16_t dev_id, uint16_t vchan,
const uint16_t nb_cpls, uint16_t *last_idx,
-   enum rte_dma_status_code *status);
+   enum rte_dma_status_code *status)
+{
+   struct rte_dmadev *dev = &rte_dmadevices[dev_id];
+   uint16_t idx;
+
+#ifdef RTE_DMADEV_DEBUG
+   if (!rte_dmadev_is_valid_dev(dev_id) || !dev->data->dev_started ||
+   vchan >= dev->data->dev_conf.nb_vchans ||
+   nb_cpls == 0 || status == NULL)
+   return 0;
+   RTE_FUNC_PTR_OR_ERR_RET(*dev->completed_status, 0);
+#endif
+
+   if (last_idx == NULL)
+   last_idx = &idx;


Hi Chengwen,

An internal coverity scan on the IDXD dmadev driver patches flagged a 
potential null pointer dereference when using completed_status().


IMO it is a false positive for the driver code since it should be 
checked at the library API level, however the check is also not present 
in the library.


For the v22, can you add the NULL pointer check for status here, like 
you have for last_idx, please?


/Kevin


+
+   return (*dev->completed_status)(dev, vchan, nb_cpls, last_idx, status);
+}
  
  #ifdef __cplusplus

  }



[dpdk-dev] logs about hugepages detection

2021-09-15 Thread Thomas Monjalon
Hi,

I would like to discuss some issues in logging of hugepage lookup.
The issues to be discussed will be enumerated and numbered below.
I will take an example of an x86 machine with 2M and 1G pages.
I reserve only 2M pages:

usertools/dpdk-hugepages.py -p 2M -r 80M

If I start a DPDK application with --log-level info
the only message I read makes me think something is wrong:

EAL: No available 1048576 kB hugepages reported

1/ Log level is too high.

If I start with EAL in debug level, I can see which page size is used:

--log-level debug --log-level lib.eal:debug

EAL: No available 1048576 kB hugepages reported
[...]
EAL: Detected memory type: socket_id:0 hugepage_sz:2097152

2/ The positive message should be at the same level as the negative one.

3/ The sizes are sometimes written in bytes, sometimes in kB.
It should be always the highest unit, including GB.

When using the --in-memory mode, things are worst:

EAL: No available 1048576 kB hugepages reported
EAL: In-memory mode enabled, hugepages of size 1073741824 bytes will be 
allocated anonymously
EAL: No free 1048576 kB hugepages reported on node 0
EAL: No available 1048576 kB hugepages reported
[...]
EAL: Detected memory type: socket_id:0 hugepage_sz:1073741824
EAL: Detected memory type: socket_id:0 hugepage_sz:2097152

4/ The unavailability of 1G should be reported only once.

5/ If non-reserved pages can be used without reservation, it should be better 
documented.

Please correct me if I'm wrong, and give your opinion.
I could work on some patches if needed.




[dpdk-dev] [PATCH v2 0/9] IPsec Sec GW new features

2021-09-15 Thread Radu Nicolau
Update the IPsec sample app with new features and updates:
- egress TSO support
- telemetry support
- add reset callback
- stats screen configurable as a command line parameter
- UDP encapsulation support for inline crypto
- ESN with configurable start value

Depends on series 18837 ('new features for ipsec and security libraries')

Radu Nicolau (9):
  examples/ipsec-secgw: update create inline session
  examples/ipsec-secgw: update SA parameters with L3 options
  examples/ipsec-secgw: add support for telemetry
  examples/ipsec-secgw: add stats interval argument
  examples/ipsec-secgw: add support for TSO
  examples/ipsec-secgw: add support for defining initial sequence number
value
  examples/ipsec-secgw: add ethdev reset callback
  examples/ipsec-secgw: add support for additional algorithms
  examples/ipsec-secgw: add support for inline crypto UDP encapsulation

 doc/guides/sample_app_ug/ipsec_secgw.rst |  36 ++
 examples/ipsec-secgw/ipsec-secgw.c   | 408 +--
 examples/ipsec-secgw/ipsec-secgw.h   |  48 ++-
 examples/ipsec-secgw/ipsec.c |  95 +-
 examples/ipsec-secgw/ipsec.h |  14 +-
 examples/ipsec-secgw/meson.build |   2 +-
 examples/ipsec-secgw/sa.c| 305 +++--
 7 files changed, 816 insertions(+), 92 deletions(-)

-- 
v2: reworked the patchset to improve quality and address feedback

2.25.1



[dpdk-dev] [PATCH v2 1/9] examples/ipsec-secgw: update create inline session

2021-09-15 Thread Radu Nicolau
Rework create inline session function as to update the session
configuration parameters before create session is called.
Also updated the rss key array size to prevent buffers overflows
with PMDs that copy more than 40 bytes.

Signed-off-by: Radu Nicolau 
---
 examples/ipsec-secgw/ipsec.c | 56 ++--
 1 file changed, 48 insertions(+), 8 deletions(-)

diff --git a/examples/ipsec-secgw/ipsec.c b/examples/ipsec-secgw/ipsec.c
index 5b032fecfb..0af49f3f4b 100644
--- a/examples/ipsec-secgw/ipsec.c
+++ b/examples/ipsec-secgw/ipsec.c
@@ -167,21 +167,61 @@ create_inline_session(struct socket_ctx *skt_ctx, struct 
ipsec_sa *sa,
.action_type = ips->type,
.protocol = RTE_SECURITY_PROTOCOL_IPSEC,
{.ipsec = {
-   .spi = sa->spi,
+   .spi = rte_cpu_to_be_32(sa->spi),
.salt = sa->salt,
.options = { 0 },
.replay_win_sz = 0,
.direction = sa->direction,
-   .proto = RTE_SECURITY_IPSEC_SA_PROTO_ESP,
-   .mode = (sa->flags == IP4_TUNNEL ||
-   sa->flags == IP6_TUNNEL) ?
-   RTE_SECURITY_IPSEC_SA_MODE_TUNNEL :
-   RTE_SECURITY_IPSEC_SA_MODE_TRANSPORT,
+   .proto = RTE_SECURITY_IPSEC_SA_PROTO_ESP
} },
.crypto_xform = sa->xforms,
.userdata = NULL,
};
 
+   if (IS_TRANSPORT(sa->flags)) {
+   sess_conf.ipsec.mode = RTE_SECURITY_IPSEC_SA_MODE_TRANSPORT;
+   if (IS_IP4(sa->flags)) {
+   sess_conf.ipsec.tunnel.type =
+   RTE_SECURITY_IPSEC_TUNNEL_IPV4;
+
+   sess_conf.ipsec.tunnel.ipv4.src_ip.s_addr =
+   sa->src.ip.ip4;
+   sess_conf.ipsec.tunnel.ipv4.dst_ip.s_addr =
+   sa->dst.ip.ip4;
+   } else if (IS_IP6(sa->flags)) {
+   sess_conf.ipsec.tunnel.type =
+   RTE_SECURITY_IPSEC_TUNNEL_IPV6;
+
+   memcpy(sess_conf.ipsec.tunnel.ipv6.src_addr.s6_addr,
+   sa->src.ip.ip6.ip6_b, 16);
+   memcpy(sess_conf.ipsec.tunnel.ipv6.dst_addr.s6_addr,
+   sa->dst.ip.ip6.ip6_b, 16);
+   }
+   } else if (IS_TUNNEL(sa->flags)) {
+   sess_conf.ipsec.mode = RTE_SECURITY_IPSEC_SA_MODE_TUNNEL;
+
+   if (IS_IP4(sa->flags)) {
+   sess_conf.ipsec.tunnel.type =
+   RTE_SECURITY_IPSEC_TUNNEL_IPV4;
+
+   sess_conf.ipsec.tunnel.ipv4.src_ip.s_addr =
+   sa->src.ip.ip4;
+   sess_conf.ipsec.tunnel.ipv4.dst_ip.s_addr =
+   sa->dst.ip.ip4;
+   } else if (IS_IP6(sa->flags)) {
+   sess_conf.ipsec.tunnel.type =
+   RTE_SECURITY_IPSEC_TUNNEL_IPV6;
+
+   memcpy(sess_conf.ipsec.tunnel.ipv6.src_addr.s6_addr,
+   sa->src.ip.ip6.ip6_b, 16);
+   memcpy(sess_conf.ipsec.tunnel.ipv6.dst_addr.s6_addr,
+   sa->dst.ip.ip6.ip6_b, 16);
+   } else {
+   RTE_LOG(ERR, IPSEC, "invalid tunnel type\n");
+   return -1;
+   }
+   }
+
RTE_LOG_DP(DEBUG, IPSEC, "Create session for SA spi %u on port %u\n",
sa->spi, sa->portid);
 
@@ -267,10 +307,10 @@ create_inline_session(struct socket_ctx *skt_ctx, struct 
ipsec_sa *sa,
sa->attr.ingress = (sa->direction ==
RTE_SECURITY_IPSEC_SA_DIR_INGRESS);
if (sa->attr.ingress) {
-   uint8_t rss_key[40];
+   uint8_t rss_key[64];
struct rte_eth_rss_conf rss_conf = {
.rss_key = rss_key,
-   .rss_key_len = 40,
+   .rss_key_len = sizeof(rss_key),
};
struct rte_eth_dev_info dev_info;
uint16_t queue[RTE_MAX_QUEUES_PER_PORT];
-- 
2.25.1



[dpdk-dev] [PATCH v2 2/9] examples/ipsec-secgw: update SA parameters with L3 options

2021-09-15 Thread Radu Nicolau
Set the L3 offset and L3 length in the SA parameters

Signed-off-by: Radu Nicolau 
---
 examples/ipsec-secgw/sa.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/examples/ipsec-secgw/sa.c b/examples/ipsec-secgw/sa.c
index 17a28556c9..7fb8fef264 100644
--- a/examples/ipsec-secgw/sa.c
+++ b/examples/ipsec-secgw/sa.c
@@ -1316,11 +1316,15 @@ fill_ipsec_sa_prm(struct rte_ipsec_sa_prm *prm, const 
struct ipsec_sa *ss,
 
if (IS_IP4_TUNNEL(ss->flags)) {
prm->ipsec_xform.tunnel.type = RTE_SECURITY_IPSEC_TUNNEL_IPV4;
+   prm->tun.hdr_l3_len = sizeof(*v4);
+   prm->tun.hdr_l3_off = 0;
prm->tun.hdr_len = sizeof(*v4);
prm->tun.next_proto = rc;
prm->tun.hdr = v4;
} else if (IS_IP6_TUNNEL(ss->flags)) {
prm->ipsec_xform.tunnel.type = RTE_SECURITY_IPSEC_TUNNEL_IPV6;
+   prm->tun.hdr_l3_len = sizeof(*v6);
+   prm->tun.hdr_l3_off = 0;
prm->tun.hdr_len = sizeof(*v6);
prm->tun.next_proto = rc;
prm->tun.hdr = v6;
-- 
2.25.1



[dpdk-dev] [PATCH v2 3/9] examples/ipsec-secgw: add support for telemetry

2021-09-15 Thread Radu Nicolau
Add telemetry support to the IPsec GW sample app

Signed-off-by: Declan Doherty 
Signed-off-by: Radu Nicolau 
---
 doc/guides/sample_app_ug/ipsec_secgw.rst |  11 +
 examples/ipsec-secgw/ipsec-secgw.c   | 365 ++-
 examples/ipsec-secgw/ipsec-secgw.h   |  33 +-
 examples/ipsec-secgw/ipsec.h |   2 +
 examples/ipsec-secgw/meson.build |   2 +-
 examples/ipsec-secgw/sa.c|  15 +-
 6 files changed, 406 insertions(+), 22 deletions(-)

diff --git a/doc/guides/sample_app_ug/ipsec_secgw.rst 
b/doc/guides/sample_app_ug/ipsec_secgw.rst
index 78171b25f9..20bc1e6bc4 100644
--- a/doc/guides/sample_app_ug/ipsec_secgw.rst
+++ b/doc/guides/sample_app_ug/ipsec_secgw.rst
@@ -720,6 +720,17 @@ where each options means:
 
* *udp-encap*
 
+ 
+
+ * Option to enable per SA telemetry.
+   Currently only supported with IPsec library path.
+
+ * Optional: Yes, it is disabled by default
+
+ * Syntax:
+
+   * *telemetry*
+
 Example SA rules:
 
 .. code-block:: console
diff --git a/examples/ipsec-secgw/ipsec-secgw.c 
b/examples/ipsec-secgw/ipsec-secgw.c
index f252d34985..265fff4bef 100644
--- a/examples/ipsec-secgw/ipsec-secgw.c
+++ b/examples/ipsec-secgw/ipsec-secgw.c
@@ -48,6 +48,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "event_helper.h"
 #include "flow.h"
@@ -671,7 +672,7 @@ send_single_packet(struct rte_mbuf *m, uint16_t port, 
uint8_t proto)
 
 static inline void
 inbound_sp_sa(struct sp_ctx *sp, struct sa_ctx *sa, struct traffic_type *ip,
-   uint16_t lim)
+   uint16_t lim, struct ipsec_spd_stats *stats)
 {
struct rte_mbuf *m;
uint32_t i, j, res, sa_idx;
@@ -688,25 +689,30 @@ inbound_sp_sa(struct sp_ctx *sp, struct sa_ctx *sa, 
struct traffic_type *ip,
res = ip->res[i];
if (res == BYPASS) {
ip->pkts[j++] = m;
+   stats->bypass++;
continue;
}
if (res == DISCARD) {
free_pkts(&m, 1);
+   stats->discard++;
continue;
}
 
/* Only check SPI match for processed IPSec packets */
if (i < lim && ((m->ol_flags & PKT_RX_SEC_OFFLOAD) == 0)) {
+   stats->discard++;
free_pkts(&m, 1);
continue;
}
 
sa_idx = res - 1;
if (!inbound_sa_check(sa, m, sa_idx)) {
+   stats->discard++;
free_pkts(&m, 1);
continue;
}
ip->pkts[j++] = m;
+   stats->protect++;
}
ip->num = j;
 }
@@ -750,6 +756,7 @@ static inline void
 process_pkts_inbound(struct ipsec_ctx *ipsec_ctx,
struct ipsec_traffic *traffic)
 {
+   unsigned int lcoreid = rte_lcore_id();
uint16_t nb_pkts_in, n_ip4, n_ip6;
 
n_ip4 = traffic->ip4.num;
@@ -765,16 +772,20 @@ process_pkts_inbound(struct ipsec_ctx *ipsec_ctx,
ipsec_process(ipsec_ctx, traffic);
}
 
-   inbound_sp_sa(ipsec_ctx->sp4_ctx, ipsec_ctx->sa_ctx, &traffic->ip4,
-   n_ip4);
+   inbound_sp_sa(ipsec_ctx->sp4_ctx,
+   ipsec_ctx->sa_ctx, &traffic->ip4, n_ip4,
+   &core_statistics[lcoreid].inbound.spd4);
 
-   inbound_sp_sa(ipsec_ctx->sp6_ctx, ipsec_ctx->sa_ctx, &traffic->ip6,
-   n_ip6);
+   inbound_sp_sa(ipsec_ctx->sp6_ctx,
+   ipsec_ctx->sa_ctx, &traffic->ip6, n_ip6,
+   &core_statistics[lcoreid].inbound.spd6);
 }
 
 static inline void
-outbound_sp(struct sp_ctx *sp, struct traffic_type *ip,
-   struct traffic_type *ipsec)
+outbound_spd_lookup(struct sp_ctx *sp,
+   struct traffic_type *ip,
+   struct traffic_type *ipsec,
+   struct ipsec_spd_stats *stats)
 {
struct rte_mbuf *m;
uint32_t i, j, sa_idx;
@@ -785,17 +796,23 @@ outbound_sp(struct sp_ctx *sp, struct traffic_type *ip,
rte_acl_classify((struct rte_acl_ctx *)sp, ip->data, ip->res,
ip->num, DEFAULT_MAX_CATEGORIES);
 
-   j = 0;
-   for (i = 0; i < ip->num; i++) {
+   for (i = 0, j = 0; i < ip->num; i++) {
m = ip->pkts[i];
sa_idx = ip->res[i] - 1;
-   if (ip->res[i] == DISCARD)
+
+   if (unlikely(ip->res[i] == DISCARD)) {
free_pkts(&m, 1);
-   else if (ip->res[i] == BYPASS)
+
+   stats->discard++;
+   } else if (unlikely(ip->res[i] == BYPASS)) {
ip->pkts[j++] = m;
-   else {
+
+   stats->bypass++;
+   } else {
ipsec->res[ipsec->num] = sa_idx;
ipsec->pkts[ipsec->num++] = m;
+
+ 

[dpdk-dev] [PATCH v2 4/9] examples/ipsec-secgw: add stats interval argument

2021-09-15 Thread Radu Nicolau
Add -t for stats screen update interval, disabled by default.

Signed-off-by: Radu Nicolau 
---
 doc/guides/sample_app_ug/ipsec_secgw.rst |  5 
 examples/ipsec-secgw/ipsec-secgw.c   | 29 
 examples/ipsec-secgw/ipsec-secgw.h   | 15 
 3 files changed, 25 insertions(+), 24 deletions(-)

diff --git a/doc/guides/sample_app_ug/ipsec_secgw.rst 
b/doc/guides/sample_app_ug/ipsec_secgw.rst
index 20bc1e6bc4..0d55e74022 100644
--- a/doc/guides/sample_app_ug/ipsec_secgw.rst
+++ b/doc/guides/sample_app_ug/ipsec_secgw.rst
@@ -127,6 +127,7 @@ The application has a number of command line options::
 -p PORTMASK -P -u PORTMASK -j FRAMESIZE
 -l -w REPLAY_WINDOW_SIZE -e -a
 -c SAD_CACHE_SIZE
+-t STATISTICS_INTERVAL
 -s NUMBER_OF_MBUFS_IN_PACKET_POOL
 -f CONFIG_FILE_PATH
 --config (port,queue,lcore)[,(port,queue,lcore)]
@@ -176,6 +177,10 @@ Where:
 Zero value disables cache.
 Default value: 128.
 
+*   ``-t``: specifies the statistics screen update interval. If set to zero or
+omitted statistics screen is disabled.
+Default value: 0.
+
 *   ``-s``: sets number of mbufs in packet pool, if not provided number of 
mbufs
 will be calculated based on number of cores, eth ports and crypto queues.
 
diff --git a/examples/ipsec-secgw/ipsec-secgw.c 
b/examples/ipsec-secgw/ipsec-secgw.c
index 265fff4bef..60b25be872 100644
--- a/examples/ipsec-secgw/ipsec-secgw.c
+++ b/examples/ipsec-secgw/ipsec-secgw.c
@@ -181,6 +181,7 @@ static uint32_t frag_tbl_sz;
 static uint32_t frame_buf_size = RTE_MBUF_DEFAULT_BUF_SIZE;
 static uint32_t mtu_size = RTE_ETHER_MTU;
 static uint64_t frag_ttl_ns = MAX_FRAG_TTL_NS;
+static uint32_t stats_interval;
 
 /* application wide librte_ipsec/SA parameters */
 struct app_sa_prm app_sa_prm = {
@@ -292,7 +293,6 @@ adjust_ipv6_pktlen(struct rte_mbuf *m, const struct 
rte_ipv6_hdr *iph,
}
 }
 
-#if (STATS_INTERVAL > 0)
 
 /* Print out statistics on packet distribution */
 static void
@@ -352,9 +352,8 @@ print_stats_cb(__rte_unused void *param)
   total_packets_dropped);
printf("\n\n");
 
-   rte_eal_alarm_set(STATS_INTERVAL * US_PER_S, print_stats_cb, NULL);
+   rte_eal_alarm_set(stats_interval * US_PER_S, print_stats_cb, NULL);
 }
-#endif /* STATS_INTERVAL */
 
 static inline void
 prepare_one_packet(struct rte_mbuf *pkt, struct ipsec_traffic *t)
@@ -1435,6 +1434,7 @@ print_usage(const char *prgname)
" [-e]"
" [-a]"
" [-c]"
+   " [-t STATS_INTERVAL]"
" [-s NUMBER_OF_MBUFS_IN_PKT_POOL]"
" -f CONFIG_FILE"
" --config (port,queue,lcore)[,(port,queue,lcore)]"
@@ -1459,6 +1459,8 @@ print_usage(const char *prgname)
"  -a enables SA SQN atomic behaviour\n"
"  -c specifies inbound SAD cache size,\n"
" zero value disables the cache (default value: 128)\n"
+   "  -t specifies statistics screen update interval,\n"
+   " zero disables statistics screen (default value: 0)\n"
"  -s number of mbufs in packet pool, if not specified number\n"
" of mbufs will be calculated based on number of cores,\n"
" ports and crypto queues\n"
@@ -1666,7 +1668,7 @@ parse_args(int32_t argc, char **argv, struct eh_conf 
*eh_conf)
 
argvopt = argv;
 
-   while ((opt = getopt_long(argc, argvopt, "aelp:Pu:f:j:w:c:s:",
+   while ((opt = getopt_long(argc, argvopt, "aelp:Pu:f:j:w:c:t:s:",
lgopts, &option_index)) != EOF) {
 
switch (opt) {
@@ -1747,6 +1749,15 @@ parse_args(int32_t argc, char **argv, struct eh_conf 
*eh_conf)
}
app_sa_prm.cache_sz = ret;
break;
+   case 't':
+   ret = parse_decimal(optarg);
+   if (ret < 0) {
+   printf("Invalid interval value: %s\n", optarg);
+   print_usage(prgname);
+   return -1;
+   }
+   stats_interval = ret;
+   break;
case CMD_LINE_OPT_CONFIG_NUM:
ret = parse_config(optarg);
if (ret) {
@@ -3350,11 +3361,11 @@ main(int32_t argc, char **argv)
 
check_all_ports_link_status(enabled_port_mask);
 
-#if (STATS_INTERVAL > 0)
-   rte_eal_alarm_set(STATS_INTERVAL * US_PER_S, print_stats_cb, NULL);
-#else
-   RTE_LOG(INFO, IPSEC, "Stats display disabled\n");
-#endif /* STATS_INTERVAL */
+   if (stats_interval > 0)
+   rte_eal_alarm_set(stats_inter

[dpdk-dev] [PATCH v2 5/9] examples/ipsec-secgw: add support for TSO

2021-09-15 Thread Radu Nicolau
Add support to allow user to specific MSS for TSO offload on a per SA
basis. MSS configuration in the context of IPsec is only supported for
outbound SA's in the context of an inline IPsec Crypto offload.

Signed-off-by: Declan Doherty 
Signed-off-by: Radu Nicolau 
---
 doc/guides/sample_app_ug/ipsec_secgw.rst | 10 ++
 examples/ipsec-secgw/ipsec.h |  1 +
 examples/ipsec-secgw/sa.c| 15 +++
 3 files changed, 26 insertions(+)

diff --git a/doc/guides/sample_app_ug/ipsec_secgw.rst 
b/doc/guides/sample_app_ug/ipsec_secgw.rst
index 0d55e74022..7727051394 100644
--- a/doc/guides/sample_app_ug/ipsec_secgw.rst
+++ b/doc/guides/sample_app_ug/ipsec_secgw.rst
@@ -736,6 +736,16 @@ where each options means:
 
* *telemetry*
 
+ 
+
+ * Maximum segment size for TSO offload, available for egress SAs only.
+
+ * Optional: Yes, TSO offload not set by default
+
+ * Syntax:
+
+   * *mss N* N is the segment size
+
 Example SA rules:
 
 .. code-block:: console
diff --git a/examples/ipsec-secgw/ipsec.h b/examples/ipsec-secgw/ipsec.h
index a3de8952b6..c3da5fb243 100644
--- a/examples/ipsec-secgw/ipsec.h
+++ b/examples/ipsec-secgw/ipsec.h
@@ -141,6 +141,7 @@ struct ipsec_sa {
enum rte_security_ipsec_sa_direction direction;
uint8_t udp_encap;
uint16_t portid;
+   uint16_t mss;
uint8_t fdir_qid;
uint8_t fdir_flag;
 
diff --git a/examples/ipsec-secgw/sa.c b/examples/ipsec-secgw/sa.c
index db5fd46e67..1a53430ec9 100644
--- a/examples/ipsec-secgw/sa.c
+++ b/examples/ipsec-secgw/sa.c
@@ -683,6 +683,16 @@ parse_sa_tokens(char **tokens, uint32_t n_tokens,
continue;
}
 
+   if (strcmp(tokens[ti], "mss") == 0) {
+   INCREMENT_TOKEN_INDEX(ti, n_tokens, status);
+   if (status->status < 0)
+   return;
+   rule->mss = atoi(tokens[ti]);
+   if (status->status < 0)
+   return;
+   continue;
+   }
+
if (strcmp(tokens[ti], "fallback") == 0) {
struct rte_ipsec_session *fb;
 
@@ -1320,6 +1330,11 @@ fill_ipsec_sa_prm(struct rte_ipsec_sa_prm *prm, const 
struct ipsec_sa *ss,
prm->ipsec_xform.options.ecn = 1;
prm->ipsec_xform.options.copy_dscp = 1;
 
+   if (ss->mss > 0) {
+   prm->ipsec_xform.options.tso = 1;
+   prm->ipsec_xform.mss = ss->mss;
+   }
+
if (IS_IP4_TUNNEL(ss->flags)) {
prm->ipsec_xform.tunnel.type = RTE_SECURITY_IPSEC_TUNNEL_IPV4;
prm->tun.hdr_l3_len = sizeof(*v4);
-- 
2.25.1



[dpdk-dev] [PATCH v2 6/9] examples/ipsec-secgw: add support for defining initial sequence number value

2021-09-15 Thread Radu Nicolau
Add esn field to SA definition block to allow initial ESN value

Signed-off-by: Declan Doherty 
Signed-off-by: Radu Nicolau 
---
 doc/guides/sample_app_ug/ipsec_secgw.rst | 10 ++
 examples/ipsec-secgw/ipsec.c |  5 +
 examples/ipsec-secgw/ipsec.h |  1 +
 examples/ipsec-secgw/sa.c| 15 +++
 4 files changed, 31 insertions(+)

diff --git a/doc/guides/sample_app_ug/ipsec_secgw.rst 
b/doc/guides/sample_app_ug/ipsec_secgw.rst
index 7727051394..dc3ced244d 100644
--- a/doc/guides/sample_app_ug/ipsec_secgw.rst
+++ b/doc/guides/sample_app_ug/ipsec_secgw.rst
@@ -746,6 +746,16 @@ where each options means:
 
* *mss N* N is the segment size
 
+ 
+
+ * Enable ESN and set the initial ESN value.
+
+ * Optional: Yes, ESN not enabled by default
+
+ * Syntax:
+
+   * *esn N* N is the initial ESN value
+
 Example SA rules:
 
 .. code-block:: console
diff --git a/examples/ipsec-secgw/ipsec.c b/examples/ipsec-secgw/ipsec.c
index 0af49f3f4b..868089ad3e 100644
--- a/examples/ipsec-secgw/ipsec.c
+++ b/examples/ipsec-secgw/ipsec.c
@@ -222,6 +222,11 @@ create_inline_session(struct socket_ctx *skt_ctx, struct 
ipsec_sa *sa,
}
}
 
+   if (sa->esn > 0) {
+   sess_conf.ipsec.options.esn = 1;
+   sess_conf.ipsec.esn.value = sa->esn;
+   }
+
RTE_LOG_DP(DEBUG, IPSEC, "Create session for SA spi %u on port %u\n",
sa->spi, sa->portid);
 
diff --git a/examples/ipsec-secgw/ipsec.h b/examples/ipsec-secgw/ipsec.h
index c3da5fb243..2807b41ebb 100644
--- a/examples/ipsec-secgw/ipsec.h
+++ b/examples/ipsec-secgw/ipsec.h
@@ -142,6 +142,7 @@ struct ipsec_sa {
uint8_t udp_encap;
uint16_t portid;
uint16_t mss;
+   uint64_t esn;
uint8_t fdir_qid;
uint8_t fdir_flag;
 
diff --git a/examples/ipsec-secgw/sa.c b/examples/ipsec-secgw/sa.c
index 1a53430ec9..cfab416c9c 100644
--- a/examples/ipsec-secgw/sa.c
+++ b/examples/ipsec-secgw/sa.c
@@ -693,6 +693,16 @@ parse_sa_tokens(char **tokens, uint32_t n_tokens,
continue;
}
 
+   if (strcmp(tokens[ti], "esn") == 0) {
+   INCREMENT_TOKEN_INDEX(ti, n_tokens, status);
+   if (status->status < 0)
+   return;
+   rule->esn = atoll(tokens[ti]);
+   if (status->status < 0)
+   return;
+   continue;
+   }
+
if (strcmp(tokens[ti], "fallback") == 0) {
struct rte_ipsec_session *fb;
 
@@ -1335,6 +1345,11 @@ fill_ipsec_sa_prm(struct rte_ipsec_sa_prm *prm, const 
struct ipsec_sa *ss,
prm->ipsec_xform.mss = ss->mss;
}
 
+   if (ss->esn > 0) {
+   prm->ipsec_xform.options.esn = 1;
+   prm->ipsec_xform.esn.value = ss->esn;
+   }
+
if (IS_IP4_TUNNEL(ss->flags)) {
prm->ipsec_xform.tunnel.type = RTE_SECURITY_IPSEC_TUNNEL_IPV4;
prm->tun.hdr_l3_len = sizeof(*v4);
-- 
2.25.1



[dpdk-dev] [PATCH v2 7/9] examples/ipsec-secgw: add ethdev reset callback

2021-09-15 Thread Radu Nicolau
Add event handler for ethdev reset callback

Signed-off-by: Declan Doherty 
Signed-off-by: Radu Nicolau 
---
 examples/ipsec-secgw/ipsec-secgw.c | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/examples/ipsec-secgw/ipsec-secgw.c 
b/examples/ipsec-secgw/ipsec-secgw.c
index 60b25be872..ba8880e363 100644
--- a/examples/ipsec-secgw/ipsec-secgw.c
+++ b/examples/ipsec-secgw/ipsec-secgw.c
@@ -2559,6 +2559,17 @@ inline_ipsec_event_callback(uint16_t port_id, enum 
rte_eth_event_type type,
return -1;
 }
 
+static int
+ethdev_reset_event_callback(uint16_t port_id,
+   enum rte_eth_event_type type __rte_unused,
+void *param __rte_unused, void *ret_param __rte_unused)
+{
+   printf("Reset Event on port id %d\n", port_id);
+   printf("Force quit application");
+   force_quit = true;
+   return 0;
+}
+
 static uint16_t
 rx_callback(__rte_unused uint16_t port, __rte_unused uint16_t queue,
struct rte_mbuf *pkt[], uint16_t nb_pkts,
@@ -,6 +3344,9 @@ main(int32_t argc, char **argv)
rte_strerror(-ret), portid);
}
 
+   rte_eth_dev_callback_register(portid, RTE_ETH_EVENT_INTR_RESET,
+   ethdev_reset_event_callback, NULL);
+
rte_eth_dev_callback_register(portid,
RTE_ETH_EVENT_IPSEC, inline_ipsec_event_callback, NULL);
}
-- 
2.25.1



[dpdk-dev] [PATCH v2 8/9] examples/ipsec-secgw: add support for additional algorithms

2021-09-15 Thread Radu Nicolau
Add support for AES-GMAC, AES_CTR, AES_XCBC_MAC,
AES_CCM, CHACHA20_POLY1305

Signed-off-by: Declan Doherty 
Signed-off-by: Radu Nicolau 
---
 examples/ipsec-secgw/ipsec.h |   3 +-
 examples/ipsec-secgw/sa.c| 133 ---
 2 files changed, 126 insertions(+), 10 deletions(-)

diff --git a/examples/ipsec-secgw/ipsec.h b/examples/ipsec-secgw/ipsec.h
index 2807b41ebb..3ec3e55170 100644
--- a/examples/ipsec-secgw/ipsec.h
+++ b/examples/ipsec-secgw/ipsec.h
@@ -65,8 +65,7 @@ struct ip_addr {
} ip;
 };
 
-#define MAX_KEY_SIZE   36
-
+#define MAX_KEY_SIZE   96
 /*
  * application wide SA parameters
  */
diff --git a/examples/ipsec-secgw/sa.c b/examples/ipsec-secgw/sa.c
index cfab416c9c..bd58edebc9 100644
--- a/examples/ipsec-secgw/sa.c
+++ b/examples/ipsec-secgw/sa.c
@@ -45,6 +45,7 @@ struct supported_cipher_algo {
 struct supported_auth_algo {
const char *keyword;
enum rte_crypto_auth_algorithm algo;
+   uint16_t iv_len;
uint16_t digest_len;
uint16_t key_len;
uint8_t key_not_req;
@@ -97,6 +98,20 @@ const struct supported_cipher_algo cipher_algos[] = {
.block_size = 4,
.key_len = 20
},
+   {
+   .keyword = "aes-192-ctr",
+   .algo = RTE_CRYPTO_CIPHER_AES_CTR,
+   .iv_len = 16,
+   .block_size = 16,
+   .key_len = 28
+   },
+   {
+   .keyword = "aes-256-ctr",
+   .algo = RTE_CRYPTO_CIPHER_AES_CTR,
+   .iv_len = 16,
+   .block_size = 16,
+   .key_len = 36
+   },
{
.keyword = "3des-cbc",
.algo = RTE_CRYPTO_CIPHER_3DES_CBC,
@@ -125,6 +140,31 @@ const struct supported_auth_algo auth_algos[] = {
.algo = RTE_CRYPTO_AUTH_SHA256_HMAC,
.digest_len = 16,
.key_len = 32
+   },
+   {
+   .keyword = "sha384-hmac",
+   .algo = RTE_CRYPTO_AUTH_SHA384_HMAC,
+   .digest_len = 24,
+   .key_len = 48
+   },
+   {
+   .keyword = "sha512-hmac",
+   .algo = RTE_CRYPTO_AUTH_SHA512_HMAC,
+   .digest_len = 32,
+   .key_len = 64
+   },
+   {
+   .keyword = "aes-gmac",
+   .algo = RTE_CRYPTO_AUTH_AES_GMAC,
+   .iv_len = 8,
+   .digest_len = 16,
+   .key_len = 20
+   },
+   {
+   .keyword = "aes-xcbc-mac-96",
+   .algo = RTE_CRYPTO_AUTH_AES_XCBC_MAC,
+   .digest_len = 12,
+   .key_len = 16
}
 };
 
@@ -155,6 +195,42 @@ const struct supported_aead_algo aead_algos[] = {
.key_len = 36,
.digest_len = 16,
.aad_len = 8,
+   },
+   {
+   .keyword = "aes-128-ccm",
+   .algo = RTE_CRYPTO_AEAD_AES_CCM,
+   .iv_len = 8,
+   .block_size = 4,
+   .key_len = 20,
+   .digest_len = 16,
+   .aad_len = 8,
+   },
+   {
+   .keyword = "aes-192-ccm",
+   .algo = RTE_CRYPTO_AEAD_AES_CCM,
+   .iv_len = 8,
+   .block_size = 4,
+   .key_len = 28,
+   .digest_len = 16,
+   .aad_len = 8,
+   },
+   {
+   .keyword = "aes-256-ccm",
+   .algo = RTE_CRYPTO_AEAD_AES_CCM,
+   .iv_len = 8,
+   .block_size = 4,
+   .key_len = 36,
+   .digest_len = 16,
+   .aad_len = 8,
+   },
+   {
+   .keyword = "chacha20-poly1305",
+   .algo = RTE_CRYPTO_AEAD_CHACHA20_POLY1305,
+   .iv_len = 12,
+   .block_size = 64,
+   .key_len = 36,
+   .digest_len = 16,
+   .aad_len = 8,
}
 };
 
@@ -483,6 +559,15 @@ parse_sa_tokens(char **tokens, uint32_t n_tokens,
if (status->status < 0)
return;
 
+   if (algo->algo == RTE_CRYPTO_AUTH_AES_GMAC) {
+   key_len -= 4;
+   rule->auth_key_len = key_len;
+   rule->iv_len = algo->iv_len;
+   memcpy(&rule->salt,
+   &rule->auth_key[key_len], 4);
+   }
+
+
auth_algo_p = 1;
continue;
}
@@ -1173,8 +1258,20 @@ sa_add_rules(struct sa_ctx *sa_ctx, const struct 
ipsec_sa entries[],
break;
}
 
-   if (sa->aead_algo == RTE_CRYPTO_AEAD_AES_GCM) {
-   iv_length = 12;
+
+   if (sa->aead_algo == RTE_CRYPTO_AEAD_AES_GCM ||
+   sa->aead_algo == RTE_CRYPTO_AEAD_AES_CC

[dpdk-dev] [PATCH v2 9/9] examples/ipsec-secgw: add support for inline crypto UDP encapsulation

2021-09-15 Thread Radu Nicolau
Enable UDP encapsulation for both transport and tunnel modes for the
inline crypto offload path.

Signed-off-by: Radu Nicolau 
---
 examples/ipsec-secgw/ipsec.c |  34 --
 examples/ipsec-secgw/ipsec.h |   7 +-
 examples/ipsec-secgw/sa.c| 123 ---
 3 files changed, 136 insertions(+), 28 deletions(-)

diff --git a/examples/ipsec-secgw/ipsec.c b/examples/ipsec-secgw/ipsec.c
index 868089ad3e..edc0b21478 100644
--- a/examples/ipsec-secgw/ipsec.c
+++ b/examples/ipsec-secgw/ipsec.c
@@ -222,6 +222,13 @@ create_inline_session(struct socket_ctx *skt_ctx, struct 
ipsec_sa *sa,
}
}
 
+   if (sa->udp_encap) {
+   sess_conf.ipsec.options.udp_encap = 1;
+
+   sess_conf.ipsec.udp.sport = htons(sa->udp.sport);
+   sess_conf.ipsec.udp.dport = htons(sa->udp.dport);
+   }
+
if (sa->esn > 0) {
sess_conf.ipsec.options.esn = 1;
sess_conf.ipsec.esn.value = sa->esn;
@@ -295,12 +302,31 @@ create_inline_session(struct socket_ctx *skt_ctx, struct 
ipsec_sa *sa,
sa->ipv4_spec.hdr.src_addr = sa->src.ip.ip4;
}
 
-   sa->pattern[2].type = RTE_FLOW_ITEM_TYPE_ESP;
-   sa->pattern[2].spec = &sa->esp_spec;
-   sa->pattern[2].mask = &rte_flow_item_esp_mask;
sa->esp_spec.hdr.spi = rte_cpu_to_be_32(sa->spi);
 
-   sa->pattern[3].type = RTE_FLOW_ITEM_TYPE_END;
+   if (sa->udp_encap) {
+
+   sa->udp_spec.hdr.dst_port =
+   rte_cpu_to_be_16(sa->udp.dport);
+   sa->udp_spec.hdr.src_port =
+   rte_cpu_to_be_16(sa->udp.sport);
+
+   sa->pattern[2].mask = &rte_flow_item_udp_mask;
+   sa->pattern[2].type = RTE_FLOW_ITEM_TYPE_UDP;
+   sa->pattern[2].spec = &sa->udp_spec;
+
+   sa->pattern[3].type = RTE_FLOW_ITEM_TYPE_ESP;
+   sa->pattern[3].spec = &sa->esp_spec;
+   sa->pattern[3].mask = &rte_flow_item_esp_mask;
+
+   sa->pattern[4].type = RTE_FLOW_ITEM_TYPE_END;
+   } else {
+   sa->pattern[2].type = RTE_FLOW_ITEM_TYPE_ESP;
+   sa->pattern[2].spec = &sa->esp_spec;
+   sa->pattern[2].mask = &rte_flow_item_esp_mask;
+
+   sa->pattern[3].type = RTE_FLOW_ITEM_TYPE_END;
+   }
 
sa->action[0].type = RTE_FLOW_ACTION_TYPE_SECURITY;
sa->action[0].conf = ips->security.ses;
diff --git a/examples/ipsec-secgw/ipsec.h b/examples/ipsec-secgw/ipsec.h
index 3ec3e55170..5fa4e62f37 100644
--- a/examples/ipsec-secgw/ipsec.h
+++ b/examples/ipsec-secgw/ipsec.h
@@ -128,6 +128,10 @@ struct ipsec_sa {
 
struct ip_addr src;
struct ip_addr dst;
+   struct {
+   uint16_t sport;
+   uint16_t dport;
+   } udp;
uint8_t cipher_key[MAX_KEY_SIZE];
uint16_t cipher_key_len;
uint8_t auth_key[MAX_KEY_SIZE];
@@ -145,7 +149,7 @@ struct ipsec_sa {
uint8_t fdir_qid;
uint8_t fdir_flag;
 
-#define MAX_RTE_FLOW_PATTERN (4)
+#define MAX_RTE_FLOW_PATTERN (5)
 #define MAX_RTE_FLOW_ACTIONS (3)
struct rte_flow_item pattern[MAX_RTE_FLOW_PATTERN];
struct rte_flow_action action[MAX_RTE_FLOW_ACTIONS];
@@ -154,6 +158,7 @@ struct ipsec_sa {
struct rte_flow_item_ipv4 ipv4_spec;
struct rte_flow_item_ipv6 ipv6_spec;
};
+   struct rte_flow_item_udp udp_spec;
struct rte_flow_item_esp esp_spec;
struct rte_flow *flow;
struct rte_security_session_conf sess_conf;
diff --git a/examples/ipsec-secgw/sa.c b/examples/ipsec-secgw/sa.c
index bd58edebc9..847ac37b81 100644
--- a/examples/ipsec-secgw/sa.c
+++ b/examples/ipsec-secgw/sa.c
@@ -17,6 +17,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -882,6 +883,11 @@ parse_sa_tokens(char **tokens, uint32_t n_tokens,
app_sa_prm.udp_encap = 1;
udp_encap_p = 1;
break;
+   case RTE_SECURITY_ACTION_TYPE_INLINE_CRYPTO:
+   rule->udp_encap = 1;
+   rule->udp.sport = 0;
+   rule->udp.dport = 4500;
+   break;
default:
APP_CHECK(0, status,
"UDP encapsulation not supported for "
@@ -969,6 +975,8 @@ print_one_sa_rule(const struct ipsec_sa *sa, int inbound)
}
 
printf("mode:");
+   if (sa->udp_encap)
+   printf("UDP encapsulated ");
 
switch (WITHOUT_TRANSPORT_VERSION(sa->flags)) {
  

Re: [dpdk-dev] [PATCH] build: fix building when essential drivers in disable list

2021-09-15 Thread Thomas Monjalon
09/09/2021 15:38, Nicolau, Radu:
> On 8/18/2021 2:42 PM, Bruce Richardson wrote:
> > The PCI and vdev bus drivers cannot be disabled for DPDK builds and
> > special logic is put in place to not skip them when they are specified
> > in the disable list. This logic is broken though, as the inclusion of
> > the driver-specific meson.build file is only included in the "else" leg
> > of the condition check. This means that when they are specified as
> > disabled the PCI and vdev buses are not disabled, but neither are their
> > source files compiled.
> >
> > Fix this by moving the "subdir()" call into the next "if build" block,
> > ensuring that if not disabled the sources are always included. To take
> > account of the fact that the subdir call could itself disable the
> > driver, we add a break call into the following loop to ensure we quickly
> > fall through to the following block which stops processing appropriately
> > if the driver is disabled.
> >
> > Fixes: 2e33309ebe03 ("config: enable/disable drivers in Arm builds")
> > Cc: juraj.lin...@pantheon.tech
> >
> > Signed-off-by: Bruce Richardson 
> > ---
> Tested-by: Radu Nicolau 
> Acked-by: Radu Nicolau 

Applied, thanks




[dpdk-dev] [PATCH] telemetry: fix "in-memory" process socket conflicts

2021-09-15 Thread Bruce Richardson
When DPDK is run with --in-memory mode, multiple processes can run
simultaneously using the same runtime dir. This leads to each process
removing another process' telemetry socket as it started up, giving
unexpected behaviour.

This patch changes that behaviour to first check if the existing socket
is active. If not, it's an old socket to be cleaned up and can be
removed. If it is active, telemetry initialization fails and an error
message is printed out giving instructions on how to remove the error;
either by using file-prefix to have a different runtime dir (and
therefore socket path) or by disabling telemetry if it not needed.

Fixes: 6dd571fd07c3 ("telemetry: introduce new functionality")
Cc: sta...@dpdk.org

Reported-by: David Marchand 
Signed-off-by: Bruce Richardson 
---
 lib/telemetry/telemetry.c | 25 -
 1 file changed, 20 insertions(+), 5 deletions(-)

diff --git a/lib/telemetry/telemetry.c b/lib/telemetry/telemetry.c
index 8665db8d03..5be2834757 100644
--- a/lib/telemetry/telemetry.c
+++ b/lib/telemetry/telemetry.c
@@ -421,15 +421,30 @@ create_socket(char *path)

struct sockaddr_un sun = {.sun_family = AF_UNIX};
strlcpy(sun.sun_path, path, sizeof(sun.sun_path));
-   unlink(sun.sun_path);
+
if (bind(sock, (void *) &sun, sizeof(sun)) < 0) {
struct stat st;

-   TMTY_LOG(ERR, "Error binding socket: %s\n", strerror(errno));
-   if (stat(socket_dir, &st) < 0 || !S_ISDIR(st.st_mode))
+   /* first check if we have a runtime dir */
+   if (stat(socket_dir, &st) < 0 || !S_ISDIR(st.st_mode)) {
TMTY_LOG(ERR, "Cannot access DPDK runtime directory: 
%s\n", socket_dir);
-   sun.sun_path[0] = 0;
-   goto error;
+   goto error;
+   }
+
+   /* check if current socket is active */
+   if (connect(sock, &sun, sizeof(sun)) == 0) {
+   TMTY_LOG(ERR, "Error binding telemetry socket, path 
already in use\n");
+   TMTY_LOG(ERR, "Use '--file-prefix' to select a 
different socket path, or '--no-telemetry' to disable\n");
+   path[0] = 0;
+   goto error;
+   }
+
+   /* socket is not active, delete and attempt rebind */
+   unlink(sun.sun_path);
+   if (bind(sock, (void *) &sun, sizeof(sun)) < 0) {
+   TMTY_LOG(ERR, "Error binding socket: %s\n", 
strerror(errno));
+   goto error;
+   }
}

if (listen(sock, 1) < 0) {
--
2.30.2



[dpdk-dev] [PATCH v2] net/virtio: wait device ready during reset

2021-09-15 Thread Xueming Li
According to virtio spec, the device MUST reset when 0 is written to
device_status, and present 0 in device_status once reset is done.

This patch waits status value to be 0 during reset operation, if
timeout in 3 seconds, log and continue.

Signed-off-by: Xueming Li 
Cc: Andrew Rybchenko 
---
 drivers/net/virtio/virtio.c | 15 +--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/drivers/net/virtio/virtio.c b/drivers/net/virtio/virtio.c
index 7e1e77797f..d9e642f412 100644
--- a/drivers/net/virtio/virtio.c
+++ b/drivers/net/virtio/virtio.c
@@ -3,7 +3,10 @@
  * Copyright(c) 2020 Red Hat, Inc.
  */
 
+#include 
+
 #include "virtio.h"
+#include "virtio_logs.h"
 
 uint64_t
 virtio_negotiate_features(struct virtio_hw *hw, uint64_t host_features)
@@ -38,9 +41,17 @@ virtio_write_dev_config(struct virtio_hw *hw, size_t offset,
 void
 virtio_reset(struct virtio_hw *hw)
 {
+   uint32_t retry = 0;
+
VIRTIO_OPS(hw)->set_status(hw, VIRTIO_CONFIG_STATUS_RESET);
-   /* flush status write */
-   VIRTIO_OPS(hw)->get_status(hw);
+   /* Flush status write and wait device ready max 3 seconds. */
+   while (VIRTIO_OPS(hw)->get_status(hw) != VIRTIO_CONFIG_STATUS_RESET) {
+   if (retry++ > 3000) {
+   PMD_INIT_LOG(WARNING, "port %u device reset timeout", 
hw->port_id);
+   break;
+   }
+   usleep(1000L);
+   }
 }
 
 void
-- 
2.33.0



Re: [dpdk-dev] [PATCH v1 0/7] make rte_intr_handle internal

2021-09-15 Thread Harman Kalra
Ping...
Kindly review the series. Also would like to request PMD maintainers(who uses 
interrupt APIs) to validate the series for their respective drivers, 
as many drivers underwent interrupt related changes in patch 5 of the series.

Thanks
Harman

> -Original Message-
> From: Harman Kalra 
> Sent: Friday, September 3, 2021 6:11 PM
> To: dev@dpdk.org
> Cc: Harman Kalra 
> Subject: [PATCH v1 0/7] make rte_intr_handle internal
> 
> Moving struct rte_intr_handle as an internal structure to avoid any ABI
> breakages in future. Since this structure defines some static arrays and
> changing respective macros breaks the ABI.
> Eg:
> Currently RTE_MAX_RXTX_INTR_VEC_ID imposes a limit of maximum 512
> MSI-X interrupts that can be defined for a PCI device, while PCI specification
> allows maximum 2048 MSI-X interrupts that can be used.
> If some PCI device requires more than 512 vectors, either change the
> RTE_MAX_RXTX_INTR_VEC_ID limit or dynamically allocate based on PCI
> device MSI-X size on probe time. Either way its an ABI breakage.
> 
> Change already included in 21.11 ABI improvement spreadsheet (item 42):
> https://urldefense.proofpoint.com/v2/url?u=https-3A__docs.google.com_s
> preadsheets_d_1betlC000ua5SsSiJIcC54mCCCJnW6voH5Dqv9UxeyfE_edit-
> 23gid-
> 3D0&d=DwICaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=5ESHPj7V-
> 7JdkxT_Z_SU6RrS37ys4U
> XudBQ_rrS5LRo&m=7dl3OmXU7QHMmWYB6V1hYJtq1cUkjfhXUwze2Si_48c
> &s=lh6DEGhR
> Bg1shODpAy3RQk-H-0uQx5icRfUBf9dtCp4&e=
> 
> 
> This series makes struct rte_intr_handle totally opaque to the outside world
> by wrapping it inside a .c file and providing get set wrapper APIs to read or
> manipulate its fields.. Any changes to be made to any of the fields should be
> done via these get set APIs.
> Introduced a new eal_common_interrupts.c where all these APIs are
> defined and also hides struct rte_intr_handle definition.
> 
> Details on each patch of the series:
> Patch 1: eal: interrupt handle API prototypes This patch provides prototypes
> of all the new get set APIs, and also rearranges the headers related to
> interrupt framework. Epoll related definitions prototypes are moved into a
> new header i.e.
> rte_epoll.h and APIs defined in rte_eal_interrupts.h which were driver
> specific are moved to rte_interrupts.h (as anyways it was accessible and used
> outside DPDK library. Later in the series rte_eal_interrupts.h is removed.
> 
> Patch 2: eal/interrupts: implement get set APIs Implementing all get, set and
> alloc APIs. Alloc APIs are implemented to allocate memory for interrupt
> handle instance. Currently most of the drivers defines interrupt handle
> instance as static but now it cant be static as size of rte_intr_handle is
> unknown to all the drivers.
> Drivers are expected to allocate interrupt instances during initialization and
> free these instances during cleanup phase.
> 
> Patch 3: eal/interrupts: avoid direct access to interrupt handle Modifying the
> interrupt framework for linux and freebsd to use these get set alloc APIs as
> per requirement and avoid accessing the fields directly.
> 
> Patch 4: test/interrupt: apply get set interrupt handle APIs Updating
> interrupt test suite to use interrupt handle APIs.
> 
> Patch 5: drivers: remove direct access to interrupt handle fields Modifying 
> all
> the drivers and libraries which are currently directly accessing the interrupt
> handle fields. Drivers are expected to allocated the interrupt instance, use
> get set APIs with the allocated interrupt handle and free it on cleanup.
> 
> Patch 6: eal/interrupts: make interrupt handle structure opaque In this patch
> rte_eal_interrupt.h is removed, struct rte_intr_handle definition is moved to
> c file to make it completely opaque. As part of interrupt handle allocation,
> array like efds and elist(which are currently
> static) are dynamically allocated with default size
> (RTE_MAX_RXTX_INTR_VEC_ID). Later these arrays can be reallocated as per
> device requirement using new API rte_intr_handle_event_list_update().
> Eg, on PCI device probing MSIX size can be queried and these arrays can be
> reallocated accordingly.
> 
> Patch 7: eal/alarm: introduce alarm fini routine Introducing alarm fini 
> routine,
> as the memory allocated for alarm interrupt instance can be freed in alarm
> fini.
> 
> Testing performed:
> 1. Validated the series by running interrupts and alarm test suite.
> 2. Validate l3fwd power functionality with octeontx2 and i40e intel cards,
>where interrupts are expected on packet arrival.
> 
> v1:
> * Fixed freebsd compilation failure
> * Fixed seg fault in case of memif
> 
> Harman Kalra (7):
>   eal: interrupt handle API prototypes
>   eal/interrupts: implement get set APIs
>   eal/interrupts: avoid direct access to interrupt handle
>   test/interrupt: apply get set interrupt handle APIs
>   drivers: remove direct access to interrupt handle fields
>   eal/interrupts: make interrupt handle structure opaque
>   eal/alarm: introduce alarm fini routin

Re: [dpdk-dev] [PATCH v1] net/virtio: wait device ready during reset

2021-09-15 Thread Xueming(Steven) Li
On Wed, 2021-09-15 at 12:27 +0300, Andrew Rybchenko wrote:
> On 9/15/21 12:21 PM, Xueming Li wrote:
> > According to virtio spec, the device MUST reset when 0 is written to
> > device_status, and present 0 in device_status once reset is done.
> > 
> > This patch waits status value to be 0 during reset operation, if
> > timeout in 3 seconds, log and continue.
> 
> I have no strong opinion on timeout.
> 
> > 
> > Signed-off-by: Xueming Li 
> > Cc: Andrew Rybchenko 
> > ---
> >  drivers/net/virtio/virtio.c | 15 +--
> >  1 file changed, 13 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/net/virtio/virtio.c b/drivers/net/virtio/virtio.c
> > index 7e1e77797f..f865b27b65 100644
> > --- a/drivers/net/virtio/virtio.c
> > +++ b/drivers/net/virtio/virtio.c
> > @@ -3,7 +3,10 @@
> >   * Copyright(c) 2020 Red Hat, Inc.
> >   */
> >  
> > +#include 
> > +
> >  #include "virtio.h"
> > +#include "virtio_logs.h"
> >  
> >  uint64_t
> >  virtio_negotiate_features(struct virtio_hw *hw, uint64_t host_features)
> > @@ -38,9 +41,17 @@ virtio_write_dev_config(struct virtio_hw *hw, size_t 
> > offset,
> >  void
> >  virtio_reset(struct virtio_hw *hw)
> >  {
> > +   uint32_t retry = 0;
> > +
> > VIRTIO_OPS(hw)->set_status(hw, VIRTIO_CONFIG_STATUS_RESET);
> > -   /* flush status write */
> > -   VIRTIO_OPS(hw)->get_status(hw);
> > +   /* Flush status write and wait device ready max 3 seconds. */
> > +   while (VIRTIO_OPS(hw)->get_status(hw) != VIRTIO_CONFIG_STATUS_RESET) {
> > +   if (retry++ > 3000) {
> > +   PMD_INIT_LOG(WARNING, "device reset timeout");
> 
> I think it would be very useful to log ethdev port ID here.

Good suggesiton, v2 sent.

> 
> > +   break;
> > +   }
> > +   usleep(1000L);
> > +   }
> >  }
> >  
> >  void
> > 
> 



[dpdk-dev] [PATCH] net/virtio: remove blank lines in log

2021-09-15 Thread Thomas Monjalon
The macro PMD_INIT_LOG includes already the line feed character.
Redundant \n are removed.

Signed-off-by: Thomas Monjalon 
---
 drivers/net/virtio/virtio_ethdev.c|  2 +-
 drivers/net/virtio/virtio_pci.c   |  2 +-
 drivers/net/virtio/virtio_pci_ethdev.c|  6 ++---
 .../net/virtio/virtio_user/virtio_user_dev.c  | 22 +--
 drivers/net/virtio/virtio_user_ethdev.c   |  2 +-
 5 files changed, 17 insertions(+), 17 deletions(-)

diff --git a/drivers/net/virtio/virtio_ethdev.c 
b/drivers/net/virtio/virtio_ethdev.c
index 9ca8bae0fe..d722fbe04a 100644
--- a/drivers/net/virtio/virtio_ethdev.c
+++ b/drivers/net/virtio/virtio_ethdev.c
@@ -212,7 +212,7 @@ virtio_send_command_packed(struct virtnet_ctl *cvq,
"vq->vq_avail_idx=%d\n"
"vq->vq_used_cons_idx=%d\n"
"vq->vq_packed.cached_flags=0x%x\n"
-   "vq->vq_packed.used_wrap_counter=%d\n",
+   "vq->vq_packed.used_wrap_counter=%d",
vq->vq_free_cnt,
vq->vq_avail_idx,
vq->vq_used_cons_idx,
diff --git a/drivers/net/virtio/virtio_pci.c b/drivers/net/virtio/virtio_pci.c
index c14d1339c9..182cfc9eae 100644
--- a/drivers/net/virtio/virtio_pci.c
+++ b/drivers/net/virtio/virtio_pci.c
@@ -410,7 +410,7 @@ static int
 modern_features_ok(struct virtio_hw *hw)
 {
if (!virtio_with_feature(hw, VIRTIO_F_VERSION_1)) {
-   PMD_INIT_LOG(ERR, "Version 1+ required with modern devices\n");
+   PMD_INIT_LOG(ERR, "Version 1+ required with modern devices");
return -1;
}
 
diff --git a/drivers/net/virtio/virtio_pci_ethdev.c 
b/drivers/net/virtio/virtio_pci_ethdev.c
index 4083853c48..54645dc62e 100644
--- a/drivers/net/virtio/virtio_pci_ethdev.c
+++ b/drivers/net/virtio/virtio_pci_ethdev.c
@@ -81,7 +81,7 @@ eth_virtio_pci_init(struct rte_eth_dev *eth_dev)
VTPCI_DEV(hw) = pci_dev;
ret = vtpci_init(RTE_ETH_DEV_TO_PCI(eth_dev), dev);
if (ret) {
-   PMD_INIT_LOG(ERR, "Failed to init PCI device\n");
+   PMD_INIT_LOG(ERR, "Failed to init PCI device");
return -1;
}
} else {
@@ -93,14 +93,14 @@ eth_virtio_pci_init(struct rte_eth_dev *eth_dev)
 
ret = virtio_remap_pci(RTE_ETH_DEV_TO_PCI(eth_dev), dev);
if (ret < 0) {
-   PMD_INIT_LOG(ERR, "Failed to remap PCI device\n");
+   PMD_INIT_LOG(ERR, "Failed to remap PCI device");
return -1;
}
}
 
ret = eth_virtio_dev_init(eth_dev);
if (ret < 0) {
-   PMD_INIT_LOG(ERR, "Failed to init virtio device\n");
+   PMD_INIT_LOG(ERR, "Failed to init virtio device");
goto err_unmap;
}
 
diff --git a/drivers/net/virtio/virtio_user/virtio_user_dev.c 
b/drivers/net/virtio/virtio_user/virtio_user_dev.c
index 16c58710d7..200942d622 100644
--- a/drivers/net/virtio/virtio_user/virtio_user_dev.c
+++ b/drivers/net/virtio/virtio_user/virtio_user_dev.c
@@ -44,7 +44,7 @@ virtio_user_create_queue(struct virtio_user_dev *dev, 
uint32_t queue_sel)
file.fd = dev->callfds[queue_sel];
ret = dev->ops->set_vring_call(dev, &file);
if (ret < 0) {
-   PMD_INIT_LOG(ERR, "(%s) Failed to create queue %u\n", 
dev->path, queue_sel);
+   PMD_INIT_LOG(ERR, "(%s) Failed to create queue %u", dev->path, 
queue_sel);
return -1;
}
 
@@ -108,7 +108,7 @@ virtio_user_kick_queue(struct virtio_user_dev *dev, 
uint32_t queue_sel)
 
return 0;
 err:
-   PMD_INIT_LOG(ERR, "(%s) Failed to kick queue %u\n", dev->path, 
queue_sel);
+   PMD_INIT_LOG(ERR, "(%s) Failed to kick queue %u", dev->path, queue_sel);
 
return -1;
 }
@@ -214,7 +214,7 @@ virtio_user_start_device(struct virtio_user_dev *dev)
pthread_mutex_unlock(&dev->mutex);
rte_mcfg_mem_read_unlock();
 
-   PMD_INIT_LOG(ERR, "(%s) Failed to start device\n", dev->path);
+   PMD_INIT_LOG(ERR, "(%s) Failed to start device", dev->path);
 
/* TODO: free resource here or caller to check */
return -1;
@@ -255,7 +255,7 @@ int virtio_user_stop_device(struct virtio_user_dev *dev)
 err:
pthread_mutex_unlock(&dev->mutex);
 
-   PMD_INIT_LOG(ERR, "(%s) Failed to stop device\n", dev->path);
+   PMD_INIT_LOG(ERR, "(%s) Failed to stop device", dev->path);
 
return -1;
 }
@@ -500,17 +500,17 @@ virtio_user_dev_setup(struct virtio_user_dev *dev)
}
 
if (dev->ops->setup(dev) < 0) {
-   PMD_INIT_LOG(ERR, "(%s) Failed to setup backend\n", dev->path);
+   PMD_INIT_LOG(ERR, "(%s) Failed to setup backend", dev->path);
return -1;
}
 
if (virtio_user_dev_init_notify(dev

Re: [dpdk-dev] [PATCH v5 1/3] security: enforce semantics for Tx inline processing

2021-09-15 Thread Ananyev, Konstantin



> 
> Not all net PMD's/HW can parse packet and identify L2 header and
> L3 header locations on Tx. This is inline with other Tx offloads
> requirements such as L3 checksum, L4 checksum offload, etc,
> where mbuf.l2_len, mbuf.l3_len etc, needs to be set for HW to be
> able to generate checksum. Since Inline IPSec is also such a Tx
> offload, some PMD's at least need mbuf.l2_len to be valid to
> find L3 header and perform Outbound IPSec processing.
> 
> Hence, this patch updates documentation to enforce setting
> mbuf.l2_len while setting PKT_TX_SEC_OFFLOAD in mbuf.ol_flags
> for Inline IPSec Crypto / Protocol offload processing to
> work on Tx.
> 
> Signed-off-by: Nithin Dabilpuram 
> Acked-by: Akhil Goyal 
> ---
>  doc/guides/nics/features.rst | 2 ++
>  lib/mbuf/rte_mbuf_core.h | 2 ++
>  2 files changed, 4 insertions(+)
> 
> diff --git a/doc/guides/nics/features.rst b/doc/guides/nics/features.rst
> index a96e12d..4fce8cd 100644
> --- a/doc/guides/nics/features.rst
> +++ b/doc/guides/nics/features.rst
> @@ -430,6 +430,7 @@ of protocol operations. See Security library and PMD 
> documentation for more deta
> 
>  * **[uses]   rte_eth_rxconf,rte_eth_rxmode**: 
> ``offloads:DEV_RX_OFFLOAD_SECURITY``,
>  * **[uses]   rte_eth_txconf,rte_eth_txmode**: 
> ``offloads:DEV_TX_OFFLOAD_SECURITY``.
> +* **[uses]   mbuf**: ``mbuf.l2_len``.
>  * **[implements] rte_security_ops**: ``session_create``, ``session_update``,
>``session_stats_get``, ``session_destroy``, ``set_pkt_metadata``, 
> ``capabilities_get``.
>  * **[provides] rte_eth_dev_info**: 
> ``rx_offload_capa,rx_queue_offload_capa:DEV_RX_OFFLOAD_SECURITY``,
> @@ -451,6 +452,7 @@ protocol operations. See security library and PMD 
> documentation for more details
> 
>  * **[uses]   rte_eth_rxconf,rte_eth_rxmode**: 
> ``offloads:DEV_RX_OFFLOAD_SECURITY``,
>  * **[uses]   rte_eth_txconf,rte_eth_txmode**: 
> ``offloads:DEV_TX_OFFLOAD_SECURITY``.
> +* **[uses]   mbuf**: ``mbuf.l2_len``.
>  * **[implements] rte_security_ops**: ``session_create``, ``session_update``,
>``session_stats_get``, ``session_destroy``, ``set_pkt_metadata``, 
> ``get_userdata``,
>``capabilities_get``.
> diff --git a/lib/mbuf/rte_mbuf_core.h b/lib/mbuf/rte_mbuf_core.h
> index bb38d7f..9d8e3dd 100644
> --- a/lib/mbuf/rte_mbuf_core.h
> +++ b/lib/mbuf/rte_mbuf_core.h
> @@ -228,6 +228,8 @@ extern "C" {
> 
>  /**
>   * Request security offload processing on the TX packet.
> + * To use Tx security offload, the user needs to fill l2_len in mbuf
> + * indicating L2 header size and where L3 header starts.
>   */
>  #define PKT_TX_SEC_OFFLOAD   (1ULL << 43)
> 
> --

Acked-by: Konstantin Ananyev 

> 2.8.4



Re: [dpdk-dev] logs about hugepages detection

2021-09-15 Thread Bruce Richardson
On Wed, Sep 15, 2021 at 03:52:35PM +0200, Thomas Monjalon wrote:
> Hi,
> 
> I would like to discuss some issues in logging of hugepage lookup.
> The issues to be discussed will be enumerated and numbered below.
> I will take an example of an x86 machine with 2M and 1G pages.
> I reserve only 2M pages:
> 
>   usertools/dpdk-hugepages.py -p 2M -r 80M
> 
> If I start a DPDK application with --log-level info
> the only message I read makes me think something is wrong:
> 
>   EAL: No available 1048576 kB hugepages reported
> 
> 1/ Log level is too high.
> 

Agreed.

> If I start with EAL in debug level, I can see which page size is used:
> 
>   --log-level debug --log-level lib.eal:debug
> 
>   EAL: No available 1048576 kB hugepages reported
>   [...]
>   EAL: Detected memory type: socket_id:0 hugepage_sz:2097152
> 
> 2/ The positive message should be at the same level as the negative one.
> 

A bit uncertain about this, as I think it need not always be the case. I
think the log messages should be assessed independently.

> 3/ The sizes are sometimes written in bytes, sometimes in kB.
> It should be always the highest unit, including GB.
> 
> When using the --in-memory mode, things are worst:
> 
>   EAL: No available 1048576 kB hugepages reported
>   EAL: In-memory mode enabled, hugepages of size 1073741824 bytes will be 
> allocated anonymously
>   EAL: No free 1048576 kB hugepages reported on node 0
>   EAL: No available 1048576 kB hugepages reported
>   [...]
>   EAL: Detected memory type: socket_id:0 hugepage_sz:1073741824
>   EAL: Detected memory type: socket_id:0 hugepage_sz:2097152
> 

Yes, things should be consistent, having highest units is nice-to-have. If
everything is consistently reported in KB or MB it's probably fine.

> 4/ The unavailability of 1G should be reported only once.
> 
I'd actually suggest that the unavailability of 1G pages should not be
reported at all if 2MB pages are available. If we imagine a hypothetical
architecture with 15 hugepage sizes, if more than enough memory is
available for DPDK use via one page size, would we really want to know or
care about the fact that 14 page sizes are unavailable?

> 5/ If non-reserved pages can be used without reservation, it should be better 
> documented.
> 
> Please correct me if I'm wrong, and give your opinion.
> I could work on some patches if needed.
> 

Regards,
/Bruce


[dpdk-dev] [PATCH v1] sched: adds function to get 64 bits greatest common divisor

2021-09-15 Thread Xueming Li
This patch adds new function that compute the greatest common
divisor of 64 bits, also changes the original 32 bits function to call
this new 64 bits version.

Signed-off-by: Xueming Li 
---
v1: add 64 bits version and make 32 bits api call it

 lib/sched/rte_sched_common.h | 19 ---
 1 file changed, 16 insertions(+), 3 deletions(-)

diff --git a/lib/sched/rte_sched_common.h b/lib/sched/rte_sched_common.h
index 96706df7bd..1056543a84 100644
--- a/lib/sched/rte_sched_common.h
+++ b/lib/sched/rte_sched_common.h
@@ -51,10 +51,10 @@ rte_min_pos_4_u16(uint16_t *x)
  *gcd(a, b) = gcd(b, a mod b)
  *
  */
-static inline uint32_t
-rte_get_gcd(uint32_t a, uint32_t b)
+static inline uint64_t
+rte_get_gcd64(uint64_t a, uint64_t b)
 {
-   uint32_t c;
+   uint64_t c;
 
if (a == 0)
return b;
@@ -76,6 +76,19 @@ rte_get_gcd(uint32_t a, uint32_t b)
return a;
 }
 
+/*
+ * Compute the Greatest Common Divisor (GCD) of two u32 numbers.
+ * This implementation uses Euclid's algorithm:
+ *gcd(a, 0) = a
+ *gcd(a, b) = gcd(b, a mod b)
+ *
+ */
+static inline uint32_t
+rte_get_gcd(uint32_t a, uint32_t b)
+{
+   return rte_get_gcd64(a, b);
+}
+
 /*
  * Compute the Lowest Common Denominator (LCD) of two numbers.
  * This implementation computes GCD first:
-- 
2.33.0



[dpdk-dev] [PATCH] eal: remove plural syntax for single resource

2021-09-15 Thread Thomas Monjalon
Some comments and logs about cores, nodes and pages
were using plural or hypotetic plural (s) form
even if preceded by "0" or "no".

It is replaced with singular form where appropriate.

Signed-off-by: Thomas Monjalon 
---
 lib/eal/common/eal_common_lcore.c | 13 +
 lib/eal/common/rte_malloc.c   |  4 ++--
 lib/eal/linux/eal_hugepage_info.c |  4 ++--
 lib/eal/linux/eal_memory.c|  4 ++--
 4 files changed, 15 insertions(+), 10 deletions(-)

diff --git a/lib/eal/common/eal_common_lcore.c 
b/lib/eal/common/eal_common_lcore.c
index 66d6bad1a7..6778ecc98f 100644
--- a/lib/eal/common/eal_common_lcore.c
+++ b/lib/eal/common/eal_common_lcore.c
@@ -191,9 +191,12 @@ rte_eal_cpu_init(void)
/* Set the count of enabled logical cores of the EAL configuration */
config->lcore_count = count;
RTE_LOG(DEBUG, EAL,
-   "Support maximum %u logical core(s) by configuration.\n",
-   RTE_MAX_LCORE);
-   RTE_LOG(INFO, EAL, "Detected %u lcore(s)\n", config->lcore_count);
+   "Support maximum %u logical core%s by configuration.\n",
+   RTE_MAX_LCORE,
+   RTE_MAX_LCORE > 1 ? "s" : "");
+   RTE_LOG(INFO, EAL, "Detected %u lcore%s\n",
+   config->lcore_count,
+   config->lcore_count > 1 ? "s" : "");
 
/* sort all socket id's in ascending order */
qsort(lcore_to_socket_id, RTE_DIM(lcore_to_socket_id),
@@ -208,7 +211,9 @@ rte_eal_cpu_init(void)
socket_id;
prev_socket_id = socket_id;
}
-   RTE_LOG(INFO, EAL, "Detected %u NUMA nodes\n", config->numa_node_count);
+   RTE_LOG(INFO, EAL, "Detected %u NUMA node%s\n",
+   config->numa_node_count,
+   config->numa_node_count > 1 ? "s" : "");
 
return 0;
 }
diff --git a/lib/eal/common/rte_malloc.c b/lib/eal/common/rte_malloc.c
index 9d39e58c08..04cbb0078b 100644
--- a/lib/eal/common/rte_malloc.c
+++ b/lib/eal/common/rte_malloc.c
@@ -65,10 +65,10 @@ malloc_socket(const char *type, size_t size, unsigned int 
align,
if (size == 0 || (align && !rte_is_power_of_2(align)))
return NULL;
 
-   /* if there are no hugepages and if we are not allocating from an
+   /* if there are no hugepage and if we are not allocating from an
 * external heap, use memory from any socket available. checking for
 * socket being external may return -1 in case of invalid socket, but
-* that's OK - if there are no hugepages, it doesn't matter.
+* that's OK - if there are no hugepage, it doesn't matter.
 */
if (rte_malloc_heap_socket_is_external(socket_arg) != 1 &&
!rte_eal_has_hugepages())
diff --git a/lib/eal/linux/eal_hugepage_info.c 
b/lib/eal/linux/eal_hugepage_info.c
index d97792cade..5673784186 100644
--- a/lib/eal/linux/eal_hugepage_info.c
+++ b/lib/eal/linux/eal_hugepage_info.c
@@ -117,7 +117,7 @@ get_num_hugepages(const char *subdir, size_t sz)
over_pages = 0;
 
if (num_pages == 0 && over_pages == 0)
-   RTE_LOG(WARNING, EAL, "No available %zu kB hugepages 
reported\n",
+   RTE_LOG(WARNING, EAL, "No available %zu kB hugepage reported\n",
sz >> 10);
 
num_pages += over_pages;
@@ -158,7 +158,7 @@ get_num_hugepages_on_node(const char *subdir, unsigned int 
socket, size_t sz)
return 0;
 
if (num_pages == 0)
-   RTE_LOG(WARNING, EAL, "No free %zu kB hugepages reported on 
node %u\n",
+   RTE_LOG(WARNING, EAL, "No free %zu kB hugepage reported on node 
%u\n",
sz >> 10, socket);
 
/*
diff --git a/lib/eal/linux/eal_memory.c b/lib/eal/linux/eal_memory.c
index 03a4f2dd2d..d73d233a7d 100644
--- a/lib/eal/linux/eal_memory.c
+++ b/lib/eal/linux/eal_memory.c
@@ -1002,7 +1002,7 @@ remap_needed_hugepages(struct hugepage_file *hugepages, 
int n_pages)
cur = &hugepages[cur_page];
prev = cur_page == 0 ? NULL : &hugepages[cur_page - 1];
 
-   /* if size is zero, no more pages left */
+   /* if size is zero, no more page left */
if (cur->size == 0)
break;
 
@@ -1550,7 +1550,7 @@ eal_legacy_hugepage_attach(void)
struct rte_memseg_list *msl;
struct rte_memseg *ms;
 
-   /* if size is zero, no more pages left */
+   /* if size is zero, no more page left */
if (map_sz == 0)
break;
 
-- 
2.33.0



Re: [dpdk-dev] [PATCH v5 2/3] security: add option for faster udata or mdata access

2021-09-15 Thread Ananyev, Konstantin


> 
> Currently rte_security_set_pkt_metadata() and rte_security_get_userdata()
> methods to set pkt metadata on Inline outbound and get userdata
> after Inline inbound processing is always driver specific callbacks.
> 
> For drivers that do not have much to do in the callbacks but just
> to update metadata in rte_security dynamic field and get userdata
> from rte_security dynamic field, having to just to PMD specific
> callback is costly per packet operation. This patch provides
> a mechanism to do the same in inline function and avoid function
> pointer jump if a driver supports the same.
> 
> Signed-off-by: Nithin Dabilpuram 
> Acked-by: Akhil Goyal 
> ---
>  doc/guides/rel_notes/deprecation.rst   |  4 ---
>  doc/guides/rel_notes/release_21_08.rst |  6 +
>  lib/security/rte_security.c|  8 +++---
>  lib/security/rte_security.h| 48 
> +++---
>  lib/security/version.map   |  2 ++
>  5 files changed, 56 insertions(+), 12 deletions(-)
> 
> diff --git a/doc/guides/rel_notes/deprecation.rst 
> b/doc/guides/rel_notes/deprecation.rst
> index 59445a6..70ef45e 100644
> --- a/doc/guides/rel_notes/deprecation.rst
> +++ b/doc/guides/rel_notes/deprecation.rst
> @@ -276,10 +276,6 @@ Deprecation Notices
>content. On Linux and FreeBSD, supported prior to DPDK 20.11,
>original structure will be kept until DPDK 21.11.
> 
> -* security: The functions ``rte_security_set_pkt_metadata`` and
> -  ``rte_security_get_userdata`` will be made inline functions and additional
> -  flags will be added in structure ``rte_security_ctx`` in DPDK 21.11.
> -
>  * cryptodev: The structure ``rte_crypto_op`` would be updated to reduce
>reserved bytes to 2 (from 3), and use 1 byte to indicate warnings and other
>information from the crypto/security operation. This field will be used to
> diff --git a/doc/guides/rel_notes/release_21_08.rst 
> b/doc/guides/rel_notes/release_21_08.rst
> index b4cbf2d..59ff15a 100644
> --- a/doc/guides/rel_notes/release_21_08.rst
> +++ b/doc/guides/rel_notes/release_21_08.rst
> @@ -223,6 +223,12 @@ ABI Changes
> 
>  * No ABI change that would break compatibility with 20.11.
> 
> +* security: ``rte_security_set_pkt_metadata`` and 
> ``rte_security_get_userdata``
> +  routines used by Inline outbound and Inline inbound security processing are
> +  made inline and enhanced to do simple 64-bit set/get for PMD's that donot
> +  have much processing in PMD specific callbacks but just 64-bit set/get.
> +  This avoids a per-pkt function pointer jump overhead for such PMD's.
> +
> 
>  Known Issues
>  
> diff --git a/lib/security/rte_security.c b/lib/security/rte_security.c
> index e8116d5..fe81ed3 100644
> --- a/lib/security/rte_security.c
> +++ b/lib/security/rte_security.c
> @@ -122,9 +122,9 @@ rte_security_session_destroy(struct rte_security_ctx 
> *instance,
>  }
> 
>  int
> -rte_security_set_pkt_metadata(struct rte_security_ctx *instance,
> -   struct rte_security_session *sess,
> -   struct rte_mbuf *m, void *params)
> +__rte_security_set_pkt_metadata(struct rte_security_ctx *instance,
> + struct rte_security_session *sess,
> + struct rte_mbuf *m, void *params)
>  {
>  #ifdef RTE_DEBUG
>   RTE_PTR_OR_ERR_RET(sess, -EINVAL);
> @@ -137,7 +137,7 @@ rte_security_set_pkt_metadata(struct rte_security_ctx 
> *instance,
>  }
> 
>  void *
> -rte_security_get_userdata(struct rte_security_ctx *instance, uint64_t md)
> +__rte_security_get_userdata(struct rte_security_ctx *instance, uint64_t md)
>  {
>   void *userdata = NULL;
> 
> diff --git a/lib/security/rte_security.h b/lib/security/rte_security.h
> index 2e136d7..3124134 100644
> --- a/lib/security/rte_security.h
> +++ b/lib/security/rte_security.h
> @@ -71,8 +71,18 @@ struct rte_security_ctx {
>   /**< Pointer to security ops for the device */
>   uint16_t sess_cnt;
>   /**< Number of sessions attached to this context */
> + uint32_t flags;
> + /**< Flags for security context */
>  };
> 
> +#define RTE_SEC_CTX_F_FAST_SET_MDATA 0x0001
> +/**< Driver uses fast metadata update without using driver specific callback 
> */

Probably worth to mention somewhere that it is driver responsibility to call 
rte_security_dynfield_register() to expose that flag.

> +
> +#define RTE_SEC_CTX_F_FAST_GET_UDATA 0x0002
> +/**< Driver provides udata using fast method without using driver specific
> + * callback.
> + */
> +
>  /**
>   * IPSEC tunnel parameters
>   *
> @@ -494,6 +504,12 @@ static inline bool 
> rte_security_dynfield_is_registered(void)
>   return rte_security_dynfield_offset >= 0;
>  }
> 
> +/** Function to call PMD specific function pointer set_pkt_metadata() */
> +__rte_experimental
> +extern int __rte_security_set_pkt_metadata(struct rte_security_ctx *instance,
> +struct rte_security_session *sess,
> 

Re: [dpdk-dev] [PATCH v5 3/3] examples/ipsec-secgw: update event mode inline path

2021-09-15 Thread Ananyev, Konstantin
> Update mbuf.l2_len with L2 header size for outbound
> inline processing.
> 
> This patch also fixes a bug in arg parsing.
> 
> Signed-off-by: Nithin Dabilpuram 
> Acked-by: Akhil Goyal 
> ---
>  examples/ipsec-secgw/ipsec-secgw.c  |  2 ++
>  examples/ipsec-secgw/ipsec_worker.c | 41 
> -
>  2 files changed, 29 insertions(+), 14 deletions(-)
> 
> diff --git a/examples/ipsec-secgw/ipsec-secgw.c 
> b/examples/ipsec-secgw/ipsec-secgw.c
> index f252d34..7ad94cb 100644
> --- a/examples/ipsec-secgw/ipsec-secgw.c
> +++ b/examples/ipsec-secgw/ipsec-secgw.c
> @@ -1495,6 +1495,8 @@ parse_portmask(const char *portmask)
>   char *end = NULL;
>   unsigned long pm;
> 
> + errno = 0;
> +
>   /* parse hexadecimal string */
>   pm = strtoul(portmask, &end, 16);
>   if ((portmask[0] == '\0') || (end == NULL) || (*end != '\0'))
> diff --git a/examples/ipsec-secgw/ipsec_worker.c 
> b/examples/ipsec-secgw/ipsec_worker.c
> index 647e22d..c545497 100644
> --- a/examples/ipsec-secgw/ipsec_worker.c
> +++ b/examples/ipsec-secgw/ipsec_worker.c
> @@ -12,6 +12,11 @@
>  #include "ipsec-secgw.h"
>  #include "ipsec_worker.h"
> 
> +struct port_drv_mode_data {
> + struct rte_security_session *sess;
> + struct rte_security_ctx *ctx;
> +};
> +
>  static inline enum pkt_type
>  process_ipsec_get_pkt_type(struct rte_mbuf *pkt, uint8_t **nlp)
>  {
> @@ -60,7 +65,8 @@ ipsec_event_pre_forward(struct rte_mbuf *m, unsigned int 
> port_id)
> 
>  static inline void
>  prepare_out_sessions_tbl(struct sa_ctx *sa_out,
> - struct rte_security_session **sess_tbl, uint16_t size)
> +  struct port_drv_mode_data *data,
> +  uint16_t size)
>  {
>   struct rte_ipsec_session *pri_sess;
>   struct ipsec_sa *sa;
> @@ -95,9 +101,10 @@ prepare_out_sessions_tbl(struct sa_ctx *sa_out,
>   }
> 
>   /* Use only first inline session found for a given port */
> - if (sess_tbl[sa->portid])
> + if (data[sa->portid].sess)
>   continue;
> - sess_tbl[sa->portid] = pri_sess->security.ses;
> + data[sa->portid].sess = pri_sess->security.ses;
> + data[sa->portid].ctx = pri_sess->security.ctx;
>   }
>  }
> 
> @@ -356,9 +363,8 @@ process_ipsec_ev_outbound(struct ipsec_ctx *ctx, struct 
> route_table *rt,
>   goto drop_pkt_and_exit;
>   }
> 
> - if (sess->security.ol_flags & RTE_SECURITY_TX_OLOAD_NEED_MDATA)
> - *(struct rte_security_session **)rte_security_dynfield(pkt) =
> - sess->security.ses;
> + rte_security_set_pkt_metadata(sess->security.ctx,
> +   sess->security.ses, pkt, NULL);
> 
>   /* Mark the packet for Tx security offload */
>   pkt->ol_flags |= PKT_TX_SEC_OFFLOAD;
> @@ -367,6 +373,9 @@ process_ipsec_ev_outbound(struct ipsec_ctx *ctx, struct 
> route_table *rt,
>   port_id = sa->portid;
> 
>  send_pkt:
> + /* Provide L2 len for Outbound processing */
> + pkt->l2_len = RTE_ETHER_HDR_LEN;
> +
>   /* Update mac addresses */
>   update_mac_addrs(pkt, port_id);
> 
> @@ -398,7 +407,7 @@ static void
>  ipsec_wrkr_non_burst_int_port_drv_mode(struct eh_event_link_info *links,
>   uint8_t nb_links)
>  {
> - struct rte_security_session *sess_tbl[RTE_MAX_ETHPORTS] = { NULL };
> + struct port_drv_mode_data data[RTE_MAX_ETHPORTS];
>   unsigned int nb_rx = 0;
>   struct rte_mbuf *pkt;
>   struct rte_event ev;
> @@ -412,6 +421,8 @@ ipsec_wrkr_non_burst_int_port_drv_mode(struct 
> eh_event_link_info *links,
>   return;
>   }
> 
> + memset(&data, 0, sizeof(struct port_drv_mode_data));
> +
>   /* Get core ID */
>   lcore_id = rte_lcore_id();
> 
> @@ -422,8 +433,8 @@ ipsec_wrkr_non_burst_int_port_drv_mode(struct 
> eh_event_link_info *links,
>* Prepare security sessions table. In outbound driver mode
>* we always use first session configured for a given port
>*/
> - prepare_out_sessions_tbl(socket_ctx[socket_id].sa_out, sess_tbl,
> - RTE_MAX_ETHPORTS);
> + prepare_out_sessions_tbl(socket_ctx[socket_id].sa_out, data,
> +  RTE_MAX_ETHPORTS);
> 
>   RTE_LOG(INFO, IPSEC,
>   "Launching event mode worker (non-burst - Tx internal port - "
> @@ -460,19 +471,21 @@ ipsec_wrkr_non_burst_int_port_drv_mode(struct 
> eh_event_link_info *links,
> 
>   if (!is_unprotected_port(port_id)) {
> 
> - if (unlikely(!sess_tbl[port_id])) {
> + if (unlikely(!data[port_id].sess)) {
>   rte_pktmbuf_free(pkt);
>   continue;
>   }
> 
>   /* Save security session */
> - if (rte_security_dynfield_is_registered())
> - *(st

Re: [dpdk-dev] [PATCH v21 4/7] dmadev: introduce DMA device library implementation

2021-09-15 Thread Bruce Richardson
On Wed, Sep 15, 2021 at 02:51:55PM +0100, Kevin Laatz wrote:
> On 07/09/2021 13:56, Chengwen Feng wrote:
> > This patch introduce DMA device library implementation which includes
> > configuration and I/O with the DMA devices.
> > 
> > Signed-off-by: Chengwen Feng 
> > Acked-by: Bruce Richardson 
> > Acked-by: Morten Brørup 
> > Reviewed-by: Kevin Laatz 
> > Reviewed-by: Conor Walsh 
> > ---
> >   config/rte_config.h  |   3 +
> >   lib/dmadev/meson.build   |   1 +
> >   lib/dmadev/rte_dmadev.c  | 607 +++
> >   lib/dmadev/rte_dmadev.h  | 118 ++-
> >   lib/dmadev/rte_dmadev_core.h |   2 +
> >   lib/dmadev/version.map   |   1 +
> >   6 files changed, 720 insertions(+), 12 deletions(-)
> >   create mode 100644 lib/dmadev/rte_dmadev.c
> > 
> [snip]
> 
> >   /**
> >* @warning
> > @@ -941,10 +1018,27 @@ rte_dmadev_completed(uint16_t dev_id, uint16_t 
> > vchan, const uint16_t nb_cpls,
> >*   status array are also set.
> >*/
> >   __rte_experimental
> > -uint16_t
> > +static inline uint16_t
> >   rte_dmadev_completed_status(uint16_t dev_id, uint16_t vchan,
> > const uint16_t nb_cpls, uint16_t *last_idx,
> > -   enum rte_dma_status_code *status);
> > +   enum rte_dma_status_code *status)
> > +{
> > +   struct rte_dmadev *dev = &rte_dmadevices[dev_id];
> > +   uint16_t idx;
> > +
> > +#ifdef RTE_DMADEV_DEBUG
> > +   if (!rte_dmadev_is_valid_dev(dev_id) || !dev->data->dev_started ||
> > +   vchan >= dev->data->dev_conf.nb_vchans ||
> > +   nb_cpls == 0 || status == NULL)
> > +   return 0;
> > +   RTE_FUNC_PTR_OR_ERR_RET(*dev->completed_status, 0);
> > +#endif
> > +
> > +   if (last_idx == NULL)
> > +   last_idx = &idx;
> 
> Hi Chengwen,
> 
> An internal coverity scan on the IDXD dmadev driver patches flagged a
> potential null pointer dereference when using completed_status().
> 
> IMO it is a false positive for the driver code since it should be checked at
> the library API level, however the check is also not present in the library.
> 
> For the v22, can you add the NULL pointer check for status here, like you
> have for last_idx, please?
> 
I think the check would have to be different than that for last_idx, since
the status pointer is a pointer to an array, rather than a single value -
which procludes a simple replacement in the wrapper function that the
compiler can inline away if unnecessary.
It's probably best to add it as a check in the debug block, with an
error-return if status is NULL.

/Bruce


Re: [dpdk-dev] logs about hugepages detection

2021-09-15 Thread Thomas Monjalon
15/09/2021 16:25, Bruce Richardson:
> On Wed, Sep 15, 2021 at 03:52:35PM +0200, Thomas Monjalon wrote:
> > Hi,
> > 
> > I would like to discuss some issues in logging of hugepage lookup.
> > The issues to be discussed will be enumerated and numbered below.
> > I will take an example of an x86 machine with 2M and 1G pages.
> > I reserve only 2M pages:
> > 
> > usertools/dpdk-hugepages.py -p 2M -r 80M
> > 
> > If I start a DPDK application with --log-level info
> > the only message I read makes me think something is wrong:
> > 
> > EAL: No available 1048576 kB hugepages reported
> > 
> > 1/ Log level is too high.
> > 
> 
> Agreed.
> 
> > If I start with EAL in debug level, I can see which page size is used:
> > 
> > --log-level debug --log-level lib.eal:debug
> > 
> > EAL: No available 1048576 kB hugepages reported
> > [...]
> > EAL: Detected memory type: socket_id:0 hugepage_sz:2097152
> > 
> > 2/ The positive message should be at the same level as the negative one.
> 
> A bit uncertain about this, as I think it need not always be the case. I
> think the log messages should be assessed independently.

Not sure what you mean. Which level for which message?

> > 3/ The sizes are sometimes written in bytes, sometimes in kB.
> > It should be always the highest unit, including GB.
> > 
> > When using the --in-memory mode, things are worst:
> > 
> > EAL: No available 1048576 kB hugepages reported
> > EAL: In-memory mode enabled, hugepages of size 1073741824 bytes will be 
> > allocated anonymously
> > EAL: No free 1048576 kB hugepages reported on node 0
> > EAL: No available 1048576 kB hugepages reported
> > [...]
> > EAL: Detected memory type: socket_id:0 hugepage_sz:1073741824
> > EAL: Detected memory type: socket_id:0 hugepage_sz:2097152
> > 
> 
> Yes, things should be consistent, having highest units is nice-to-have. If
> everything is consistently reported in KB or MB it's probably fine.

Fine but not nice :)
I'm looking to improve the user experience, so "1GB" is definitely easier
to read than "1048576 kB", not talking about "1073741824".

> > 4/ The unavailability of 1G should be reported only once.
> > 
> I'd actually suggest that the unavailability of 1G pages should not be
> reported at all if 2MB pages are available. If we imagine a hypothetical
> architecture with 15 hugepage sizes, if more than enough memory is
> available for DPDK use via one page size, would we really want to know or
> care about the fact that 14 page sizes are unavailable?

I agree.

> > 5/ If non-reserved pages can be used without reservation, it should be 
> > better documented.
> > 
> > Please correct me if I'm wrong, and give your opinion.
> > I could work on some patches if needed.





[dpdk-dev] [RFC PATCH 02/12] common/mlx5: read software parsing capabilities from DevX

2021-09-15 Thread Tal Shnaiderman
mlx5 in Windows needs the software parsing hca capabilities
to query the NIC for TSO and Checksum offloading support.

Added the capability as part of the capabilities
queried by the PMD using DevX.

Signed-off-by: Tal Shnaiderman 
---
 drivers/common/mlx5/mlx5_devx_cmds.c | 6 ++
 drivers/common/mlx5/mlx5_devx_cmds.h | 3 +++
 2 files changed, 9 insertions(+)

diff --git a/drivers/common/mlx5/mlx5_devx_cmds.c 
b/drivers/common/mlx5/mlx5_devx_cmds.c
index 56407cc332..70ba74e112 100644
--- a/drivers/common/mlx5/mlx5_devx_cmds.c
+++ b/drivers/common/mlx5/mlx5_devx_cmds.c
@@ -991,6 +991,12 @@ mlx5_devx_cmd_query_hca_attr(void *ctx,
hcattr, tunnel_lro_gre);
attr->tunnel_lro_vxlan = MLX5_GET(per_protocol_networking_offload_caps,
  hcattr, tunnel_lro_vxlan);
+   attr->swp = MLX5_GET(per_protocol_networking_offload_caps,
+ hcattr, swp);
+   attr->swp_csum = MLX5_GET(per_protocol_networking_offload_caps,
+ hcattr, swp_csum);
+   attr->swp_lso = MLX5_GET(per_protocol_networking_offload_caps,
+ hcattr, swp_lso);
attr->lro_max_msg_sz_mode = MLX5_GET
(per_protocol_networking_offload_caps,
 hcattr, lro_max_msg_sz_mode);
diff --git a/drivers/common/mlx5/mlx5_devx_cmds.h 
b/drivers/common/mlx5/mlx5_devx_cmds.h
index e576e30f24..caa444bc15 100644
--- a/drivers/common/mlx5/mlx5_devx_cmds.h
+++ b/drivers/common/mlx5/mlx5_devx_cmds.h
@@ -116,6 +116,9 @@ struct mlx5_hca_attr {
uint32_t lro_cap:1;
uint32_t tunnel_lro_gre:1;
uint32_t tunnel_lro_vxlan:1;
+   uint32_t swp:1;
+   uint32_t swp_csum:1;
+   uint32_t swp_lso:1;
uint32_t lro_max_msg_sz_mode:2;
uint32_t lro_timer_supported_periods[MLX5_LRO_NUM_SUPP_PERIODS];
uint16_t lro_min_mss_size;
-- 
2.16.1.windows.4



[dpdk-dev] [RFC PATCH 01/12] net/mlx5: fix software parsing support query

2021-09-15 Thread Tal Shnaiderman
Currently, the PMD decides if the software parsing
offload can enable outer IPv4 checksum and tunneled
TSO support by checking config->hw_csum and config->tso
respectively.

This is incorrect, the right way is to check the following
flags returned by the mlx5dv_query_device function:

MLX5DV_SW_PARSING - check general swp support.
MLX5DV_SW_PARSING_CSUM - check swp checksum support.
MLX5DV_SW_PARSING_LSO - check swp LSO/TSO support.

The fix enables the offloads according to the correct
flags returned by the kernel.

Fixes: e46821e9fcdc60 ("net/mlx5: separate generic tunnel TSO from the standard 
one")
Cc: sta...@dpdk.org

Signed-off-by: Tal Shnaiderman 
---
 drivers/net/mlx5/linux/mlx5_os.c |  3 ++-
 drivers/net/mlx5/linux/mlx5_os.h | 12 
 drivers/net/mlx5/mlx5.h  |  2 +-
 drivers/net/mlx5/mlx5_txq.c  | 15 +--
 4 files changed, 24 insertions(+), 8 deletions(-)

diff --git a/drivers/net/mlx5/linux/mlx5_os.c b/drivers/net/mlx5/linux/mlx5_os.c
index 470b16cb9a..536b39ba9c 100644
--- a/drivers/net/mlx5/linux/mlx5_os.c
+++ b/drivers/net/mlx5/linux/mlx5_os.c
@@ -1112,7 +1112,8 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev,
swp = dv_attr.sw_parsing_caps.sw_parsing_offloads;
DRV_LOG(DEBUG, "SWP support: %u", swp);
 #endif
-   config->swp = !!swp;
+   config->swp = swp & (MLX5_SW_PARSING_CAP | MLX5_SW_PARSING_CSUM_CAP |
+   MLX5_SW_PARSING_TSO_CAP);
 #ifdef HAVE_IBV_DEVICE_STRIDING_RQ_SUPPORT
if (dv_attr.comp_mask & MLX5DV_CONTEXT_MASK_STRIDING_RQ) {
struct mlx5dv_striding_rq_caps mprq_caps =
diff --git a/drivers/net/mlx5/linux/mlx5_os.h b/drivers/net/mlx5/linux/mlx5_os.h
index 2991d37df2..da036edb72 100644
--- a/drivers/net/mlx5/linux/mlx5_os.h
+++ b/drivers/net/mlx5/linux/mlx5_os.h
@@ -21,4 +21,16 @@ enum {
 
 int mlx5_auxiliary_get_ifindex(const char *sf_name);
 
+
+enum mlx5_sw_parsing_offloads {
+#ifdef HAVE_IBV_MLX5_MOD_SWP
+   MLX5_SW_PARSING_CAP  = MLX5DV_SW_PARSING,
+   MLX5_SW_PARSING_CSUM_CAP = MLX5DV_SW_PARSING_CSUM,
+   MLX5_SW_PARSING_TSO_CAP  = MLX5DV_SW_PARSING_LSO,
+#else
+   MLX5_SW_PARSING_CAP  = 0,
+   MLX5_SW_PARSING_CSUM_CAP = 0,
+   MLX5_SW_PARSING_TSO_CAP  = 0,
+#endif
+};
 #endif /* RTE_PMD_MLX5_OS_H_ */
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index e02714e231..a56f39cd5f 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -260,7 +260,7 @@ struct mlx5_dev_config {
unsigned int dv_xmeta_en:2; /* Enable extensive flow metadata. */
unsigned int lacp_by_user:1;
/* Enable user to manage LACP traffic. */
-   unsigned int swp:1; /* Tx generic tunnel checksum and TSO offload. */
+   unsigned int swp:3; /* Tx generic tunnel checksum and TSO offload. */
unsigned int devx:1; /* Whether devx interface is available or not. */
unsigned int dest_tir:1; /* Whether advanced DR API is available. */
unsigned int reclaim_mode:2; /* Memory reclaim mode. */
diff --git a/drivers/net/mlx5/mlx5_txq.c b/drivers/net/mlx5/mlx5_txq.c
index eb4d34ca55..8dca2b7f79 100644
--- a/drivers/net/mlx5/mlx5_txq.c
+++ b/drivers/net/mlx5/mlx5_txq.c
@@ -111,9 +111,9 @@ mlx5_get_tx_port_offloads(struct rte_eth_dev *dev)
if (config->tx_pp)
offloads |= DEV_TX_OFFLOAD_SEND_ON_TIMESTAMP;
if (config->swp) {
-   if (config->hw_csum)
+   if (config->swp & MLX5_SW_PARSING_CSUM_CAP)
offloads |= DEV_TX_OFFLOAD_OUTER_IPV4_CKSUM;
-   if (config->tso)
+   if (config->swp & MLX5_SW_PARSING_TSO_CAP)
offloads |= (DEV_TX_OFFLOAD_IP_TNL_TSO |
 DEV_TX_OFFLOAD_UDP_TNL_TSO);
}
@@ -979,10 +979,13 @@ txq_set_params(struct mlx5_txq_ctrl *txq_ctrl)
txq_ctrl->txq.tso_en = 1;
}
txq_ctrl->txq.tunnel_en = config->tunnel_en | config->swp;
-   txq_ctrl->txq.swp_en = ((DEV_TX_OFFLOAD_IP_TNL_TSO |
-DEV_TX_OFFLOAD_UDP_TNL_TSO |
-DEV_TX_OFFLOAD_OUTER_IPV4_CKSUM) &
-   txq_ctrl->txq.offloads) && config->swp;
+   txq_ctrl->txq.swp_en = (((DEV_TX_OFFLOAD_IP_TNL_TSO |
+ DEV_TX_OFFLOAD_UDP_TNL_TSO) &
+ txq_ctrl->txq.offloads) && (config->swp &
+ MLX5_SW_PARSING_TSO_CAP)) |
+   ((DEV_TX_OFFLOAD_OUTER_IPV4_CKSUM &
+txq_ctrl->txq.offloads) && (config->swp &
+MLX5_SW_PARSING_CSUM_CAP));
 }
 
 /**
-- 
2.16.1.windows.4



  1   2   3   >