date:20230420

RE: [PATCH 1/2] config/arm: Do not require processor information

2023-04-20 Thread Ruifeng Wang

> -Original Message-
> From: Akihiko Odaki 
> Sent: Thursday, April 20, 2023 9:40 AM
> To: Ruifeng Wang ; Bruce Richardson 
> ;
> Juraj Linkeš 
> Cc: dev@dpdk.org; nd 
> Subject: Re: [PATCH 1/2] config/arm: Do not require processor information
> 
> On 2023/04/17 16:41, Ruifeng Wang wrote:
> >> -Original Message-
> >> From: Akihiko Odaki 
> >> Sent: Friday, April 14, 2023 8:42 PM
> >> To: Ruifeng Wang ; Bruce Richardson
> >> 
> >> Cc: dev@dpdk.org; Akihiko Odaki 
> >> Subject: [PATCH 1/2] config/arm: Do not require processor information
> >>
> >> DPDK can be built even without exact processor information for x86
> >> and ppc so allow to build for Arm even if we don't know the targeted 
> >> processor is
> unknown.
> >
> > Hi Akihiko,
> >
> > The design idea was to require an explicit generic build.
> > Default/native build doesn't fall back to generic build when SoC info is 
> > not on the list.
> > So the user has less chance to generate a suboptimal binary by accident.
> 
> Hi,
> 
> It is true that the suboptimal binary can result, but the rationale here is 
> that we
> tolerate that for x86 and ppc so it should not really matter for Arm too. On 
> x86 and ppc
> you don't need to modify meson.build just to run dts on a development machine.

What modification do you need for a development machine?
I suppose "meson setup build -Dplatform=generic" will generate a binary that 
can run
on your development machine.

> 
> Regards,
> Akihiko Odaki

Re: [PATCH 1/2] config/arm: Do not require processor information

2023-04-20 Thread Akihiko Odaki

On 2023/04/20 16:10, Ruifeng Wang wrote:

-Original Message-
From: Akihiko Odaki 
Sent: Thursday, April 20, 2023 9:40 AM
To: Ruifeng Wang ; Bruce Richardson 
;
Juraj Linkeš 
Cc: dev@dpdk.org; nd 
Subject: Re: [PATCH 1/2] config/arm: Do not require processor information

On 2023/04/17 16:41, Ruifeng Wang wrote:

-Original Message-
From: Akihiko Odaki 
Sent: Friday, April 14, 2023 8:42 PM
To: Ruifeng Wang ; Bruce Richardson

Cc: dev@dpdk.org; Akihiko Odaki 
Subject: [PATCH 1/2] config/arm: Do not require processor information

DPDK can be built even without exact processor information for x86
and ppc so allow to build for Arm even if we don't know the targeted processor 
is

unknown.

Hi Akihiko,

The design idea was to require an explicit generic build.
Default/native build doesn't fall back to generic build when SoC info is not on 
the list.
So the user has less chance to generate a suboptimal binary by accident.

Hi,

It is true that the suboptimal binary can result, but the rationale here is 
that we
tolerate that for x86 and ppc so it should not really matter for Arm too. On 
x86 and ppc
you don't need to modify meson.build just to run dts on a development machine.

What modification do you need for a development machine?
I suppose "meson setup build -Dplatform=generic" will generate a binary that 
can run
on your development machine.

I didn't describe the situation well. I use DPDK Test Suite for testing 
and it determines what flags to be passed to Meson. You need to modify 
DPDK's meson.build or DTS to get it built.

Regards,
Akihiko Odaki

RE: [EXT] [PATCH] crypto/uadk: set queue pair in dev_configure

2023-04-20 Thread Akhil Goyal

> By default, uadk only alloc two queues for each algorithm, which
> will impact performance.
> Set queue pair number as required in dev_configure.
> The default max queue pair number is 8, which can be modified
> via para: max_nb_queue_pairs
> 
Please add documentation for the newly added devarg in uadk.rst.

> Example:
> sudo dpdk-test-crypto-perf -l 0-10 --vdev crypto_uadk,max_nb_queue_pairs=10
>   -- --devtype crypto_uadk --optype cipher-only --buffer-sz 8192
> 
> lcore idBuf Size  Burst Size  Gbps  Cycles/Buf
> 
>38192  327.5226  871.19
>78192  327.5225  871.20
>18192  327.5225  871.20
>48192  327.5224  871.21
>58192  327.5224  871.21
>   108192  327.5223  871.22
>98192  327.5223  871.23
>28192  327.5222  871.23
>88192  327.5222  871.23
>68192  327.5218  871.28
> 

No need to mention the above test result in patch description.

> Signed-off-by: Zhangfei Gao 
> ---
>  drivers/crypto/uadk/uadk_crypto_pmd.c | 19 +--
>  drivers/crypto/uadk/uadk_crypto_pmd_private.h |  1 +
>  2 files changed, 18 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/crypto/uadk/uadk_crypto_pmd.c
> b/drivers/crypto/uadk/uadk_crypto_pmd.c
> index 4f729e0f07..34aae99342 100644
> --- a/drivers/crypto/uadk/uadk_crypto_pmd.c
> +++ b/drivers/crypto/uadk/uadk_crypto_pmd.c
> @@ -357,8 +357,15 @@ static const struct rte_cryptodev_capabilities
> uadk_crypto_v2_capabilities[] = {
>  /* Configure device */
>  static int
>  uadk_crypto_pmd_config(struct rte_cryptodev *dev __rte_unused,
> -struct rte_cryptodev_config *config __rte_unused)
> +struct rte_cryptodev_config *config)
>  {
> + char env[128];
> +
> + /* set queue pairs num via env */
> + sprintf(env, "sync:%d@0", config->nb_queue_pairs);
> + setenv("WD_CIPHER_CTX_NUM", env, 1);
> + setenv("WD_DIGEST_CTX_NUM", env, 1);
> +

Who is the intended user of this environment variable?

>   return 0;
>  }
> 
> @@ -434,7 +441,7 @@ uadk_crypto_pmd_info_get(struct rte_cryptodev *dev,
>   if (dev_info != NULL) {
>   dev_info->driver_id = dev->driver_id;
>   dev_info->driver_name = dev->device->driver->name;
> - dev_info->max_nb_queue_pairs = 128;
> + dev_info->max_nb_queue_pairs = priv->max_nb_qpairs;
>   /* No limit of number of sessions */
>   dev_info->sym.max_nb_sessions = 0;
>   dev_info->feature_flags = dev->feature_flags;
> @@ -1015,6 +1022,7 @@ uadk_cryptodev_probe(struct rte_vdev_device *vdev)
>   struct uadk_crypto_priv *priv;
>   struct rte_cryptodev *dev;
>   struct uacce_dev *udev;
> + const char *input_args;
>   const char *name;
> 
>   udev = wd_get_accel_dev("cipher");
> @@ -1030,6 +1038,9 @@ uadk_cryptodev_probe(struct rte_vdev_device *vdev)
>   if (name == NULL)
>   return -EINVAL;
> 
> + input_args = rte_vdev_device_args(vdev);
> + rte_cryptodev_pmd_parse_input_args(&init_params, input_args);
> +
>   dev = rte_cryptodev_pmd_create(name, &vdev->device, &init_params);
>   if (dev == NULL) {
>   UADK_LOG(ERR, "driver %s: create failed", init_params.name);
> @@ -1044,6 +1055,7 @@ uadk_cryptodev_probe(struct rte_vdev_device *vdev)
>RTE_CRYPTODEV_FF_SYMMETRIC_CRYPTO;
>   priv = dev->data->dev_private;
>   priv->version = version;
> + priv->max_nb_qpairs = init_params.max_nb_queue_pairs;

Is the user free to give any number as max? Do you want to add a check here?
You should also mention in the documentation about the max and min values.

> 
>   rte_cryptodev_pmd_probing_finish(dev);
> 
> @@ -1078,4 +1090,7 @@ static struct cryptodev_driver uadk_crypto_drv;
>  RTE_PMD_REGISTER_VDEV(UADK_CRYPTO_DRIVER_NAME,
> uadk_crypto_pmd);
>  RTE_PMD_REGISTER_CRYPTO_DRIVER(uadk_crypto_drv,
> uadk_crypto_pmd.driver,
>  uadk_cryptodev_driver_id);
> +RTE_PMD_REGISTER_PARAM_STRING(UADK_CRYPTO_DRIVER_NAME,
> +   "max_nb_queue_pairs= "
> +   "socket_id=");
>  RTE_LOG_REGISTER_DEFAULT(uadk_crypto_logtype, INFO);
> diff --git a/drivers/crypto/uadk/uadk_crypto_pmd_private.h
> b/drivers/crypto/uadk/uadk_crypto_pmd_private.h
> index 9075f0f058..5a7dbff117 100644
> --- a/drivers/crypto/uadk/uadk_crypto_pmd_private.h
> +++ b/drivers/crypto/uadk/uadk_crypto_pmd_private.h
> @@ -67,6 +67,7 @@ struct uadk_crypto_priv {
>   bool env_cipher_init;
>   bool env_auth_init;
>   enum uadk_crypto_version version;
> + unsigned int max_nb

RE: [PATCH 2/2] config/arm: Enable NUMA for generic Arm build

2023-04-20 Thread Ruifeng Wang

> -Original Message-
> From: Akihiko Odaki 
> Sent: Friday, April 14, 2023 8:42 PM
> To: Ruifeng Wang ; Bruce Richardson 
> 
> Cc: dev@dpdk.org; Akihiko Odaki 
> Subject: [PATCH 2/2] config/arm: Enable NUMA for generic Arm build
> 
> We enable NUMA even if the presence of NUMA is unknown for the other 
> architectures. Enable
> NUMA for generic Arm build too.
> 
> Signed-off-by: Akihiko Odaki 
> ---
>  config/arm/meson.build | 6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/config/arm/meson.build b/config/arm/meson.build index 
> 724c00ad7e..f8ee7cdafb
> 100644
> --- a/config/arm/meson.build
> +++ b/config/arm/meson.build
> @@ -271,13 +271,15 @@ implementers = {
>  soc_generic = {
>  'description': 'Generic un-optimized build for armv8 aarch64 exec mode',
>  'implementer': 'generic',
> -'part_number': 'generic'
> +'part_number': 'generic',
> +'numa': true

The default value of numa is true. So no need to add it here?

if not soc_config.get('numa', true)
has_libnuma = 0
endif

>  }
> 
>  soc_generic_aarch32 = {
>  'description': 'Generic un-optimized build for armv8 aarch32 exec mode',
>  'implementer': 'generic',
> -'part_number': 'generic_aarch32'
> +'part_number': 'generic_aarch32',
> +'numa': true
>  }
> 
>  soc_armada = {
> --
> 2.40.0

RE: [RFC] lib: set/get max memzone segments

2023-04-20 Thread Ophir Munk

Devendra Singh Rawat, Alok Prasad - can you please give your feedback on the 
qede driver updates?

> -Original Message-
> In current DPDK the RTE_MAX_MEMZONE definition is unconditionally hard
> coded as 2560.  For applications requiring different values of this parameter
> – it is more convenient to set the max value via an rte API - rather than
> changing the dpdk source code per application.  In many organizations, the
> possibility to compile a private DPDK library for a particular application 
> does
> not exist at all.  With this option there is no need to recompile DPDK and it
> allows using an in-box packaged DPDK.
> An example usage for updating the RTE_MAX_MEMZONE would be of an
> application that uses the DPDK mempool library which is based on DPDK
> memzone library.  The application may need to create a number of steering
> tables, each of which will require its own mempool allocation.
> This commit is not about how to optimize the application usage of mempool
> nor about how to improve the mempool implementation based on
> memzone.  It is about how to make the max memzone definition - run-time
> customized.
> This commit adds an API which must be called before rte_eal_init():
> rte_memzone_max_set(int max).  If not called, the default memzone
> (RTE_MAX_MEMZONE) is used.  There is also an API to query the effective
> max memzone: rte_memzone_max_get().
> 
> Signed-off-by: Ophir Munk 
> ---
>  app/test/test_func_reentrancy.c |  2 +-
>  app/test/test_malloc_perf.c |  2 +-
>  app/test/test_memzone.c |  2 +-
>  config/rte_config.h |  1 -
>  drivers/net/qede/base/bcm_osal.c| 26 +-
>  drivers/net/qede/base/bcm_osal.h|  3 +++
>  drivers/net/qede/qede_main.c|  7 +++
>  lib/eal/common/eal_common_memzone.c | 28

[PATCH] common/idpf: remove device stop flag

2023-04-20 Thread Qi Zhang

Remove device stop flag, as we already have dev->data-dev_started.
This also fixed the issue when close port directly without start it
first, some error message will be reported in dev_stop.

Fixes: 14aa6ed8f2ec ("net/idpf: support device start and stop")
Fixes: 1082a773a86b ("common/idpf: add vport structure")
Cc: sta...@dpdk.org

Signed-off-by: Qi Zhang 
---
 drivers/common/idpf/idpf_common_device.h | 2 --
 drivers/net/cpfl/cpfl_ethdev.c   | 6 +-
 drivers/net/idpf/idpf_ethdev.c   | 6 +-
 3 files changed, 2 insertions(+), 12 deletions(-)

diff --git a/drivers/common/idpf/idpf_common_device.h 
b/drivers/common/idpf/idpf_common_device.h
index c2dc2f16b9..7a54f7c937 100644
--- a/drivers/common/idpf/idpf_common_device.h
+++ b/drivers/common/idpf/idpf_common_device.h
@@ -110,8 +110,6 @@ struct idpf_vport {
 
uint16_t devarg_id;
 
-   bool stopped;
-
bool rx_vec_allowed;
bool tx_vec_allowed;
bool rx_use_avx512;
diff --git a/drivers/net/cpfl/cpfl_ethdev.c b/drivers/net/cpfl/cpfl_ethdev.c
index ede730fd50..f1d4425ce2 100644
--- a/drivers/net/cpfl/cpfl_ethdev.c
+++ b/drivers/net/cpfl/cpfl_ethdev.c
@@ -798,8 +798,6 @@ cpfl_dev_start(struct rte_eth_dev *dev)
if (cpfl_dev_stats_reset(dev))
PMD_DRV_LOG(ERR, "Failed to reset stats");
 
-   vport->stopped = 0;
-
return 0;
 
 err_vport:
@@ -817,7 +815,7 @@ cpfl_dev_stop(struct rte_eth_dev *dev)
 {
struct idpf_vport *vport = dev->data->dev_private;
 
-   if (vport->stopped == 1)
+   if (dev->data->dev_started == 0)
return 0;
 
idpf_vc_vport_ena_dis(vport, false);
@@ -828,8 +826,6 @@ cpfl_dev_stop(struct rte_eth_dev *dev)
 
idpf_vc_vectors_dealloc(vport);
 
-   vport->stopped = 1;
-
return 0;
 }
 
diff --git a/drivers/net/idpf/idpf_ethdev.c b/drivers/net/idpf/idpf_ethdev.c
index e02ec2ec5a..e01eb3a2ec 100644
--- a/drivers/net/idpf/idpf_ethdev.c
+++ b/drivers/net/idpf/idpf_ethdev.c
@@ -792,8 +792,6 @@ idpf_dev_start(struct rte_eth_dev *dev)
if (idpf_dev_stats_reset(dev))
PMD_DRV_LOG(ERR, "Failed to reset stats");
 
-   vport->stopped = 0;
-
return 0;
 
 err_vport:
@@ -811,7 +809,7 @@ idpf_dev_stop(struct rte_eth_dev *dev)
 {
struct idpf_vport *vport = dev->data->dev_private;
 
-   if (vport->stopped == 1)
+   if (dev->data->dev_started == 0)
return 0;
 
idpf_vc_vport_ena_dis(vport, false);
@@ -822,8 +820,6 @@ idpf_dev_stop(struct rte_eth_dev *dev)
 
idpf_vc_vectors_dealloc(vport);
 
-   vport->stopped = 1;
-
return 0;
 }
 
-- 
2.31.1

Re: [RFC] lib: set/get max memzone segments

2023-04-20 Thread Thomas Monjalon

19/04/2023 16:51, Tyler Retzlaff:
> On Wed, Apr 19, 2023 at 11:36:34AM +0300, Ophir Munk wrote:
> > In current DPDK the RTE_MAX_MEMZONE definition is unconditionally hard
> > coded as 2560.  For applications requiring different values of this
> > parameter – it is more convenient to set the max value via an rte API -
> > rather than changing the dpdk source code per application.  In many
> > organizations, the possibility to compile a private DPDK library for a
> > particular application does not exist at all.  With this option there is
> > no need to recompile DPDK and it allows using an in-box packaged DPDK.
> > An example usage for updating the RTE_MAX_MEMZONE would be of an
> > application that uses the DPDK mempool library which is based on DPDK
> > memzone library.  The application may need to create a number of
> > steering tables, each of which will require its own mempool allocation.
> > This commit is not about how to optimize the application usage of
> > mempool nor about how to improve the mempool implementation based on
> > memzone.  It is about how to make the max memzone definition - run-time
> > customized.
> > This commit adds an API which must be called before rte_eal_init():
> > rte_memzone_max_set(int max).  If not called, the default memzone
> > (RTE_MAX_MEMZONE) is used.  There is also an API to query the effective
> > max memzone: rte_memzone_max_get().
> > 
> > Signed-off-by: Ophir Munk 
> > ---
> 
> the use case of each application may want a different non-hard coded
> value makes sense.
> 
> it's less clear to me that requiring it be called before eal init makes
> sense over just providing it as configuration to eal init so that it is
> composed.

Why do you think it would be better as EAL init option?
From an API perspective, I think it is simpler to call a dedicated function.
And I don't think a user wants to deal with it when starting the application.

> can you elaborate further on why you need get if you have a one-shot
> set? why would the application not know the value if you can only ever
> call it once before init?

The "get" function is used in this patch by test and qede driver.
The application could use it as well, especially to query the default value.

[PATCH] app/dma-perf: introduce dma-perf application

2023-04-20 Thread Cheng Jiang

There are many high-performance DMA devices supported in DPDK now, and
these DMA devices can also be integrated into other modules of DPDK as
accelerators, such as Vhost. Before integrating DMA into applications,
developers need to know the performance of these DMA devices in various
scenarios and the performance of CPUs in the same scenario, such as
different buffer lengths. Only in this way can we know the target
performance of the application accelerated by using them. This patch
introduces a high-performance testing tool, which supports comparing the
performance of CPU and DMA in different scenarios automatically with a
pre-set config file. Memory Copy performance test are supported for now.

Signed-off-by: Cheng Jiang 
Signed-off-by: Jiayu Hu 
Signed-off-by: Yuan Wang 
Acked-by: Morten Brørup 
---
 app/meson.build   |   1 +
 app/test-dma-perf/benchmark.c | 467 ++
 app/test-dma-perf/config.ini  |  56 
 app/test-dma-perf/main.c  | 445 
 app/test-dma-perf/main.h  |  56 
 app/test-dma-perf/meson.build |  17 ++
 6 files changed, 1042 insertions(+)
 create mode 100644 app/test-dma-perf/benchmark.c
 create mode 100644 app/test-dma-perf/config.ini
 create mode 100644 app/test-dma-perf/main.c
 create mode 100644 app/test-dma-perf/main.h
 create mode 100644 app/test-dma-perf/meson.build

diff --git a/app/meson.build b/app/meson.build
index e32ea4bd5c..514cb2f7b2 100644
--- a/app/meson.build
+++ b/app/meson.build
@@ -19,6 +19,7 @@ apps = [
 'test-cmdline',
 'test-compress-perf',
 'test-crypto-perf',
+'test-dma-perf',
 'test-eventdev',
 'test-fib',
 'test-flow-perf',
diff --git a/app/test-dma-perf/benchmark.c b/app/test-dma-perf/benchmark.c
new file mode 100644
index 00..36e3413bdc
--- /dev/null
+++ b/app/test-dma-perf/benchmark.c
@@ -0,0 +1,467 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2023 Intel Corporation
+ */
+
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "main.h"
+
+#define MAX_DMA_CPL_NB 255
+
+#define TEST_WAIT_U_SECOND 1
+
+#define CSV_LINE_DMA_FMT "Scenario %u,%u,%u,%u,%u,%u,%" PRIu64 ",%.3lf,%.3lf\n"
+#define CSV_LINE_CPU_FMT "Scenario %u,%u,NA,%u,%u,%u,%" PRIu64 ",%.3lf,%.3lf\n"
+
+struct worker_info {
+   bool ready_flag;
+   bool start_flag;
+   bool stop_flag;
+   uint32_t total_cpl;
+   uint32_t test_cpl;
+};
+
+struct lcore_params {
+   uint8_t scenario_id;
+   unsigned int lcore_id;
+   uint16_t worker_id;
+   uint16_t dev_id;
+   uint32_t nr_buf;
+   uint16_t kick_batch;
+   uint32_t buf_size;
+   uint16_t test_secs;
+   struct rte_mbuf **srcs;
+   struct rte_mbuf **dsts;
+   struct worker_info worker_info;
+};
+
+static struct rte_mempool *src_pool;
+static struct rte_mempool *dst_pool;
+
+static volatile struct lcore_params *worker_params[MAX_WORKER_NB];
+
+uint16_t dmadev_ids[MAX_WORKER_NB];
+uint32_t nb_dmadevs;
+
+
+#define PRINT_ERR(...) print_err(__func__, __LINE__, __VA_ARGS__)
+
+static inline int
+__rte_format_printf(3, 4)
+print_err(const char *func, int lineno, const char *format, ...)
+{
+   va_list ap;
+   int ret;
+
+   ret = fprintf(stderr, "In %s:%d - ", func, lineno);
+   va_start(ap, format);
+   ret += vfprintf(stderr, format, ap);
+   va_end(ap);
+
+   return ret;
+}
+
+static inline void
+calc_result(uint32_t buf_size, uint32_t nr_buf, uint16_t nb_workers, uint16_t 
test_secs,
+   uint32_t total_cnt, uint32_t *memory, uint32_t 
*ave_cycle,
+   float *bandwidth, float *mops)
+{
+   *memory = (buf_size * (nr_buf / nb_workers) * 2) / (1024 * 1024);
+   *ave_cycle = test_secs * rte_get_timer_hz() / total_cnt;
+   *bandwidth = (buf_size * 8 * (rte_get_timer_hz() / (float)*ave_cycle)) 
/ 10;
+   *mops = (float)rte_get_timer_hz() / *ave_cycle / 100;
+}
+
+static void
+output_result(uint8_t scenario_id, uint32_t lcore_id, uint16_t dev_id, 
uint64_t ave_cycle,
+   uint32_t buf_size, uint32_t nr_buf, uint32_t memory,
+   float bandwidth, float mops, bool is_dma)
+{
+   if (is_dma)
+   printf("lcore %u, DMA %u:\n", lcore_id, dev_id);
+   else
+   printf("lcore %u\n", lcore_id);
+
+   printf("average cycles/op: %" PRIu64 ", buffer size: %u, nr_buf: %u, 
memory: %uMB, frequency: %" PRIu64 ".\n",
+   ave_cycle, buf_size, nr_buf, memory, 
rte_get_timer_hz());
+   printf("Average bandwidth: %.3lfGbps, MOps: %.3lf\n", bandwidth, mops);
+
+   if (is_dma)
+   snprintf(output_str[lcore_id], MAX_OUTPUT_STR_LEN, 
CSV_LINE_DMA_FMT,
+   scenario_id, lcore_id, dev_id, buf_size,
+   nr_buf, memory, ave_cycle, bandwidth, mops);
+

[PATCH] common/idpf: remove unnecessary field in vport

2023-04-20 Thread Qi Zhang

Remove the pointer to rte_eth_dev instance, as
1. there is already a pointer to  rte_eth_dev_data.
2. a pointer to rte_eth_dev will break multi-process usage.

Signed-off-by: Qi Zhang 
---
 drivers/common/idpf/idpf_common_device.h | 1 -
 drivers/net/cpfl/cpfl_ethdev.c   | 4 ++--
 drivers/net/idpf/idpf_ethdev.c   | 4 ++--
 3 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/drivers/common/idpf/idpf_common_device.h 
b/drivers/common/idpf/idpf_common_device.h
index 7a54f7c937..d29bcc71ab 100644
--- a/drivers/common/idpf/idpf_common_device.h
+++ b/drivers/common/idpf/idpf_common_device.h
@@ -117,7 +117,6 @@ struct idpf_vport {
 
struct virtchnl2_vport_stats eth_stats_offset;
 
-   void *dev;
/* Event from ipf */
bool link_up;
uint32_t link_speed;
diff --git a/drivers/net/cpfl/cpfl_ethdev.c b/drivers/net/cpfl/cpfl_ethdev.c
index f1d4425ce2..680c2326ec 100644
--- a/drivers/net/cpfl/cpfl_ethdev.c
+++ b/drivers/net/cpfl/cpfl_ethdev.c
@@ -1061,7 +1061,8 @@ static void
 cpfl_handle_event_msg(struct idpf_vport *vport, uint8_t *msg, uint16_t msglen)
 {
struct virtchnl2_event *vc_event = (struct virtchnl2_event *)msg;
-   struct rte_eth_dev *dev = (struct rte_eth_dev *)vport->dev;
+   struct rte_eth_dev_data *data = vport->dev_data;
+   struct rte_eth_dev *dev = &rte_eth_devices[data->port_id];
 
if (msglen < sizeof(struct virtchnl2_event)) {
PMD_DRV_LOG(ERR, "Error event");
@@ -1245,7 +1246,6 @@ cpfl_dev_vport_init(struct rte_eth_dev *dev, void 
*init_params)
vport->adapter = &adapter->base;
vport->sw_idx = param->idx;
vport->devarg_id = param->devarg_id;
-   vport->dev = dev;
 
memset(&create_vport_info, 0, sizeof(create_vport_info));
ret = idpf_vport_info_init(vport, &create_vport_info);
diff --git a/drivers/net/idpf/idpf_ethdev.c b/drivers/net/idpf/idpf_ethdev.c
index e01eb3a2ec..38ad4e7ac0 100644
--- a/drivers/net/idpf/idpf_ethdev.c
+++ b/drivers/net/idpf/idpf_ethdev.c
@@ -1024,7 +1024,8 @@ static void
 idpf_handle_event_msg(struct idpf_vport *vport, uint8_t *msg, uint16_t msglen)
 {
struct virtchnl2_event *vc_event = (struct virtchnl2_event *)msg;
-   struct rte_eth_dev *dev = (struct rte_eth_dev *)vport->dev;
+   struct rte_eth_dev_data *data = vport->dev_data;
+   struct rte_eth_dev *dev = &rte_eth_devices[data->port_id];
 
if (msglen < sizeof(struct virtchnl2_event)) {
PMD_DRV_LOG(ERR, "Error event");
@@ -1235,7 +1236,6 @@ idpf_dev_vport_init(struct rte_eth_dev *dev, void 
*init_params)
vport->adapter = &adapter->base;
vport->sw_idx = param->idx;
vport->devarg_id = param->devarg_id;
-   vport->dev = dev;
 
memset(&create_vport_info, 0, sizeof(create_vport_info));
ret = idpf_vport_info_init(vport, &create_vport_info);
-- 
2.31.1

[PATCH] net/mlx5: add timestamp ascending order error statistics

2023-04-20 Thread Viacheslav Ovsiienko

The ConnectX NICs support packet send scheduling on specified
moment of time. Application can set the desired timestamp value
in dynamic mbuf field and driver will push the special WAIT WQE
to the hardware queue in order to suspend the entire queue
operations till the specified time moment, then PMD pushes the
regular WQE for packet sending.

In the following packets the scheduling can be requested again,
with different timestamps, and driver pushes WAIT WQE accordingly.
The timestamps should be provided by application in ascending
order as packets are queued to the hardware queue, otherwise
hardware would not be able to perform scheduling correctly -
it discovers the WAIT WQEs in order as they were pushed, there is
no any reordering - neither in PMD, not in the NIC, and, obviously,
the regular hardware can't work as time machine and wait for some
elapsed moment in the past.

Signed-off-by: Viacheslav Ovsiienko 
---
 drivers/net/mlx5/mlx5.h  |  1 +
 drivers/net/mlx5/mlx5_tx.h   |  5 +
 drivers/net/mlx5/mlx5_txpp.c | 12 +---
 3 files changed, 15 insertions(+), 3 deletions(-)

diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 9eae692037..e03f1f6385 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -1186,6 +1186,7 @@ struct mlx5_dev_txpp {
uint64_t err_clock_queue; /* Clock Queue errors. */
uint64_t err_ts_past; /* Timestamp in the past. */
uint64_t err_ts_future; /* Timestamp in the distant future. */
+   uint64_t err_ts_order; /* Timestamp not in ascending order. */
 };
 
 /* Sample ID information of eCPRI flex parser structure. */
diff --git a/drivers/net/mlx5/mlx5_tx.h b/drivers/net/mlx5/mlx5_tx.h
index d0c6303a2d..cc8f7e98aa 100644
--- a/drivers/net/mlx5/mlx5_tx.h
+++ b/drivers/net/mlx5/mlx5_tx.h
@@ -162,6 +162,7 @@ struct mlx5_txq_data {
uint16_t idx; /* Queue index. */
uint64_t rt_timemask; /* Scheduling timestamp mask. */
uint64_t ts_mask; /* Timestamp flag dynamic mask. */
+   uint64_t ts_last; /* Last scheduled timestamp. */
int32_t ts_offset; /* Timestamp field dynamic offset. */
struct mlx5_dev_ctx_shared *sh; /* Shared context. */
struct mlx5_txq_stats stats; /* TX queue counters. */
@@ -1682,6 +1683,10 @@ mlx5_tx_schedule_send(struct mlx5_txq_data *restrict txq,
return MLX5_TXCMP_CODE_EXIT;
/* Convert the timestamp into completion to wait. */
ts = *RTE_MBUF_DYNFIELD(loc->mbuf, txq->ts_offset, uint64_t *);
+   if (txq->ts_last && ts < txq->ts_last)
+   __atomic_fetch_add(&txq->sh->txpp.err_ts_order,
+  1, __ATOMIC_RELAXED);
+   txq->ts_last = ts;
wqe = txq->wqes + (txq->wqe_ci & txq->wqe_m);
sh = txq->sh;
if (txq->wait_on_time) {
diff --git a/drivers/net/mlx5/mlx5_txpp.c b/drivers/net/mlx5/mlx5_txpp.c
index 0e1da1d5f5..5a5df2d1bb 100644
--- a/drivers/net/mlx5/mlx5_txpp.c
+++ b/drivers/net/mlx5/mlx5_txpp.c
@@ -29,6 +29,7 @@ static const char * const mlx5_txpp_stat_names[] = {
"tx_pp_clock_queue_errors", /* Clock Queue errors. */
"tx_pp_timestamp_past_errors", /* Timestamp in the past. */
"tx_pp_timestamp_future_errors", /* Timestamp in the distant future. */
+   "tx_pp_timestamp_order_errors", /* Timestamp not in ascending order. */
"tx_pp_jitter", /* Timestamp jitter (one Clock Queue completion). */
"tx_pp_wander", /* Timestamp wander (half of Clock Queue CQEs). */
"tx_pp_sync_lost", /* Scheduling synchronization lost. */
@@ -758,6 +759,7 @@ mlx5_txpp_start_service(struct mlx5_dev_ctx_shared *sh)
sh->txpp.err_clock_queue = 0;
sh->txpp.err_ts_past = 0;
sh->txpp.err_ts_future = 0;
+   sh->txpp.err_ts_order = 0;
/* Attach interrupt handler to process Rearm Queue completions. */
fd = mlx5_os_get_devx_channel_fd(sh->txpp.echan);
ret = mlx5_os_set_nonblock_channel_fd(fd);
@@ -1034,6 +1036,7 @@ int mlx5_txpp_xstats_reset(struct rte_eth_dev *dev)
__atomic_store_n(&sh->txpp.err_clock_queue, 0, __ATOMIC_RELAXED);
__atomic_store_n(&sh->txpp.err_ts_past, 0, __ATOMIC_RELAXED);
__atomic_store_n(&sh->txpp.err_ts_future, 0, __ATOMIC_RELAXED);
+   __atomic_store_n(&sh->txpp.err_ts_order, 0, __ATOMIC_RELAXED);
return 0;
 }
 
@@ -1221,9 +1224,12 @@ mlx5_txpp_xstats_get(struct rte_eth_dev *dev,
stats[n_used + 4].value =
__atomic_load_n(&sh->txpp.err_ts_future,
__ATOMIC_RELAXED);
-   stats[n_used + 5].value = mlx5_txpp_xstats_jitter(&sh->txpp);
-   stats[n_used + 6].value = mlx5_txpp_xstats_wander(&sh->txpp);
-   stats[n_used + 7].value = sh->txpp.sync_lost;
+   stats[n_used + 5].value =
+   __

[RFC 0/2] ethdev: extend modify field API

2023-04-20 Thread Michael Baum

This petch-set extend the modify field action API to support 2 special
cases.

1. Modify field when the relevant header appears multiple times inside
same encapsulation level.
2. Modify Geneve option header which is specified by its "type" and
"class" fields.

In current API, the header type is provided by "rte_flow_field_id"
enumeration and the encapsulation level (inner/outer/tunnel) is
specified by "data.level" field.
However, there is no way to specify header inside encapsulation level.

For example, for this packet:
eth / mpls / mpls / mpls / ipv4 / udp
the both second and third MPLS headers cannot be modified using this
API.

Michael Baum (2):
  ethdev: add GENEVE TLV option modification support
  ethdev: add MPLS header modification support

 app/test-pmd/cmdline_flow.c| 69 +++-
 doc/guides/prog_guide/rte_flow.rst | 33 +++---
 lib/ethdev/rte_flow.h  | 72 --
 3 files changed, 165 insertions(+), 9 deletions(-)

-- 
2.25.1

[RFC 1/2] ethdev: add GENEVE TLV option modification support

2023-04-20 Thread Michael Baum

Add modify field support for GENEVE option fields:
 - "RTE_FLOW_FIELD_GENEVE_OPT_TYPE"
 - "RTE_FLOW_FIELD_GENEVE_OPT_CLASS"
 - "RTE_FLOW_FIELD_GENEVE_OPT_DATA"

Each GENEVE TLV option is identified by both its "class" and "type", so
2 new fields were added to "rte_flow_action_modify_data" structure to
help specify which option to modify.

To get room for those 2 new fields, the "level" field move to use
"uint8_t" which is more than enough for encapsulation level.

Signed-off-by: Michael Baum 
---
 app/test-pmd/cmdline_flow.c| 47 ++-
 doc/guides/prog_guide/rte_flow.rst | 27 +---
 lib/ethdev/rte_flow.h  | 51 +-
 3 files changed, 118 insertions(+), 7 deletions(-)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 58939ec321..db8bd30cb1 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -636,11 +636,15 @@ enum index {
ACTION_MODIFY_FIELD_DST_TYPE_VALUE,
ACTION_MODIFY_FIELD_DST_LEVEL,
ACTION_MODIFY_FIELD_DST_LEVEL_VALUE,
+   ACTION_MODIFY_FIELD_DST_TYPE_ID,
+   ACTION_MODIFY_FIELD_DST_CLASS_ID,
ACTION_MODIFY_FIELD_DST_OFFSET,
ACTION_MODIFY_FIELD_SRC_TYPE,
ACTION_MODIFY_FIELD_SRC_TYPE_VALUE,
ACTION_MODIFY_FIELD_SRC_LEVEL,
ACTION_MODIFY_FIELD_SRC_LEVEL_VALUE,
+   ACTION_MODIFY_FIELD_SRC_TYPE_ID,
+   ACTION_MODIFY_FIELD_SRC_CLASS_ID,
ACTION_MODIFY_FIELD_SRC_OFFSET,
ACTION_MODIFY_FIELD_SRC_VALUE,
ACTION_MODIFY_FIELD_SRC_POINTER,
@@ -854,7 +858,8 @@ static const char *const modify_field_ids[] = {
"ipv4_ecn", "ipv6_ecn", "gtp_psc_qfi", "meter_color",
"ipv6_proto",
"flex_item",
-   "hash_result", NULL
+   "hash_result",
+   "geneve_opt_type", "geneve_opt_class", "geneve_opt_data", NULL
 };
 
 static const char *const meter_colors[] = {
@@ -2295,6 +2300,8 @@ static const enum index next_action_sample[] = {
 
 static const enum index action_modify_field_dst[] = {
ACTION_MODIFY_FIELD_DST_LEVEL,
+   ACTION_MODIFY_FIELD_DST_TYPE_ID,
+   ACTION_MODIFY_FIELD_DST_CLASS_ID,
ACTION_MODIFY_FIELD_DST_OFFSET,
ACTION_MODIFY_FIELD_SRC_TYPE,
ZERO,
@@ -2302,6 +2309,8 @@ static const enum index action_modify_field_dst[] = {
 
 static const enum index action_modify_field_src[] = {
ACTION_MODIFY_FIELD_SRC_LEVEL,
+   ACTION_MODIFY_FIELD_SRC_TYPE_ID,
+   ACTION_MODIFY_FIELD_SRC_CLASS_ID,
ACTION_MODIFY_FIELD_SRC_OFFSET,
ACTION_MODIFY_FIELD_SRC_VALUE,
ACTION_MODIFY_FIELD_SRC_POINTER,
@@ -6388,6 +6397,24 @@ static const struct token token_list[] = {
.call = parse_vc_modify_field_level,
.comp = comp_none,
},
+   [ACTION_MODIFY_FIELD_DST_TYPE_ID] = {
+   .name = "dst_type_id",
+   .help = "destination field type ID",
+   .next = NEXT(action_modify_field_dst,
+NEXT_ENTRY(COMMON_UNSIGNED)),
+   .args = ARGS(ARGS_ENTRY(struct rte_flow_action_modify_field,
+   dst.type)),
+   .call = parse_vc_conf,
+   },
+   [ACTION_MODIFY_FIELD_DST_CLASS_ID] = {
+   .name = "dst_class",
+   .help = "destination field class ID",
+   .next = NEXT(action_modify_field_dst,
+NEXT_ENTRY(COMMON_UNSIGNED)),
+   .args = ARGS(ARGS_ENTRY(struct rte_flow_action_modify_field,
+   dst.class_id)),
+   .call = parse_vc_conf,
+   },
[ACTION_MODIFY_FIELD_DST_OFFSET] = {
.name = "dst_offset",
.help = "destination field bit offset",
@@ -6423,6 +6450,24 @@ static const struct token token_list[] = {
.call = parse_vc_modify_field_level,
.comp = comp_none,
},
+   [ACTION_MODIFY_FIELD_SRC_TYPE_ID] = {
+   .name = "src_type_id",
+   .help = "source field type ID",
+   .next = NEXT(action_modify_field_src,
+NEXT_ENTRY(COMMON_UNSIGNED)),
+   .args = ARGS(ARGS_ENTRY(struct rte_flow_action_modify_field,
+   src.type)),
+   .call = parse_vc_conf,
+   },
+   [ACTION_MODIFY_FIELD_SRC_CLASS_ID] = {
+   .name = "src_class",
+   .help = "source field class ID",
+   .next = NEXT(action_modify_field_src,
+NEXT_ENTRY(COMMON_UNSIGNED)),
+   .args = ARGS(ARGS_ENTRY(struct rte_flow_action_modify_field,
+   src.class_id)),
+   .call = parse_vc_conf,
+   },
[ACTION_MODIFY_FIELD_SRC_OFFSET] = {
.name = "src_offset",
.help = "source field bit offset",
diff --git a/doc/guides/prog_gu

[RFC 2/2] ethdev: add MPLS header modification support

2023-04-20 Thread Michael Baum

Add support for MPLS modify header using "RTE_FLOW_FIELD_MPLS" id.

Since MPLS heaser might appear more the one time in inner/outer/tunnel,
a new field was added to "rte_flow_action_modify_data" structure in
addition to "level" field.
The "sub_level" field is the index of the header inside encapsulation
level. It is used for modify multiple MPLS headers in same encapsulation
level.

This addition enables to modify multiple VLAN headers too, so the
description of "RTE_FLOW_FIELD_VLAN_" was updated.

Signed-off-by: Michael Baum 
---
 app/test-pmd/cmdline_flow.c| 24 ++-
 doc/guides/prog_guide/rte_flow.rst |  6 
 lib/ethdev/rte_flow.h  | 47 --
 3 files changed, 61 insertions(+), 16 deletions(-)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index db8bd30cb1..ffeedefc35 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -636,6 +636,7 @@ enum index {
ACTION_MODIFY_FIELD_DST_TYPE_VALUE,
ACTION_MODIFY_FIELD_DST_LEVEL,
ACTION_MODIFY_FIELD_DST_LEVEL_VALUE,
+   ACTION_MODIFY_FIELD_DST_SUB_LEVEL,
ACTION_MODIFY_FIELD_DST_TYPE_ID,
ACTION_MODIFY_FIELD_DST_CLASS_ID,
ACTION_MODIFY_FIELD_DST_OFFSET,
@@ -643,6 +644,7 @@ enum index {
ACTION_MODIFY_FIELD_SRC_TYPE_VALUE,
ACTION_MODIFY_FIELD_SRC_LEVEL,
ACTION_MODIFY_FIELD_SRC_LEVEL_VALUE,
+   ACTION_MODIFY_FIELD_SRC_SUB_LEVEL,
ACTION_MODIFY_FIELD_SRC_TYPE_ID,
ACTION_MODIFY_FIELD_SRC_CLASS_ID,
ACTION_MODIFY_FIELD_SRC_OFFSET,
@@ -859,7 +861,7 @@ static const char *const modify_field_ids[] = {
"ipv6_proto",
"flex_item",
"hash_result",
-   "geneve_opt_type", "geneve_opt_class", "geneve_opt_data", NULL
+   "geneve_opt_type", "geneve_opt_class", "geneve_opt_data", "mpls", NULL
 };
 
 static const char *const meter_colors[] = {
@@ -2300,6 +2302,7 @@ static const enum index next_action_sample[] = {
 
 static const enum index action_modify_field_dst[] = {
ACTION_MODIFY_FIELD_DST_LEVEL,
+   ACTION_MODIFY_FIELD_DST_SUB_LEVEL,
ACTION_MODIFY_FIELD_DST_TYPE_ID,
ACTION_MODIFY_FIELD_DST_CLASS_ID,
ACTION_MODIFY_FIELD_DST_OFFSET,
@@ -2309,6 +2312,7 @@ static const enum index action_modify_field_dst[] = {
 
 static const enum index action_modify_field_src[] = {
ACTION_MODIFY_FIELD_SRC_LEVEL,
+   ACTION_MODIFY_FIELD_SRC_SUB_LEVEL,
ACTION_MODIFY_FIELD_SRC_TYPE_ID,
ACTION_MODIFY_FIELD_SRC_CLASS_ID,
ACTION_MODIFY_FIELD_SRC_OFFSET,
@@ -6397,6 +6401,15 @@ static const struct token token_list[] = {
.call = parse_vc_modify_field_level,
.comp = comp_none,
},
+   [ACTION_MODIFY_FIELD_DST_SUB_LEVEL] = {
+   .name = "dst_sub_level",
+   .help = "destination field sub level",
+   .next = NEXT(action_modify_field_dst,
+NEXT_ENTRY(COMMON_UNSIGNED)),
+   .args = ARGS(ARGS_ENTRY(struct rte_flow_action_modify_field,
+   dst.sub_level)),
+   .call = parse_vc_conf,
+   },
[ACTION_MODIFY_FIELD_DST_TYPE_ID] = {
.name = "dst_type_id",
.help = "destination field type ID",
@@ -6450,6 +6463,15 @@ static const struct token token_list[] = {
.call = parse_vc_modify_field_level,
.comp = comp_none,
},
+   [ACTION_MODIFY_FIELD_SRC_SUB_LEVEL] = {
+   .name = "stc_sub_level",
+   .help = "source field sub level",
+   .next = NEXT(action_modify_field_src,
+NEXT_ENTRY(COMMON_UNSIGNED)),
+   .args = ARGS(ARGS_ENTRY(struct rte_flow_action_modify_field,
+   src.sub_level)),
+   .call = parse_vc_conf,
+   },
[ACTION_MODIFY_FIELD_SRC_TYPE_ID] = {
.name = "src_type_id",
.help = "source field type ID",
diff --git a/doc/guides/prog_guide/rte_flow.rst 
b/doc/guides/prog_guide/rte_flow.rst
index dc86e040ec..b5d8ce26c5 100644
--- a/doc/guides/prog_guide/rte_flow.rst
+++ b/doc/guides/prog_guide/rte_flow.rst
@@ -2939,6 +2939,10 @@ as well as any tag element in the tag array:
 For the tag array (in case of multiple tags are supported and present)
 ``level`` translates directly into the array index.
 
+- ``sub_level`` is the index of the header inside encapsulation level.
+  It is used for modify either ``VLAN`` or ``MPLS`` headers which multiple of
+  them might be supported in same encapsulation level.
+
 ``type`` is used to specify (along with ``class_id``) the Geneve option which
 is being modified.
 This field is relevant only for ``RTE_FLOW_FIELD_GENEVE_OPT_`` type.
@@ -3004,6 +3008,8 @@ value as sequence of bytes {xxx, xxx, 0x85, xxx, xxx, 
xxx}.

+-+-

[PATCH 0/7] update idpf and cpfl timestamp

2023-04-20 Thread Wenjing Qiao

Add timestamp offload feature support for ACC. Using alarm
to save master time to solve timestamp roll over issue.
Ajust timestamp mbuf registering at dev start.

Wenjing Qiao (7):
  common/idpf: fix 64b timestamp roll over issue
  net/idpf: save master time by alarm
  net/cpfl: save master time by alarm
  common/idpf: support timestamp offload feature for ACC
  common/idpf: add timestamp enable flag for rxq
  net/cpfl: register timestamp mbuf when starting dev
  net/idpf: register timestamp mbuf when starting dev

 config/meson.build |   3 +
 drivers/common/idpf/base/idpf_osdep.h  |  48 +
 drivers/common/idpf/idpf_common_rxtx.c | 133 ++---
 drivers/common/idpf/idpf_common_rxtx.h |   5 +-
 drivers/common/idpf/version.map|   4 +
 drivers/net/cpfl/cpfl_ethdev.c |  19 
 drivers/net/cpfl/cpfl_ethdev.h |   3 +
 drivers/net/cpfl/cpfl_rxtx.c   |   2 +
 drivers/net/idpf/idpf_ethdev.c |  19 
 drivers/net/idpf/idpf_ethdev.h |   3 +
 drivers/net/idpf/idpf_rxtx.c   |   3 +
 meson_options.txt  |   2 +
 12 files changed, 186 insertions(+), 58 deletions(-)

-- 
2.25.1

[PATCH 1/7] common/idpf: fix 64b timestamp roll over issue

2023-04-20 Thread Wenjing Qiao

Reading MTS register at first packet will cause timestamp
roll over issue. To support caculating 64b timestamp, need
an alarm to save master time from registers every 1 second.

Fixes: 8c6098afa075 ("common/idpf: add Rx/Tx data path")
Cc: sta...@dpdk.org

Signed-off-by: Wenjing Qiao 
---
 drivers/common/idpf/idpf_common_rxtx.c | 108 -
 drivers/common/idpf/idpf_common_rxtx.h |   3 +-
 drivers/common/idpf/version.map|   1 +
 3 files changed, 55 insertions(+), 57 deletions(-)

diff --git a/drivers/common/idpf/idpf_common_rxtx.c 
b/drivers/common/idpf/idpf_common_rxtx.c
index fc87e3e243..19bcb94077 100644
--- a/drivers/common/idpf/idpf_common_rxtx.c
+++ b/drivers/common/idpf/idpf_common_rxtx.c
@@ -4,6 +4,7 @@
 
 #include 
 #include 
+#include 
 
 #include "idpf_common_rxtx.h"
 
@@ -442,56 +443,23 @@ idpf_qc_split_rxq_mbufs_alloc(struct idpf_rx_queue *rxq)
return 0;
 }
 
-#define IDPF_TIMESYNC_REG_WRAP_GUARD_BAND  1
 /* Helper function to convert a 32b nanoseconds timestamp to 64b. */
 static inline uint64_t
-idpf_tstamp_convert_32b_64b(struct idpf_adapter *ad, uint32_t flag,
-   uint32_t in_timestamp)
+idpf_tstamp_convert_32b_64b(uint64_t time_hw, uint32_t in_timestamp)
 {
-#ifdef RTE_ARCH_X86_64
-   struct idpf_hw *hw = &ad->hw;
const uint64_t mask = 0x;
-   uint32_t hi, lo, lo2, delta;
+   const uint32_t half_overflow_duration = 0x1 << 31;
+   uint32_t delta;
uint64_t ns;
 
-   if (flag != 0) {
-   IDPF_WRITE_REG(hw, GLTSYN_CMD_SYNC_0_0, 
PF_GLTSYN_CMD_SYNC_SHTIME_EN_M);
-   IDPF_WRITE_REG(hw, GLTSYN_CMD_SYNC_0_0, 
PF_GLTSYN_CMD_SYNC_EXEC_CMD_M |
-  PF_GLTSYN_CMD_SYNC_SHTIME_EN_M);
-   lo = IDPF_READ_REG(hw, PF_GLTSYN_SHTIME_L_0);
-   hi = IDPF_READ_REG(hw, PF_GLTSYN_SHTIME_H_0);
-   /*
-* On typical system, the delta between lo and lo2 is ~1000ns,
-* so 1 seems a large-enough but not overly-big guard band.
-*/
-   if (lo > (UINT32_MAX - IDPF_TIMESYNC_REG_WRAP_GUARD_BAND))
-   lo2 = IDPF_READ_REG(hw, PF_GLTSYN_SHTIME_L_0);
-   else
-   lo2 = lo;
-
-   if (lo2 < lo) {
-   lo = IDPF_READ_REG(hw, PF_GLTSYN_SHTIME_L_0);
-   hi = IDPF_READ_REG(hw, PF_GLTSYN_SHTIME_H_0);
-   }
-
-   ad->time_hw = ((uint64_t)hi << 32) | lo;
-   }
-
-   delta = (in_timestamp - (uint32_t)(ad->time_hw & mask));
-   if (delta > (mask / 2)) {
-   delta = ((uint32_t)(ad->time_hw & mask) - in_timestamp);
-   ns = ad->time_hw - delta;
+   delta = (in_timestamp - (uint32_t)(time_hw & mask));
+   if (delta > half_overflow_duration) {
+   delta = ((uint32_t)(time_hw & mask) - in_timestamp);
+   ns = time_hw - delta;
} else {
-   ns = ad->time_hw + delta;
+   ns = time_hw + delta;
}
-
return ns;
-#else /* !RTE_ARCH_X86_64 */
-   RTE_SET_USED(ad);
-   RTE_SET_USED(flag);
-   RTE_SET_USED(in_timestamp);
-   return 0;
-#endif /* RTE_ARCH_X86_64 */
 }
 
 #define IDPF_RX_FLEX_DESC_ADV_STATUS0_XSUM_S   \
@@ -659,9 +627,6 @@ idpf_dp_splitq_recv_pkts(void *rx_queue, struct rte_mbuf 
**rx_pkts,
rx_desc_ring = rxq->rx_ring;
ptype_tbl = rxq->adapter->ptype_tbl;
 
-   if ((rxq->offloads & IDPF_RX_OFFLOAD_TIMESTAMP) != 0)
-   rxq->hw_register_set = 1;
-
while (nb_rx < nb_pkts) {
rx_desc = &rx_desc_ring[rx_id];
 
@@ -720,10 +685,8 @@ idpf_dp_splitq_recv_pkts(void *rx_queue, struct rte_mbuf 
**rx_pkts,
if (idpf_timestamp_dynflag > 0 &&
(rxq->offloads & IDPF_RX_OFFLOAD_TIMESTAMP)) {
/* timestamp */
-   ts_ns = idpf_tstamp_convert_32b_64b(ad,
-   
rxq->hw_register_set,
+   ts_ns = idpf_tstamp_convert_32b_64b(ad->time_hw,

rte_le_to_cpu_32(rx_desc->ts_high));
-   rxq->hw_register_set = 0;
*RTE_MBUF_DYNFIELD(rxm,
   idpf_timestamp_dynfield_offset,
   rte_mbuf_timestamp_t *) = ts_ns;
@@ -1077,9 +1040,6 @@ idpf_dp_singleq_recv_pkts(void *rx_queue, struct rte_mbuf 
**rx_pkts,
rx_ring = rxq->rx_ring;
ptype_tbl = rxq->adapter->ptype_tbl;
 
-   if ((rxq->offloads & IDPF_RX_OFFLOAD_TIMESTAMP) != 0)
-   rxq->hw_register_set = 1;
-
while (nb_rx < nb_pkts) {
rxdp = &rx_ring[rx_id];
rx_status0 = rte_le_to_cpu_16(rxdp->flex_nic_wb.status_error0);
@@ -1142,10 +1102,8 @@ idpf_dp_

[PATCH 2/7] net/idpf: save master time by alarm

2023-04-20 Thread Wenjing Qiao

Using alarm to save master time from registers every 1 second.

Fixes: 8c6098afa075 ("common/idpf: add Rx/Tx data path")
Cc: sta...@dpdk.org

Signed-off-by: Wenjing Qiao 
---
 drivers/net/idpf/idpf_ethdev.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/drivers/net/idpf/idpf_ethdev.c b/drivers/net/idpf/idpf_ethdev.c
index e02ec2ec5a..3f33ffbc78 100644
--- a/drivers/net/idpf/idpf_ethdev.c
+++ b/drivers/net/idpf/idpf_ethdev.c
@@ -761,6 +761,12 @@ idpf_dev_start(struct rte_eth_dev *dev)
goto err_vec;
}
 
+   if (dev->data->dev_conf.rxmode.offloads & RTE_ETH_RX_OFFLOAD_TIMESTAMP) 
{
+   rte_eal_alarm_set(1000 * 1000,
+ &idpf_dev_read_time_hw,
+ (void *)base);
+   }
+
ret = idpf_vc_vectors_alloc(vport, req_vecs_num);
if (ret != 0) {
PMD_DRV_LOG(ERR, "Failed to allocate interrupt vectors");
@@ -810,6 +816,7 @@ static int
 idpf_dev_stop(struct rte_eth_dev *dev)
 {
struct idpf_vport *vport = dev->data->dev_private;
+   struct idpf_adapter *base = vport->adapter;
 
if (vport->stopped == 1)
return 0;
@@ -822,6 +829,11 @@ idpf_dev_stop(struct rte_eth_dev *dev)
 
idpf_vc_vectors_dealloc(vport);
 
+   if (dev->data->dev_conf.rxmode.offloads & RTE_ETH_RX_OFFLOAD_TIMESTAMP) 
{
+   rte_eal_alarm_cancel(idpf_dev_read_time_hw,
+base);
+   }
+
vport->stopped = 1;
 
return 0;
-- 
2.25.1

[PATCH 3/7] net/cpfl: save master time by alarm

2023-04-20 Thread Wenjing Qiao

Using alarm to save master time from registers every 1 second.

Fixes: 8c6098afa075 ("common/idpf: add Rx/Tx data path")
Cc: sta...@dpdk.org

Signed-off-by: Wenjing Qiao 
---
 drivers/net/cpfl/cpfl_ethdev.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/drivers/net/cpfl/cpfl_ethdev.c b/drivers/net/cpfl/cpfl_ethdev.c
index ede730fd50..82d8147494 100644
--- a/drivers/net/cpfl/cpfl_ethdev.c
+++ b/drivers/net/cpfl/cpfl_ethdev.c
@@ -767,6 +767,12 @@ cpfl_dev_start(struct rte_eth_dev *dev)
goto err_vec;
}
 
+   if (dev->data->dev_conf.rxmode.offloads & RTE_ETH_RX_OFFLOAD_TIMESTAMP) 
{
+   rte_eal_alarm_set(1000 * 1000,
+ &idpf_dev_read_time_hw,
+ (void *)base);
+   }
+
ret = idpf_vc_vectors_alloc(vport, req_vecs_num);
if (ret != 0) {
PMD_DRV_LOG(ERR, "Failed to allocate interrupt vectors");
@@ -816,6 +822,7 @@ static int
 cpfl_dev_stop(struct rte_eth_dev *dev)
 {
struct idpf_vport *vport = dev->data->dev_private;
+   struct idpf_adapter *base = vport->adapter;
 
if (vport->stopped == 1)
return 0;
@@ -828,6 +835,11 @@ cpfl_dev_stop(struct rte_eth_dev *dev)
 
idpf_vc_vectors_dealloc(vport);
 
+   if (dev->data->dev_conf.rxmode.offloads & RTE_ETH_RX_OFFLOAD_TIMESTAMP) 
{
+   rte_eal_alarm_cancel(idpf_dev_read_time_hw,
+base);
+   }
+
vport->stopped = 1;
 
return 0;
-- 
2.25.1

[PATCH 4/7] common/idpf: support timestamp offload feature for ACC

2023-04-20 Thread Wenjing Qiao

For ACC, getting master time from MTS registers by shared memory.

Notice: it is a workaroud, and it will be removed after generic
solution are provided.

Signed-off-by: Wenjing Qiao 
---
 config/meson.build |  3 ++
 drivers/common/idpf/base/idpf_osdep.h  | 48 ++
 drivers/common/idpf/idpf_common_rxtx.c | 30 +---
 meson_options.txt  |  2 ++
 4 files changed, 79 insertions(+), 4 deletions(-)

diff --git a/config/meson.build b/config/meson.build
index fa730a1b14..8d74f301b4 100644
--- a/config/meson.build
+++ b/config/meson.build
@@ -316,6 +316,9 @@ endif
 if get_option('mbuf_refcnt_atomic')
 dpdk_conf.set('RTE_MBUF_REFCNT_ATOMIC', true)
 endif
+if get_option('enable_acc_timestamp')
+dpdk_conf.set('IDPF_ACC_TIMESTAMP', true)
+endif
 dpdk_conf.set10('RTE_IOVA_IN_MBUF', get_option('enable_iova_as_pa'))
 
 compile_time_cpuflags = []
diff --git a/drivers/common/idpf/base/idpf_osdep.h 
b/drivers/common/idpf/base/idpf_osdep.h
index 99ae9cf60a..e634939a51 100644
--- a/drivers/common/idpf/base/idpf_osdep.h
+++ b/drivers/common/idpf/base/idpf_osdep.h
@@ -24,6 +24,13 @@
 #include 
 #include 
 
+#ifdef IDPF_ACC_TIMESTAMP
+#include 
+#include 
+#include 
+#include 
+#endif /* IDPF_ACC_TIMESTAMP */
+
 #define INLINE inline
 #define STATIC static
 
@@ -361,4 +368,45 @@ idpf_hweight32(u32 num)
 
 #endif
 
+#ifdef IDPF_ACC_TIMESTAMP
+#define IDPF_ACC_TIMESYNC_BASE_ADDR 0x480D50
+#define IDPF_ACC_GLTSYN_TIME_H (IDPF_ACC_TIMESYNC_BASE_ADDR + 0x1C)
+#define IDPF_ACC_GLTSYN_TIME_L (IDPF_ACC_TIMESYNC_BASE_ADDR + 0x10)
+
+inline uint32_t
+idpf_mmap_r32(uint64_t pa)
+{
+   int fd;
+   void *bp, *vp;
+   uint32_t rval = 0xdeadbeef;
+   uint32_t ps, ml, of;
+
+   fd = open("/dev/mem", (O_RDWR | O_SYNC));
+   if (fd == -1) {
+   perror("/dev/mem");
+   return -1;
+   }
+   ml = ps = getpagesize();
+   of = (uint32_t)pa & (ps - 1);
+   if (of + (sizeof(uint32_t) * 4) > ps)
+   ml *= 2;
+   bp = mmap(NULL, ml, (PROT_READ | PROT_WRITE), MAP_SHARED, fd, pa & 
~(uint64_t)(ps - 1));
+   if (bp == MAP_FAILED) {
+   perror("mmap");
+   goto done;
+   }
+
+   vp = (char *)bp + of;
+
+   rval = *(volatile uint32_t *)vp;
+   if (munmap(bp, ml) == -1)
+   perror("munmap");
+done:
+   close(fd);
+
+   return rval;
+}
+
+#endif /* IDPF_ACC_TIMESTAMP */
+
 #endif /* _IDPF_OSDEP_H_ */
diff --git a/drivers/common/idpf/idpf_common_rxtx.c 
b/drivers/common/idpf/idpf_common_rxtx.c
index 19bcb94077..9c58f3fb11 100644
--- a/drivers/common/idpf/idpf_common_rxtx.c
+++ b/drivers/common/idpf/idpf_common_rxtx.c
@@ -1582,12 +1582,36 @@ idpf_qc_splitq_rx_vec_setup(struct idpf_rx_queue *rxq)
 void
 idpf_dev_read_time_hw(void *cb_arg)
 {
-#ifdef RTE_ARCH_X86_64
struct idpf_adapter *ad = (struct idpf_adapter *)cb_arg;
uint32_t hi, lo, lo2;
int rc = 0;
+#ifndef IDPF_ACC_TIMESTAMP
struct idpf_hw *hw = &ad->hw;
+#endif /*  !IDPF_ACC_TIMESTAMP */
 
+#ifdef IDPF_ACC_TIMESTAMP
+
+   lo = idpf_mmap_r32(IDPF_ACC_GLTSYN_TIME_L);
+   hi = idpf_mmap_r32(IDPF_ACC_GLTSYN_TIME_H);
+   DRV_LOG(DEBUG, "lo : %X,", lo);
+   DRV_LOG(DEBUG, "hi : %X,", hi);
+   /*
+* On typical system, the delta between lo and lo2 is ~1000ns,
+* so 1 seems a large-enough but not overly-big guard band.
+*/
+   if (lo > (UINT32_MAX - IDPF_TIMESYNC_REG_WRAP_GUARD_BAND))
+   lo2 = idpf_mmap_r32(IDPF_ACC_GLTSYN_TIME_L);
+   else
+   lo2 = lo;
+
+   if (lo2 < lo) {
+   lo = idpf_mmap_r32(IDPF_ACC_GLTSYN_TIME_L);
+   hi = idpf_mmap_r32(IDPF_ACC_GLTSYN_TIME_H);
+   }
+
+   ad->time_hw = ((uint64_t)hi << 32) | lo;
+
+#else  /* !IDPF_ACC_TIMESTAMP */
IDPF_WRITE_REG(hw, GLTSYN_CMD_SYNC_0_0, PF_GLTSYN_CMD_SYNC_SHTIME_EN_M);
IDPF_WRITE_REG(hw, GLTSYN_CMD_SYNC_0_0,
   PF_GLTSYN_CMD_SYNC_EXEC_CMD_M | 
PF_GLTSYN_CMD_SYNC_SHTIME_EN_M);
@@ -1608,9 +1632,7 @@ idpf_dev_read_time_hw(void *cb_arg)
}
 
ad->time_hw = ((uint64_t)hi << 32) | lo;
-#else  /* !RTE_ARCH_X86_64 */
-   ad->time_hw = 0;
-#endif /* RTE_ARCH_X86_64 */
+#endif /* IDPF_ACC_TIMESTAMP */
 
/* re-alarm watchdog */
rc = rte_eal_alarm_set(1000 * 1000, &idpf_dev_read_time_hw, cb_arg);
diff --git a/meson_options.txt b/meson_options.txt
index 82c8297065..31fc634aa0 100644
--- a/meson_options.txt
+++ b/meson_options.txt
@@ -52,3 +52,5 @@ option('tests', type: 'boolean', value: true, description:
'build unit tests')
 option('use_hpet', type: 'boolean', value: false, description:
'use HPET timer in EAL')
+option('enable_acc_timestamp', type: 'boolean', value: false, description:
+   'enable timestamp on ACC.')
-- 
2.25.1

[PATCH 5/7] common/idpf: add timestamp enable flag for rxq

2023-04-20 Thread Wenjing Qiao

A rxq can be configured with timestamp offload.
So, add timestamp enable flag for rxq.

Fixes: 8c6098afa075 ("common/idpf: add Rx/Tx data path")
Cc: sta...@dpdk.org

Signed-off-by: Wenjing Qiao 
Suggested-by: Jingjing Wu 
---
 drivers/common/idpf/idpf_common_rxtx.c | 3 ++-
 drivers/common/idpf/idpf_common_rxtx.h | 2 ++
 drivers/common/idpf/version.map| 3 +++
 3 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/common/idpf/idpf_common_rxtx.c 
b/drivers/common/idpf/idpf_common_rxtx.c
index 9c58f3fb11..7afe7afe3f 100644
--- a/drivers/common/idpf/idpf_common_rxtx.c
+++ b/drivers/common/idpf/idpf_common_rxtx.c
@@ -354,7 +354,7 @@ int
 idpf_qc_ts_mbuf_register(struct idpf_rx_queue *rxq)
 {
int err;
-   if ((rxq->offloads & IDPF_RX_OFFLOAD_TIMESTAMP) != 0) {
+   if (!rxq->ts_enable && (rxq->offloads & IDPF_RX_OFFLOAD_TIMESTAMP)) {
/* Register mbuf field and flag for Rx timestamp */
err = 
rte_mbuf_dyn_rx_timestamp_register(&idpf_timestamp_dynfield_offset,
 
&idpf_timestamp_dynflag);
@@ -363,6 +363,7 @@ idpf_qc_ts_mbuf_register(struct idpf_rx_queue *rxq)
"Cannot register mbuf field/flag for 
timestamp");
return -EINVAL;
}
+   rxq->ts_enable = TRUE;
}
return 0;
 }
diff --git a/drivers/common/idpf/idpf_common_rxtx.h 
b/drivers/common/idpf/idpf_common_rxtx.h
index af1425eb3f..cb7f5a3ba8 100644
--- a/drivers/common/idpf/idpf_common_rxtx.h
+++ b/drivers/common/idpf/idpf_common_rxtx.h
@@ -142,6 +142,8 @@ struct idpf_rx_queue {
struct idpf_rx_queue *bufq2;
 
uint64_t offloads;
+
+   bool ts_enable; /* if timestamp is enabled */
 };
 
 struct idpf_tx_entry {
diff --git a/drivers/common/idpf/version.map b/drivers/common/idpf/version.map
index c67c554911..15b42b4d2e 100644
--- a/drivers/common/idpf/version.map
+++ b/drivers/common/idpf/version.map
@@ -69,5 +69,8 @@ INTERNAL {
idpf_vport_rss_config;
idpf_vport_stats_update;
 
+   idpf_timestamp_dynfield_offset;
+   idpf_timestamp_dynflag;
+
local: *;
 };
-- 
2.25.1

[PATCH 6/7] net/cpfl: register timestamp mbuf when starting dev

2023-04-20 Thread Wenjing Qiao

Due to only support timestamp at port level, registering
timestamp mbuf should be at dev start stage.

Fixes: 8c6098afa075 ("common/idpf: add Rx/Tx data path")
Cc: sta...@dpdk.org

Signed-off-by: Wenjing Qiao 
Suggested-by: Jingjing Wu 
---
 drivers/net/cpfl/cpfl_ethdev.c | 7 +++
 drivers/net/cpfl/cpfl_ethdev.h | 3 +++
 drivers/net/cpfl/cpfl_rxtx.c   | 2 ++
 3 files changed, 12 insertions(+)

diff --git a/drivers/net/cpfl/cpfl_ethdev.c b/drivers/net/cpfl/cpfl_ethdev.c
index 82d8147494..416273f567 100644
--- a/drivers/net/cpfl/cpfl_ethdev.c
+++ b/drivers/net/cpfl/cpfl_ethdev.c
@@ -771,6 +771,13 @@ cpfl_dev_start(struct rte_eth_dev *dev)
rte_eal_alarm_set(1000 * 1000,
  &idpf_dev_read_time_hw,
  (void *)base);
+   /* Register mbuf field and flag for Rx timestamp */
+   ret = 
rte_mbuf_dyn_rx_timestamp_register(&idpf_timestamp_dynfield_offset,
+
&idpf_timestamp_dynflag);
+   if (ret != 0) {
+   PMD_DRV_LOG(ERR, "Cannot register mbuf field/flag for 
timestamp");
+   return -EINVAL;
+   }
}
 
ret = idpf_vc_vectors_alloc(vport, req_vecs_num);
diff --git a/drivers/net/cpfl/cpfl_ethdev.h b/drivers/net/cpfl/cpfl_ethdev.h
index 200dfcac02..eec253bc77 100644
--- a/drivers/net/cpfl/cpfl_ethdev.h
+++ b/drivers/net/cpfl/cpfl_ethdev.h
@@ -57,6 +57,9 @@
 /* Device IDs */
 #define IDPF_DEV_ID_CPF0x1453
 
+extern int idpf_timestamp_dynfield_offset;
+extern uint64_t idpf_timestamp_dynflag;
+
 struct cpfl_vport_param {
struct cpfl_adapter_ext *adapter;
uint16_t devarg_id; /* arg id from user */
diff --git a/drivers/net/cpfl/cpfl_rxtx.c b/drivers/net/cpfl/cpfl_rxtx.c
index de59b31b3d..cdb5b37da0 100644
--- a/drivers/net/cpfl/cpfl_rxtx.c
+++ b/drivers/net/cpfl/cpfl_rxtx.c
@@ -529,6 +529,8 @@ cpfl_rx_queue_init(struct rte_eth_dev *dev, uint16_t 
rx_queue_id)
frame_size > rxq->rx_buf_len)
dev->data->scattered_rx = 1;
 
+   if (dev->data->dev_conf.rxmode.offloads & RTE_ETH_RX_OFFLOAD_TIMESTAMP)
+   rxq->ts_enable = TRUE;
err = idpf_qc_ts_mbuf_register(rxq);
if (err != 0) {
PMD_DRV_LOG(ERR, "fail to register timestamp mbuf %u",
-- 
2.25.1

[PATCH 7/7] net/idpf: register timestamp mbuf when starting dev

2023-04-20 Thread Wenjing Qiao

Due to only support timestamp at port level, registering
timestamp mbuf should be at dev start stage.

Fixes: 8c6098afa075 ("common/idpf: add Rx/Tx data path")
Cc: sta...@dpdk.org

Signed-off-by: Wenjing Qiao 
Suggested-by: Jingjing Wu 
---
 drivers/net/idpf/idpf_ethdev.c | 7 +++
 drivers/net/idpf/idpf_ethdev.h | 3 +++
 drivers/net/idpf/idpf_rxtx.c   | 3 +++
 3 files changed, 13 insertions(+)

diff --git a/drivers/net/idpf/idpf_ethdev.c b/drivers/net/idpf/idpf_ethdev.c
index 3f33ffbc78..7c43f51c25 100644
--- a/drivers/net/idpf/idpf_ethdev.c
+++ b/drivers/net/idpf/idpf_ethdev.c
@@ -765,6 +765,13 @@ idpf_dev_start(struct rte_eth_dev *dev)
rte_eal_alarm_set(1000 * 1000,
  &idpf_dev_read_time_hw,
  (void *)base);
+   /* Register mbuf field and flag for Rx timestamp */
+   ret = 
rte_mbuf_dyn_rx_timestamp_register(&idpf_timestamp_dynfield_offset,
+
&idpf_timestamp_dynflag);
+   if (ret != 0) {
+   PMD_DRV_LOG(ERR, "Cannot register mbuf field/flag for 
timestamp");
+   return -EINVAL;
+   }
}
 
ret = idpf_vc_vectors_alloc(vport, req_vecs_num);
diff --git a/drivers/net/idpf/idpf_ethdev.h b/drivers/net/idpf/idpf_ethdev.h
index 3c2c932438..256e348710 100644
--- a/drivers/net/idpf/idpf_ethdev.h
+++ b/drivers/net/idpf/idpf_ethdev.h
@@ -55,6 +55,9 @@
 
 #define IDPF_ALARM_INTERVAL5 /* us */
 
+extern int idpf_timestamp_dynfield_offset;
+extern uint64_t idpf_timestamp_dynflag;
+
 struct idpf_vport_param {
struct idpf_adapter_ext *adapter;
uint16_t devarg_id; /* arg id from user */
diff --git a/drivers/net/idpf/idpf_rxtx.c b/drivers/net/idpf/idpf_rxtx.c
index 414f9a37f6..1aaf0142d2 100644
--- a/drivers/net/idpf/idpf_rxtx.c
+++ b/drivers/net/idpf/idpf_rxtx.c
@@ -529,6 +529,9 @@ idpf_rx_queue_init(struct rte_eth_dev *dev, uint16_t 
rx_queue_id)
frame_size > rxq->rx_buf_len)
dev->data->scattered_rx = 1;
 
+   if (dev->data->dev_conf.rxmode.offloads & RTE_ETH_RX_OFFLOAD_TIMESTAMP)
+   rxq->ts_enable = TRUE;
+
err = idpf_qc_ts_mbuf_register(rxq);
if (err != 0) {
PMD_DRV_LOG(ERR, "fail to residter timestamp mbuf %u",
-- 
2.25.1

[RFC] net/mlx5: add MPLS modify field support

2023-04-20 Thread Michael Baum

Add support for modify field in tunnel MPLS header.
For now it is supported only to copy from.

Signed-off-by: Michael Baum 
---
 drivers/common/mlx5/mlx5_prm.h  |  5 +
 drivers/net/mlx5/mlx5_flow_dv.c | 23 +++
 drivers/net/mlx5/mlx5_flow_hw.c | 16 +---
 3 files changed, 37 insertions(+), 7 deletions(-)

diff --git a/drivers/common/mlx5/mlx5_prm.h b/drivers/common/mlx5/mlx5_prm.h
index ed3d5efbb7..04c1400a1e 100644
--- a/drivers/common/mlx5/mlx5_prm.h
+++ b/drivers/common/mlx5/mlx5_prm.h
@@ -787,6 +787,11 @@ enum mlx5_modification_field {
MLX5_MODI_TUNNEL_HDR_DW_1 = 0x75,
MLX5_MODI_GTPU_FIRST_EXT_DW_0 = 0x76,
MLX5_MODI_HASH_RESULT = 0x81,
+   MLX5_MODI_IN_MPLS_LABEL_0 = 0x8a,
+   MLX5_MODI_IN_MPLS_LABEL_1,
+   MLX5_MODI_IN_MPLS_LABEL_2,
+   MLX5_MODI_IN_MPLS_LABEL_3,
+   MLX5_MODI_IN_MPLS_LABEL_4,
MLX5_MODI_OUT_IPV6_NEXT_HDR = 0x4A,
MLX5_MODI_INVALID = INT_MAX,
 };
diff --git a/drivers/net/mlx5/mlx5_flow_dv.c b/drivers/net/mlx5/mlx5_flow_dv.c
index f136f43b0a..93cce16a1e 100644
--- a/drivers/net/mlx5/mlx5_flow_dv.c
+++ b/drivers/net/mlx5/mlx5_flow_dv.c
@@ -1388,6 +1388,7 @@ mlx5_flow_item_field_width(struct rte_eth_dev *dev,
case RTE_FLOW_FIELD_GENEVE_VNI:
return 24;
case RTE_FLOW_FIELD_GTP_TEID:
+   case RTE_FLOW_FIELD_MPLS:
case RTE_FLOW_FIELD_TAG:
return 32;
case RTE_FLOW_FIELD_MARK:
@@ -1435,6 +1436,12 @@ flow_modify_info_mask_32_masked(uint32_t length, 
uint32_t off, uint32_t post_mas
return rte_cpu_to_be_32(mask & post_mask);
 }
 
+static __rte_always_inline enum mlx5_modification_field
+mlx5_mpls_modi_field_get(const struct rte_flow_action_modify_data *data)
+{
+   return MLX5_MODI_IN_MPLS_LABEL_0 + data->sub_level;
+}
+
 static void
 mlx5_modify_flex_item(const struct rte_eth_dev *dev,
  const struct mlx5_flex_item *flex,
@@ -1893,6 +1900,16 @@ mlx5_flow_field_id_to_modify_info
else
info[idx].offset = off_be;
break;
+   case RTE_FLOW_FIELD_MPLS:
+   MLX5_ASSERT(data->offset + width <= 32);
+   off_be = 32 - (data->offset + width);
+   info[idx] = (struct field_modify_info){4, 0,
+   mlx5_mpls_modi_field_get(data)};
+   if (mask)
+   mask[idx] = flow_modify_info_mask_32(width, off_be);
+   else
+   info[idx].offset = off_be;
+   break;
case RTE_FLOW_FIELD_TAG:
{
MLX5_ASSERT(data->offset + width <= 32);
@@ -5344,6 +5361,12 @@ flow_dv_validate_action_modify_field(struct rte_eth_dev 
*dev,
RTE_FLOW_ERROR_TYPE_ACTION, action,
"modifications of the GENEVE Network"
" Identifier is not supported");
+   if (action_modify_field->dst.field == RTE_FLOW_FIELD_MPLS ||
+   action_modify_field->src.field == RTE_FLOW_FIELD_MPLS)
+   return rte_flow_error_set(error, ENOTSUP,
+   RTE_FLOW_ERROR_TYPE_ACTION, action,
+   "modifications of the MPLS header "
+   "is not supported");
if (action_modify_field->dst.field == RTE_FLOW_FIELD_MARK ||
action_modify_field->src.field == RTE_FLOW_FIELD_MARK)
if (config->dv_xmeta_en == MLX5_XMETA_MODE_LEGACY ||
diff --git a/drivers/net/mlx5/mlx5_flow_hw.c b/drivers/net/mlx5/mlx5_flow_hw.c
index 7e0ee8d883..fd2ad3bb58 100644
--- a/drivers/net/mlx5/mlx5_flow_hw.c
+++ b/drivers/net/mlx5/mlx5_flow_hw.c
@@ -3546,10 +3546,8 @@ flow_hw_validate_action_modify_field(const struct 
rte_flow_action *action,
 const struct rte_flow_action *mask,
 struct rte_flow_error *error)
 {
-   const struct rte_flow_action_modify_field *action_conf =
-   action->conf;
-   const struct rte_flow_action_modify_field *mask_conf =
-   mask->conf;
+   const struct rte_flow_action_modify_field *action_conf = action->conf;
+   const struct rte_flow_action_modify_field *mask_conf = mask->conf;
 
if (action_conf->operation != mask_conf->operation)
return rte_flow_error_set(error, EINVAL,
@@ -3604,6 +3602,11 @@ flow_hw_validate_action_modify_field(const struct 
rte_flow_action *action,
return rte_flow_error_set(error, EINVAL,
RTE_FLOW_ERROR_TYPE_ACTION, action,
"modifying Geneve VNI is not supported");
+   /* Due to HW bug, tunnel MPLS header is read only. */
+   if (action_conf->dst.field == RTE_FLOW_FIELD_MPLS)
+   return rte_flow_error_set(error, EINVAL,
+   RTE_FLOW_ERROR_TYPE_A

[RFC PATCH v1 0/5] dts: add tg abstractions and scapy

2023-04-20 Thread Juraj Linkeš

The implementation adds abstractions for all traffic generators as well
as those that can capture individual packets and investigate (not just
count) them.

The traffic generators reside on traffic generator nodes which are also
added, along with some related code.

Juraj Linkeš (5):
  dts: add scapy dependency
  dts: add traffic generator config
  dts: traffic generator abstractions
  dts: scapy traffic generator implementation
  dts: add traffic generator node to dts runner

 dts/conf.yaml |  25 ++
 dts/framework/config/__init__.py  | 107 +-
 dts/framework/config/conf_yaml_schema.json| 172 -
 dts/framework/dts.py  |  42 ++-
 dts/framework/remote_session/linux_session.py |  55 +++
 dts/framework/remote_session/os_session.py|  22 +-
 dts/framework/remote_session/posix_session.py |   3 +
 .../remote_session/remote/remote_session.py   |   7 +
 dts/framework/testbed_model/__init__.py   |   1 +
 .../capturing_traffic_generator.py| 155 
 dts/framework/testbed_model/hw/port.py|  55 +++
 dts/framework/testbed_model/node.py   |   4 +-
 dts/framework/testbed_model/scapy.py  | 348 ++
 dts/framework/testbed_model/sut_node.py   |   5 +-
 dts/framework/testbed_model/tg_node.py|  62 
 .../testbed_model/traffic_generator.py|  59 +++
 dts/poetry.lock   |  18 +-
 dts/pyproject.toml|   1 +
 18 files changed, 1103 insertions(+), 38 deletions(-)
 create mode 100644 dts/framework/testbed_model/capturing_traffic_generator.py
 create mode 100644 dts/framework/testbed_model/hw/port.py
 create mode 100644 dts/framework/testbed_model/scapy.py
 create mode 100644 dts/framework/testbed_model/tg_node.py
 create mode 100644 dts/framework/testbed_model/traffic_generator.py

-- 
2.30.2

[RFC PATCH v1 1/5] dts: add scapy dependency

2023-04-20 Thread Juraj Linkeš

Required for scapy traffic generator.

Signed-off-by: Juraj Linkeš 
---
 dts/poetry.lock| 18 +-
 dts/pyproject.toml |  1 +
 2 files changed, 18 insertions(+), 1 deletion(-)

diff --git a/dts/poetry.lock b/dts/poetry.lock
index 64d6c18f35..4b6c42e280 100644
--- a/dts/poetry.lock
+++ b/dts/poetry.lock
@@ -425,6 +425,22 @@ files = [
 {file = "PyYAML-6.0.tar.gz", hash = 
"sha256:68fb519c14306fec9720a2a5b45bc9f0c8d1b9c72adf45c37baedfcd949c35a2"},
 ]
 
+[[package]]
+name = "scapy"
+version = "2.5.0"
+description = "Scapy: interactive packet manipulation tool"
+category = "main"
+optional = false
+python-versions = ">=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, <4"
+files = [
+{file = "scapy-2.5.0.tar.gz", hash = 
"sha256:5b260c2b754fd8d409ba83ee7aee294ecdbb2c235f9f78fe90bc11cb6e5debc2"},
+]
+
+[package.extras]
+basic = ["ipython"]
+complete = ["cryptography (>=2.0)", "ipython", "matplotlib", "pyx"]
+docs = ["sphinx (>=3.0.0)", "sphinx_rtd_theme (>=0.4.3)", "tox (>=3.0.0)"]
+
 [[package]]
 name = "snowballstemmer"
 version = "2.2.0"
@@ -504,4 +520,4 @@ jsonschema = ">=4,<5"
 [metadata]
 lock-version = "2.0"
 python-versions = "^3.10"
-content-hash = 
"af71d1ffeb4372d870bd02a8bf101577254e03ddb8b12d02ab174f80069fd853"
+content-hash = 
"fba5dcbb12d55a9c6b3f59f062d3a40973ff8360edb023b6e4613522654ba7c1"
diff --git a/dts/pyproject.toml b/dts/pyproject.toml
index 72d5b0204d..fc9b6278bb 100644
--- a/dts/pyproject.toml
+++ b/dts/pyproject.toml
@@ -23,6 +23,7 @@ pexpect = "^4.8.0"
 warlock = "^2.0.1"
 PyYAML = "^6.0"
 types-PyYAML = "^6.0.8"
+scapy = "^2.5.0"
 
 [tool.poetry.group.dev.dependencies]
 mypy = "^0.961"
-- 
2.30.2

[RFC PATCH v1 2/5] dts: add traffic generator config

2023-04-20 Thread Juraj Linkeš

Node configuration - where to connect, what ports to use and what TG to
use.

Signed-off-by: Juraj Linkeš 
---
 dts/conf.yaml  |  25 +++
 dts/framework/config/__init__.py   | 107 +++--
 dts/framework/config/conf_yaml_schema.json | 172 -
 3 files changed, 287 insertions(+), 17 deletions(-)

diff --git a/dts/conf.yaml b/dts/conf.yaml
index a9bd8a3ecf..4e5fd3560f 100644
--- a/dts/conf.yaml
+++ b/dts/conf.yaml
@@ -13,6 +13,7 @@ executions:
 test_suites:
   - hello_world
 system_under_test: "SUT 1"
+traffic_generator_system: "TG 1"
 nodes:
   - name: "SUT 1"
 hostname: sut1.change.me.localhost
@@ -25,3 +26,27 @@ nodes:
 hugepages:  # optional; if removed, will use system hugepage configuration
 amount: 256
 force_first_numa: false
+ports:
+  - pci: ":00:08.0"
+dpdk_os_driver: vfio-pci
+os_driver: i40e
+peer_node: "TG 1"
+peer_pci: ":00:08.0"
+  - name: "TG 1"
+hostname: tg1.change.me.localhost
+user: root
+arch: x86_64
+os: linux
+lcores: ""
+use_first_core: false
+hugepages:  # optional; if removed, will use system hugepage configuration
+amount: 256
+force_first_numa: false
+ports:
+  - pci: ":00:08.0"
+dpdk_os_driver: rdma
+os_driver: rdma
+peer_node: "SUT 1"
+peer_pci: ":00:08.0"
+traffic_generator:
+  type: SCAPY
diff --git a/dts/framework/config/__init__.py b/dts/framework/config/__init__.py
index ebb0823ff5..6b1c3159f7 100644
--- a/dts/framework/config/__init__.py
+++ b/dts/framework/config/__init__.py
@@ -12,7 +12,7 @@
 import pathlib
 from dataclasses import dataclass
 from enum import Enum, auto, unique
-from typing import Any, TypedDict
+from typing import Any, TypedDict, Union
 
 import warlock  # type: ignore
 import yaml
@@ -61,6 +61,18 @@ class Compiler(StrEnum):
 msvc = auto()
 
 
+@unique
+class NodeType(StrEnum):
+physical = auto()
+virtual = auto()
+
+
+@unique
+class TrafficGeneratorType(StrEnum):
+NONE = auto()
+SCAPY = auto()
+
+
 # Slots enables some optimizations, by pre-allocating space for the defined
 # attributes in the underlying data structure.
 #
@@ -72,6 +84,41 @@ class HugepageConfiguration:
 force_first_numa: bool
 
 
+@dataclass(slots=True, frozen=True)
+class PortConfig:
+id: int
+node: str
+pci: str
+dpdk_os_driver: str
+os_driver: str
+peer_node: str
+peer_pci: str
+
+@staticmethod
+def from_dict(id: int, node: str, d: dict) -> "PortConfig":
+return PortConfig(id=id, node=node, **d)
+
+
+@dataclass(slots=True, frozen=True)
+class TrafficGeneratorConfig:
+traffic_generator_type: TrafficGeneratorType
+
+@staticmethod
+def from_dict(d: dict):
+# This looks useless now, but is designed to allow expansion to traffic
+# generators that require more configuration later.
+match TrafficGeneratorType(d["type"]):
+case TrafficGeneratorType.SCAPY:
+return ScapyTrafficGeneratorConfig(
+traffic_generator_type=TrafficGeneratorType.SCAPY
+)
+
+
+@dataclass(slots=True, frozen=True)
+class ScapyTrafficGeneratorConfig(TrafficGeneratorConfig):
+pass
+
+
 @dataclass(slots=True, frozen=True)
 class NodeConfiguration:
 name: str
@@ -82,29 +129,52 @@ class NodeConfiguration:
 os: OS
 lcores: str
 use_first_core: bool
-memory_channels: int
 hugepages: HugepageConfiguration | None
+ports: list[PortConfig]
 
 @staticmethod
-def from_dict(d: dict) -> "NodeConfiguration":
+def from_dict(d: dict) -> Union["SUTConfiguration", "TGConfiguration"]:
 hugepage_config = d.get("hugepages")
 if hugepage_config:
 if "force_first_numa" not in hugepage_config:
 hugepage_config["force_first_numa"] = False
 hugepage_config = HugepageConfiguration(**hugepage_config)
 
-return NodeConfiguration(
-name=d["name"],
-hostname=d["hostname"],
-user=d["user"],
-password=d.get("password"),
-arch=Architecture(d["arch"]),
-os=OS(d["os"]),
-lcores=d.get("lcores", "1"),
-use_first_core=d.get("use_first_core", False),
-memory_channels=d.get("memory_channels", 1),
-hugepages=hugepage_config,
-)
+common_config = {"name": d["name"],
+"hostname": d["hostname"],
+"user": d["user"],
+"password": d.get("password"),
+"arch": Architecture(d["arch"]),
+"os": OS(d["os"]),
+"lcores": d.get("lcores", "1"),
+"use_first_core": d.get("use_first_core", False),
+"hugepages": hugepage_config,
+"ports": [
+PortConfig.from_dict(i, d["name"]

[RFC PATCH v1 3/5] dts: traffic generator abstractions

2023-04-20 Thread Juraj Linkeš

There are traffic abstractions for all traffic generators and for
traffic generators that can capture (not just count) packets.

There also related abstractions, such as TGNode where the traffic
generators reside and some related code.

Signed-off-by: Juraj Linkeš 
---
 dts/framework/remote_session/os_session.py|  22 ++-
 dts/framework/remote_session/posix_session.py |   3 +
 .../capturing_traffic_generator.py| 155 ++
 dts/framework/testbed_model/hw/port.py|  55 +++
 dts/framework/testbed_model/node.py   |   4 +-
 dts/framework/testbed_model/sut_node.py   |   5 +-
 dts/framework/testbed_model/tg_node.py|  62 +++
 .../testbed_model/traffic_generator.py|  59 +++
 8 files changed, 360 insertions(+), 5 deletions(-)
 create mode 100644 dts/framework/testbed_model/capturing_traffic_generator.py
 create mode 100644 dts/framework/testbed_model/hw/port.py
 create mode 100644 dts/framework/testbed_model/tg_node.py
 create mode 100644 dts/framework/testbed_model/traffic_generator.py

diff --git a/dts/framework/remote_session/os_session.py 
b/dts/framework/remote_session/os_session.py
index 4c48ae2567..56d7fef06c 100644
--- a/dts/framework/remote_session/os_session.py
+++ b/dts/framework/remote_session/os_session.py
@@ -10,6 +10,7 @@
 from framework.logger import DTSLOG
 from framework.settings import SETTINGS
 from framework.testbed_model import LogicalCore
+from framework.testbed_model.hw.port import PortIdentifier
 from framework.utils import EnvVarsDict, MesonArgs
 
 from .remote import CommandResult, RemoteSession, create_remote_session
@@ -37,6 +38,7 @@ def __init__(
 self.name = name
 self._logger = logger
 self.remote_session = create_remote_session(node_config, name, logger)
+self._disable_terminal_colors()
 
 def close(self, force: bool = False) -> None:
 """
@@ -53,7 +55,7 @@ def is_alive(self) -> bool:
 def send_command(
 self,
 command: str,
-timeout: float,
+timeout: float = SETTINGS.timeout,
 verify: bool = False,
 env: EnvVarsDict | None = None,
 ) -> CommandResult:
@@ -64,6 +66,12 @@ def send_command(
 """
 return self.remote_session.send_command(command, timeout, verify, env)
 
+@abstractmethod
+def _disable_terminal_colors(self) -> None:
+"""
+Disable the colors in the ssh session.
+"""
+
 @abstractmethod
 def guess_dpdk_remote_dir(self, remote_dir) -> PurePath:
 """
@@ -173,3 +181,15 @@ def setup_hugepages(self, hugepage_amount: int, 
force_first_numa: bool) -> None:
 if needed and mount the hugepages if needed.
 If force_first_numa is True, configure hugepages just on the first 
socket.
 """
+
+@abstractmethod
+def get_logical_name_of_port(self, id: PortIdentifier) -> str | None:
+"""
+Gets the logical name (eno1, ens5, etc) of a port by the port's 
identifier.
+"""
+
+@abstractmethod
+def check_link_is_up(self, id: PortIdentifier) -> bool:
+"""
+Check that the link is up.
+"""
diff --git a/dts/framework/remote_session/posix_session.py 
b/dts/framework/remote_session/posix_session.py
index d38062e8d6..288fbabf1e 100644
--- a/dts/framework/remote_session/posix_session.py
+++ b/dts/framework/remote_session/posix_session.py
@@ -219,3 +219,6 @@ def _remove_dpdk_runtime_dirs(
 
 def get_dpdk_file_prefix(self, dpdk_prefix) -> str:
 return ""
+
+def _disable_terminal_colors(self) -> None:
+self.remote_session.send_command("export TERM=xterm-mono")
diff --git a/dts/framework/testbed_model/capturing_traffic_generator.py 
b/dts/framework/testbed_model/capturing_traffic_generator.py
new file mode 100644
index 00..7beeb139c1
--- /dev/null
+++ b/dts/framework/testbed_model/capturing_traffic_generator.py
@@ -0,0 +1,155 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2022 University of New Hampshire
+#
+
+import itertools
+import uuid
+from abc import abstractmethod
+
+import scapy.utils
+from scapy.packet import Packet
+
+from framework.testbed_model.hw.port import PortIdentifier
+from framework.settings import SETTINGS
+
+from .traffic_generator import TrafficGenerator
+
+
+def _get_default_capture_name() -> str:
+"""
+This is the function used for the default implementation of capture names.
+"""
+return str(uuid.uuid4())
+
+
+class CapturingTrafficGenerator(TrafficGenerator):
+"""
+A mixin interface which enables a packet generator to declare that it can 
capture
+packets and return them to the user.
+
+All packet functions added by this class should write out the captured 
packets
+to a pcap file in output, allowing for easier analysis of failed tests.
+"""
+
+def is_capturing(self) -> bool:
+return True
+
+@abstractmethod
+def send_packet_and_capture(
+self,
+sen

[RFC PATCH v1 4/5] dts: scapy traffic generator implementation

2023-04-20 Thread Juraj Linkeš

Scapy is a traffic generator capable of sending and receiving traffic.
Since it's a software traffic generator, it's not suitable for
performance testing, but it is suitable for functional testing.

Signed-off-by: Juraj Linkeš 
---
 dts/framework/remote_session/linux_session.py |  55 +++
 .../remote_session/remote/remote_session.py   |   7 +
 dts/framework/testbed_model/scapy.py  | 348 ++
 3 files changed, 410 insertions(+)
 create mode 100644 dts/framework/testbed_model/scapy.py

diff --git a/dts/framework/remote_session/linux_session.py 
b/dts/framework/remote_session/linux_session.py
index a1e3bc3a92..b99a27bba4 100644
--- a/dts/framework/remote_session/linux_session.py
+++ b/dts/framework/remote_session/linux_session.py
@@ -2,13 +2,29 @@
 # Copyright(c) 2023 PANTHEON.tech s.r.o.
 # Copyright(c) 2023 University of New Hampshire
 
+import json
+from typing import TypedDict
+from typing_extensions import NotRequired
+
+
 from framework.exception import RemoteCommandExecutionError
 from framework.testbed_model import LogicalCore
+from framework.testbed_model.hw.port import PortIdentifier
 from framework.utils import expand_range
 
 from .posix_session import PosixSession
 
 
+class LshwOutputConfigurationDict(TypedDict):
+link: str
+
+
+class LshwOutputDict(TypedDict):
+businfo: str
+logicalname: NotRequired[str]
+configuration: LshwOutputConfigurationDict
+
+
 class LinuxSession(PosixSession):
 """
 The implementation of non-Posix compliant parts of Linux remote sessions.
@@ -105,3 +121,42 @@ def _configure_huge_pages(
 self.remote_session.send_command(
 f"echo {amount} | sudo tee {hugepage_config_path}"
 )
+
+def get_lshw_info(self) -> list[LshwOutputDict]:
+output = self.remote_session.send_expect("lshw -quiet -json -C 
network", "#")
+assert not isinstance(
+output, int
+), "send_expect returned an int when it should have been a string"
+return json.loads(output)
+
+def get_logical_name_of_port(self, id: PortIdentifier) -> str | None:
+self._logger.debug(f"Searching for logical name of {id.pci}")
+assert (
+id.node == self.name
+), "Attempted to get the logical port name on the wrong node"
+port_info_list: list[LshwOutputDict] = self.get_lshw_info()
+for port_info in port_info_list:
+if f"pci@{id.pci}" == port_info.get("businfo"):
+if "logicalname" in port_info:
+self._logger.debug(
+f"Found logical name for port {id.pci}, 
{port_info.get('logicalname')}"
+)
+return port_info.get("logicalname")
+else:
+self._logger.warning(
+f"Attempted to get the logical name of {id.pci}, but 
none existed"
+)
+return None
+self._logger.warning(f"No port at pci address {id.pci} found.")
+return None
+
+def check_link_is_up(self, id: PortIdentifier) -> bool | None:
+self._logger.debug(f"Checking link status for {id.pci}")
+port_info_list: list[LshwOutputDict] = self.get_lshw_info()
+for port_info in port_info_list:
+if f"pci@{id.pci}" == port_info.get("businfo"):
+status = port_info["configuration"]["link"]
+self._logger.debug(f"Found link status for port {id.pci}, 
{status}")
+return status == "up"
+self._logger.warning(f"No port at pci address {id.pci} found.")
+return None
diff --git a/dts/framework/remote_session/remote/remote_session.py 
b/dts/framework/remote_session/remote/remote_session.py
index 91dee3cb4f..5b36e2d7d2 100644
--- a/dts/framework/remote_session/remote/remote_session.py
+++ b/dts/framework/remote_session/remote/remote_session.py
@@ -84,6 +84,13 @@ def _connect(self) -> None:
 Create connection to assigned node.
 """
 
+@abstractmethod
+def send_expect(
+self, command: str, prompt: str, timeout: float = 15,
+verify: bool = False
+) -> str | int:
+""
+
 def send_command(
 self,
 command: str,
diff --git a/dts/framework/testbed_model/scapy.py 
b/dts/framework/testbed_model/scapy.py
new file mode 100644
index 00..1e5caab897
--- /dev/null
+++ b/dts/framework/testbed_model/scapy.py
@@ -0,0 +1,348 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2022 University of New Hampshire
+#
+
+import inspect
+import json
+import marshal
+import types
+import xmlrpc.client
+from typing import TypedDict
+from xmlrpc.server import SimpleXMLRPCServer
+
+import scapy.all
+from scapy.packet import Packet
+from typing_extensions import NotRequired
+
+from framework.config import OS
+from framework.logger import getLogger
+from .tg_node import TGNode
+from .hw.port import Port, PortIdentifier
+from .capturing_traffic_generator

[RFC PATCH v1 5/5] dts: add traffic generator node to dts runner

2023-04-20 Thread Juraj Linkeš

Initialize the TG node and do basic verification.

Signed-off-by: Juraj Linkeš 
---
 dts/framework/dts.py| 42 -
 dts/framework/testbed_model/__init__.py |  1 +
 2 files changed, 28 insertions(+), 15 deletions(-)

diff --git a/dts/framework/dts.py b/dts/framework/dts.py
index 0502284580..9c82bfe1f4 100644
--- a/dts/framework/dts.py
+++ b/dts/framework/dts.py
@@ -9,7 +9,7 @@
 from .logger import DTSLOG, getLogger
 from .test_result import BuildTargetResult, DTSResult, ExecutionResult, Result
 from .test_suite import get_test_suites
-from .testbed_model import SutNode
+from .testbed_model import SutNode, TGNode, Node
 from .utils import check_dts_python_version
 
 dts_logger: DTSLOG = getLogger("DTSRunner")
@@ -27,28 +27,40 @@ def run_all() -> None:
 # check the python version of the server that run dts
 check_dts_python_version()
 
-nodes: dict[str, SutNode] = {}
+nodes: dict[str, Node] = {}
 try:
 # for all Execution sections
 for execution in CONFIGURATION.executions:
 sut_node = None
+tg_node = None
 if execution.system_under_test.name in nodes:
 # a Node with the same name already exists
 sut_node = nodes[execution.system_under_test.name]
-else:
-# the SUT has not been initialized yet
-try:
+
+if execution.traffic_generator_system.name in nodes:
+# a Node with the same name already exists
+tg_node = nodes[execution.traffic_generator_system.name]
+
+try:
+if not sut_node:
 sut_node = SutNode(execution.system_under_test)
-result.update_setup(Result.PASS)
-except Exception as e:
-dts_logger.exception(
-f"Connection to node {execution.system_under_test} 
failed."
-)
-result.update_setup(Result.FAIL, e)
-else:
-nodes[sut_node.name] = sut_node
-
-if sut_node:
+if not tg_node:
+tg_node = TGNode(execution.traffic_generator_system)
+tg_node.verify()
+result.update_setup(Result.PASS)
+except Exception as e:
+failed_node = execution.system_under_test.name
+if sut_node:
+failed_node = execution.traffic_generator_system.name
+dts_logger.exception(
+f"Creation of node {failed_node} failed."
+)
+result.update_setup(Result.FAIL, e)
+else:
+nodes[sut_node.name] = sut_node
+nodes[tg_node.name] = tg_node
+
+if sut_node and tg_node:
 _run_execution(sut_node, execution, result)
 
 except Exception as e:
diff --git a/dts/framework/testbed_model/__init__.py 
b/dts/framework/testbed_model/__init__.py
index f54a947051..5cbb859e47 100644
--- a/dts/framework/testbed_model/__init__.py
+++ b/dts/framework/testbed_model/__init__.py
@@ -20,3 +20,4 @@
 )
 from .node import Node
 from .sut_node import SutNode
+from .tg_node import TGNode
-- 
2.30.2

[PATCH] crypto/qat: support to enable insecure algorithms

2023-04-20 Thread Vikash Poddar

All the insecure algorithms are default disable from
cryptodev.
use qat_legacy_capa to enable all the legacy
algorithms.

Signed-off-by: Vikash Poddar 
---
 drivers/common/qat/qat_device.c  |  1 +
 drivers/common/qat/qat_device.h  |  3 +-
 drivers/crypto/qat/dev/qat_crypto_pmd_gen2.c | 90 
 drivers/crypto/qat/qat_crypto.h  |  1 +
 drivers/crypto/qat/qat_sym.c |  3 +
 5 files changed, 62 insertions(+), 36 deletions(-)

diff --git a/drivers/common/qat/qat_device.c b/drivers/common/qat/qat_device.c
index 8bce2ac073..b8da684973 100644
--- a/drivers/common/qat/qat_device.c
+++ b/drivers/common/qat/qat_device.c
@@ -365,6 +365,7 @@ static int qat_pci_probe(struct rte_pci_driver *pci_drv 
__rte_unused,
struct qat_pci_device *qat_pci_dev;
struct qat_dev_hw_spec_funcs *ops_hw;
struct qat_dev_cmd_param qat_dev_cmd_param[] = {
+   { QAT_LEGACY_CAPA, 0 },
{ QAT_IPSEC_MB_LIB, 0 },
{ SYM_ENQ_THRESHOLD_NAME, 0 },
{ ASYM_ENQ_THRESHOLD_NAME, 0 },
diff --git a/drivers/common/qat/qat_device.h b/drivers/common/qat/qat_device.h
index bc3da04238..12b8cc46b1 100644
--- a/drivers/common/qat/qat_device.h
+++ b/drivers/common/qat/qat_device.h
@@ -17,12 +17,13 @@
 
 #define QAT_DEV_NAME_MAX_LEN   64
 
+#define QAT_LEGACY_CAPA "qat_legacy_capa"
 #define QAT_IPSEC_MB_LIB "qat_ipsec_mb_lib"
 #define SYM_ENQ_THRESHOLD_NAME "qat_sym_enq_threshold"
 #define ASYM_ENQ_THRESHOLD_NAME "qat_asym_enq_threshold"
 #define COMP_ENQ_THRESHOLD_NAME "qat_comp_enq_threshold"
 #define QAT_CMD_SLICE_MAP "qat_cmd_slice_disable"
-#define QAT_CMD_SLICE_MAP_POS  4
+#define QAT_CMD_SLICE_MAP_POS  5
 #define MAX_QP_THRESHOLD_SIZE  32
 
 /**
diff --git a/drivers/crypto/qat/dev/qat_crypto_pmd_gen2.c 
b/drivers/crypto/qat/dev/qat_crypto_pmd_gen2.c
index 60ca0fc0d2..3cd1c42d94 100644
--- a/drivers/crypto/qat/dev/qat_crypto_pmd_gen2.c
+++ b/drivers/crypto/qat/dev/qat_crypto_pmd_gen2.c
@@ -12,10 +12,39 @@
 
 #define MIXED_CRYPTO_MIN_FW_VER 0x0409
 
-static struct rte_cryptodev_capabilities qat_sym_crypto_caps_gen2[] = {
+static struct rte_cryptodev_capabilities qat_sym_crypto_legacy_caps_gen2[] = {
+   QAT_SYM_CIPHER_CAP(DES_CBC,
+   CAP_SET(block_size, 8),
+   CAP_RNG(key_size, 8, 24, 8), CAP_RNG(iv_size, 8, 8, 0)),
+   QAT_SYM_CIPHER_CAP(3DES_CBC,
+   CAP_SET(block_size, 8),
+   CAP_RNG(key_size, 8, 24, 8), CAP_RNG(iv_size, 8, 8, 0)),
+   QAT_SYM_CIPHER_CAP(3DES_CTR,
+   CAP_SET(block_size, 8),
+   CAP_RNG(key_size, 16, 24, 8), CAP_RNG(iv_size, 8, 8, 0)),
+   QAT_SYM_CIPHER_CAP(DES_DOCSISBPI,
+   CAP_SET(block_size, 8),
+   CAP_RNG(key_size, 8, 8, 0), CAP_RNG(iv_size, 8, 8, 0)),
QAT_SYM_PLAIN_AUTH_CAP(SHA1,
CAP_SET(block_size, 64),
CAP_RNG(digest_size, 1, 20, 1)),
+   QAT_SYM_AUTH_CAP(SHA224,
+   CAP_SET(block_size, 64),
+   CAP_RNG_ZERO(key_size), CAP_RNG(digest_size, 1, 28, 1),
+   CAP_RNG_ZERO(aad_size), CAP_RNG_ZERO(iv_size)),
+   QAT_SYM_AUTH_CAP(SHA224_HMAC,
+   CAP_SET(block_size, 64),
+   CAP_RNG(key_size, 1, 64, 1), CAP_RNG(digest_size, 1, 28, 1),
+   CAP_RNG_ZERO(aad_size), CAP_RNG_ZERO(iv_size)),
+   QAT_SYM_AUTH_CAP(SHA1_HMAC,
+   CAP_SET(block_size, 64),
+   CAP_RNG(key_size, 1, 64, 1), CAP_RNG(digest_size, 1, 20, 1),
+   CAP_RNG_ZERO(aad_size), CAP_RNG_ZERO(iv_size)),
+   RTE_CRYPTODEV_END_OF_CAPABILITIES_LIST()
+};
+
+
+static struct rte_cryptodev_capabilities qat_sym_crypto_caps_gen2[] = {
QAT_SYM_AEAD_CAP(AES_GCM,
CAP_SET(block_size, 16),
CAP_RNG(key_size, 16, 32, 8), CAP_RNG(digest_size, 8, 16, 4),
@@ -32,10 +61,6 @@ static struct rte_cryptodev_capabilities 
qat_sym_crypto_caps_gen2[] = {
CAP_SET(block_size, 16),
CAP_RNG(key_size, 16, 16, 0), CAP_RNG(digest_size, 4, 16, 4),
CAP_RNG_ZERO(aad_size), CAP_RNG_ZERO(iv_size)),
-   QAT_SYM_AUTH_CAP(SHA224,
-   CAP_SET(block_size, 64),
-   CAP_RNG_ZERO(key_size), CAP_RNG(digest_size, 1, 28, 1),
-   CAP_RNG_ZERO(aad_size), CAP_RNG_ZERO(iv_size)),
QAT_SYM_AUTH_CAP(SHA256,
CAP_SET(block_size, 64),
CAP_RNG_ZERO(key_size), CAP_RNG(digest_size, 1, 32, 1),
@@ -51,14 +76,6 @@ static struct rte_cryptodev_capabilities 
qat_sym_crypto_caps_gen2[] = {
QAT_SYM_PLAIN_AUTH_CAP(SHA3_256,
CAP_SET(block_size, 136),
CAP_RNG(digest_size, 32, 32, 0)),
-   QAT_SYM_AUTH_CAP(SHA1_HMAC,
-   CAP_SET(block_size, 64),
-   CAP_RNG(key_size, 1, 64, 1), CAP_RNG(digest_size, 1, 20, 1),
-   CAP_RNG_ZERO(aad_size), CAP_RNG_ZERO

[RFC 1/5] app/testpmd: add trace dump command

2023-04-20 Thread Viacheslav Ovsiienko

The "dump_trace" CLI command is added to trigger
saving the trace dumps to the trace directory.

Signed-off-by: Viacheslav Ovsiienko 
---
 app/test-pmd/cmdline.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 7b20bef4e9..be9e3a9ed6 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -39,6 +39,7 @@
 #include 
 #endif
 #include 
+#include 
 
 #include 
 #include 
@@ -8367,6 +8368,8 @@ static void cmd_dump_parsed(void *parsed_result,
rte_lcore_dump(stdout);
else if (!strcmp(res->dump, "dump_log_types"))
rte_log_dump(stdout);
+   else if (!strcmp(res->dump, "dump_trace"))
+   rte_trace_save();
 }
 
 static cmdline_parse_token_string_t cmd_dump_dump =
@@ -8379,7 +8382,8 @@ static cmdline_parse_token_string_t cmd_dump_dump =
"dump_mempool#"
"dump_devargs#"
"dump_lcores#"
-   "dump_log_types");
+   "dump_log_types#"
+   "dump_trace");
 
 static cmdline_parse_inst_t cmd_dump = {
.f = cmd_dump_parsed,  /* function to call */
-- 
2.18.1

[RFC 0/5] net/mlx5: introduce Tx datapath tracing

2023-04-20 Thread Viacheslav Ovsiienko

The mlx5 provides the send scheduling on specific moment of time,
and for the related kind of applications it would be extremely useful
to have extra debug information - when and how packets were scheduled
and when the actual sending was completed by the NIC hardware (it helps
application to track the internal delay issues).

Because the DPDK tx datapath API does not suppose getting any feedback
from the driver and the feature looks like to be mlx5 specific, it seems
to be reasonable to engage exisiting DPDK datapath tracing capability.

The work cycle is supposed to be:
  - compile appplication with enabled tracing
  - run application with EAL parameters configuring the tracing in mlx5
Tx datapath
  - store the dump file with gathered tracing information
  - run analyzing scrypt (in Python) to combine related events (packet
firing and completion) and see the data in human-readable view

Below is the detailed instruction "how to" with mlx5 NIC to gather
all the debug data including the full timings information.


1. Build DPDK application with enabled datapath tracing

The meson option should be specified:
   --enable_trace_fp=true

The c_args shoudl be specified:
   -DALLOW_EXPERIMENTAL_API

The DPDK configuration examples:

  meson configure --buildtype=debug -Denable_trace_fp=true
-Dc_args='-DRTE_LIBRTE_MLX5_DEBUG -DRTE_ENABLE_ASSERT 
-DALLOW_EXPERIMENTAL_API' build

  meson configure --buildtype=debug -Denable_trace_fp=true
-Dc_args='-DRTE_ENABLE_ASSERT -DALLOW_EXPERIMENTAL_API' build

  meson configure --buildtype=release -Denable_trace_fp=true
-Dc_args='-DRTE_ENABLE_ASSERT -DALLOW_EXPERIMENTAL_API' build

  meson configure --buildtype=release -Denable_trace_fp=true
-Dc_args='-DALLOW_EXPERIMENTAL_API' build


2. Configuring the NIC

If the sending completion timings are important the NIC should be configured
to provide realtime timestamps, the REAL_TIME_CLOCK_ENABLE NV settings parameter
should be configured to TRUE, for example with command (and with following
FW/driver reset):

  sudo mlxconfig -d /dev/mst/mt4125_pciconf0 s REAL_TIME_CLOCK_ENABLE=1


3. Run DPDK application to gather the traces

EAL parameters controlling trace capability in runtime

  --trace=pmd.net.mlx5.tx - the regular expression enabling the tracepoints
with matching names at least "pmd.net.mlx5.tx"
must be enabled to gather all events needed
to analyze mlx5 Tx datapath and its timings.
By default all tracepoints are disabled.

  --trace-dir=/var/log - trace storing directory

  --trace-bufsz=B|K|M - optional, trace data buffer size
   per thread. The default is 1MB.

  --trace-mode=overwrite|discard  - optional, selects trace data buffer mode.


4. Installing or Building Babeltrace2 Package

The gathered trace data can be analyzed with a developed Python script.
To parse the trace, the data script uses the Babeltrace2 library.
The package should be either installed or built from source code as
shown below:

  git clone https://github.com/efficios/babeltrace.git
  cd babeltrace
  ./bootstrap
  ./configure -help
  ./configure --disable-api-doc --disable-man-pages
  --disable-python-bindings-doc --enbale-python-plugins
  --enable-python-binding

5. Running the Analyzing Script

The analyzing script is located in the folder: ./drivers/net/mlx5/tools
It requires Python3.6, Babeltrace2 packages and it takes the only parameter
of trace data file. For example:

   ./mlx5_trace.py /var/log/rte-2023-01-23-AM-11-52-39


6. Interpreting the Script Output Data

All the timings are given in nanoseconds.
The list of Tx (and coming Rx) bursts per port/queue is presented in the output.
Each list element contains the list of built WQEs with specific opcodes, and
each WQE contains the list of the encompassed packets to send.

Signed-off-by: Viacheslav Ovsiienko 

Viacheslav Ovsiienko (5):
  app/testpmd: add trace dump command
  common/mlx5: introduce tracepoints for mlx5 drivers
  net/mlx5: add Tx datapath tracing
  net/mlx5: add comprehensive send completion trace
  net/mlx5: add Tx datapath trace analyzing script

 app/test-pmd/cmdline.c   |   6 +-
 drivers/common/mlx5/meson.build  |   1 +
 drivers/common/mlx5/mlx5_trace.c |  25 +++
 drivers/common/mlx5/mlx5_trace.h |  72 +++
 drivers/common/mlx5/version.map  |   8 +
 drivers/net/mlx5/linux/mlx5_verbs.c  |   8 +-
 drivers/net/mlx5/mlx5_devx.c |   8 +-
 drivers/net/mlx5/mlx5_rx.h   |  19 --
 drivers/net/mlx5/mlx5_rxtx.h |  19 ++
 drivers/net/mlx5/mlx5_tx.c   |   9 +
 drivers/net/mlx5/mlx5_tx.h   |  88 -
 drivers/net/mlx5/tools/mlx5_trace.py | 271 +++
 12 files changed, 504 insertions(+), 30 deletions(-)
 create mode 100644 drivers/common/mlx5/mlx5_trace.c
 create mode 100644 drivers/c

[RFC 2/5] common/mlx5: introduce tracepoints for mlx5 drivers

2023-04-20 Thread Viacheslav Ovsiienko

There is an intention to engage DPDK tracing capabilities
for mlx5 PMDs monitoring and profiling in various modes.
The patch introduces tracepoints for the Tx datapath in
the ethernet device driver.

Signed-off-by: Viacheslav Ovsiienko 
---
 drivers/common/mlx5/meson.build  |  1 +
 drivers/common/mlx5/mlx5_trace.c | 25 +++
 drivers/common/mlx5/mlx5_trace.h | 72 
 drivers/common/mlx5/version.map  |  8 
 4 files changed, 106 insertions(+)
 create mode 100644 drivers/common/mlx5/mlx5_trace.c
 create mode 100644 drivers/common/mlx5/mlx5_trace.h

diff --git a/drivers/common/mlx5/meson.build b/drivers/common/mlx5/meson.build
index 9dc809f192..e074ffb140 100644
--- a/drivers/common/mlx5/meson.build
+++ b/drivers/common/mlx5/meson.build
@@ -19,6 +19,7 @@ sources += files(
 'mlx5_common_mp.c',
 'mlx5_common_mr.c',
 'mlx5_malloc.c',
+'mlx5_trace.c',
 'mlx5_common_pci.c',
 'mlx5_common_devx.c',
 'mlx5_common_utils.c',
diff --git a/drivers/common/mlx5/mlx5_trace.c b/drivers/common/mlx5/mlx5_trace.c
new file mode 100644
index 00..b9f14413ad
--- /dev/null
+++ b/drivers/common/mlx5/mlx5_trace.c
@@ -0,0 +1,25 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright (c) 2022 NVIDIA Corporation & Affiliates
+ */
+
+#include 
+#include 
+
+RTE_TRACE_POINT_REGISTER(rte_pmd_mlx5_trace_tx_entry,
+   pmd.net.mlx5.tx.entry)
+
+RTE_TRACE_POINT_REGISTER(rte_pmd_mlx5_trace_tx_exit,
+   pmd.net.mlx5.tx.exit)
+
+RTE_TRACE_POINT_REGISTER(rte_pmd_mlx5_trace_tx_wqe,
+   pmd.net.mlx5.tx.wqe)
+
+RTE_TRACE_POINT_REGISTER(rte_pmd_mlx5_trace_tx_wait,
+   pmd.net.mlx5.tx.wait)
+
+RTE_TRACE_POINT_REGISTER(rte_pmd_mlx5_trace_tx_push,
+   pmd.net.mlx5.tx.push)
+
+RTE_TRACE_POINT_REGISTER(rte_pmd_mlx5_trace_tx_complete,
+   pmd.net.mlx5.tx.complete)
+
diff --git a/drivers/common/mlx5/mlx5_trace.h b/drivers/common/mlx5/mlx5_trace.h
new file mode 100644
index 00..57512e654f
--- /dev/null
+++ b/drivers/common/mlx5/mlx5_trace.h
@@ -0,0 +1,72 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright (c) 2022 NVIDIA Corporation & Affiliates
+ */
+
+#ifndef RTE_PMD_MLX5_TRACE_H_
+#define RTE_PMD_MLX5_TRACE_H_
+
+/**
+ * @file
+ *
+ * API for mlx5 PMD trace support
+ */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include 
+#include 
+#include 
+
+RTE_TRACE_POINT_FP(
+   rte_pmd_mlx5_trace_tx_entry,
+   RTE_TRACE_POINT_ARGS(uint16_t port_id, uint16_t queue_id),
+   rte_trace_point_emit_u16(port_id);
+   rte_trace_point_emit_u16(queue_id);
+)
+
+RTE_TRACE_POINT_FP(
+   rte_pmd_mlx5_trace_tx_exit,
+   RTE_TRACE_POINT_ARGS(uint16_t nb_sent, uint16_t nb_req),
+   rte_trace_point_emit_u16(nb_sent);
+   rte_trace_point_emit_u16(nb_req);
+)
+
+RTE_TRACE_POINT_FP(
+   rte_pmd_mlx5_trace_tx_wqe,
+   RTE_TRACE_POINT_ARGS(uint32_t opcode),
+   rte_trace_point_emit_u32(opcode);
+)
+
+RTE_TRACE_POINT_FP(
+   rte_pmd_mlx5_trace_tx_wait,
+   RTE_TRACE_POINT_ARGS(uint64_t ts),
+   rte_trace_point_emit_u64(ts);
+)
+
+
+RTE_TRACE_POINT_FP(
+   rte_pmd_mlx5_trace_tx_push,
+   RTE_TRACE_POINT_ARGS(const struct rte_mbuf *mbuf, uint16_t wqe_id),
+   rte_trace_point_emit_ptr(mbuf);
+   rte_trace_point_emit_u32(mbuf->pkt_len);
+   rte_trace_point_emit_u16(mbuf->nb_segs);
+   rte_trace_point_emit_u16(wqe_id);
+)
+
+RTE_TRACE_POINT_FP(
+   rte_pmd_mlx5_trace_tx_complete,
+   RTE_TRACE_POINT_ARGS(uint16_t port_id, uint16_t queue_id,
+uint16_t wqe_id, uint64_t ts),
+   rte_trace_point_emit_u16(port_id);
+   rte_trace_point_emit_u16(queue_id);
+   rte_trace_point_emit_u64(ts);
+   rte_trace_point_emit_u16(wqe_id);
+)
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* RTE_PMD_MLX5_TRACE_H_ */
diff --git a/drivers/common/mlx5/version.map b/drivers/common/mlx5/version.map
index e05e1aa8c5..d0ec8571e6 100644
--- a/drivers/common/mlx5/version.map
+++ b/drivers/common/mlx5/version.map
@@ -158,5 +158,13 @@ INTERNAL {
 
mlx5_os_interrupt_handler_create; # WINDOWS_NO_EXPORT
mlx5_os_interrupt_handler_destroy; # WINDOWS_NO_EXPORT
+
+   __rte_pmd_mlx5_trace_tx_entry;
+   __rte_pmd_mlx5_trace_tx_exit;
+   __rte_pmd_mlx5_trace_tx_wqe;
+   __rte_pmd_mlx5_trace_tx_wait;
+   __rte_pmd_mlx5_trace_tx_push;
+   __rte_pmd_mlx5_trace_tx_complete;
+
local: *;
 };
-- 
2.18.1

[RFC 3/5] net/mlx5: add Tx datapath tracing

2023-04-20 Thread Viacheslav Ovsiienko

The patch adds tracing capability to Tx datapath.
To engage this tracing capability the following steps
should be taken:

- meson option -Denable_trace_fp=true
- meson option -Dc_args='-DALLOW_EXPERIMENTAL_API'
- EAL command line parameter --trace=pmd.net.mlx5.tx.*

The Tx datapath tracing allows to get information how packets
are pushed into hardware descriptors, time stamping for
scheduled wait and send completions, etc.

To provide the human readable form of trace results the
dedicated post-processing script is presumed.

Signed-off-by: Viacheslav Ovsiienko 
---
 drivers/net/mlx5/mlx5_rx.h   | 19 ---
 drivers/net/mlx5/mlx5_rxtx.h | 19 +++
 drivers/net/mlx5/mlx5_tx.c   |  9 +
 drivers/net/mlx5/mlx5_tx.h   | 25 +++--
 4 files changed, 51 insertions(+), 21 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_rx.h b/drivers/net/mlx5/mlx5_rx.h
index 8b87adad36..1b5f110ccc 100644
--- a/drivers/net/mlx5/mlx5_rx.h
+++ b/drivers/net/mlx5/mlx5_rx.h
@@ -376,25 +376,6 @@ mlx5_rx_mb2mr(struct mlx5_rxq_data *rxq, struct rte_mbuf 
*mb)
return mlx5_mr_mempool2mr_bh(mr_ctrl, mb->pool, addr);
 }
 
-/**
- * Convert timestamp from HW format to linear counter
- * from Packet Pacing Clock Queue CQE timestamp format.
- *
- * @param sh
- *   Pointer to the device shared context. Might be needed
- *   to convert according current device configuration.
- * @param ts
- *   Timestamp from CQE to convert.
- * @return
- *   UTC in nanoseconds
- */
-static __rte_always_inline uint64_t
-mlx5_txpp_convert_rx_ts(struct mlx5_dev_ctx_shared *sh, uint64_t ts)
-{
-   RTE_SET_USED(sh);
-   return (ts & UINT32_MAX) + (ts >> 32) * NS_PER_S;
-}
-
 /**
  * Set timestamp in mbuf dynamic field.
  *
diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h
index 876aa14ae6..b109d50758 100644
--- a/drivers/net/mlx5/mlx5_rxtx.h
+++ b/drivers/net/mlx5/mlx5_rxtx.h
@@ -43,4 +43,23 @@ int mlx5_queue_state_modify_primary(struct rte_eth_dev *dev,
 int mlx5_queue_state_modify(struct rte_eth_dev *dev,
struct mlx5_mp_arg_queue_state_modify *sm);
 
+/**
+ * Convert timestamp from HW format to linear counter
+ * from Packet Pacing Clock Queue CQE timestamp format.
+ *
+ * @param sh
+ *   Pointer to the device shared context. Might be needed
+ *   to convert according current device configuration.
+ * @param ts
+ *   Timestamp from CQE to convert.
+ * @return
+ *   UTC in nanoseconds
+ */
+static __rte_always_inline uint64_t
+mlx5_txpp_convert_rx_ts(struct mlx5_dev_ctx_shared *sh, uint64_t ts)
+{
+   RTE_SET_USED(sh);
+   return (ts & UINT32_MAX) + (ts >> 32) * NS_PER_S;
+}
+
 #endif /* RTE_PMD_MLX5_RXTX_H_ */
diff --git a/drivers/net/mlx5/mlx5_tx.c b/drivers/net/mlx5/mlx5_tx.c
index 14e1487e59..1fe9521dfc 100644
--- a/drivers/net/mlx5/mlx5_tx.c
+++ b/drivers/net/mlx5/mlx5_tx.c
@@ -232,6 +232,15 @@ mlx5_tx_handle_completion(struct mlx5_txq_data 
*__rte_restrict txq,
MLX5_ASSERT((txq->fcqs[txq->cq_ci & txq->cqe_m] >> 16) ==
cqe->wqe_counter);
 #endif
+   if (__rte_trace_point_fp_is_enabled()) {
+   uint64_t ts = rte_be_to_cpu_64(cqe->timestamp);
+   uint16_t wqe_id = rte_be_to_cpu_16(cqe->wqe_counter);
+
+   if (txq->rt_timestamp)
+   ts = mlx5_txpp_convert_rx_ts(NULL, ts);
+   rte_pmd_mlx5_trace_tx_complete(txq->port_id, txq->idx,
+  wqe_id, ts);
+   }
ring_doorbell = true;
++txq->cq_ci;
last_cqe = cqe;
diff --git a/drivers/net/mlx5/mlx5_tx.h b/drivers/net/mlx5/mlx5_tx.h
index cc8f7e98aa..7f624de58e 100644
--- a/drivers/net/mlx5/mlx5_tx.h
+++ b/drivers/net/mlx5/mlx5_tx.h
@@ -19,6 +19,8 @@
 
 #include "mlx5.h"
 #include "mlx5_autoconf.h"
+#include "mlx5_trace.h"
+#include "mlx5_rxtx.h"
 
 /* TX burst subroutines return codes. */
 enum mlx5_txcmp_code {
@@ -764,6 +766,9 @@ mlx5_tx_cseg_init(struct mlx5_txq_data *__rte_restrict txq,
cs->flags = RTE_BE32(MLX5_COMP_ONLY_FIRST_ERR <<
 MLX5_COMP_MODE_OFFSET);
cs->misc = RTE_BE32(0);
+   if (__rte_trace_point_fp_is_enabled() && !loc->pkts_sent)
+   rte_pmd_mlx5_trace_tx_entry(txq->port_id, txq->idx);
+   rte_pmd_mlx5_trace_tx_wqe((txq->wqe_ci << 8) | opcode);
 }
 
 /**
@@ -1692,6 +1697,7 @@ mlx5_tx_schedule_send(struct mlx5_txq_data *restrict txq,
if (txq->wait_on_time) {
/* The wait on time capability should be used. */
ts -= sh->txpp.skew;
+   rte_pmd_mlx5_trace_tx_wait(ts);
mlx5_tx_cseg_init(txq, loc, wqe,
  1 + sizeof(struct mlx5_wqe_wseg) /
  MLX5_WSEG_SIZE,
@@ -1706,6 +17

[RFC 4/5] net/mlx5: add comprehensive send completion trace

2023-04-20 Thread Viacheslav Ovsiienko

There is the demand to trace the send completions of
every WQE if time scheduling is enabled.

The patch extends the size of completion queue and
requests completion on every issued WQE in the
send queue. As the result hardware provides CQE on
each completed WQE and driver is able to fetch
completion timestamp for dedicated operation.

The add code is under conditional compilation
RTE_ENABLE_TRACE_FP flag and does not impact the
release code.

Signed-off-by: Viacheslav Ovsiienko 
---
 drivers/net/mlx5/linux/mlx5_verbs.c |  8 +++-
 drivers/net/mlx5/mlx5_devx.c|  8 +++-
 drivers/net/mlx5/mlx5_tx.h  | 63 +++--
 3 files changed, 71 insertions(+), 8 deletions(-)

diff --git a/drivers/net/mlx5/linux/mlx5_verbs.c 
b/drivers/net/mlx5/linux/mlx5_verbs.c
index 67a7bec22b..f3f717f17b 100644
--- a/drivers/net/mlx5/linux/mlx5_verbs.c
+++ b/drivers/net/mlx5/linux/mlx5_verbs.c
@@ -968,8 +968,12 @@ mlx5_txq_ibv_obj_new(struct rte_eth_dev *dev, uint16_t idx)
rte_errno = EINVAL;
return -rte_errno;
}
-   cqe_n = desc / MLX5_TX_COMP_THRESH +
-   1 + MLX5_TX_COMP_THRESH_INLINE_DIV;
+   if (__rte_trace_point_fp_is_enabled() &&
+   txq_data->offloads & RTE_ETH_TX_OFFLOAD_SEND_ON_TIMESTAMP)
+   cqe_n = UINT16_MAX / 2 - 1;
+   else
+   cqe_n = desc / MLX5_TX_COMP_THRESH +
+   1 + MLX5_TX_COMP_THRESH_INLINE_DIV;
txq_obj->cq = mlx5_glue->create_cq(priv->sh->cdev->ctx, cqe_n,
   NULL, NULL, 0);
if (txq_obj->cq == NULL) {
diff --git a/drivers/net/mlx5/mlx5_devx.c b/drivers/net/mlx5/mlx5_devx.c
index 4369d2557e..5082a7e178 100644
--- a/drivers/net/mlx5/mlx5_devx.c
+++ b/drivers/net/mlx5/mlx5_devx.c
@@ -1465,8 +1465,12 @@ mlx5_txq_devx_obj_new(struct rte_eth_dev *dev, uint16_t 
idx)
MLX5_ASSERT(ppriv);
txq_obj->txq_ctrl = txq_ctrl;
txq_obj->dev = dev;
-   cqe_n = (1UL << txq_data->elts_n) / MLX5_TX_COMP_THRESH +
-   1 + MLX5_TX_COMP_THRESH_INLINE_DIV;
+   if (__rte_trace_point_fp_is_enabled() &&
+   txq_data->offloads & RTE_ETH_TX_OFFLOAD_SEND_ON_TIMESTAMP)
+   cqe_n = UINT16_MAX / 2 - 1;
+   else
+   cqe_n = (1UL << txq_data->elts_n) / MLX5_TX_COMP_THRESH +
+   1 + MLX5_TX_COMP_THRESH_INLINE_DIV;
log_desc_n = log2above(cqe_n);
cqe_n = 1UL << log_desc_n;
if (cqe_n > UINT16_MAX) {
diff --git a/drivers/net/mlx5/mlx5_tx.h b/drivers/net/mlx5/mlx5_tx.h
index 7f624de58e..9f29df280f 100644
--- a/drivers/net/mlx5/mlx5_tx.h
+++ b/drivers/net/mlx5/mlx5_tx.h
@@ -728,6 +728,54 @@ mlx5_tx_request_completion(struct mlx5_txq_data 
*__rte_restrict txq,
}
 }
 
+/**
+ * Set completion request flag for all issued WQEs.
+ * This routine is intended to be used with enabled fast path tracing
+ * and send scheduling on time to provide the detailed report in trace
+ * for send completions on every WQE.
+ *
+ * @param txq
+ *   Pointer to TX queue structure.
+ * @param loc
+ *   Pointer to burst routine local context.
+ * @param olx
+ *   Configured Tx offloads mask. It is fully defined at
+ *   compile time and may be used for optimization.
+ */
+static __rte_always_inline void
+mlx5_tx_request_completion_trace(struct mlx5_txq_data *__rte_restrict txq,
+struct mlx5_txq_local *__rte_restrict loc,
+unsigned int olx)
+{
+   uint16_t head = txq->elts_comp;
+
+   while (txq->wqe_comp != txq->wqe_ci) {
+   volatile struct mlx5_wqe *wqe;
+   uint32_t wqe_n;
+
+   MLX5_ASSERT(loc->wqe_last);
+   wqe = txq->wqes + (txq->wqe_comp & txq->wqe_m);
+   if (wqe == loc->wqe_last) {
+   head = txq->elts_head;
+   head += MLX5_TXOFF_CONFIG(INLINE) ?
+   0 : loc->pkts_sent - loc->pkts_copy;
+   txq->elts_comp = head;
+   }
+   /* Completion request flag was set on cseg constructing. */
+#ifdef RTE_LIBRTE_MLX5_DEBUG
+   txq->fcqs[txq->cq_pi++ & txq->cqe_m] = head |
+ (wqe->cseg.opcode >> 8) << 16;
+#else
+   txq->fcqs[txq->cq_pi++ & txq->cqe_m] = head;
+#endif
+   /* A CQE slot must always be available. */
+   MLX5_ASSERT((txq->cq_pi - txq->cq_ci) <= txq->cqe_s);
+   /* Advance to the next WQE in the queue. */
+   wqe_n = rte_be_to_cpu_32(wqe->cseg.sq_ds) & 0x3F;
+   txq->wqe_comp += RTE_ALIGN(wqe_n, 4) / 4;
+   }
+}
+
 /**
  * Build the Control Segment with specified opcode:
  * - MLX5_OPCODE_SEND
@@ -754,7 +802,7 @@ mlx5_tx_cseg_init(struct mlx5_txq_data *__rte_restrict txq,
  struct mlx5_wqe *__rte_restrict wqe,
  unsigned int ds,
  unsigned i

[RFC 5/5] net/mlx5: add Tx datapath trace analyzing script

2023-04-20 Thread Viacheslav Ovsiienko

The Python script is intended to analyze mlx5 PMD
datapath traces and report:
  - tx_burst routine timings
  - how packets are pushed to WQEs
  - how packet sending is completed with timings

Signed-off-by: Viacheslav Ovsiienko 
---
 drivers/net/mlx5/tools/mlx5_trace.py | 271 +++
 1 file changed, 271 insertions(+)
 create mode 100755 drivers/net/mlx5/tools/mlx5_trace.py

diff --git a/drivers/net/mlx5/tools/mlx5_trace.py 
b/drivers/net/mlx5/tools/mlx5_trace.py
new file mode 100755
index 00..c8fa63a7b9
--- /dev/null
+++ b/drivers/net/mlx5/tools/mlx5_trace.py
@@ -0,0 +1,271 @@
+#!/usr/bin/env python3
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright (c) 2023 NVIDIA Corporation & Affiliates
+
+'''
+Analyzing the mlx5 PMD datapath tracings
+'''
+import sys
+import argparse
+import pathlib
+import bt2
+
+PFX_TX = "pmd.net.mlx5.tx."
+PFX_TX_LEN = len(PFX_TX)
+
+tx_blst = {}# current Tx bursts per CPU
+tx_qlst = {}# active Tx queues per port/queue
+tx_wlst = {}# wait timestamp list per CPU
+
+class mlx5_queue(object):
+def __init__(self):
+self.done_burst = []# completed bursts
+self.wait_burst = []# waiting for completion
+self.pq_id = 0
+
+def log(self):
+for txb in self.done_burst:
+txb.log()
+
+
+class mlx5_mbuf(object):
+def __init__(self):
+self.wqe = 0# wqe id
+self.ptr = None # first packet mbuf pointer
+self.len = 0# packet data length
+self.nseg = 0   # number of segments
+
+def log(self):
+out = "%X: %u" % (self.ptr, self.len)
+if self.nseg != 1:
+out += " (%d segs)" % self.nseg
+print(out)
+
+
+class mlx5_wqe(object):
+def __init__(self):
+self.mbuf = []  # list of mbufs in WQE
+self.wait_ts = 0# preceding wait/push timestamp
+self.comp_ts = 0# send/recv completion timestamp
+self.opcode = 0
+
+def log(self):
+id = (self.opcode >> 8) & 0x
+op = self.opcode & 0xFF
+fl = self.opcode >> 24
+out = "  %04X: " % id
+if op == 0xF:
+out += "WAIT"
+elif op == 0x29:
+out += "EMPW"
+elif op == 0xE:
+out += "TSO "
+elif op == 0xA:
+out += "SEND"
+else:
+out += "0x%02X" % op
+if self.comp_ts != 0:
+out += " (%d, %d)" % (self.wait_ts, self.comp_ts - self.wait_ts)
+else:
+out += " (%d)" % self.wait_ts
+print(out)
+for mbuf in self.mbuf:
+mbuf.log()
+
+# return 0 if WQE in not completed
+def comp(self, wqe_id, ts):
+if self.comp_ts != 0:
+return 1
+id = (self.opcode >> 8) & 0x
+if id > wqe_id:
+id -= wqe_id
+if id <= 0x8000:
+return 0
+else:
+id = wqe_id - id
+if id >= 0x8000:
+return 0
+self.comp_ts = ts
+return 1
+
+
+class mlx5_burst(object):
+def __init__(self):
+self.wqes = []  # issued burst WQEs
+self.done = 0   # number of sent/recv packets
+self.req = 0# requested number of packets
+self.call_ts = 0# burst routine invocation
+self.done_ts = 0# burst routine done
+self.queue = None
+
+def log(self):
+port = self.queue.pq_id >> 16
+queue = self.queue.pq_id & 0x
+if self.req == 0:
+print("%u: tx(p=%u, q=%u, %u/%u pkts (incomplete)" %
+  (self.call_ts, port, queue, self.done, self.req))
+else:
+print("%u: tx(p=%u, q=%u, %u/%u pkts in %u" %
+  (self.call_ts, port, queue, self.done, self.req,
+   self.done_ts - self.call_ts))
+for wqe in self.wqes:
+wqe.log()
+
+# return 0 if not all of WQEs in burst completed
+def comp(self, wqe_id, ts):
+wlen = len(self.wqes)
+if wlen == 0:
+return 0
+for wqe in self.wqes:
+if wqe.comp(wqe_id, ts) == 0:
+return 0
+return 1
+
+
+def do_tx_entry(msg):
+event = msg.event
+cpu_id = event["cpu_id"]
+burst = tx_blst.get(cpu_id)
+if burst is not None:
+# continue existing burst after WAIT
+return
+# allocate the new burst and append to the queue
+burst = mlx5_burst()
+burst.call_ts = msg.default_clock_snapshot.ns_from_origin
+tx_blst[cpu_id] = burst
+pq_id = event["port_id"] << 16 | event["queue_id"]
+queue = tx_qlst.get(pq_id)
+if queue is None:
+# queue does not exist - allocate the new one
+queue = mlx5_queue();
+queue.pq_id = pq_id
+tx_qlst[pq_id] = queue
+burst.queue = queue
+queue.wait_burst.ap

Re: [RFC 2/5] common/mlx5: introduce tracepoints for mlx5 drivers

2023-04-20 Thread Jerin Jacob

On Thu, Apr 20, 2023 at 3:38 PM Viacheslav Ovsiienko
 wrote:
>
> There is an intention to engage DPDK tracing capabilities
> for mlx5 PMDs monitoring and profiling in various modes.
> The patch introduces tracepoints for the Tx datapath in
> the ethernet device driver.
>
> Signed-off-by: Viacheslav Ovsiienko 
> ---
>  drivers/common/mlx5/meson.build  |  1 +
>  drivers/common/mlx5/mlx5_trace.c | 25 +++
>  drivers/common/mlx5/mlx5_trace.h | 72 
>  drivers/common/mlx5/version.map  |  8 
>  4 files changed, 106 insertions(+)
>  create mode 100644 drivers/common/mlx5/mlx5_trace.c
>  create mode 100644 drivers/common/mlx5/mlx5_trace.h
>
> diff --git a/drivers/common/mlx5/meson.build b/drivers/common/mlx5/meson.build
> index 9dc809f192..e074ffb140 100644
> --- a/drivers/common/mlx5/meson.build
> +++ b/drivers/common/mlx5/meson.build
> @@ -19,6 +19,7 @@ sources += files(
>  'mlx5_common_mp.c',
>  'mlx5_common_mr.c',
>  'mlx5_malloc.c',
> +'mlx5_trace.c',
>  'mlx5_common_pci.c',
>  'mlx5_common_devx.c',
>  'mlx5_common_utils.c',
> diff --git a/drivers/common/mlx5/mlx5_trace.c 
> b/drivers/common/mlx5/mlx5_trace.c
> new file mode 100644
> index 00..b9f14413ad
> --- /dev/null
> +++ b/drivers/common/mlx5/mlx5_trace.c
> @@ -0,0 +1,25 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright (c) 2022 NVIDIA Corporation & Affiliates
> + */
> +
> +#include 
> +#include 
> +
> +RTE_TRACE_POINT_REGISTER(rte_pmd_mlx5_trace_tx_entry,
> +   pmd.net.mlx5.tx.entry)
> +
> +RTE_TRACE_POINT_REGISTER(rte_pmd_mlx5_trace_tx_exit,
> +   pmd.net.mlx5.tx.exit)
> +
> +RTE_TRACE_POINT_REGISTER(rte_pmd_mlx5_trace_tx_wqe,
> +   pmd.net.mlx5.tx.wqe)
> +
> +RTE_TRACE_POINT_REGISTER(rte_pmd_mlx5_trace_tx_wait,
> +   pmd.net.mlx5.tx.wait)
> +
> +RTE_TRACE_POINT_REGISTER(rte_pmd_mlx5_trace_tx_push,
> +   pmd.net.mlx5.tx.push)
> +
> +RTE_TRACE_POINT_REGISTER(rte_pmd_mlx5_trace_tx_complete,
> +   pmd.net.mlx5.tx.complete)
> +
> diff --git a/drivers/common/mlx5/mlx5_trace.h 
> b/drivers/common/mlx5/mlx5_trace.h
> new file mode 100644
> index 00..57512e654f
> --- /dev/null
> +++ b/drivers/common/mlx5/mlx5_trace.h
> @@ -0,0 +1,72 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright (c) 2022 NVIDIA Corporation & Affiliates
> + */
> +
> +#ifndef RTE_PMD_MLX5_TRACE_H_
> +#define RTE_PMD_MLX5_TRACE_H_
> +
> +/**
> + * @file
> + *
> + * API for mlx5 PMD trace support
> + */
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +#include 
> +#include 
> +#include 
> +
> +RTE_TRACE_POINT_FP(
> +   rte_pmd_mlx5_trace_tx_entry,
> +   RTE_TRACE_POINT_ARGS(uint16_t port_id, uint16_t queue_id),
> +   rte_trace_point_emit_u16(port_id);
> +   rte_trace_point_emit_u16(queue_id);
> +)
> +
> +RTE_TRACE_POINT_FP(
> +   rte_pmd_mlx5_trace_tx_exit,
> +   RTE_TRACE_POINT_ARGS(uint16_t nb_sent, uint16_t nb_req),
> +   rte_trace_point_emit_u16(nb_sent);
> +   rte_trace_point_emit_u16(nb_req);
> +)
> +
> +RTE_TRACE_POINT_FP(
> +   rte_pmd_mlx5_trace_tx_wqe,
> +   RTE_TRACE_POINT_ARGS(uint32_t opcode),
> +   rte_trace_point_emit_u32(opcode);
> +)
> +
> +RTE_TRACE_POINT_FP(
> +   rte_pmd_mlx5_trace_tx_wait,
> +   RTE_TRACE_POINT_ARGS(uint64_t ts),
> +   rte_trace_point_emit_u64(ts);
> +)
> +
> +
> +RTE_TRACE_POINT_FP(
> +   rte_pmd_mlx5_trace_tx_push,
> +   RTE_TRACE_POINT_ARGS(const struct rte_mbuf *mbuf, uint16_t wqe_id),
> +   rte_trace_point_emit_ptr(mbuf);
> +   rte_trace_point_emit_u32(mbuf->pkt_len);
> +   rte_trace_point_emit_u16(mbuf->nb_segs);
> +   rte_trace_point_emit_u16(wqe_id);
> +)
> +
> +RTE_TRACE_POINT_FP(
> +   rte_pmd_mlx5_trace_tx_complete,
> +   RTE_TRACE_POINT_ARGS(uint16_t port_id, uint16_t queue_id,
> +uint16_t wqe_id, uint64_t ts),
> +   rte_trace_point_emit_u16(port_id);
> +   rte_trace_point_emit_u16(queue_id);
> +   rte_trace_point_emit_u64(ts);
> +   rte_trace_point_emit_u16(wqe_id);
> +)
> +
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +#endif /* RTE_PMD_MLX5_TRACE_H_ */
> diff --git a/drivers/common/mlx5/version.map b/drivers/common/mlx5/version.map
> index e05e1aa8c5..d0ec8571e6 100644
> --- a/drivers/common/mlx5/version.map
> +++ b/drivers/common/mlx5/version.map
> @@ -158,5 +158,13 @@ INTERNAL {
>
> mlx5_os_interrupt_handler_create; # WINDOWS_NO_EXPORT
> mlx5_os_interrupt_handler_destroy; # WINDOWS_NO_EXPORT
> +
> +   __rte_pmd_mlx5_trace_tx_entry;
> +   __rte_pmd_mlx5_trace_tx_exit;
> +   __rte_pmd_mlx5_trace_tx_wqe;
> +   __rte_pmd_mlx5_trace_tx_wait;
> +   __rte_pmd_mlx5_trace_tx_push;
> +   __rte_pmd_mlx5_trace_tx_complete;

No need to expose these symbols. It is getting removed from rest of
DPDK. Application can do rte_trace_lookup() to get this address.


> +
> local: *;
>  };
> --
> 2.18.1
>

Re: [RFC 1/5] app/testpmd: add trace dump command

2023-04-20 Thread Jerin Jacob

On Thu, Apr 20, 2023 at 3:39 PM Viacheslav Ovsiienko
 wrote:
>
> The "dump_trace" CLI command is added to trigger
> saving the trace dumps to the trace directory.
>
> Signed-off-by: Viacheslav Ovsiienko 
> ---
>  app/test-pmd/cmdline.c | 6 +-
>  1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
> index 7b20bef4e9..be9e3a9ed6 100644
> --- a/app/test-pmd/cmdline.c
> +++ b/app/test-pmd/cmdline.c
> @@ -39,6 +39,7 @@
>  #include 
>  #endif
>  #include 
> +#include 
>
>  #include 
>  #include 
> @@ -8367,6 +8368,8 @@ static void cmd_dump_parsed(void *parsed_result,
> rte_lcore_dump(stdout);
> else if (!strcmp(res->dump, "dump_log_types"))
> rte_log_dump(stdout);
> +   else if (!strcmp(res->dump, "dump_trace"))
> +   rte_trace_save();

Isn't saving the trace? If so, change the command to save_trace or so.

>  }
>
>  static cmdline_parse_token_string_t cmd_dump_dump =
> @@ -8379,7 +8382,8 @@ static cmdline_parse_token_string_t cmd_dump_dump =
> "dump_mempool#"
> "dump_devargs#"
> "dump_lcores#"
> -   "dump_log_types");
> +   "dump_log_types#"
> +   "dump_trace");
>
>  static cmdline_parse_inst_t cmd_dump = {
> .f = cmd_dump_parsed,  /* function to call */
> --
> 2.18.1
>

Re: [dpdk-web] [RFC PATCH] process: new library approval in principle

2023-04-20 Thread Jerin Jacob

On Wed, Apr 19, 2023 at 9:10 PM Kevin Traynor  wrote:
>
> On 13/02/2023 09:26, jer...@marvell.com wrote:
> > From: Jerin Jacob 
> >
> > Based on TB meeting[1] action item, defining
> > the process for new library approval in principle.
> >
> > [1]
> > https://mails.dpdk.org/archives/dev/2023-January/260035.html
> >
> > Signed-off-by: Jerin Jacob 
> > ---
> >   content/process/_index.md | 33 +
> >   1 file changed, 33 insertions(+)
> >   create mode 100644 content/process/_index.md
> >
> > diff --git a/content/process/_index.md b/content/process/_index.md
> > new file mode 100644
> > index 000..21c2642
> > --- /dev/null
> > +++ b/content/process/_index.md
> > @@ -0,0 +1,33 @@
> > 
> > +title = "Process"
> > +weight = "9"
> > 
> > +
> > +## Process for new library approval in principle
> > +
> > +### Rational
> > +
> > +Adding a new library to DPDK codebase with proper RFC and then full 
> > patch-sets is
> > +significant work and getting early approval-in-principle that a library 
> > help DPDK contributors
> > +avoid wasted effort if it is not suitable for various reasons.
> > +
> > +### Process
> > +
> > +1. When a contributor would like to add a new library to DPDK code base, 
> > the contributor must send
> > +the following items to DPDK mailing list for TB approval-in-principle.
> > +
> > +   - Purpose of the library.
> > +   - Scope of the library.
> > +   - Any licensing constraints.
> > +   - Justification for adding to DPDK.
> > +   - Any other implementations of the same functionality in other 
> > libs/products and how this version differs.
>
> - Dependencies
>
> (Need to know if it's introducing new dependencies to the project)

Ack. I will add in next version.


>
> > +   - Public API specification header file as RFC
> > +   - Optional and good to have.
> > +   - TB may additionally request this collateral if needed to get more 
> > clarity on scope and purpose.
> > +
> > +2. TB to schedule discussion on this in upcoming TB meeting along with 
> > author. Based on the TB
> > +schedule and/or author availability, TB may need maximum three TB meeting 
> > slots.
> > +
> > +3. Based on mailing list and TB meeting discussions, TB to vote for 
> > approval-in-principle and share
> > +the decision in the mailing list.
> > +
>
> How about having three outcomes:
> - Approval in principal
> - Not approved
> - Further information needed

Ack. I will add in next version.

>

Re: 21.11.4 patches review and test

2023-04-20 Thread Kevin Traynor


On 20/04/2023 03:40, Xu, HailinX wrote:

-Original Message-
From: Xu, HailinX 
Sent: Thursday, April 13, 2023 2:13 PM
To: Kevin Traynor ; sta...@dpdk.org
Cc: dev@dpdk.org; Abhishek Marathe ;
Ali Alnubani ; Walker, Benjamin
; David Christensen ;
Hemant Agrawal ; Stokes, Ian
; Jerin Jacob ; Mcnamara, John
; Ju-Hyoung Lee ; Luca
Boccassi ; Pei Zhang ; Xu, Qian Q
; Raslan Darawsheh ; Thomas
Monjalon ; yangh...@redhat.com; Peng, Yuan
; Chen, Zhaoyan 
Subject: RE: 21.11.4 patches review and test


-Original Message-
From: Kevin Traynor 
Sent: Thursday, April 6, 2023 7:38 PM
To: sta...@dpdk.org
Cc: dev@dpdk.org; Abhishek Marathe ;
Ali Alnubani ; Walker, Benjamin
; David Christensen
; Hemant Agrawal ;
Stokes, Ian ; Jerin Jacob ;
Mcnamara, John ; Ju-Hyoung Lee
; Kevin Traynor ; Luca
Boccassi ; Pei Zhang ; Xu, Qian
Q ; Raslan Darawsheh ;

Thomas

Monjalon ; yangh...@redhat.com; Peng, Yuan
; Chen, Zhaoyan 
Subject: 21.11.4 patches review and test

Hi all,

Here is a list of patches targeted for stable release 21.11.4.

The planned date for the final release is 25th April.

Please help with testing and validation of your use cases and report
any issues/results with reply-all to this mail. For the final release
the fixes and reported validations will be added to the release notes.

A release candidate tarball can be found at:

 https://dpdk.org/browse/dpdk-stable/tag/?id=v21.11.4-rc1

These patches are located at branch 21.11 of dpdk-stable repo:
 https://dpdk.org/browse/dpdk-stable/

Thanks.

Kevin


HI All,

Update the test status for Intel part. Till now dpdk21.11.4-rc1 validation test
rate is 85%. No critical issue is found.
2 new bugs are found, 1 new issue is under confirming by Intel Dev.
New bugs:   --20.11.8-rc1 also has these two issues
   1. pvp_qemu_multi_paths_port_restart:perf_pvp_qemu_vector_rx_mac:
performance drop about 23.5% when send small packets
https://bugs.dpdk.org/show_bug.cgi?id=1212-- no fix yet
   2. some of the virtio tests are failing:-- Intel dev is under 
investigating
# Basic Intel(R) NIC testing
* Build & CFLAG compile: cover the build test combination with latest
GCC/Clang version and the popular OS revision such as
   Ubuntu20.04, Ubuntu22.04, Fedora35, Fedora37, RHEL8.6, RHEL8.4,
FreeBSD13.1, SUSE15, CentOS7.9, etc.
- All test done. No new dpdk issue is found.
* PF(i40e, ixgbe): test scenarios including
RTE_FLOW/TSO/Jumboframe/checksum offload/VLAN/VXLAN, etc.
- All test done. No new dpdk issue is found.
* VF(i40e, ixgbe): test scenarios including
VF-RTE_FLOW/TSO/Jumboframe/checksum offload/VLAN/VXLAN, etc.
- All test done. No new dpdk issue is found.
* PF/VF(ice): test scenarios including Switch features/Package
Management/Flow Director/Advanced Tx/Advanced RSS/ACL/DCF/Flexible
Descriptor, etc.
- All test done. No new dpdk issue is found.
* Intel NIC single core/NIC performance: test scenarios including PF/VF single
core performance test, etc.
- All test done. No new dpdk issue is found.
* IPsec: test scenarios including ipsec/ipsec-gw/ipsec library basic test -
QAT&SW/FIB library, etc.
- On going.

# Basic cryptodev and virtio testing
* Virtio: both function and performance test are covered. Such as
PVP/Virtio_loopback/virtio-user loopback/virtio-net VM2VM perf
testing/VMAWARE ESXI 8.0, etc.
- All test done. found bug1.
* Cryptodev:
   *Function test: test scenarios including Cryptodev API testing/CompressDev
ISA-L/QAT/ZLIB PMD Testing/FIPS, etc.
 - Execution rate is 90%. found bug2.
   *Performance test: test scenarios including Thoughput
Performance/Cryptodev Latency, etc.
 - All test done. No new dpdk issue is found.

Regards,
Xu, Hailin

Update the test status for Intel part. completed dpdk21.11.4-rc1 all 
validation. No critical issue is found.


Hi. Thanks for testing.


2 new bugs are found, 1 new issue is under confirming by Intel Dev.
New bugs: --20.11.8-rc1 also has these two issues
   1. pvp_qemu_multi_paths_port_restart:perf_pvp_qemu_vector_rx_mac: 
performance drop about 23.5% when send small packets
https://bugs.dpdk.org/show_bug.cgi?id=1212  --not fix yet, Only the 
specified platform exists


Do you know which patch caaused the regression? I'm not fully clear from 
the Bz for 20.11. The backported patch ID'd as root cause [0] in 20.11 
is in the previous releases of 20.11 (and 21.11).


Trying to understand because then it would have shown in testing for 
previous releases. Or is this a new test introduced for latest LTS 
releases? and if so, what is the baseline performance based on?


[0]
commit 1c9a7fba5c90e0422b517404499ed106f647bcff
Author: Mattias Rönnblom 
Date:   Mon Jul 11 14:11:32 2022 +0200

net: accept unaligned data in checksum routines


   2. some of the virtio tests are failing: -- Intel dev is under investigating


ok, thank you.

Kevin.


# Basic Intel(R) NIC testing
* Build & CFLAG compile: cover the build test combination with latest GCC/Clang 
version and the popular OS revision such a

[PATCH] app/crypto-perf: check crypto result

2023-04-20 Thread Anoob Joseph

Check crypto result in latency tests. Checking result won't affect the
test results as latency is calculated using timestamps which are done
before enqueue and after dequeue. Ignoring result means the data can be
false positive.

Signed-off-by: Anoob Joseph 
---
 app/test-crypto-perf/cperf_test_latency.c | 23 ---
 1 file changed, 20 insertions(+), 3 deletions(-)

diff --git a/app/test-crypto-perf/cperf_test_latency.c 
b/app/test-crypto-perf/cperf_test_latency.c
index 406e082e4e..64bef2cc0e 100644
--- a/app/test-crypto-perf/cperf_test_latency.c
+++ b/app/test-crypto-perf/cperf_test_latency.c
@@ -134,6 +134,7 @@ cperf_latency_test_runner(void *arg)
uint16_t test_burst_size;
uint8_t burst_size_idx = 0;
uint32_t imix_idx = 0;
+   int ret = 0;
 
static uint16_t display_once;
 
@@ -258,10 +259,16 @@ cperf_latency_test_runner(void *arg)
}
 
if (likely(ops_deqd))  {
-   /* Free crypto ops so they can be reused. */
-   for (i = 0; i < ops_deqd; i++)
+   for (i = 0; i < ops_deqd; i++) {
+   struct rte_crypto_op *op = 
ops_processed[i];
+
+   if (op->status == 
RTE_CRYPTO_OP_STATUS_ERROR)
+   ret = -1;
+
store_timestamp(ops_processed[i], 
tsc_end);
+   }
 
+   /* Free crypto ops so they can be reused. */
rte_mempool_put_bulk(ctx->pool,
(void **)ops_processed, 
ops_deqd);
 
@@ -289,8 +296,14 @@ cperf_latency_test_runner(void *arg)
tsc_end = rte_rdtsc_precise();
 
if (ops_deqd != 0) {
-   for (i = 0; i < ops_deqd; i++)
+   for (i = 0; i < ops_deqd; i++) {
+   struct rte_crypto_op *op = 
ops_processed[i];
+
+   if (op->status == 
RTE_CRYPTO_OP_STATUS_ERROR)
+   ret = -1;
+
store_timestamp(ops_processed[i], 
tsc_end);
+   }
 
rte_mempool_put_bulk(ctx->pool,
(void **)ops_processed, 
ops_deqd);
@@ -301,6 +314,10 @@ cperf_latency_test_runner(void *arg)
}
}
 
+   /* If there was any failure in crypto op, exit */
+   if (ret)
+   return ret;
+
for (i = 0; i < tsc_idx; i++) {
tsc_val = ctx->res[i].tsc_end - ctx->res[i].tsc_start;
tsc_max = RTE_MAX(tsc_val, tsc_max);
-- 
2.25.1

[PATCH v2] crypto/ipsec_mb: enqueue counter fix

2023-04-20 Thread Saoirse O'Donovan

This patch removes enqueue op counter update from the process_op_bit
function where the process is now done in dequeue stage. The original
stats increment was incorrect as they shouldn't have been updated at all
in this function.

Fixes: 4f1cfda59ad3 ("crypto/ipsec_mb: move snow3g PMD")
Cc: piotrx.bronow...@intel.com
Cc: sta...@dpdk.org

Signed-off-by: Saoirse O'Donovan 

---
v2: Added cc stable for 21.11 and 22.11 backport.

A similar fix has been sent to 20.11 LTS stable, in the interest of
time. In that fix, the enqueued stat is still in use, therefore only the
fix to the count increment was necessary.

Here is the mail archive link:
https://mails.dpdk.org/archives/stable/2023-April/043550.html
---
 drivers/crypto/ipsec_mb/pmd_snow3g.c | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/crypto/ipsec_mb/pmd_snow3g.c 
b/drivers/crypto/ipsec_mb/pmd_snow3g.c
index 8ed069f428..e64df1a462 100644
--- a/drivers/crypto/ipsec_mb/pmd_snow3g.c
+++ b/drivers/crypto/ipsec_mb/pmd_snow3g.c
@@ -372,9 +372,10 @@ process_ops(struct rte_crypto_op **ops, struct 
snow3g_session *session,
 /** Process a crypto op with length/offset in bits. */
 static int
 process_op_bit(struct rte_crypto_op *op, struct snow3g_session *session,
-   struct ipsec_mb_qp *qp, uint16_t *accumulated_enqueued_ops)
+   struct ipsec_mb_qp *qp)
 {
-   uint32_t enqueued_op, processed_op;
+   unsigned int processed_op;
+   int ret;
 
switch (session->op) {
case IPSEC_MB_OP_ENCRYPT_ONLY:
@@ -421,9 +422,10 @@ process_op_bit(struct rte_crypto_op *op, struct 
snow3g_session *session,
 
if (unlikely(processed_op != 1))
return 0;
-   enqueued_op = rte_ring_enqueue(qp->ingress_queue, op);
-   qp->stats.enqueued_count += enqueued_op;
-   *accumulated_enqueued_ops += enqueued_op;
+
+   ret = rte_ring_enqueue(qp->ingress_queue, op);
+   if (ret != 0)
+   return ret;
 
return 1;
 }
@@ -439,7 +441,6 @@ snow3g_pmd_dequeue_burst(void *queue_pair,
struct snow3g_session *prev_sess = NULL, *curr_sess = NULL;
uint32_t i;
uint8_t burst_size = 0;
-   uint16_t enqueued_ops = 0;
uint8_t processed_ops;
uint32_t nb_dequeued;
 
@@ -479,8 +480,7 @@ snow3g_pmd_dequeue_burst(void *queue_pair,
prev_sess = NULL;
}
 
-   processed_ops = process_op_bit(curr_c_op, curr_sess,
-   qp, &enqueued_ops);
+   processed_ops = process_op_bit(curr_c_op, curr_sess, 
qp);
if (processed_ops != 1)
break;
 
-- 
2.25.1

Re: [PATCH v2] crypto/ipsec_mb: enqueue counter fix

2023-04-20 Thread Kevin Traynor


On 20/04/2023 11:31, Saoirse O'Donovan wrote:

This patch removes enqueue op counter update from the process_op_bit
function where the process is now done in dequeue stage. The original
stats increment was incorrect as they shouldn't have been updated at all
in this function.

Fixes: 4f1cfda59ad3 ("crypto/ipsec_mb: move snow3g PMD")
Cc: piotrx.bronow...@intel.com
Cc: sta...@dpdk.org

Signed-off-by: Saoirse O'Donovan 

---
v2: Added cc stable for 21.11 and 22.11 backport.

A similar fix has been sent to 20.11 LTS stable, in the interest of
time. In that fix, the enqueued stat is still in use, therefore only the
fix to the count increment was necessary.



Thanks for the explanation. As it has the correct tags, we will pick 
this up for 21.11 and 22.11 LTS releases in the normal workflow, which 
is after it has been released as part of a DPDK main branch release.


thanks,
Kevin.


Here is the mail archive link:
https://mails.dpdk.org/archives/stable/2023-April/043550.html
---
  drivers/crypto/ipsec_mb/pmd_snow3g.c | 16 
  1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/crypto/ipsec_mb/pmd_snow3g.c 
b/drivers/crypto/ipsec_mb/pmd_snow3g.c
index 8ed069f428..e64df1a462 100644
--- a/drivers/crypto/ipsec_mb/pmd_snow3g.c
+++ b/drivers/crypto/ipsec_mb/pmd_snow3g.c
@@ -372,9 +372,10 @@ process_ops(struct rte_crypto_op **ops, struct 
snow3g_session *session,
  /** Process a crypto op with length/offset in bits. */
  static int
  process_op_bit(struct rte_crypto_op *op, struct snow3g_session *session,
-   struct ipsec_mb_qp *qp, uint16_t *accumulated_enqueued_ops)
+   struct ipsec_mb_qp *qp)
  {
-   uint32_t enqueued_op, processed_op;
+   unsigned int processed_op;
+   int ret;
  
  	switch (session->op) {

case IPSEC_MB_OP_ENCRYPT_ONLY:
@@ -421,9 +422,10 @@ process_op_bit(struct rte_crypto_op *op, struct 
snow3g_session *session,
  
  	if (unlikely(processed_op != 1))

return 0;
-   enqueued_op = rte_ring_enqueue(qp->ingress_queue, op);
-   qp->stats.enqueued_count += enqueued_op;
-   *accumulated_enqueued_ops += enqueued_op;
+
+   ret = rte_ring_enqueue(qp->ingress_queue, op);
+   if (ret != 0)
+   return ret;
  
  	return 1;

  }
@@ -439,7 +441,6 @@ snow3g_pmd_dequeue_burst(void *queue_pair,
struct snow3g_session *prev_sess = NULL, *curr_sess = NULL;
uint32_t i;
uint8_t burst_size = 0;
-   uint16_t enqueued_ops = 0;
uint8_t processed_ops;
uint32_t nb_dequeued;
  
@@ -479,8 +480,7 @@ snow3g_pmd_dequeue_burst(void *queue_pair,

prev_sess = NULL;
}
  
-			processed_ops = process_op_bit(curr_c_op, curr_sess,

-   qp, &enqueued_ops);
+   processed_ops = process_op_bit(curr_c_op, curr_sess, 
qp);
if (processed_ops != 1)
break;

RE: [PATCH v2] crypto/ipsec_mb: enqueue counter fix

2023-04-20 Thread Power, Ciara




> -Original Message-
> From: Saoirse O'Donovan 
> Sent: Thursday 20 April 2023 11:32
> To: Ji, Kai ; De Lara Guarch, Pablo
> 
> Cc: dev@dpdk.org; O'Donovan, Saoirse ;
> Bronowski, PiotrX ; sta...@dpdk.org
> Subject: [PATCH v2] crypto/ipsec_mb: enqueue counter fix
> 
> This patch removes enqueue op counter update from the process_op_bit
> function where the process is now done in dequeue stage. The original stats
> increment was incorrect as they shouldn't have been updated at all in this
> function.
> 
> Fixes: 4f1cfda59ad3 ("crypto/ipsec_mb: move snow3g PMD")
> Cc: piotrx.bronow...@intel.com
> Cc: sta...@dpdk.org
> 
> Signed-off-by: Saoirse O'Donovan 
> 
> ---
> v2: Added cc stable for 21.11 and 22.11 backport.
> 
> A similar fix has been sent to 20.11 LTS stable, in the interest of time. In 
> that
> fix, the enqueued stat is still in use, therefore only the fix to the count
> increment was necessary.
> 
> Here is the mail archive link:
> https://mails.dpdk.org/archives/stable/2023-April/043550.html
> ---
>  drivers/crypto/ipsec_mb/pmd_snow3g.c | 16 
>  1 file changed, 8 insertions(+), 8 deletions(-)

Acked-by: Ciara Power

[PATCH] devtools: allow patch to multiple groups for the same driver

2023-04-20 Thread Viacheslav Ovsiienko

The PMD's source code resides in the ./drivers folder of the
DPDK project and split into the several groups depending on the
PMD class (common, net, regex, etc.).

For some vendors the drivers of different classes operate over
the same hardware, for example Nvidia PMDs operate over ConnectX
NIC series. It often happens the same minor fixes should be applied
to the multiple drivers of the same vendor in the different classes.

The check-git-log.sh script checks the consistence of patch
affected files and patch commit message headline and prevents
updating multiple drivers in single commit.

This patch mitigates this strict check and allows to update
multiple drivers in different classes for the single vendor.

Signed-off-by: Viacheslav Ovsiienko 
---
 devtools/check-git-log.sh | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/devtools/check-git-log.sh b/devtools/check-git-log.sh
index af751e49ab..b66e8fe553 100755
--- a/devtools/check-git-log.sh
+++ b/devtools/check-git-log.sh
@@ -80,7 +80,9 @@ bad=$(for commit in $commits ; do
continue
drv=$(echo "$files" | grep '^drivers/' | cut -d "/" -f 2,3 | sort -u)
drvgrp=$(echo "$drv" | cut -d "/" -f 1 | uniq)
-   if [ $(echo "$drvgrp" | wc -l) -gt 1 ] ; then
+   drvpmd=$(echo "$drv" | cut -d "/" -f 2 | uniq)
+   if [ $(echo "$drvgrp" | wc -l) -gt 1 ] && \
+  [ $(echo "$drvpmd" | wc -l) -gt 1 ] ; then
echo "$headline" | grep -v '^drivers:'
elif [ $(echo "$drv" | wc -l) -gt 1 ] ; then
echo "$headline" | grep -v "^drivers/$drvgrp"
-- 
2.18.1

[PATCH v2] app/crypto-perf: check crypto result

2023-04-20 Thread Anoob Joseph

Check crypto result in latency tests. Checking result won't affect the
test results as latency is calculated using timestamps which are done
before enqueue and after dequeue. Ignoring result means the data can be
false positive.

Signed-off-by: Anoob Joseph 
---
v2:
- Improved result check (treat all non success as errors)

 app/test-crypto-perf/cperf_test_latency.c | 23 ---
 1 file changed, 20 insertions(+), 3 deletions(-)

diff --git a/app/test-crypto-perf/cperf_test_latency.c 
b/app/test-crypto-perf/cperf_test_latency.c
index 406e082e4e..f1676a9aa9 100644
--- a/app/test-crypto-perf/cperf_test_latency.c
+++ b/app/test-crypto-perf/cperf_test_latency.c
@@ -134,6 +134,7 @@ cperf_latency_test_runner(void *arg)
uint16_t test_burst_size;
uint8_t burst_size_idx = 0;
uint32_t imix_idx = 0;
+   int ret = 0;
 
static uint16_t display_once;
 
@@ -258,10 +259,16 @@ cperf_latency_test_runner(void *arg)
}
 
if (likely(ops_deqd))  {
-   /* Free crypto ops so they can be reused. */
-   for (i = 0; i < ops_deqd; i++)
+   for (i = 0; i < ops_deqd; i++) {
+   struct rte_crypto_op *op = 
ops_processed[i];
+
+   if (op->status != 
RTE_CRYPTO_OP_STATUS_SUCCESS)
+   ret = -1;
+
store_timestamp(ops_processed[i], 
tsc_end);
+   }
 
+   /* Free crypto ops so they can be reused. */
rte_mempool_put_bulk(ctx->pool,
(void **)ops_processed, 
ops_deqd);
 
@@ -289,8 +296,14 @@ cperf_latency_test_runner(void *arg)
tsc_end = rte_rdtsc_precise();
 
if (ops_deqd != 0) {
-   for (i = 0; i < ops_deqd; i++)
+   for (i = 0; i < ops_deqd; i++) {
+   struct rte_crypto_op *op = 
ops_processed[i];
+
+   if (op->status != 
RTE_CRYPTO_OP_STATUS_SUCCESS)
+   ret = -1;
+
store_timestamp(ops_processed[i], 
tsc_end);
+   }
 
rte_mempool_put_bulk(ctx->pool,
(void **)ops_processed, 
ops_deqd);
@@ -301,6 +314,10 @@ cperf_latency_test_runner(void *arg)
}
}
 
+   /* If there was any failure in crypto op, exit */
+   if (ret)
+   return ret;
+
for (i = 0; i < tsc_idx; i++) {
tsc_val = ctx->res[i].tsc_end - ctx->res[i].tsc_start;
tsc_max = RTE_MAX(tsc_val, tsc_max);
-- 
2.25.1

[PATCH v1] dts: create tarball from git ref

2023-04-20 Thread Juraj Linkeš

Add additional convenience options for specifying what DPDK version to
test.

Signed-off-by: Juraj Linkeš 
---
 dts/framework/config/__init__.py |  11 +--
 dts/framework/settings.py|  20 ++---
 dts/framework/utils.py   | 140 +++
 3 files changed, 152 insertions(+), 19 deletions(-)

diff --git a/dts/framework/config/__init__.py b/dts/framework/config/__init__.py
index ebb0823ff5..a4b73483e6 100644
--- a/dts/framework/config/__init__.py
+++ b/dts/framework/config/__init__.py
@@ -11,21 +11,14 @@
 import os.path
 import pathlib
 from dataclasses import dataclass
-from enum import Enum, auto, unique
+from enum import auto, unique
 from typing import Any, TypedDict
 
 import warlock  # type: ignore
 import yaml
 
 from framework.settings import SETTINGS
-
-
-class StrEnum(Enum):
-@staticmethod
-def _generate_next_value_(
-name: str, start: int, count: int, last_values: object
-) -> str:
-return name
+from framework.utils import StrEnum
 
 
 @unique
diff --git a/dts/framework/settings.py b/dts/framework/settings.py
index 71955f4581..cfa39d011b 100644
--- a/dts/framework/settings.py
+++ b/dts/framework/settings.py
@@ -10,7 +10,7 @@
 from pathlib import Path
 from typing import Any, TypeVar
 
-from .exception import ConfigurationError
+from .utils import DPDKGitTarball
 
 _T = TypeVar("_T")
 
@@ -124,11 +124,13 @@ def _get_parser() -> argparse.ArgumentParser:
 parser.add_argument(
 "--tarball",
 "--snapshot",
+"--git-ref",
 action=_env_arg("DTS_DPDK_TARBALL"),
 default="dpdk.tar.xz",
 type=Path,
-help="[DTS_DPDK_TARBALL] Path to DPDK source code tarball "
-"which will be used in testing.",
+help="[DTS_DPDK_TARBALL] Path to DPDK source code tarball or a git 
commit ID, "
+"tag ID or tree ID to test. To test local changes, first commit them, "
+"then use the commit ID with this option.",
 )
 
 parser.add_argument(
@@ -160,21 +162,19 @@ def _get_parser() -> argparse.ArgumentParser:
 return parser
 
 
-def _check_tarball_path(parsed_args: argparse.Namespace) -> None:
-if not os.path.exists(parsed_args.tarball):
-raise ConfigurationError(f"DPDK tarball '{parsed_args.tarball}' 
doesn't exist.")
-
-
 def _get_settings() -> _Settings:
 parsed_args = _get_parser().parse_args()
-_check_tarball_path(parsed_args)
 return _Settings(
 config_file_path=parsed_args.config_file,
 output_dir=parsed_args.output_dir,
 timeout=parsed_args.timeout,
 verbose=(parsed_args.verbose == "Y"),
 skip_setup=(parsed_args.skip_setup == "Y"),
-dpdk_tarball_path=parsed_args.tarball,
+dpdk_tarball_path=Path(
+DPDKGitTarball(parsed_args.tarball, parsed_args.output_dir)
+)
+if not os.path.exists(parsed_args.tarball)
+else Path(parsed_args.tarball),
 compile_timeout=parsed_args.compile_timeout,
 test_cases=parsed_args.test_cases.split(",") if parsed_args.test_cases 
else [],
 re_run=parsed_args.re_run,
diff --git a/dts/framework/utils.py b/dts/framework/utils.py
index 55e0b0ef0e..0623106b78 100644
--- a/dts/framework/utils.py
+++ b/dts/framework/utils.py
@@ -3,7 +3,26 @@
 # Copyright(c) 2022-2023 PANTHEON.tech s.r.o.
 # Copyright(c) 2022-2023 University of New Hampshire
 
+import atexit
+import os
+import subprocess
 import sys
+from enum import Enum
+from pathlib import Path
+from subprocess import SubprocessError
+
+from .exception import ConfigurationError
+
+
+class StrEnum(Enum):
+@staticmethod
+def _generate_next_value_(
+name: str, start: int, count: int, last_values: object
+) -> str:
+return name
+
+def __str__(self) -> str:
+return self.name
 
 
 def check_dts_python_version() -> None:
@@ -80,3 +99,124 @@ def __init__(self, default_library: str | None = None, 
**dpdk_args: str | bool):
 
 def __str__(self) -> str:
 return " ".join(f"{self._default_library} {self._dpdk_args}".split())
+
+
+class _TarCompressionFormat(StrEnum):
+"""Compression formats that tar can use.
+
+Enum names are the shell compression commands
+and Enum values are the associated file extensions.
+"""
+
+gzip = "gz"
+compress = "Z"
+bzip2 = "bz2"
+lzip = "lz"
+lzma = "lzma"
+lzop = "lzo"
+xz = "xz"
+zstd = "zst"
+
+
+class DPDKGitTarball(object):
+"""Create a compressed tarball of DPDK from the repository.
+
+The DPDK version is specified with git object git_ref.
+The tarball will be compressed with _TarCompressionFormat,
+which must be supported by the DTS execution environment.
+The resulting tarball will be put into output_dir.
+
+The class supports the os.PathLike protocol,
+which is used to get the Path of the tarball::
+
+from pathlib import Path
+tarball = DPDKGitTarball("HEAD", "output")
+tarball_path = Pa

Re: [PATCH] doc: add PMD known issue

2023-04-20 Thread Stephen Hemminger

On Thu, 20 Apr 2023 06:14:29 +
Mingjin Ye  wrote:

> Add a known issue: ASLR feature causes core dump.
> 
> Signed-off-by: Mingjin Ye 
> ---

Please provide back trace.
This should be fixable.
Fixing a bug is always better than documenting it.

Re: [PATCH] doc: add PMD known issue

2023-04-20 Thread Bruce Richardson

On Thu, Apr 20, 2023 at 06:14:29AM +, Mingjin Ye wrote:
> Add a known issue: ASLR feature causes core dump.
> 
> Signed-off-by: Mingjin Ye 
> ---
>  doc/guides/nics/ixgbe.rst | 15 +++
>  1 file changed, 15 insertions(+)
> 
> diff --git a/doc/guides/nics/ixgbe.rst b/doc/guides/nics/ixgbe.rst
> index b1d77ab7ab..c346e377e2 100644
> --- a/doc/guides/nics/ixgbe.rst
> +++ b/doc/guides/nics/ixgbe.rst
> @@ -461,3 +461,18 @@ show bypass config
>  Show the bypass configuration for a bypass enabled NIC using the lowest port 
> on the NIC::
>  
> testpmd> show bypass config (port_id)
> +
> +ASLR feature causes core dump
> +~
> +
> +Core dump may occur when we start secondary processes on the vf port.
> +Mainstream Linux distributions have the ASLR feature enabled by default,
> +and the text segment of the process's memory space is randomized.
> +The secondary process calls the function address shared by the primary
> +process, resulting in a core dump.
> +
> +   .. Note::
> +
> +   Support for ASLR features varies by distribution. Redhat and
> +   Centos series distributions work fine. Ubuntu distributions
> +   will core dump, other Linux distributions are unknown.
> -- 

I disagree about this description of the bug. ASLR is not the problem;
instead driver is just not multi-process aware and uses the same pointers
in both primary and secondary processes. You will hit this issue even
without ASLR if primary and secondary processes use different static
binaries. Therefore, IMHO, title should be that the VF driver is not
multi-process safe, rather than pinning the blame on ASLR.

/Bruce

[Bug 1217] RTE flow: Port state changing to error when RTE flow is enabled/disabled on Intel X722

2023-04-20 Thread bugzilla

https://bugs.dpdk.org/show_bug.cgi?id=1217

Bug ID: 1217
   Summary: RTE flow: Port state changing to error when RTE flow
is enabled/disabled on Intel X722
   Product: DPDK
   Version: 22.03
  Hardware: x86
OS: Linux
Status: UNCONFIRMED
  Severity: major
  Priority: Normal
 Component: other
  Assignee: dev@dpdk.org
  Reporter: ltham...@usc.edu
  Target Milestone: ---

Created attachment 249
  --> https://bugs.dpdk.org/attachment.cgi?id=249&action=edit
rte_flow_port_state_error_on_X722 logs

When RTE flows are enabled and disabled couple of times on Intel X722
interface, the port state turns from Up/down to error state with error logs
"Interface TenGigabitEthernetb5/0/0 error -95: Unknown error -95",
"i40e_dev_sync_phy_type(): Failed to sync phy type: status=-7".

Please check the attachment for logs.

-- 
You are receiving this mail because:
You are the assignee for the bug.

RE: [dpdk-dev] [PATCH v2] ring: fix use after free in ring release

2023-04-20 Thread Honnappa Nagarahalli




> -Original Message-
> From: Yunjian Wang 
> Sent: Thursday, April 20, 2023 1:44 AM
> To: dev@dpdk.org
> Cc: Honnappa Nagarahalli ;
> konstantin.v.anan...@yandex.ru; luyi...@huawei.com; Yunjian Wang
> ; sta...@dpdk.org
> Subject: [dpdk-dev] [PATCH v2] ring: fix use after free in ring release
> 
> After the memzone is freed, it is not removed from the 'rte_ring_tailq'.
> If rte_ring_lookup is called at this time, it will cause a use-after-free 
> problem.
> This change prevents that from happening.
> 
> Fixes: 4e32101f9b01 ("ring: support freeing")
> Cc: sta...@dpdk.org
> 
> Suggested-by: Honnappa Nagarahalli 
This is incorrect, this is not a suggestion from me. Please remove this.

> Signed-off-by: Yunjian Wang 
Other than the above, the patch looks fine.

Reviewed-by: Honnappa Nagarahalli 

> ---
> v2: update code suggested by Honnappa Nagarahalli
> ---
>  lib/ring/rte_ring.c | 8 +++-
>  1 file changed, 3 insertions(+), 5 deletions(-)
> 
> diff --git a/lib/ring/rte_ring.c b/lib/ring/rte_ring.c index
> 8ed455043d..2755323b8a 100644
> --- a/lib/ring/rte_ring.c
> +++ b/lib/ring/rte_ring.c
> @@ -333,11 +333,6 @@ rte_ring_free(struct rte_ring *r)
>   return;
>   }
> 
> - if (rte_memzone_free(r->memzone) != 0) {
> - RTE_LOG(ERR, RING, "Cannot free memory\n");
> - return;
> - }
> -
>   ring_list = RTE_TAILQ_CAST(rte_ring_tailq.head, rte_ring_list);
>   rte_mcfg_tailq_write_lock();
> 
> @@ -354,6 +349,9 @@ rte_ring_free(struct rte_ring *r)
> 
>   TAILQ_REMOVE(ring_list, te, next);
> 
> + if (rte_memzone_free(r->memzone) != 0)
> + RTE_LOG(ERR, RING, "Cannot free memory\n");
> +
>   rte_mcfg_tailq_write_unlock();
> 
>   rte_free(te);
> --
> 2.33.0

Re: [RFC 06/27] vhost: don't dump unneeded pages with IOTLB

2023-04-20 Thread Mike Pattrick

On Fri, Mar 31, 2023 at 11:43 AM Maxime Coquelin
 wrote:
>
> On IOTLB entry removal, previous fixes took care of not
> marking pages shared with other IOTLB entries as DONTDUMP.
>
> However, if an IOTLB entry is spanned on multiple pages,
> the other pages were kept as DODUMP while they might not
> have been shared with other entries, increasing needlessly
> the coredump size.
>
> This patch addresses this issue by excluding only the
> shared pages from madvise's DONTDUMP.
>
> Fixes: dea092d0addb ("vhost: fix madvise arguments alignment")
> Cc: sta...@dpdk.org
>
> Signed-off-by: Maxime Coquelin 

Looks good to me.

Acked-by: Mike Pattrick

Re: [RFC] lib: set/get max memzone segments

2023-04-20 Thread Tyler Retzlaff

On Thu, Apr 20, 2023 at 09:43:28AM +0200, Thomas Monjalon wrote:
> 19/04/2023 16:51, Tyler Retzlaff:
> > On Wed, Apr 19, 2023 at 11:36:34AM +0300, Ophir Munk wrote:
> > > In current DPDK the RTE_MAX_MEMZONE definition is unconditionally hard
> > > coded as 2560.  For applications requiring different values of this
> > > parameter – it is more convenient to set the max value via an rte API -
> > > rather than changing the dpdk source code per application.  In many
> > > organizations, the possibility to compile a private DPDK library for a
> > > particular application does not exist at all.  With this option there is
> > > no need to recompile DPDK and it allows using an in-box packaged DPDK.
> > > An example usage for updating the RTE_MAX_MEMZONE would be of an
> > > application that uses the DPDK mempool library which is based on DPDK
> > > memzone library.  The application may need to create a number of
> > > steering tables, each of which will require its own mempool allocation.
> > > This commit is not about how to optimize the application usage of
> > > mempool nor about how to improve the mempool implementation based on
> > > memzone.  It is about how to make the max memzone definition - run-time
> > > customized.
> > > This commit adds an API which must be called before rte_eal_init():
> > > rte_memzone_max_set(int max).  If not called, the default memzone
> > > (RTE_MAX_MEMZONE) is used.  There is also an API to query the effective
> > > max memzone: rte_memzone_max_get().
> > > 
> > > Signed-off-by: Ophir Munk 
> > > ---
> > 
> > the use case of each application may want a different non-hard coded
> > value makes sense.
> > 
> > it's less clear to me that requiring it be called before eal init makes
> > sense over just providing it as configuration to eal init so that it is
> > composed.
> 
> Why do you think it would be better as EAL init option?
> From an API perspective, I think it is simpler to call a dedicated function.
> And I don't think a user wants to deal with it when starting the application.

because a dedicated function that can be called detached from the eal
state enables an opportunity for accidental and confusing use outside
the correct context.

i know the above prescribes not to do this but.

now you can call set after eal init, but we protect about calling it
after init by failing. what do we do sensibly with the failure?

> 
> > can you elaborate further on why you need get if you have a one-shot
> > set? why would the application not know the value if you can only ever
> > call it once before init?
> 
> The "get" function is used in this patch by test and qede driver.
> The application could use it as well, especially to query the default value.

this seems incoherent to me, why does the application not know if it has
called set or not? if it called set it knows what the value is, if it didn't
call set it knows what the default is.

anyway, the use case is valid and i would like to see the ability to
change it dynamically i'd prefer not to see an api like this be introduced
as prescribed but that's for you folks to decide.

anyway, i own a lot of apis that operate just like the proposed and
they're great source of support overhead. i prefer not to rely on
documenting a contract when i can enforce the contract and implicit state
machine mechanically with the api instead.

fwiw a nicer pattern for doing this one of framework influencing config
might look something like this.

struct eal_config config;

eal_config_init(&config); // defaults are set entire state made valid
eal_config_set_max_memzone(&config, 1024); // default is overridden

rte_eal_init(&config);

ty

Re: [PATCH 1/1] net/ixgbe: add a proper memory barrier for LoongArch

2023-04-20 Thread zhoumin


On Fri, Apr 7, 2023 at 4:50PM, Min Zhou wrote:

Segmentation fault has been observed while running the
ixgbe_recv_pkts_lro() function to receive packets on the Loongson
3C5000 processor which has 64 cores and 4 NUMA nodes.

Reason is the read ordering of the status and the rest of the
descriptor fields in this function may not be correct on the
LoongArch processor. We should add rte_rmb() to ensure the read
ordering be correct.

We also did the same thing in the ixgbe_recv_pkts() function.

Signed-off-by: Min Zhou 
---
  drivers/net/ixgbe/ixgbe_rxtx.c | 6 ++
  1 file changed, 6 insertions(+)

diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c b/drivers/net/ixgbe/ixgbe_rxtx.c
index c9d6ca9efe..16391a42f9 100644
--- a/drivers/net/ixgbe/ixgbe_rxtx.c
+++ b/drivers/net/ixgbe/ixgbe_rxtx.c
@@ -1823,6 +1823,9 @@ ixgbe_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
staterr = rxdp->wb.upper.status_error;
if (!(staterr & rte_cpu_to_le_32(IXGBE_RXDADV_STAT_DD)))
break;
+#if defined(RTE_ARCH_LOONGARCH)
+   rte_rmb();
+#endif
rxd = *rxdp;
  
  		/*

@@ -2122,6 +2125,9 @@ ixgbe_recv_pkts_lro(void *rx_queue, struct rte_mbuf 
**rx_pkts, uint16_t nb_pkts,
if (!(staterr & IXGBE_RXDADV_STAT_DD))
break;
  
+#if defined(RTE_ARCH_LOONGARCH)

+   rte_rmb();
+#endif
rxd = *rxdp;
  
  		PMD_RX_LOG(DEBUG, "port_id=%u queue_id=%u rx_id=%u "


Kindly ping.

Any comments or suggestions will be appreciated.


Min

[PATCH 0/2] vhost: add port mirroring function in the vhost lib

2023-04-20 Thread Cheng Jiang

Similar to the port mirroring function on the switch or router, this
patch set implements such function on the Vhost lib. When
data is sent to a front-end, it will also send the data to its mirror
front-end. When data is received from a front-end, it will also send
the data to its mirror front-end.

Cheng Jiang (2):
  vhost: add ingress API for port mirroring datapath
  vhost: add egress API for port mirroring datapath

 lib/vhost/rte_vhost_async.h |   17 +
 lib/vhost/version.map   |3 +
 lib/vhost/virtio_net.c  | 1266 +++
 3 files changed, 1286 insertions(+)

--
2.35.1

[PATCH 1/2] vhost: add ingress API for port mirroring datapath

2023-04-20 Thread Cheng Jiang

Similar to the port mirroring function on the switch or router, this
patch also implements an ingress function on the Vhost lib. When
data is sent to a front-end, it will also send the data to its mirror
front-end.

Signed-off-by: Cheng Jiang 
Signed-off-by: Yuan Wang 
Signed-off-by: Wenwu Ma 
---
 lib/vhost/rte_vhost_async.h |   6 +
 lib/vhost/version.map   |   1 +
 lib/vhost/virtio_net.c  | 688 
 3 files changed, 695 insertions(+)

diff --git a/lib/vhost/rte_vhost_async.h b/lib/vhost/rte_vhost_async.h
index 8f190dd44b..30aaf66b60 100644
--- a/lib/vhost/rte_vhost_async.h
+++ b/lib/vhost/rte_vhost_async.h
@@ -286,6 +286,12 @@ __rte_experimental
 int
 rte_vhost_async_dma_unconfigure(int16_t dma_id, uint16_t vchan_id);
 
+__rte_experimental
+uint16_t rte_vhost_async_try_egress_burst(int vid, uint16_t queue_id,
+   int mr_vid, uint16_t mr_queue_id,
+   struct rte_mempool *mbuf_pool, struct rte_mbuf **pkts, uint16_t 
count,
+   int *nr_inflight, int16_t dma_id, uint16_t vchan_id);
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/lib/vhost/version.map b/lib/vhost/version.map
index d322a4a888..95f75a6928 100644
--- a/lib/vhost/version.map
+++ b/lib/vhost/version.map
@@ -98,6 +98,7 @@ EXPERIMENTAL {
# added in 22.11
rte_vhost_async_dma_unconfigure;
rte_vhost_vring_call_nonblock;
+   rte_vhost_async_try_egress_burst;
 };
 
 INTERNAL {
diff --git a/lib/vhost/virtio_net.c b/lib/vhost/virtio_net.c
index be28ea5151..c7e99d403e 100644
--- a/lib/vhost/virtio_net.c
+++ b/lib/vhost/virtio_net.c
@@ -4262,3 +4262,691 @@ rte_vhost_async_try_dequeue_burst(int vid, uint16_t 
queue_id,
 
return count;
 }
+
+
+static __rte_always_inline uint16_t
+async_poll_egress_completed_split(struct virtio_net *dev, struct 
vhost_virtqueue *vq,
+   struct virtio_net *mr_dev, struct vhost_virtqueue *mr_vq,
+   struct rte_mbuf **pkts, uint16_t count, int16_t dma_id,
+   uint16_t vchan_id, bool legacy_ol_flags)
+{
+   uint16_t nr_cpl_pkts = 0;
+   uint16_t start_idx, from, i;
+   struct async_inflight_info *pkts_info = vq->async->pkts_info;
+   uint16_t n_descs = 0;
+
+   vhost_async_dma_check_completed(dev, dma_id, vchan_id, 
VHOST_DMA_MAX_COPY_COMPLETE);
+
+   start_idx = async_get_first_inflight_pkt_idx(vq);
+
+   from = start_idx;
+   while (vq->async->pkts_cmpl_flag[from] && count--) {
+   vq->async->pkts_cmpl_flag[from] = false;
+   from = (from + 1) % vq->size;
+   nr_cpl_pkts++;
+   }
+
+   if (nr_cpl_pkts == 0)
+   return 0;
+
+   for (i = 0; i < nr_cpl_pkts; i++) {
+   from = (start_idx + i) % vq->size;
+   n_descs += pkts_info[from].descs;
+   pkts[i] = pkts_info[from].mbuf;
+
+   if (virtio_net_with_host_offload(dev))
+   vhost_dequeue_offload(dev, &pkts_info[from].nethdr, 
pkts[i],
+   legacy_ol_flags);
+   }
+
+   /* write back completed descs to used ring and update used idx */
+   write_back_completed_descs_split(vq, nr_cpl_pkts);
+   __atomic_add_fetch(&vq->used->idx, nr_cpl_pkts, __ATOMIC_RELEASE);
+   vhost_vring_call_split(dev, vq);
+
+   write_back_completed_descs_split(mr_vq, n_descs);
+   __atomic_add_fetch(&mr_vq->used->idx, n_descs, __ATOMIC_RELEASE);
+   vhost_vring_call_split(mr_dev, mr_vq);
+
+   vq->async->pkts_inflight_n -= nr_cpl_pkts;
+
+   return nr_cpl_pkts;
+}
+
+static __rte_always_inline int
+egress_async_fill_seg(struct virtio_net *dev, struct vhost_virtqueue *vq, 
uint64_t buf_iova,
+   struct virtio_net *mr_dev, uint64_t mr_buf_iova,
+   struct rte_mbuf *m, uint32_t mbuf_offset, uint32_t cpy_len)
+{
+   struct vhost_async *async = vq->async;
+   uint64_t mapped_len, mr_mapped_len;
+   uint32_t buf_offset = 0;
+   void *src, *dst, *mr_dst;
+   void *host_iova, *mr_host_iova;
+
+   while (cpy_len) {
+   host_iova = (void *)(uintptr_t)gpa_to_first_hpa(dev,
+   buf_iova + buf_offset, cpy_len, 
&mapped_len);
+   if (unlikely(!host_iova)) {
+   VHOST_LOG_DATA(dev->ifname, ERR, "%s: failed to get 
host iova.\n", __func__);
+   return -1;
+   }
+
+   mr_host_iova = (void *)(uintptr_t)gpa_to_first_hpa(mr_dev,
+   mr_buf_iova + buf_offset, cpy_len, 
&mr_mapped_len);
+   if (unlikely(!mr_host_iova)) {
+   VHOST_LOG_DATA(mr_dev->ifname, ERR, "%s: failed to get 
mirror hpa.\n", __func__);
+   return -1;
+   }
+
+   if (unlikely(mr_mapped_len != mapped_len)) {
+   VHOST_LOG_DATA(dev->ifname, ERR, "original mapped len 
is not equal to mirror le

[PATCH 2/2] vhost: add egress API for port mirroring datapath

2023-04-20 Thread Cheng Jiang

This patch implements egress function on the Vhost lib. When packets are
received from a front-end, it will also send the packets to its mirror
front-end.

Signed-off-by: Cheng Jiang 
Signed-off-by: Yuan Wang 
Signed-off-by: Wenwu Ma 
---
 lib/vhost/rte_vhost_async.h |  11 +
 lib/vhost/version.map   |   2 +
 lib/vhost/virtio_net.c  | 682 +---
 3 files changed, 643 insertions(+), 52 deletions(-)

diff --git a/lib/vhost/rte_vhost_async.h b/lib/vhost/rte_vhost_async.h
index 30aaf66b60..4df473f1ec 100644
--- a/lib/vhost/rte_vhost_async.h
+++ b/lib/vhost/rte_vhost_async.h
@@ -286,6 +286,17 @@ __rte_experimental
 int
 rte_vhost_async_dma_unconfigure(int16_t dma_id, uint16_t vchan_id);
 
+__rte_experimental
+uint16_t rte_vhost_submit_ingress_mirroring_burst(int vid, uint16_t queue_id,
+   int mirror_vid, uint16_t mirror_queue_id,
+   struct rte_mbuf **pkts, uint16_t count,
+   int16_t dma_id, uint16_t vchan_id);
+
+__rte_experimental
+uint16_t rte_vhost_poll_ingress_completed(int vid, uint16_t queue_id, int 
mr_vid,
+   uint16_t mr_queue_id, struct rte_mbuf **pkts, uint16_t count,
+   int16_t dma_id, uint16_t vchan_id);
+
 __rte_experimental
 uint16_t rte_vhost_async_try_egress_burst(int vid, uint16_t queue_id,
int mr_vid, uint16_t mr_queue_id,
diff --git a/lib/vhost/version.map b/lib/vhost/version.map
index 95f75a6928..347ea6ac9c 100644
--- a/lib/vhost/version.map
+++ b/lib/vhost/version.map
@@ -98,6 +98,8 @@ EXPERIMENTAL {
# added in 22.11
rte_vhost_async_dma_unconfigure;
rte_vhost_vring_call_nonblock;
+   rte_vhost_submit_ingress_mirroring_burst;
+   rte_vhost_poll_ingress_completed;
rte_vhost_async_try_egress_burst;
 };
 
diff --git a/lib/vhost/virtio_net.c b/lib/vhost/virtio_net.c
index c7e99d403e..f4c96c3216 100644
--- a/lib/vhost/virtio_net.c
+++ b/lib/vhost/virtio_net.c
@@ -4263,6 +4263,634 @@ rte_vhost_async_try_dequeue_burst(int vid, uint16_t 
queue_id,
return count;
 }
 
+static __rte_always_inline int
+async_mirror_fill_seg(struct virtio_net *dev, struct vhost_virtqueue *vq, 
uint64_t buf_iova,
+   struct virtio_net *mr_dev, uint64_t mr_buf_iova,
+   struct rte_mbuf *m, uint32_t mbuf_offset, uint32_t cpy_len, 
bool is_ingress)
+{
+   struct vhost_async *async = vq->async;
+   uint64_t mapped_len, mr_mapped_len;
+   uint32_t buf_offset = 0;
+   void *src, *dst, *mr_dst;
+   void *host_iova, *mr_host_iova;
+
+   while (cpy_len) {
+   host_iova = (void *)(uintptr_t)gpa_to_first_hpa(dev,
+   buf_iova + buf_offset, cpy_len, &mapped_len);
+   if (unlikely(!host_iova)) {
+   VHOST_LOG_DATA(dev->ifname, ERR, "%s: failed to get 
host iova.\n", __func__);
+   return -1;
+   }
+   mr_host_iova = (void *)(uintptr_t)gpa_to_first_hpa(mr_dev,
+   mr_buf_iova + buf_offset, cpy_len, 
&mr_mapped_len);
+   if (unlikely(!mr_host_iova)) {
+   VHOST_LOG_DATA(mr_dev->ifname, ERR, "%s: failed to get 
mirror host iova.\n", __func__);
+   return -1;
+   }
+
+   if (unlikely(mr_mapped_len != mapped_len)) {
+   VHOST_LOG_DATA(dev->ifname, ERR, "original mapped len 
is not equal to mirror len\n");
+   return -1;
+   }
+
+   if (is_ingress) {
+   src = (void *)(uintptr_t)rte_pktmbuf_iova_offset(m, 
mbuf_offset);
+   dst = host_iova;
+   mr_dst = mr_host_iova;
+   } else {
+   src = host_iova;
+   dst = (void *)(uintptr_t)rte_pktmbuf_iova_offset(m, 
mbuf_offset);
+   mr_dst = mr_host_iova;
+   }
+
+   if (unlikely(async_iter_add_iovec(dev, async, src, dst, 
(size_t)mapped_len)))
+   return -1;
+   if (unlikely(async_iter_add_iovec(mr_dev, async, src, mr_dst, 
(size_t)mapped_len)))
+   return -1;
+
+   cpy_len -= (uint32_t)mapped_len;
+   mbuf_offset += (uint32_t)mapped_len;
+   buf_offset += (uint32_t)mapped_len;
+   }
+
+   return 0;
+}
+
+static __rte_always_inline uint16_t
+vhost_poll_ingress_completed(struct virtio_net *dev, struct vhost_virtqueue 
*vq,
+   struct virtio_net *mr_dev, struct vhost_virtqueue 
*mr_vq,
+   struct rte_mbuf **pkts, uint16_t count, int16_t dma_id, 
uint16_t vchan_id)
+{
+   struct vhost_async *async = vq->async;
+   struct vhost_async *mr_async = mr_vq->async;
+   struct async_inflight_info *pkts_info = async->pkts_info;
+
+   uint16_t nr_cpl_pkts = 0, n_descs = 0;
+   uint16_t mr_n_descs = 0;
+   uint16_

RE: [PATCH] usertools: enhance CPU layout

2023-04-20 Thread Lu, Wenzhuo

Hi Stephen,

> -Original Message-
> From: Stephen Hemminger 
> Sent: Wednesday, April 19, 2023 12:47 AM
> To: Lu, Wenzhuo 
> Cc: dev@dpdk.org
> Subject: Re: [PATCH] usertools: enhance CPU layout
> 
> On Tue, 18 Apr 2023 13:25:41 +0800
> Wenzhuo Lu  wrote:
> 
> > The cores in a single CPU may be not all the same.
> > The user tool is updated to show the
> > difference of the cores.
> >
> > This patch addes below informantion,
> > 1, Group the cores based on the die.
> > 2, A core is either a performance core or an
> >efficency core.
> >A performance core is shown as 'Core-P'.
> >An efficency core is shown as 'Core-E'.
> > 3, All the E-cores which share the same L2-cache
> >are grouped to one module.
> >
> > The known limitation.
> > 1, To tell a core is P-core or E-core is based on if
> >this core shares L2 cache with others.
> >
> > Signed-off-by: Wenzhuo Lu 
> 
> Side note:
> This tool only exists because of lack of simple tool at the time.
> Looking around, found that there is a tool 'lstopo' under the hwloc package
> that gives output in many formats including graphical and seems to do a better
> job than the DPDK python script.
> 
> Not sure how much farther DPDK should go in this area?
> Really should be a distro tool.
Many thanks for your review and comments.
Have to say I'm a green hand in this field. Just imitate the existing code to 
write mine. So, still trying to understand and handle the comments :)

Better to understand more about our opinion of this script before send a v2 
patch.
I've used 'lstopo'. It's a great tool.
To my opinion, considering there're Linux tools to show all kinds of 
information, the reason that DPDK has its own tool is to summarize and 
emphasize the information that is important to DPDK. Here it's that some cores 
are more powerful than others. When the users use a testpmd-like APP, they can 
choose the appropriate cores after DPDK reminds them about the difference 
between cores.
Add Thomas for more suggestions. Thanks.

RE: [PATCH v5] enhance NUMA affinity heuristic

2023-04-20 Thread You, KaisenX




> -Original Message-
> From: Thomas Monjalon 
> Sent: 2023年4月19日 20:17
> To: You, KaisenX 
> Cc: dev@dpdk.org; Zhou, YidingX ;
> david.march...@redhat.com; Matz, Olivier ;
> ferruh.yi...@amd.com; zhou...@loongson.cn; sta...@dpdk.org;
> Richardson, Bruce ; jer...@marvell.com;
> Burakov, Anatoly 
> Subject: Re: [PATCH v5] enhance NUMA affinity heuristic
> 
> 13/04/2023 02:56, You, KaisenX:
> > From: You, KaisenX
> > > From: Thomas Monjalon 
> > > >
> > > > I'm not comfortable with this patch.
> > > >
> > > > First, there is no comment in the code which helps to understand the
> logic.
> > > > Second, I'm afraid changing the value of the per-core variable
> > > > _socket_id may have an impact on some applications.
> > > >
> > Hi Thomas, I'm sorry to bother you again, but we can't think of a
> > better solution for now, would you please give me some suggestion, and
> then I will modify it accordingly.
> 
> You need to better explain the logic
> both in the commit message and in code comments.
> When it will be done, it will be easier to have a discussion with other
> maintainers and community experts.
> Thank you
> 
Thank you for your reply, I'll explain my patch in more detail next.

When a DPDK application is started on only one numa node, memory is allocated 
for only one socket.When interrupt threads use memory, memory may not be found 
on the socket where the interrupt thread is currently located, and memory has 
to be 
reallocated on the hugepage, this operation can lead to performance degradation.

So my modification is in the function malloc_get_numa_socket to make sure 
that the first socket with memory can be returned.

If you can accept my explanation and modification, I will send the V6 
version to improve the commit message and code comments.

> > > Thank you for your reply.
> > > First, about comments, I can submit a new patch to add comments to
> > > help understand.
> > > Second, if you do not change the value of the per-core variable_
> > > socket_ id, /lib/eal/common/malloc_heap.c
> > > malloc_get_numa_socket(void)
> > > {
> > > const struct internal_config *conf = 
> > > eal_get_internal_configuration();
> > > unsigned int socket_id = rte_socket_id();   // The return value of
> > > "rte_socket_id()" is 1
> > > unsigned int idx;
> > >
> > > if (socket_id != (unsigned int)SOCKET_ID_ANY)
> > > return socket_id;//so return here
> > >
> > > This will cause return here, This function returns the socket_id of
> > > unallocated memory.
> > >
> > > If you have a better solution, I can modify it.
> 
> 
>

Re: [PATCH v2 06/10] net/octeon_ep: fix DMA incompletion

2023-04-20 Thread Jerin Jacob

On Wed, Apr 5, 2023 at 7:56 PM Sathesh Edara  wrote:
>
> This patch fixes the DMA incompletion

1) Please remove "This patch" in every commit description in this
series, as it is quite implicit.

2) Please add Fixes: tag

3) Tell what was the problem and how it is fixing it?

Re: [PATCH v2 05/10] net/octeon_ep: support ISM

2023-04-20 Thread Jerin Jacob

On Wed, Apr 5, 2023 at 7:56 PM Sathesh Edara  wrote:
>
> This patch adds ISM specific functionality.

See following commit as reference, and update new acronyms like ISM
and others at devtools/words-case.txt

commit 33c942d19260817502b49403f0baaab6113774b2
Author: Ashwin Sekhar T K 
Date:   Fri Sep 17 16:28:39 2021 +0530

devtools: add Marvell acronyms for commit checks

Update word list with Marvell specific acronyms.

CPT  -> Cryptographic Accelerator Unit
CQ   -> Completion Queue
LBK  -> Loopback Interface Unit
LMT  -> Large Atomic Store Unit
MCAM -> Match Content Addressable Memory
NIX  -> Network Interface Controller Unit
NPA  -> Network Pool Allocator
NPC  -> Network Parser and CAM Unit
ROC  -> Rest Of Chip
RQ   -> Receive Queue
RVU  -> Resource Virtualization Unit
SQ   -> Send Queue
SSO  -> Schedule Synchronize Order Unit
TIM  -> Timer Unit

Suggested-by: Ferruh Yigit 
Signed-off-by: Ashwin Sekhar T K 
Reviewed-by: Jerin Jacob 
>
> Signed-off-by: Sathesh Edara 
> ---
>  drivers/net/octeon_ep/cnxk_ep_vf.c| 35 +++--
>  drivers/net/octeon_ep/cnxk_ep_vf.h| 12 ++
>  drivers/net/octeon_ep/otx2_ep_vf.c| 45 ++---
>  drivers/net/octeon_ep/otx2_ep_vf.h| 14 +++
>  drivers/net/octeon_ep/otx_ep_common.h | 16 
>  drivers/net/octeon_ep/otx_ep_ethdev.c | 36 +
>  drivers/net/octeon_ep/otx_ep_rxtx.c   | 56 +--
>  7 files changed, 194 insertions(+), 20 deletions(-)
>
> diff --git a/drivers/net/octeon_ep/cnxk_ep_vf.c 
> b/drivers/net/octeon_ep/cnxk_ep_vf.c
> index 1a92887109..a437ae68cb 100644
> --- a/drivers/net/octeon_ep/cnxk_ep_vf.c
> +++ b/drivers/net/octeon_ep/cnxk_ep_vf.c
> @@ -2,11 +2,12 @@
>   * Copyright(C) 2022 Marvell.
>   */
>
> +#include 
>  #include 
>
>  #include 
>  #include 
> -
> +#include 
>  #include "cnxk_ep_vf.h"
>
>  static void
> @@ -85,6 +86,7 @@ cnxk_ep_vf_setup_iq_regs(struct otx_ep_device *otx_ep, 
> uint32_t iq_no)
> struct otx_ep_instr_queue *iq = otx_ep->instr_queue[iq_no];
> int loop = OTX_EP_BUSY_LOOP_COUNT;
> volatile uint64_t reg_val = 0ull;
> +   uint64_t ism_addr;
>
> reg_val = oct_ep_read64(otx_ep->hw_addr + 
> CNXK_EP_R_IN_CONTROL(iq_no));
>
> @@ -132,6 +134,19 @@ cnxk_ep_vf_setup_iq_regs(struct otx_ep_device *otx_ep, 
> uint32_t iq_no)
>  */
> oct_ep_write64(OTX_EP_CLEAR_SDP_IN_INT_LVLS,
>otx_ep->hw_addr + CNXK_EP_R_IN_INT_LEVELS(iq_no));
> +   /* Set up IQ ISM registers and structures */
> +   ism_addr = (otx_ep->ism_buffer_mz->iova | CNXK_EP_ISM_EN
> +   | CNXK_EP_ISM_MSIX_DIS)
> +   + CNXK_EP_IQ_ISM_OFFSET(iq_no);
> +   rte_write64(ism_addr, (uint8_t *)otx_ep->hw_addr +
> +   CNXK_EP_R_IN_CNTS_ISM(iq_no));
> +   iq->inst_cnt_ism =
> +   (uint32_t *)((uint8_t *)otx_ep->ism_buffer_mz->addr
> ++ CNXK_EP_IQ_ISM_OFFSET(iq_no));
> +   otx_ep_err("SDP_R[%d] INST Q ISM virt: %p, dma: 0x%" PRIX64, iq_no,
> +  (void *)iq->inst_cnt_ism, ism_addr);
> +   *iq->inst_cnt_ism = 0;
> +   iq->inst_cnt_ism_prev = 0;
> return 0;
>  }
>
> @@ -142,6 +157,7 @@ cnxk_ep_vf_setup_oq_regs(struct otx_ep_device *otx_ep, 
> uint32_t oq_no)
> uint64_t oq_ctl = 0ull;
> int loop = OTX_EP_BUSY_LOOP_COUNT;
> struct otx_ep_droq *droq = otx_ep->droq[oq_no];
> +   uint64_t ism_addr;
>
> /* Wait on IDLE to set to 1, supposed to configure BADDR
>  * as long as IDLE is 0
> @@ -201,9 +217,22 @@ cnxk_ep_vf_setup_oq_regs(struct otx_ep_device *otx_ep, 
> uint32_t oq_no)
> rte_write32((uint32_t)reg_val, droq->pkts_sent_reg);
>
> otx_ep_dbg("SDP_R[%d]_sent: %x", oq_no, 
> rte_read32(droq->pkts_sent_reg));
> -   loop = OTX_EP_BUSY_LOOP_COUNT;
> +   /* Set up ISM registers and structures */
> +   ism_addr = (otx_ep->ism_buffer_mz->iova | CNXK_EP_ISM_EN
> +   | CNXK_EP_ISM_MSIX_DIS)
> +   + CNXK_EP_OQ_ISM_OFFSET(oq_no);
> +   rte_write64(ism_addr, (uint8_t *)otx_ep->hw_addr +
> +   CNXK_EP_R_OUT_CNTS_ISM(oq_no));
> +   droq->pkts_sent_ism =
> +   (uint32_t *)((uint8_t *)otx_ep->ism_buffer_mz->addr
> ++ CNXK_EP_OQ_ISM_OFFSET(oq_no));
> +   otx_ep_err("SDP_R[%d] OQ ISM virt: %p dma: 0x%" PRIX64,
> +   oq_no, (void *)droq->pkts_sent_ism, ism_addr);
> +   *droq->pkts_sent_ism = 0;
> +   droq->pkts_sent_ism_prev = 0;
>
> -   while (((rte_read32(droq->pkts_sent_reg)) != 0ull)) {
> +   loop = OTX_EP_BUSY_LOOP_COUNT;
> +   while (((rte_read32(droq->pkts_sent_reg)) != 0ull) && loop--) {
> reg_val = rte_read32(droq->pkts_sent_reg);
> rte_write32((uint32_t)reg_val, droq->pkts_sent_reg);
> rte_delay_ms

Re: [PATCH v2 08/10] net/octeon_ep: support Mailbox between VF and PF

2023-04-20 Thread Jerin Jacob

On Wed, Apr 5, 2023 at 7:56 PM Sathesh Edara  wrote:
>
> This patch adds the mailbox communication between
> VF and PF and supports the following mailbox
> messages.
> - Get and set  MAC address
> - Get link information
> - Get stats
> - Get and set link status
> - Set and get MTU
> - Send notification to PF
>
> Signed-off-by: Sathesh Edara 

1) Change "Mailbox" to "mailbox" in subject line
2) Please cross check, Do you need to update new items in
doc/guides/nics/features/octeon_ep.ini by adding this new features.
See doc/guides/nics/features.rst for list of features.


> ---
>  drivers/net/octeon_ep/cnxk_ep_vf.c|   1 +
>  drivers/net/octeon_ep/cnxk_ep_vf.h|  12 +-
>  drivers/net/octeon_ep/meson.build |   1 +
>  drivers/net/octeon_ep/otx_ep_common.h |  26 +++
>  drivers/net/octeon_ep/otx_ep_ethdev.c | 143 +++-
>  drivers/net/octeon_ep/otx_ep_mbox.c   | 309 ++
>  drivers/net/octeon_ep/otx_ep_mbox.h   | 163 ++
>  7 files changed, 642 insertions(+), 13 deletions(-)
>  create mode 100644 drivers/net/octeon_ep/otx_ep_mbox.c
>  create mode 100644 drivers/net/octeon_ep/otx_ep_mbox.h
>
> diff --git a/drivers/net/octeon_ep/cnxk_ep_vf.c 
> b/drivers/net/octeon_ep/cnxk_ep_vf.c
> index a437ae68cb..cadb4ecbf9 100644
> --- a/drivers/net/octeon_ep/cnxk_ep_vf.c
> +++ b/drivers/net/octeon_ep/cnxk_ep_vf.c
> @@ -8,6 +8,7 @@
>  #include 
>  #include 
>  #include 
> +#include "otx_ep_common.h"
>  #include "cnxk_ep_vf.h"
>
>  static void
> diff --git a/drivers/net/octeon_ep/cnxk_ep_vf.h 
> b/drivers/net/octeon_ep/cnxk_ep_vf.h
> index 072b38ea15..86277449ea 100644
> --- a/drivers/net/octeon_ep/cnxk_ep_vf.h
> +++ b/drivers/net/octeon_ep/cnxk_ep_vf.h
> @@ -5,7 +5,7 @@
>  #define _CNXK_EP_VF_H_
>
>  #include 
> -#include "otx_ep_common.h"
> +
>  #define CNXK_CONFIG_XPANSION_BAR 0x38
>  #define CNXK_CONFIG_PCIE_CAP 0x70
>  #define CNXK_CONFIG_PCIE_DEVCAP  0x74
> @@ -92,6 +92,10 @@
>  #define CNXK_EP_R_OUT_BYTE_CNT_START   0x10190
>  #define CNXK_EP_R_OUT_CNTS_ISM_START   0x10510
>
> +#define CNXK_EP_R_MBOX_PF_VF_DATA_START0x10210
> +#define CNXK_EP_R_MBOX_VF_PF_DATA_START0x10230
> +#define CNXK_EP_R_MBOX_PF_VF_INT_START 0x10220
> +
>  #define CNXK_EP_R_OUT_CNTS(ring)\
> (CNXK_EP_R_OUT_CNTS_START + ((ring) * CNXK_EP_RING_OFFSET))
>
> @@ -125,6 +129,12 @@
>  #define CNXK_EP_R_OUT_CNTS_ISM(ring) \
> (CNXK_EP_R_OUT_CNTS_ISM_START + ((ring) * CNXK_EP_RING_OFFSET))
>
> +#define CNXK_EP_R_MBOX_VF_PF_DATA(ring)  \
> +   (CNXK_EP_R_MBOX_VF_PF_DATA_START + ((ring) * CNXK_EP_RING_OFFSET))
> +
> +#define CNXK_EP_R_MBOX_PF_VF_INT(ring)   \
> +   (CNXK_EP_R_MBOX_PF_VF_INT_START + ((ring) * CNXK_EP_RING_OFFSET))
> +
>  /*-- R_OUT Masks */
>  #define CNXK_EP_R_OUT_INT_LEVELS_BMODE   (1ULL << 63)
>  #define CNXK_EP_R_OUT_INT_LEVELS_TIMET   (32)
> diff --git a/drivers/net/octeon_ep/meson.build 
> b/drivers/net/octeon_ep/meson.build
> index a267b60290..e698bf9792 100644
> --- a/drivers/net/octeon_ep/meson.build
> +++ b/drivers/net/octeon_ep/meson.build
> @@ -8,4 +8,5 @@ sources = files(
>  'otx_ep_vf.c',
>  'otx2_ep_vf.c',
>  'cnxk_ep_vf.c',
> +'otx_ep_mbox.c',
>  )
> diff --git a/drivers/net/octeon_ep/otx_ep_common.h 
> b/drivers/net/octeon_ep/otx_ep_common.h
> index 3beec71968..0bf5454a39 100644
> --- a/drivers/net/octeon_ep/otx_ep_common.h
> +++ b/drivers/net/octeon_ep/otx_ep_common.h
> @@ -4,6 +4,7 @@
>  #ifndef _OTX_EP_COMMON_H_
>  #define _OTX_EP_COMMON_H_
>
> +#include 
>
>  #define OTX_EP_NW_PKT_OP   0x1220
>  #define OTX_EP_NW_CMD_OP   0x1221
> @@ -67,6 +68,9 @@
>  #define oct_ep_read64(addr) rte_read64_relaxed((void *)(addr))
>  #define oct_ep_write64(val, addr) rte_write64_relaxed((val), (void *)(addr))
>
> +/* Mailbox maximum data size */
> +#define MBOX_MAX_DATA_BUF_SIZE 320
> +
>  /* Input Request Header format */
>  union otx_ep_instr_irh {
> uint64_t u64;
> @@ -488,6 +492,18 @@ struct otx_ep_device {
>
> /* DMA buffer for SDP ISM messages */
> const struct rte_memzone *ism_buffer_mz;
> +
> +   /* Mailbox lock */
> +   rte_spinlock_t mbox_lock;
> +
> +   /* Mailbox data */
> +   uint8_t mbox_data_buf[MBOX_MAX_DATA_BUF_SIZE];
> +
> +   /* Mailbox data index */
> +   int32_t mbox_data_index;
> +
> +   /* Mailbox receive message length */
> +   int32_t mbox_rcv_message_len;
>  };
>
>  int otx_ep_setup_iqs(struct otx_ep_device *otx_ep, uint32_t iq_no,
> @@ -541,6 +557,16 @@ struct otx_ep_buf_free_info {
>  #define OTX_EP_CLEAR_SLIST_DBELL 0x
>  #define OTX_EP_CLEAR_SDP_OUT_PKT_CNT 0xF
>
> +/* Max overhead includes
> + * - Ethernet hdr
> + * - CRC
> + * - nested VLANs
> + * - octeon rx info
> + */
> +#define OTX_EP_ETH_OVERHEAD \
> +   (RTE_ETHER_HDR_LEN + RTE_ETHER_CRC_LEN + \
> +(2

Re: [PATCH v2 10/10] net/octeon_ep: set secondary process dev ops

2023-04-20 Thread Jerin Jacob

On Wed, Apr 5, 2023 at 7:57 PM Sathesh Edara  wrote:
>
> This patch sets the dev ops and transmit/receive
> callbacks for secondary process.

Change the message as "fix ..." and fixes: tag if it just bug fixes.
BTW, "Multiprocess aware" is missing in doc/guides/nics/features/octeon_ep.ini


>
> Signed-off-by: Sathesh Edara 
> ---
>  drivers/net/octeon_ep/otx_ep_ethdev.c | 22 +++---
>  1 file changed, 19 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/net/octeon_ep/otx_ep_ethdev.c 
> b/drivers/net/octeon_ep/otx_ep_ethdev.c
> index 885fbb475f..a9868909f8 100644
> --- a/drivers/net/octeon_ep/otx_ep_ethdev.c
> +++ b/drivers/net/octeon_ep/otx_ep_ethdev.c
> @@ -527,9 +527,17 @@ otx_ep_dev_stats_get(struct rte_eth_dev *eth_dev,
>  static int
>  otx_ep_dev_close(struct rte_eth_dev *eth_dev)
>  {
> -   struct otx_ep_device *otx_epvf = OTX_EP_DEV(eth_dev);
> +   struct otx_ep_device *otx_epvf;
> uint32_t num_queues, q_no;
>
> +   if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
> +   eth_dev->dev_ops = NULL;
> +   eth_dev->rx_pkt_burst = NULL;
> +   eth_dev->tx_pkt_burst = NULL;
> +   return 0;
> +   }
> +
> +   otx_epvf = OTX_EP_DEV(eth_dev);
> otx_ep_mbox_send_dev_exit(eth_dev);
> otx_epvf->fn_list.disable_io_queues(otx_epvf);
> num_queues = otx_epvf->nb_rx_queues;
> @@ -593,8 +601,12 @@ static const struct eth_dev_ops otx_ep_eth_dev_ops = {
>  static int
>  otx_ep_eth_dev_uninit(struct rte_eth_dev *eth_dev)
>  {
> -   if (rte_eal_process_type() != RTE_PROC_PRIMARY)
> +   if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
> +   eth_dev->dev_ops = NULL;
> +   eth_dev->rx_pkt_burst = NULL;
> +   eth_dev->tx_pkt_burst = NULL;
> return 0;
> +   }
>
> eth_dev->dev_ops = NULL;
> eth_dev->rx_pkt_burst = NULL;
> @@ -642,8 +654,12 @@ otx_ep_eth_dev_init(struct rte_eth_dev *eth_dev)
> struct rte_ether_addr vf_mac_addr;
>
> /* Single process support */
> -   if (rte_eal_process_type() != RTE_PROC_PRIMARY)
> +   if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
> +   eth_dev->dev_ops = &otx_ep_eth_dev_ops;
> +   eth_dev->rx_pkt_burst = &otx_ep_recv_pkts;
> +   eth_dev->tx_pkt_burst = &otx2_ep_xmit_pkts;
> return 0;
> +   }
>
> rte_eth_copy_pci_info(eth_dev, pdev);
> otx_epvf->eth_dev = eth_dev;
> --
> 2.31.1
>

Re: [PATCH v2] eventdev/timer: fix timeout event wait behavior

2023-04-20 Thread Jerin Jacob

On Thu, Apr 13, 2023 at 1:31 AM Carrillo, Erik G
 wrote:
>
> > -Original Message-
> > From: Shijith Thotton 
> > Sent: Tuesday, March 21, 2023 12:20 AM
> > To: Carrillo, Erik G ; jer...@marvell.com
> > Cc: Shijith Thotton ; dev@dpdk.org;
> > pbhagavat...@marvell.com; sta...@dpdk.org
> > Subject: [PATCH v2] eventdev/timer: fix timeout event wait behavior
> >
> > Improved the accuracy and consistency of timeout event wait behavior by
> > refactoring it. Previously, the delay function used for waiting could be
> > inaccurate, leading to inconsistent results. This commit updates the wait
> > behavior to use a timeout-based approach, enabling the wait for the exact
> > number of timer ticks before proceeding.
> >
> > The new function timeout_event_dequeue mimics the behavior of the
> > tested systems closely. It dequeues timer expiry events until either the
> > expected number of events have been dequeued or the specified time has
> > elapsed. The WAIT_TICKS macro defines the waiting behavior based on the
> > type of timer being used (software or hardware).
> >
> > Fixes: d1f3385d0076 ("test: add event timer adapter auto-test")
> >
> > Signed-off-by: Shijith Thotton 
> Thanks for the update.
>
> Acked-by: Erik Gabriel Carrillo 

Applied to dpdk-next-net-eventdev/for-main. Thanks

[PATCH 0/4] app: introduce testgraph application

2023-04-20 Thread Vamsi Attunuru

This patch series introduces testgraph application that verifies graph
architecture, it provides an infra to verify the graph & node libraries
and scale the test coverage by adding newer configurations to exercise
various graph topologies & graph-walk models required by the DPDK
applications.

Also this series adds two new nodes (punt_kernel & kernel_recv) to the
node library.

Vamsi Attunuru (4):
  node: add pkt punt to kernel node
  node: add a node to receive pkts from kernel
  node: remove hardcoded node next details
  app: add testgraph application

 app/meson.build |1 +
 app/test-graph/cmdline.c|  212 +
 app/test-graph/cmdline_graph.c  |  297 ++
 app/test-graph/cmdline_graph.h  |   19 +
 app/test-graph/meson.build  |   17 +
 app/test-graph/parameters.c |  157 
 app/test-graph/testgraph.c  | 1309 +++
 app/test-graph/testgraph.h  |   92 ++
 doc/guides/prog_guide/graph_lib.rst |   17 +
 doc/guides/tools/index.rst  |1 +
 doc/guides/tools/testgraph.rst  |  131 +++
 lib/node/ethdev_rx.c|2 -
 lib/node/kernel_recv.c  |  277 ++
 lib/node/kernel_recv_priv.h |   74 ++
 lib/node/meson.build|2 +
 lib/node/punt_kernel.c  |  125 +++
 lib/node/punt_kernel_priv.h |   36 +
 17 files changed, 2767 insertions(+), 2 deletions(-)
 create mode 100644 app/test-graph/cmdline.c
 create mode 100644 app/test-graph/cmdline_graph.c
 create mode 100644 app/test-graph/cmdline_graph.h
 create mode 100644 app/test-graph/meson.build
 create mode 100644 app/test-graph/parameters.c
 create mode 100644 app/test-graph/testgraph.c
 create mode 100644 app/test-graph/testgraph.h
 create mode 100644 doc/guides/tools/testgraph.rst
 create mode 100644 lib/node/kernel_recv.c
 create mode 100644 lib/node/kernel_recv_priv.h
 create mode 100644 lib/node/punt_kernel.c
 create mode 100644 lib/node/punt_kernel_priv.h

-- 
2.25.1

[PATCH 1/4] node: add pkt punt to kernel node

2023-04-20 Thread Vamsi Attunuru

Patch adds a node to punt the packets to kernel over
a raw socket.

Signed-off-by: Vamsi Attunuru 
---
 doc/guides/prog_guide/graph_lib.rst |  10 +++
 lib/node/meson.build|   1 +
 lib/node/punt_kernel.c  | 125 
 lib/node/punt_kernel_priv.h |  36 
 4 files changed, 172 insertions(+)

diff --git a/doc/guides/prog_guide/graph_lib.rst 
b/doc/guides/prog_guide/graph_lib.rst
index 1cfdc86433..b3b5b14827 100644
--- a/doc/guides/prog_guide/graph_lib.rst
+++ b/doc/guides/prog_guide/graph_lib.rst
@@ -392,3 +392,13 @@ null
 
 This node ignores the set of objects passed to it and reports that all are
 processed.
+
+punt_kernel
+~~~
+This node punts the packets to kernel using a raw socket interface. For sending
+the received packets, raw socket uses the packet's destination IP address in
+sockaddr_in address structure and node uses ``sendto`` function to send data
+on the raw socket.
+
+Aftering sending the burst of packets to kernel, this node redirects the same
+objects to pkt_drop node to free up the packet buffers.
diff --git a/lib/node/meson.build b/lib/node/meson.build
index dbdf673c86..48c2da73f7 100644
--- a/lib/node/meson.build
+++ b/lib/node/meson.build
@@ -17,6 +17,7 @@ sources = files(
 'null.c',
 'pkt_cls.c',
 'pkt_drop.c',
+'punt_kernel.c',
 )
 headers = files('rte_node_ip4_api.h', 'rte_node_eth_api.h')
 # Strict-aliasing rules are violated by uint8_t[] to context size casts.
diff --git a/lib/node/punt_kernel.c b/lib/node/punt_kernel.c
new file mode 100644
index 00..e5dd15b759
--- /dev/null
+++ b/lib/node/punt_kernel.c
@@ -0,0 +1,125 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(C) 2023 Marvell International Ltd.
+ */
+
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "node_private.h"
+#include "punt_kernel_priv.h"
+
+static __rte_always_inline void
+punt_kernel_process_mbuf(struct rte_node *node, struct rte_mbuf **mbufs, 
uint16_t cnt)
+{
+   punt_kernel_node_ctx_t *ctx = (punt_kernel_node_ctx_t *)node->ctx;
+   struct sockaddr_in sin = {0};
+   struct rte_ipv4_hdr *ip4;
+   size_t len;
+   char *buf;
+   int i;
+
+   for (i = 0; i < cnt; i++) {
+   ip4 = rte_pktmbuf_mtod(mbufs[i], struct rte_ipv4_hdr *);
+   len = rte_pktmbuf_data_len(mbufs[i]);
+   buf = (char *)ip4;
+
+   sin.sin_family = AF_INET;
+   sin.sin_port = 0;
+   sin.sin_addr.s_addr = ip4->dst_addr;
+
+   if (sendto(ctx->sock, buf, len, 0, (struct sockaddr *)&sin, 
sizeof(sin)) < 0)
+   node_err("punt_kernel", "Unable to send packets: %s\n", 
strerror(errno));
+   }
+}
+
+static uint16_t
+punt_kernel_node_process(struct rte_graph *graph __rte_unused, struct rte_node 
*node, void **objs,
+uint16_t nb_objs)
+{
+   struct rte_mbuf **pkts = (struct rte_mbuf **)objs;
+   uint16_t obj_left = nb_objs;
+
+#define PREFETCH_CNT 4
+
+   while (obj_left >= 12) {
+   /* Prefetch next-next mbufs */
+   rte_prefetch0(pkts[8]);
+   rte_prefetch0(pkts[9]);
+   rte_prefetch0(pkts[10]);
+   rte_prefetch0(pkts[11]);
+
+   /* Prefetch next mbuf data */
+   rte_prefetch0(rte_pktmbuf_mtod_offset(pkts[4], void *, 
pkts[4]->l2_len));
+   rte_prefetch0(rte_pktmbuf_mtod_offset(pkts[5], void *, 
pkts[5]->l2_len));
+   rte_prefetch0(rte_pktmbuf_mtod_offset(pkts[6], void *, 
pkts[6]->l2_len));
+   rte_prefetch0(rte_pktmbuf_mtod_offset(pkts[7], void *, 
pkts[7]->l2_len));
+
+   punt_kernel_process_mbuf(node, pkts, PREFETCH_CNT);
+
+   obj_left -= PREFETCH_CNT;
+   pkts += PREFETCH_CNT;
+   }
+
+   while (obj_left > 0) {
+   punt_kernel_process_mbuf(node, pkts, 1);
+
+   obj_left--;
+   pkts++;
+   }
+
+   rte_node_next_stream_move(graph, node, PUNT_KERNEL_NEXT_PKT_DROP);
+
+   return nb_objs;
+}
+
+static int
+punt_kernel_node_init(const struct rte_graph *graph __rte_unused, struct 
rte_node *node)
+{
+   punt_kernel_node_ctx_t *ctx = (punt_kernel_node_ctx_t *)node->ctx;
+
+   ctx->sock = socket(AF_INET, SOCK_RAW, IPPROTO_RAW);
+   if (ctx->sock < 0)
+   node_err("punt_kernel", "Unable to open RAW socket\n");
+
+   return 0;
+}
+
+static void
+punt_kernel_node_fini(const struct rte_graph *graph __rte_unused, struct 
rte_node *node)
+{
+   punt_kernel_node_ctx_t *ctx = (punt_kernel_node_ctx_t *)node->ctx;
+
+   if (ctx->sock >= 0) {
+   close(ctx->sock);
+   ctx->sock = -1;
+   }
+}
+
+static struct rte_node_register punt_kernel_node_base = {
+   .process = punt_kernel_node_process,
+   .name = "punt_kernel",
+
+   .init =

[PATCH 2/4] node: add a node to receive pkts from kernel

2023-04-20 Thread Vamsi Attunuru

Patch adds a node to receive packets from kernel
over a raw socket.

Signed-off-by: Vamsi Attunuru 
---
 doc/guides/prog_guide/graph_lib.rst |   7 +
 lib/node/kernel_recv.c  | 277 
 lib/node/kernel_recv_priv.h |  74 
 lib/node/meson.build|   1 +
 4 files changed, 359 insertions(+)

diff --git a/doc/guides/prog_guide/graph_lib.rst 
b/doc/guides/prog_guide/graph_lib.rst
index b3b5b14827..1057f16de8 100644
--- a/doc/guides/prog_guide/graph_lib.rst
+++ b/doc/guides/prog_guide/graph_lib.rst
@@ -402,3 +402,10 @@ on the raw socket.
 
 Aftering sending the burst of packets to kernel, this node redirects the same
 objects to pkt_drop node to free up the packet buffers.
+
+kernel_recv
+~~~
+This node receives packets from kernel over a raw socket interface. Uses 
``poll``
+function to poll on the socket fd for ``POLLIN`` events to read the packets 
from
+raw socket to stream buffer and does ``rte_node_next_stream_move()`` when there
+are received packets.
diff --git a/lib/node/kernel_recv.c b/lib/node/kernel_recv.c
new file mode 100644
index 00..361dcc3b5f
--- /dev/null
+++ b/lib/node/kernel_recv.c
@@ -0,0 +1,277 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(C) 2023 Marvell International Ltd.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "ethdev_rx_priv.h"
+#include "kernel_recv_priv.h"
+#include "node_private.h"
+
+static struct kernel_recv_node_main kernel_recv_main;
+
+static inline struct rte_mbuf *
+alloc_rx_mbuf(kernel_recv_node_ctx_t *ctx)
+{
+   kernel_recv_info_t *rx = ctx->recv_info;
+
+   if (rx->idx >= rx->cnt) {
+   uint16_t cnt;
+
+   rx->idx = 0;
+   rx->cnt = 0;
+
+   cnt = rte_pktmbuf_alloc_bulk(ctx->pktmbuf_pool, rx->rx_bufs, 
KERN_RECV_CACHE_COUNT);
+   if (cnt <= 0)
+   return NULL;
+
+   rx->cnt = cnt;
+   }
+
+   return rx->rx_bufs[rx->idx++];
+}
+
+static inline void
+mbuf_update(struct rte_mbuf **mbufs, uint16_t nb_pkts)
+{
+   struct rte_net_hdr_lens hdr_lens;
+   struct rte_mbuf *m;
+   int i;
+
+   for (i = 0; i < nb_pkts; i++) {
+   m = mbufs[i];
+
+   m->packet_type = rte_net_get_ptype(m, &hdr_lens, 
RTE_PTYPE_ALL_MASK);
+
+   m->ol_flags = 0;
+   m->tx_offload = 0;
+
+   m->l2_len = hdr_lens.l2_len;
+   m->l3_len = hdr_lens.l3_len;
+   m->l4_len = hdr_lens.l4_len;
+   }
+}
+
+static uint16_t
+recv_pkt_parse(void **objs, uint16_t nb_pkts)
+{
+   uint16_t pkts_left = nb_pkts;
+   struct rte_mbuf **pkts;
+   int i;
+
+   pkts = (struct rte_mbuf **)objs;
+
+   if (pkts_left >= 4) {
+   for (i = 0; i < 4; i++)
+   rte_prefetch0(rte_pktmbuf_mtod(pkts[i], void *));
+   }
+
+   while (pkts_left >= 12) {
+   /* Prefetch next-next mbufs */
+   rte_prefetch0(pkts[8]);
+   rte_prefetch0(pkts[9]);
+   rte_prefetch0(pkts[10]);
+   rte_prefetch0(pkts[11]);
+
+   /* Prefetch next mbuf data */
+   rte_prefetch0(rte_pktmbuf_mtod(pkts[4], void *));
+   rte_prefetch0(rte_pktmbuf_mtod(pkts[5], void *));
+   rte_prefetch0(rte_pktmbuf_mtod(pkts[6], void *));
+   rte_prefetch0(rte_pktmbuf_mtod(pkts[7], void *));
+
+   /* Extract ptype of mbufs */
+   mbuf_update(pkts, 4);
+
+   pkts += 4;
+   pkts_left -= 4;
+   }
+
+   if (pkts_left > 0)
+   mbuf_update(pkts, pkts_left);
+
+   return nb_pkts;
+}
+
+static uint16_t
+kernel_recv_node_do(struct rte_graph *graph, struct rte_node *node, 
kernel_recv_node_ctx_t *ctx)
+{
+   kernel_recv_info_t *rx;
+   uint16_t next_index;
+   int fd;
+
+   rx = ctx->recv_info;
+   next_index = rx->cls_next;
+
+   fd = rx->sock;
+   if (fd > 0) {
+   struct rte_mbuf **mbufs;
+   uint16_t len = 0, count = 0;
+   int nb_cnt, i;
+
+   nb_cnt = (node->size >= RTE_GRAPH_BURST_SIZE) ? 
RTE_GRAPH_BURST_SIZE : node->size;
+
+   mbufs = (struct rte_mbuf **)node->objs;
+   for (i = 0; i < nb_cnt; i++) {
+   struct rte_mbuf *m = alloc_rx_mbuf(ctx);
+
+   if (!m)
+   break;
+
+   len = read(fd, rte_pktmbuf_mtod(m, char *), 
rte_pktmbuf_tailroom(m));
+   if (len == 0 || len == 0x) {
+   rte_pktmbuf_free(m);
+   if (rx->idx <= 0)
+   node_dbg("kernel_recv", "rx_mbuf array 
is empty\n");
+

[PATCH 3/4] node: remove hardcoded node next details

2023-04-20 Thread Vamsi Attunuru

For ethdev_rx node, node_next details can be populated
during node cloning time and same gets assigned to
node context structure during node initialization.

Patch removes overriding node_next details in node
init().

Signed-off-by: Vamsi Attunuru 
---
 lib/node/ethdev_rx.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/lib/node/ethdev_rx.c b/lib/node/ethdev_rx.c
index a19237b42f..85816c489c 100644
--- a/lib/node/ethdev_rx.c
+++ b/lib/node/ethdev_rx.c
@@ -194,8 +194,6 @@ ethdev_rx_node_init(const struct rte_graph *graph, struct 
rte_node *node)
 
RTE_VERIFY(elem != NULL);
 
-   ctx->cls_next = ETHDEV_RX_NEXT_PKT_CLS;
-
/* Check and setup ptype */
return ethdev_ptype_setup(ctx->port_id, ctx->queue_id);
 }
-- 
2.25.1

[PATCH 4/4] app: add testgraph application

2023-04-20 Thread Vamsi Attunuru

Patch adds test-graph application to validate graph
and node libraries.

Signed-off-by: Vamsi Attunuru 
---
 app/meson.build|1 +
 app/test-graph/cmdline.c   |  212 ++
 app/test-graph/cmdline_graph.c |  297 
 app/test-graph/cmdline_graph.h |   19 +
 app/test-graph/meson.build |   17 +
 app/test-graph/parameters.c|  157 
 app/test-graph/testgraph.c | 1309 
 app/test-graph/testgraph.h |   92 +++
 doc/guides/tools/index.rst |1 +
 doc/guides/tools/testgraph.rst |  131 
 10 files changed, 2236 insertions(+)

diff --git a/app/meson.build b/app/meson.build
index 74d2420f67..6c7b24e604 100644
--- a/app/meson.build
+++ b/app/meson.build
@@ -22,6 +22,7 @@ apps = [
 'test-eventdev',
 'test-fib',
 'test-flow-perf',
+'test-graph',
 'test-gpudev',
 'test-mldev',
 'test-pipeline',
diff --git a/app/test-graph/cmdline.c b/app/test-graph/cmdline.c
new file mode 100644
index 00..a07a8a24f9
--- /dev/null
+++ b/app/test-graph/cmdline.c
@@ -0,0 +1,212 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(C) 2023 Marvell International Ltd.
+ */
+
+#include 
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "cmdline_graph.h"
+#include "testgraph.h"
+
+static struct cmdline *testgraph_cl;
+static cmdline_parse_ctx_t *main_ctx;
+
+/* *** Help command with introduction. *** */
+struct cmd_help_brief_result {
+   cmdline_fixed_string_t help;
+};
+
+static void
+cmd_help_brief_parsed(__rte_unused void *parsed_result, struct cmdline *cl, 
__rte_unused void *data)
+{
+   cmdline_printf(cl,
+  "\n"
+  "Help is available for the following sections:\n\n"
+  "help control: Start and stop 
graph walk.\n"
+  "help display: Displaying port, 
stats and config "
+  "information.\n"
+  "help config : Configuration 
information.\n"
+  "help all: All of the above 
sections.\n\n");
+}
+
+static cmdline_parse_token_string_t cmd_help_brief_help =
+   TOKEN_STRING_INITIALIZER(struct cmd_help_brief_result, help, "help");
+
+static cmdline_parse_inst_t cmd_help_brief = {
+   .f = cmd_help_brief_parsed,
+   .data = NULL,
+   .help_str = "help: Show help",
+   .tokens = {
+   (void *)&cmd_help_brief_help,
+   NULL,
+   },
+};
+
+/* *** Help command with help sections. *** */
+struct cmd_help_long_result {
+   cmdline_fixed_string_t help;
+   cmdline_fixed_string_t section;
+};
+
+static void
+cmd_help_long_parsed(void *parsed_result, struct cmdline *cl, __rte_unused 
void *data)
+{
+   int show_all = 0;
+   struct cmd_help_long_result *res = parsed_result;
+
+   if (!strcmp(res->section, "all"))
+   show_all = 1;
+
+   if (show_all || !strcmp(res->section, "control")) {
+
+   cmdline_printf(cl, "\n"
+  "Control forwarding:\n"
+  "---\n\n"
+
+  "start graph_walk\n"
+  " Start graph_walk on worker threads.\n\n"
+
+  "stop graph_walk\n"
+  " Stop worker threads from running 
graph_walk.\n\n"
+
+  "quit\n"
+  "Quit to prompt.\n\n");
+   }
+
+   if (show_all || !strcmp(res->section, "display")) {
+
+   cmdline_printf(cl,
+  "\n"
+  "Display:\n"
+  "\n\n"
+
+  "show node_list\n"
+  " Display the list of supported nodes.\n\n"
+
+  "show graph_stats\n"
+  " Display the node statistics of graph 
cluster.\n\n");
+   }
+
+   if (show_all || !strcmp(res->section, "config")) {
+   cmdline_printf(cl, "\n"
+  "Configuration:\n"
+  "--\n"
+  "set lcore_config 
(port_id0,rxq0,lcore_idX),..."
+  ".,(port_idX,rxqX,lcoreidY)\n"
+  " Set lcore configuration.\n\n"
+
+  "create_graph 
(node0_name,node1_name,...,nodeX_name)\n"
+  " Create graph instances using the provided 
node details.\n\n"
+
+  "destroy_graph\n"
+  " Destroy the graph instances.\n\n");
+   }
+}
+
+static cmdline_parse_tok

[PATCH] net/cpfl: update the doc of CPFL PMD

2023-04-20 Thread Mingxia Liu

This patch updates cpfl.rst doc, adjusting the order of chapters
referring to IDPF PMD doc.

Signed-off-by: Mingxia Liu 
---
 doc/guides/nics/cpfl.rst | 44 +---
 1 file changed, 23 insertions(+), 21 deletions(-)

diff --git a/doc/guides/nics/cpfl.rst b/doc/guides/nics/cpfl.rst
index 91dbec306d..d25db088eb 100644
--- a/doc/guides/nics/cpfl.rst
+++ b/doc/guides/nics/cpfl.rst
@@ -20,27 +20,6 @@ Follow the DPDK :doc:`../linux_gsg/index` to setup the basic 
DPDK environment.
 To get better performance on Intel platforms,
 please follow the :doc:`../linux_gsg/nic_perf_intel_platform`.
 
-Features
-
-
-Vector PMD
-~~
-
-Vector path for Rx and Tx path are selected automatically.
-The paths are chosen based on 2 conditions:
-
-- ``CPU``
-
-  On the x86 platform, the driver checks if the CPU supports AVX512.
-  If the CPU supports AVX512 and EAL argument ``--force-max-simd-bitwidth``
-  is set to 512, AVX512 paths will be chosen.
-
-- ``Offload features``
-
-  The supported HW offload features are described in the document cpfl.ini,
-  A value "P" means the offload feature is not supported by vector path.
-  If any not supported features are used, cpfl vector PMD is disabled
-  and the scalar paths are chosen.
 
 Configuration
 -
@@ -104,3 +83,26 @@ Driver compilation and testing
 --
 
 Refer to the document :doc:`build_and_test` for details.
+
+
+Features
+
+
+Vector PMD
+~~
+
+Vector path for Rx and Tx path are selected automatically.
+The paths are chosen based on 2 conditions:
+
+- ``CPU``
+
+  On the x86 platform, the driver checks if the CPU supports AVX512.
+  If the CPU supports AVX512 and EAL argument ``--force-max-simd-bitwidth``
+  is set to 512, AVX512 paths will be chosen.
+
+- ``Offload features``
+
+  The supported HW offload features are described in the document cpfl.ini,
+  A value "P" means the offload feature is not supported by vector path.
+  If any not supported features are used, cpfl vector PMD is disabled
+  and the scalar paths are chosen.
-- 
2.34.1

NVIDIA roadmap for 23.07

2023-04-20 Thread Maayan Kashani

Please find below NVIDIA roadmap for 23.07 release:

A. rte_flow new APIs

[1] Updated existing rule's actions in flow template API table.

Value: The user can update an existing flow action in flight directly
without removing an old rule entry and then inserting a new one.

The update of action can have a different actions list.

To update the actions for a given flow entry, support all types of actions but
only with optimize by index matcher.

ethdev: add flow rule actions update API:

https://patchwork.dpdk.org/project/dpdk/patch/20230418195807.352514-1-akozy...@nvidia.com/

[2] Support Quota flow action and item

Value: allow setting a flow or multiple flows to share a volume quota in which
traffic usage can be monitored by the application to assure usage is permitted
up to a predefined limit

The Quota action limits traffic according to pre-defined configuration.

The quota action updates the ‘quota’ value and sets packet quota state (PASS or
BLOCK).

The quota item matches on the flow quota state.

ethdev: add quota flow action and item:

https://patches.dpdk.org/project/dpdk/patch/20221221073547.988-2-getel...@nvidia.com/

[3] add IPv6 extension push remove.

app/testpmd: add IPv6 extension push remove cli:

https://patchwork.dpdk.org/project/dpdk/patch/20230417092540.2617450-3-rongw...@nvidia.com/

ethdev: add IPv6 extension push remove action:

https://patchwork.dpdk.org/project/dpdk/patch/20230417022630.2377505-2-rongw...@nvidia.com/

Add new flow actions to support push/remove IPv6 extension header.

[4] Flow template API Geneve plus options support

Value: Supported in non-template API, adding support to the template API.

User needs to support more than one TLV option headers for their network.

The private and dedicated APIs are used to handle the parsers on CX-* and BF-*.
This will not only provide the comparability, but also extends the
functionality compared to the non-template API. E.g., more than one TLV option
header can be supported, and more fields can be modified, the source and
destination can both be the option headers.

To support the standard and customized Geneve and Geneve opt.

ethdev: extend modify field API (For MPLS and GENEVE):

https://patchwork.dpdk.org/project/dpdk/cover/20230420092145.522389-1-michae...@nvidia.com/

[5] Local / Remote mirroring support in flow template API

Value: A parity of the mirroring support with non-template API. In addition,
support also multiple ports mirroring with template API. Multiple destinations
can be supported, and the local and remote mirroring can both be in the same
rule. This would provide more diagnostic and lawful interception abilities to
the cloud infrastructure applications.

ethdev: add indirect list flow action:

https://patches.dpdk.org/project/dpdk/patch/20230418172144.24365-1-getel...@nvidia.com/

[6] vRoCE feature: need able to monitor Cloud guest RoCE (RDMA Over Converged
Ethernet) stats on Cloud provider side (ECN/CNP)

Value: The guest RoCE traffic (UDP dport 4791) needs support matching and
monitors in the provider application side. With the new item support, RoCE
traffic with specific patterns can be countered with COUNT action and the
statistic results is visible in the provider on the host or BareMetal on DPU
side. More actions can be supported as well, not only for counter.

User can count the number of ROCE packets.

ethdev: add flow item for RoCE infiniband BT.

http://patches.dpdk.org/project/dpdk/patch/20230324032615.4141031-1-dongz...@nvidia.com/

B. Net/mlx5 PMD updates

[1] DPDK Protection of burst non-raising order.

Value: In accurate scheduling the packets may be rescheduled before sending,
it is the user’s responsibility to ensure the timestamps of packets to be
rescheduled are in ascending order when pushing the WQE. Or else the hardware
would not be able to perform scheduling correctly. A software counter has been
added to record the application errors in such case. It will give more insights
and help to debug when such error occurs.

net/mlx5: introduce Tx datapath tracing:

http://patches.dpdk.org/project/dpdk/cover/20230420100803.494-1-viachesl...@nvidia.com/

[2] Added flow offload action to route packets to kernel.

Value: A parity of the support in non-template API. It allows an application
to re-route packets directly to the kernel without software involvement.

net/mlx5/hws: support dest root table action:

https://patches.dpdk.org/project/dpdk/patch/20230320141229.104748-1-hamd...@nvidia.com/

[3] Forward to SW packets that are too big for encap (match on size > X)

Value: In some customer environments it is not possible to control the MTU
size and if packets are about to be encapsulated their final length might
exceed the MTU size.

It can be used to identify packets that are longer than predefined size.

Add support for IP length range matching (IPv4/IPv6) for flow template API.

70 matches

Mail list logo