RE: [PATCH 1/2] config/arm: Do not require processor information
> -Original Message- > From: Akihiko Odaki > Sent: Thursday, April 20, 2023 9:40 AM > To: Ruifeng Wang ; Bruce Richardson > ; > Juraj Linkeš > Cc: dev@dpdk.org; nd > Subject: Re: [PATCH 1/2] config/arm: Do not require processor information > > On 2023/04/17 16:41, Ruifeng Wang wrote: > >> -Original Message- > >> From: Akihiko Odaki > >> Sent: Friday, April 14, 2023 8:42 PM > >> To: Ruifeng Wang ; Bruce Richardson > >> > >> Cc: dev@dpdk.org; Akihiko Odaki > >> Subject: [PATCH 1/2] config/arm: Do not require processor information > >> > >> DPDK can be built even without exact processor information for x86 > >> and ppc so allow to build for Arm even if we don't know the targeted > >> processor is > unknown. > > > > Hi Akihiko, > > > > The design idea was to require an explicit generic build. > > Default/native build doesn't fall back to generic build when SoC info is > > not on the list. > > So the user has less chance to generate a suboptimal binary by accident. > > Hi, > > It is true that the suboptimal binary can result, but the rationale here is > that we > tolerate that for x86 and ppc so it should not really matter for Arm too. On > x86 and ppc > you don't need to modify meson.build just to run dts on a development machine. What modification do you need for a development machine? I suppose "meson setup build -Dplatform=generic" will generate a binary that can run on your development machine. > > Regards, > Akihiko Odaki
Re: [PATCH 1/2] config/arm: Do not require processor information
On 2023/04/20 16:10, Ruifeng Wang wrote: -Original Message- From: Akihiko Odaki Sent: Thursday, April 20, 2023 9:40 AM To: Ruifeng Wang ; Bruce Richardson ; Juraj Linkeš Cc: dev@dpdk.org; nd Subject: Re: [PATCH 1/2] config/arm: Do not require processor information On 2023/04/17 16:41, Ruifeng Wang wrote: -Original Message- From: Akihiko Odaki Sent: Friday, April 14, 2023 8:42 PM To: Ruifeng Wang ; Bruce Richardson Cc: dev@dpdk.org; Akihiko Odaki Subject: [PATCH 1/2] config/arm: Do not require processor information DPDK can be built even without exact processor information for x86 and ppc so allow to build for Arm even if we don't know the targeted processor is unknown. Hi Akihiko, The design idea was to require an explicit generic build. Default/native build doesn't fall back to generic build when SoC info is not on the list. So the user has less chance to generate a suboptimal binary by accident. Hi, It is true that the suboptimal binary can result, but the rationale here is that we tolerate that for x86 and ppc so it should not really matter for Arm too. On x86 and ppc you don't need to modify meson.build just to run dts on a development machine. What modification do you need for a development machine? I suppose "meson setup build -Dplatform=generic" will generate a binary that can run on your development machine. I didn't describe the situation well. I use DPDK Test Suite for testing and it determines what flags to be passed to Meson. You need to modify DPDK's meson.build or DTS to get it built. Regards, Akihiko Odaki
RE: [EXT] [PATCH] crypto/uadk: set queue pair in dev_configure
> By default, uadk only alloc two queues for each algorithm, which > will impact performance. > Set queue pair number as required in dev_configure. > The default max queue pair number is 8, which can be modified > via para: max_nb_queue_pairs > Please add documentation for the newly added devarg in uadk.rst. > Example: > sudo dpdk-test-crypto-perf -l 0-10 --vdev crypto_uadk,max_nb_queue_pairs=10 > -- --devtype crypto_uadk --optype cipher-only --buffer-sz 8192 > > lcore idBuf Size Burst Size Gbps Cycles/Buf > >38192 327.5226 871.19 >78192 327.5225 871.20 >18192 327.5225 871.20 >48192 327.5224 871.21 >58192 327.5224 871.21 > 108192 327.5223 871.22 >98192 327.5223 871.23 >28192 327.5222 871.23 >88192 327.5222 871.23 >68192 327.5218 871.28 > No need to mention the above test result in patch description. > Signed-off-by: Zhangfei Gao > --- > drivers/crypto/uadk/uadk_crypto_pmd.c | 19 +-- > drivers/crypto/uadk/uadk_crypto_pmd_private.h | 1 + > 2 files changed, 18 insertions(+), 2 deletions(-) > > diff --git a/drivers/crypto/uadk/uadk_crypto_pmd.c > b/drivers/crypto/uadk/uadk_crypto_pmd.c > index 4f729e0f07..34aae99342 100644 > --- a/drivers/crypto/uadk/uadk_crypto_pmd.c > +++ b/drivers/crypto/uadk/uadk_crypto_pmd.c > @@ -357,8 +357,15 @@ static const struct rte_cryptodev_capabilities > uadk_crypto_v2_capabilities[] = { > /* Configure device */ > static int > uadk_crypto_pmd_config(struct rte_cryptodev *dev __rte_unused, > -struct rte_cryptodev_config *config __rte_unused) > +struct rte_cryptodev_config *config) > { > + char env[128]; > + > + /* set queue pairs num via env */ > + sprintf(env, "sync:%d@0", config->nb_queue_pairs); > + setenv("WD_CIPHER_CTX_NUM", env, 1); > + setenv("WD_DIGEST_CTX_NUM", env, 1); > + Who is the intended user of this environment variable? > return 0; > } > > @@ -434,7 +441,7 @@ uadk_crypto_pmd_info_get(struct rte_cryptodev *dev, > if (dev_info != NULL) { > dev_info->driver_id = dev->driver_id; > dev_info->driver_name = dev->device->driver->name; > - dev_info->max_nb_queue_pairs = 128; > + dev_info->max_nb_queue_pairs = priv->max_nb_qpairs; > /* No limit of number of sessions */ > dev_info->sym.max_nb_sessions = 0; > dev_info->feature_flags = dev->feature_flags; > @@ -1015,6 +1022,7 @@ uadk_cryptodev_probe(struct rte_vdev_device *vdev) > struct uadk_crypto_priv *priv; > struct rte_cryptodev *dev; > struct uacce_dev *udev; > + const char *input_args; > const char *name; > > udev = wd_get_accel_dev("cipher"); > @@ -1030,6 +1038,9 @@ uadk_cryptodev_probe(struct rte_vdev_device *vdev) > if (name == NULL) > return -EINVAL; > > + input_args = rte_vdev_device_args(vdev); > + rte_cryptodev_pmd_parse_input_args(&init_params, input_args); > + > dev = rte_cryptodev_pmd_create(name, &vdev->device, &init_params); > if (dev == NULL) { > UADK_LOG(ERR, "driver %s: create failed", init_params.name); > @@ -1044,6 +1055,7 @@ uadk_cryptodev_probe(struct rte_vdev_device *vdev) >RTE_CRYPTODEV_FF_SYMMETRIC_CRYPTO; > priv = dev->data->dev_private; > priv->version = version; > + priv->max_nb_qpairs = init_params.max_nb_queue_pairs; Is the user free to give any number as max? Do you want to add a check here? You should also mention in the documentation about the max and min values. > > rte_cryptodev_pmd_probing_finish(dev); > > @@ -1078,4 +1090,7 @@ static struct cryptodev_driver uadk_crypto_drv; > RTE_PMD_REGISTER_VDEV(UADK_CRYPTO_DRIVER_NAME, > uadk_crypto_pmd); > RTE_PMD_REGISTER_CRYPTO_DRIVER(uadk_crypto_drv, > uadk_crypto_pmd.driver, > uadk_cryptodev_driver_id); > +RTE_PMD_REGISTER_PARAM_STRING(UADK_CRYPTO_DRIVER_NAME, > + "max_nb_queue_pairs= " > + "socket_id="); > RTE_LOG_REGISTER_DEFAULT(uadk_crypto_logtype, INFO); > diff --git a/drivers/crypto/uadk/uadk_crypto_pmd_private.h > b/drivers/crypto/uadk/uadk_crypto_pmd_private.h > index 9075f0f058..5a7dbff117 100644 > --- a/drivers/crypto/uadk/uadk_crypto_pmd_private.h > +++ b/drivers/crypto/uadk/uadk_crypto_pmd_private.h > @@ -67,6 +67,7 @@ struct uadk_crypto_priv { > bool env_cipher_init; > bool env_auth_init; > enum uadk_crypto_version version; > + unsigned int max_nb
RE: [PATCH 2/2] config/arm: Enable NUMA for generic Arm build
> -Original Message- > From: Akihiko Odaki > Sent: Friday, April 14, 2023 8:42 PM > To: Ruifeng Wang ; Bruce Richardson > > Cc: dev@dpdk.org; Akihiko Odaki > Subject: [PATCH 2/2] config/arm: Enable NUMA for generic Arm build > > We enable NUMA even if the presence of NUMA is unknown for the other > architectures. Enable > NUMA for generic Arm build too. > > Signed-off-by: Akihiko Odaki > --- > config/arm/meson.build | 6 -- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/config/arm/meson.build b/config/arm/meson.build index > 724c00ad7e..f8ee7cdafb > 100644 > --- a/config/arm/meson.build > +++ b/config/arm/meson.build > @@ -271,13 +271,15 @@ implementers = { > soc_generic = { > 'description': 'Generic un-optimized build for armv8 aarch64 exec mode', > 'implementer': 'generic', > -'part_number': 'generic' > +'part_number': 'generic', > +'numa': true The default value of numa is true. So no need to add it here? if not soc_config.get('numa', true) has_libnuma = 0 endif > } > > soc_generic_aarch32 = { > 'description': 'Generic un-optimized build for armv8 aarch32 exec mode', > 'implementer': 'generic', > -'part_number': 'generic_aarch32' > +'part_number': 'generic_aarch32', > +'numa': true > } > > soc_armada = { > -- > 2.40.0
RE: [RFC] lib: set/get max memzone segments
Devendra Singh Rawat, Alok Prasad - can you please give your feedback on the qede driver updates? > -Original Message- > In current DPDK the RTE_MAX_MEMZONE definition is unconditionally hard > coded as 2560. For applications requiring different values of this parameter > – it is more convenient to set the max value via an rte API - rather than > changing the dpdk source code per application. In many organizations, the > possibility to compile a private DPDK library for a particular application > does > not exist at all. With this option there is no need to recompile DPDK and it > allows using an in-box packaged DPDK. > An example usage for updating the RTE_MAX_MEMZONE would be of an > application that uses the DPDK mempool library which is based on DPDK > memzone library. The application may need to create a number of steering > tables, each of which will require its own mempool allocation. > This commit is not about how to optimize the application usage of mempool > nor about how to improve the mempool implementation based on > memzone. It is about how to make the max memzone definition - run-time > customized. > This commit adds an API which must be called before rte_eal_init(): > rte_memzone_max_set(int max). If not called, the default memzone > (RTE_MAX_MEMZONE) is used. There is also an API to query the effective > max memzone: rte_memzone_max_get(). > > Signed-off-by: Ophir Munk > --- > app/test/test_func_reentrancy.c | 2 +- > app/test/test_malloc_perf.c | 2 +- > app/test/test_memzone.c | 2 +- > config/rte_config.h | 1 - > drivers/net/qede/base/bcm_osal.c| 26 +- > drivers/net/qede/base/bcm_osal.h| 3 +++ > drivers/net/qede/qede_main.c| 7 +++ > lib/eal/common/eal_common_memzone.c | 28
[PATCH] common/idpf: remove device stop flag
Remove device stop flag, as we already have dev->data-dev_started. This also fixed the issue when close port directly without start it first, some error message will be reported in dev_stop. Fixes: 14aa6ed8f2ec ("net/idpf: support device start and stop") Fixes: 1082a773a86b ("common/idpf: add vport structure") Cc: sta...@dpdk.org Signed-off-by: Qi Zhang --- drivers/common/idpf/idpf_common_device.h | 2 -- drivers/net/cpfl/cpfl_ethdev.c | 6 +- drivers/net/idpf/idpf_ethdev.c | 6 +- 3 files changed, 2 insertions(+), 12 deletions(-) diff --git a/drivers/common/idpf/idpf_common_device.h b/drivers/common/idpf/idpf_common_device.h index c2dc2f16b9..7a54f7c937 100644 --- a/drivers/common/idpf/idpf_common_device.h +++ b/drivers/common/idpf/idpf_common_device.h @@ -110,8 +110,6 @@ struct idpf_vport { uint16_t devarg_id; - bool stopped; - bool rx_vec_allowed; bool tx_vec_allowed; bool rx_use_avx512; diff --git a/drivers/net/cpfl/cpfl_ethdev.c b/drivers/net/cpfl/cpfl_ethdev.c index ede730fd50..f1d4425ce2 100644 --- a/drivers/net/cpfl/cpfl_ethdev.c +++ b/drivers/net/cpfl/cpfl_ethdev.c @@ -798,8 +798,6 @@ cpfl_dev_start(struct rte_eth_dev *dev) if (cpfl_dev_stats_reset(dev)) PMD_DRV_LOG(ERR, "Failed to reset stats"); - vport->stopped = 0; - return 0; err_vport: @@ -817,7 +815,7 @@ cpfl_dev_stop(struct rte_eth_dev *dev) { struct idpf_vport *vport = dev->data->dev_private; - if (vport->stopped == 1) + if (dev->data->dev_started == 0) return 0; idpf_vc_vport_ena_dis(vport, false); @@ -828,8 +826,6 @@ cpfl_dev_stop(struct rte_eth_dev *dev) idpf_vc_vectors_dealloc(vport); - vport->stopped = 1; - return 0; } diff --git a/drivers/net/idpf/idpf_ethdev.c b/drivers/net/idpf/idpf_ethdev.c index e02ec2ec5a..e01eb3a2ec 100644 --- a/drivers/net/idpf/idpf_ethdev.c +++ b/drivers/net/idpf/idpf_ethdev.c @@ -792,8 +792,6 @@ idpf_dev_start(struct rte_eth_dev *dev) if (idpf_dev_stats_reset(dev)) PMD_DRV_LOG(ERR, "Failed to reset stats"); - vport->stopped = 0; - return 0; err_vport: @@ -811,7 +809,7 @@ idpf_dev_stop(struct rte_eth_dev *dev) { struct idpf_vport *vport = dev->data->dev_private; - if (vport->stopped == 1) + if (dev->data->dev_started == 0) return 0; idpf_vc_vport_ena_dis(vport, false); @@ -822,8 +820,6 @@ idpf_dev_stop(struct rte_eth_dev *dev) idpf_vc_vectors_dealloc(vport); - vport->stopped = 1; - return 0; } -- 2.31.1
Re: [RFC] lib: set/get max memzone segments
19/04/2023 16:51, Tyler Retzlaff: > On Wed, Apr 19, 2023 at 11:36:34AM +0300, Ophir Munk wrote: > > In current DPDK the RTE_MAX_MEMZONE definition is unconditionally hard > > coded as 2560. For applications requiring different values of this > > parameter – it is more convenient to set the max value via an rte API - > > rather than changing the dpdk source code per application. In many > > organizations, the possibility to compile a private DPDK library for a > > particular application does not exist at all. With this option there is > > no need to recompile DPDK and it allows using an in-box packaged DPDK. > > An example usage for updating the RTE_MAX_MEMZONE would be of an > > application that uses the DPDK mempool library which is based on DPDK > > memzone library. The application may need to create a number of > > steering tables, each of which will require its own mempool allocation. > > This commit is not about how to optimize the application usage of > > mempool nor about how to improve the mempool implementation based on > > memzone. It is about how to make the max memzone definition - run-time > > customized. > > This commit adds an API which must be called before rte_eal_init(): > > rte_memzone_max_set(int max). If not called, the default memzone > > (RTE_MAX_MEMZONE) is used. There is also an API to query the effective > > max memzone: rte_memzone_max_get(). > > > > Signed-off-by: Ophir Munk > > --- > > the use case of each application may want a different non-hard coded > value makes sense. > > it's less clear to me that requiring it be called before eal init makes > sense over just providing it as configuration to eal init so that it is > composed. Why do you think it would be better as EAL init option? From an API perspective, I think it is simpler to call a dedicated function. And I don't think a user wants to deal with it when starting the application. > can you elaborate further on why you need get if you have a one-shot > set? why would the application not know the value if you can only ever > call it once before init? The "get" function is used in this patch by test and qede driver. The application could use it as well, especially to query the default value.
[PATCH] app/dma-perf: introduce dma-perf application
There are many high-performance DMA devices supported in DPDK now, and these DMA devices can also be integrated into other modules of DPDK as accelerators, such as Vhost. Before integrating DMA into applications, developers need to know the performance of these DMA devices in various scenarios and the performance of CPUs in the same scenario, such as different buffer lengths. Only in this way can we know the target performance of the application accelerated by using them. This patch introduces a high-performance testing tool, which supports comparing the performance of CPU and DMA in different scenarios automatically with a pre-set config file. Memory Copy performance test are supported for now. Signed-off-by: Cheng Jiang Signed-off-by: Jiayu Hu Signed-off-by: Yuan Wang Acked-by: Morten Brørup --- app/meson.build | 1 + app/test-dma-perf/benchmark.c | 467 ++ app/test-dma-perf/config.ini | 56 app/test-dma-perf/main.c | 445 app/test-dma-perf/main.h | 56 app/test-dma-perf/meson.build | 17 ++ 6 files changed, 1042 insertions(+) create mode 100644 app/test-dma-perf/benchmark.c create mode 100644 app/test-dma-perf/config.ini create mode 100644 app/test-dma-perf/main.c create mode 100644 app/test-dma-perf/main.h create mode 100644 app/test-dma-perf/meson.build diff --git a/app/meson.build b/app/meson.build index e32ea4bd5c..514cb2f7b2 100644 --- a/app/meson.build +++ b/app/meson.build @@ -19,6 +19,7 @@ apps = [ 'test-cmdline', 'test-compress-perf', 'test-crypto-perf', +'test-dma-perf', 'test-eventdev', 'test-fib', 'test-flow-perf', diff --git a/app/test-dma-perf/benchmark.c b/app/test-dma-perf/benchmark.c new file mode 100644 index 00..36e3413bdc --- /dev/null +++ b/app/test-dma-perf/benchmark.c @@ -0,0 +1,467 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2023 Intel Corporation + */ + +#include +#include +#include +#include + +#include +#include +#include +#include +#include + +#include "main.h" + +#define MAX_DMA_CPL_NB 255 + +#define TEST_WAIT_U_SECOND 1 + +#define CSV_LINE_DMA_FMT "Scenario %u,%u,%u,%u,%u,%u,%" PRIu64 ",%.3lf,%.3lf\n" +#define CSV_LINE_CPU_FMT "Scenario %u,%u,NA,%u,%u,%u,%" PRIu64 ",%.3lf,%.3lf\n" + +struct worker_info { + bool ready_flag; + bool start_flag; + bool stop_flag; + uint32_t total_cpl; + uint32_t test_cpl; +}; + +struct lcore_params { + uint8_t scenario_id; + unsigned int lcore_id; + uint16_t worker_id; + uint16_t dev_id; + uint32_t nr_buf; + uint16_t kick_batch; + uint32_t buf_size; + uint16_t test_secs; + struct rte_mbuf **srcs; + struct rte_mbuf **dsts; + struct worker_info worker_info; +}; + +static struct rte_mempool *src_pool; +static struct rte_mempool *dst_pool; + +static volatile struct lcore_params *worker_params[MAX_WORKER_NB]; + +uint16_t dmadev_ids[MAX_WORKER_NB]; +uint32_t nb_dmadevs; + + +#define PRINT_ERR(...) print_err(__func__, __LINE__, __VA_ARGS__) + +static inline int +__rte_format_printf(3, 4) +print_err(const char *func, int lineno, const char *format, ...) +{ + va_list ap; + int ret; + + ret = fprintf(stderr, "In %s:%d - ", func, lineno); + va_start(ap, format); + ret += vfprintf(stderr, format, ap); + va_end(ap); + + return ret; +} + +static inline void +calc_result(uint32_t buf_size, uint32_t nr_buf, uint16_t nb_workers, uint16_t test_secs, + uint32_t total_cnt, uint32_t *memory, uint32_t *ave_cycle, + float *bandwidth, float *mops) +{ + *memory = (buf_size * (nr_buf / nb_workers) * 2) / (1024 * 1024); + *ave_cycle = test_secs * rte_get_timer_hz() / total_cnt; + *bandwidth = (buf_size * 8 * (rte_get_timer_hz() / (float)*ave_cycle)) / 10; + *mops = (float)rte_get_timer_hz() / *ave_cycle / 100; +} + +static void +output_result(uint8_t scenario_id, uint32_t lcore_id, uint16_t dev_id, uint64_t ave_cycle, + uint32_t buf_size, uint32_t nr_buf, uint32_t memory, + float bandwidth, float mops, bool is_dma) +{ + if (is_dma) + printf("lcore %u, DMA %u:\n", lcore_id, dev_id); + else + printf("lcore %u\n", lcore_id); + + printf("average cycles/op: %" PRIu64 ", buffer size: %u, nr_buf: %u, memory: %uMB, frequency: %" PRIu64 ".\n", + ave_cycle, buf_size, nr_buf, memory, rte_get_timer_hz()); + printf("Average bandwidth: %.3lfGbps, MOps: %.3lf\n", bandwidth, mops); + + if (is_dma) + snprintf(output_str[lcore_id], MAX_OUTPUT_STR_LEN, CSV_LINE_DMA_FMT, + scenario_id, lcore_id, dev_id, buf_size, + nr_buf, memory, ave_cycle, bandwidth, mops); +
[PATCH] common/idpf: remove unnecessary field in vport
Remove the pointer to rte_eth_dev instance, as 1. there is already a pointer to rte_eth_dev_data. 2. a pointer to rte_eth_dev will break multi-process usage. Signed-off-by: Qi Zhang --- drivers/common/idpf/idpf_common_device.h | 1 - drivers/net/cpfl/cpfl_ethdev.c | 4 ++-- drivers/net/idpf/idpf_ethdev.c | 4 ++-- 3 files changed, 4 insertions(+), 5 deletions(-) diff --git a/drivers/common/idpf/idpf_common_device.h b/drivers/common/idpf/idpf_common_device.h index 7a54f7c937..d29bcc71ab 100644 --- a/drivers/common/idpf/idpf_common_device.h +++ b/drivers/common/idpf/idpf_common_device.h @@ -117,7 +117,6 @@ struct idpf_vport { struct virtchnl2_vport_stats eth_stats_offset; - void *dev; /* Event from ipf */ bool link_up; uint32_t link_speed; diff --git a/drivers/net/cpfl/cpfl_ethdev.c b/drivers/net/cpfl/cpfl_ethdev.c index f1d4425ce2..680c2326ec 100644 --- a/drivers/net/cpfl/cpfl_ethdev.c +++ b/drivers/net/cpfl/cpfl_ethdev.c @@ -1061,7 +1061,8 @@ static void cpfl_handle_event_msg(struct idpf_vport *vport, uint8_t *msg, uint16_t msglen) { struct virtchnl2_event *vc_event = (struct virtchnl2_event *)msg; - struct rte_eth_dev *dev = (struct rte_eth_dev *)vport->dev; + struct rte_eth_dev_data *data = vport->dev_data; + struct rte_eth_dev *dev = &rte_eth_devices[data->port_id]; if (msglen < sizeof(struct virtchnl2_event)) { PMD_DRV_LOG(ERR, "Error event"); @@ -1245,7 +1246,6 @@ cpfl_dev_vport_init(struct rte_eth_dev *dev, void *init_params) vport->adapter = &adapter->base; vport->sw_idx = param->idx; vport->devarg_id = param->devarg_id; - vport->dev = dev; memset(&create_vport_info, 0, sizeof(create_vport_info)); ret = idpf_vport_info_init(vport, &create_vport_info); diff --git a/drivers/net/idpf/idpf_ethdev.c b/drivers/net/idpf/idpf_ethdev.c index e01eb3a2ec..38ad4e7ac0 100644 --- a/drivers/net/idpf/idpf_ethdev.c +++ b/drivers/net/idpf/idpf_ethdev.c @@ -1024,7 +1024,8 @@ static void idpf_handle_event_msg(struct idpf_vport *vport, uint8_t *msg, uint16_t msglen) { struct virtchnl2_event *vc_event = (struct virtchnl2_event *)msg; - struct rte_eth_dev *dev = (struct rte_eth_dev *)vport->dev; + struct rte_eth_dev_data *data = vport->dev_data; + struct rte_eth_dev *dev = &rte_eth_devices[data->port_id]; if (msglen < sizeof(struct virtchnl2_event)) { PMD_DRV_LOG(ERR, "Error event"); @@ -1235,7 +1236,6 @@ idpf_dev_vport_init(struct rte_eth_dev *dev, void *init_params) vport->adapter = &adapter->base; vport->sw_idx = param->idx; vport->devarg_id = param->devarg_id; - vport->dev = dev; memset(&create_vport_info, 0, sizeof(create_vport_info)); ret = idpf_vport_info_init(vport, &create_vport_info); -- 2.31.1
[PATCH] net/mlx5: add timestamp ascending order error statistics
The ConnectX NICs support packet send scheduling on specified moment of time. Application can set the desired timestamp value in dynamic mbuf field and driver will push the special WAIT WQE to the hardware queue in order to suspend the entire queue operations till the specified time moment, then PMD pushes the regular WQE for packet sending. In the following packets the scheduling can be requested again, with different timestamps, and driver pushes WAIT WQE accordingly. The timestamps should be provided by application in ascending order as packets are queued to the hardware queue, otherwise hardware would not be able to perform scheduling correctly - it discovers the WAIT WQEs in order as they were pushed, there is no any reordering - neither in PMD, not in the NIC, and, obviously, the regular hardware can't work as time machine and wait for some elapsed moment in the past. Signed-off-by: Viacheslav Ovsiienko --- drivers/net/mlx5/mlx5.h | 1 + drivers/net/mlx5/mlx5_tx.h | 5 + drivers/net/mlx5/mlx5_txpp.c | 12 +--- 3 files changed, 15 insertions(+), 3 deletions(-) diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h index 9eae692037..e03f1f6385 100644 --- a/drivers/net/mlx5/mlx5.h +++ b/drivers/net/mlx5/mlx5.h @@ -1186,6 +1186,7 @@ struct mlx5_dev_txpp { uint64_t err_clock_queue; /* Clock Queue errors. */ uint64_t err_ts_past; /* Timestamp in the past. */ uint64_t err_ts_future; /* Timestamp in the distant future. */ + uint64_t err_ts_order; /* Timestamp not in ascending order. */ }; /* Sample ID information of eCPRI flex parser structure. */ diff --git a/drivers/net/mlx5/mlx5_tx.h b/drivers/net/mlx5/mlx5_tx.h index d0c6303a2d..cc8f7e98aa 100644 --- a/drivers/net/mlx5/mlx5_tx.h +++ b/drivers/net/mlx5/mlx5_tx.h @@ -162,6 +162,7 @@ struct mlx5_txq_data { uint16_t idx; /* Queue index. */ uint64_t rt_timemask; /* Scheduling timestamp mask. */ uint64_t ts_mask; /* Timestamp flag dynamic mask. */ + uint64_t ts_last; /* Last scheduled timestamp. */ int32_t ts_offset; /* Timestamp field dynamic offset. */ struct mlx5_dev_ctx_shared *sh; /* Shared context. */ struct mlx5_txq_stats stats; /* TX queue counters. */ @@ -1682,6 +1683,10 @@ mlx5_tx_schedule_send(struct mlx5_txq_data *restrict txq, return MLX5_TXCMP_CODE_EXIT; /* Convert the timestamp into completion to wait. */ ts = *RTE_MBUF_DYNFIELD(loc->mbuf, txq->ts_offset, uint64_t *); + if (txq->ts_last && ts < txq->ts_last) + __atomic_fetch_add(&txq->sh->txpp.err_ts_order, + 1, __ATOMIC_RELAXED); + txq->ts_last = ts; wqe = txq->wqes + (txq->wqe_ci & txq->wqe_m); sh = txq->sh; if (txq->wait_on_time) { diff --git a/drivers/net/mlx5/mlx5_txpp.c b/drivers/net/mlx5/mlx5_txpp.c index 0e1da1d5f5..5a5df2d1bb 100644 --- a/drivers/net/mlx5/mlx5_txpp.c +++ b/drivers/net/mlx5/mlx5_txpp.c @@ -29,6 +29,7 @@ static const char * const mlx5_txpp_stat_names[] = { "tx_pp_clock_queue_errors", /* Clock Queue errors. */ "tx_pp_timestamp_past_errors", /* Timestamp in the past. */ "tx_pp_timestamp_future_errors", /* Timestamp in the distant future. */ + "tx_pp_timestamp_order_errors", /* Timestamp not in ascending order. */ "tx_pp_jitter", /* Timestamp jitter (one Clock Queue completion). */ "tx_pp_wander", /* Timestamp wander (half of Clock Queue CQEs). */ "tx_pp_sync_lost", /* Scheduling synchronization lost. */ @@ -758,6 +759,7 @@ mlx5_txpp_start_service(struct mlx5_dev_ctx_shared *sh) sh->txpp.err_clock_queue = 0; sh->txpp.err_ts_past = 0; sh->txpp.err_ts_future = 0; + sh->txpp.err_ts_order = 0; /* Attach interrupt handler to process Rearm Queue completions. */ fd = mlx5_os_get_devx_channel_fd(sh->txpp.echan); ret = mlx5_os_set_nonblock_channel_fd(fd); @@ -1034,6 +1036,7 @@ int mlx5_txpp_xstats_reset(struct rte_eth_dev *dev) __atomic_store_n(&sh->txpp.err_clock_queue, 0, __ATOMIC_RELAXED); __atomic_store_n(&sh->txpp.err_ts_past, 0, __ATOMIC_RELAXED); __atomic_store_n(&sh->txpp.err_ts_future, 0, __ATOMIC_RELAXED); + __atomic_store_n(&sh->txpp.err_ts_order, 0, __ATOMIC_RELAXED); return 0; } @@ -1221,9 +1224,12 @@ mlx5_txpp_xstats_get(struct rte_eth_dev *dev, stats[n_used + 4].value = __atomic_load_n(&sh->txpp.err_ts_future, __ATOMIC_RELAXED); - stats[n_used + 5].value = mlx5_txpp_xstats_jitter(&sh->txpp); - stats[n_used + 6].value = mlx5_txpp_xstats_wander(&sh->txpp); - stats[n_used + 7].value = sh->txpp.sync_lost; + stats[n_used + 5].value = + __
[RFC 0/2] ethdev: extend modify field API
This petch-set extend the modify field action API to support 2 special cases. 1. Modify field when the relevant header appears multiple times inside same encapsulation level. 2. Modify Geneve option header which is specified by its "type" and "class" fields. In current API, the header type is provided by "rte_flow_field_id" enumeration and the encapsulation level (inner/outer/tunnel) is specified by "data.level" field. However, there is no way to specify header inside encapsulation level. For example, for this packet: eth / mpls / mpls / mpls / ipv4 / udp the both second and third MPLS headers cannot be modified using this API. Michael Baum (2): ethdev: add GENEVE TLV option modification support ethdev: add MPLS header modification support app/test-pmd/cmdline_flow.c| 69 +++- doc/guides/prog_guide/rte_flow.rst | 33 +++--- lib/ethdev/rte_flow.h | 72 -- 3 files changed, 165 insertions(+), 9 deletions(-) -- 2.25.1
[RFC 1/2] ethdev: add GENEVE TLV option modification support
Add modify field support for GENEVE option fields: - "RTE_FLOW_FIELD_GENEVE_OPT_TYPE" - "RTE_FLOW_FIELD_GENEVE_OPT_CLASS" - "RTE_FLOW_FIELD_GENEVE_OPT_DATA" Each GENEVE TLV option is identified by both its "class" and "type", so 2 new fields were added to "rte_flow_action_modify_data" structure to help specify which option to modify. To get room for those 2 new fields, the "level" field move to use "uint8_t" which is more than enough for encapsulation level. Signed-off-by: Michael Baum --- app/test-pmd/cmdline_flow.c| 47 ++- doc/guides/prog_guide/rte_flow.rst | 27 +--- lib/ethdev/rte_flow.h | 51 +- 3 files changed, 118 insertions(+), 7 deletions(-) diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c index 58939ec321..db8bd30cb1 100644 --- a/app/test-pmd/cmdline_flow.c +++ b/app/test-pmd/cmdline_flow.c @@ -636,11 +636,15 @@ enum index { ACTION_MODIFY_FIELD_DST_TYPE_VALUE, ACTION_MODIFY_FIELD_DST_LEVEL, ACTION_MODIFY_FIELD_DST_LEVEL_VALUE, + ACTION_MODIFY_FIELD_DST_TYPE_ID, + ACTION_MODIFY_FIELD_DST_CLASS_ID, ACTION_MODIFY_FIELD_DST_OFFSET, ACTION_MODIFY_FIELD_SRC_TYPE, ACTION_MODIFY_FIELD_SRC_TYPE_VALUE, ACTION_MODIFY_FIELD_SRC_LEVEL, ACTION_MODIFY_FIELD_SRC_LEVEL_VALUE, + ACTION_MODIFY_FIELD_SRC_TYPE_ID, + ACTION_MODIFY_FIELD_SRC_CLASS_ID, ACTION_MODIFY_FIELD_SRC_OFFSET, ACTION_MODIFY_FIELD_SRC_VALUE, ACTION_MODIFY_FIELD_SRC_POINTER, @@ -854,7 +858,8 @@ static const char *const modify_field_ids[] = { "ipv4_ecn", "ipv6_ecn", "gtp_psc_qfi", "meter_color", "ipv6_proto", "flex_item", - "hash_result", NULL + "hash_result", + "geneve_opt_type", "geneve_opt_class", "geneve_opt_data", NULL }; static const char *const meter_colors[] = { @@ -2295,6 +2300,8 @@ static const enum index next_action_sample[] = { static const enum index action_modify_field_dst[] = { ACTION_MODIFY_FIELD_DST_LEVEL, + ACTION_MODIFY_FIELD_DST_TYPE_ID, + ACTION_MODIFY_FIELD_DST_CLASS_ID, ACTION_MODIFY_FIELD_DST_OFFSET, ACTION_MODIFY_FIELD_SRC_TYPE, ZERO, @@ -2302,6 +2309,8 @@ static const enum index action_modify_field_dst[] = { static const enum index action_modify_field_src[] = { ACTION_MODIFY_FIELD_SRC_LEVEL, + ACTION_MODIFY_FIELD_SRC_TYPE_ID, + ACTION_MODIFY_FIELD_SRC_CLASS_ID, ACTION_MODIFY_FIELD_SRC_OFFSET, ACTION_MODIFY_FIELD_SRC_VALUE, ACTION_MODIFY_FIELD_SRC_POINTER, @@ -6388,6 +6397,24 @@ static const struct token token_list[] = { .call = parse_vc_modify_field_level, .comp = comp_none, }, + [ACTION_MODIFY_FIELD_DST_TYPE_ID] = { + .name = "dst_type_id", + .help = "destination field type ID", + .next = NEXT(action_modify_field_dst, +NEXT_ENTRY(COMMON_UNSIGNED)), + .args = ARGS(ARGS_ENTRY(struct rte_flow_action_modify_field, + dst.type)), + .call = parse_vc_conf, + }, + [ACTION_MODIFY_FIELD_DST_CLASS_ID] = { + .name = "dst_class", + .help = "destination field class ID", + .next = NEXT(action_modify_field_dst, +NEXT_ENTRY(COMMON_UNSIGNED)), + .args = ARGS(ARGS_ENTRY(struct rte_flow_action_modify_field, + dst.class_id)), + .call = parse_vc_conf, + }, [ACTION_MODIFY_FIELD_DST_OFFSET] = { .name = "dst_offset", .help = "destination field bit offset", @@ -6423,6 +6450,24 @@ static const struct token token_list[] = { .call = parse_vc_modify_field_level, .comp = comp_none, }, + [ACTION_MODIFY_FIELD_SRC_TYPE_ID] = { + .name = "src_type_id", + .help = "source field type ID", + .next = NEXT(action_modify_field_src, +NEXT_ENTRY(COMMON_UNSIGNED)), + .args = ARGS(ARGS_ENTRY(struct rte_flow_action_modify_field, + src.type)), + .call = parse_vc_conf, + }, + [ACTION_MODIFY_FIELD_SRC_CLASS_ID] = { + .name = "src_class", + .help = "source field class ID", + .next = NEXT(action_modify_field_src, +NEXT_ENTRY(COMMON_UNSIGNED)), + .args = ARGS(ARGS_ENTRY(struct rte_flow_action_modify_field, + src.class_id)), + .call = parse_vc_conf, + }, [ACTION_MODIFY_FIELD_SRC_OFFSET] = { .name = "src_offset", .help = "source field bit offset", diff --git a/doc/guides/prog_gu
[RFC 2/2] ethdev: add MPLS header modification support
Add support for MPLS modify header using "RTE_FLOW_FIELD_MPLS" id. Since MPLS heaser might appear more the one time in inner/outer/tunnel, a new field was added to "rte_flow_action_modify_data" structure in addition to "level" field. The "sub_level" field is the index of the header inside encapsulation level. It is used for modify multiple MPLS headers in same encapsulation level. This addition enables to modify multiple VLAN headers too, so the description of "RTE_FLOW_FIELD_VLAN_" was updated. Signed-off-by: Michael Baum --- app/test-pmd/cmdline_flow.c| 24 ++- doc/guides/prog_guide/rte_flow.rst | 6 lib/ethdev/rte_flow.h | 47 -- 3 files changed, 61 insertions(+), 16 deletions(-) diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c index db8bd30cb1..ffeedefc35 100644 --- a/app/test-pmd/cmdline_flow.c +++ b/app/test-pmd/cmdline_flow.c @@ -636,6 +636,7 @@ enum index { ACTION_MODIFY_FIELD_DST_TYPE_VALUE, ACTION_MODIFY_FIELD_DST_LEVEL, ACTION_MODIFY_FIELD_DST_LEVEL_VALUE, + ACTION_MODIFY_FIELD_DST_SUB_LEVEL, ACTION_MODIFY_FIELD_DST_TYPE_ID, ACTION_MODIFY_FIELD_DST_CLASS_ID, ACTION_MODIFY_FIELD_DST_OFFSET, @@ -643,6 +644,7 @@ enum index { ACTION_MODIFY_FIELD_SRC_TYPE_VALUE, ACTION_MODIFY_FIELD_SRC_LEVEL, ACTION_MODIFY_FIELD_SRC_LEVEL_VALUE, + ACTION_MODIFY_FIELD_SRC_SUB_LEVEL, ACTION_MODIFY_FIELD_SRC_TYPE_ID, ACTION_MODIFY_FIELD_SRC_CLASS_ID, ACTION_MODIFY_FIELD_SRC_OFFSET, @@ -859,7 +861,7 @@ static const char *const modify_field_ids[] = { "ipv6_proto", "flex_item", "hash_result", - "geneve_opt_type", "geneve_opt_class", "geneve_opt_data", NULL + "geneve_opt_type", "geneve_opt_class", "geneve_opt_data", "mpls", NULL }; static const char *const meter_colors[] = { @@ -2300,6 +2302,7 @@ static const enum index next_action_sample[] = { static const enum index action_modify_field_dst[] = { ACTION_MODIFY_FIELD_DST_LEVEL, + ACTION_MODIFY_FIELD_DST_SUB_LEVEL, ACTION_MODIFY_FIELD_DST_TYPE_ID, ACTION_MODIFY_FIELD_DST_CLASS_ID, ACTION_MODIFY_FIELD_DST_OFFSET, @@ -2309,6 +2312,7 @@ static const enum index action_modify_field_dst[] = { static const enum index action_modify_field_src[] = { ACTION_MODIFY_FIELD_SRC_LEVEL, + ACTION_MODIFY_FIELD_SRC_SUB_LEVEL, ACTION_MODIFY_FIELD_SRC_TYPE_ID, ACTION_MODIFY_FIELD_SRC_CLASS_ID, ACTION_MODIFY_FIELD_SRC_OFFSET, @@ -6397,6 +6401,15 @@ static const struct token token_list[] = { .call = parse_vc_modify_field_level, .comp = comp_none, }, + [ACTION_MODIFY_FIELD_DST_SUB_LEVEL] = { + .name = "dst_sub_level", + .help = "destination field sub level", + .next = NEXT(action_modify_field_dst, +NEXT_ENTRY(COMMON_UNSIGNED)), + .args = ARGS(ARGS_ENTRY(struct rte_flow_action_modify_field, + dst.sub_level)), + .call = parse_vc_conf, + }, [ACTION_MODIFY_FIELD_DST_TYPE_ID] = { .name = "dst_type_id", .help = "destination field type ID", @@ -6450,6 +6463,15 @@ static const struct token token_list[] = { .call = parse_vc_modify_field_level, .comp = comp_none, }, + [ACTION_MODIFY_FIELD_SRC_SUB_LEVEL] = { + .name = "stc_sub_level", + .help = "source field sub level", + .next = NEXT(action_modify_field_src, +NEXT_ENTRY(COMMON_UNSIGNED)), + .args = ARGS(ARGS_ENTRY(struct rte_flow_action_modify_field, + src.sub_level)), + .call = parse_vc_conf, + }, [ACTION_MODIFY_FIELD_SRC_TYPE_ID] = { .name = "src_type_id", .help = "source field type ID", diff --git a/doc/guides/prog_guide/rte_flow.rst b/doc/guides/prog_guide/rte_flow.rst index dc86e040ec..b5d8ce26c5 100644 --- a/doc/guides/prog_guide/rte_flow.rst +++ b/doc/guides/prog_guide/rte_flow.rst @@ -2939,6 +2939,10 @@ as well as any tag element in the tag array: For the tag array (in case of multiple tags are supported and present) ``level`` translates directly into the array index. +- ``sub_level`` is the index of the header inside encapsulation level. + It is used for modify either ``VLAN`` or ``MPLS`` headers which multiple of + them might be supported in same encapsulation level. + ``type`` is used to specify (along with ``class_id``) the Geneve option which is being modified. This field is relevant only for ``RTE_FLOW_FIELD_GENEVE_OPT_`` type. @@ -3004,6 +3008,8 @@ value as sequence of bytes {xxx, xxx, 0x85, xxx, xxx, xxx}. +-+-
[PATCH 0/7] update idpf and cpfl timestamp
Add timestamp offload feature support for ACC. Using alarm to save master time to solve timestamp roll over issue. Ajust timestamp mbuf registering at dev start. Wenjing Qiao (7): common/idpf: fix 64b timestamp roll over issue net/idpf: save master time by alarm net/cpfl: save master time by alarm common/idpf: support timestamp offload feature for ACC common/idpf: add timestamp enable flag for rxq net/cpfl: register timestamp mbuf when starting dev net/idpf: register timestamp mbuf when starting dev config/meson.build | 3 + drivers/common/idpf/base/idpf_osdep.h | 48 + drivers/common/idpf/idpf_common_rxtx.c | 133 ++--- drivers/common/idpf/idpf_common_rxtx.h | 5 +- drivers/common/idpf/version.map| 4 + drivers/net/cpfl/cpfl_ethdev.c | 19 drivers/net/cpfl/cpfl_ethdev.h | 3 + drivers/net/cpfl/cpfl_rxtx.c | 2 + drivers/net/idpf/idpf_ethdev.c | 19 drivers/net/idpf/idpf_ethdev.h | 3 + drivers/net/idpf/idpf_rxtx.c | 3 + meson_options.txt | 2 + 12 files changed, 186 insertions(+), 58 deletions(-) -- 2.25.1
[PATCH 1/7] common/idpf: fix 64b timestamp roll over issue
Reading MTS register at first packet will cause timestamp roll over issue. To support caculating 64b timestamp, need an alarm to save master time from registers every 1 second. Fixes: 8c6098afa075 ("common/idpf: add Rx/Tx data path") Cc: sta...@dpdk.org Signed-off-by: Wenjing Qiao --- drivers/common/idpf/idpf_common_rxtx.c | 108 - drivers/common/idpf/idpf_common_rxtx.h | 3 +- drivers/common/idpf/version.map| 1 + 3 files changed, 55 insertions(+), 57 deletions(-) diff --git a/drivers/common/idpf/idpf_common_rxtx.c b/drivers/common/idpf/idpf_common_rxtx.c index fc87e3e243..19bcb94077 100644 --- a/drivers/common/idpf/idpf_common_rxtx.c +++ b/drivers/common/idpf/idpf_common_rxtx.c @@ -4,6 +4,7 @@ #include #include +#include #include "idpf_common_rxtx.h" @@ -442,56 +443,23 @@ idpf_qc_split_rxq_mbufs_alloc(struct idpf_rx_queue *rxq) return 0; } -#define IDPF_TIMESYNC_REG_WRAP_GUARD_BAND 1 /* Helper function to convert a 32b nanoseconds timestamp to 64b. */ static inline uint64_t -idpf_tstamp_convert_32b_64b(struct idpf_adapter *ad, uint32_t flag, - uint32_t in_timestamp) +idpf_tstamp_convert_32b_64b(uint64_t time_hw, uint32_t in_timestamp) { -#ifdef RTE_ARCH_X86_64 - struct idpf_hw *hw = &ad->hw; const uint64_t mask = 0x; - uint32_t hi, lo, lo2, delta; + const uint32_t half_overflow_duration = 0x1 << 31; + uint32_t delta; uint64_t ns; - if (flag != 0) { - IDPF_WRITE_REG(hw, GLTSYN_CMD_SYNC_0_0, PF_GLTSYN_CMD_SYNC_SHTIME_EN_M); - IDPF_WRITE_REG(hw, GLTSYN_CMD_SYNC_0_0, PF_GLTSYN_CMD_SYNC_EXEC_CMD_M | - PF_GLTSYN_CMD_SYNC_SHTIME_EN_M); - lo = IDPF_READ_REG(hw, PF_GLTSYN_SHTIME_L_0); - hi = IDPF_READ_REG(hw, PF_GLTSYN_SHTIME_H_0); - /* -* On typical system, the delta between lo and lo2 is ~1000ns, -* so 1 seems a large-enough but not overly-big guard band. -*/ - if (lo > (UINT32_MAX - IDPF_TIMESYNC_REG_WRAP_GUARD_BAND)) - lo2 = IDPF_READ_REG(hw, PF_GLTSYN_SHTIME_L_0); - else - lo2 = lo; - - if (lo2 < lo) { - lo = IDPF_READ_REG(hw, PF_GLTSYN_SHTIME_L_0); - hi = IDPF_READ_REG(hw, PF_GLTSYN_SHTIME_H_0); - } - - ad->time_hw = ((uint64_t)hi << 32) | lo; - } - - delta = (in_timestamp - (uint32_t)(ad->time_hw & mask)); - if (delta > (mask / 2)) { - delta = ((uint32_t)(ad->time_hw & mask) - in_timestamp); - ns = ad->time_hw - delta; + delta = (in_timestamp - (uint32_t)(time_hw & mask)); + if (delta > half_overflow_duration) { + delta = ((uint32_t)(time_hw & mask) - in_timestamp); + ns = time_hw - delta; } else { - ns = ad->time_hw + delta; + ns = time_hw + delta; } - return ns; -#else /* !RTE_ARCH_X86_64 */ - RTE_SET_USED(ad); - RTE_SET_USED(flag); - RTE_SET_USED(in_timestamp); - return 0; -#endif /* RTE_ARCH_X86_64 */ } #define IDPF_RX_FLEX_DESC_ADV_STATUS0_XSUM_S \ @@ -659,9 +627,6 @@ idpf_dp_splitq_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, rx_desc_ring = rxq->rx_ring; ptype_tbl = rxq->adapter->ptype_tbl; - if ((rxq->offloads & IDPF_RX_OFFLOAD_TIMESTAMP) != 0) - rxq->hw_register_set = 1; - while (nb_rx < nb_pkts) { rx_desc = &rx_desc_ring[rx_id]; @@ -720,10 +685,8 @@ idpf_dp_splitq_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, if (idpf_timestamp_dynflag > 0 && (rxq->offloads & IDPF_RX_OFFLOAD_TIMESTAMP)) { /* timestamp */ - ts_ns = idpf_tstamp_convert_32b_64b(ad, - rxq->hw_register_set, + ts_ns = idpf_tstamp_convert_32b_64b(ad->time_hw, rte_le_to_cpu_32(rx_desc->ts_high)); - rxq->hw_register_set = 0; *RTE_MBUF_DYNFIELD(rxm, idpf_timestamp_dynfield_offset, rte_mbuf_timestamp_t *) = ts_ns; @@ -1077,9 +1040,6 @@ idpf_dp_singleq_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, rx_ring = rxq->rx_ring; ptype_tbl = rxq->adapter->ptype_tbl; - if ((rxq->offloads & IDPF_RX_OFFLOAD_TIMESTAMP) != 0) - rxq->hw_register_set = 1; - while (nb_rx < nb_pkts) { rxdp = &rx_ring[rx_id]; rx_status0 = rte_le_to_cpu_16(rxdp->flex_nic_wb.status_error0); @@ -1142,10 +1102,8 @@ idpf_dp_
[PATCH 2/7] net/idpf: save master time by alarm
Using alarm to save master time from registers every 1 second. Fixes: 8c6098afa075 ("common/idpf: add Rx/Tx data path") Cc: sta...@dpdk.org Signed-off-by: Wenjing Qiao --- drivers/net/idpf/idpf_ethdev.c | 12 1 file changed, 12 insertions(+) diff --git a/drivers/net/idpf/idpf_ethdev.c b/drivers/net/idpf/idpf_ethdev.c index e02ec2ec5a..3f33ffbc78 100644 --- a/drivers/net/idpf/idpf_ethdev.c +++ b/drivers/net/idpf/idpf_ethdev.c @@ -761,6 +761,12 @@ idpf_dev_start(struct rte_eth_dev *dev) goto err_vec; } + if (dev->data->dev_conf.rxmode.offloads & RTE_ETH_RX_OFFLOAD_TIMESTAMP) { + rte_eal_alarm_set(1000 * 1000, + &idpf_dev_read_time_hw, + (void *)base); + } + ret = idpf_vc_vectors_alloc(vport, req_vecs_num); if (ret != 0) { PMD_DRV_LOG(ERR, "Failed to allocate interrupt vectors"); @@ -810,6 +816,7 @@ static int idpf_dev_stop(struct rte_eth_dev *dev) { struct idpf_vport *vport = dev->data->dev_private; + struct idpf_adapter *base = vport->adapter; if (vport->stopped == 1) return 0; @@ -822,6 +829,11 @@ idpf_dev_stop(struct rte_eth_dev *dev) idpf_vc_vectors_dealloc(vport); + if (dev->data->dev_conf.rxmode.offloads & RTE_ETH_RX_OFFLOAD_TIMESTAMP) { + rte_eal_alarm_cancel(idpf_dev_read_time_hw, +base); + } + vport->stopped = 1; return 0; -- 2.25.1
[PATCH 3/7] net/cpfl: save master time by alarm
Using alarm to save master time from registers every 1 second. Fixes: 8c6098afa075 ("common/idpf: add Rx/Tx data path") Cc: sta...@dpdk.org Signed-off-by: Wenjing Qiao --- drivers/net/cpfl/cpfl_ethdev.c | 12 1 file changed, 12 insertions(+) diff --git a/drivers/net/cpfl/cpfl_ethdev.c b/drivers/net/cpfl/cpfl_ethdev.c index ede730fd50..82d8147494 100644 --- a/drivers/net/cpfl/cpfl_ethdev.c +++ b/drivers/net/cpfl/cpfl_ethdev.c @@ -767,6 +767,12 @@ cpfl_dev_start(struct rte_eth_dev *dev) goto err_vec; } + if (dev->data->dev_conf.rxmode.offloads & RTE_ETH_RX_OFFLOAD_TIMESTAMP) { + rte_eal_alarm_set(1000 * 1000, + &idpf_dev_read_time_hw, + (void *)base); + } + ret = idpf_vc_vectors_alloc(vport, req_vecs_num); if (ret != 0) { PMD_DRV_LOG(ERR, "Failed to allocate interrupt vectors"); @@ -816,6 +822,7 @@ static int cpfl_dev_stop(struct rte_eth_dev *dev) { struct idpf_vport *vport = dev->data->dev_private; + struct idpf_adapter *base = vport->adapter; if (vport->stopped == 1) return 0; @@ -828,6 +835,11 @@ cpfl_dev_stop(struct rte_eth_dev *dev) idpf_vc_vectors_dealloc(vport); + if (dev->data->dev_conf.rxmode.offloads & RTE_ETH_RX_OFFLOAD_TIMESTAMP) { + rte_eal_alarm_cancel(idpf_dev_read_time_hw, +base); + } + vport->stopped = 1; return 0; -- 2.25.1
[PATCH 4/7] common/idpf: support timestamp offload feature for ACC
For ACC, getting master time from MTS registers by shared memory. Notice: it is a workaroud, and it will be removed after generic solution are provided. Signed-off-by: Wenjing Qiao --- config/meson.build | 3 ++ drivers/common/idpf/base/idpf_osdep.h | 48 ++ drivers/common/idpf/idpf_common_rxtx.c | 30 +--- meson_options.txt | 2 ++ 4 files changed, 79 insertions(+), 4 deletions(-) diff --git a/config/meson.build b/config/meson.build index fa730a1b14..8d74f301b4 100644 --- a/config/meson.build +++ b/config/meson.build @@ -316,6 +316,9 @@ endif if get_option('mbuf_refcnt_atomic') dpdk_conf.set('RTE_MBUF_REFCNT_ATOMIC', true) endif +if get_option('enable_acc_timestamp') +dpdk_conf.set('IDPF_ACC_TIMESTAMP', true) +endif dpdk_conf.set10('RTE_IOVA_IN_MBUF', get_option('enable_iova_as_pa')) compile_time_cpuflags = [] diff --git a/drivers/common/idpf/base/idpf_osdep.h b/drivers/common/idpf/base/idpf_osdep.h index 99ae9cf60a..e634939a51 100644 --- a/drivers/common/idpf/base/idpf_osdep.h +++ b/drivers/common/idpf/base/idpf_osdep.h @@ -24,6 +24,13 @@ #include #include +#ifdef IDPF_ACC_TIMESTAMP +#include +#include +#include +#include +#endif /* IDPF_ACC_TIMESTAMP */ + #define INLINE inline #define STATIC static @@ -361,4 +368,45 @@ idpf_hweight32(u32 num) #endif +#ifdef IDPF_ACC_TIMESTAMP +#define IDPF_ACC_TIMESYNC_BASE_ADDR 0x480D50 +#define IDPF_ACC_GLTSYN_TIME_H (IDPF_ACC_TIMESYNC_BASE_ADDR + 0x1C) +#define IDPF_ACC_GLTSYN_TIME_L (IDPF_ACC_TIMESYNC_BASE_ADDR + 0x10) + +inline uint32_t +idpf_mmap_r32(uint64_t pa) +{ + int fd; + void *bp, *vp; + uint32_t rval = 0xdeadbeef; + uint32_t ps, ml, of; + + fd = open("/dev/mem", (O_RDWR | O_SYNC)); + if (fd == -1) { + perror("/dev/mem"); + return -1; + } + ml = ps = getpagesize(); + of = (uint32_t)pa & (ps - 1); + if (of + (sizeof(uint32_t) * 4) > ps) + ml *= 2; + bp = mmap(NULL, ml, (PROT_READ | PROT_WRITE), MAP_SHARED, fd, pa & ~(uint64_t)(ps - 1)); + if (bp == MAP_FAILED) { + perror("mmap"); + goto done; + } + + vp = (char *)bp + of; + + rval = *(volatile uint32_t *)vp; + if (munmap(bp, ml) == -1) + perror("munmap"); +done: + close(fd); + + return rval; +} + +#endif /* IDPF_ACC_TIMESTAMP */ + #endif /* _IDPF_OSDEP_H_ */ diff --git a/drivers/common/idpf/idpf_common_rxtx.c b/drivers/common/idpf/idpf_common_rxtx.c index 19bcb94077..9c58f3fb11 100644 --- a/drivers/common/idpf/idpf_common_rxtx.c +++ b/drivers/common/idpf/idpf_common_rxtx.c @@ -1582,12 +1582,36 @@ idpf_qc_splitq_rx_vec_setup(struct idpf_rx_queue *rxq) void idpf_dev_read_time_hw(void *cb_arg) { -#ifdef RTE_ARCH_X86_64 struct idpf_adapter *ad = (struct idpf_adapter *)cb_arg; uint32_t hi, lo, lo2; int rc = 0; +#ifndef IDPF_ACC_TIMESTAMP struct idpf_hw *hw = &ad->hw; +#endif /* !IDPF_ACC_TIMESTAMP */ +#ifdef IDPF_ACC_TIMESTAMP + + lo = idpf_mmap_r32(IDPF_ACC_GLTSYN_TIME_L); + hi = idpf_mmap_r32(IDPF_ACC_GLTSYN_TIME_H); + DRV_LOG(DEBUG, "lo : %X,", lo); + DRV_LOG(DEBUG, "hi : %X,", hi); + /* +* On typical system, the delta between lo and lo2 is ~1000ns, +* so 1 seems a large-enough but not overly-big guard band. +*/ + if (lo > (UINT32_MAX - IDPF_TIMESYNC_REG_WRAP_GUARD_BAND)) + lo2 = idpf_mmap_r32(IDPF_ACC_GLTSYN_TIME_L); + else + lo2 = lo; + + if (lo2 < lo) { + lo = idpf_mmap_r32(IDPF_ACC_GLTSYN_TIME_L); + hi = idpf_mmap_r32(IDPF_ACC_GLTSYN_TIME_H); + } + + ad->time_hw = ((uint64_t)hi << 32) | lo; + +#else /* !IDPF_ACC_TIMESTAMP */ IDPF_WRITE_REG(hw, GLTSYN_CMD_SYNC_0_0, PF_GLTSYN_CMD_SYNC_SHTIME_EN_M); IDPF_WRITE_REG(hw, GLTSYN_CMD_SYNC_0_0, PF_GLTSYN_CMD_SYNC_EXEC_CMD_M | PF_GLTSYN_CMD_SYNC_SHTIME_EN_M); @@ -1608,9 +1632,7 @@ idpf_dev_read_time_hw(void *cb_arg) } ad->time_hw = ((uint64_t)hi << 32) | lo; -#else /* !RTE_ARCH_X86_64 */ - ad->time_hw = 0; -#endif /* RTE_ARCH_X86_64 */ +#endif /* IDPF_ACC_TIMESTAMP */ /* re-alarm watchdog */ rc = rte_eal_alarm_set(1000 * 1000, &idpf_dev_read_time_hw, cb_arg); diff --git a/meson_options.txt b/meson_options.txt index 82c8297065..31fc634aa0 100644 --- a/meson_options.txt +++ b/meson_options.txt @@ -52,3 +52,5 @@ option('tests', type: 'boolean', value: true, description: 'build unit tests') option('use_hpet', type: 'boolean', value: false, description: 'use HPET timer in EAL') +option('enable_acc_timestamp', type: 'boolean', value: false, description: + 'enable timestamp on ACC.') -- 2.25.1
[PATCH 5/7] common/idpf: add timestamp enable flag for rxq
A rxq can be configured with timestamp offload. So, add timestamp enable flag for rxq. Fixes: 8c6098afa075 ("common/idpf: add Rx/Tx data path") Cc: sta...@dpdk.org Signed-off-by: Wenjing Qiao Suggested-by: Jingjing Wu --- drivers/common/idpf/idpf_common_rxtx.c | 3 ++- drivers/common/idpf/idpf_common_rxtx.h | 2 ++ drivers/common/idpf/version.map| 3 +++ 3 files changed, 7 insertions(+), 1 deletion(-) diff --git a/drivers/common/idpf/idpf_common_rxtx.c b/drivers/common/idpf/idpf_common_rxtx.c index 9c58f3fb11..7afe7afe3f 100644 --- a/drivers/common/idpf/idpf_common_rxtx.c +++ b/drivers/common/idpf/idpf_common_rxtx.c @@ -354,7 +354,7 @@ int idpf_qc_ts_mbuf_register(struct idpf_rx_queue *rxq) { int err; - if ((rxq->offloads & IDPF_RX_OFFLOAD_TIMESTAMP) != 0) { + if (!rxq->ts_enable && (rxq->offloads & IDPF_RX_OFFLOAD_TIMESTAMP)) { /* Register mbuf field and flag for Rx timestamp */ err = rte_mbuf_dyn_rx_timestamp_register(&idpf_timestamp_dynfield_offset, &idpf_timestamp_dynflag); @@ -363,6 +363,7 @@ idpf_qc_ts_mbuf_register(struct idpf_rx_queue *rxq) "Cannot register mbuf field/flag for timestamp"); return -EINVAL; } + rxq->ts_enable = TRUE; } return 0; } diff --git a/drivers/common/idpf/idpf_common_rxtx.h b/drivers/common/idpf/idpf_common_rxtx.h index af1425eb3f..cb7f5a3ba8 100644 --- a/drivers/common/idpf/idpf_common_rxtx.h +++ b/drivers/common/idpf/idpf_common_rxtx.h @@ -142,6 +142,8 @@ struct idpf_rx_queue { struct idpf_rx_queue *bufq2; uint64_t offloads; + + bool ts_enable; /* if timestamp is enabled */ }; struct idpf_tx_entry { diff --git a/drivers/common/idpf/version.map b/drivers/common/idpf/version.map index c67c554911..15b42b4d2e 100644 --- a/drivers/common/idpf/version.map +++ b/drivers/common/idpf/version.map @@ -69,5 +69,8 @@ INTERNAL { idpf_vport_rss_config; idpf_vport_stats_update; + idpf_timestamp_dynfield_offset; + idpf_timestamp_dynflag; + local: *; }; -- 2.25.1
[PATCH 6/7] net/cpfl: register timestamp mbuf when starting dev
Due to only support timestamp at port level, registering timestamp mbuf should be at dev start stage. Fixes: 8c6098afa075 ("common/idpf: add Rx/Tx data path") Cc: sta...@dpdk.org Signed-off-by: Wenjing Qiao Suggested-by: Jingjing Wu --- drivers/net/cpfl/cpfl_ethdev.c | 7 +++ drivers/net/cpfl/cpfl_ethdev.h | 3 +++ drivers/net/cpfl/cpfl_rxtx.c | 2 ++ 3 files changed, 12 insertions(+) diff --git a/drivers/net/cpfl/cpfl_ethdev.c b/drivers/net/cpfl/cpfl_ethdev.c index 82d8147494..416273f567 100644 --- a/drivers/net/cpfl/cpfl_ethdev.c +++ b/drivers/net/cpfl/cpfl_ethdev.c @@ -771,6 +771,13 @@ cpfl_dev_start(struct rte_eth_dev *dev) rte_eal_alarm_set(1000 * 1000, &idpf_dev_read_time_hw, (void *)base); + /* Register mbuf field and flag for Rx timestamp */ + ret = rte_mbuf_dyn_rx_timestamp_register(&idpf_timestamp_dynfield_offset, + &idpf_timestamp_dynflag); + if (ret != 0) { + PMD_DRV_LOG(ERR, "Cannot register mbuf field/flag for timestamp"); + return -EINVAL; + } } ret = idpf_vc_vectors_alloc(vport, req_vecs_num); diff --git a/drivers/net/cpfl/cpfl_ethdev.h b/drivers/net/cpfl/cpfl_ethdev.h index 200dfcac02..eec253bc77 100644 --- a/drivers/net/cpfl/cpfl_ethdev.h +++ b/drivers/net/cpfl/cpfl_ethdev.h @@ -57,6 +57,9 @@ /* Device IDs */ #define IDPF_DEV_ID_CPF0x1453 +extern int idpf_timestamp_dynfield_offset; +extern uint64_t idpf_timestamp_dynflag; + struct cpfl_vport_param { struct cpfl_adapter_ext *adapter; uint16_t devarg_id; /* arg id from user */ diff --git a/drivers/net/cpfl/cpfl_rxtx.c b/drivers/net/cpfl/cpfl_rxtx.c index de59b31b3d..cdb5b37da0 100644 --- a/drivers/net/cpfl/cpfl_rxtx.c +++ b/drivers/net/cpfl/cpfl_rxtx.c @@ -529,6 +529,8 @@ cpfl_rx_queue_init(struct rte_eth_dev *dev, uint16_t rx_queue_id) frame_size > rxq->rx_buf_len) dev->data->scattered_rx = 1; + if (dev->data->dev_conf.rxmode.offloads & RTE_ETH_RX_OFFLOAD_TIMESTAMP) + rxq->ts_enable = TRUE; err = idpf_qc_ts_mbuf_register(rxq); if (err != 0) { PMD_DRV_LOG(ERR, "fail to register timestamp mbuf %u", -- 2.25.1
[PATCH 7/7] net/idpf: register timestamp mbuf when starting dev
Due to only support timestamp at port level, registering timestamp mbuf should be at dev start stage. Fixes: 8c6098afa075 ("common/idpf: add Rx/Tx data path") Cc: sta...@dpdk.org Signed-off-by: Wenjing Qiao Suggested-by: Jingjing Wu --- drivers/net/idpf/idpf_ethdev.c | 7 +++ drivers/net/idpf/idpf_ethdev.h | 3 +++ drivers/net/idpf/idpf_rxtx.c | 3 +++ 3 files changed, 13 insertions(+) diff --git a/drivers/net/idpf/idpf_ethdev.c b/drivers/net/idpf/idpf_ethdev.c index 3f33ffbc78..7c43f51c25 100644 --- a/drivers/net/idpf/idpf_ethdev.c +++ b/drivers/net/idpf/idpf_ethdev.c @@ -765,6 +765,13 @@ idpf_dev_start(struct rte_eth_dev *dev) rte_eal_alarm_set(1000 * 1000, &idpf_dev_read_time_hw, (void *)base); + /* Register mbuf field and flag for Rx timestamp */ + ret = rte_mbuf_dyn_rx_timestamp_register(&idpf_timestamp_dynfield_offset, + &idpf_timestamp_dynflag); + if (ret != 0) { + PMD_DRV_LOG(ERR, "Cannot register mbuf field/flag for timestamp"); + return -EINVAL; + } } ret = idpf_vc_vectors_alloc(vport, req_vecs_num); diff --git a/drivers/net/idpf/idpf_ethdev.h b/drivers/net/idpf/idpf_ethdev.h index 3c2c932438..256e348710 100644 --- a/drivers/net/idpf/idpf_ethdev.h +++ b/drivers/net/idpf/idpf_ethdev.h @@ -55,6 +55,9 @@ #define IDPF_ALARM_INTERVAL5 /* us */ +extern int idpf_timestamp_dynfield_offset; +extern uint64_t idpf_timestamp_dynflag; + struct idpf_vport_param { struct idpf_adapter_ext *adapter; uint16_t devarg_id; /* arg id from user */ diff --git a/drivers/net/idpf/idpf_rxtx.c b/drivers/net/idpf/idpf_rxtx.c index 414f9a37f6..1aaf0142d2 100644 --- a/drivers/net/idpf/idpf_rxtx.c +++ b/drivers/net/idpf/idpf_rxtx.c @@ -529,6 +529,9 @@ idpf_rx_queue_init(struct rte_eth_dev *dev, uint16_t rx_queue_id) frame_size > rxq->rx_buf_len) dev->data->scattered_rx = 1; + if (dev->data->dev_conf.rxmode.offloads & RTE_ETH_RX_OFFLOAD_TIMESTAMP) + rxq->ts_enable = TRUE; + err = idpf_qc_ts_mbuf_register(rxq); if (err != 0) { PMD_DRV_LOG(ERR, "fail to residter timestamp mbuf %u", -- 2.25.1
[RFC] net/mlx5: add MPLS modify field support
Add support for modify field in tunnel MPLS header. For now it is supported only to copy from. Signed-off-by: Michael Baum --- drivers/common/mlx5/mlx5_prm.h | 5 + drivers/net/mlx5/mlx5_flow_dv.c | 23 +++ drivers/net/mlx5/mlx5_flow_hw.c | 16 +--- 3 files changed, 37 insertions(+), 7 deletions(-) diff --git a/drivers/common/mlx5/mlx5_prm.h b/drivers/common/mlx5/mlx5_prm.h index ed3d5efbb7..04c1400a1e 100644 --- a/drivers/common/mlx5/mlx5_prm.h +++ b/drivers/common/mlx5/mlx5_prm.h @@ -787,6 +787,11 @@ enum mlx5_modification_field { MLX5_MODI_TUNNEL_HDR_DW_1 = 0x75, MLX5_MODI_GTPU_FIRST_EXT_DW_0 = 0x76, MLX5_MODI_HASH_RESULT = 0x81, + MLX5_MODI_IN_MPLS_LABEL_0 = 0x8a, + MLX5_MODI_IN_MPLS_LABEL_1, + MLX5_MODI_IN_MPLS_LABEL_2, + MLX5_MODI_IN_MPLS_LABEL_3, + MLX5_MODI_IN_MPLS_LABEL_4, MLX5_MODI_OUT_IPV6_NEXT_HDR = 0x4A, MLX5_MODI_INVALID = INT_MAX, }; diff --git a/drivers/net/mlx5/mlx5_flow_dv.c b/drivers/net/mlx5/mlx5_flow_dv.c index f136f43b0a..93cce16a1e 100644 --- a/drivers/net/mlx5/mlx5_flow_dv.c +++ b/drivers/net/mlx5/mlx5_flow_dv.c @@ -1388,6 +1388,7 @@ mlx5_flow_item_field_width(struct rte_eth_dev *dev, case RTE_FLOW_FIELD_GENEVE_VNI: return 24; case RTE_FLOW_FIELD_GTP_TEID: + case RTE_FLOW_FIELD_MPLS: case RTE_FLOW_FIELD_TAG: return 32; case RTE_FLOW_FIELD_MARK: @@ -1435,6 +1436,12 @@ flow_modify_info_mask_32_masked(uint32_t length, uint32_t off, uint32_t post_mas return rte_cpu_to_be_32(mask & post_mask); } +static __rte_always_inline enum mlx5_modification_field +mlx5_mpls_modi_field_get(const struct rte_flow_action_modify_data *data) +{ + return MLX5_MODI_IN_MPLS_LABEL_0 + data->sub_level; +} + static void mlx5_modify_flex_item(const struct rte_eth_dev *dev, const struct mlx5_flex_item *flex, @@ -1893,6 +1900,16 @@ mlx5_flow_field_id_to_modify_info else info[idx].offset = off_be; break; + case RTE_FLOW_FIELD_MPLS: + MLX5_ASSERT(data->offset + width <= 32); + off_be = 32 - (data->offset + width); + info[idx] = (struct field_modify_info){4, 0, + mlx5_mpls_modi_field_get(data)}; + if (mask) + mask[idx] = flow_modify_info_mask_32(width, off_be); + else + info[idx].offset = off_be; + break; case RTE_FLOW_FIELD_TAG: { MLX5_ASSERT(data->offset + width <= 32); @@ -5344,6 +5361,12 @@ flow_dv_validate_action_modify_field(struct rte_eth_dev *dev, RTE_FLOW_ERROR_TYPE_ACTION, action, "modifications of the GENEVE Network" " Identifier is not supported"); + if (action_modify_field->dst.field == RTE_FLOW_FIELD_MPLS || + action_modify_field->src.field == RTE_FLOW_FIELD_MPLS) + return rte_flow_error_set(error, ENOTSUP, + RTE_FLOW_ERROR_TYPE_ACTION, action, + "modifications of the MPLS header " + "is not supported"); if (action_modify_field->dst.field == RTE_FLOW_FIELD_MARK || action_modify_field->src.field == RTE_FLOW_FIELD_MARK) if (config->dv_xmeta_en == MLX5_XMETA_MODE_LEGACY || diff --git a/drivers/net/mlx5/mlx5_flow_hw.c b/drivers/net/mlx5/mlx5_flow_hw.c index 7e0ee8d883..fd2ad3bb58 100644 --- a/drivers/net/mlx5/mlx5_flow_hw.c +++ b/drivers/net/mlx5/mlx5_flow_hw.c @@ -3546,10 +3546,8 @@ flow_hw_validate_action_modify_field(const struct rte_flow_action *action, const struct rte_flow_action *mask, struct rte_flow_error *error) { - const struct rte_flow_action_modify_field *action_conf = - action->conf; - const struct rte_flow_action_modify_field *mask_conf = - mask->conf; + const struct rte_flow_action_modify_field *action_conf = action->conf; + const struct rte_flow_action_modify_field *mask_conf = mask->conf; if (action_conf->operation != mask_conf->operation) return rte_flow_error_set(error, EINVAL, @@ -3604,6 +3602,11 @@ flow_hw_validate_action_modify_field(const struct rte_flow_action *action, return rte_flow_error_set(error, EINVAL, RTE_FLOW_ERROR_TYPE_ACTION, action, "modifying Geneve VNI is not supported"); + /* Due to HW bug, tunnel MPLS header is read only. */ + if (action_conf->dst.field == RTE_FLOW_FIELD_MPLS) + return rte_flow_error_set(error, EINVAL, + RTE_FLOW_ERROR_TYPE_A
[RFC PATCH v1 0/5] dts: add tg abstractions and scapy
The implementation adds abstractions for all traffic generators as well as those that can capture individual packets and investigate (not just count) them. The traffic generators reside on traffic generator nodes which are also added, along with some related code. Juraj Linkeš (5): dts: add scapy dependency dts: add traffic generator config dts: traffic generator abstractions dts: scapy traffic generator implementation dts: add traffic generator node to dts runner dts/conf.yaml | 25 ++ dts/framework/config/__init__.py | 107 +- dts/framework/config/conf_yaml_schema.json| 172 - dts/framework/dts.py | 42 ++- dts/framework/remote_session/linux_session.py | 55 +++ dts/framework/remote_session/os_session.py| 22 +- dts/framework/remote_session/posix_session.py | 3 + .../remote_session/remote/remote_session.py | 7 + dts/framework/testbed_model/__init__.py | 1 + .../capturing_traffic_generator.py| 155 dts/framework/testbed_model/hw/port.py| 55 +++ dts/framework/testbed_model/node.py | 4 +- dts/framework/testbed_model/scapy.py | 348 ++ dts/framework/testbed_model/sut_node.py | 5 +- dts/framework/testbed_model/tg_node.py| 62 .../testbed_model/traffic_generator.py| 59 +++ dts/poetry.lock | 18 +- dts/pyproject.toml| 1 + 18 files changed, 1103 insertions(+), 38 deletions(-) create mode 100644 dts/framework/testbed_model/capturing_traffic_generator.py create mode 100644 dts/framework/testbed_model/hw/port.py create mode 100644 dts/framework/testbed_model/scapy.py create mode 100644 dts/framework/testbed_model/tg_node.py create mode 100644 dts/framework/testbed_model/traffic_generator.py -- 2.30.2
[RFC PATCH v1 1/5] dts: add scapy dependency
Required for scapy traffic generator. Signed-off-by: Juraj Linkeš --- dts/poetry.lock| 18 +- dts/pyproject.toml | 1 + 2 files changed, 18 insertions(+), 1 deletion(-) diff --git a/dts/poetry.lock b/dts/poetry.lock index 64d6c18f35..4b6c42e280 100644 --- a/dts/poetry.lock +++ b/dts/poetry.lock @@ -425,6 +425,22 @@ files = [ {file = "PyYAML-6.0.tar.gz", hash = "sha256:68fb519c14306fec9720a2a5b45bc9f0c8d1b9c72adf45c37baedfcd949c35a2"}, ] +[[package]] +name = "scapy" +version = "2.5.0" +description = "Scapy: interactive packet manipulation tool" +category = "main" +optional = false +python-versions = ">=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, <4" +files = [ +{file = "scapy-2.5.0.tar.gz", hash = "sha256:5b260c2b754fd8d409ba83ee7aee294ecdbb2c235f9f78fe90bc11cb6e5debc2"}, +] + +[package.extras] +basic = ["ipython"] +complete = ["cryptography (>=2.0)", "ipython", "matplotlib", "pyx"] +docs = ["sphinx (>=3.0.0)", "sphinx_rtd_theme (>=0.4.3)", "tox (>=3.0.0)"] + [[package]] name = "snowballstemmer" version = "2.2.0" @@ -504,4 +520,4 @@ jsonschema = ">=4,<5" [metadata] lock-version = "2.0" python-versions = "^3.10" -content-hash = "af71d1ffeb4372d870bd02a8bf101577254e03ddb8b12d02ab174f80069fd853" +content-hash = "fba5dcbb12d55a9c6b3f59f062d3a40973ff8360edb023b6e4613522654ba7c1" diff --git a/dts/pyproject.toml b/dts/pyproject.toml index 72d5b0204d..fc9b6278bb 100644 --- a/dts/pyproject.toml +++ b/dts/pyproject.toml @@ -23,6 +23,7 @@ pexpect = "^4.8.0" warlock = "^2.0.1" PyYAML = "^6.0" types-PyYAML = "^6.0.8" +scapy = "^2.5.0" [tool.poetry.group.dev.dependencies] mypy = "^0.961" -- 2.30.2
[RFC PATCH v1 2/5] dts: add traffic generator config
Node configuration - where to connect, what ports to use and what TG to use. Signed-off-by: Juraj Linkeš --- dts/conf.yaml | 25 +++ dts/framework/config/__init__.py | 107 +++-- dts/framework/config/conf_yaml_schema.json | 172 - 3 files changed, 287 insertions(+), 17 deletions(-) diff --git a/dts/conf.yaml b/dts/conf.yaml index a9bd8a3ecf..4e5fd3560f 100644 --- a/dts/conf.yaml +++ b/dts/conf.yaml @@ -13,6 +13,7 @@ executions: test_suites: - hello_world system_under_test: "SUT 1" +traffic_generator_system: "TG 1" nodes: - name: "SUT 1" hostname: sut1.change.me.localhost @@ -25,3 +26,27 @@ nodes: hugepages: # optional; if removed, will use system hugepage configuration amount: 256 force_first_numa: false +ports: + - pci: ":00:08.0" +dpdk_os_driver: vfio-pci +os_driver: i40e +peer_node: "TG 1" +peer_pci: ":00:08.0" + - name: "TG 1" +hostname: tg1.change.me.localhost +user: root +arch: x86_64 +os: linux +lcores: "" +use_first_core: false +hugepages: # optional; if removed, will use system hugepage configuration +amount: 256 +force_first_numa: false +ports: + - pci: ":00:08.0" +dpdk_os_driver: rdma +os_driver: rdma +peer_node: "SUT 1" +peer_pci: ":00:08.0" +traffic_generator: + type: SCAPY diff --git a/dts/framework/config/__init__.py b/dts/framework/config/__init__.py index ebb0823ff5..6b1c3159f7 100644 --- a/dts/framework/config/__init__.py +++ b/dts/framework/config/__init__.py @@ -12,7 +12,7 @@ import pathlib from dataclasses import dataclass from enum import Enum, auto, unique -from typing import Any, TypedDict +from typing import Any, TypedDict, Union import warlock # type: ignore import yaml @@ -61,6 +61,18 @@ class Compiler(StrEnum): msvc = auto() +@unique +class NodeType(StrEnum): +physical = auto() +virtual = auto() + + +@unique +class TrafficGeneratorType(StrEnum): +NONE = auto() +SCAPY = auto() + + # Slots enables some optimizations, by pre-allocating space for the defined # attributes in the underlying data structure. # @@ -72,6 +84,41 @@ class HugepageConfiguration: force_first_numa: bool +@dataclass(slots=True, frozen=True) +class PortConfig: +id: int +node: str +pci: str +dpdk_os_driver: str +os_driver: str +peer_node: str +peer_pci: str + +@staticmethod +def from_dict(id: int, node: str, d: dict) -> "PortConfig": +return PortConfig(id=id, node=node, **d) + + +@dataclass(slots=True, frozen=True) +class TrafficGeneratorConfig: +traffic_generator_type: TrafficGeneratorType + +@staticmethod +def from_dict(d: dict): +# This looks useless now, but is designed to allow expansion to traffic +# generators that require more configuration later. +match TrafficGeneratorType(d["type"]): +case TrafficGeneratorType.SCAPY: +return ScapyTrafficGeneratorConfig( +traffic_generator_type=TrafficGeneratorType.SCAPY +) + + +@dataclass(slots=True, frozen=True) +class ScapyTrafficGeneratorConfig(TrafficGeneratorConfig): +pass + + @dataclass(slots=True, frozen=True) class NodeConfiguration: name: str @@ -82,29 +129,52 @@ class NodeConfiguration: os: OS lcores: str use_first_core: bool -memory_channels: int hugepages: HugepageConfiguration | None +ports: list[PortConfig] @staticmethod -def from_dict(d: dict) -> "NodeConfiguration": +def from_dict(d: dict) -> Union["SUTConfiguration", "TGConfiguration"]: hugepage_config = d.get("hugepages") if hugepage_config: if "force_first_numa" not in hugepage_config: hugepage_config["force_first_numa"] = False hugepage_config = HugepageConfiguration(**hugepage_config) -return NodeConfiguration( -name=d["name"], -hostname=d["hostname"], -user=d["user"], -password=d.get("password"), -arch=Architecture(d["arch"]), -os=OS(d["os"]), -lcores=d.get("lcores", "1"), -use_first_core=d.get("use_first_core", False), -memory_channels=d.get("memory_channels", 1), -hugepages=hugepage_config, -) +common_config = {"name": d["name"], +"hostname": d["hostname"], +"user": d["user"], +"password": d.get("password"), +"arch": Architecture(d["arch"]), +"os": OS(d["os"]), +"lcores": d.get("lcores", "1"), +"use_first_core": d.get("use_first_core", False), +"hugepages": hugepage_config, +"ports": [ +PortConfig.from_dict(i, d["name"]
[RFC PATCH v1 3/5] dts: traffic generator abstractions
There are traffic abstractions for all traffic generators and for traffic generators that can capture (not just count) packets. There also related abstractions, such as TGNode where the traffic generators reside and some related code. Signed-off-by: Juraj Linkeš --- dts/framework/remote_session/os_session.py| 22 ++- dts/framework/remote_session/posix_session.py | 3 + .../capturing_traffic_generator.py| 155 ++ dts/framework/testbed_model/hw/port.py| 55 +++ dts/framework/testbed_model/node.py | 4 +- dts/framework/testbed_model/sut_node.py | 5 +- dts/framework/testbed_model/tg_node.py| 62 +++ .../testbed_model/traffic_generator.py| 59 +++ 8 files changed, 360 insertions(+), 5 deletions(-) create mode 100644 dts/framework/testbed_model/capturing_traffic_generator.py create mode 100644 dts/framework/testbed_model/hw/port.py create mode 100644 dts/framework/testbed_model/tg_node.py create mode 100644 dts/framework/testbed_model/traffic_generator.py diff --git a/dts/framework/remote_session/os_session.py b/dts/framework/remote_session/os_session.py index 4c48ae2567..56d7fef06c 100644 --- a/dts/framework/remote_session/os_session.py +++ b/dts/framework/remote_session/os_session.py @@ -10,6 +10,7 @@ from framework.logger import DTSLOG from framework.settings import SETTINGS from framework.testbed_model import LogicalCore +from framework.testbed_model.hw.port import PortIdentifier from framework.utils import EnvVarsDict, MesonArgs from .remote import CommandResult, RemoteSession, create_remote_session @@ -37,6 +38,7 @@ def __init__( self.name = name self._logger = logger self.remote_session = create_remote_session(node_config, name, logger) +self._disable_terminal_colors() def close(self, force: bool = False) -> None: """ @@ -53,7 +55,7 @@ def is_alive(self) -> bool: def send_command( self, command: str, -timeout: float, +timeout: float = SETTINGS.timeout, verify: bool = False, env: EnvVarsDict | None = None, ) -> CommandResult: @@ -64,6 +66,12 @@ def send_command( """ return self.remote_session.send_command(command, timeout, verify, env) +@abstractmethod +def _disable_terminal_colors(self) -> None: +""" +Disable the colors in the ssh session. +""" + @abstractmethod def guess_dpdk_remote_dir(self, remote_dir) -> PurePath: """ @@ -173,3 +181,15 @@ def setup_hugepages(self, hugepage_amount: int, force_first_numa: bool) -> None: if needed and mount the hugepages if needed. If force_first_numa is True, configure hugepages just on the first socket. """ + +@abstractmethod +def get_logical_name_of_port(self, id: PortIdentifier) -> str | None: +""" +Gets the logical name (eno1, ens5, etc) of a port by the port's identifier. +""" + +@abstractmethod +def check_link_is_up(self, id: PortIdentifier) -> bool: +""" +Check that the link is up. +""" diff --git a/dts/framework/remote_session/posix_session.py b/dts/framework/remote_session/posix_session.py index d38062e8d6..288fbabf1e 100644 --- a/dts/framework/remote_session/posix_session.py +++ b/dts/framework/remote_session/posix_session.py @@ -219,3 +219,6 @@ def _remove_dpdk_runtime_dirs( def get_dpdk_file_prefix(self, dpdk_prefix) -> str: return "" + +def _disable_terminal_colors(self) -> None: +self.remote_session.send_command("export TERM=xterm-mono") diff --git a/dts/framework/testbed_model/capturing_traffic_generator.py b/dts/framework/testbed_model/capturing_traffic_generator.py new file mode 100644 index 00..7beeb139c1 --- /dev/null +++ b/dts/framework/testbed_model/capturing_traffic_generator.py @@ -0,0 +1,155 @@ +# SPDX-License-Identifier: BSD-3-Clause +# Copyright(c) 2022 University of New Hampshire +# + +import itertools +import uuid +from abc import abstractmethod + +import scapy.utils +from scapy.packet import Packet + +from framework.testbed_model.hw.port import PortIdentifier +from framework.settings import SETTINGS + +from .traffic_generator import TrafficGenerator + + +def _get_default_capture_name() -> str: +""" +This is the function used for the default implementation of capture names. +""" +return str(uuid.uuid4()) + + +class CapturingTrafficGenerator(TrafficGenerator): +""" +A mixin interface which enables a packet generator to declare that it can capture +packets and return them to the user. + +All packet functions added by this class should write out the captured packets +to a pcap file in output, allowing for easier analysis of failed tests. +""" + +def is_capturing(self) -> bool: +return True + +@abstractmethod +def send_packet_and_capture( +self, +sen
[RFC PATCH v1 4/5] dts: scapy traffic generator implementation
Scapy is a traffic generator capable of sending and receiving traffic. Since it's a software traffic generator, it's not suitable for performance testing, but it is suitable for functional testing. Signed-off-by: Juraj Linkeš --- dts/framework/remote_session/linux_session.py | 55 +++ .../remote_session/remote/remote_session.py | 7 + dts/framework/testbed_model/scapy.py | 348 ++ 3 files changed, 410 insertions(+) create mode 100644 dts/framework/testbed_model/scapy.py diff --git a/dts/framework/remote_session/linux_session.py b/dts/framework/remote_session/linux_session.py index a1e3bc3a92..b99a27bba4 100644 --- a/dts/framework/remote_session/linux_session.py +++ b/dts/framework/remote_session/linux_session.py @@ -2,13 +2,29 @@ # Copyright(c) 2023 PANTHEON.tech s.r.o. # Copyright(c) 2023 University of New Hampshire +import json +from typing import TypedDict +from typing_extensions import NotRequired + + from framework.exception import RemoteCommandExecutionError from framework.testbed_model import LogicalCore +from framework.testbed_model.hw.port import PortIdentifier from framework.utils import expand_range from .posix_session import PosixSession +class LshwOutputConfigurationDict(TypedDict): +link: str + + +class LshwOutputDict(TypedDict): +businfo: str +logicalname: NotRequired[str] +configuration: LshwOutputConfigurationDict + + class LinuxSession(PosixSession): """ The implementation of non-Posix compliant parts of Linux remote sessions. @@ -105,3 +121,42 @@ def _configure_huge_pages( self.remote_session.send_command( f"echo {amount} | sudo tee {hugepage_config_path}" ) + +def get_lshw_info(self) -> list[LshwOutputDict]: +output = self.remote_session.send_expect("lshw -quiet -json -C network", "#") +assert not isinstance( +output, int +), "send_expect returned an int when it should have been a string" +return json.loads(output) + +def get_logical_name_of_port(self, id: PortIdentifier) -> str | None: +self._logger.debug(f"Searching for logical name of {id.pci}") +assert ( +id.node == self.name +), "Attempted to get the logical port name on the wrong node" +port_info_list: list[LshwOutputDict] = self.get_lshw_info() +for port_info in port_info_list: +if f"pci@{id.pci}" == port_info.get("businfo"): +if "logicalname" in port_info: +self._logger.debug( +f"Found logical name for port {id.pci}, {port_info.get('logicalname')}" +) +return port_info.get("logicalname") +else: +self._logger.warning( +f"Attempted to get the logical name of {id.pci}, but none existed" +) +return None +self._logger.warning(f"No port at pci address {id.pci} found.") +return None + +def check_link_is_up(self, id: PortIdentifier) -> bool | None: +self._logger.debug(f"Checking link status for {id.pci}") +port_info_list: list[LshwOutputDict] = self.get_lshw_info() +for port_info in port_info_list: +if f"pci@{id.pci}" == port_info.get("businfo"): +status = port_info["configuration"]["link"] +self._logger.debug(f"Found link status for port {id.pci}, {status}") +return status == "up" +self._logger.warning(f"No port at pci address {id.pci} found.") +return None diff --git a/dts/framework/remote_session/remote/remote_session.py b/dts/framework/remote_session/remote/remote_session.py index 91dee3cb4f..5b36e2d7d2 100644 --- a/dts/framework/remote_session/remote/remote_session.py +++ b/dts/framework/remote_session/remote/remote_session.py @@ -84,6 +84,13 @@ def _connect(self) -> None: Create connection to assigned node. """ +@abstractmethod +def send_expect( +self, command: str, prompt: str, timeout: float = 15, +verify: bool = False +) -> str | int: +"" + def send_command( self, command: str, diff --git a/dts/framework/testbed_model/scapy.py b/dts/framework/testbed_model/scapy.py new file mode 100644 index 00..1e5caab897 --- /dev/null +++ b/dts/framework/testbed_model/scapy.py @@ -0,0 +1,348 @@ +# SPDX-License-Identifier: BSD-3-Clause +# Copyright(c) 2022 University of New Hampshire +# + +import inspect +import json +import marshal +import types +import xmlrpc.client +from typing import TypedDict +from xmlrpc.server import SimpleXMLRPCServer + +import scapy.all +from scapy.packet import Packet +from typing_extensions import NotRequired + +from framework.config import OS +from framework.logger import getLogger +from .tg_node import TGNode +from .hw.port import Port, PortIdentifier +from .capturing_traffic_generator
[RFC PATCH v1 5/5] dts: add traffic generator node to dts runner
Initialize the TG node and do basic verification. Signed-off-by: Juraj Linkeš --- dts/framework/dts.py| 42 - dts/framework/testbed_model/__init__.py | 1 + 2 files changed, 28 insertions(+), 15 deletions(-) diff --git a/dts/framework/dts.py b/dts/framework/dts.py index 0502284580..9c82bfe1f4 100644 --- a/dts/framework/dts.py +++ b/dts/framework/dts.py @@ -9,7 +9,7 @@ from .logger import DTSLOG, getLogger from .test_result import BuildTargetResult, DTSResult, ExecutionResult, Result from .test_suite import get_test_suites -from .testbed_model import SutNode +from .testbed_model import SutNode, TGNode, Node from .utils import check_dts_python_version dts_logger: DTSLOG = getLogger("DTSRunner") @@ -27,28 +27,40 @@ def run_all() -> None: # check the python version of the server that run dts check_dts_python_version() -nodes: dict[str, SutNode] = {} +nodes: dict[str, Node] = {} try: # for all Execution sections for execution in CONFIGURATION.executions: sut_node = None +tg_node = None if execution.system_under_test.name in nodes: # a Node with the same name already exists sut_node = nodes[execution.system_under_test.name] -else: -# the SUT has not been initialized yet -try: + +if execution.traffic_generator_system.name in nodes: +# a Node with the same name already exists +tg_node = nodes[execution.traffic_generator_system.name] + +try: +if not sut_node: sut_node = SutNode(execution.system_under_test) -result.update_setup(Result.PASS) -except Exception as e: -dts_logger.exception( -f"Connection to node {execution.system_under_test} failed." -) -result.update_setup(Result.FAIL, e) -else: -nodes[sut_node.name] = sut_node - -if sut_node: +if not tg_node: +tg_node = TGNode(execution.traffic_generator_system) +tg_node.verify() +result.update_setup(Result.PASS) +except Exception as e: +failed_node = execution.system_under_test.name +if sut_node: +failed_node = execution.traffic_generator_system.name +dts_logger.exception( +f"Creation of node {failed_node} failed." +) +result.update_setup(Result.FAIL, e) +else: +nodes[sut_node.name] = sut_node +nodes[tg_node.name] = tg_node + +if sut_node and tg_node: _run_execution(sut_node, execution, result) except Exception as e: diff --git a/dts/framework/testbed_model/__init__.py b/dts/framework/testbed_model/__init__.py index f54a947051..5cbb859e47 100644 --- a/dts/framework/testbed_model/__init__.py +++ b/dts/framework/testbed_model/__init__.py @@ -20,3 +20,4 @@ ) from .node import Node from .sut_node import SutNode +from .tg_node import TGNode -- 2.30.2
[PATCH] crypto/qat: support to enable insecure algorithms
All the insecure algorithms are default disable from cryptodev. use qat_legacy_capa to enable all the legacy algorithms. Signed-off-by: Vikash Poddar --- drivers/common/qat/qat_device.c | 1 + drivers/common/qat/qat_device.h | 3 +- drivers/crypto/qat/dev/qat_crypto_pmd_gen2.c | 90 drivers/crypto/qat/qat_crypto.h | 1 + drivers/crypto/qat/qat_sym.c | 3 + 5 files changed, 62 insertions(+), 36 deletions(-) diff --git a/drivers/common/qat/qat_device.c b/drivers/common/qat/qat_device.c index 8bce2ac073..b8da684973 100644 --- a/drivers/common/qat/qat_device.c +++ b/drivers/common/qat/qat_device.c @@ -365,6 +365,7 @@ static int qat_pci_probe(struct rte_pci_driver *pci_drv __rte_unused, struct qat_pci_device *qat_pci_dev; struct qat_dev_hw_spec_funcs *ops_hw; struct qat_dev_cmd_param qat_dev_cmd_param[] = { + { QAT_LEGACY_CAPA, 0 }, { QAT_IPSEC_MB_LIB, 0 }, { SYM_ENQ_THRESHOLD_NAME, 0 }, { ASYM_ENQ_THRESHOLD_NAME, 0 }, diff --git a/drivers/common/qat/qat_device.h b/drivers/common/qat/qat_device.h index bc3da04238..12b8cc46b1 100644 --- a/drivers/common/qat/qat_device.h +++ b/drivers/common/qat/qat_device.h @@ -17,12 +17,13 @@ #define QAT_DEV_NAME_MAX_LEN 64 +#define QAT_LEGACY_CAPA "qat_legacy_capa" #define QAT_IPSEC_MB_LIB "qat_ipsec_mb_lib" #define SYM_ENQ_THRESHOLD_NAME "qat_sym_enq_threshold" #define ASYM_ENQ_THRESHOLD_NAME "qat_asym_enq_threshold" #define COMP_ENQ_THRESHOLD_NAME "qat_comp_enq_threshold" #define QAT_CMD_SLICE_MAP "qat_cmd_slice_disable" -#define QAT_CMD_SLICE_MAP_POS 4 +#define QAT_CMD_SLICE_MAP_POS 5 #define MAX_QP_THRESHOLD_SIZE 32 /** diff --git a/drivers/crypto/qat/dev/qat_crypto_pmd_gen2.c b/drivers/crypto/qat/dev/qat_crypto_pmd_gen2.c index 60ca0fc0d2..3cd1c42d94 100644 --- a/drivers/crypto/qat/dev/qat_crypto_pmd_gen2.c +++ b/drivers/crypto/qat/dev/qat_crypto_pmd_gen2.c @@ -12,10 +12,39 @@ #define MIXED_CRYPTO_MIN_FW_VER 0x0409 -static struct rte_cryptodev_capabilities qat_sym_crypto_caps_gen2[] = { +static struct rte_cryptodev_capabilities qat_sym_crypto_legacy_caps_gen2[] = { + QAT_SYM_CIPHER_CAP(DES_CBC, + CAP_SET(block_size, 8), + CAP_RNG(key_size, 8, 24, 8), CAP_RNG(iv_size, 8, 8, 0)), + QAT_SYM_CIPHER_CAP(3DES_CBC, + CAP_SET(block_size, 8), + CAP_RNG(key_size, 8, 24, 8), CAP_RNG(iv_size, 8, 8, 0)), + QAT_SYM_CIPHER_CAP(3DES_CTR, + CAP_SET(block_size, 8), + CAP_RNG(key_size, 16, 24, 8), CAP_RNG(iv_size, 8, 8, 0)), + QAT_SYM_CIPHER_CAP(DES_DOCSISBPI, + CAP_SET(block_size, 8), + CAP_RNG(key_size, 8, 8, 0), CAP_RNG(iv_size, 8, 8, 0)), QAT_SYM_PLAIN_AUTH_CAP(SHA1, CAP_SET(block_size, 64), CAP_RNG(digest_size, 1, 20, 1)), + QAT_SYM_AUTH_CAP(SHA224, + CAP_SET(block_size, 64), + CAP_RNG_ZERO(key_size), CAP_RNG(digest_size, 1, 28, 1), + CAP_RNG_ZERO(aad_size), CAP_RNG_ZERO(iv_size)), + QAT_SYM_AUTH_CAP(SHA224_HMAC, + CAP_SET(block_size, 64), + CAP_RNG(key_size, 1, 64, 1), CAP_RNG(digest_size, 1, 28, 1), + CAP_RNG_ZERO(aad_size), CAP_RNG_ZERO(iv_size)), + QAT_SYM_AUTH_CAP(SHA1_HMAC, + CAP_SET(block_size, 64), + CAP_RNG(key_size, 1, 64, 1), CAP_RNG(digest_size, 1, 20, 1), + CAP_RNG_ZERO(aad_size), CAP_RNG_ZERO(iv_size)), + RTE_CRYPTODEV_END_OF_CAPABILITIES_LIST() +}; + + +static struct rte_cryptodev_capabilities qat_sym_crypto_caps_gen2[] = { QAT_SYM_AEAD_CAP(AES_GCM, CAP_SET(block_size, 16), CAP_RNG(key_size, 16, 32, 8), CAP_RNG(digest_size, 8, 16, 4), @@ -32,10 +61,6 @@ static struct rte_cryptodev_capabilities qat_sym_crypto_caps_gen2[] = { CAP_SET(block_size, 16), CAP_RNG(key_size, 16, 16, 0), CAP_RNG(digest_size, 4, 16, 4), CAP_RNG_ZERO(aad_size), CAP_RNG_ZERO(iv_size)), - QAT_SYM_AUTH_CAP(SHA224, - CAP_SET(block_size, 64), - CAP_RNG_ZERO(key_size), CAP_RNG(digest_size, 1, 28, 1), - CAP_RNG_ZERO(aad_size), CAP_RNG_ZERO(iv_size)), QAT_SYM_AUTH_CAP(SHA256, CAP_SET(block_size, 64), CAP_RNG_ZERO(key_size), CAP_RNG(digest_size, 1, 32, 1), @@ -51,14 +76,6 @@ static struct rte_cryptodev_capabilities qat_sym_crypto_caps_gen2[] = { QAT_SYM_PLAIN_AUTH_CAP(SHA3_256, CAP_SET(block_size, 136), CAP_RNG(digest_size, 32, 32, 0)), - QAT_SYM_AUTH_CAP(SHA1_HMAC, - CAP_SET(block_size, 64), - CAP_RNG(key_size, 1, 64, 1), CAP_RNG(digest_size, 1, 20, 1), - CAP_RNG_ZERO(aad_size), CAP_RNG_ZERO
[RFC 1/5] app/testpmd: add trace dump command
The "dump_trace" CLI command is added to trigger saving the trace dumps to the trace directory. Signed-off-by: Viacheslav Ovsiienko --- app/test-pmd/cmdline.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c index 7b20bef4e9..be9e3a9ed6 100644 --- a/app/test-pmd/cmdline.c +++ b/app/test-pmd/cmdline.c @@ -39,6 +39,7 @@ #include #endif #include +#include #include #include @@ -8367,6 +8368,8 @@ static void cmd_dump_parsed(void *parsed_result, rte_lcore_dump(stdout); else if (!strcmp(res->dump, "dump_log_types")) rte_log_dump(stdout); + else if (!strcmp(res->dump, "dump_trace")) + rte_trace_save(); } static cmdline_parse_token_string_t cmd_dump_dump = @@ -8379,7 +8382,8 @@ static cmdline_parse_token_string_t cmd_dump_dump = "dump_mempool#" "dump_devargs#" "dump_lcores#" - "dump_log_types"); + "dump_log_types#" + "dump_trace"); static cmdline_parse_inst_t cmd_dump = { .f = cmd_dump_parsed, /* function to call */ -- 2.18.1
[RFC 0/5] net/mlx5: introduce Tx datapath tracing
The mlx5 provides the send scheduling on specific moment of time, and for the related kind of applications it would be extremely useful to have extra debug information - when and how packets were scheduled and when the actual sending was completed by the NIC hardware (it helps application to track the internal delay issues). Because the DPDK tx datapath API does not suppose getting any feedback from the driver and the feature looks like to be mlx5 specific, it seems to be reasonable to engage exisiting DPDK datapath tracing capability. The work cycle is supposed to be: - compile appplication with enabled tracing - run application with EAL parameters configuring the tracing in mlx5 Tx datapath - store the dump file with gathered tracing information - run analyzing scrypt (in Python) to combine related events (packet firing and completion) and see the data in human-readable view Below is the detailed instruction "how to" with mlx5 NIC to gather all the debug data including the full timings information. 1. Build DPDK application with enabled datapath tracing The meson option should be specified: --enable_trace_fp=true The c_args shoudl be specified: -DALLOW_EXPERIMENTAL_API The DPDK configuration examples: meson configure --buildtype=debug -Denable_trace_fp=true -Dc_args='-DRTE_LIBRTE_MLX5_DEBUG -DRTE_ENABLE_ASSERT -DALLOW_EXPERIMENTAL_API' build meson configure --buildtype=debug -Denable_trace_fp=true -Dc_args='-DRTE_ENABLE_ASSERT -DALLOW_EXPERIMENTAL_API' build meson configure --buildtype=release -Denable_trace_fp=true -Dc_args='-DRTE_ENABLE_ASSERT -DALLOW_EXPERIMENTAL_API' build meson configure --buildtype=release -Denable_trace_fp=true -Dc_args='-DALLOW_EXPERIMENTAL_API' build 2. Configuring the NIC If the sending completion timings are important the NIC should be configured to provide realtime timestamps, the REAL_TIME_CLOCK_ENABLE NV settings parameter should be configured to TRUE, for example with command (and with following FW/driver reset): sudo mlxconfig -d /dev/mst/mt4125_pciconf0 s REAL_TIME_CLOCK_ENABLE=1 3. Run DPDK application to gather the traces EAL parameters controlling trace capability in runtime --trace=pmd.net.mlx5.tx - the regular expression enabling the tracepoints with matching names at least "pmd.net.mlx5.tx" must be enabled to gather all events needed to analyze mlx5 Tx datapath and its timings. By default all tracepoints are disabled. --trace-dir=/var/log - trace storing directory --trace-bufsz=B|K|M - optional, trace data buffer size per thread. The default is 1MB. --trace-mode=overwrite|discard - optional, selects trace data buffer mode. 4. Installing or Building Babeltrace2 Package The gathered trace data can be analyzed with a developed Python script. To parse the trace, the data script uses the Babeltrace2 library. The package should be either installed or built from source code as shown below: git clone https://github.com/efficios/babeltrace.git cd babeltrace ./bootstrap ./configure -help ./configure --disable-api-doc --disable-man-pages --disable-python-bindings-doc --enbale-python-plugins --enable-python-binding 5. Running the Analyzing Script The analyzing script is located in the folder: ./drivers/net/mlx5/tools It requires Python3.6, Babeltrace2 packages and it takes the only parameter of trace data file. For example: ./mlx5_trace.py /var/log/rte-2023-01-23-AM-11-52-39 6. Interpreting the Script Output Data All the timings are given in nanoseconds. The list of Tx (and coming Rx) bursts per port/queue is presented in the output. Each list element contains the list of built WQEs with specific opcodes, and each WQE contains the list of the encompassed packets to send. Signed-off-by: Viacheslav Ovsiienko Viacheslav Ovsiienko (5): app/testpmd: add trace dump command common/mlx5: introduce tracepoints for mlx5 drivers net/mlx5: add Tx datapath tracing net/mlx5: add comprehensive send completion trace net/mlx5: add Tx datapath trace analyzing script app/test-pmd/cmdline.c | 6 +- drivers/common/mlx5/meson.build | 1 + drivers/common/mlx5/mlx5_trace.c | 25 +++ drivers/common/mlx5/mlx5_trace.h | 72 +++ drivers/common/mlx5/version.map | 8 + drivers/net/mlx5/linux/mlx5_verbs.c | 8 +- drivers/net/mlx5/mlx5_devx.c | 8 +- drivers/net/mlx5/mlx5_rx.h | 19 -- drivers/net/mlx5/mlx5_rxtx.h | 19 ++ drivers/net/mlx5/mlx5_tx.c | 9 + drivers/net/mlx5/mlx5_tx.h | 88 - drivers/net/mlx5/tools/mlx5_trace.py | 271 +++ 12 files changed, 504 insertions(+), 30 deletions(-) create mode 100644 drivers/common/mlx5/mlx5_trace.c create mode 100644 drivers/c
[RFC 2/5] common/mlx5: introduce tracepoints for mlx5 drivers
There is an intention to engage DPDK tracing capabilities for mlx5 PMDs monitoring and profiling in various modes. The patch introduces tracepoints for the Tx datapath in the ethernet device driver. Signed-off-by: Viacheslav Ovsiienko --- drivers/common/mlx5/meson.build | 1 + drivers/common/mlx5/mlx5_trace.c | 25 +++ drivers/common/mlx5/mlx5_trace.h | 72 drivers/common/mlx5/version.map | 8 4 files changed, 106 insertions(+) create mode 100644 drivers/common/mlx5/mlx5_trace.c create mode 100644 drivers/common/mlx5/mlx5_trace.h diff --git a/drivers/common/mlx5/meson.build b/drivers/common/mlx5/meson.build index 9dc809f192..e074ffb140 100644 --- a/drivers/common/mlx5/meson.build +++ b/drivers/common/mlx5/meson.build @@ -19,6 +19,7 @@ sources += files( 'mlx5_common_mp.c', 'mlx5_common_mr.c', 'mlx5_malloc.c', +'mlx5_trace.c', 'mlx5_common_pci.c', 'mlx5_common_devx.c', 'mlx5_common_utils.c', diff --git a/drivers/common/mlx5/mlx5_trace.c b/drivers/common/mlx5/mlx5_trace.c new file mode 100644 index 00..b9f14413ad --- /dev/null +++ b/drivers/common/mlx5/mlx5_trace.c @@ -0,0 +1,25 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright (c) 2022 NVIDIA Corporation & Affiliates + */ + +#include +#include + +RTE_TRACE_POINT_REGISTER(rte_pmd_mlx5_trace_tx_entry, + pmd.net.mlx5.tx.entry) + +RTE_TRACE_POINT_REGISTER(rte_pmd_mlx5_trace_tx_exit, + pmd.net.mlx5.tx.exit) + +RTE_TRACE_POINT_REGISTER(rte_pmd_mlx5_trace_tx_wqe, + pmd.net.mlx5.tx.wqe) + +RTE_TRACE_POINT_REGISTER(rte_pmd_mlx5_trace_tx_wait, + pmd.net.mlx5.tx.wait) + +RTE_TRACE_POINT_REGISTER(rte_pmd_mlx5_trace_tx_push, + pmd.net.mlx5.tx.push) + +RTE_TRACE_POINT_REGISTER(rte_pmd_mlx5_trace_tx_complete, + pmd.net.mlx5.tx.complete) + diff --git a/drivers/common/mlx5/mlx5_trace.h b/drivers/common/mlx5/mlx5_trace.h new file mode 100644 index 00..57512e654f --- /dev/null +++ b/drivers/common/mlx5/mlx5_trace.h @@ -0,0 +1,72 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright (c) 2022 NVIDIA Corporation & Affiliates + */ + +#ifndef RTE_PMD_MLX5_TRACE_H_ +#define RTE_PMD_MLX5_TRACE_H_ + +/** + * @file + * + * API for mlx5 PMD trace support + */ + +#ifdef __cplusplus +extern "C" { +#endif + +#include +#include +#include + +RTE_TRACE_POINT_FP( + rte_pmd_mlx5_trace_tx_entry, + RTE_TRACE_POINT_ARGS(uint16_t port_id, uint16_t queue_id), + rte_trace_point_emit_u16(port_id); + rte_trace_point_emit_u16(queue_id); +) + +RTE_TRACE_POINT_FP( + rte_pmd_mlx5_trace_tx_exit, + RTE_TRACE_POINT_ARGS(uint16_t nb_sent, uint16_t nb_req), + rte_trace_point_emit_u16(nb_sent); + rte_trace_point_emit_u16(nb_req); +) + +RTE_TRACE_POINT_FP( + rte_pmd_mlx5_trace_tx_wqe, + RTE_TRACE_POINT_ARGS(uint32_t opcode), + rte_trace_point_emit_u32(opcode); +) + +RTE_TRACE_POINT_FP( + rte_pmd_mlx5_trace_tx_wait, + RTE_TRACE_POINT_ARGS(uint64_t ts), + rte_trace_point_emit_u64(ts); +) + + +RTE_TRACE_POINT_FP( + rte_pmd_mlx5_trace_tx_push, + RTE_TRACE_POINT_ARGS(const struct rte_mbuf *mbuf, uint16_t wqe_id), + rte_trace_point_emit_ptr(mbuf); + rte_trace_point_emit_u32(mbuf->pkt_len); + rte_trace_point_emit_u16(mbuf->nb_segs); + rte_trace_point_emit_u16(wqe_id); +) + +RTE_TRACE_POINT_FP( + rte_pmd_mlx5_trace_tx_complete, + RTE_TRACE_POINT_ARGS(uint16_t port_id, uint16_t queue_id, +uint16_t wqe_id, uint64_t ts), + rte_trace_point_emit_u16(port_id); + rte_trace_point_emit_u16(queue_id); + rte_trace_point_emit_u64(ts); + rte_trace_point_emit_u16(wqe_id); +) + +#ifdef __cplusplus +} +#endif + +#endif /* RTE_PMD_MLX5_TRACE_H_ */ diff --git a/drivers/common/mlx5/version.map b/drivers/common/mlx5/version.map index e05e1aa8c5..d0ec8571e6 100644 --- a/drivers/common/mlx5/version.map +++ b/drivers/common/mlx5/version.map @@ -158,5 +158,13 @@ INTERNAL { mlx5_os_interrupt_handler_create; # WINDOWS_NO_EXPORT mlx5_os_interrupt_handler_destroy; # WINDOWS_NO_EXPORT + + __rte_pmd_mlx5_trace_tx_entry; + __rte_pmd_mlx5_trace_tx_exit; + __rte_pmd_mlx5_trace_tx_wqe; + __rte_pmd_mlx5_trace_tx_wait; + __rte_pmd_mlx5_trace_tx_push; + __rte_pmd_mlx5_trace_tx_complete; + local: *; }; -- 2.18.1
[RFC 3/5] net/mlx5: add Tx datapath tracing
The patch adds tracing capability to Tx datapath. To engage this tracing capability the following steps should be taken: - meson option -Denable_trace_fp=true - meson option -Dc_args='-DALLOW_EXPERIMENTAL_API' - EAL command line parameter --trace=pmd.net.mlx5.tx.* The Tx datapath tracing allows to get information how packets are pushed into hardware descriptors, time stamping for scheduled wait and send completions, etc. To provide the human readable form of trace results the dedicated post-processing script is presumed. Signed-off-by: Viacheslav Ovsiienko --- drivers/net/mlx5/mlx5_rx.h | 19 --- drivers/net/mlx5/mlx5_rxtx.h | 19 +++ drivers/net/mlx5/mlx5_tx.c | 9 + drivers/net/mlx5/mlx5_tx.h | 25 +++-- 4 files changed, 51 insertions(+), 21 deletions(-) diff --git a/drivers/net/mlx5/mlx5_rx.h b/drivers/net/mlx5/mlx5_rx.h index 8b87adad36..1b5f110ccc 100644 --- a/drivers/net/mlx5/mlx5_rx.h +++ b/drivers/net/mlx5/mlx5_rx.h @@ -376,25 +376,6 @@ mlx5_rx_mb2mr(struct mlx5_rxq_data *rxq, struct rte_mbuf *mb) return mlx5_mr_mempool2mr_bh(mr_ctrl, mb->pool, addr); } -/** - * Convert timestamp from HW format to linear counter - * from Packet Pacing Clock Queue CQE timestamp format. - * - * @param sh - * Pointer to the device shared context. Might be needed - * to convert according current device configuration. - * @param ts - * Timestamp from CQE to convert. - * @return - * UTC in nanoseconds - */ -static __rte_always_inline uint64_t -mlx5_txpp_convert_rx_ts(struct mlx5_dev_ctx_shared *sh, uint64_t ts) -{ - RTE_SET_USED(sh); - return (ts & UINT32_MAX) + (ts >> 32) * NS_PER_S; -} - /** * Set timestamp in mbuf dynamic field. * diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h index 876aa14ae6..b109d50758 100644 --- a/drivers/net/mlx5/mlx5_rxtx.h +++ b/drivers/net/mlx5/mlx5_rxtx.h @@ -43,4 +43,23 @@ int mlx5_queue_state_modify_primary(struct rte_eth_dev *dev, int mlx5_queue_state_modify(struct rte_eth_dev *dev, struct mlx5_mp_arg_queue_state_modify *sm); +/** + * Convert timestamp from HW format to linear counter + * from Packet Pacing Clock Queue CQE timestamp format. + * + * @param sh + * Pointer to the device shared context. Might be needed + * to convert according current device configuration. + * @param ts + * Timestamp from CQE to convert. + * @return + * UTC in nanoseconds + */ +static __rte_always_inline uint64_t +mlx5_txpp_convert_rx_ts(struct mlx5_dev_ctx_shared *sh, uint64_t ts) +{ + RTE_SET_USED(sh); + return (ts & UINT32_MAX) + (ts >> 32) * NS_PER_S; +} + #endif /* RTE_PMD_MLX5_RXTX_H_ */ diff --git a/drivers/net/mlx5/mlx5_tx.c b/drivers/net/mlx5/mlx5_tx.c index 14e1487e59..1fe9521dfc 100644 --- a/drivers/net/mlx5/mlx5_tx.c +++ b/drivers/net/mlx5/mlx5_tx.c @@ -232,6 +232,15 @@ mlx5_tx_handle_completion(struct mlx5_txq_data *__rte_restrict txq, MLX5_ASSERT((txq->fcqs[txq->cq_ci & txq->cqe_m] >> 16) == cqe->wqe_counter); #endif + if (__rte_trace_point_fp_is_enabled()) { + uint64_t ts = rte_be_to_cpu_64(cqe->timestamp); + uint16_t wqe_id = rte_be_to_cpu_16(cqe->wqe_counter); + + if (txq->rt_timestamp) + ts = mlx5_txpp_convert_rx_ts(NULL, ts); + rte_pmd_mlx5_trace_tx_complete(txq->port_id, txq->idx, + wqe_id, ts); + } ring_doorbell = true; ++txq->cq_ci; last_cqe = cqe; diff --git a/drivers/net/mlx5/mlx5_tx.h b/drivers/net/mlx5/mlx5_tx.h index cc8f7e98aa..7f624de58e 100644 --- a/drivers/net/mlx5/mlx5_tx.h +++ b/drivers/net/mlx5/mlx5_tx.h @@ -19,6 +19,8 @@ #include "mlx5.h" #include "mlx5_autoconf.h" +#include "mlx5_trace.h" +#include "mlx5_rxtx.h" /* TX burst subroutines return codes. */ enum mlx5_txcmp_code { @@ -764,6 +766,9 @@ mlx5_tx_cseg_init(struct mlx5_txq_data *__rte_restrict txq, cs->flags = RTE_BE32(MLX5_COMP_ONLY_FIRST_ERR << MLX5_COMP_MODE_OFFSET); cs->misc = RTE_BE32(0); + if (__rte_trace_point_fp_is_enabled() && !loc->pkts_sent) + rte_pmd_mlx5_trace_tx_entry(txq->port_id, txq->idx); + rte_pmd_mlx5_trace_tx_wqe((txq->wqe_ci << 8) | opcode); } /** @@ -1692,6 +1697,7 @@ mlx5_tx_schedule_send(struct mlx5_txq_data *restrict txq, if (txq->wait_on_time) { /* The wait on time capability should be used. */ ts -= sh->txpp.skew; + rte_pmd_mlx5_trace_tx_wait(ts); mlx5_tx_cseg_init(txq, loc, wqe, 1 + sizeof(struct mlx5_wqe_wseg) / MLX5_WSEG_SIZE, @@ -1706,6 +17
[RFC 4/5] net/mlx5: add comprehensive send completion trace
There is the demand to trace the send completions of every WQE if time scheduling is enabled. The patch extends the size of completion queue and requests completion on every issued WQE in the send queue. As the result hardware provides CQE on each completed WQE and driver is able to fetch completion timestamp for dedicated operation. The add code is under conditional compilation RTE_ENABLE_TRACE_FP flag and does not impact the release code. Signed-off-by: Viacheslav Ovsiienko --- drivers/net/mlx5/linux/mlx5_verbs.c | 8 +++- drivers/net/mlx5/mlx5_devx.c| 8 +++- drivers/net/mlx5/mlx5_tx.h | 63 +++-- 3 files changed, 71 insertions(+), 8 deletions(-) diff --git a/drivers/net/mlx5/linux/mlx5_verbs.c b/drivers/net/mlx5/linux/mlx5_verbs.c index 67a7bec22b..f3f717f17b 100644 --- a/drivers/net/mlx5/linux/mlx5_verbs.c +++ b/drivers/net/mlx5/linux/mlx5_verbs.c @@ -968,8 +968,12 @@ mlx5_txq_ibv_obj_new(struct rte_eth_dev *dev, uint16_t idx) rte_errno = EINVAL; return -rte_errno; } - cqe_n = desc / MLX5_TX_COMP_THRESH + - 1 + MLX5_TX_COMP_THRESH_INLINE_DIV; + if (__rte_trace_point_fp_is_enabled() && + txq_data->offloads & RTE_ETH_TX_OFFLOAD_SEND_ON_TIMESTAMP) + cqe_n = UINT16_MAX / 2 - 1; + else + cqe_n = desc / MLX5_TX_COMP_THRESH + + 1 + MLX5_TX_COMP_THRESH_INLINE_DIV; txq_obj->cq = mlx5_glue->create_cq(priv->sh->cdev->ctx, cqe_n, NULL, NULL, 0); if (txq_obj->cq == NULL) { diff --git a/drivers/net/mlx5/mlx5_devx.c b/drivers/net/mlx5/mlx5_devx.c index 4369d2557e..5082a7e178 100644 --- a/drivers/net/mlx5/mlx5_devx.c +++ b/drivers/net/mlx5/mlx5_devx.c @@ -1465,8 +1465,12 @@ mlx5_txq_devx_obj_new(struct rte_eth_dev *dev, uint16_t idx) MLX5_ASSERT(ppriv); txq_obj->txq_ctrl = txq_ctrl; txq_obj->dev = dev; - cqe_n = (1UL << txq_data->elts_n) / MLX5_TX_COMP_THRESH + - 1 + MLX5_TX_COMP_THRESH_INLINE_DIV; + if (__rte_trace_point_fp_is_enabled() && + txq_data->offloads & RTE_ETH_TX_OFFLOAD_SEND_ON_TIMESTAMP) + cqe_n = UINT16_MAX / 2 - 1; + else + cqe_n = (1UL << txq_data->elts_n) / MLX5_TX_COMP_THRESH + + 1 + MLX5_TX_COMP_THRESH_INLINE_DIV; log_desc_n = log2above(cqe_n); cqe_n = 1UL << log_desc_n; if (cqe_n > UINT16_MAX) { diff --git a/drivers/net/mlx5/mlx5_tx.h b/drivers/net/mlx5/mlx5_tx.h index 7f624de58e..9f29df280f 100644 --- a/drivers/net/mlx5/mlx5_tx.h +++ b/drivers/net/mlx5/mlx5_tx.h @@ -728,6 +728,54 @@ mlx5_tx_request_completion(struct mlx5_txq_data *__rte_restrict txq, } } +/** + * Set completion request flag for all issued WQEs. + * This routine is intended to be used with enabled fast path tracing + * and send scheduling on time to provide the detailed report in trace + * for send completions on every WQE. + * + * @param txq + * Pointer to TX queue structure. + * @param loc + * Pointer to burst routine local context. + * @param olx + * Configured Tx offloads mask. It is fully defined at + * compile time and may be used for optimization. + */ +static __rte_always_inline void +mlx5_tx_request_completion_trace(struct mlx5_txq_data *__rte_restrict txq, +struct mlx5_txq_local *__rte_restrict loc, +unsigned int olx) +{ + uint16_t head = txq->elts_comp; + + while (txq->wqe_comp != txq->wqe_ci) { + volatile struct mlx5_wqe *wqe; + uint32_t wqe_n; + + MLX5_ASSERT(loc->wqe_last); + wqe = txq->wqes + (txq->wqe_comp & txq->wqe_m); + if (wqe == loc->wqe_last) { + head = txq->elts_head; + head += MLX5_TXOFF_CONFIG(INLINE) ? + 0 : loc->pkts_sent - loc->pkts_copy; + txq->elts_comp = head; + } + /* Completion request flag was set on cseg constructing. */ +#ifdef RTE_LIBRTE_MLX5_DEBUG + txq->fcqs[txq->cq_pi++ & txq->cqe_m] = head | + (wqe->cseg.opcode >> 8) << 16; +#else + txq->fcqs[txq->cq_pi++ & txq->cqe_m] = head; +#endif + /* A CQE slot must always be available. */ + MLX5_ASSERT((txq->cq_pi - txq->cq_ci) <= txq->cqe_s); + /* Advance to the next WQE in the queue. */ + wqe_n = rte_be_to_cpu_32(wqe->cseg.sq_ds) & 0x3F; + txq->wqe_comp += RTE_ALIGN(wqe_n, 4) / 4; + } +} + /** * Build the Control Segment with specified opcode: * - MLX5_OPCODE_SEND @@ -754,7 +802,7 @@ mlx5_tx_cseg_init(struct mlx5_txq_data *__rte_restrict txq, struct mlx5_wqe *__rte_restrict wqe, unsigned int ds, unsigned i
[RFC 5/5] net/mlx5: add Tx datapath trace analyzing script
The Python script is intended to analyze mlx5 PMD datapath traces and report: - tx_burst routine timings - how packets are pushed to WQEs - how packet sending is completed with timings Signed-off-by: Viacheslav Ovsiienko --- drivers/net/mlx5/tools/mlx5_trace.py | 271 +++ 1 file changed, 271 insertions(+) create mode 100755 drivers/net/mlx5/tools/mlx5_trace.py diff --git a/drivers/net/mlx5/tools/mlx5_trace.py b/drivers/net/mlx5/tools/mlx5_trace.py new file mode 100755 index 00..c8fa63a7b9 --- /dev/null +++ b/drivers/net/mlx5/tools/mlx5_trace.py @@ -0,0 +1,271 @@ +#!/usr/bin/env python3 +# SPDX-License-Identifier: BSD-3-Clause +# Copyright (c) 2023 NVIDIA Corporation & Affiliates + +''' +Analyzing the mlx5 PMD datapath tracings +''' +import sys +import argparse +import pathlib +import bt2 + +PFX_TX = "pmd.net.mlx5.tx." +PFX_TX_LEN = len(PFX_TX) + +tx_blst = {}# current Tx bursts per CPU +tx_qlst = {}# active Tx queues per port/queue +tx_wlst = {}# wait timestamp list per CPU + +class mlx5_queue(object): +def __init__(self): +self.done_burst = []# completed bursts +self.wait_burst = []# waiting for completion +self.pq_id = 0 + +def log(self): +for txb in self.done_burst: +txb.log() + + +class mlx5_mbuf(object): +def __init__(self): +self.wqe = 0# wqe id +self.ptr = None # first packet mbuf pointer +self.len = 0# packet data length +self.nseg = 0 # number of segments + +def log(self): +out = "%X: %u" % (self.ptr, self.len) +if self.nseg != 1: +out += " (%d segs)" % self.nseg +print(out) + + +class mlx5_wqe(object): +def __init__(self): +self.mbuf = [] # list of mbufs in WQE +self.wait_ts = 0# preceding wait/push timestamp +self.comp_ts = 0# send/recv completion timestamp +self.opcode = 0 + +def log(self): +id = (self.opcode >> 8) & 0x +op = self.opcode & 0xFF +fl = self.opcode >> 24 +out = " %04X: " % id +if op == 0xF: +out += "WAIT" +elif op == 0x29: +out += "EMPW" +elif op == 0xE: +out += "TSO " +elif op == 0xA: +out += "SEND" +else: +out += "0x%02X" % op +if self.comp_ts != 0: +out += " (%d, %d)" % (self.wait_ts, self.comp_ts - self.wait_ts) +else: +out += " (%d)" % self.wait_ts +print(out) +for mbuf in self.mbuf: +mbuf.log() + +# return 0 if WQE in not completed +def comp(self, wqe_id, ts): +if self.comp_ts != 0: +return 1 +id = (self.opcode >> 8) & 0x +if id > wqe_id: +id -= wqe_id +if id <= 0x8000: +return 0 +else: +id = wqe_id - id +if id >= 0x8000: +return 0 +self.comp_ts = ts +return 1 + + +class mlx5_burst(object): +def __init__(self): +self.wqes = [] # issued burst WQEs +self.done = 0 # number of sent/recv packets +self.req = 0# requested number of packets +self.call_ts = 0# burst routine invocation +self.done_ts = 0# burst routine done +self.queue = None + +def log(self): +port = self.queue.pq_id >> 16 +queue = self.queue.pq_id & 0x +if self.req == 0: +print("%u: tx(p=%u, q=%u, %u/%u pkts (incomplete)" % + (self.call_ts, port, queue, self.done, self.req)) +else: +print("%u: tx(p=%u, q=%u, %u/%u pkts in %u" % + (self.call_ts, port, queue, self.done, self.req, + self.done_ts - self.call_ts)) +for wqe in self.wqes: +wqe.log() + +# return 0 if not all of WQEs in burst completed +def comp(self, wqe_id, ts): +wlen = len(self.wqes) +if wlen == 0: +return 0 +for wqe in self.wqes: +if wqe.comp(wqe_id, ts) == 0: +return 0 +return 1 + + +def do_tx_entry(msg): +event = msg.event +cpu_id = event["cpu_id"] +burst = tx_blst.get(cpu_id) +if burst is not None: +# continue existing burst after WAIT +return +# allocate the new burst and append to the queue +burst = mlx5_burst() +burst.call_ts = msg.default_clock_snapshot.ns_from_origin +tx_blst[cpu_id] = burst +pq_id = event["port_id"] << 16 | event["queue_id"] +queue = tx_qlst.get(pq_id) +if queue is None: +# queue does not exist - allocate the new one +queue = mlx5_queue(); +queue.pq_id = pq_id +tx_qlst[pq_id] = queue +burst.queue = queue +queue.wait_burst.ap
Re: [RFC 2/5] common/mlx5: introduce tracepoints for mlx5 drivers
On Thu, Apr 20, 2023 at 3:38 PM Viacheslav Ovsiienko wrote: > > There is an intention to engage DPDK tracing capabilities > for mlx5 PMDs monitoring and profiling in various modes. > The patch introduces tracepoints for the Tx datapath in > the ethernet device driver. > > Signed-off-by: Viacheslav Ovsiienko > --- > drivers/common/mlx5/meson.build | 1 + > drivers/common/mlx5/mlx5_trace.c | 25 +++ > drivers/common/mlx5/mlx5_trace.h | 72 > drivers/common/mlx5/version.map | 8 > 4 files changed, 106 insertions(+) > create mode 100644 drivers/common/mlx5/mlx5_trace.c > create mode 100644 drivers/common/mlx5/mlx5_trace.h > > diff --git a/drivers/common/mlx5/meson.build b/drivers/common/mlx5/meson.build > index 9dc809f192..e074ffb140 100644 > --- a/drivers/common/mlx5/meson.build > +++ b/drivers/common/mlx5/meson.build > @@ -19,6 +19,7 @@ sources += files( > 'mlx5_common_mp.c', > 'mlx5_common_mr.c', > 'mlx5_malloc.c', > +'mlx5_trace.c', > 'mlx5_common_pci.c', > 'mlx5_common_devx.c', > 'mlx5_common_utils.c', > diff --git a/drivers/common/mlx5/mlx5_trace.c > b/drivers/common/mlx5/mlx5_trace.c > new file mode 100644 > index 00..b9f14413ad > --- /dev/null > +++ b/drivers/common/mlx5/mlx5_trace.c > @@ -0,0 +1,25 @@ > +/* SPDX-License-Identifier: BSD-3-Clause > + * Copyright (c) 2022 NVIDIA Corporation & Affiliates > + */ > + > +#include > +#include > + > +RTE_TRACE_POINT_REGISTER(rte_pmd_mlx5_trace_tx_entry, > + pmd.net.mlx5.tx.entry) > + > +RTE_TRACE_POINT_REGISTER(rte_pmd_mlx5_trace_tx_exit, > + pmd.net.mlx5.tx.exit) > + > +RTE_TRACE_POINT_REGISTER(rte_pmd_mlx5_trace_tx_wqe, > + pmd.net.mlx5.tx.wqe) > + > +RTE_TRACE_POINT_REGISTER(rte_pmd_mlx5_trace_tx_wait, > + pmd.net.mlx5.tx.wait) > + > +RTE_TRACE_POINT_REGISTER(rte_pmd_mlx5_trace_tx_push, > + pmd.net.mlx5.tx.push) > + > +RTE_TRACE_POINT_REGISTER(rte_pmd_mlx5_trace_tx_complete, > + pmd.net.mlx5.tx.complete) > + > diff --git a/drivers/common/mlx5/mlx5_trace.h > b/drivers/common/mlx5/mlx5_trace.h > new file mode 100644 > index 00..57512e654f > --- /dev/null > +++ b/drivers/common/mlx5/mlx5_trace.h > @@ -0,0 +1,72 @@ > +/* SPDX-License-Identifier: BSD-3-Clause > + * Copyright (c) 2022 NVIDIA Corporation & Affiliates > + */ > + > +#ifndef RTE_PMD_MLX5_TRACE_H_ > +#define RTE_PMD_MLX5_TRACE_H_ > + > +/** > + * @file > + * > + * API for mlx5 PMD trace support > + */ > + > +#ifdef __cplusplus > +extern "C" { > +#endif > + > +#include > +#include > +#include > + > +RTE_TRACE_POINT_FP( > + rte_pmd_mlx5_trace_tx_entry, > + RTE_TRACE_POINT_ARGS(uint16_t port_id, uint16_t queue_id), > + rte_trace_point_emit_u16(port_id); > + rte_trace_point_emit_u16(queue_id); > +) > + > +RTE_TRACE_POINT_FP( > + rte_pmd_mlx5_trace_tx_exit, > + RTE_TRACE_POINT_ARGS(uint16_t nb_sent, uint16_t nb_req), > + rte_trace_point_emit_u16(nb_sent); > + rte_trace_point_emit_u16(nb_req); > +) > + > +RTE_TRACE_POINT_FP( > + rte_pmd_mlx5_trace_tx_wqe, > + RTE_TRACE_POINT_ARGS(uint32_t opcode), > + rte_trace_point_emit_u32(opcode); > +) > + > +RTE_TRACE_POINT_FP( > + rte_pmd_mlx5_trace_tx_wait, > + RTE_TRACE_POINT_ARGS(uint64_t ts), > + rte_trace_point_emit_u64(ts); > +) > + > + > +RTE_TRACE_POINT_FP( > + rte_pmd_mlx5_trace_tx_push, > + RTE_TRACE_POINT_ARGS(const struct rte_mbuf *mbuf, uint16_t wqe_id), > + rte_trace_point_emit_ptr(mbuf); > + rte_trace_point_emit_u32(mbuf->pkt_len); > + rte_trace_point_emit_u16(mbuf->nb_segs); > + rte_trace_point_emit_u16(wqe_id); > +) > + > +RTE_TRACE_POINT_FP( > + rte_pmd_mlx5_trace_tx_complete, > + RTE_TRACE_POINT_ARGS(uint16_t port_id, uint16_t queue_id, > +uint16_t wqe_id, uint64_t ts), > + rte_trace_point_emit_u16(port_id); > + rte_trace_point_emit_u16(queue_id); > + rte_trace_point_emit_u64(ts); > + rte_trace_point_emit_u16(wqe_id); > +) > + > +#ifdef __cplusplus > +} > +#endif > + > +#endif /* RTE_PMD_MLX5_TRACE_H_ */ > diff --git a/drivers/common/mlx5/version.map b/drivers/common/mlx5/version.map > index e05e1aa8c5..d0ec8571e6 100644 > --- a/drivers/common/mlx5/version.map > +++ b/drivers/common/mlx5/version.map > @@ -158,5 +158,13 @@ INTERNAL { > > mlx5_os_interrupt_handler_create; # WINDOWS_NO_EXPORT > mlx5_os_interrupt_handler_destroy; # WINDOWS_NO_EXPORT > + > + __rte_pmd_mlx5_trace_tx_entry; > + __rte_pmd_mlx5_trace_tx_exit; > + __rte_pmd_mlx5_trace_tx_wqe; > + __rte_pmd_mlx5_trace_tx_wait; > + __rte_pmd_mlx5_trace_tx_push; > + __rte_pmd_mlx5_trace_tx_complete; No need to expose these symbols. It is getting removed from rest of DPDK. Application can do rte_trace_lookup() to get this address. > + > local: *; > }; > -- > 2.18.1 >
Re: [RFC 1/5] app/testpmd: add trace dump command
On Thu, Apr 20, 2023 at 3:39 PM Viacheslav Ovsiienko wrote: > > The "dump_trace" CLI command is added to trigger > saving the trace dumps to the trace directory. > > Signed-off-by: Viacheslav Ovsiienko > --- > app/test-pmd/cmdline.c | 6 +- > 1 file changed, 5 insertions(+), 1 deletion(-) > > diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c > index 7b20bef4e9..be9e3a9ed6 100644 > --- a/app/test-pmd/cmdline.c > +++ b/app/test-pmd/cmdline.c > @@ -39,6 +39,7 @@ > #include > #endif > #include > +#include > > #include > #include > @@ -8367,6 +8368,8 @@ static void cmd_dump_parsed(void *parsed_result, > rte_lcore_dump(stdout); > else if (!strcmp(res->dump, "dump_log_types")) > rte_log_dump(stdout); > + else if (!strcmp(res->dump, "dump_trace")) > + rte_trace_save(); Isn't saving the trace? If so, change the command to save_trace or so. > } > > static cmdline_parse_token_string_t cmd_dump_dump = > @@ -8379,7 +8382,8 @@ static cmdline_parse_token_string_t cmd_dump_dump = > "dump_mempool#" > "dump_devargs#" > "dump_lcores#" > - "dump_log_types"); > + "dump_log_types#" > + "dump_trace"); > > static cmdline_parse_inst_t cmd_dump = { > .f = cmd_dump_parsed, /* function to call */ > -- > 2.18.1 >
Re: [dpdk-web] [RFC PATCH] process: new library approval in principle
On Wed, Apr 19, 2023 at 9:10 PM Kevin Traynor wrote: > > On 13/02/2023 09:26, jer...@marvell.com wrote: > > From: Jerin Jacob > > > > Based on TB meeting[1] action item, defining > > the process for new library approval in principle. > > > > [1] > > https://mails.dpdk.org/archives/dev/2023-January/260035.html > > > > Signed-off-by: Jerin Jacob > > --- > > content/process/_index.md | 33 + > > 1 file changed, 33 insertions(+) > > create mode 100644 content/process/_index.md > > > > diff --git a/content/process/_index.md b/content/process/_index.md > > new file mode 100644 > > index 000..21c2642 > > --- /dev/null > > +++ b/content/process/_index.md > > @@ -0,0 +1,33 @@ > > > > +title = "Process" > > +weight = "9" > > > > + > > +## Process for new library approval in principle > > + > > +### Rational > > + > > +Adding a new library to DPDK codebase with proper RFC and then full > > patch-sets is > > +significant work and getting early approval-in-principle that a library > > help DPDK contributors > > +avoid wasted effort if it is not suitable for various reasons. > > + > > +### Process > > + > > +1. When a contributor would like to add a new library to DPDK code base, > > the contributor must send > > +the following items to DPDK mailing list for TB approval-in-principle. > > + > > + - Purpose of the library. > > + - Scope of the library. > > + - Any licensing constraints. > > + - Justification for adding to DPDK. > > + - Any other implementations of the same functionality in other > > libs/products and how this version differs. > > - Dependencies > > (Need to know if it's introducing new dependencies to the project) Ack. I will add in next version. > > > + - Public API specification header file as RFC > > + - Optional and good to have. > > + - TB may additionally request this collateral if needed to get more > > clarity on scope and purpose. > > + > > +2. TB to schedule discussion on this in upcoming TB meeting along with > > author. Based on the TB > > +schedule and/or author availability, TB may need maximum three TB meeting > > slots. > > + > > +3. Based on mailing list and TB meeting discussions, TB to vote for > > approval-in-principle and share > > +the decision in the mailing list. > > + > > How about having three outcomes: > - Approval in principal > - Not approved > - Further information needed Ack. I will add in next version. >
Re: 21.11.4 patches review and test
On 20/04/2023 03:40, Xu, HailinX wrote: -Original Message- From: Xu, HailinX Sent: Thursday, April 13, 2023 2:13 PM To: Kevin Traynor ; sta...@dpdk.org Cc: dev@dpdk.org; Abhishek Marathe ; Ali Alnubani ; Walker, Benjamin ; David Christensen ; Hemant Agrawal ; Stokes, Ian ; Jerin Jacob ; Mcnamara, John ; Ju-Hyoung Lee ; Luca Boccassi ; Pei Zhang ; Xu, Qian Q ; Raslan Darawsheh ; Thomas Monjalon ; yangh...@redhat.com; Peng, Yuan ; Chen, Zhaoyan Subject: RE: 21.11.4 patches review and test -Original Message- From: Kevin Traynor Sent: Thursday, April 6, 2023 7:38 PM To: sta...@dpdk.org Cc: dev@dpdk.org; Abhishek Marathe ; Ali Alnubani ; Walker, Benjamin ; David Christensen ; Hemant Agrawal ; Stokes, Ian ; Jerin Jacob ; Mcnamara, John ; Ju-Hyoung Lee ; Kevin Traynor ; Luca Boccassi ; Pei Zhang ; Xu, Qian Q ; Raslan Darawsheh ; Thomas Monjalon ; yangh...@redhat.com; Peng, Yuan ; Chen, Zhaoyan Subject: 21.11.4 patches review and test Hi all, Here is a list of patches targeted for stable release 21.11.4. The planned date for the final release is 25th April. Please help with testing and validation of your use cases and report any issues/results with reply-all to this mail. For the final release the fixes and reported validations will be added to the release notes. A release candidate tarball can be found at: https://dpdk.org/browse/dpdk-stable/tag/?id=v21.11.4-rc1 These patches are located at branch 21.11 of dpdk-stable repo: https://dpdk.org/browse/dpdk-stable/ Thanks. Kevin HI All, Update the test status for Intel part. Till now dpdk21.11.4-rc1 validation test rate is 85%. No critical issue is found. 2 new bugs are found, 1 new issue is under confirming by Intel Dev. New bugs: --20.11.8-rc1 also has these two issues 1. pvp_qemu_multi_paths_port_restart:perf_pvp_qemu_vector_rx_mac: performance drop about 23.5% when send small packets https://bugs.dpdk.org/show_bug.cgi?id=1212-- no fix yet 2. some of the virtio tests are failing:-- Intel dev is under investigating # Basic Intel(R) NIC testing * Build & CFLAG compile: cover the build test combination with latest GCC/Clang version and the popular OS revision such as Ubuntu20.04, Ubuntu22.04, Fedora35, Fedora37, RHEL8.6, RHEL8.4, FreeBSD13.1, SUSE15, CentOS7.9, etc. - All test done. No new dpdk issue is found. * PF(i40e, ixgbe): test scenarios including RTE_FLOW/TSO/Jumboframe/checksum offload/VLAN/VXLAN, etc. - All test done. No new dpdk issue is found. * VF(i40e, ixgbe): test scenarios including VF-RTE_FLOW/TSO/Jumboframe/checksum offload/VLAN/VXLAN, etc. - All test done. No new dpdk issue is found. * PF/VF(ice): test scenarios including Switch features/Package Management/Flow Director/Advanced Tx/Advanced RSS/ACL/DCF/Flexible Descriptor, etc. - All test done. No new dpdk issue is found. * Intel NIC single core/NIC performance: test scenarios including PF/VF single core performance test, etc. - All test done. No new dpdk issue is found. * IPsec: test scenarios including ipsec/ipsec-gw/ipsec library basic test - QAT&SW/FIB library, etc. - On going. # Basic cryptodev and virtio testing * Virtio: both function and performance test are covered. Such as PVP/Virtio_loopback/virtio-user loopback/virtio-net VM2VM perf testing/VMAWARE ESXI 8.0, etc. - All test done. found bug1. * Cryptodev: *Function test: test scenarios including Cryptodev API testing/CompressDev ISA-L/QAT/ZLIB PMD Testing/FIPS, etc. - Execution rate is 90%. found bug2. *Performance test: test scenarios including Thoughput Performance/Cryptodev Latency, etc. - All test done. No new dpdk issue is found. Regards, Xu, Hailin Update the test status for Intel part. completed dpdk21.11.4-rc1 all validation. No critical issue is found. Hi. Thanks for testing. 2 new bugs are found, 1 new issue is under confirming by Intel Dev. New bugs: --20.11.8-rc1 also has these two issues 1. pvp_qemu_multi_paths_port_restart:perf_pvp_qemu_vector_rx_mac: performance drop about 23.5% when send small packets https://bugs.dpdk.org/show_bug.cgi?id=1212 --not fix yet, Only the specified platform exists Do you know which patch caaused the regression? I'm not fully clear from the Bz for 20.11. The backported patch ID'd as root cause [0] in 20.11 is in the previous releases of 20.11 (and 21.11). Trying to understand because then it would have shown in testing for previous releases. Or is this a new test introduced for latest LTS releases? and if so, what is the baseline performance based on? [0] commit 1c9a7fba5c90e0422b517404499ed106f647bcff Author: Mattias Rönnblom Date: Mon Jul 11 14:11:32 2022 +0200 net: accept unaligned data in checksum routines 2. some of the virtio tests are failing: -- Intel dev is under investigating ok, thank you. Kevin. # Basic Intel(R) NIC testing * Build & CFLAG compile: cover the build test combination with latest GCC/Clang version and the popular OS revision such a
[PATCH] app/crypto-perf: check crypto result
Check crypto result in latency tests. Checking result won't affect the test results as latency is calculated using timestamps which are done before enqueue and after dequeue. Ignoring result means the data can be false positive. Signed-off-by: Anoob Joseph --- app/test-crypto-perf/cperf_test_latency.c | 23 --- 1 file changed, 20 insertions(+), 3 deletions(-) diff --git a/app/test-crypto-perf/cperf_test_latency.c b/app/test-crypto-perf/cperf_test_latency.c index 406e082e4e..64bef2cc0e 100644 --- a/app/test-crypto-perf/cperf_test_latency.c +++ b/app/test-crypto-perf/cperf_test_latency.c @@ -134,6 +134,7 @@ cperf_latency_test_runner(void *arg) uint16_t test_burst_size; uint8_t burst_size_idx = 0; uint32_t imix_idx = 0; + int ret = 0; static uint16_t display_once; @@ -258,10 +259,16 @@ cperf_latency_test_runner(void *arg) } if (likely(ops_deqd)) { - /* Free crypto ops so they can be reused. */ - for (i = 0; i < ops_deqd; i++) + for (i = 0; i < ops_deqd; i++) { + struct rte_crypto_op *op = ops_processed[i]; + + if (op->status == RTE_CRYPTO_OP_STATUS_ERROR) + ret = -1; + store_timestamp(ops_processed[i], tsc_end); + } + /* Free crypto ops so they can be reused. */ rte_mempool_put_bulk(ctx->pool, (void **)ops_processed, ops_deqd); @@ -289,8 +296,14 @@ cperf_latency_test_runner(void *arg) tsc_end = rte_rdtsc_precise(); if (ops_deqd != 0) { - for (i = 0; i < ops_deqd; i++) + for (i = 0; i < ops_deqd; i++) { + struct rte_crypto_op *op = ops_processed[i]; + + if (op->status == RTE_CRYPTO_OP_STATUS_ERROR) + ret = -1; + store_timestamp(ops_processed[i], tsc_end); + } rte_mempool_put_bulk(ctx->pool, (void **)ops_processed, ops_deqd); @@ -301,6 +314,10 @@ cperf_latency_test_runner(void *arg) } } + /* If there was any failure in crypto op, exit */ + if (ret) + return ret; + for (i = 0; i < tsc_idx; i++) { tsc_val = ctx->res[i].tsc_end - ctx->res[i].tsc_start; tsc_max = RTE_MAX(tsc_val, tsc_max); -- 2.25.1
[PATCH v2] crypto/ipsec_mb: enqueue counter fix
This patch removes enqueue op counter update from the process_op_bit function where the process is now done in dequeue stage. The original stats increment was incorrect as they shouldn't have been updated at all in this function. Fixes: 4f1cfda59ad3 ("crypto/ipsec_mb: move snow3g PMD") Cc: piotrx.bronow...@intel.com Cc: sta...@dpdk.org Signed-off-by: Saoirse O'Donovan --- v2: Added cc stable for 21.11 and 22.11 backport. A similar fix has been sent to 20.11 LTS stable, in the interest of time. In that fix, the enqueued stat is still in use, therefore only the fix to the count increment was necessary. Here is the mail archive link: https://mails.dpdk.org/archives/stable/2023-April/043550.html --- drivers/crypto/ipsec_mb/pmd_snow3g.c | 16 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/drivers/crypto/ipsec_mb/pmd_snow3g.c b/drivers/crypto/ipsec_mb/pmd_snow3g.c index 8ed069f428..e64df1a462 100644 --- a/drivers/crypto/ipsec_mb/pmd_snow3g.c +++ b/drivers/crypto/ipsec_mb/pmd_snow3g.c @@ -372,9 +372,10 @@ process_ops(struct rte_crypto_op **ops, struct snow3g_session *session, /** Process a crypto op with length/offset in bits. */ static int process_op_bit(struct rte_crypto_op *op, struct snow3g_session *session, - struct ipsec_mb_qp *qp, uint16_t *accumulated_enqueued_ops) + struct ipsec_mb_qp *qp) { - uint32_t enqueued_op, processed_op; + unsigned int processed_op; + int ret; switch (session->op) { case IPSEC_MB_OP_ENCRYPT_ONLY: @@ -421,9 +422,10 @@ process_op_bit(struct rte_crypto_op *op, struct snow3g_session *session, if (unlikely(processed_op != 1)) return 0; - enqueued_op = rte_ring_enqueue(qp->ingress_queue, op); - qp->stats.enqueued_count += enqueued_op; - *accumulated_enqueued_ops += enqueued_op; + + ret = rte_ring_enqueue(qp->ingress_queue, op); + if (ret != 0) + return ret; return 1; } @@ -439,7 +441,6 @@ snow3g_pmd_dequeue_burst(void *queue_pair, struct snow3g_session *prev_sess = NULL, *curr_sess = NULL; uint32_t i; uint8_t burst_size = 0; - uint16_t enqueued_ops = 0; uint8_t processed_ops; uint32_t nb_dequeued; @@ -479,8 +480,7 @@ snow3g_pmd_dequeue_burst(void *queue_pair, prev_sess = NULL; } - processed_ops = process_op_bit(curr_c_op, curr_sess, - qp, &enqueued_ops); + processed_ops = process_op_bit(curr_c_op, curr_sess, qp); if (processed_ops != 1) break; -- 2.25.1
Re: [PATCH v2] crypto/ipsec_mb: enqueue counter fix
On 20/04/2023 11:31, Saoirse O'Donovan wrote: This patch removes enqueue op counter update from the process_op_bit function where the process is now done in dequeue stage. The original stats increment was incorrect as they shouldn't have been updated at all in this function. Fixes: 4f1cfda59ad3 ("crypto/ipsec_mb: move snow3g PMD") Cc: piotrx.bronow...@intel.com Cc: sta...@dpdk.org Signed-off-by: Saoirse O'Donovan --- v2: Added cc stable for 21.11 and 22.11 backport. A similar fix has been sent to 20.11 LTS stable, in the interest of time. In that fix, the enqueued stat is still in use, therefore only the fix to the count increment was necessary. Thanks for the explanation. As it has the correct tags, we will pick this up for 21.11 and 22.11 LTS releases in the normal workflow, which is after it has been released as part of a DPDK main branch release. thanks, Kevin. Here is the mail archive link: https://mails.dpdk.org/archives/stable/2023-April/043550.html --- drivers/crypto/ipsec_mb/pmd_snow3g.c | 16 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/drivers/crypto/ipsec_mb/pmd_snow3g.c b/drivers/crypto/ipsec_mb/pmd_snow3g.c index 8ed069f428..e64df1a462 100644 --- a/drivers/crypto/ipsec_mb/pmd_snow3g.c +++ b/drivers/crypto/ipsec_mb/pmd_snow3g.c @@ -372,9 +372,10 @@ process_ops(struct rte_crypto_op **ops, struct snow3g_session *session, /** Process a crypto op with length/offset in bits. */ static int process_op_bit(struct rte_crypto_op *op, struct snow3g_session *session, - struct ipsec_mb_qp *qp, uint16_t *accumulated_enqueued_ops) + struct ipsec_mb_qp *qp) { - uint32_t enqueued_op, processed_op; + unsigned int processed_op; + int ret; switch (session->op) { case IPSEC_MB_OP_ENCRYPT_ONLY: @@ -421,9 +422,10 @@ process_op_bit(struct rte_crypto_op *op, struct snow3g_session *session, if (unlikely(processed_op != 1)) return 0; - enqueued_op = rte_ring_enqueue(qp->ingress_queue, op); - qp->stats.enqueued_count += enqueued_op; - *accumulated_enqueued_ops += enqueued_op; + + ret = rte_ring_enqueue(qp->ingress_queue, op); + if (ret != 0) + return ret; return 1; } @@ -439,7 +441,6 @@ snow3g_pmd_dequeue_burst(void *queue_pair, struct snow3g_session *prev_sess = NULL, *curr_sess = NULL; uint32_t i; uint8_t burst_size = 0; - uint16_t enqueued_ops = 0; uint8_t processed_ops; uint32_t nb_dequeued; @@ -479,8 +480,7 @@ snow3g_pmd_dequeue_burst(void *queue_pair, prev_sess = NULL; } - processed_ops = process_op_bit(curr_c_op, curr_sess, - qp, &enqueued_ops); + processed_ops = process_op_bit(curr_c_op, curr_sess, qp); if (processed_ops != 1) break;
RE: [PATCH v2] crypto/ipsec_mb: enqueue counter fix
> -Original Message- > From: Saoirse O'Donovan > Sent: Thursday 20 April 2023 11:32 > To: Ji, Kai ; De Lara Guarch, Pablo > > Cc: dev@dpdk.org; O'Donovan, Saoirse ; > Bronowski, PiotrX ; sta...@dpdk.org > Subject: [PATCH v2] crypto/ipsec_mb: enqueue counter fix > > This patch removes enqueue op counter update from the process_op_bit > function where the process is now done in dequeue stage. The original stats > increment was incorrect as they shouldn't have been updated at all in this > function. > > Fixes: 4f1cfda59ad3 ("crypto/ipsec_mb: move snow3g PMD") > Cc: piotrx.bronow...@intel.com > Cc: sta...@dpdk.org > > Signed-off-by: Saoirse O'Donovan > > --- > v2: Added cc stable for 21.11 and 22.11 backport. > > A similar fix has been sent to 20.11 LTS stable, in the interest of time. In > that > fix, the enqueued stat is still in use, therefore only the fix to the count > increment was necessary. > > Here is the mail archive link: > https://mails.dpdk.org/archives/stable/2023-April/043550.html > --- > drivers/crypto/ipsec_mb/pmd_snow3g.c | 16 > 1 file changed, 8 insertions(+), 8 deletions(-) Acked-by: Ciara Power
[PATCH] devtools: allow patch to multiple groups for the same driver
The PMD's source code resides in the ./drivers folder of the DPDK project and split into the several groups depending on the PMD class (common, net, regex, etc.). For some vendors the drivers of different classes operate over the same hardware, for example Nvidia PMDs operate over ConnectX NIC series. It often happens the same minor fixes should be applied to the multiple drivers of the same vendor in the different classes. The check-git-log.sh script checks the consistence of patch affected files and patch commit message headline and prevents updating multiple drivers in single commit. This patch mitigates this strict check and allows to update multiple drivers in different classes for the single vendor. Signed-off-by: Viacheslav Ovsiienko --- devtools/check-git-log.sh | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/devtools/check-git-log.sh b/devtools/check-git-log.sh index af751e49ab..b66e8fe553 100755 --- a/devtools/check-git-log.sh +++ b/devtools/check-git-log.sh @@ -80,7 +80,9 @@ bad=$(for commit in $commits ; do continue drv=$(echo "$files" | grep '^drivers/' | cut -d "/" -f 2,3 | sort -u) drvgrp=$(echo "$drv" | cut -d "/" -f 1 | uniq) - if [ $(echo "$drvgrp" | wc -l) -gt 1 ] ; then + drvpmd=$(echo "$drv" | cut -d "/" -f 2 | uniq) + if [ $(echo "$drvgrp" | wc -l) -gt 1 ] && \ + [ $(echo "$drvpmd" | wc -l) -gt 1 ] ; then echo "$headline" | grep -v '^drivers:' elif [ $(echo "$drv" | wc -l) -gt 1 ] ; then echo "$headline" | grep -v "^drivers/$drvgrp" -- 2.18.1
[PATCH v2] app/crypto-perf: check crypto result
Check crypto result in latency tests. Checking result won't affect the test results as latency is calculated using timestamps which are done before enqueue and after dequeue. Ignoring result means the data can be false positive. Signed-off-by: Anoob Joseph --- v2: - Improved result check (treat all non success as errors) app/test-crypto-perf/cperf_test_latency.c | 23 --- 1 file changed, 20 insertions(+), 3 deletions(-) diff --git a/app/test-crypto-perf/cperf_test_latency.c b/app/test-crypto-perf/cperf_test_latency.c index 406e082e4e..f1676a9aa9 100644 --- a/app/test-crypto-perf/cperf_test_latency.c +++ b/app/test-crypto-perf/cperf_test_latency.c @@ -134,6 +134,7 @@ cperf_latency_test_runner(void *arg) uint16_t test_burst_size; uint8_t burst_size_idx = 0; uint32_t imix_idx = 0; + int ret = 0; static uint16_t display_once; @@ -258,10 +259,16 @@ cperf_latency_test_runner(void *arg) } if (likely(ops_deqd)) { - /* Free crypto ops so they can be reused. */ - for (i = 0; i < ops_deqd; i++) + for (i = 0; i < ops_deqd; i++) { + struct rte_crypto_op *op = ops_processed[i]; + + if (op->status != RTE_CRYPTO_OP_STATUS_SUCCESS) + ret = -1; + store_timestamp(ops_processed[i], tsc_end); + } + /* Free crypto ops so they can be reused. */ rte_mempool_put_bulk(ctx->pool, (void **)ops_processed, ops_deqd); @@ -289,8 +296,14 @@ cperf_latency_test_runner(void *arg) tsc_end = rte_rdtsc_precise(); if (ops_deqd != 0) { - for (i = 0; i < ops_deqd; i++) + for (i = 0; i < ops_deqd; i++) { + struct rte_crypto_op *op = ops_processed[i]; + + if (op->status != RTE_CRYPTO_OP_STATUS_SUCCESS) + ret = -1; + store_timestamp(ops_processed[i], tsc_end); + } rte_mempool_put_bulk(ctx->pool, (void **)ops_processed, ops_deqd); @@ -301,6 +314,10 @@ cperf_latency_test_runner(void *arg) } } + /* If there was any failure in crypto op, exit */ + if (ret) + return ret; + for (i = 0; i < tsc_idx; i++) { tsc_val = ctx->res[i].tsc_end - ctx->res[i].tsc_start; tsc_max = RTE_MAX(tsc_val, tsc_max); -- 2.25.1
[PATCH v1] dts: create tarball from git ref
Add additional convenience options for specifying what DPDK version to test. Signed-off-by: Juraj Linkeš --- dts/framework/config/__init__.py | 11 +-- dts/framework/settings.py| 20 ++--- dts/framework/utils.py | 140 +++ 3 files changed, 152 insertions(+), 19 deletions(-) diff --git a/dts/framework/config/__init__.py b/dts/framework/config/__init__.py index ebb0823ff5..a4b73483e6 100644 --- a/dts/framework/config/__init__.py +++ b/dts/framework/config/__init__.py @@ -11,21 +11,14 @@ import os.path import pathlib from dataclasses import dataclass -from enum import Enum, auto, unique +from enum import auto, unique from typing import Any, TypedDict import warlock # type: ignore import yaml from framework.settings import SETTINGS - - -class StrEnum(Enum): -@staticmethod -def _generate_next_value_( -name: str, start: int, count: int, last_values: object -) -> str: -return name +from framework.utils import StrEnum @unique diff --git a/dts/framework/settings.py b/dts/framework/settings.py index 71955f4581..cfa39d011b 100644 --- a/dts/framework/settings.py +++ b/dts/framework/settings.py @@ -10,7 +10,7 @@ from pathlib import Path from typing import Any, TypeVar -from .exception import ConfigurationError +from .utils import DPDKGitTarball _T = TypeVar("_T") @@ -124,11 +124,13 @@ def _get_parser() -> argparse.ArgumentParser: parser.add_argument( "--tarball", "--snapshot", +"--git-ref", action=_env_arg("DTS_DPDK_TARBALL"), default="dpdk.tar.xz", type=Path, -help="[DTS_DPDK_TARBALL] Path to DPDK source code tarball " -"which will be used in testing.", +help="[DTS_DPDK_TARBALL] Path to DPDK source code tarball or a git commit ID, " +"tag ID or tree ID to test. To test local changes, first commit them, " +"then use the commit ID with this option.", ) parser.add_argument( @@ -160,21 +162,19 @@ def _get_parser() -> argparse.ArgumentParser: return parser -def _check_tarball_path(parsed_args: argparse.Namespace) -> None: -if not os.path.exists(parsed_args.tarball): -raise ConfigurationError(f"DPDK tarball '{parsed_args.tarball}' doesn't exist.") - - def _get_settings() -> _Settings: parsed_args = _get_parser().parse_args() -_check_tarball_path(parsed_args) return _Settings( config_file_path=parsed_args.config_file, output_dir=parsed_args.output_dir, timeout=parsed_args.timeout, verbose=(parsed_args.verbose == "Y"), skip_setup=(parsed_args.skip_setup == "Y"), -dpdk_tarball_path=parsed_args.tarball, +dpdk_tarball_path=Path( +DPDKGitTarball(parsed_args.tarball, parsed_args.output_dir) +) +if not os.path.exists(parsed_args.tarball) +else Path(parsed_args.tarball), compile_timeout=parsed_args.compile_timeout, test_cases=parsed_args.test_cases.split(",") if parsed_args.test_cases else [], re_run=parsed_args.re_run, diff --git a/dts/framework/utils.py b/dts/framework/utils.py index 55e0b0ef0e..0623106b78 100644 --- a/dts/framework/utils.py +++ b/dts/framework/utils.py @@ -3,7 +3,26 @@ # Copyright(c) 2022-2023 PANTHEON.tech s.r.o. # Copyright(c) 2022-2023 University of New Hampshire +import atexit +import os +import subprocess import sys +from enum import Enum +from pathlib import Path +from subprocess import SubprocessError + +from .exception import ConfigurationError + + +class StrEnum(Enum): +@staticmethod +def _generate_next_value_( +name: str, start: int, count: int, last_values: object +) -> str: +return name + +def __str__(self) -> str: +return self.name def check_dts_python_version() -> None: @@ -80,3 +99,124 @@ def __init__(self, default_library: str | None = None, **dpdk_args: str | bool): def __str__(self) -> str: return " ".join(f"{self._default_library} {self._dpdk_args}".split()) + + +class _TarCompressionFormat(StrEnum): +"""Compression formats that tar can use. + +Enum names are the shell compression commands +and Enum values are the associated file extensions. +""" + +gzip = "gz" +compress = "Z" +bzip2 = "bz2" +lzip = "lz" +lzma = "lzma" +lzop = "lzo" +xz = "xz" +zstd = "zst" + + +class DPDKGitTarball(object): +"""Create a compressed tarball of DPDK from the repository. + +The DPDK version is specified with git object git_ref. +The tarball will be compressed with _TarCompressionFormat, +which must be supported by the DTS execution environment. +The resulting tarball will be put into output_dir. + +The class supports the os.PathLike protocol, +which is used to get the Path of the tarball:: + +from pathlib import Path +tarball = DPDKGitTarball("HEAD", "output") +tarball_path = Pa
Re: [PATCH] doc: add PMD known issue
On Thu, 20 Apr 2023 06:14:29 + Mingjin Ye wrote: > Add a known issue: ASLR feature causes core dump. > > Signed-off-by: Mingjin Ye > --- Please provide back trace. This should be fixable. Fixing a bug is always better than documenting it.
Re: [PATCH] doc: add PMD known issue
On Thu, Apr 20, 2023 at 06:14:29AM +, Mingjin Ye wrote: > Add a known issue: ASLR feature causes core dump. > > Signed-off-by: Mingjin Ye > --- > doc/guides/nics/ixgbe.rst | 15 +++ > 1 file changed, 15 insertions(+) > > diff --git a/doc/guides/nics/ixgbe.rst b/doc/guides/nics/ixgbe.rst > index b1d77ab7ab..c346e377e2 100644 > --- a/doc/guides/nics/ixgbe.rst > +++ b/doc/guides/nics/ixgbe.rst > @@ -461,3 +461,18 @@ show bypass config > Show the bypass configuration for a bypass enabled NIC using the lowest port > on the NIC:: > > testpmd> show bypass config (port_id) > + > +ASLR feature causes core dump > +~ > + > +Core dump may occur when we start secondary processes on the vf port. > +Mainstream Linux distributions have the ASLR feature enabled by default, > +and the text segment of the process's memory space is randomized. > +The secondary process calls the function address shared by the primary > +process, resulting in a core dump. > + > + .. Note:: > + > + Support for ASLR features varies by distribution. Redhat and > + Centos series distributions work fine. Ubuntu distributions > + will core dump, other Linux distributions are unknown. > -- I disagree about this description of the bug. ASLR is not the problem; instead driver is just not multi-process aware and uses the same pointers in both primary and secondary processes. You will hit this issue even without ASLR if primary and secondary processes use different static binaries. Therefore, IMHO, title should be that the VF driver is not multi-process safe, rather than pinning the blame on ASLR. /Bruce
[Bug 1217] RTE flow: Port state changing to error when RTE flow is enabled/disabled on Intel X722
https://bugs.dpdk.org/show_bug.cgi?id=1217 Bug ID: 1217 Summary: RTE flow: Port state changing to error when RTE flow is enabled/disabled on Intel X722 Product: DPDK Version: 22.03 Hardware: x86 OS: Linux Status: UNCONFIRMED Severity: major Priority: Normal Component: other Assignee: dev@dpdk.org Reporter: ltham...@usc.edu Target Milestone: --- Created attachment 249 --> https://bugs.dpdk.org/attachment.cgi?id=249&action=edit rte_flow_port_state_error_on_X722 logs When RTE flows are enabled and disabled couple of times on Intel X722 interface, the port state turns from Up/down to error state with error logs "Interface TenGigabitEthernetb5/0/0 error -95: Unknown error -95", "i40e_dev_sync_phy_type(): Failed to sync phy type: status=-7". Please check the attachment for logs. -- You are receiving this mail because: You are the assignee for the bug.
RE: [dpdk-dev] [PATCH v2] ring: fix use after free in ring release
> -Original Message- > From: Yunjian Wang > Sent: Thursday, April 20, 2023 1:44 AM > To: dev@dpdk.org > Cc: Honnappa Nagarahalli ; > konstantin.v.anan...@yandex.ru; luyi...@huawei.com; Yunjian Wang > ; sta...@dpdk.org > Subject: [dpdk-dev] [PATCH v2] ring: fix use after free in ring release > > After the memzone is freed, it is not removed from the 'rte_ring_tailq'. > If rte_ring_lookup is called at this time, it will cause a use-after-free > problem. > This change prevents that from happening. > > Fixes: 4e32101f9b01 ("ring: support freeing") > Cc: sta...@dpdk.org > > Suggested-by: Honnappa Nagarahalli This is incorrect, this is not a suggestion from me. Please remove this. > Signed-off-by: Yunjian Wang Other than the above, the patch looks fine. Reviewed-by: Honnappa Nagarahalli > --- > v2: update code suggested by Honnappa Nagarahalli > --- > lib/ring/rte_ring.c | 8 +++- > 1 file changed, 3 insertions(+), 5 deletions(-) > > diff --git a/lib/ring/rte_ring.c b/lib/ring/rte_ring.c index > 8ed455043d..2755323b8a 100644 > --- a/lib/ring/rte_ring.c > +++ b/lib/ring/rte_ring.c > @@ -333,11 +333,6 @@ rte_ring_free(struct rte_ring *r) > return; > } > > - if (rte_memzone_free(r->memzone) != 0) { > - RTE_LOG(ERR, RING, "Cannot free memory\n"); > - return; > - } > - > ring_list = RTE_TAILQ_CAST(rte_ring_tailq.head, rte_ring_list); > rte_mcfg_tailq_write_lock(); > > @@ -354,6 +349,9 @@ rte_ring_free(struct rte_ring *r) > > TAILQ_REMOVE(ring_list, te, next); > > + if (rte_memzone_free(r->memzone) != 0) > + RTE_LOG(ERR, RING, "Cannot free memory\n"); > + > rte_mcfg_tailq_write_unlock(); > > rte_free(te); > -- > 2.33.0
Re: [RFC 06/27] vhost: don't dump unneeded pages with IOTLB
On Fri, Mar 31, 2023 at 11:43 AM Maxime Coquelin wrote: > > On IOTLB entry removal, previous fixes took care of not > marking pages shared with other IOTLB entries as DONTDUMP. > > However, if an IOTLB entry is spanned on multiple pages, > the other pages were kept as DODUMP while they might not > have been shared with other entries, increasing needlessly > the coredump size. > > This patch addresses this issue by excluding only the > shared pages from madvise's DONTDUMP. > > Fixes: dea092d0addb ("vhost: fix madvise arguments alignment") > Cc: sta...@dpdk.org > > Signed-off-by: Maxime Coquelin Looks good to me. Acked-by: Mike Pattrick
Re: [RFC] lib: set/get max memzone segments
On Thu, Apr 20, 2023 at 09:43:28AM +0200, Thomas Monjalon wrote: > 19/04/2023 16:51, Tyler Retzlaff: > > On Wed, Apr 19, 2023 at 11:36:34AM +0300, Ophir Munk wrote: > > > In current DPDK the RTE_MAX_MEMZONE definition is unconditionally hard > > > coded as 2560. For applications requiring different values of this > > > parameter – it is more convenient to set the max value via an rte API - > > > rather than changing the dpdk source code per application. In many > > > organizations, the possibility to compile a private DPDK library for a > > > particular application does not exist at all. With this option there is > > > no need to recompile DPDK and it allows using an in-box packaged DPDK. > > > An example usage for updating the RTE_MAX_MEMZONE would be of an > > > application that uses the DPDK mempool library which is based on DPDK > > > memzone library. The application may need to create a number of > > > steering tables, each of which will require its own mempool allocation. > > > This commit is not about how to optimize the application usage of > > > mempool nor about how to improve the mempool implementation based on > > > memzone. It is about how to make the max memzone definition - run-time > > > customized. > > > This commit adds an API which must be called before rte_eal_init(): > > > rte_memzone_max_set(int max). If not called, the default memzone > > > (RTE_MAX_MEMZONE) is used. There is also an API to query the effective > > > max memzone: rte_memzone_max_get(). > > > > > > Signed-off-by: Ophir Munk > > > --- > > > > the use case of each application may want a different non-hard coded > > value makes sense. > > > > it's less clear to me that requiring it be called before eal init makes > > sense over just providing it as configuration to eal init so that it is > > composed. > > Why do you think it would be better as EAL init option? > From an API perspective, I think it is simpler to call a dedicated function. > And I don't think a user wants to deal with it when starting the application. because a dedicated function that can be called detached from the eal state enables an opportunity for accidental and confusing use outside the correct context. i know the above prescribes not to do this but. now you can call set after eal init, but we protect about calling it after init by failing. what do we do sensibly with the failure? > > > can you elaborate further on why you need get if you have a one-shot > > set? why would the application not know the value if you can only ever > > call it once before init? > > The "get" function is used in this patch by test and qede driver. > The application could use it as well, especially to query the default value. this seems incoherent to me, why does the application not know if it has called set or not? if it called set it knows what the value is, if it didn't call set it knows what the default is. anyway, the use case is valid and i would like to see the ability to change it dynamically i'd prefer not to see an api like this be introduced as prescribed but that's for you folks to decide. anyway, i own a lot of apis that operate just like the proposed and they're great source of support overhead. i prefer not to rely on documenting a contract when i can enforce the contract and implicit state machine mechanically with the api instead. fwiw a nicer pattern for doing this one of framework influencing config might look something like this. struct eal_config config; eal_config_init(&config); // defaults are set entire state made valid eal_config_set_max_memzone(&config, 1024); // default is overridden rte_eal_init(&config); ty
Re: [PATCH 1/1] net/ixgbe: add a proper memory barrier for LoongArch
On Fri, Apr 7, 2023 at 4:50PM, Min Zhou wrote: Segmentation fault has been observed while running the ixgbe_recv_pkts_lro() function to receive packets on the Loongson 3C5000 processor which has 64 cores and 4 NUMA nodes. Reason is the read ordering of the status and the rest of the descriptor fields in this function may not be correct on the LoongArch processor. We should add rte_rmb() to ensure the read ordering be correct. We also did the same thing in the ixgbe_recv_pkts() function. Signed-off-by: Min Zhou --- drivers/net/ixgbe/ixgbe_rxtx.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c b/drivers/net/ixgbe/ixgbe_rxtx.c index c9d6ca9efe..16391a42f9 100644 --- a/drivers/net/ixgbe/ixgbe_rxtx.c +++ b/drivers/net/ixgbe/ixgbe_rxtx.c @@ -1823,6 +1823,9 @@ ixgbe_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, staterr = rxdp->wb.upper.status_error; if (!(staterr & rte_cpu_to_le_32(IXGBE_RXDADV_STAT_DD))) break; +#if defined(RTE_ARCH_LOONGARCH) + rte_rmb(); +#endif rxd = *rxdp; /* @@ -2122,6 +2125,9 @@ ixgbe_recv_pkts_lro(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts, if (!(staterr & IXGBE_RXDADV_STAT_DD)) break; +#if defined(RTE_ARCH_LOONGARCH) + rte_rmb(); +#endif rxd = *rxdp; PMD_RX_LOG(DEBUG, "port_id=%u queue_id=%u rx_id=%u " Kindly ping. Any comments or suggestions will be appreciated. Min
[PATCH 0/2] vhost: add port mirroring function in the vhost lib
Similar to the port mirroring function on the switch or router, this patch set implements such function on the Vhost lib. When data is sent to a front-end, it will also send the data to its mirror front-end. When data is received from a front-end, it will also send the data to its mirror front-end. Cheng Jiang (2): vhost: add ingress API for port mirroring datapath vhost: add egress API for port mirroring datapath lib/vhost/rte_vhost_async.h | 17 + lib/vhost/version.map |3 + lib/vhost/virtio_net.c | 1266 +++ 3 files changed, 1286 insertions(+) -- 2.35.1
[PATCH 1/2] vhost: add ingress API for port mirroring datapath
Similar to the port mirroring function on the switch or router, this patch also implements an ingress function on the Vhost lib. When data is sent to a front-end, it will also send the data to its mirror front-end. Signed-off-by: Cheng Jiang Signed-off-by: Yuan Wang Signed-off-by: Wenwu Ma --- lib/vhost/rte_vhost_async.h | 6 + lib/vhost/version.map | 1 + lib/vhost/virtio_net.c | 688 3 files changed, 695 insertions(+) diff --git a/lib/vhost/rte_vhost_async.h b/lib/vhost/rte_vhost_async.h index 8f190dd44b..30aaf66b60 100644 --- a/lib/vhost/rte_vhost_async.h +++ b/lib/vhost/rte_vhost_async.h @@ -286,6 +286,12 @@ __rte_experimental int rte_vhost_async_dma_unconfigure(int16_t dma_id, uint16_t vchan_id); +__rte_experimental +uint16_t rte_vhost_async_try_egress_burst(int vid, uint16_t queue_id, + int mr_vid, uint16_t mr_queue_id, + struct rte_mempool *mbuf_pool, struct rte_mbuf **pkts, uint16_t count, + int *nr_inflight, int16_t dma_id, uint16_t vchan_id); + #ifdef __cplusplus } #endif diff --git a/lib/vhost/version.map b/lib/vhost/version.map index d322a4a888..95f75a6928 100644 --- a/lib/vhost/version.map +++ b/lib/vhost/version.map @@ -98,6 +98,7 @@ EXPERIMENTAL { # added in 22.11 rte_vhost_async_dma_unconfigure; rte_vhost_vring_call_nonblock; + rte_vhost_async_try_egress_burst; }; INTERNAL { diff --git a/lib/vhost/virtio_net.c b/lib/vhost/virtio_net.c index be28ea5151..c7e99d403e 100644 --- a/lib/vhost/virtio_net.c +++ b/lib/vhost/virtio_net.c @@ -4262,3 +4262,691 @@ rte_vhost_async_try_dequeue_burst(int vid, uint16_t queue_id, return count; } + + +static __rte_always_inline uint16_t +async_poll_egress_completed_split(struct virtio_net *dev, struct vhost_virtqueue *vq, + struct virtio_net *mr_dev, struct vhost_virtqueue *mr_vq, + struct rte_mbuf **pkts, uint16_t count, int16_t dma_id, + uint16_t vchan_id, bool legacy_ol_flags) +{ + uint16_t nr_cpl_pkts = 0; + uint16_t start_idx, from, i; + struct async_inflight_info *pkts_info = vq->async->pkts_info; + uint16_t n_descs = 0; + + vhost_async_dma_check_completed(dev, dma_id, vchan_id, VHOST_DMA_MAX_COPY_COMPLETE); + + start_idx = async_get_first_inflight_pkt_idx(vq); + + from = start_idx; + while (vq->async->pkts_cmpl_flag[from] && count--) { + vq->async->pkts_cmpl_flag[from] = false; + from = (from + 1) % vq->size; + nr_cpl_pkts++; + } + + if (nr_cpl_pkts == 0) + return 0; + + for (i = 0; i < nr_cpl_pkts; i++) { + from = (start_idx + i) % vq->size; + n_descs += pkts_info[from].descs; + pkts[i] = pkts_info[from].mbuf; + + if (virtio_net_with_host_offload(dev)) + vhost_dequeue_offload(dev, &pkts_info[from].nethdr, pkts[i], + legacy_ol_flags); + } + + /* write back completed descs to used ring and update used idx */ + write_back_completed_descs_split(vq, nr_cpl_pkts); + __atomic_add_fetch(&vq->used->idx, nr_cpl_pkts, __ATOMIC_RELEASE); + vhost_vring_call_split(dev, vq); + + write_back_completed_descs_split(mr_vq, n_descs); + __atomic_add_fetch(&mr_vq->used->idx, n_descs, __ATOMIC_RELEASE); + vhost_vring_call_split(mr_dev, mr_vq); + + vq->async->pkts_inflight_n -= nr_cpl_pkts; + + return nr_cpl_pkts; +} + +static __rte_always_inline int +egress_async_fill_seg(struct virtio_net *dev, struct vhost_virtqueue *vq, uint64_t buf_iova, + struct virtio_net *mr_dev, uint64_t mr_buf_iova, + struct rte_mbuf *m, uint32_t mbuf_offset, uint32_t cpy_len) +{ + struct vhost_async *async = vq->async; + uint64_t mapped_len, mr_mapped_len; + uint32_t buf_offset = 0; + void *src, *dst, *mr_dst; + void *host_iova, *mr_host_iova; + + while (cpy_len) { + host_iova = (void *)(uintptr_t)gpa_to_first_hpa(dev, + buf_iova + buf_offset, cpy_len, &mapped_len); + if (unlikely(!host_iova)) { + VHOST_LOG_DATA(dev->ifname, ERR, "%s: failed to get host iova.\n", __func__); + return -1; + } + + mr_host_iova = (void *)(uintptr_t)gpa_to_first_hpa(mr_dev, + mr_buf_iova + buf_offset, cpy_len, &mr_mapped_len); + if (unlikely(!mr_host_iova)) { + VHOST_LOG_DATA(mr_dev->ifname, ERR, "%s: failed to get mirror hpa.\n", __func__); + return -1; + } + + if (unlikely(mr_mapped_len != mapped_len)) { + VHOST_LOG_DATA(dev->ifname, ERR, "original mapped len is not equal to mirror le
[PATCH 2/2] vhost: add egress API for port mirroring datapath
This patch implements egress function on the Vhost lib. When packets are received from a front-end, it will also send the packets to its mirror front-end. Signed-off-by: Cheng Jiang Signed-off-by: Yuan Wang Signed-off-by: Wenwu Ma --- lib/vhost/rte_vhost_async.h | 11 + lib/vhost/version.map | 2 + lib/vhost/virtio_net.c | 682 +--- 3 files changed, 643 insertions(+), 52 deletions(-) diff --git a/lib/vhost/rte_vhost_async.h b/lib/vhost/rte_vhost_async.h index 30aaf66b60..4df473f1ec 100644 --- a/lib/vhost/rte_vhost_async.h +++ b/lib/vhost/rte_vhost_async.h @@ -286,6 +286,17 @@ __rte_experimental int rte_vhost_async_dma_unconfigure(int16_t dma_id, uint16_t vchan_id); +__rte_experimental +uint16_t rte_vhost_submit_ingress_mirroring_burst(int vid, uint16_t queue_id, + int mirror_vid, uint16_t mirror_queue_id, + struct rte_mbuf **pkts, uint16_t count, + int16_t dma_id, uint16_t vchan_id); + +__rte_experimental +uint16_t rte_vhost_poll_ingress_completed(int vid, uint16_t queue_id, int mr_vid, + uint16_t mr_queue_id, struct rte_mbuf **pkts, uint16_t count, + int16_t dma_id, uint16_t vchan_id); + __rte_experimental uint16_t rte_vhost_async_try_egress_burst(int vid, uint16_t queue_id, int mr_vid, uint16_t mr_queue_id, diff --git a/lib/vhost/version.map b/lib/vhost/version.map index 95f75a6928..347ea6ac9c 100644 --- a/lib/vhost/version.map +++ b/lib/vhost/version.map @@ -98,6 +98,8 @@ EXPERIMENTAL { # added in 22.11 rte_vhost_async_dma_unconfigure; rte_vhost_vring_call_nonblock; + rte_vhost_submit_ingress_mirroring_burst; + rte_vhost_poll_ingress_completed; rte_vhost_async_try_egress_burst; }; diff --git a/lib/vhost/virtio_net.c b/lib/vhost/virtio_net.c index c7e99d403e..f4c96c3216 100644 --- a/lib/vhost/virtio_net.c +++ b/lib/vhost/virtio_net.c @@ -4263,6 +4263,634 @@ rte_vhost_async_try_dequeue_burst(int vid, uint16_t queue_id, return count; } +static __rte_always_inline int +async_mirror_fill_seg(struct virtio_net *dev, struct vhost_virtqueue *vq, uint64_t buf_iova, + struct virtio_net *mr_dev, uint64_t mr_buf_iova, + struct rte_mbuf *m, uint32_t mbuf_offset, uint32_t cpy_len, bool is_ingress) +{ + struct vhost_async *async = vq->async; + uint64_t mapped_len, mr_mapped_len; + uint32_t buf_offset = 0; + void *src, *dst, *mr_dst; + void *host_iova, *mr_host_iova; + + while (cpy_len) { + host_iova = (void *)(uintptr_t)gpa_to_first_hpa(dev, + buf_iova + buf_offset, cpy_len, &mapped_len); + if (unlikely(!host_iova)) { + VHOST_LOG_DATA(dev->ifname, ERR, "%s: failed to get host iova.\n", __func__); + return -1; + } + mr_host_iova = (void *)(uintptr_t)gpa_to_first_hpa(mr_dev, + mr_buf_iova + buf_offset, cpy_len, &mr_mapped_len); + if (unlikely(!mr_host_iova)) { + VHOST_LOG_DATA(mr_dev->ifname, ERR, "%s: failed to get mirror host iova.\n", __func__); + return -1; + } + + if (unlikely(mr_mapped_len != mapped_len)) { + VHOST_LOG_DATA(dev->ifname, ERR, "original mapped len is not equal to mirror len\n"); + return -1; + } + + if (is_ingress) { + src = (void *)(uintptr_t)rte_pktmbuf_iova_offset(m, mbuf_offset); + dst = host_iova; + mr_dst = mr_host_iova; + } else { + src = host_iova; + dst = (void *)(uintptr_t)rte_pktmbuf_iova_offset(m, mbuf_offset); + mr_dst = mr_host_iova; + } + + if (unlikely(async_iter_add_iovec(dev, async, src, dst, (size_t)mapped_len))) + return -1; + if (unlikely(async_iter_add_iovec(mr_dev, async, src, mr_dst, (size_t)mapped_len))) + return -1; + + cpy_len -= (uint32_t)mapped_len; + mbuf_offset += (uint32_t)mapped_len; + buf_offset += (uint32_t)mapped_len; + } + + return 0; +} + +static __rte_always_inline uint16_t +vhost_poll_ingress_completed(struct virtio_net *dev, struct vhost_virtqueue *vq, + struct virtio_net *mr_dev, struct vhost_virtqueue *mr_vq, + struct rte_mbuf **pkts, uint16_t count, int16_t dma_id, uint16_t vchan_id) +{ + struct vhost_async *async = vq->async; + struct vhost_async *mr_async = mr_vq->async; + struct async_inflight_info *pkts_info = async->pkts_info; + + uint16_t nr_cpl_pkts = 0, n_descs = 0; + uint16_t mr_n_descs = 0; + uint16_
RE: [PATCH] usertools: enhance CPU layout
Hi Stephen, > -Original Message- > From: Stephen Hemminger > Sent: Wednesday, April 19, 2023 12:47 AM > To: Lu, Wenzhuo > Cc: dev@dpdk.org > Subject: Re: [PATCH] usertools: enhance CPU layout > > On Tue, 18 Apr 2023 13:25:41 +0800 > Wenzhuo Lu wrote: > > > The cores in a single CPU may be not all the same. > > The user tool is updated to show the > > difference of the cores. > > > > This patch addes below informantion, > > 1, Group the cores based on the die. > > 2, A core is either a performance core or an > >efficency core. > >A performance core is shown as 'Core-P'. > >An efficency core is shown as 'Core-E'. > > 3, All the E-cores which share the same L2-cache > >are grouped to one module. > > > > The known limitation. > > 1, To tell a core is P-core or E-core is based on if > >this core shares L2 cache with others. > > > > Signed-off-by: Wenzhuo Lu > > Side note: > This tool only exists because of lack of simple tool at the time. > Looking around, found that there is a tool 'lstopo' under the hwloc package > that gives output in many formats including graphical and seems to do a better > job than the DPDK python script. > > Not sure how much farther DPDK should go in this area? > Really should be a distro tool. Many thanks for your review and comments. Have to say I'm a green hand in this field. Just imitate the existing code to write mine. So, still trying to understand and handle the comments :) Better to understand more about our opinion of this script before send a v2 patch. I've used 'lstopo'. It's a great tool. To my opinion, considering there're Linux tools to show all kinds of information, the reason that DPDK has its own tool is to summarize and emphasize the information that is important to DPDK. Here it's that some cores are more powerful than others. When the users use a testpmd-like APP, they can choose the appropriate cores after DPDK reminds them about the difference between cores. Add Thomas for more suggestions. Thanks.
RE: [PATCH v5] enhance NUMA affinity heuristic
> -Original Message- > From: Thomas Monjalon > Sent: 2023年4月19日 20:17 > To: You, KaisenX > Cc: dev@dpdk.org; Zhou, YidingX ; > david.march...@redhat.com; Matz, Olivier ; > ferruh.yi...@amd.com; zhou...@loongson.cn; sta...@dpdk.org; > Richardson, Bruce ; jer...@marvell.com; > Burakov, Anatoly > Subject: Re: [PATCH v5] enhance NUMA affinity heuristic > > 13/04/2023 02:56, You, KaisenX: > > From: You, KaisenX > > > From: Thomas Monjalon > > > > > > > > I'm not comfortable with this patch. > > > > > > > > First, there is no comment in the code which helps to understand the > logic. > > > > Second, I'm afraid changing the value of the per-core variable > > > > _socket_id may have an impact on some applications. > > > > > > Hi Thomas, I'm sorry to bother you again, but we can't think of a > > better solution for now, would you please give me some suggestion, and > then I will modify it accordingly. > > You need to better explain the logic > both in the commit message and in code comments. > When it will be done, it will be easier to have a discussion with other > maintainers and community experts. > Thank you > Thank you for your reply, I'll explain my patch in more detail next. When a DPDK application is started on only one numa node, memory is allocated for only one socket.When interrupt threads use memory, memory may not be found on the socket where the interrupt thread is currently located, and memory has to be reallocated on the hugepage, this operation can lead to performance degradation. So my modification is in the function malloc_get_numa_socket to make sure that the first socket with memory can be returned. If you can accept my explanation and modification, I will send the V6 version to improve the commit message and code comments. > > > Thank you for your reply. > > > First, about comments, I can submit a new patch to add comments to > > > help understand. > > > Second, if you do not change the value of the per-core variable_ > > > socket_ id, /lib/eal/common/malloc_heap.c > > > malloc_get_numa_socket(void) > > > { > > > const struct internal_config *conf = > > > eal_get_internal_configuration(); > > > unsigned int socket_id = rte_socket_id(); // The return value of > > > "rte_socket_id()" is 1 > > > unsigned int idx; > > > > > > if (socket_id != (unsigned int)SOCKET_ID_ANY) > > > return socket_id;//so return here > > > > > > This will cause return here, This function returns the socket_id of > > > unallocated memory. > > > > > > If you have a better solution, I can modify it. > > >
Re: [PATCH v2 06/10] net/octeon_ep: fix DMA incompletion
On Wed, Apr 5, 2023 at 7:56 PM Sathesh Edara wrote: > > This patch fixes the DMA incompletion 1) Please remove "This patch" in every commit description in this series, as it is quite implicit. 2) Please add Fixes: tag 3) Tell what was the problem and how it is fixing it?
Re: [PATCH v2 05/10] net/octeon_ep: support ISM
On Wed, Apr 5, 2023 at 7:56 PM Sathesh Edara wrote: > > This patch adds ISM specific functionality. See following commit as reference, and update new acronyms like ISM and others at devtools/words-case.txt commit 33c942d19260817502b49403f0baaab6113774b2 Author: Ashwin Sekhar T K Date: Fri Sep 17 16:28:39 2021 +0530 devtools: add Marvell acronyms for commit checks Update word list with Marvell specific acronyms. CPT -> Cryptographic Accelerator Unit CQ -> Completion Queue LBK -> Loopback Interface Unit LMT -> Large Atomic Store Unit MCAM -> Match Content Addressable Memory NIX -> Network Interface Controller Unit NPA -> Network Pool Allocator NPC -> Network Parser and CAM Unit ROC -> Rest Of Chip RQ -> Receive Queue RVU -> Resource Virtualization Unit SQ -> Send Queue SSO -> Schedule Synchronize Order Unit TIM -> Timer Unit Suggested-by: Ferruh Yigit Signed-off-by: Ashwin Sekhar T K Reviewed-by: Jerin Jacob > > Signed-off-by: Sathesh Edara > --- > drivers/net/octeon_ep/cnxk_ep_vf.c| 35 +++-- > drivers/net/octeon_ep/cnxk_ep_vf.h| 12 ++ > drivers/net/octeon_ep/otx2_ep_vf.c| 45 ++--- > drivers/net/octeon_ep/otx2_ep_vf.h| 14 +++ > drivers/net/octeon_ep/otx_ep_common.h | 16 > drivers/net/octeon_ep/otx_ep_ethdev.c | 36 + > drivers/net/octeon_ep/otx_ep_rxtx.c | 56 +-- > 7 files changed, 194 insertions(+), 20 deletions(-) > > diff --git a/drivers/net/octeon_ep/cnxk_ep_vf.c > b/drivers/net/octeon_ep/cnxk_ep_vf.c > index 1a92887109..a437ae68cb 100644 > --- a/drivers/net/octeon_ep/cnxk_ep_vf.c > +++ b/drivers/net/octeon_ep/cnxk_ep_vf.c > @@ -2,11 +2,12 @@ > * Copyright(C) 2022 Marvell. > */ > > +#include > #include > > #include > #include > - > +#include > #include "cnxk_ep_vf.h" > > static void > @@ -85,6 +86,7 @@ cnxk_ep_vf_setup_iq_regs(struct otx_ep_device *otx_ep, > uint32_t iq_no) > struct otx_ep_instr_queue *iq = otx_ep->instr_queue[iq_no]; > int loop = OTX_EP_BUSY_LOOP_COUNT; > volatile uint64_t reg_val = 0ull; > + uint64_t ism_addr; > > reg_val = oct_ep_read64(otx_ep->hw_addr + > CNXK_EP_R_IN_CONTROL(iq_no)); > > @@ -132,6 +134,19 @@ cnxk_ep_vf_setup_iq_regs(struct otx_ep_device *otx_ep, > uint32_t iq_no) > */ > oct_ep_write64(OTX_EP_CLEAR_SDP_IN_INT_LVLS, >otx_ep->hw_addr + CNXK_EP_R_IN_INT_LEVELS(iq_no)); > + /* Set up IQ ISM registers and structures */ > + ism_addr = (otx_ep->ism_buffer_mz->iova | CNXK_EP_ISM_EN > + | CNXK_EP_ISM_MSIX_DIS) > + + CNXK_EP_IQ_ISM_OFFSET(iq_no); > + rte_write64(ism_addr, (uint8_t *)otx_ep->hw_addr + > + CNXK_EP_R_IN_CNTS_ISM(iq_no)); > + iq->inst_cnt_ism = > + (uint32_t *)((uint8_t *)otx_ep->ism_buffer_mz->addr > ++ CNXK_EP_IQ_ISM_OFFSET(iq_no)); > + otx_ep_err("SDP_R[%d] INST Q ISM virt: %p, dma: 0x%" PRIX64, iq_no, > + (void *)iq->inst_cnt_ism, ism_addr); > + *iq->inst_cnt_ism = 0; > + iq->inst_cnt_ism_prev = 0; > return 0; > } > > @@ -142,6 +157,7 @@ cnxk_ep_vf_setup_oq_regs(struct otx_ep_device *otx_ep, > uint32_t oq_no) > uint64_t oq_ctl = 0ull; > int loop = OTX_EP_BUSY_LOOP_COUNT; > struct otx_ep_droq *droq = otx_ep->droq[oq_no]; > + uint64_t ism_addr; > > /* Wait on IDLE to set to 1, supposed to configure BADDR > * as long as IDLE is 0 > @@ -201,9 +217,22 @@ cnxk_ep_vf_setup_oq_regs(struct otx_ep_device *otx_ep, > uint32_t oq_no) > rte_write32((uint32_t)reg_val, droq->pkts_sent_reg); > > otx_ep_dbg("SDP_R[%d]_sent: %x", oq_no, > rte_read32(droq->pkts_sent_reg)); > - loop = OTX_EP_BUSY_LOOP_COUNT; > + /* Set up ISM registers and structures */ > + ism_addr = (otx_ep->ism_buffer_mz->iova | CNXK_EP_ISM_EN > + | CNXK_EP_ISM_MSIX_DIS) > + + CNXK_EP_OQ_ISM_OFFSET(oq_no); > + rte_write64(ism_addr, (uint8_t *)otx_ep->hw_addr + > + CNXK_EP_R_OUT_CNTS_ISM(oq_no)); > + droq->pkts_sent_ism = > + (uint32_t *)((uint8_t *)otx_ep->ism_buffer_mz->addr > ++ CNXK_EP_OQ_ISM_OFFSET(oq_no)); > + otx_ep_err("SDP_R[%d] OQ ISM virt: %p dma: 0x%" PRIX64, > + oq_no, (void *)droq->pkts_sent_ism, ism_addr); > + *droq->pkts_sent_ism = 0; > + droq->pkts_sent_ism_prev = 0; > > - while (((rte_read32(droq->pkts_sent_reg)) != 0ull)) { > + loop = OTX_EP_BUSY_LOOP_COUNT; > + while (((rte_read32(droq->pkts_sent_reg)) != 0ull) && loop--) { > reg_val = rte_read32(droq->pkts_sent_reg); > rte_write32((uint32_t)reg_val, droq->pkts_sent_reg); > rte_delay_ms
Re: [PATCH v2 08/10] net/octeon_ep: support Mailbox between VF and PF
On Wed, Apr 5, 2023 at 7:56 PM Sathesh Edara wrote: > > This patch adds the mailbox communication between > VF and PF and supports the following mailbox > messages. > - Get and set MAC address > - Get link information > - Get stats > - Get and set link status > - Set and get MTU > - Send notification to PF > > Signed-off-by: Sathesh Edara 1) Change "Mailbox" to "mailbox" in subject line 2) Please cross check, Do you need to update new items in doc/guides/nics/features/octeon_ep.ini by adding this new features. See doc/guides/nics/features.rst for list of features. > --- > drivers/net/octeon_ep/cnxk_ep_vf.c| 1 + > drivers/net/octeon_ep/cnxk_ep_vf.h| 12 +- > drivers/net/octeon_ep/meson.build | 1 + > drivers/net/octeon_ep/otx_ep_common.h | 26 +++ > drivers/net/octeon_ep/otx_ep_ethdev.c | 143 +++- > drivers/net/octeon_ep/otx_ep_mbox.c | 309 ++ > drivers/net/octeon_ep/otx_ep_mbox.h | 163 ++ > 7 files changed, 642 insertions(+), 13 deletions(-) > create mode 100644 drivers/net/octeon_ep/otx_ep_mbox.c > create mode 100644 drivers/net/octeon_ep/otx_ep_mbox.h > > diff --git a/drivers/net/octeon_ep/cnxk_ep_vf.c > b/drivers/net/octeon_ep/cnxk_ep_vf.c > index a437ae68cb..cadb4ecbf9 100644 > --- a/drivers/net/octeon_ep/cnxk_ep_vf.c > +++ b/drivers/net/octeon_ep/cnxk_ep_vf.c > @@ -8,6 +8,7 @@ > #include > #include > #include > +#include "otx_ep_common.h" > #include "cnxk_ep_vf.h" > > static void > diff --git a/drivers/net/octeon_ep/cnxk_ep_vf.h > b/drivers/net/octeon_ep/cnxk_ep_vf.h > index 072b38ea15..86277449ea 100644 > --- a/drivers/net/octeon_ep/cnxk_ep_vf.h > +++ b/drivers/net/octeon_ep/cnxk_ep_vf.h > @@ -5,7 +5,7 @@ > #define _CNXK_EP_VF_H_ > > #include > -#include "otx_ep_common.h" > + > #define CNXK_CONFIG_XPANSION_BAR 0x38 > #define CNXK_CONFIG_PCIE_CAP 0x70 > #define CNXK_CONFIG_PCIE_DEVCAP 0x74 > @@ -92,6 +92,10 @@ > #define CNXK_EP_R_OUT_BYTE_CNT_START 0x10190 > #define CNXK_EP_R_OUT_CNTS_ISM_START 0x10510 > > +#define CNXK_EP_R_MBOX_PF_VF_DATA_START0x10210 > +#define CNXK_EP_R_MBOX_VF_PF_DATA_START0x10230 > +#define CNXK_EP_R_MBOX_PF_VF_INT_START 0x10220 > + > #define CNXK_EP_R_OUT_CNTS(ring)\ > (CNXK_EP_R_OUT_CNTS_START + ((ring) * CNXK_EP_RING_OFFSET)) > > @@ -125,6 +129,12 @@ > #define CNXK_EP_R_OUT_CNTS_ISM(ring) \ > (CNXK_EP_R_OUT_CNTS_ISM_START + ((ring) * CNXK_EP_RING_OFFSET)) > > +#define CNXK_EP_R_MBOX_VF_PF_DATA(ring) \ > + (CNXK_EP_R_MBOX_VF_PF_DATA_START + ((ring) * CNXK_EP_RING_OFFSET)) > + > +#define CNXK_EP_R_MBOX_PF_VF_INT(ring) \ > + (CNXK_EP_R_MBOX_PF_VF_INT_START + ((ring) * CNXK_EP_RING_OFFSET)) > + > /*-- R_OUT Masks */ > #define CNXK_EP_R_OUT_INT_LEVELS_BMODE (1ULL << 63) > #define CNXK_EP_R_OUT_INT_LEVELS_TIMET (32) > diff --git a/drivers/net/octeon_ep/meson.build > b/drivers/net/octeon_ep/meson.build > index a267b60290..e698bf9792 100644 > --- a/drivers/net/octeon_ep/meson.build > +++ b/drivers/net/octeon_ep/meson.build > @@ -8,4 +8,5 @@ sources = files( > 'otx_ep_vf.c', > 'otx2_ep_vf.c', > 'cnxk_ep_vf.c', > +'otx_ep_mbox.c', > ) > diff --git a/drivers/net/octeon_ep/otx_ep_common.h > b/drivers/net/octeon_ep/otx_ep_common.h > index 3beec71968..0bf5454a39 100644 > --- a/drivers/net/octeon_ep/otx_ep_common.h > +++ b/drivers/net/octeon_ep/otx_ep_common.h > @@ -4,6 +4,7 @@ > #ifndef _OTX_EP_COMMON_H_ > #define _OTX_EP_COMMON_H_ > > +#include > > #define OTX_EP_NW_PKT_OP 0x1220 > #define OTX_EP_NW_CMD_OP 0x1221 > @@ -67,6 +68,9 @@ > #define oct_ep_read64(addr) rte_read64_relaxed((void *)(addr)) > #define oct_ep_write64(val, addr) rte_write64_relaxed((val), (void *)(addr)) > > +/* Mailbox maximum data size */ > +#define MBOX_MAX_DATA_BUF_SIZE 320 > + > /* Input Request Header format */ > union otx_ep_instr_irh { > uint64_t u64; > @@ -488,6 +492,18 @@ struct otx_ep_device { > > /* DMA buffer for SDP ISM messages */ > const struct rte_memzone *ism_buffer_mz; > + > + /* Mailbox lock */ > + rte_spinlock_t mbox_lock; > + > + /* Mailbox data */ > + uint8_t mbox_data_buf[MBOX_MAX_DATA_BUF_SIZE]; > + > + /* Mailbox data index */ > + int32_t mbox_data_index; > + > + /* Mailbox receive message length */ > + int32_t mbox_rcv_message_len; > }; > > int otx_ep_setup_iqs(struct otx_ep_device *otx_ep, uint32_t iq_no, > @@ -541,6 +557,16 @@ struct otx_ep_buf_free_info { > #define OTX_EP_CLEAR_SLIST_DBELL 0x > #define OTX_EP_CLEAR_SDP_OUT_PKT_CNT 0xF > > +/* Max overhead includes > + * - Ethernet hdr > + * - CRC > + * - nested VLANs > + * - octeon rx info > + */ > +#define OTX_EP_ETH_OVERHEAD \ > + (RTE_ETHER_HDR_LEN + RTE_ETHER_CRC_LEN + \ > +(2
Re: [PATCH v2 10/10] net/octeon_ep: set secondary process dev ops
On Wed, Apr 5, 2023 at 7:57 PM Sathesh Edara wrote: > > This patch sets the dev ops and transmit/receive > callbacks for secondary process. Change the message as "fix ..." and fixes: tag if it just bug fixes. BTW, "Multiprocess aware" is missing in doc/guides/nics/features/octeon_ep.ini > > Signed-off-by: Sathesh Edara > --- > drivers/net/octeon_ep/otx_ep_ethdev.c | 22 +++--- > 1 file changed, 19 insertions(+), 3 deletions(-) > > diff --git a/drivers/net/octeon_ep/otx_ep_ethdev.c > b/drivers/net/octeon_ep/otx_ep_ethdev.c > index 885fbb475f..a9868909f8 100644 > --- a/drivers/net/octeon_ep/otx_ep_ethdev.c > +++ b/drivers/net/octeon_ep/otx_ep_ethdev.c > @@ -527,9 +527,17 @@ otx_ep_dev_stats_get(struct rte_eth_dev *eth_dev, > static int > otx_ep_dev_close(struct rte_eth_dev *eth_dev) > { > - struct otx_ep_device *otx_epvf = OTX_EP_DEV(eth_dev); > + struct otx_ep_device *otx_epvf; > uint32_t num_queues, q_no; > > + if (rte_eal_process_type() != RTE_PROC_PRIMARY) { > + eth_dev->dev_ops = NULL; > + eth_dev->rx_pkt_burst = NULL; > + eth_dev->tx_pkt_burst = NULL; > + return 0; > + } > + > + otx_epvf = OTX_EP_DEV(eth_dev); > otx_ep_mbox_send_dev_exit(eth_dev); > otx_epvf->fn_list.disable_io_queues(otx_epvf); > num_queues = otx_epvf->nb_rx_queues; > @@ -593,8 +601,12 @@ static const struct eth_dev_ops otx_ep_eth_dev_ops = { > static int > otx_ep_eth_dev_uninit(struct rte_eth_dev *eth_dev) > { > - if (rte_eal_process_type() != RTE_PROC_PRIMARY) > + if (rte_eal_process_type() != RTE_PROC_PRIMARY) { > + eth_dev->dev_ops = NULL; > + eth_dev->rx_pkt_burst = NULL; > + eth_dev->tx_pkt_burst = NULL; > return 0; > + } > > eth_dev->dev_ops = NULL; > eth_dev->rx_pkt_burst = NULL; > @@ -642,8 +654,12 @@ otx_ep_eth_dev_init(struct rte_eth_dev *eth_dev) > struct rte_ether_addr vf_mac_addr; > > /* Single process support */ > - if (rte_eal_process_type() != RTE_PROC_PRIMARY) > + if (rte_eal_process_type() != RTE_PROC_PRIMARY) { > + eth_dev->dev_ops = &otx_ep_eth_dev_ops; > + eth_dev->rx_pkt_burst = &otx_ep_recv_pkts; > + eth_dev->tx_pkt_burst = &otx2_ep_xmit_pkts; > return 0; > + } > > rte_eth_copy_pci_info(eth_dev, pdev); > otx_epvf->eth_dev = eth_dev; > -- > 2.31.1 >
Re: [PATCH v2] eventdev/timer: fix timeout event wait behavior
On Thu, Apr 13, 2023 at 1:31 AM Carrillo, Erik G wrote: > > > -Original Message- > > From: Shijith Thotton > > Sent: Tuesday, March 21, 2023 12:20 AM > > To: Carrillo, Erik G ; jer...@marvell.com > > Cc: Shijith Thotton ; dev@dpdk.org; > > pbhagavat...@marvell.com; sta...@dpdk.org > > Subject: [PATCH v2] eventdev/timer: fix timeout event wait behavior > > > > Improved the accuracy and consistency of timeout event wait behavior by > > refactoring it. Previously, the delay function used for waiting could be > > inaccurate, leading to inconsistent results. This commit updates the wait > > behavior to use a timeout-based approach, enabling the wait for the exact > > number of timer ticks before proceeding. > > > > The new function timeout_event_dequeue mimics the behavior of the > > tested systems closely. It dequeues timer expiry events until either the > > expected number of events have been dequeued or the specified time has > > elapsed. The WAIT_TICKS macro defines the waiting behavior based on the > > type of timer being used (software or hardware). > > > > Fixes: d1f3385d0076 ("test: add event timer adapter auto-test") > > > > Signed-off-by: Shijith Thotton > Thanks for the update. > > Acked-by: Erik Gabriel Carrillo Applied to dpdk-next-net-eventdev/for-main. Thanks
[PATCH 0/4] app: introduce testgraph application
This patch series introduces testgraph application that verifies graph architecture, it provides an infra to verify the graph & node libraries and scale the test coverage by adding newer configurations to exercise various graph topologies & graph-walk models required by the DPDK applications. Also this series adds two new nodes (punt_kernel & kernel_recv) to the node library. Vamsi Attunuru (4): node: add pkt punt to kernel node node: add a node to receive pkts from kernel node: remove hardcoded node next details app: add testgraph application app/meson.build |1 + app/test-graph/cmdline.c| 212 + app/test-graph/cmdline_graph.c | 297 ++ app/test-graph/cmdline_graph.h | 19 + app/test-graph/meson.build | 17 + app/test-graph/parameters.c | 157 app/test-graph/testgraph.c | 1309 +++ app/test-graph/testgraph.h | 92 ++ doc/guides/prog_guide/graph_lib.rst | 17 + doc/guides/tools/index.rst |1 + doc/guides/tools/testgraph.rst | 131 +++ lib/node/ethdev_rx.c|2 - lib/node/kernel_recv.c | 277 ++ lib/node/kernel_recv_priv.h | 74 ++ lib/node/meson.build|2 + lib/node/punt_kernel.c | 125 +++ lib/node/punt_kernel_priv.h | 36 + 17 files changed, 2767 insertions(+), 2 deletions(-) create mode 100644 app/test-graph/cmdline.c create mode 100644 app/test-graph/cmdline_graph.c create mode 100644 app/test-graph/cmdline_graph.h create mode 100644 app/test-graph/meson.build create mode 100644 app/test-graph/parameters.c create mode 100644 app/test-graph/testgraph.c create mode 100644 app/test-graph/testgraph.h create mode 100644 doc/guides/tools/testgraph.rst create mode 100644 lib/node/kernel_recv.c create mode 100644 lib/node/kernel_recv_priv.h create mode 100644 lib/node/punt_kernel.c create mode 100644 lib/node/punt_kernel_priv.h -- 2.25.1
[PATCH 1/4] node: add pkt punt to kernel node
Patch adds a node to punt the packets to kernel over a raw socket. Signed-off-by: Vamsi Attunuru --- doc/guides/prog_guide/graph_lib.rst | 10 +++ lib/node/meson.build| 1 + lib/node/punt_kernel.c | 125 lib/node/punt_kernel_priv.h | 36 4 files changed, 172 insertions(+) diff --git a/doc/guides/prog_guide/graph_lib.rst b/doc/guides/prog_guide/graph_lib.rst index 1cfdc86433..b3b5b14827 100644 --- a/doc/guides/prog_guide/graph_lib.rst +++ b/doc/guides/prog_guide/graph_lib.rst @@ -392,3 +392,13 @@ null This node ignores the set of objects passed to it and reports that all are processed. + +punt_kernel +~~~ +This node punts the packets to kernel using a raw socket interface. For sending +the received packets, raw socket uses the packet's destination IP address in +sockaddr_in address structure and node uses ``sendto`` function to send data +on the raw socket. + +Aftering sending the burst of packets to kernel, this node redirects the same +objects to pkt_drop node to free up the packet buffers. diff --git a/lib/node/meson.build b/lib/node/meson.build index dbdf673c86..48c2da73f7 100644 --- a/lib/node/meson.build +++ b/lib/node/meson.build @@ -17,6 +17,7 @@ sources = files( 'null.c', 'pkt_cls.c', 'pkt_drop.c', +'punt_kernel.c', ) headers = files('rte_node_ip4_api.h', 'rte_node_eth_api.h') # Strict-aliasing rules are violated by uint8_t[] to context size casts. diff --git a/lib/node/punt_kernel.c b/lib/node/punt_kernel.c new file mode 100644 index 00..e5dd15b759 --- /dev/null +++ b/lib/node/punt_kernel.c @@ -0,0 +1,125 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(C) 2023 Marvell International Ltd. + */ + +#include +#include +#include +#include + +#include +#include +#include +#include +#include + +#include "node_private.h" +#include "punt_kernel_priv.h" + +static __rte_always_inline void +punt_kernel_process_mbuf(struct rte_node *node, struct rte_mbuf **mbufs, uint16_t cnt) +{ + punt_kernel_node_ctx_t *ctx = (punt_kernel_node_ctx_t *)node->ctx; + struct sockaddr_in sin = {0}; + struct rte_ipv4_hdr *ip4; + size_t len; + char *buf; + int i; + + for (i = 0; i < cnt; i++) { + ip4 = rte_pktmbuf_mtod(mbufs[i], struct rte_ipv4_hdr *); + len = rte_pktmbuf_data_len(mbufs[i]); + buf = (char *)ip4; + + sin.sin_family = AF_INET; + sin.sin_port = 0; + sin.sin_addr.s_addr = ip4->dst_addr; + + if (sendto(ctx->sock, buf, len, 0, (struct sockaddr *)&sin, sizeof(sin)) < 0) + node_err("punt_kernel", "Unable to send packets: %s\n", strerror(errno)); + } +} + +static uint16_t +punt_kernel_node_process(struct rte_graph *graph __rte_unused, struct rte_node *node, void **objs, +uint16_t nb_objs) +{ + struct rte_mbuf **pkts = (struct rte_mbuf **)objs; + uint16_t obj_left = nb_objs; + +#define PREFETCH_CNT 4 + + while (obj_left >= 12) { + /* Prefetch next-next mbufs */ + rte_prefetch0(pkts[8]); + rte_prefetch0(pkts[9]); + rte_prefetch0(pkts[10]); + rte_prefetch0(pkts[11]); + + /* Prefetch next mbuf data */ + rte_prefetch0(rte_pktmbuf_mtod_offset(pkts[4], void *, pkts[4]->l2_len)); + rte_prefetch0(rte_pktmbuf_mtod_offset(pkts[5], void *, pkts[5]->l2_len)); + rte_prefetch0(rte_pktmbuf_mtod_offset(pkts[6], void *, pkts[6]->l2_len)); + rte_prefetch0(rte_pktmbuf_mtod_offset(pkts[7], void *, pkts[7]->l2_len)); + + punt_kernel_process_mbuf(node, pkts, PREFETCH_CNT); + + obj_left -= PREFETCH_CNT; + pkts += PREFETCH_CNT; + } + + while (obj_left > 0) { + punt_kernel_process_mbuf(node, pkts, 1); + + obj_left--; + pkts++; + } + + rte_node_next_stream_move(graph, node, PUNT_KERNEL_NEXT_PKT_DROP); + + return nb_objs; +} + +static int +punt_kernel_node_init(const struct rte_graph *graph __rte_unused, struct rte_node *node) +{ + punt_kernel_node_ctx_t *ctx = (punt_kernel_node_ctx_t *)node->ctx; + + ctx->sock = socket(AF_INET, SOCK_RAW, IPPROTO_RAW); + if (ctx->sock < 0) + node_err("punt_kernel", "Unable to open RAW socket\n"); + + return 0; +} + +static void +punt_kernel_node_fini(const struct rte_graph *graph __rte_unused, struct rte_node *node) +{ + punt_kernel_node_ctx_t *ctx = (punt_kernel_node_ctx_t *)node->ctx; + + if (ctx->sock >= 0) { + close(ctx->sock); + ctx->sock = -1; + } +} + +static struct rte_node_register punt_kernel_node_base = { + .process = punt_kernel_node_process, + .name = "punt_kernel", + + .init =
[PATCH 2/4] node: add a node to receive pkts from kernel
Patch adds a node to receive packets from kernel over a raw socket. Signed-off-by: Vamsi Attunuru --- doc/guides/prog_guide/graph_lib.rst | 7 + lib/node/kernel_recv.c | 277 lib/node/kernel_recv_priv.h | 74 lib/node/meson.build| 1 + 4 files changed, 359 insertions(+) diff --git a/doc/guides/prog_guide/graph_lib.rst b/doc/guides/prog_guide/graph_lib.rst index b3b5b14827..1057f16de8 100644 --- a/doc/guides/prog_guide/graph_lib.rst +++ b/doc/guides/prog_guide/graph_lib.rst @@ -402,3 +402,10 @@ on the raw socket. Aftering sending the burst of packets to kernel, this node redirects the same objects to pkt_drop node to free up the packet buffers. + +kernel_recv +~~~ +This node receives packets from kernel over a raw socket interface. Uses ``poll`` +function to poll on the socket fd for ``POLLIN`` events to read the packets from +raw socket to stream buffer and does ``rte_node_next_stream_move()`` when there +are received packets. diff --git a/lib/node/kernel_recv.c b/lib/node/kernel_recv.c new file mode 100644 index 00..361dcc3b5f --- /dev/null +++ b/lib/node/kernel_recv.c @@ -0,0 +1,277 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(C) 2023 Marvell International Ltd. + */ + +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "ethdev_rx_priv.h" +#include "kernel_recv_priv.h" +#include "node_private.h" + +static struct kernel_recv_node_main kernel_recv_main; + +static inline struct rte_mbuf * +alloc_rx_mbuf(kernel_recv_node_ctx_t *ctx) +{ + kernel_recv_info_t *rx = ctx->recv_info; + + if (rx->idx >= rx->cnt) { + uint16_t cnt; + + rx->idx = 0; + rx->cnt = 0; + + cnt = rte_pktmbuf_alloc_bulk(ctx->pktmbuf_pool, rx->rx_bufs, KERN_RECV_CACHE_COUNT); + if (cnt <= 0) + return NULL; + + rx->cnt = cnt; + } + + return rx->rx_bufs[rx->idx++]; +} + +static inline void +mbuf_update(struct rte_mbuf **mbufs, uint16_t nb_pkts) +{ + struct rte_net_hdr_lens hdr_lens; + struct rte_mbuf *m; + int i; + + for (i = 0; i < nb_pkts; i++) { + m = mbufs[i]; + + m->packet_type = rte_net_get_ptype(m, &hdr_lens, RTE_PTYPE_ALL_MASK); + + m->ol_flags = 0; + m->tx_offload = 0; + + m->l2_len = hdr_lens.l2_len; + m->l3_len = hdr_lens.l3_len; + m->l4_len = hdr_lens.l4_len; + } +} + +static uint16_t +recv_pkt_parse(void **objs, uint16_t nb_pkts) +{ + uint16_t pkts_left = nb_pkts; + struct rte_mbuf **pkts; + int i; + + pkts = (struct rte_mbuf **)objs; + + if (pkts_left >= 4) { + for (i = 0; i < 4; i++) + rte_prefetch0(rte_pktmbuf_mtod(pkts[i], void *)); + } + + while (pkts_left >= 12) { + /* Prefetch next-next mbufs */ + rte_prefetch0(pkts[8]); + rte_prefetch0(pkts[9]); + rte_prefetch0(pkts[10]); + rte_prefetch0(pkts[11]); + + /* Prefetch next mbuf data */ + rte_prefetch0(rte_pktmbuf_mtod(pkts[4], void *)); + rte_prefetch0(rte_pktmbuf_mtod(pkts[5], void *)); + rte_prefetch0(rte_pktmbuf_mtod(pkts[6], void *)); + rte_prefetch0(rte_pktmbuf_mtod(pkts[7], void *)); + + /* Extract ptype of mbufs */ + mbuf_update(pkts, 4); + + pkts += 4; + pkts_left -= 4; + } + + if (pkts_left > 0) + mbuf_update(pkts, pkts_left); + + return nb_pkts; +} + +static uint16_t +kernel_recv_node_do(struct rte_graph *graph, struct rte_node *node, kernel_recv_node_ctx_t *ctx) +{ + kernel_recv_info_t *rx; + uint16_t next_index; + int fd; + + rx = ctx->recv_info; + next_index = rx->cls_next; + + fd = rx->sock; + if (fd > 0) { + struct rte_mbuf **mbufs; + uint16_t len = 0, count = 0; + int nb_cnt, i; + + nb_cnt = (node->size >= RTE_GRAPH_BURST_SIZE) ? RTE_GRAPH_BURST_SIZE : node->size; + + mbufs = (struct rte_mbuf **)node->objs; + for (i = 0; i < nb_cnt; i++) { + struct rte_mbuf *m = alloc_rx_mbuf(ctx); + + if (!m) + break; + + len = read(fd, rte_pktmbuf_mtod(m, char *), rte_pktmbuf_tailroom(m)); + if (len == 0 || len == 0x) { + rte_pktmbuf_free(m); + if (rx->idx <= 0) + node_dbg("kernel_recv", "rx_mbuf array is empty\n"); +
[PATCH 3/4] node: remove hardcoded node next details
For ethdev_rx node, node_next details can be populated during node cloning time and same gets assigned to node context structure during node initialization. Patch removes overriding node_next details in node init(). Signed-off-by: Vamsi Attunuru --- lib/node/ethdev_rx.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/lib/node/ethdev_rx.c b/lib/node/ethdev_rx.c index a19237b42f..85816c489c 100644 --- a/lib/node/ethdev_rx.c +++ b/lib/node/ethdev_rx.c @@ -194,8 +194,6 @@ ethdev_rx_node_init(const struct rte_graph *graph, struct rte_node *node) RTE_VERIFY(elem != NULL); - ctx->cls_next = ETHDEV_RX_NEXT_PKT_CLS; - /* Check and setup ptype */ return ethdev_ptype_setup(ctx->port_id, ctx->queue_id); } -- 2.25.1
[PATCH 4/4] app: add testgraph application
Patch adds test-graph application to validate graph and node libraries. Signed-off-by: Vamsi Attunuru --- app/meson.build|1 + app/test-graph/cmdline.c | 212 ++ app/test-graph/cmdline_graph.c | 297 app/test-graph/cmdline_graph.h | 19 + app/test-graph/meson.build | 17 + app/test-graph/parameters.c| 157 app/test-graph/testgraph.c | 1309 app/test-graph/testgraph.h | 92 +++ doc/guides/tools/index.rst |1 + doc/guides/tools/testgraph.rst | 131 10 files changed, 2236 insertions(+) diff --git a/app/meson.build b/app/meson.build index 74d2420f67..6c7b24e604 100644 --- a/app/meson.build +++ b/app/meson.build @@ -22,6 +22,7 @@ apps = [ 'test-eventdev', 'test-fib', 'test-flow-perf', +'test-graph', 'test-gpudev', 'test-mldev', 'test-pipeline', diff --git a/app/test-graph/cmdline.c b/app/test-graph/cmdline.c new file mode 100644 index 00..a07a8a24f9 --- /dev/null +++ b/app/test-graph/cmdline.c @@ -0,0 +1,212 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(C) 2023 Marvell International Ltd. + */ + +#include + +#include +#include +#include +#include +#include +#include +#include +#include + +#include "cmdline_graph.h" +#include "testgraph.h" + +static struct cmdline *testgraph_cl; +static cmdline_parse_ctx_t *main_ctx; + +/* *** Help command with introduction. *** */ +struct cmd_help_brief_result { + cmdline_fixed_string_t help; +}; + +static void +cmd_help_brief_parsed(__rte_unused void *parsed_result, struct cmdline *cl, __rte_unused void *data) +{ + cmdline_printf(cl, + "\n" + "Help is available for the following sections:\n\n" + "help control: Start and stop graph walk.\n" + "help display: Displaying port, stats and config " + "information.\n" + "help config : Configuration information.\n" + "help all: All of the above sections.\n\n"); +} + +static cmdline_parse_token_string_t cmd_help_brief_help = + TOKEN_STRING_INITIALIZER(struct cmd_help_brief_result, help, "help"); + +static cmdline_parse_inst_t cmd_help_brief = { + .f = cmd_help_brief_parsed, + .data = NULL, + .help_str = "help: Show help", + .tokens = { + (void *)&cmd_help_brief_help, + NULL, + }, +}; + +/* *** Help command with help sections. *** */ +struct cmd_help_long_result { + cmdline_fixed_string_t help; + cmdline_fixed_string_t section; +}; + +static void +cmd_help_long_parsed(void *parsed_result, struct cmdline *cl, __rte_unused void *data) +{ + int show_all = 0; + struct cmd_help_long_result *res = parsed_result; + + if (!strcmp(res->section, "all")) + show_all = 1; + + if (show_all || !strcmp(res->section, "control")) { + + cmdline_printf(cl, "\n" + "Control forwarding:\n" + "---\n\n" + + "start graph_walk\n" + " Start graph_walk on worker threads.\n\n" + + "stop graph_walk\n" + " Stop worker threads from running graph_walk.\n\n" + + "quit\n" + "Quit to prompt.\n\n"); + } + + if (show_all || !strcmp(res->section, "display")) { + + cmdline_printf(cl, + "\n" + "Display:\n" + "\n\n" + + "show node_list\n" + " Display the list of supported nodes.\n\n" + + "show graph_stats\n" + " Display the node statistics of graph cluster.\n\n"); + } + + if (show_all || !strcmp(res->section, "config")) { + cmdline_printf(cl, "\n" + "Configuration:\n" + "--\n" + "set lcore_config (port_id0,rxq0,lcore_idX),..." + ".,(port_idX,rxqX,lcoreidY)\n" + " Set lcore configuration.\n\n" + + "create_graph (node0_name,node1_name,...,nodeX_name)\n" + " Create graph instances using the provided node details.\n\n" + + "destroy_graph\n" + " Destroy the graph instances.\n\n"); + } +} + +static cmdline_parse_tok
[PATCH] net/cpfl: update the doc of CPFL PMD
This patch updates cpfl.rst doc, adjusting the order of chapters referring to IDPF PMD doc. Signed-off-by: Mingxia Liu --- doc/guides/nics/cpfl.rst | 44 +--- 1 file changed, 23 insertions(+), 21 deletions(-) diff --git a/doc/guides/nics/cpfl.rst b/doc/guides/nics/cpfl.rst index 91dbec306d..d25db088eb 100644 --- a/doc/guides/nics/cpfl.rst +++ b/doc/guides/nics/cpfl.rst @@ -20,27 +20,6 @@ Follow the DPDK :doc:`../linux_gsg/index` to setup the basic DPDK environment. To get better performance on Intel platforms, please follow the :doc:`../linux_gsg/nic_perf_intel_platform`. -Features - - -Vector PMD -~~ - -Vector path for Rx and Tx path are selected automatically. -The paths are chosen based on 2 conditions: - -- ``CPU`` - - On the x86 platform, the driver checks if the CPU supports AVX512. - If the CPU supports AVX512 and EAL argument ``--force-max-simd-bitwidth`` - is set to 512, AVX512 paths will be chosen. - -- ``Offload features`` - - The supported HW offload features are described in the document cpfl.ini, - A value "P" means the offload feature is not supported by vector path. - If any not supported features are used, cpfl vector PMD is disabled - and the scalar paths are chosen. Configuration - @@ -104,3 +83,26 @@ Driver compilation and testing -- Refer to the document :doc:`build_and_test` for details. + + +Features + + +Vector PMD +~~ + +Vector path for Rx and Tx path are selected automatically. +The paths are chosen based on 2 conditions: + +- ``CPU`` + + On the x86 platform, the driver checks if the CPU supports AVX512. + If the CPU supports AVX512 and EAL argument ``--force-max-simd-bitwidth`` + is set to 512, AVX512 paths will be chosen. + +- ``Offload features`` + + The supported HW offload features are described in the document cpfl.ini, + A value "P" means the offload feature is not supported by vector path. + If any not supported features are used, cpfl vector PMD is disabled + and the scalar paths are chosen. -- 2.34.1
NVIDIA roadmap for 23.07
Please find below NVIDIA roadmap for 23.07 release: A. rte_flow new APIs = [1] Updated existing rule's actions in flow template API table. Value: The user can update an existing flow action in flight directly without removing an old rule entry and then inserting a new one. The update of action can have a different actions list. To update the actions for a given flow entry, support all types of actions but only with optimize by index matcher. ethdev: add flow rule actions update API: https://patchwork.dpdk.org/project/dpdk/patch/20230418195807.352514-1-akozy...@nvidia.com/ [2] Support Quota flow action and item Value: allow setting a flow or multiple flows to share a volume quota in which traffic usage can be monitored by the application to assure usage is permitted up to a predefined limit The Quota action limits traffic according to pre-defined configuration. The quota action updates the ‘quota’ value and sets packet quota state (PASS or BLOCK). The quota item matches on the flow quota state. ethdev: add quota flow action and item: https://patches.dpdk.org/project/dpdk/patch/20221221073547.988-2-getel...@nvidia.com/ [3] add IPv6 extension push remove. app/testpmd: add IPv6 extension push remove cli: https://patchwork.dpdk.org/project/dpdk/patch/20230417092540.2617450-3-rongw...@nvidia.com/ ethdev: add IPv6 extension push remove action: https://patchwork.dpdk.org/project/dpdk/patch/20230417022630.2377505-2-rongw...@nvidia.com/ Add new flow actions to support push/remove IPv6 extension header. [4] Flow template API Geneve plus options support Value: Supported in non-template API, adding support to the template API. User needs to support more than one TLV option headers for their network. The private and dedicated APIs are used to handle the parsers on CX-* and BF-*. This will not only provide the comparability, but also extends the functionality compared to the non-template API. E.g., more than one TLV option header can be supported, and more fields can be modified, the source and destination can both be the option headers. To support the standard and customized Geneve and Geneve opt. ethdev: extend modify field API (For MPLS and GENEVE): https://patchwork.dpdk.org/project/dpdk/cover/20230420092145.522389-1-michae...@nvidia.com/ [5] Local / Remote mirroring support in flow template API Value: A parity of the mirroring support with non-template API. In addition, support also multiple ports mirroring with template API. Multiple destinations can be supported, and the local and remote mirroring can both be in the same rule. This would provide more diagnostic and lawful interception abilities to the cloud infrastructure applications. ethdev: add indirect list flow action: https://patches.dpdk.org/project/dpdk/patch/20230418172144.24365-1-getel...@nvidia.com/ [6] vRoCE feature: need able to monitor Cloud guest RoCE (RDMA Over Converged Ethernet) stats on Cloud provider side (ECN/CNP) Value: The guest RoCE traffic (UDP dport 4791) needs support matching and monitors in the provider application side. With the new item support, RoCE traffic with specific patterns can be countered with COUNT action and the statistic results is visible in the provider on the host or BareMetal on DPU side. More actions can be supported as well, not only for counter. User can count the number of ROCE packets. ethdev: add flow item for RoCE infiniband BT. http://patches.dpdk.org/project/dpdk/patch/20230324032615.4141031-1-dongz...@nvidia.com/ B. Net/mlx5 PMD updates = [1] DPDK Protection of burst non-raising order. Value: In accurate scheduling the packets may be rescheduled before sending, it is the user’s responsibility to ensure the timestamps of packets to be rescheduled are in ascending order when pushing the WQE. Or else the hardware would not be able to perform scheduling correctly. A software counter has been added to record the application errors in such case. It will give more insights and help to debug when such error occurs. net/mlx5: introduce Tx datapath tracing: http://patches.dpdk.org/project/dpdk/cover/20230420100803.494-1-viachesl...@nvidia.com/ [2] Added flow offload action to route packets to kernel. Value: A parity of the support in non-template API. It allows an application to re-route packets directly to the kernel without software involvement. net/mlx5/hws: support dest root table action: https://patches.dpdk.org/project/dpdk/patch/20230320141229.104748-1-hamd...@nvidia.com/ [3] Forward to SW packets that are too big for encap (match on size > X) Value: In some customer environments it is not possible to control the MTU size and if packets are about to be encapsulated their final length might exceed the MTU size. It can be used to identify packets that are longer than predefined size. Add support for IP length range matching (IPv4/IPv6) for flow template API.