RE: [dpdk-dev] [v5] doc: define qualification criteria for external library

2024-01-08 Thread Hemant Agrawal

On 08-Jan-24 1:28 PM, jer...@marvell.com wrote:
> From: Jerin Jacob 
>
> Define qualification criteria for external library
> based on a techboard meeting minutes [1] and past
> learnings from mailing list discussion.
>
> [1]
> http://mails.dpdk.org/archives/dev/2019-June/135847.html
> https://mails.dpdk.org/archives/dev/2024-January/284849.html
>
> Signed-off-by: Jerin Jacob 
> Acked-by: Thomas Monjalon 
> ---
>  doc/guides/contributing/index.rst |  1 +
>  .../contributing/library_dependency.rst   | 52 +++
>  2 files changed, 53 insertions(+)
>  create mode 100644 doc/guides/contributing/library_dependency.rst
>
> v5:
> - Added "Dependency nature" section based on Stephen's input
>
> v4:
> - Address Thomas comments from
https://patches.dpdk.org/project/dpdk/patch/20240105121215.3950532-1-jerinj@
marvell.com/
>
> v3:
> - Updated the content based on TB discussion which is documented at
> https://mails.dpdk.org/archives/dev/2024-January/284849.html
>
> v2:
> - Added "Meson build integration" and "Code readability" sections.
>
>
> diff --git a/doc/guides/contributing/index.rst
b/doc/guides/contributing/index.rst
> index dcb9b1fbf0..e5a8c2b0a3 100644
> --- a/doc/guides/contributing/index.rst
> +++ b/doc/guides/contributing/index.rst
> @@ -15,6 +15,7 @@ Contributor's Guidelines
>  documentation
>  unit_test
>  new_library
> +library_dependency
>  patches
>  vulnerability
>  stable
> diff --git a/doc/guides/contributing/library_dependency.rst
b/doc/guides/contributing/library_dependency.rst
> new file mode 100644
> index 00..94025fdf60
> --- /dev/null
> +++ b/doc/guides/contributing/library_dependency.rst
> @@ -0,0 +1,52 @@
> +.. SPDX-License-Identifier: BSD-3-Clause
> +   Copyright(c) 2024 Marvell.
> +
> +External Library dependency
> +===
> +
> +This document defines the qualification criteria for external libraries
that may be
> +used as dependencies in DPDK drivers or libraries.
> +
> +#. **Documentation:**
> +
> +   - Must have adequate documentation for the steps to build it.
> +   - Must have clear license documentation on distribution and usage
aspects of external library.
> +
> +#. **Free availability:**
> +
> +   - The library must be freely available to build in either source or
binary form.
> +   - It shall be downloadable from a direct link. There shall not be any
requirement to explicitly
> + login or sign a user agreement.
> +
> +#. **Usage License:**
> +
> +   - Both permissive (e.g., BSD-3 or Apache) and non-permissive (e.g.,
GPLv3) licenses are acceptable.
> +   - In the case of a permissive license, automatic inclusion in the
build process is assumed.
> + For non-permissive licenses, an additional build configuration
option is required.
> +
> +#. **Distributions License:**
> +
> +   - No specific constraints beyond documentation.

Though we are not mandatory open distribution. However we should ask for the
defining the distribution aspect clearly in the library.

>
> +
> +#. **Compiler compatibility:**
> +
> +   - The library must be able to compile with a DPDK supported compiler
for the given execution
> + environment.
> + For example, for Linux, the library must be able to compile with GCC
and/or clang.
> +   - Library may be limited to a specific OS.
> +
> +#. **Meson build integration:**
> +
> +   - The library must have standard method like ``pkg-config`` for
seamless integration with
> + DPDK's build environment.
> +
> +#. **Code readability:**
> +
> +   - Optional dependencies should use stubs to minimize ``ifdef``
clutter, promoting improved
> + code readability.
> +
> +#. **Dependency nature:**
> +
> +   - The external library dependency should be optional.
> + i.e Missing external library must not impact the core functionality
of the DPDK, specific
> + library and/or driver will not built if dependencies are not meet.


smime.p7s
Description: S/MIME cryptographic signature


[PATCH] app/dma-perf: support bi-directional transfer

2024-01-08 Thread Amit Prakash Shukla
Adds bi-directional DMA transfer support to test performance.

Signed-off-by: Amit Prakash Shukla 
---
Depends-on: series-30357 ("PCI Dev and SG copy support")

 app/test-dma-perf/benchmark.c | 89 +--
 app/test-dma-perf/config.ini  |  5 ++
 app/test-dma-perf/main.c  | 21 +++--
 app/test-dma-perf/main.h  |  1 +
 4 files changed, 88 insertions(+), 28 deletions(-)

diff --git a/app/test-dma-perf/benchmark.c b/app/test-dma-perf/benchmark.c
index 4530bd98ce..91ba0f4718 100644
--- a/app/test-dma-perf/benchmark.c
+++ b/app/test-dma-perf/benchmark.c
@@ -144,12 +144,19 @@ cache_flush_buf(__rte_unused struct rte_mbuf **array,
 
 static int
 vchan_data_populate(uint32_t dev_id, struct rte_dma_vchan_conf *qconf,
-   struct test_configure *cfg)
+   struct test_configure *cfg, uint16_t dev_num)
 {
struct rte_dma_info info;
 
qconf->direction = cfg->transfer_dir;
 
+   /* If its a bi-directional test, configure odd device for inbound dma
+* transfer and even device for outbound dma transfer.
+*/
+   if (cfg->is_bidir)
+   qconf->direction = (dev_num % 2) ? RTE_DMA_DIR_MEM_TO_DEV :
+  RTE_DMA_DIR_DEV_TO_MEM;
+
rte_dma_info_get(dev_id, &info);
if (!(RTE_BIT64(qconf->direction) & info.dev_capa))
return -1;
@@ -181,14 +188,15 @@ vchan_data_populate(uint32_t dev_id, struct 
rte_dma_vchan_conf *qconf,
 
 /* Configuration of device. */
 static void
-configure_dmadev_queue(uint32_t dev_id, struct test_configure *cfg, uint8_t 
ptrs_max)
+configure_dmadev_queue(uint32_t dev_id, struct test_configure *cfg, uint8_t 
ptrs_max,
+  uint16_t dev_num)
 {
uint16_t vchan = 0;
struct rte_dma_info info;
struct rte_dma_conf dev_config = { .nb_vchans = 1 };
struct rte_dma_vchan_conf qconf = { 0 };
 
-   if (vchan_data_populate(dev_id, &qconf, cfg) != 0)
+   if (vchan_data_populate(dev_id, &qconf, cfg, dev_num) != 0)
rte_exit(EXIT_FAILURE, "Error with vchan data populate.\n");
 
if (rte_dma_configure(dev_id, &dev_config) != 0)
@@ -235,7 +243,7 @@ config_dmadevs(struct test_configure *cfg)
}
 
ldm->dma_ids[i] = dev_id;
-   configure_dmadev_queue(dev_id, cfg, ptrs_max);
+   configure_dmadev_queue(dev_id, cfg, ptrs_max, nb_dmadevs);
++nb_dmadevs;
}
 
@@ -433,7 +441,8 @@ setup_memory_env(struct test_configure *cfg,
 struct rte_dma_sge **src_sges, struct rte_dma_sge 
**dst_sges)
 {
static struct rte_mbuf_ext_shared_info *ext_buf_info;
-   unsigned int buf_size = cfg->buf_size.cur;
+   unsigned int cur_buf_size = cfg->buf_size.cur;
+   unsigned int buf_size = cur_buf_size + RTE_PKTMBUF_HEADROOM;
unsigned int nr_sockets;
uint32_t nr_buf = cfg->nr_buf;
uint32_t i;
@@ -449,7 +458,7 @@ setup_memory_env(struct test_configure *cfg,
nr_buf,
0,
0,
-   buf_size + RTE_PKTMBUF_HEADROOM,
+   buf_size,
cfg->src_numa_node);
if (src_pool == NULL) {
PRINT_ERR("Error with source mempool creation.\n");
@@ -460,7 +469,7 @@ setup_memory_env(struct test_configure *cfg,
nr_buf,
0,
0,
-   buf_size + RTE_PKTMBUF_HEADROOM,
+   buf_size,
cfg->dst_numa_node);
if (dst_pool == NULL) {
PRINT_ERR("Error with destination mempool creation.\n");
@@ -490,8 +499,8 @@ setup_memory_env(struct test_configure *cfg,
}
 
for (i = 0; i < nr_buf; i++) {
-   memset(rte_pktmbuf_mtod((*srcs)[i], void *), rte_rand(), 
buf_size);
-   memset(rte_pktmbuf_mtod((*dsts)[i], void *), 0, buf_size);
+   memset(rte_pktmbuf_mtod((*srcs)[i], void *), rte_rand(), 
cur_buf_size);
+   memset(rte_pktmbuf_mtod((*dsts)[i], void *), 0, cur_buf_size);
}
 
if (cfg->transfer_dir == RTE_DMA_DIR_DEV_TO_MEM ||
@@ -503,24 +512,38 @@ setup_memory_env(struct test_configure *cfg,
}
}
 
-   if (cfg->transfer_dir == RTE_DMA_DIR_DEV_TO_MEM) {
+   if (cfg->transfer_dir == RTE_DMA_DIR_DEV_TO_MEM && !cfg->is_bidir) {
+   ext_buf_info->free_cb = dummy_free_ext_buf;
+   ext_buf_info->fcb_opaque = NULL;
+   for (i = 0; i < nr_buf; i++) {
+   /* Using mbuf structure to hold remote iova address. */
+   rte_pktmbuf_attach_extbuf((*srcs)[i], (void 
*)(cfg->raddr + (i * buf_size)),
+ (rte_iova_t)(cfg->raddr + (i 
* buf_size)), 0,
+

Re: [dpdk-dev] [v5] doc: define qualification criteria for external library

2024-01-08 Thread Jerin Jacob
On Mon, Jan 8, 2024 at 1:47 PM Hemant Agrawal  wrote:
>
>
> On 08-Jan-24 1:28 PM, jer...@marvell.com wrote:
> > From: Jerin Jacob 
> >
> > Define qualification criteria for external library
> > based on a techboard meeting minutes [1] and past
> > learnings from mailing list discussion.
> >
> > [1]
> > http://mails.dpdk.org/archives/dev/2019-June/135847.html
> > https://mails.dpdk.org/archives/dev/2024-January/284849.html
> >
> > Signed-off-by: Jerin Jacob 
> > Acked-by: Thomas Monjalon 

> > +#. **Distributions License:**
> > +
> > +   - No specific constraints beyond documentation.
>
> Though we are not mandatory open distribution. However we should ask for the
> defining the distribution aspect clearly in the library.

How about following then,

No specific constraints, but clear documentation on distribution usage
aspects is required.

If not, please suggest the exact wording.


RE: [PATCH v1] net/axgbe: read and save the port property register

2024-01-08 Thread Sebastian, Selwin
[AMD Official Use Only - General]

Acked-by: Selwin Sebastian

-Original Message-
From: Ande, Venkat Kumar 
Sent: Friday, January 5, 2024 5:03 PM
To: dev@dpdk.org
Cc: Sebastian, Selwin ; Ande, Venkat Kumar 

Subject: [PATCH v1] net/axgbe: read and save the port property register

From: Venkat Kumar Ande 

Read and save the port property registers once during the device probe and then 
use the saved values as they are needed.

Signed-off-by: Venkat Kumar Ande 
---
 drivers/net/axgbe/axgbe_ethdev.c   | 21 +
 drivers/net/axgbe/axgbe_ethdev.h   |  7 +++
 drivers/net/axgbe/axgbe_phy_impl.c | 68 --
 3 files changed, 48 insertions(+), 48 deletions(-)

diff --git a/drivers/net/axgbe/axgbe_ethdev.c b/drivers/net/axgbe/axgbe_ethdev.c
index f174d46143..3450374535 100644
--- a/drivers/net/axgbe/axgbe_ethdev.c
+++ b/drivers/net/axgbe/axgbe_ethdev.c
@@ -2342,23 +2342,28 @@ eth_axgbe_dev_init(struct rte_eth_dev *eth_dev)
pdata->arcache = AXGBE_DMA_OS_ARCACHE;
pdata->awcache = AXGBE_DMA_OS_AWCACHE;

+   /* Read the port property registers */
+   pdata->pp0 = XP_IOREAD(pdata, XP_PROP_0);
+   pdata->pp1 = XP_IOREAD(pdata, XP_PROP_1);
+   pdata->pp2 = XP_IOREAD(pdata, XP_PROP_2);
+   pdata->pp3 = XP_IOREAD(pdata, XP_PROP_3);
+   pdata->pp4 = XP_IOREAD(pdata, XP_PROP_4);
+
/* Set the maximum channels and queues */
-   reg = XP_IOREAD(pdata, XP_PROP_1);
-   pdata->tx_max_channel_count = XP_GET_BITS(reg, XP_PROP_1, MAX_TX_DMA);
-   pdata->rx_max_channel_count = XP_GET_BITS(reg, XP_PROP_1, MAX_RX_DMA);
-   pdata->tx_max_q_count = XP_GET_BITS(reg, XP_PROP_1, MAX_TX_QUEUES);
-   pdata->rx_max_q_count = XP_GET_BITS(reg, XP_PROP_1, MAX_RX_QUEUES);
+   pdata->tx_max_channel_count = XP_GET_BITS(pdata->pp1, XP_PROP_1, 
MAX_TX_DMA);
+   pdata->rx_max_channel_count = XP_GET_BITS(pdata->pp1, XP_PROP_1, 
MAX_RX_DMA);
+   pdata->tx_max_q_count = XP_GET_BITS(pdata->pp1, XP_PROP_1, 
MAX_TX_QUEUES);
+   pdata->rx_max_q_count = XP_GET_BITS(pdata->pp1, XP_PROP_1,
+MAX_RX_QUEUES);

/* Set the hardware channel and queue counts */
axgbe_set_counts(pdata);

/* Set the maximum fifo amounts */
-   reg = XP_IOREAD(pdata, XP_PROP_2);
-   pdata->tx_max_fifo_size = XP_GET_BITS(reg, XP_PROP_2, TX_FIFO_SIZE);
+   pdata->tx_max_fifo_size = XP_GET_BITS(pdata->pp2, XP_PROP_2,
+TX_FIFO_SIZE);
pdata->tx_max_fifo_size *= 16384;
pdata->tx_max_fifo_size = RTE_MIN(pdata->tx_max_fifo_size,
  pdata->vdata->tx_max_fifo_size);
-   pdata->rx_max_fifo_size = XP_GET_BITS(reg, XP_PROP_2, RX_FIFO_SIZE);
+   pdata->rx_max_fifo_size = XP_GET_BITS(pdata->pp2, XP_PROP_2,
+RX_FIFO_SIZE);
pdata->rx_max_fifo_size *= 16384;
pdata->rx_max_fifo_size = RTE_MIN(pdata->rx_max_fifo_size,
  pdata->vdata->rx_max_fifo_size); diff 
--git a/drivers/net/axgbe/axgbe_ethdev.h b/drivers/net/axgbe/axgbe_ethdev.h
index 7f19321d88..df5d63c493 100644
--- a/drivers/net/axgbe/axgbe_ethdev.h
+++ b/drivers/net/axgbe/axgbe_ethdev.h
@@ -539,6 +539,13 @@ struct axgbe_port {
void *xprop_regs;   /* AXGBE property registers */
void *xi2c_regs;/* AXGBE I2C CSRs */

+   /* Port property registers */
+   unsigned int pp0;
+   unsigned int pp1;
+   unsigned int pp2;
+   unsigned int pp3;
+   unsigned int pp4;
+
bool cdr_track_early;
/* XPCS indirect addressing lock */
unsigned int xpcs_window_def_reg;
diff --git a/drivers/net/axgbe/axgbe_phy_impl.c 
b/drivers/net/axgbe/axgbe_phy_impl.c
index d97fbbfddd..44ff28517c 100644
--- a/drivers/net/axgbe/axgbe_phy_impl.c
+++ b/drivers/net/axgbe/axgbe_phy_impl.c
@@ -1709,40 +1709,35 @@ static int axgbe_phy_link_status(struct axgbe_port 
*pdata, int *an_restart)  static void axgbe_phy_sfp_gpio_setup(struct 
axgbe_port *pdata)  {
struct axgbe_phy_data *phy_data = pdata->phy_data;
-   unsigned int reg;
-
-   reg = XP_IOREAD(pdata, XP_PROP_3);

phy_data->sfp_gpio_address = AXGBE_GPIO_ADDRESS_PCA9555 +
-   XP_GET_BITS(reg, XP_PROP_3, GPIO_ADDR);
+   XP_GET_BITS(pdata->pp3, XP_PROP_3, GPIO_ADDR);

-   phy_data->sfp_gpio_mask = XP_GET_BITS(reg, XP_PROP_3, GPIO_MASK);
+   phy_data->sfp_gpio_mask = XP_GET_BITS(pdata->pp3, XP_PROP_3,
+GPIO_MASK);

-   phy_data->sfp_gpio_rx_los = XP_GET_BITS(reg, XP_PROP_3,
+   phy_data->sfp_gpio_rx_los = XP_GET_BITS(pdata->pp3, XP_PROP_3,
GPIO_RX_LOS);
-   phy_data->sfp_gpio_tx_fault = XP_GET_BITS(reg, XP_PROP_3,
+   phy_data->sfp_gpio_tx_fault = XP_GET_BITS(pdata->pp3, XP_PROP_3,
  GPIO_TX_FAULT);
-   phy_data->sfp_gpio_mod_absent = XP_GET_BITS(reg, XP_PROP_3,
+   phy_data->sfp_gpio_mod_absent = XP_GET_BITS(pdata->pp3, XP_PRO

RE: [dpdk-dev] [v5] doc: define qualification criteria for external library

2024-01-08 Thread Morten Brørup
> From: jer...@marvell.com [mailto:jer...@marvell.com]
> Sent: Monday, 8 January 2024 08.59
> 
> Define qualification criteria for external library
> based on a techboard meeting minutes [1] and past
> learnings from mailing list discussion.

According to the DPDK project charter, the Governing Board deals with legal and 
licensing issues, so we need their approval before publishing this.
Perhaps the Governing Board should be invited to join the discussion?

> 
> [1]
> http://mails.dpdk.org/archives/dev/2019-June/135847.html
> https://mails.dpdk.org/archives/dev/2024-January/284849.html
> 
> Signed-off-by: Jerin Jacob 
> Acked-by: Thomas Monjalon 
> ---
>  doc/guides/contributing/index.rst |  1 +
>  .../contributing/library_dependency.rst   | 52 +++
>  2 files changed, 53 insertions(+)
>  create mode 100644 doc/guides/contributing/library_dependency.rst
> 
> v5:
> - Added "Dependency nature" section based on Stephen's input
> 
> v4:
> - Address Thomas comments from
> https://patches.dpdk.org/project/dpdk/patch/20240105121215.3950532-1-
> jer...@marvell.com/
> 
> v3:
> - Updated the content based on TB discussion which is documented at
> https://mails.dpdk.org/archives/dev/2024-January/284849.html
> 
> v2:
> - Added "Meson build integration" and "Code readability" sections.
> 
> 
> diff --git a/doc/guides/contributing/index.rst
> b/doc/guides/contributing/index.rst
> index dcb9b1fbf0..e5a8c2b0a3 100644
> --- a/doc/guides/contributing/index.rst
> +++ b/doc/guides/contributing/index.rst
> @@ -15,6 +15,7 @@ Contributor's Guidelines
>  documentation
>  unit_test
>  new_library
> +library_dependency
>  patches
>  vulnerability
>  stable
> diff --git a/doc/guides/contributing/library_dependency.rst
> b/doc/guides/contributing/library_dependency.rst
> new file mode 100644
> index 00..94025fdf60
> --- /dev/null
> +++ b/doc/guides/contributing/library_dependency.rst
> @@ -0,0 +1,52 @@
> +.. SPDX-License-Identifier: BSD-3-Clause
> +   Copyright(c) 2024 Marvell.
> +
> +External Library dependency
> +===
> +
> +This document defines the qualification criteria for external
> libraries that may be
> +used as dependencies in DPDK drivers or libraries.

More background information could be added here, for context.

Although DPDK is a BSD licensed project, we want to open the door for non-BSD 
licensed external libraries in those drivers and libraries, where the developer 
has the choice to omit them at build time. But not in the core parts of DPDK, 
which must remain fully BSD licensed.

Stephen shared some concerns about source code availability, so DPDK doesn't 
become a shim for a bunch of binary blobs, like some other "open" project (I 
cannot remember the name of the project he mentioned). We are allowing binary 
blobs, but it would be nice if we could somehow state our intentions in this 
regard.

> +
> +#. **Documentation:**
> +
> +   - Must have adequate documentation for the steps to build it.
> +   - Must have clear license documentation on distribution and usage
> aspects of external library.
> +
> +#. **Free availability:**
> +
> +   - The library must be freely available to build in either source or
> binary form.
> +   - It shall be downloadable from a direct link. There shall not be
> any requirement to explicitly
> + login or sign a user agreement.

Remove "explicitly".

> +
> +#. **Usage License:**
> +
> +   - Both permissive (e.g., BSD-3 or Apache) and non-permissive (e.g.,
> GPLv3) licenses are acceptable.
> +   - In the case of a permissive license, automatic inclusion in the
> build process is assumed.
> + For non-permissive licenses, an additional build configuration
> option is required.

We must ensure that automatic inclusion only applies to libraries with a 
license that allows both free usage and free distribution. IANAL, so please 
confirm that this bullet covers both?

The default DPDK build should not include anything that is not freely 
distributable, comes with an EULA or any other limitations or restrictions. 
Optimally, the default DPDK build should remain 100 % compatible with the BSD 
license. DPDK is a BSD licensed project, so any deviations from the BSD license 
must be explicitly selected (opt-in) at build time.

> +
> +#. **Distributions License:**

Distributions -> Distribution

> +
> +   - No specific constraints beyond documentation.
> +
> +#. **Compiler compatibility:**
> +
> +   - The library must be able to compile with a DPDK supported
> compiler for the given execution
> + environment.

We should consider cross build requirements, or at least use cross build 
terminology.

E.g. "execution environment" -> "target environment".

> + For example, for Linux, the library must be able to compile with
> GCC and/or clang.
> +   - Library may be limited to a specific OS.

Since we allow limiting to a specific OS, we should probably also allow 
limiting to specific architecture a

RE: [PATCH] net/ice: fix memory leak

2024-01-08 Thread Zhang, Qi Z



> -Original Message-
> From: Wu, Wenjun1 
> Sent: Monday, January 8, 2024 3:27 PM
> To: Zhang, Qi Z ; Yang, Qiming
> 
> Cc: dev@dpdk.org; sta...@dpdk.org
> Subject: RE: [PATCH] net/ice: fix memory leak
> 
> 
> 
> > -Original Message-
> > From: Zhang, Qi Z 
> > Sent: Monday, January 8, 2024 4:49 AM
> > To: Yang, Qiming ; Wu, Wenjun1
> > 
> > Cc: dev@dpdk.org; Zhang, Qi Z ; sta...@dpdk.org
> > Subject: [PATCH] net/ice: fix memory leak
> >
> > Free memory for AQ buffer at icd_move_recfg_lan_txq Free memory for
> > profile list at ice_tm_conf_uninit
> >
> > Fixes: 8c481c3bb65b ("net/ice: support queue and queue group bandwidth
> > limit")
> > Cc: sta...@dpdk.org
> >
> > Signed-off-by: Qi Zhang 
> > ---
> >  drivers/net/ice/ice_tm.c | 12 
> >  1 file changed, 12 insertions(+)
> >
> > diff --git a/drivers/net/ice/ice_tm.c b/drivers/net/ice/ice_tm.c index
> > b570798f07..c00ecb6a97 100644
> > --- a/drivers/net/ice/ice_tm.c
> > +++ b/drivers/net/ice/ice_tm.c
> > @@ -59,8 +59,15 @@ void
> >  ice_tm_conf_uninit(struct rte_eth_dev *dev)  {
> > struct ice_pf *pf = ICE_DEV_PRIVATE_TO_PF(dev->data-
> >dev_private);
> > +   struct ice_tm_shaper_profile *shaper_profile;
> > struct ice_tm_node *tm_node;
> >
> > +   /* clear profile */
> > +   while ((shaper_profile = TAILQ_FIRST(&pf-
> > >tm_conf.shaper_profile_list))) {
> > +   TAILQ_REMOVE(&pf->tm_conf.shaper_profile_list,
> > shaper_profile, node);
> > +   rte_free(shaper_profile);
> > +   }
> > +
> > /* clear node configuration */
> > while ((tm_node = TAILQ_FIRST(&pf->tm_conf.queue_list))) {
> > TAILQ_REMOVE(&pf->tm_conf.queue_list, tm_node, node);
> @@ -636,6
> > +643,8 @@ static int ice_move_recfg_lan_txq(struct rte_eth_dev *dev,
> > uint16_t buf_size = ice_struct_size(buf, txqs, 1);
> >
> > buf = (struct ice_aqc_move_txqs_data *)ice_malloc(hw, sizeof(*buf));
> > +   if (buf == NULL)
> > +   return -ENOMEM;
> >
> > queue_parent_node = queue_sched_node->parent;
> > buf->src_teid = queue_parent_node->info.node_teid;
> > @@ -647,6 +656,7 @@ static int ice_move_recfg_lan_txq(struct
> > rte_eth_dev *dev,
> > NULL, buf, buf_size, &txqs_moved,
> NULL);
> > if (ret || txqs_moved == 0) {
> > PMD_DRV_LOG(ERR, "move lan queue %u failed", queue_id);
> > +   rte_free(buf);
> > return ICE_ERR_PARAM;
> > }
> >
> > @@ -656,12 +666,14 @@ static int ice_move_recfg_lan_txq(struct
> > rte_eth_dev *dev,
> > } else {
> > PMD_DRV_LOG(ERR, "invalid children number %d for
> queue %u",
> > queue_parent_node->num_children, queue_id);
> > +   rte_free(buf);
> > return ICE_ERR_PARAM;
> > }
> > dst_node->children[dst_node->num_children++] =
> queue_sched_node;
> > queue_sched_node->parent = dst_node;
> > ice_sched_query_elem(hw, queue_sched_node->info.node_teid,
> > &queue_sched_node->info);
> >
> > +   rte_free(buf);
> > return ret;
> >  }
> >
> > --
> > 2.31.1
> 
> Acked-by: Wenjun Wu 

Applied to dpdk-next-net-intel.

Thanks
Qi


RE: unnecessary rx callbacks when zero packets

2024-01-08 Thread Morten Brørup
> From: Honnappa Nagarahalli [mailto:honnappa.nagaraha...@arm.com]
> Sent: Sunday, 7 January 2024 21.57
> 
> > From: Stephen Hemminger 
> > Sent: Sunday, January 7, 2024 11:37 AM
> >
> > I noticed while looking at packet capture that currently the receive
> callbacks
> > get called even if there are no packets. This seems unnecessary since
> if nb_rx is
> > zero, then there are no packets to look at.  My one concern is that
> an
> > application could be using callbacks as some form of scheduling
> mechanism
> > which would be broken.
> Is it possible that the call back functions are maintaining statistics
> on zero packet polls?

I agree with this concern. The primary argument for introducing the callbacks 
(instead of the application simply calling the same functions at RX and TX 
time) was to provide instrumentation outside of the application itself. And for 
instrumentation purposes, zero-packet calls may be relevant.

TX also calls its callback with zero packets. The callbacks treatment should be 
the same for both RX and TX: Either always call, or only call if non-zero 
packets.

So: NAK.

Perhaps the packet capture library can be optimized for zero packets instead.

> 
> >
> > The change would be:
> >
> >
> > diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h index
> > 21e3a21903ec..f64bf977c46e 100644
> > --- a/lib/ethdev/rte_ethdev.h
> > +++ b/lib/ethdev/rte_ethdev.h
> > @@ -6077,7 +6077,7 @@ rte_eth_rx_burst(uint16_t port_id, uint16_t
> > queue_id,
> > nb_rx = p->rx_pkt_burst(qd, rx_pkts, nb_pkts);
> >
> >  #ifdef RTE_ETHDEV_RXTX_CALLBACKS
> > -   {
> > +   if (nb_rx > 0) {
> > void *cb;
> >
> > /* rte_memory_order_release memory order was used
> when the


RE: Issues around packet capture when secondary process is doing rx/tx

2024-01-08 Thread Morten Brørup
> From: Stephen Hemminger [mailto:step...@networkplumber.org]
> Sent: Monday, 8 January 2024 02.59
> 
> I have been looking at a problem reported by Sandesh
> where packet capture does not work if rx/tx burst is done in secondary
> process.
> 
> The root cause is that existing rx/tx callback model just doesn't work
> unless the process doing the rx/tx burst calls is the same one that
> registered the callbacks.

So, callbacks don't work across processes, because code might differ across 
processes.

If process A is running, and RX'ing and TX'ing, and process B wants to install 
its own callbacks (e.g. packet capture) on RX and RX, we basically want process 
A to execute code residing in process B, which is impossible.

An alternative could be to pass the packets through a ring in shared memory. 
However, this method would add the ring processing latency of process B to the 
RX/TX latency of process A.

I think we can conclude that callbacks are one of the things that don't work 
with secondary processes.

With this decided, we can then consider how to best add packet capture. The 
concept of passing "data" (instead of calling functions) across processes 
obviously applies to this use case.

> 
> An example sequence would be:
>   1. dumpcap (or pdump) as secondary tells pdump in primary to
> register callback
>   2. secondary process calls rx_burst.
>   3. rx_burst sees the callback but it has pointer pdump_rx which
> is not necessarily
>  at same location in primary and secondary process.
>   4. indirect function call in secondary to bad location likely
> causes crash.
> 
> Some possible workarounds.
>   1. Keep callback list per-process: messy, but won't crash.
> Capture won't work
>without other changes. In this primary would register
> callback, but secondaries
>would not use them in rx/tx burst.
> 
>   2. Replace use of rx/tx callback in pdump with change to
> rte_ethdev to have
>a capture flag. (i.e. don't use indirection).  Likely ABI
> problems.
>Basically, ignore the rx/tx callback mechanism. This is my
> preferred
>  solution.
> 
>   3. Some fix up mechanism (in EAL mp support?) to have each
> process fixup
>its callback mechanism.
> 
>   4. Do something in pdump_init to register the callback in same
> process context
>  (probably need callbacks to be per-process). Would mean
> callback is always
>on independent of capture being enabled.
> 
> 5. Get rid of indirect function call pointer, and replace it by
> index into
>a static table of callback functions. Every process would
> have same code
>(in this case pdump_rx) but at different address.  Requires
> all callbacks
>to be statically defined at build time.
> 
> The existing rx/tx callback is not safe id rx/tx burst is called from
> different process
> than where callback is registered.
> 



Re: [PATCH v2 0/3] remove __typeof__ from expansion of per lcore macros

2024-01-08 Thread Bruce Richardson
On Tue, Jan 02, 2024 at 03:44:59PM -0800, Tyler Retzlaff wrote:
> The design of the macros requires a type to be provided to the macro.
> 
> By expanding the type parameter inside of typeof it also inadvertently
> allows an expression to be used which appears not to have been intended
> after evaluating the parameter name and existing macro use.
> 
> Technically this is an API break but only for applications that were
> using these macros outside of the original design intent.
> 
> v2:
>   * add additional patch to adjust usage for crypto/ipsec_mb 
> 
Series-acked-by: Bruce Richardson 


Re: [dpdk-dev] [RFC] ethdev: support Tx queue free descriptor query

2024-01-08 Thread Bruce Richardson
On Tue, Dec 19, 2023 at 10:59:48PM +0530, jer...@marvell.com wrote:
> From: Jerin Jacob 
> 
> Introduce a new API to retrieve the number of available free descriptors
> in a Tx queue. Applications can leverage this API in the fast path to
> inspect the Tx queue occupancy and take appropriate actions based on the
> available free descriptors.
> 
> A notable use case could be implementing Random Early Discard (RED)
> in software based on Tx queue occupancy.
> 
> Signed-off-by: Jerin Jacob 
> ---

Hi Jerin,

while I don't strongly object to this patch, I wonder if it encourages
sub-optimal code implementations. To determine the number of free
descriptors in a ring, the driver in many cases will need to do a scan of
the descriptor ring to see how many are free/used. However, I suspect that
in most cases we will see something like the following being done:

count = rte_eth_rx_free_count();
if (count > X) {
/* Do something */
}

For instances like this, scanning the entire ring is wasteful of resources.
Instead it would be better to just check the descriptor at position X
explicitly. Going to the trouble of checking the exact descriptor count is
unnecessary in this case.

Out of interest, are you aware of a case where an app would need to know
exactly the number of free descriptors, and where the result would not be
just compared to one or more threshold values? Do we see cases where it
would be used in a computation, perhaps?

/Bruce


Re: [PATCH v2 1/2] telemetry: correct json empty dictionaries

2024-01-08 Thread Bruce Richardson
On Sun, Dec 24, 2023 at 05:02:00PM -0500, Jonathan Erb wrote:
> Fix to allow telemetry to handle empty dictionaries correctly.
> 
> This patch resolves an issue where empty dictionaries are reported
> by telemetry as '[]' rather than '{}'. Initializing the output
> buffer based on the container type resolves the issue.
> 
> Signed-off-by: Jonathan Erb 

One minor comment below.

Acked-by: Bruce Richardson 

> ---
>  .mailmap  | 2 +-
>  lib/telemetry/telemetry.c | 6 +-
>  2 files changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/.mailmap b/.mailmap
> index ab0742a382..a3302ba7a1 100644
> --- a/.mailmap
> +++ b/.mailmap
> @@ -675,7 +675,7 @@ John Ousterhout 
>  John Romein 
>  John W. Linville 
>  Jonas Pfefferle  
> -Jonathan Erb  
> +Jonathan Erb 

FYI, it's advisable to keep old email addresses in this file, since it is
then used to map the older email addresses to the new one. Just add your
new correct email address as the first one on the line.

>  Jonathan Tsai 
>  Jon DeVree 
>  Jon Loeliger 
> diff --git a/lib/telemetry/telemetry.c b/lib/telemetry/telemetry.c
> index 92982842a8..0788a32210 100644
> --- a/lib/telemetry/telemetry.c
> +++ b/lib/telemetry/telemetry.c
> @@ -169,7 +169,11 @@ container_to_json(const struct rte_tel_data *d, char 
> *out_buf, size_t buf_len)
>   d->type != TEL_ARRAY_INT && d->type != TEL_ARRAY_STRING)
>   return snprintf(out_buf, buf_len, "null");
>  
> - used = rte_tel_json_empty_array(out_buf, buf_len, 0);
> + if (d->type == RTE_TEL_DICT)
> + used = rte_tel_json_empty_obj(out_buf, buf_len, 0);
> + else
> + used = rte_tel_json_empty_array(out_buf, buf_len, 0);
> +
>   if (d->type == TEL_ARRAY_UINT)
>   for (i = 0; i < d->data_len; i++)
>   used = rte_tel_json_add_array_uint(out_buf,
> -- 
> 2.34.1
> 


Re: [PATCH v2 2/2] telemetry: correct json empty dictionaries

2024-01-08 Thread Bruce Richardson
On Sun, Dec 24, 2023 at 05:02:01PM -0500, Jonathan Erb wrote:
> Fix use of incorrect enum name.
> 
> Signed-off-by: Jonathan Erb 

Please merge this fix into patch 1, so we only have a single commit to the
repo.

Thanks,
/Bruce

> ---
>  lib/telemetry/telemetry.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/lib/telemetry/telemetry.c b/lib/telemetry/telemetry.c
> index 0788a32210..eef4ac7bb7 100644
> --- a/lib/telemetry/telemetry.c
> +++ b/lib/telemetry/telemetry.c
> @@ -169,7 +169,7 @@ container_to_json(const struct rte_tel_data *d, char 
> *out_buf, size_t buf_len)
>   d->type != TEL_ARRAY_INT && d->type != TEL_ARRAY_STRING)
>   return snprintf(out_buf, buf_len, "null");
>  
> - if (d->type == RTE_TEL_DICT)
> + if (d->type == TEL_DICT)
>   used = rte_tel_json_empty_obj(out_buf, buf_len, 0);
>   else
>   used = rte_tel_json_empty_array(out_buf, buf_len, 0);
> -- 
> 2.34.1
> 


Re: vmxnet3 no longer functional on DPDK 21.11

2024-01-08 Thread Lewis Donzis
Good morning.

I just wanted to mention that this problem still persists in 22.11.3, and we 
still have to patch the vmxnet3 driver every time we upgrade.

As mentioned before, ixgbe_ethdev.c is an example of a driver that ifdef's out 
the attempt to use interrupts on FreeBSD.

Thanks,
lew


- On Jun 3, 2022, at 8:19 AM, Lewis Donzis l...@perftech.com wrote:

> Hi, all.
> 
> Resurrecting this thread from six months ago, I apologize for not having more
> time to dig into it, but in light of recent findings, I see numerous other
> drivers and other parts of the code that have comments to the effect that
> "FreeBSD doesn't support interrupts" and they effectively #ifdef out the
> attempt.
> 
> Could this be as simple as needing to do the same in vmxnet3?  Empirically,
> ignoring the error from rte_intr_enable() allows the driver to work normally,
> for what that's worth.
> 
> Thanks,
> lew
> 
> - On Dec 6, 2021, at 6:08 AM, Konstantin Ananyev
> konstantin.anan...@intel.com wrote:
> 
>>> -Original Message-
>>> From: Richardson, Bruce 
>>> Sent: Monday, December 6, 2021 9:17 AM
>>> To: Lewis Donzis 
>>> Cc: dev ; Wang, Yong ; Ananyev, 
>>> Konstantin
>>> 
>>> Subject: Re: vmxnet3 no longer functional on DPDK 21.11
>>> 
>>> On Sun, Dec 05, 2021 at 07:52:33PM -0600, Lewis Donzis wrote:
>>> >
>>> >
>>> > - On Nov 30, 2021, at 7:42 AM, Bruce Richardson 
>>> > bruce.richard...@intel.com
>>> > wrote:
>>> >
>>> > > On Mon, Nov 29, 2021 at 02:45:15PM -0600, Lewis Donzis wrote:
>>> > >>Hello.
>>> > >>We just upgraded from 21.08 to 21.11 and it's rather astounding the
>>> > >>number of incompatible changes in three months.  Not a big deal, 
>>> > >> just
>>> > >>kind of a surprise, that's all.
>>> > >>Anyway, the problem is that the vmxnet3 driver is no longer 
>>> > >> functional
>>> > >>on FreeBSD.
>>> > >>In drivers/net/vmxnet3/vmxnet3_ethdev.c, vmxnet3_dev_start() gets an
>>> > >>error calling rte_intr_enable().  So it logs "interrupt enable 
>>> > >> failed"
>>> > >>and returns an error.
>>> > >>In lib/eal/freebsd/eal_interrupts.c, rte_intr_enable() is returning 
>>> > >> an
>>> > >>error because rte_intr_dev_fd_get(intr_handle) is returning -1.
>>> > >>I don't see how that could ever return anything other than -1 since 
>>> > >> it
>>> > >>appears that there is no code that ever calls rte_intr_dev_fd_set()
>>> > >>with a value other than -1 on FreeBSD.  Also weird to me is that 
>>> > >> even
>>> > >>if it didn't get an error, the switch statement that follows looks 
>>> > >> like
>>> > >>it will return an error in every case.
>>> > >>Nonetheless, it worked in 21.08, and I can't quite see why the
>>> > >>difference, so I must be missing something.
>>> > >>For the moment, I just commented the "return -EIO" in 
>>> > >> vmxnet3_ethdev.c,
>>> > >>and it's now working again, but that's obviously not the correct
>>> > >>solution.
>>> > >>Can someone who's knowledgable about this mechanism perhaps explain 
>>> > >> a
>>> > >>little bit about what's going on?  I'll be happy to help 
>>> > >> troubleshoot.
>>> > >>It seems like it must be something simple, but I just don't see it 
>>> > >> yet.
>>> > >
>>> > > Hi
>>> > >
>>> > > if you have the chance, it would be useful if you could use "git 
>>> > > bisect" to
>>> > > identify the commit in 21.11 that broke this driver. Looking through the
>>> > > logs for 21.11 I can't identify any particular likely-looking commit, so
>>> > > bisect is likely a good way to start looking into this.
>>> > >
>>> > > Regards,
>>> > > /Bruce
>>> >
>>> > Hi, Bruce.  git bisect is very time-consuming and very cool!
>>> >
>>> > I went back to 21.08, about 1100 commits, and worked through the process, 
>>> > but
>>> > then I realized that I had forgotten to run ninja on one of
>>> the steps, so I did it again.
>>> >
>>> > I also re-checked it after the bisect, just to make sure that
>>> > c87d435a4d79739c0cec2ed280b94b41cb908af7 is good, and
>>> 7a0935239b9eb817c65c03554a9954ddb8ea5044 is bad.
>>> >
>>> > Thanks,
>>> > lew
>>> >
>>> 
>>> Many thanks for taking the time to do this. Adding Konstantin to thread as
>>> author of the commit you identified. Konstantin, any thoughts on this
>>> issue?
>> 
>> Hmm, that's looks really strange to me.
>> So to clarify, it fails at:
>> static int
>> vmxnet3_dev_start(struct rte_eth_dev *dev)
>> {
>>  ...
>> line 1695:
>>  if (rte_intr_enable(dev->intr_handle) < 0) {
>>PMD_INIT_LOG(ERR, "interrupt enable failed");
>>return -EIO;
>>}
>> 
>> Right?
>> 
>> The strange thing here is that 7a0935239b9e
>> doesn't change dev_start or rte_intr code in any way.
>> All it does - change rte_eth_rx_burst/rte_eth_tx_burst and other fast-path
>> functions.
>> Anyway, if git blames that commit, let's try to figure out what is going on.
>> Unfortunately, I don't have freebsd with vmxnet3, so will need to rely on 

why DPDK reassembles IP fragment packets with AF_PACKET

2024-01-08 Thread
Hi All,


Recently I debug ovs-dpdk with AF_PACKET mode.  When IP fragment packets are 
received via DPDK, the IP fragment packets are reassembled by DPDK. After 
reassembly, the packet length is over 1518. They are discarded by OVS because 
of oversize packets.


I don't understandy why  PACKET_FLAG_DEFRAG is set for AF_PACKET mode.


Can you help explain design at your convenience?


I would appreciate your kindly help.


Best regards,


Matthew

Re: [PATCH v2] build: set rte toolchain macros from predefined macros

2024-01-08 Thread Bruce Richardson
On Tue, Jan 02, 2024 at 04:11:15PM -0800, Tyler Retzlaff wrote:
> Stop writing RTE_TOOLCHAIN_XXX macros to rte_build_config.h. When an
> application builds it doesn't necessarily use the same toolchain that
> DPDK was built with.
> 
> Instead evaluate toolchain predefined macros and define
> RTE_TOOLCHAIN_XXX macros as appropriate each time rte_config.h is
> preprocessed.
> 
> Signed-off-by: Tyler Retzlaff 

I don't see an issue with doing this.

Acked-by: Bruce Richardson 

> ---
> 
> v2:
>   * use defined(macro) to correctly test for predefined macros
> 
>  config/meson.build  |  2 --
>  config/rte_config.h | 11 +++
>  2 files changed, 11 insertions(+), 2 deletions(-)
> 
> diff --git a/config/meson.build b/config/meson.build
> index a9ccd56..0c3550e 100644
> --- a/config/meson.build
> +++ b/config/meson.build
> @@ -180,8 +180,6 @@ if not is_ms_compiler
>  endif
>  
>  toolchain = cc.get_id()
> -dpdk_conf.set_quoted('RTE_TOOLCHAIN', toolchain)
> -dpdk_conf.set('RTE_TOOLCHAIN_' + toolchain.to_upper().underscorify(), 1)
>  
>  dpdk_conf.set('RTE_ARCH_64', cc.sizeof('void *') == 8)
>  dpdk_conf.set('RTE_ARCH_32', cc.sizeof('void *') == 4)
> diff --git a/config/rte_config.h b/config/rte_config.h
> index da265d7..d743a5c 100644
> --- a/config/rte_config.h
> +++ b/config/rte_config.h
> @@ -14,6 +14,17 @@
>  
>  #include 
>  
> +#if defined(__clang__)
> +#define RTE_TOOLCHAIN "clang"
> +#define RTE_TOOLCHAIN_CLANG 1
> +#elif defined(__GNUC__)
> +#define RTE_TOOLCHAIN "gcc"
> +#define RTE_TOOLCHAIN_GCC 1
> +#elif defined(_MSC_VER)
> +#define RTE_TOOLCHAIN "msvc"
> +#define RTE_TOOLCHAIN_MSVC 1
> +#endif
> +
>  /* legacy defines */
>  #ifdef RTE_EXEC_ENV_LINUX
>  #define RTE_EXEC_ENV_LINUXAPP 1
> -- 
> 1.8.3.1
> 


Re: [PATCH v6 1/7] dts: add startup verification and forwarding modes to testpmd shell

2024-01-08 Thread Juraj Linkeš
On Wed, Jan 3, 2024 at 11:33 PM  wrote:
>
> From: Jeremy Spewock 
>
> Added commonly used methods in testpmd such as starting and stopping
> packet forwarding, changing forward modes, and verifying link status of
> ports so that developers can configure testpmd and start forwarding
> through the provided class rather than sending commands to the testpmd
> session directly.
>
> Signed-off-by: Jeremy Spewock 
> ---
>  dts/framework/exception.py|   7 +
>  dts/framework/remote_session/testpmd_shell.py | 149 +-
>  2 files changed, 155 insertions(+), 1 deletion(-)
>
> diff --git a/dts/framework/exception.py b/dts/framework/exception.py
> index 658eee2c38..cce1e0231a 100644
> --- a/dts/framework/exception.py
> +++ b/dts/framework/exception.py

> @@ -65,9 +108,66 @@ class TestPmdShell(InteractiveShell):
>  _command_extra_chars: ClassVar[str] = "\n"
>
>  def _start_application(self, get_privileged_command: Callable[[str], 
> str] | None) -> None:
> -self._app_args += " -- -i"
> +"""Overrides :meth:`~.interactive_shell._start_application`.
> +
> +Add flags for starting testpmd in interactive mode and disabling 
> messages for link state
> +change events before starting the application. Link state is 
> verified before starting
> +packet forwarding and the messages create unexpected newlines in the 
> terminal which
> +complicates output collection.
> +

We should adjust the collection so that it can handle the newlines.
Also, can you explain exactly why we are disabling the initial link
state messages?

> +Also find the number of pci addresses which were allowed on the 
> command line when the app
> +was started.
> +"""
> +self._app_args += " -- -i --mask-event intr_lsc"
> +self.number_of_ports = self._app_args.count("-a ")
>  super()._start_application(get_privileged_command)
>
> +def start(self, verify: bool = True) -> None:
> +"""Start packet forwarding with the current configuration.
> +
> +Args:
> +verify: If :data:`True` , a second start command will be sent in 
> an attempt to verify
> +packet forwarding started as expected.
> +
> +Raises:
> +InteractiveCommandExecutionError: If `verify` is :data:`True` 
> and forwarding fails to
> +start or ports fail to come up.
> +"""
> +self.send_command("start")
> +if verify:
> +# If forwarding was already started, sending "start" again 
> should tell us
> +start_cmd_output = self.send_command("start")
> +if "Packet forwarding already started" not in start_cmd_output:
> +self._logger.debug(f"Failed to start packet forwarding: 
> \n{start_cmd_output}")
> +raise InteractiveCommandExecutionError("Testpmd failed to 
> start packet forwarding.")
> +
> +for port_id in range(self.number_of_ports):
> +if not self.wait_link_status_up(port_id):
> +raise InteractiveCommandExecutionError(
> +"Not all ports came up after starting packet 
> forwarding in testpmd."
> +)
> +
> +def stop(self, verify: bool = True) -> None:
> +"""Stop packet forwarding.
> +
> +Args:
> +verify: If :data:`True` , the output of the stop command is 
> scanned to verify that
> +forwarding was stopped successfully or not started. If 
> neither is found, it is
> +considered an error.
> +
> +Raises:
> +InteractiveCommandExecutionError: If `verify` is :data:`True` 
> and the command to stop
> +forwarding results in an error.
> +"""
> +stop_cmd_output = self.send_command("stop")
> +if verify:
> +if (
> +"Done." not in stop_cmd_output
> +and "Packet forwarding not started" not in stop_cmd_output
> +):

I want to make sure I understand this condition. When none of these
appear, it's an error. When just "Done." appears, we successfully
stopped ongoing forwarding and when "Packet forwarding not started"
appears, we're trying to stop forwarding that didn't start (or isn't
ongoing - it could've stopped in the meantime)?
I'm thinking about false failures here (Is there a string that would
indicate a failure even if one of the strings is printed?) - we're
basically looking at "not success" instead of looking for strings
telling us about a failure explicitly. Does the stop command not
produce such output? Or do we not know all of the failure strings or
is looking for the above two strings sufficient to rule out false
failures?

> +self._logger.debug(f"Failed to stop packet forwarding: 
> \n{stop_cmd_output}")
> +raise InteractiveCommandExecutionError("Testpmd failed to 
> stop packet forwarding.")
> +
>  def get

[PATCH v4 1/2] doc: updated incorrect value for IP frag max fragments

2024-01-08 Thread Euan Bourke
Docs for IP Fragment said RTE_LIBRTE_IP_FRAG_MAX_FRAG was 4 by default,
however this was changed to 8.

Documentation has been updated to account for this, including a
snippet of code where RTE_LIBRTE_IP_FRAG_MAX_FRAG is defined to
ensure documentation stays up to date.

Fixes: f8e0f8ce9030 ("ip_frag: increase default maximum of fragments")
Cc: sta...@dpdk.org

Signed-off-by: Euan Bourke 
---
 .mailmap | 1 +
 doc/guides/prog_guide/ip_fragment_reassembly_lib.rst | 8 +++-
 2 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/.mailmap b/.mailmap
index ab0742a382..528bc68a30 100644
--- a/.mailmap
+++ b/.mailmap
@@ -379,6 +379,7 @@ Eric Zhang 
 Erik Gabriel Carrillo 
 Erik Ziegenbalg 
 Erlu Chen 
+Euan Bourke 
 Eugenio Pérez 
 Eugeny Parshutin 
 Evan Swanson 
diff --git a/doc/guides/prog_guide/ip_fragment_reassembly_lib.rst 
b/doc/guides/prog_guide/ip_fragment_reassembly_lib.rst
index 314d4adbb8..f3ed90d700 100644
--- a/doc/guides/prog_guide/ip_fragment_reassembly_lib.rst
+++ b/doc/guides/prog_guide/ip_fragment_reassembly_lib.rst
@@ -43,7 +43,13 @@ Note that all update/lookup operations on Fragment Table are 
not thread safe.
 So if different execution contexts (threads/processes) will access the same 
table simultaneously,
 then some external syncing mechanism have to be provided.
 
-Each table entry can hold information about packets consisting of up to 
RTE_LIBRTE_IP_FRAG_MAX (by default: 4) fragments.
+Each table entry can hold information about packets of up to 
``RTE_LIBRTE_IP_FRAG_MAX_FRAG`` fragments,
+where ``RTE_LIBRTE_IP_FRAG_MAX_FRAG`` defaults to:
+
+.. literalinclude:: ../../../config/rte_config.h
+:language: c
+:start-at: #define RTE_LIBRTE_IP_FRAG_MAX_FRAG
+:lines: 1
 
 Code example, that demonstrates creation of a new Fragment table:
 
-- 
2.34.1



[PATCH v4 2/2] ip_frag: updated name for IP frag define

2024-01-08 Thread Euan Bourke
Removed LIBRTE from name as its an old prefix.

Signed-off-by: Euan Bourke 
---
 app/test/test_reassembly_perf.c  | 2 +-
 config/rte_config.h  | 2 +-
 doc/guides/prog_guide/ip_fragment_reassembly_lib.rst | 8 
 doc/guides/sample_app_ug/ip_reassembly.rst   | 4 ++--
 examples/ip_fragmentation/main.c | 2 +-
 examples/ip_reassembly/main.c| 2 +-
 examples/ipsec-secgw/ipsec_worker.h  | 2 +-
 lib/ip_frag/ip_reassembly.h  | 2 +-
 lib/ip_frag/rte_ip_frag.h| 2 +-
 9 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/app/test/test_reassembly_perf.c b/app/test/test_reassembly_perf.c
index 3912179022..805ae2fe9d 100644
--- a/app/test/test_reassembly_perf.c
+++ b/app/test/test_reassembly_perf.c
@@ -20,7 +20,7 @@
 #define MAX_FLOWS  (1024 * 32)
 #define MAX_BKTS   MAX_FLOWS
 #define MAX_ENTRIES_PER_BKT 16
-#define MAX_FRAGMENTS  RTE_LIBRTE_IP_FRAG_MAX_FRAG
+#define MAX_FRAGMENTS  RTE_IP_FRAG_MAX_FRAG
 #define MIN_FRAGMENTS  2
 #define MAX_PKTS   (MAX_FLOWS * MAX_FRAGMENTS)
 
diff --git a/config/rte_config.h b/config/rte_config.h
index da265d7dd2..e2fa2a58fa 100644
--- a/config/rte_config.h
+++ b/config/rte_config.h
@@ -85,7 +85,7 @@
 #define RTE_RAWDEV_MAX_DEVS 64
 
 /* ip_fragmentation defines */
-#define RTE_LIBRTE_IP_FRAG_MAX_FRAG 8
+#define RTE_IP_FRAG_MAX_FRAG 8
 // RTE_LIBRTE_IP_FRAG_TBL_STAT is not set
 
 /* rte_power defines */
diff --git a/doc/guides/prog_guide/ip_fragment_reassembly_lib.rst 
b/doc/guides/prog_guide/ip_fragment_reassembly_lib.rst
index f3ed90d700..d826854890 100644
--- a/doc/guides/prog_guide/ip_fragment_reassembly_lib.rst
+++ b/doc/guides/prog_guide/ip_fragment_reassembly_lib.rst
@@ -43,12 +43,12 @@ Note that all update/lookup operations on Fragment Table 
are not thread safe.
 So if different execution contexts (threads/processes) will access the same 
table simultaneously,
 then some external syncing mechanism have to be provided.
 
-Each table entry can hold information about packets of up to 
``RTE_LIBRTE_IP_FRAG_MAX_FRAG`` fragments,
-where ``RTE_LIBRTE_IP_FRAG_MAX_FRAG`` defaults to:
+Each table entry can hold information about packets of up to 
``RTE_IP_FRAG_MAX_FRAG`` fragments,
+where ``RTE_IP_FRAG_MAX_FRAG`` defaults to:
 
 .. literalinclude:: ../../../config/rte_config.h
 :language: c
-:start-at: #define RTE_LIBRTE_IP_FRAG_MAX_FRAG
+:start-at: #define RTE_IP_FRAG_MAX_FRAG
 :lines: 1
 
 Code example, that demonstrates creation of a new Fragment table:
@@ -69,7 +69,7 @@ Also, entries that resides in the table longer then 
 are considered
 and could be removed/replaced by the new ones.
 
 Note that reassembly demands a lot of mbuf's to be allocated.
-At any given time up to (2 \* bucket_entries \* RTE_LIBRTE_IP_FRAG_MAX \* 
)
+At any given time up to (2 \* bucket_entries \* RTE_IP_FRAG_MAX_FRAG \* 
)
 can be stored inside Fragment Table waiting for remaining fragments.
 
 Packet Reassembly
diff --git a/doc/guides/sample_app_ug/ip_reassembly.rst 
b/doc/guides/sample_app_ug/ip_reassembly.rst
index 5280bf4ea0..44684a39e2 100644
--- a/doc/guides/sample_app_ug/ip_reassembly.rst
+++ b/doc/guides/sample_app_ug/ip_reassembly.rst
@@ -135,7 +135,7 @@ Fragment table maintains information about already received 
fragments of the pac
 Each IP packet is uniquely identified by triple , 
, .
 To avoid lock contention, each RX queue has its own Fragment Table,
 e.g. the application can't handle the situation when different fragments of 
the same packet arrive through different RX queues.
-Each table entry can hold information about packet consisting of up to 
RTE_LIBRTE_IP_FRAG_MAX_FRAGS fragments.
+Each table entry can hold information about packet consisting of up to 
RTE_IP_FRAG_MAX_FRAG fragments.
 
 .. literalinclude:: ../../../examples/ip_reassembly/main.c
 :language: c
@@ -147,7 +147,7 @@ Mempools Initialization
 ~~~
 
 The reassembly application demands a lot of mbuf's to be allocated.
-At any given time up to (2 \* max_flow_num \* RTE_LIBRTE_IP_FRAG_MAX_FRAGS \* 
)
+At any given time up to (2 \* max_flow_num \* RTE_IP_FRAG_MAX_FRAG \* )
 can be stored inside Fragment Table waiting for remaining fragments.
 To keep mempool size under reasonable limits and to avoid situation when one 
RX queue can starve other queues,
 each RX queue uses its own mempool.
diff --git a/examples/ip_fragmentation/main.c b/examples/ip_fragmentation/main.c
index 744a1aa9b4..1e4471891b 100644
--- a/examples/ip_fragmentation/main.c
+++ b/examples/ip_fragmentation/main.c
@@ -71,7 +71,7 @@
 /*
  * Max number of fragments per packet expected - defined by config file.
  */
-#defineMAX_PACKET_FRAG RTE_LIBRTE_IP_FRAG_MAX_FRAG
+#defineMAX_PACKET_FRAG RTE_IP_FRAG_MAX_FRAG
 
 #define NB_MBUF   8192
 
diff --git a/examples/ip_reassembly/main.c b/example

Re: [PATCH v6 2/7] dts: limit EAL parameters to DPDK apps and add parameters to all apps

2024-01-08 Thread Juraj Linkeš
On Wed, Jan 3, 2024 at 11:33 PM  wrote:
>
> From: Jeremy Spewock 
>
> Changed the factory method for creating interactive apps in the SUT Node
> so that EAL parameters would only be passed into DPDK apps since
> non-DPDK apps wouldn't be able to process them. Also modified
> interactive apps to allow for the ability to pass parameters into the
> app on startup so that the applications can be started with certain
> configuration steps passed on the command line.
>
> Signed-off-by: Jeremy Spewock 
> ---
>
> I ended up reverting part of this back to making the argument for
> eal_parameters allowed to be a string. This was because it was casuing
> mypy errors where the method signatures of sut_node did not match with
> that of node.
>

This is because the signatures don't actually match :-)

The eal_parameters parameter is added on not top of what's in the base
methods. I suggest we move eal_parameters to the end and then we don't
need to allow str for eal_parameters.

>  dts/framework/remote_session/testpmd_shell.py |  2 +-
>  dts/framework/testbed_model/sut_node.py   | 14 +-
>  2 files changed, 10 insertions(+), 6 deletions(-)
>
> diff --git a/dts/framework/remote_session/testpmd_shell.py 
> b/dts/framework/remote_session/testpmd_shell.py
> index f310705fac..8f40e8f40e 100644
> --- a/dts/framework/remote_session/testpmd_shell.py
> +++ b/dts/framework/remote_session/testpmd_shell.py
> @@ -118,7 +118,7 @@ def _start_application(self, get_privileged_command: 
> Callable[[str], str] | None
>  Also find the number of pci addresses which were allowed on the 
> command line when the app
>  was started.
>  """
> -self._app_args += " -- -i --mask-event intr_lsc"
> +self._app_args += " -i --mask-event intr_lsc"
>  self.number_of_ports = self._app_args.count("-a ")
>  super()._start_application(get_privileged_command)
>
> diff --git a/dts/framework/testbed_model/sut_node.py 
> b/dts/framework/testbed_model/sut_node.py
> index c4acea38d1..4df18bc183 100644
> --- a/dts/framework/testbed_model/sut_node.py
> +++ b/dts/framework/testbed_model/sut_node.py
> @@ -431,6 +431,7 @@ def create_interactive_shell(
>  timeout: float = SETTINGS.timeout,
>  privileged: bool = False,
>  eal_parameters: EalParameters | str | None = None,
> +app_parameters: str = "",

What I meant above is we should move app_parameters before
eal_parameters and then we can remove the str type of eal_parameters.

>  ) -> InteractiveShellType:
>  """Extend the factory for interactive session handlers.
>
> @@ -449,20 +450,23 @@ def create_interactive_shell(
>  eal_parameters: List of EAL parameters to use to launch the app. 
> If this
>  isn't provided or an empty string is passed, it will default 
> to calling
>  :meth:`create_eal_parameters`.
> +app_parameters: Additional arguments to pass into the 
> application on the
> +command-line.
>
>  Returns:
>  An instance of the desired interactive application shell.
>  """
> -if not eal_parameters:
> -eal_parameters = self.create_eal_parameters()
> -
> -# We need to append the build directory for DPDK apps
> +# We need to append the build directory and add EAL parameters for 
> DPDK apps
>  if shell_cls.dpdk_app:
> +if not eal_parameters:
> +eal_parameters = self.create_eal_parameters()
> +app_parameters = f"{eal_parameters} -- {app_parameters}"
> +
>  shell_cls.path = self.main_session.join_remote_path(
>  self.remote_dpdk_build_dir, shell_cls.path
>  )
>
> -return super().create_interactive_shell(shell_cls, timeout, 
> privileged, str(eal_parameters))
> +return super().create_interactive_shell(shell_cls, timeout, 
> privileged, app_parameters)
>
>  def bind_ports_to_driver(self, for_dpdk: bool = True) -> None:
>  """Bind all ports on the SUT to a driver.
> --
> 2.43.0
>


[PATCH v4 0/3] simplified to 3 layer Tx scheduler

2024-01-08 Thread Qi Zhang
Remove dummy layers, code refactor, complete document

v4:
- rebase.

v3:
- fix tm_node memory free.
- fix corrupt when slibling node deletion is not in a reversed order.

v2:
- fix typos.

Qi Zhang (3):
  net/ice: hide port and TC layer in Tx sched tree
  net/ice: refactor tm config data structure
  doc: update ice document for qos

 doc/guides/nics/ice.rst  |  19 +++
 drivers/net/ice/ice_ethdev.h |  12 +-
 drivers/net/ice/ice_tm.c | 317 +--
 3 files changed, 134 insertions(+), 214 deletions(-)

-- 
2.31.1



[PATCH v4 1/3] net/ice: hide port and TC layer in Tx sched tree

2024-01-08 Thread Qi Zhang
In currently 5 layer tree implementation, the port and tc layer
is not configurable, so its not necessary to expose them to application.

The patch hides the top 2 layers and represented the root of the tree at
VSI layer. From application's point of view, its a 3 layer scheduler tree:

Port -> Queue Group -> Queue.

Signed-off-by: Qi Zhang 
Acked-by: Wenjun Wu 
---
 drivers/net/ice/ice_ethdev.h |  7 
 drivers/net/ice/ice_tm.c | 79 
 2 files changed, 7 insertions(+), 79 deletions(-)

diff --git a/drivers/net/ice/ice_ethdev.h b/drivers/net/ice/ice_ethdev.h
index fa4981ed14..ae22c29ffc 100644
--- a/drivers/net/ice/ice_ethdev.h
+++ b/drivers/net/ice/ice_ethdev.h
@@ -470,7 +470,6 @@ struct ice_tm_shaper_profile {
 struct ice_tm_node {
TAILQ_ENTRY(ice_tm_node) node;
uint32_t id;
-   uint32_t tc;
uint32_t priority;
uint32_t weight;
uint32_t reference_count;
@@ -484,8 +483,6 @@ struct ice_tm_node {
 /* node type of Traffic Manager */
 enum ice_tm_node_type {
ICE_TM_NODE_TYPE_PORT,
-   ICE_TM_NODE_TYPE_TC,
-   ICE_TM_NODE_TYPE_VSI,
ICE_TM_NODE_TYPE_QGROUP,
ICE_TM_NODE_TYPE_QUEUE,
ICE_TM_NODE_TYPE_MAX,
@@ -495,12 +492,8 @@ enum ice_tm_node_type {
 struct ice_tm_conf {
struct ice_shaper_profile_list shaper_profile_list;
struct ice_tm_node *root; /* root node - port */
-   struct ice_tm_node_list tc_list; /* node list for all the TCs */
-   struct ice_tm_node_list vsi_list; /* node list for all the VSIs */
struct ice_tm_node_list qgroup_list; /* node list for all the queue 
groups */
struct ice_tm_node_list queue_list; /* node list for all the queues */
-   uint32_t nb_tc_node;
-   uint32_t nb_vsi_node;
uint32_t nb_qgroup_node;
uint32_t nb_queue_node;
bool committed;
diff --git a/drivers/net/ice/ice_tm.c b/drivers/net/ice/ice_tm.c
index c00ecb6a97..d67783c77e 100644
--- a/drivers/net/ice/ice_tm.c
+++ b/drivers/net/ice/ice_tm.c
@@ -43,12 +43,8 @@ ice_tm_conf_init(struct rte_eth_dev *dev)
/* initialize node configuration */
TAILQ_INIT(&pf->tm_conf.shaper_profile_list);
pf->tm_conf.root = NULL;
-   TAILQ_INIT(&pf->tm_conf.tc_list);
-   TAILQ_INIT(&pf->tm_conf.vsi_list);
TAILQ_INIT(&pf->tm_conf.qgroup_list);
TAILQ_INIT(&pf->tm_conf.queue_list);
-   pf->tm_conf.nb_tc_node = 0;
-   pf->tm_conf.nb_vsi_node = 0;
pf->tm_conf.nb_qgroup_node = 0;
pf->tm_conf.nb_queue_node = 0;
pf->tm_conf.committed = false;
@@ -79,16 +75,6 @@ ice_tm_conf_uninit(struct rte_eth_dev *dev)
rte_free(tm_node);
}
pf->tm_conf.nb_qgroup_node = 0;
-   while ((tm_node = TAILQ_FIRST(&pf->tm_conf.vsi_list))) {
-   TAILQ_REMOVE(&pf->tm_conf.vsi_list, tm_node, node);
-   rte_free(tm_node);
-   }
-   pf->tm_conf.nb_vsi_node = 0;
-   while ((tm_node = TAILQ_FIRST(&pf->tm_conf.tc_list))) {
-   TAILQ_REMOVE(&pf->tm_conf.tc_list, tm_node, node);
-   rte_free(tm_node);
-   }
-   pf->tm_conf.nb_tc_node = 0;
if (pf->tm_conf.root) {
rte_free(pf->tm_conf.root);
pf->tm_conf.root = NULL;
@@ -100,8 +86,6 @@ ice_tm_node_search(struct rte_eth_dev *dev,
uint32_t node_id, enum ice_tm_node_type *node_type)
 {
struct ice_pf *pf = ICE_DEV_PRIVATE_TO_PF(dev->data->dev_private);
-   struct ice_tm_node_list *tc_list = &pf->tm_conf.tc_list;
-   struct ice_tm_node_list *vsi_list = &pf->tm_conf.vsi_list;
struct ice_tm_node_list *qgroup_list = &pf->tm_conf.qgroup_list;
struct ice_tm_node_list *queue_list = &pf->tm_conf.queue_list;
struct ice_tm_node *tm_node;
@@ -111,20 +95,6 @@ ice_tm_node_search(struct rte_eth_dev *dev,
return pf->tm_conf.root;
}
 
-   TAILQ_FOREACH(tm_node, tc_list, node) {
-   if (tm_node->id == node_id) {
-   *node_type = ICE_TM_NODE_TYPE_TC;
-   return tm_node;
-   }
-   }
-
-   TAILQ_FOREACH(tm_node, vsi_list, node) {
-   if (tm_node->id == node_id) {
-   *node_type = ICE_TM_NODE_TYPE_VSI;
-   return tm_node;
-   }
-   }
-
TAILQ_FOREACH(tm_node, qgroup_list, node) {
if (tm_node->id == node_id) {
*node_type = ICE_TM_NODE_TYPE_QGROUP;
@@ -378,6 +348,8 @@ ice_shaper_profile_del(struct rte_eth_dev *dev,
return 0;
 }
 
+#define MAX_QUEUE_PER_GROUP8
+
 static int
 ice_tm_node_add(struct rte_eth_dev *dev, uint32_t node_id,
  uint32_t parent_node_id, uint32_t priority,
@@ -391,8 +363,6 @@ ice_tm_node_add(struct rte_eth_dev *dev, uint32_t node_id,
struct ice_tm_shaper_profile *shaper_profile = NULL;
struct ice_tm_node *tm_node;
struct ice

[PATCH v4 2/3] net/ice: refactor tm config data structure

2024-01-08 Thread Qi Zhang
Simplified struct ice_tm_conf by removing per level node list.

Signed-off-by: Qi Zhang 
---
 drivers/net/ice/ice_ethdev.h |   5 +-
 drivers/net/ice/ice_tm.c | 248 ---
 2 files changed, 113 insertions(+), 140 deletions(-)

diff --git a/drivers/net/ice/ice_ethdev.h b/drivers/net/ice/ice_ethdev.h
index ae22c29ffc..008a7a23b9 100644
--- a/drivers/net/ice/ice_ethdev.h
+++ b/drivers/net/ice/ice_ethdev.h
@@ -472,6 +472,7 @@ struct ice_tm_node {
uint32_t id;
uint32_t priority;
uint32_t weight;
+   uint32_t level;
uint32_t reference_count;
struct ice_tm_node *parent;
struct ice_tm_node **children;
@@ -492,10 +493,6 @@ enum ice_tm_node_type {
 struct ice_tm_conf {
struct ice_shaper_profile_list shaper_profile_list;
struct ice_tm_node *root; /* root node - port */
-   struct ice_tm_node_list qgroup_list; /* node list for all the queue 
groups */
-   struct ice_tm_node_list queue_list; /* node list for all the queues */
-   uint32_t nb_qgroup_node;
-   uint32_t nb_queue_node;
bool committed;
bool clear_on_fail;
 };
diff --git a/drivers/net/ice/ice_tm.c b/drivers/net/ice/ice_tm.c
index d67783c77e..fbab0b8808 100644
--- a/drivers/net/ice/ice_tm.c
+++ b/drivers/net/ice/ice_tm.c
@@ -6,6 +6,9 @@
 #include "ice_ethdev.h"
 #include "ice_rxtx.h"
 
+#define MAX_CHILDREN_PER_SCHED_NODE8
+#define MAX_CHILDREN_PER_TM_NODE   256
+
 static int ice_hierarchy_commit(struct rte_eth_dev *dev,
 int clear_on_fail,
 __rte_unused struct rte_tm_error *error);
@@ -43,20 +46,28 @@ ice_tm_conf_init(struct rte_eth_dev *dev)
/* initialize node configuration */
TAILQ_INIT(&pf->tm_conf.shaper_profile_list);
pf->tm_conf.root = NULL;
-   TAILQ_INIT(&pf->tm_conf.qgroup_list);
-   TAILQ_INIT(&pf->tm_conf.queue_list);
-   pf->tm_conf.nb_qgroup_node = 0;
-   pf->tm_conf.nb_queue_node = 0;
pf->tm_conf.committed = false;
pf->tm_conf.clear_on_fail = false;
 }
 
+static void free_node(struct ice_tm_node *root)
+{
+   uint32_t i;
+
+   if (root == NULL)
+   return;
+
+   for (i = 0; i < root->reference_count; i++)
+   free_node(root->children[i]);
+
+   rte_free(root);
+}
+
 void
 ice_tm_conf_uninit(struct rte_eth_dev *dev)
 {
struct ice_pf *pf = ICE_DEV_PRIVATE_TO_PF(dev->data->dev_private);
struct ice_tm_shaper_profile *shaper_profile;
-   struct ice_tm_node *tm_node;
 
/* clear profile */
while ((shaper_profile = 
TAILQ_FIRST(&pf->tm_conf.shaper_profile_list))) {
@@ -64,52 +75,8 @@ ice_tm_conf_uninit(struct rte_eth_dev *dev)
rte_free(shaper_profile);
}
 
-   /* clear node configuration */
-   while ((tm_node = TAILQ_FIRST(&pf->tm_conf.queue_list))) {
-   TAILQ_REMOVE(&pf->tm_conf.queue_list, tm_node, node);
-   rte_free(tm_node);
-   }
-   pf->tm_conf.nb_queue_node = 0;
-   while ((tm_node = TAILQ_FIRST(&pf->tm_conf.qgroup_list))) {
-   TAILQ_REMOVE(&pf->tm_conf.qgroup_list, tm_node, node);
-   rte_free(tm_node);
-   }
-   pf->tm_conf.nb_qgroup_node = 0;
-   if (pf->tm_conf.root) {
-   rte_free(pf->tm_conf.root);
-   pf->tm_conf.root = NULL;
-   }
-}
-
-static inline struct ice_tm_node *
-ice_tm_node_search(struct rte_eth_dev *dev,
-   uint32_t node_id, enum ice_tm_node_type *node_type)
-{
-   struct ice_pf *pf = ICE_DEV_PRIVATE_TO_PF(dev->data->dev_private);
-   struct ice_tm_node_list *qgroup_list = &pf->tm_conf.qgroup_list;
-   struct ice_tm_node_list *queue_list = &pf->tm_conf.queue_list;
-   struct ice_tm_node *tm_node;
-
-   if (pf->tm_conf.root && pf->tm_conf.root->id == node_id) {
-   *node_type = ICE_TM_NODE_TYPE_PORT;
-   return pf->tm_conf.root;
-   }
-
-   TAILQ_FOREACH(tm_node, qgroup_list, node) {
-   if (tm_node->id == node_id) {
-   *node_type = ICE_TM_NODE_TYPE_QGROUP;
-   return tm_node;
-   }
-   }
-
-   TAILQ_FOREACH(tm_node, queue_list, node) {
-   if (tm_node->id == node_id) {
-   *node_type = ICE_TM_NODE_TYPE_QUEUE;
-   return tm_node;
-   }
-   }
-
-   return NULL;
+   free_node(pf->tm_conf.root);
+   pf->tm_conf.root = NULL;
 }
 
 static int
@@ -202,11 +169,29 @@ ice_node_param_check(struct ice_pf *pf, uint32_t node_id,
return 0;
 }
 
+static struct ice_tm_node *
+find_node(struct ice_tm_node *root, uint32_t id)
+{
+   uint32_t i;
+
+   if (root == NULL || root->id == id)
+   return root;
+
+   for (i = 0; i < root->reference_count; i++) {
+   struct ice_tm_node *node = find_node(root->children[i], id);
+
+ 

[PATCH v4 3/3] doc: update ice document for qos

2024-01-08 Thread Qi Zhang
Add description for ice PMD's rte_tm capabilities.

Signed-off-by: Qi Zhang 
Acked-by: Wenjun Wu 
---
 doc/guides/nics/ice.rst | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/doc/guides/nics/ice.rst b/doc/guides/nics/ice.rst
index bafb3ba022..163d6b8bb6 100644
--- a/doc/guides/nics/ice.rst
+++ b/doc/guides/nics/ice.rst
@@ -352,6 +352,25 @@ queue 3 using a raw pattern::
 
 Currently, raw pattern support is limited to the FDIR and Hash engines.
 
+Traffic Management Support
+~~
+
+The ice PMD provides support for the Traffic Management API (RTE_TM), allow
+users to offload a 3-layers Tx scheduler on the E810 NIC:
+
+- ``Port Layer``
+
+  This is the root layer, support peak bandwidth configuration, max to 32 
children.
+
+- ``Queue Group Layer``
+
+  The middel layer, support peak / committed bandwidth, weight, priority 
configurations,
+  max to 8 children.
+
+- ``Queue Layer``
+
+  The leaf layer, support peak / committed bandwidth, weight, priority 
configurations.
+
 Additional Options
 ++
 
-- 
2.31.1



Re: [PATCH v6 3/7] dts: add optional packet filtering to scapy sniffer

2024-01-08 Thread Juraj Linkeš
On Wed, Jan 3, 2024 at 11:33 PM  wrote:
>
> From: Jeremy Spewock 
>
> Added the options to filter out LLDP and ARP packets when
> sniffing for packets with scapy. This was done using BPF filters to
> ensure that the noise these packets provide does not interfere with test
> cases.
>
> Signed-off-by: Jeremy Spewock 
> ---
>  dts/framework/test_suite.py   | 15 +--
>  dts/framework/testbed_model/tg_node.py| 14 --
>  .../traffic_generator/__init__.py |  7 -
>  .../capturing_traffic_generator.py| 22 ++-
>  .../testbed_model/traffic_generator/scapy.py  | 27 +++
>  5 files changed, 79 insertions(+), 6 deletions(-)
>



> diff --git 
> a/dts/framework/testbed_model/traffic_generator/capturing_traffic_generator.py
>  
> b/dts/framework/testbed_model/traffic_generator/capturing_traffic_generator.py
> index 0246590333..c1c9facedd 100644
> --- 
> a/dts/framework/testbed_model/traffic_generator/capturing_traffic_generator.py
> +++ 
> b/dts/framework/testbed_model/traffic_generator/capturing_traffic_generator.py

> @@ -26,6 +27,19 @@ def _get_default_capture_name() -> str:
>  return str(uuid.uuid4())
>
>
> +@dataclass(slots=True)

This should also be frozen. If we need a different filter, it's better
to create a new object I think.

> +class PacketFilteringConfig:
> +"""The supported filtering options for 
> :class:`CapturingTrafficGenerator`.
> +
> +Attributes:
> +no_lldp: If :data:`True`, LLDP packets will be filtered out when 
> capturing.
> +no_arp: If :data:`True`, ARP packets will be filtered out when 
> capturing.
> +"""
> +
> +no_lldp: bool = True
> +no_arp: bool = True
> +
> +
>  class CapturingTrafficGenerator(TrafficGenerator):
>  """Capture packets after sending traffic.
>

> diff --git a/dts/framework/testbed_model/traffic_generator/scapy.py 
> b/dts/framework/testbed_model/traffic_generator/scapy.py
> index 5b60f66237..505de0be94 100644
> --- a/dts/framework/testbed_model/traffic_generator/scapy.py
> +++ b/dts/framework/testbed_model/traffic_generator/scapy.py

> @@ -260,11 +263,34 @@ def _send_packets(self, packets: list[Packet], port: 
> Port) -> None:
>  packets = [packet.build() for packet in packets]
>  self.rpc_server_proxy.scapy_send_packets(packets, port.logical_name)
>
> +def _create_packet_filter(self, filter_config: PacketFilteringConfig) -> 
> str:
> +"""Combines filter settings from `filter_config` into a BPF that 
> scapy can use.
> +
> +Scapy allows for the use of Berkeley Packet Filters (BPFs) to filter 
> what packets are
> +collected based on various attributes of the packet.
> +
> +Args:
> +filter_config: Config class that specifies which filters should 
> be applied.
> +
> +Returns:
> +A string representing the combination of BPF filters to be 
> passed to scapy. For
> +example:
> +
> +"ether[12:2] != 0x88cc && ether[12:2] != 0x0806"
> +"""
> +bpf_filter: list[str] = []

The type hint here is not needed, so let's make this consistent with
the rest of the code - we don't specify local type hints if they're
not necessary.

> +if filter_config.no_arp:
> +bpf_filter.append("ether[12:2] != 0x0806")
> +if filter_config.no_lldp:
> +bpf_filter.append("ether[12:2] != 0x88cc")
> +return " && ".join(bpf_filter)
> +
>  def _send_packets_and_capture(
>  self,
>  packets: list[Packet],
>  send_port: Port,
>  receive_port: Port,
> +filter_config: PacketFilteringConfig,
>  duration: float,
>  capture_name: str = _get_default_capture_name(),
>  ) -> list[Packet]:


Re: DTS testpmd and SCAPY integration

2024-01-08 Thread Luca Vizzarro

Hi Gregory,

Your proposal sounds rather interesting. Certainly enabling DTS to 
accept YAML-written tests sounds more developer-friendly and should 
enable quicker test-writing. As this is an extra feature though – and a 
nice-to-have, it should definitely be discussed in the DTS meetings as 
Honnappa suggested already.


Another thing, I am not sure that the intention is that Scapy will be 
the only traffic generator supported. Could be very wrong here. In the 
case we'd support others too, how would you tackle this problem? We can 
discuss this in the meeting as well if needed.


Best,
Luca


Re: DTS testpmd and SCAPY integration

2024-01-08 Thread Luca Vizzarro

Hi Honnappa,


YAML has wide support built around it. By using our own text format, we will 
have to build the parsing support etc ourselves.

However, YAML is supposed to be easy to read and understand. Is it just a 
matter for getting used to it?


I may be wrong, please feel free to correct me, but I believe what 
Gregory meant is to write raw Scapy and testpmd commands directly in 
YAML-written tests. As in, we wouldn't be using our own text format, 
we'd just need to feed the raw commands from YAML straight to the Scapy 
and testpmd shells. Finally, validate their responses to what was set in 
the YAML test. In Gregory's example this is just comparing stdout.


Best,
Luca


RE: [dpdk-dev] [v5] doc: define qualification criteria for external library

2024-01-08 Thread Hemant Agrawal

On 08-Jan-24 2:01 PM, Jerin Jacob wrote:
> On Mon, Jan 8, 2024 at 1:47 PM Hemant Agrawal  wrote:
>>
>>
>> On 08-Jan-24 1:28 PM, jer...@marvell.com wrote:
>>> From: Jerin Jacob 
>>>
>>> Define qualification criteria for external library
>>> based on a techboard meeting minutes [1] and past
>>> learnings from mailing list discussion.
>>>
>>> [1]
>>> http://mails.dpdk.org/archives/dev/2019-June/135847.html
>>> https://mails.dpdk.org/archives/dev/2024-January/284849.html
>>>
>>> Signed-off-by: Jerin Jacob 
>>> Acked-by: Thomas Monjalon 
>
>>> +#. **Distributions License:**
>>> +
>>> +   - No specific constraints beyond documentation.
>>
>> Though we are not mandatory open distribution. However we should ask for the
>> defining the distribution aspect clearly in the library.
>
> How about following then,
>
> No specific constraints, but clear documentation on distribution usage
> aspects is required.
>
> If not, please suggest the exact wording.

I think above is ok.


>


smime.p7s
Description: S/MIME cryptographic signature


RE: [PATCH v2] lib/dmadev: get DMA device using device ID

2024-01-08 Thread Anoob Joseph
> 
> DMA library has a function to get DMA device based on device name but
> there is no function to get DMA device using device id.
> 
> Added a function that lookup for the dma device using device id and returns
> the pointer to the same.
> 
> Signed-off-by: Amit Prakash Shukla 
> Acked-by: Chengwen Feng 

Acked-by: Anoob Joseph 




Re: [PATCH v6 4/7] dts: add pci addresses to EAL parameters

2024-01-08 Thread Juraj Linkeš
Reviewed-by: Juraj Linkeš 

On Wed, Jan 3, 2024 at 11:33 PM  wrote:
>
> From: Jeremy Spewock 
>
> Added allow list to the EAL parameters created in DTS to ensure that
> only the relevant PCI devices are considered when launching DPDK
> applications.
>
> Signed-off-by: Jeremy Spewock 
> ---
>  dts/framework/testbed_model/sut_node.py | 12 
>  1 file changed, 12 insertions(+)
>
> diff --git a/dts/framework/testbed_model/sut_node.py 
> b/dts/framework/testbed_model/sut_node.py
> index 4df18bc183..cc894fb07d 100644
> --- a/dts/framework/testbed_model/sut_node.py
> +++ b/dts/framework/testbed_model/sut_node.py
> @@ -30,6 +30,7 @@
>  from .cpu import LogicalCoreCount, LogicalCoreList
>  from .node import Node
>  from .os_session import InteractiveShellType, OSSession
> +from .port import Port
>  from .virtual_device import VirtualDevice
>
>
> @@ -46,6 +47,7 @@ def __init__(
>  prefix: str,
>  no_pci: bool,
>  vdevs: list[VirtualDevice],
> +ports: list[Port],
>  other_eal_param: str,
>  ):
>  """Initialize the parameters according to inputs.
> @@ -63,6 +65,7 @@ def __init__(
>  VirtualDevice('net_ring0'),
>  VirtualDevice('net_ring1')
>  ]
> +ports: The list of ports to allow.
>  other_eal_param: user defined DPDK EAL parameters, e.g.:
>  ``other_eal_param='--single-file-segments'``
>  """
> @@ -73,6 +76,7 @@ def __init__(
>  self._prefix = f"--file-prefix={prefix}"
>  self._no_pci = "--no-pci" if no_pci else ""
>  self._vdevs = " ".join(f"--vdev {vdev}" for vdev in vdevs)
> +self._ports = " ".join(f"-a {port.pci}" for port in ports)
>  self._other_eal_param = other_eal_param
>
>  def __str__(self) -> str:
> @@ -83,6 +87,7 @@ def __str__(self) -> str:
>  f"{self._prefix} "
>  f"{self._no_pci} "
>  f"{self._vdevs} "
> +f"{self._ports} "
>  f"{self._other_eal_param}"
>  )
>
> @@ -347,6 +352,7 @@ def create_eal_parameters(
>  append_prefix_timestamp: bool = True,
>  no_pci: bool = False,
>  vdevs: list[VirtualDevice] | None = None,
> +ports: list[Port] | None = None,
>  other_eal_param: str = "",
>  ) -> "EalParameters":
>  """Compose the EAL parameters.
> @@ -370,6 +376,8 @@ def create_eal_parameters(
>  VirtualDevice('net_ring0'),
>  VirtualDevice('net_ring1')
>  ]
> +ports: The list of ports to allow. If :data:`None`, all ports 
> listed in `self.ports`
> +will be allowed.
>  other_eal_param: user defined DPDK EAL parameters, e.g.:
>  ``other_eal_param='--single-file-segments'``.
>
> @@ -388,12 +396,16 @@ def create_eal_parameters(
>  if vdevs is None:
>  vdevs = []
>
> +if ports is None:
> +ports = self.ports
> +
>  return EalParameters(
>  lcore_list=lcore_list,
>  memory_channels=self.config.memory_channels,
>  prefix=prefix,
>  no_pci=no_pci,
>  vdevs=vdevs,
> +ports=ports,
>  other_eal_param=other_eal_param,
>  )
>
> --
> 2.43.0
>


Re: [PATCH v6 5/7] dts: allow configuring MTU of ports

2024-01-08 Thread Juraj Linkeš
Reviewed-by: Juraj Linkeš 

On Wed, Jan 3, 2024 at 11:33 PM  wrote:
>
> From: Jeremy Spewock 
>
> Adds methods in both os_session and linux session to allow for setting
> MTU of port interfaces so that suites that require the sending and
> receiving of packets of a specific size, or the rejection of packets
> over a certain size, can configure this maximum as needed.
>
> Signed-off-by: Jeremy Spewock 


Re: [PATCH v6 6/7] dts: add scatter to the yaml schema

2024-01-08 Thread Juraj Linkeš
Reviewed-by: Juraj Linkeš 

On Wed, Jan 3, 2024 at 11:33 PM  wrote:
>
> From: Jeremy Spewock 
>
> Allow for scatter to be specified in the configuration file.
>
> Signed-off-by: Jeremy Spewock 


RE: Issues around packet capture when secondary process is doing rx/tx

2024-01-08 Thread Konstantin Ananyev



> I have been looking at a problem reported by Sandesh
> where packet capture does not work if rx/tx burst is done in secondary 
> process.
> 
> The root cause is that existing rx/tx callback model just doesn't work
> unless the process doing the rx/tx burst calls is the same one that
> registered the callbacks.
> 
> An example sequence would be:
>   1. dumpcap (or pdump) as secondary tells pdump in primary to register 
> callback
>   2. secondary process calls rx_burst.
>   3. rx_burst sees the callback but it has pointer pdump_rx which is not 
> necessarily
>  at same location in primary and secondary process.
>   4. indirect function call in secondary to bad location likely causes 
> crash.

As I remember, RX/TX callbacks were never intended to work over multiple 
processes.
Right now RX/TX callbacks are private for the process, different process simply 
should not
see/execute them.
I.E. it callbacks list is part of 'struct rte_eth_dev' itself, not the 
rte_eth_dev.data that is shared
between processes.
It should be normal, wehn for the same port/queue you will end-up with 
different list of callbacks
for different processes.  
So, unless I am missing something, I don't see how we can end-up with 3) and 4) 
from above:
>From my understanding secondary process will never see/call primary's 
>callbacks.

About pdump itself, it was a while when I looked at it last time, but as I 
remember to start it to work,
server process has to call rte_pdump_init() which in terns register PDUMP_MP 
handler.
I suppose for the secondary process to act as a 'pdump server' it needs to call 
rte_pdump_init() itself,
though I am not sure such option is supported right now. 
 
> 
> Some possible workarounds.
>   1. Keep callback list per-process: messy, but won't crash. Capture 
> won't work
>without other changes. In this primary would register callback, 
> but secondaries
>would not use them in rx/tx burst.
> 
>   2. Replace use of rx/tx callback in pdump with change to rte_ethdev to 
> have
>a capture flag. (i.e. don't use indirection).  Likely ABI problems.
>Basically, ignore the rx/tx callback mechanism. This is my 
> preferred
>  solution.

It is not only the capture flag, it is also what to do with the captured packets
(copy? If yes, then where to? examine? drop?, do something else?).
It is probably not the best choice to add all these things into ethdev API.

>   3. Some fix up mechanism (in EAL mp support?) to have each process fixup
>its callback mechanism.
 
Probably the easiest way to fix that - pass to rte_pdump_enable() extra 
information
that  would allow it to distinguish on what exact process (local, remote)
we want to enable pdump functionality. Then it could act accordingly.

> 
>   4. Do something in pdump_init to register the callback in same process 
> context
>  (probably need callbacks to be per-process). Would mean callback is 
> always
>on independent of capture being enabled.
> 
> 5. Get rid of indirect function call pointer, and replace it by index 
> into
>a static table of callback functions. Every process would have 
> same code
>(in this case pdump_rx) but at different address.  Requires all 
> callbacks
>to be statically defined at build time.

Doesn't look like a good approach - it will break many things. 
 
> The existing rx/tx callback is not safe id rx/tx burst is called from 
> different process
> than where callback is registered.
 



RE: unnecessary rx callbacks when zero packets

2024-01-08 Thread Konstantin Ananyev


rx callbacks when zero packets
> 
> I noticed while looking at packet capture that currently the receive callbacks
> get called even if there are no packets. This seems unnecessary since if
> nb_rx is zero, then there are no packets to look at.  My one concern is that
> an application could be using callbacks as some form of scheduling mechanism
> which would be broken.

As I remember, original idea was to allow callbacks to inject new packets if 
needed.

> 
> The change would be:
> 
> 
> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
> index 21e3a21903ec..f64bf977c46e 100644
> --- a/lib/ethdev/rte_ethdev.h
> +++ b/lib/ethdev/rte_ethdev.h
> @@ -6077,7 +6077,7 @@ rte_eth_rx_burst(uint16_t port_id, uint16_t queue_id,
> nb_rx = p->rx_pkt_burst(qd, rx_pkts, nb_pkts);
> 
>  #ifdef RTE_ETHDEV_RXTX_CALLBACKS
> -   {
> +   if (nb_rx > 0) {
> void *cb;
> 
> /* rte_memory_order_release memory order was used when the


Re: [PATCH 2/3] net/nfp: fix free resource problem

2024-01-08 Thread Ferruh Yigit
On 12/18/2023 1:50 AM, Chaoyong He wrote:
>> On 12/14/2023 10:24 AM, Chaoyong He wrote:
>>> From: Long Wu 
>>>
>>> Set the representor array to NULL to avoid that close interface does
>>> not free some resource.
>>>
>>> Fixes: a135bc1644d6 ("net/nfp: fix resource leak for flower firmware")
>>> Cc: chaoyong...@corigine.com
>>> Cc: sta...@dpdk.org
>>>
>>> Signed-off-by: Long Wu 
>>> Reviewed-by: Chaoyong He 
>>> Reviewed-by: Peng Zhang 
>>> ---
>>>  drivers/net/nfp/flower/nfp_flower_representor.c | 15 ++-
>>>  1 file changed, 14 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/net/nfp/flower/nfp_flower_representor.c
>>> b/drivers/net/nfp/flower/nfp_flower_representor.c
>>> index 27ea3891bd..5f7c1fa737 100644
>>> --- a/drivers/net/nfp/flower/nfp_flower_representor.c
>>> +++ b/drivers/net/nfp/flower/nfp_flower_representor.c
>>> @@ -294,17 +294,30 @@ nfp_flower_repr_tx_burst(void *tx_queue,  static
>>> int  nfp_flower_repr_uninit(struct rte_eth_dev *eth_dev)  {
>>> +   uint16_t index;
>>> struct nfp_flower_representor *repr;
>>>
>>> repr = eth_dev->data->dev_private;
>>> rte_ring_free(repr->ring);
>>>
>>> +   if (repr->repr_type == NFP_REPR_TYPE_PHYS_PORT) {
>>> +   index = NFP_FLOWER_CMSG_PORT_PHYS_PORT_NUM(repr-
>>> port_id);
>>> +   repr->app_fw_flower->phy_reprs[index] = NULL;
>>> +   } else {
>>> +   index = repr->vf_id;
>>> +   repr->app_fw_flower->vf_reprs[index] = NULL;
>>> +   }
>>> +
>>> return 0;
>>>  }
>>>
>>>  static int
>>> -nfp_flower_pf_repr_uninit(__rte_unused struct rte_eth_dev *eth_dev)
>>> +nfp_flower_pf_repr_uninit(struct rte_eth_dev *eth_dev)
>>>  {
>>> +   struct nfp_flower_representor *repr = eth_dev->data->dev_private;
>>> +
>>> +   repr->app_fw_flower->pf_repr = NULL;
>>>
>>
>> Here it is assigned to NULL but is it freed? If freed, why not set to NULL 
>> where
>> it is freed?
>>
>> Same for above phy_reprs & vf_reprs.
> 
> The whole invoke view:
> rte_eth_dev_close()
> --> nfp_flower_repr_dev_close()
> --> nfp_flower_repr_free()
> --> nfp_flower_pf_repr_uninit()
> --> nfp_flower_repr_uninit()
>// In these two functions, we just assigned to NULL but not freed 
> yet.
>// It is still refer by the `eth_dev->data->dev_private`.
> --> rte_eth_dev_release_port()
> --> rte_free(eth_dev->data->dev_private);
> // And here it is really freed (by the rte framework).
> 

'rte_eth_dev_release_port()' frees the device private data, but not all
pointers, like 'repr->app_fw_flower->pf_repr', in the struct are freed,
it is dev_close() or unint() functions responsibility.

Can you please double check if
'eth_dev->data->dev_private->app_fw_flower->pf_repr' freed or not?



Re: [PATCH v6 7/7] dts: add pmd_buffer_scatter test suite

2024-01-08 Thread Juraj Linkeš
On Wed, Jan 3, 2024 at 11:33 PM  wrote:
>
> From: Jeremy Spewock 
>
> This test suite provides testing of the support of scattered packets by
> Poll Mode Drivers using testpmd, verifying the ability to receive and
> transmit scattered multi-segment packets made up of multiple
> non-contiguous memory buffers. This is tested through 5 different cases
> in which the length of the packets sent are less than the mbuf size,
> equal to the mbuf size, and 1, 4, and 5 bytes greater than the mbuf size
> in order to show both the CRC and the packet data are capable of
> existing in the first, second, or both buffers.
>
> Naturally, if the PMD is capable of forwarding scattered packets which
> it receives as input, this shows it is capable of both receiving and
> transmitting scattered packets.
>
> Signed-off-by: Jeremy Spewock 
> ---
>  dts/tests/TestSuite_pmd_buffer_scatter.py | 126 ++
>  1 file changed, 126 insertions(+)
>  create mode 100644 dts/tests/TestSuite_pmd_buffer_scatter.py
>
> diff --git a/dts/tests/TestSuite_pmd_buffer_scatter.py 
> b/dts/tests/TestSuite_pmd_buffer_scatter.py
> new file mode 100644
> index 00..8838c3404f
> --- /dev/null
> +++ b/dts/tests/TestSuite_pmd_buffer_scatter.py
> @@ -0,0 +1,126 @@
> +# SPDX-License-Identifier: BSD-3-Clause
> +# Copyright(c) 2023-2024 University of New Hampshire
> +
> +"""Multi-segment packet scattering testing suite.

Test suite - I guess this is a copy-paste?

> +
> +This testing suite tests the support of transmitting and receiving scattered 
> packets. This is shown
> +by the Poll Mode Driver being able to forward scattered multi-segment 
> packets composed of multiple
> +non-contiguous memory buffers. To ensure the receipt of scattered packets, 
> the DMA rings of the
> +port's RX queues must be configured with mbuf data buffers whose size is 
> less than the maximum
> +length.
> +
> +If it is the case that the Poll Mode Driver can forward scattered packets 
> which it receives, then
> +this suffices to show the Poll Mode Driver is capable of both receiving and 
> transmitting scattered
> +packets.
> +"""

We have a newline between the docstring and the import everywhere.

> +import struct
> +
> +from scapy.layers.inet import IP  # type: ignore[import]
> +from scapy.layers.l2 import Ether  # type: ignore[import]
> +from scapy.packet import Raw  # type: ignore[import]
> +from scapy.utils import hexstr  # type: ignore[import]
> +
> +from framework.remote_session.testpmd_shell import TestPmdForwardingModes, 
> TestPmdShell
> +from framework.test_suite import TestSuite
> +
> +
> +class PmdBufferScatter(TestSuite):
> +"""DPDK PMD packet scattering test suite.
> +
> +Configure the Rx queues to have mbuf data buffers whose sizes are 
> smaller than the maximum
> +packet size. Specifically, set mbuf data buffers to have a size of 2048 
> to fit a full 1512-byte
> +(CRC included) ethernet frame in a mono-segment packet. The testing of 
> scattered packets is
> +done by sending a packet whose length is greater than the size of the 
> configured size of mbuf
> +data buffers. There are a total of 5 packets sent within test cases 
> which have lengths less
> +than, equal to, and greater than the mbuf size. There are multiple 
> packets sent with lengths
> +greater than the mbuf size in order to test cases such as:
> +
> +1. A single byte of the CRC being in a second buffer while the remaining 
> 3 bytes are stored in
> +the first buffer alongside packet data.
> +2. The entire CRC being stored in a second buffer while all of the 
> packet data is stored in the
> +first.
> +3. Most of the packet data being stored in the first buffer and a single 
> byte of packet data
> +stored in a second buffer alongside the CRC.
> +"""
> +
> +def set_up_suite(self) -> None:
> +"""Set up the test suite.
> +
> +Setup:
> +Verify they we have at least 2 port links in the current 
> execution and increase the MTU

Typo - they.

> +of both ports on the tg_node to 9000 to support larger packet 
> sizes.

The description should be code agnostic, so let's use traffic
generator node instead of tg_node.

> +"""
> +self.verify(
> +len(self._port_links) > 1,
> +"Must have at least two port links to run scatter",

I'd like this to be at least "Must have at least two port links to run
the scatter test suite" so that it's immediately obvious where this
comes from. I'm also debating which of these is better: "Must have at
least" or "There must be at least", but I'm not sure.

> +)
> +
> +self.tg_node.main_session.configure_port_mtu(9000, 
> self._tg_port_egress)
> +self.tg_node.main_session.configure_port_mtu(9000, 
> self._tg_port_ingress)
> +
> +def scatter_pktgen_send_packet(self, pktsize: int) -> str:
> +"""Generate and send packet to the SUT.

send a packet

But this also captures a

[PATCH v12] gro: fix reordering of packets in GRO layer

2024-01-08 Thread Kumara Parameshwaran
In the current implementation when a packet is received with
special TCP flag(s) set, only that packet is delivered out of order.
There could be already coalesced packets in the GRO table
belonging to the same flow but not delivered.
This fix makes sure that the entire segment is delivered with the
special flag(s) set which is how the Linux GRO is also implemented

Signed-off-by: Kumara Parameshwaran 
---
If the received packet is not a pure ACK packet, we check if
there are any previous packets in the flow, if present we indulge
the received packet also in the coalescing logic and update the flags
of the last recived packet to the entire segment which would avoid
re-ordering.

Lets say a case where P1(PSH), P2(ACK), P3(ACK)  are received in burst 
mode,
P1 contains PSH flag and since it does not contain any prior packets in 
the flow
we copy it to unprocess_packets and P2(ACK) and P3(ACK) are merged 
together.
In the existing case the  P2,P3 would be delivered as single segment 
first and the
unprocess_packets will be copied later which will cause reordering. 
With the patch
copy the unprocess packets first and then the packets from the GRO 
table.

Testing done
The csum test-pmd was modified to support the following
GET request of 10MB from client to server via test-pmd (static arp 
entries added in client
and server). Enable GRO and TSO in test-pmd where the packets recived 
from the client mac
would be sent to server mac and vice versa.
In above testing, without the patch the client observerd re-ordering of 
25 packets
and with the patch there were no packet re-ordering observerd.

v2: 
Fix warnings in commit and comment.
Do not consider packet as candidate to merge if it contains SYN/RST 
flag.

v3:
Fix warnings.

v4:
Rebase with master.

v5:
Adding co-author email
v6:
Address review comments from the maintainer to restructure the code 
and handle only special flags PSH,FIN

v7:
Fix warnings and errors

v8:
Fix warnings and errors

v9:
Fix commit message 

v10:
Update tcp header flags and address review comments 

v11: 
Fix warnings

v12:
Fix nit review comments

 lib/gro/gro_tcp.h  |  9 +++
 lib/gro/gro_tcp4.c | 48 --
 lib/gro/gro_tcp_internal.h |  2 +-
 lib/gro/gro_vxlan_tcp4.c   |  5 ++--
 4 files changed, 44 insertions(+), 20 deletions(-)

diff --git a/lib/gro/gro_tcp.h b/lib/gro/gro_tcp.h
index d926c4b8cc..2c68b5f23e 100644
--- a/lib/gro/gro_tcp.h
+++ b/lib/gro/gro_tcp.h
@@ -19,6 +19,8 @@
 #define INVALID_TCP_HDRLEN(len) \
(((len) < sizeof(struct rte_tcp_hdr)) || ((len) > MAX_TCP_HLEN))
 
+#define VALID_GRO_TCP_FLAGS (RTE_TCP_ACK_FLAG | RTE_TCP_PSH_FLAG | 
RTE_TCP_FIN_FLAG)
+
 struct cmn_tcp_key {
struct rte_ether_addr eth_saddr;
struct rte_ether_addr eth_daddr;
@@ -81,11 +83,13 @@ merge_two_tcp_packets(struct gro_tcp_item *item,
struct rte_mbuf *pkt,
int cmp,
uint32_t sent_seq,
+   uint8_t tcp_flags,
uint16_t ip_id,
uint16_t l2_offset)
 {
struct rte_mbuf *pkt_head, *pkt_tail, *lastseg;
uint16_t hdr_len, l2_len;
+   struct rte_tcp_hdr *tcp_hdr;
 
if (cmp > 0) {
pkt_head = item->firstseg;
@@ -128,6 +132,11 @@ merge_two_tcp_packets(struct gro_tcp_item *item,
/* update MBUF metadata for the merged packet */
pkt_head->nb_segs += pkt_tail->nb_segs;
pkt_head->pkt_len += pkt_tail->pkt_len;
+   if (tcp_flags != RTE_TCP_ACK_FLAG) {
+   tcp_hdr = rte_pktmbuf_mtod_offset(pkt, struct rte_tcp_hdr *,
+   l2_offset + pkt_head->l2_len + 
pkt_head->l3_len);
+   tcp_hdr->tcp_flags |= tcp_flags;
+   }
 
return 1;
 }
diff --git a/lib/gro/gro_tcp4.c b/lib/gro/gro_tcp4.c
index 6645de592b..ad9cf04dbe 100644
--- a/lib/gro/gro_tcp4.c
+++ b/lib/gro/gro_tcp4.c
@@ -126,6 +126,7 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
uint32_t item_idx;
uint32_t i, max_flow_num, remaining_flow_num;
uint8_t find;
+   uint32_t item_start_idx;
 
/*
 * Don't process the packet whose TCP header length is greater
@@ -139,11 +140,8 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
tcp_hdr = (struct rte_tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len);
hdr_len = pkt->l2_len + pkt->l3_len + pkt->l4_len;
 
-   /*
-* Don't process the packet which has FIN, SYN, RST, PSH, URG, ECE
-* or CWR set.
-*/
-   if (tcp_hdr->tcp_flags != RTE_TCP_ACK_FLAG)
+   /* Return early if the TCP flags are not handled in GRO layer */
+   if (tcp_hdr->tcp_flags & ~VALID_GRO_TCP_FLAGS)
return -1;
 
/* trim the

Re: [BUG] [bonding] bonding member delete bug

2024-01-08 Thread Ferruh Yigit
On 12/18/2023 6:37 AM, Simon Jones wrote:
> Oh, it's fixed by 0911d4ec and f5e72e8e
>

Thanks Simon for reporting.

Do you know if the above fixes backported to the 21.11.x LTS release?


> 
> Simon Jones
> 
> 
> Simon Jones mailto:batmanu...@gmail.com>> 于2023
> 年12月18日周一 10:51写道:
> 
> Hi all,
> 
> I'm using DPDK-21.11 in ovs-dpdk.
> 
> I found a "bonding member delete bug" .
> 
> 1. How to reproduce
> 
> ```
> NOTICE: bondctl is a tool I develop, it's to control DPDK.
> 
> ### step 1, Add bonding device bond0.
> bondctl add bond0 mode active-backup
> 
> ### step 2, Add member m1 into bond0.
> bondctl set :00:0a.0 master bond0 
> 
> ### step 3, Add bond0 into ovs bridge.
> ovs-vsctl add-port brp0 bond0 -- set interface bond0 type=dpdk
> options:dpdk-devargs=net_bonding-bond0
> (this command call @bond_ethdev_start at last.)
> 
> ### step 4, Delete bond0 from ovs bridge.
> ovs-vsctl del-port br-phy bond0
> (this command call @bond_ethdev_stop at last.)
> 
> ### step 5, Delete m1 from bond0.
> bondctl set :00:0a.0 nomaster
> 
> ### step 6, Delete bond0.
> bondctl del bond0
> 
> ### step 7, Add bond0.
> bondctl add bond0 mode active-backup
> 
> ### step 8, Add member m1 into bond0.
> bondctl set :00:0a.0 master bond0
> (this command call @bond_ethdev_start at last.)
> 
> ### Then got error message.
> 2023-12-15T08:24:04.153Z|00017|dpdk|ERR|Port 0 must be stopped to
> allow configurr
> ation
> 2023-12-15T08:24:04.153Z|00018|dpdk|ERR|bond_cmd_set_master(581) -
> can not confii
> g slave :00:0a.0!
> ```
> 
> 2. Debug
> 
> I found the reason is, when member port is DOWN, then add operation
> will call "eth_dev->data->dev_started = 1;", but no one add active
> member port, so when delete bond0, will NOT call @rte_eth_dev_stop,
> then add bond0 again, got error. Detail is:
> ```
> ### After step 1-3, add bond0 into ovs-dpdk
> bond_ethdev_start
>     eth_dev->data->dev_started = 1;
>     for (i = 0; i < internals->slave_count; i++) {
>         if (slave_configure(eth_dev, slave_ethdev) != 0) {
>         if (slave_start(eth_dev, slave_ethdev) != 0) {
>             rte_eth_dev_start
> 
> ### NOTICE, as member port is DOWN, so will NOT
> call @activate_slave, so @active_slave_count is 0.
> bond_ethdev_lsc_event_callback
>     activate_slave(bonded_eth_dev, port_id);
> 
> ### After step 4, delete bond0 from ovs-dpdk, NOTICE,
> as @active_slave_count is 0, so will NOT call @rte_eth_dev_stop
> bond_ethdev_stop
>     for (i = 0; i < internals->slave_count; i++) {
>         if (find_slave_by_id(internals->active_slaves,
>                 internals->active_slave_count, slave_id) !=
>                         internals->active_slave_count) {
>             ret = rte_eth_dev_stop(slave_id);
> 
> ### After step 5-7, delete bond0 and then add bond0
> 
> ### After step 8, add bond0, as it's NOT call @rte_eth_dev_stop, so
> call @rte_eth_dev_start again will got error.
> 2023-12-15T08:24:04.153Z|00017|dpdk|ERR|Port 0 must be stopped to
> allow configurr
> ation
> 
> ```
> 
> 3. My question
> 
> Is this bug fixed ? Which commit ?
> 
> If NOT, how to fix this bug? I think it's better to
> call @rte_eth_dev_stop for every member, even it's DOWN. How about this?
> 
> Thanks~
> 
> 
> 
> Simon Jones
> 



[PATCH v13] gro: fix reordering of packets in GRO layer

2024-01-08 Thread Kumara Parameshwaran
In the current implementation when a packet is received with
special TCP flag(s) set, only that packet is delivered out of order.
There could be already coalesced packets in the GRO table
belonging to the same flow but not delivered.
This fix makes sure that the entire segment is delivered with the
special flag(s) set which is how the Linux GRO is also implemented

Signed-off-by: Kumara Parameshwaran 
---
If the received packet is not a pure ACK packet, we check if
there are any previous packets in the flow, if present we indulge
the received packet also in the coalescing logic and update the flags
of the last recived packet to the entire segment which would avoid
re-ordering.

Lets say a case where P1(PSH), P2(ACK), P3(ACK)  are received in burst 
mode,
P1 contains PSH flag and since it does not contain any prior packets in 
the flow
we copy it to unprocess_packets and P2(ACK) and P3(ACK) are merged 
together.
In the existing case the  P2,P3 would be delivered as single segment 
first and the
unprocess_packets will be copied later which will cause reordering. 
With the patch
copy the unprocess packets first and then the packets from the GRO 
table.

Testing done
The csum test-pmd was modified to support the following
GET request of 10MB from client to server via test-pmd (static arp 
entries added in client
and server). Enable GRO and TSO in test-pmd where the packets recived 
from the client mac
would be sent to server mac and vice versa.
In above testing, without the patch the client observerd re-ordering of 
25 packets
and with the patch there were no packet re-ordering observerd.

v2: 
Fix warnings in commit and comment.
Do not consider packet as candidate to merge if it contains SYN/RST 
flag.

v3:
Fix warnings.

v4:
Rebase with master.

v5:
Adding co-author email
v6:
Address review comments from the maintainer to restructure the code 
and handle only special flags PSH,FIN

v7:
Fix warnings and errors

v8:
Fix warnings and errors

v9:
Fix commit message 

v10:
Update tcp header flags and address review comments 

v11:  
Fix warnings

v12:
Fix nit review comments

v13:
Fix warnings 

 lib/gro/gro_tcp.h  |  9 +
 lib/gro/gro_tcp4.c | 36 +++-
 lib/gro/gro_tcp_internal.h |  2 +-
 lib/gro/gro_vxlan_tcp4.c   |  5 +++--
 4 files changed, 40 insertions(+), 12 deletions(-)

diff --git a/lib/gro/gro_tcp.h b/lib/gro/gro_tcp.h
index d926c4b8cc..2c68b5f23e 100644
--- a/lib/gro/gro_tcp.h
+++ b/lib/gro/gro_tcp.h
@@ -19,6 +19,8 @@
 #define INVALID_TCP_HDRLEN(len) \
(((len) < sizeof(struct rte_tcp_hdr)) || ((len) > MAX_TCP_HLEN))
 
+#define VALID_GRO_TCP_FLAGS (RTE_TCP_ACK_FLAG | RTE_TCP_PSH_FLAG | 
RTE_TCP_FIN_FLAG)
+
 struct cmn_tcp_key {
struct rte_ether_addr eth_saddr;
struct rte_ether_addr eth_daddr;
@@ -81,11 +83,13 @@ merge_two_tcp_packets(struct gro_tcp_item *item,
struct rte_mbuf *pkt,
int cmp,
uint32_t sent_seq,
+   uint8_t tcp_flags,
uint16_t ip_id,
uint16_t l2_offset)
 {
struct rte_mbuf *pkt_head, *pkt_tail, *lastseg;
uint16_t hdr_len, l2_len;
+   struct rte_tcp_hdr *tcp_hdr;
 
if (cmp > 0) {
pkt_head = item->firstseg;
@@ -128,6 +132,11 @@ merge_two_tcp_packets(struct gro_tcp_item *item,
/* update MBUF metadata for the merged packet */
pkt_head->nb_segs += pkt_tail->nb_segs;
pkt_head->pkt_len += pkt_tail->pkt_len;
+   if (tcp_flags != RTE_TCP_ACK_FLAG) {
+   tcp_hdr = rte_pktmbuf_mtod_offset(pkt, struct rte_tcp_hdr *,
+   l2_offset + pkt_head->l2_len + 
pkt_head->l3_len);
+   tcp_hdr->tcp_flags |= tcp_flags;
+   }
 
return 1;
 }
diff --git a/lib/gro/gro_tcp4.c b/lib/gro/gro_tcp4.c
index 6645de592b..c8b8d7990c 100644
--- a/lib/gro/gro_tcp4.c
+++ b/lib/gro/gro_tcp4.c
@@ -126,6 +126,7 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
uint32_t item_idx;
uint32_t i, max_flow_num, remaining_flow_num;
uint8_t find;
+   uint32_t item_start_idx;
 
/*
 * Don't process the packet whose TCP header length is greater
@@ -139,11 +140,8 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
tcp_hdr = (struct rte_tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len);
hdr_len = pkt->l2_len + pkt->l3_len + pkt->l4_len;
 
-   /*
-* Don't process the packet which has FIN, SYN, RST, PSH, URG, ECE
-* or CWR set.
-*/
-   if (tcp_hdr->tcp_flags != RTE_TCP_ACK_FLAG)
+   /* Return early if the TCP flags are not handled in GRO layer */
+   if (tcp_hdr->tcp_flags & ~VALID_GRO_TCP_FLAGS)
return

Re: [BUG] [bonding] bonding member delete bug

2024-01-08 Thread Kevin Traynor
On 08/01/2024 15:55, Ferruh Yigit wrote:
> On 12/18/2023 6:37 AM, Simon Jones wrote:
>> Oh, it's fixed by 0911d4ec and f5e72e8e
>>
> 
> Thanks Simon for reporting.
> 
> Do you know if the above fixes backported to the 21.11.x LTS release?
> 

Yes, 0911d4ec as part of 18.11 [0] and f5e72e8e backported to 21.11
branch since v21.11.2 [1]

[0]
https://git.dpdk.org/dpdk-stable/commit/?h=21.11&id=0911d4ec01839c9149a0df5758d00d9d57a47cea

[1]
https://git.dpdk.org/dpdk-stable/commit/?h=21.11&id=5a8afc69afabd3c69efbc1b0c048f31d06f7d875

thanks,
Kevin.

> 
>> 
>> Simon Jones
>>
>>
>> Simon Jones mailto:batmanu...@gmail.com>> 于2023
>> 年12月18日周一 10:51写道:
>>
>> Hi all,
>>
>> I'm using DPDK-21.11 in ovs-dpdk.
>>
>> I found a "bonding member delete bug" .
>>
>> 1. How to reproduce
>>
>> ```
>> NOTICE: bondctl is a tool I develop, it's to control DPDK.
>>
>> ### step 1, Add bonding device bond0.
>> bondctl add bond0 mode active-backup
>>
>> ### step 2, Add member m1 into bond0.
>> bondctl set :00:0a.0 master bond0 
>>
>> ### step 3, Add bond0 into ovs bridge.
>> ovs-vsctl add-port brp0 bond0 -- set interface bond0 type=dpdk
>> options:dpdk-devargs=net_bonding-bond0
>> (this command call @bond_ethdev_start at last.)
>>
>> ### step 4, Delete bond0 from ovs bridge.
>> ovs-vsctl del-port br-phy bond0
>> (this command call @bond_ethdev_stop at last.)
>>
>> ### step 5, Delete m1 from bond0.
>> bondctl set :00:0a.0 nomaster
>>
>> ### step 6, Delete bond0.
>> bondctl del bond0
>>
>> ### step 7, Add bond0.
>> bondctl add bond0 mode active-backup
>>
>> ### step 8, Add member m1 into bond0.
>> bondctl set :00:0a.0 master bond0
>> (this command call @bond_ethdev_start at last.)
>>
>> ### Then got error message.
>> 2023-12-15T08:24:04.153Z|00017|dpdk|ERR|Port 0 must be stopped to
>> allow configurr
>> ation
>> 2023-12-15T08:24:04.153Z|00018|dpdk|ERR|bond_cmd_set_master(581) -
>> can not confii
>> g slave :00:0a.0!
>> ```
>>
>> 2. Debug
>>
>> I found the reason is, when member port is DOWN, then add operation
>> will call "eth_dev->data->dev_started = 1;", but no one add active
>> member port, so when delete bond0, will NOT call @rte_eth_dev_stop,
>> then add bond0 again, got error. Detail is:
>> ```
>> ### After step 1-3, add bond0 into ovs-dpdk
>> bond_ethdev_start
>>     eth_dev->data->dev_started = 1;
>>     for (i = 0; i < internals->slave_count; i++) {
>>         if (slave_configure(eth_dev, slave_ethdev) != 0) {
>>         if (slave_start(eth_dev, slave_ethdev) != 0) {
>>             rte_eth_dev_start
>>
>> ### NOTICE, as member port is DOWN, so will NOT
>> call @activate_slave, so @active_slave_count is 0.
>> bond_ethdev_lsc_event_callback
>>     activate_slave(bonded_eth_dev, port_id);
>>
>> ### After step 4, delete bond0 from ovs-dpdk, NOTICE,
>> as @active_slave_count is 0, so will NOT call @rte_eth_dev_stop
>> bond_ethdev_stop
>>     for (i = 0; i < internals->slave_count; i++) {
>>         if (find_slave_by_id(internals->active_slaves,
>>                 internals->active_slave_count, slave_id) !=
>>                         internals->active_slave_count) {
>>             ret = rte_eth_dev_stop(slave_id);
>>
>> ### After step 5-7, delete bond0 and then add bond0
>>
>> ### After step 8, add bond0, as it's NOT call @rte_eth_dev_stop, so
>> call @rte_eth_dev_start again will got error.
>> 2023-12-15T08:24:04.153Z|00017|dpdk|ERR|Port 0 must be stopped to
>> allow configurr
>> ation
>>
>> ```
>>
>> 3. My question
>>
>> Is this bug fixed ? Which commit ?
>>
>> If NOT, how to fix this bug? I think it's better to
>> call @rte_eth_dev_stop for every member, even it's DOWN. How about this?
>>
>> Thanks~
>>
>>
>> 
>> Simon Jones
>>
> 



Re: [PATCH v11] gro: fix reordering of packets in GRO layer

2024-01-08 Thread kumaraparameshwaran rathinavel
On Sun, Jan 7, 2024 at 10:50 PM Stephen Hemminger <
step...@networkplumber.org> wrote:

> On Sun,  7 Jan 2024 16:59:20 +0530
> Kumara Parameshwaran  wrote:
>
> > + /* Return early if the TCP flags are not handled in GRO layer */
> > + if (tcp_hdr->tcp_flags & (~(VALID_GRO_TCP_FLAGS)))
>
> Nit, lots of extra paren here. Could be:
> if (tcp_hdr->tcp_flags & ~VALID_GRO_TCP_FLAGS)
>
>> Done.
>>
> > + if (find == 1) {
> > + /*
> > +  * Any packet with additional flags like PSH,FIN should be
> processed
> > +  * and flushed immediately.
> > +  * Hence marking the start time to 0, so that the packets
> will be flushed
> > +  * immediately in timer mode.
> > +  */
> > + if (tcp_hdr->tcp_flags & (RTE_TCP_ACK_FLAG |
> RTE_TCP_PSH_FLAG | RTE_TCP_FIN_FLAG)) {
> > + if (tcp_hdr->tcp_flags != RTE_TCP_ACK_FLAG)
> > + tbl->items[item_start_idx].start_time = 0;
> > + return process_tcp_item(pkt, tcp_hdr, tcp_dl,
> tbl->items,
> > + tbl->flows[i].start_index,
> > + &tbl->item_num,
> tbl->max_item_num,
> > + ip_id, is_atomic,
> start_time);
> > + } else {
> > + return -1;
> > + }
> > + }
>
> Reordering this conditional would keep code from being so indented.
>
>> Doing this reordering as suggested by Jiyau since the find == 1 would be
>> likely in most cases.
>>
> > - delete_tcp_item(tbl->items, item_idx,
> &tbl->item_num, INVALID_ARRAY_INDEX);
> > + delete_tcp_item(tbl->items, item_idx,
> &tbl->item_num,
> > + INVALID_ARRAY_INDEX);
> >   return -1;
>
> This change is unnecessary, max line length in DPDK is 100 characters for
> readability.
>
>> Done.
>>
> >   return 0;
> > + } else {
> > + return -1;
> >   }
> >
> > - return process_tcp_item(pkt, tcp_hdr, tcp_dl, tbl->items,
> tbl->flows[i].start_index,
> > - &tbl->item_num,
> tbl->max_item_num,
> > - ip_id, is_atomic,
> start_time);
> > + return -1;
> >  }
>
> Since end of else and end of function both return -1, the else clause is
> unnecessary.
>
>> Done.
>>
>


Re: Depends-on patchseries support via git-pw or patchwork

2024-01-08 Thread Ferruh Yigit
On 12/22/2023 5:26 PM, Patrick Robb wrote:
> Hi all,
> 
> As some of you know from discussions at DPDK CI meetings, Adam from UNH
> is writing a script which leverages git-pw, and takes as arguments a
> patch series patchwork id, patchwork project, and pw token, and produces
> a project artifact for CI testing purposes. Starting in January we will
> use it for applying patches to DPDK and creating our dpdk.tar.gz
> artifacts for testing. And, we will submit it to the dpdk-ci repo. 
> 
> Anyways, when we originally discussed the idea, Thomas suggested that we
> implement the depends-on functionality by contributing to the git-pw
> project, as opposed to implementing the depend-on support in the create
> artifact script itself. Adam did create a github issue on the git-pw
> project in order to poll the community for interest in this feature, and
> one of the patchwork maintainers chimed in to suggest that rather than
> implementing the feature on the client side via git-pw, it should simply
> be implemented for patchwork itself. That way if it's patchwork server
> side and exposed via the api, other client side tools like pwclient can
> also receive the benefits.
> 
> I just wanted to flag this on the ci mailing list so that anyone with
> thoughts could submit them on the Github issue, which you can find
> here: https://github.com/getpatchwork/git-pw/issues/71
> 
> 
> Thanks Adam for pushing this effort forward. 
>

Thanks Patrick for the update and thanks Adam for driving this.

Implementing support to patchwork sounds good to me, is anything
expected from our end for this?



Re: [PATCH v6 1/7] dts: add startup verification and forwarding modes to testpmd shell

2024-01-08 Thread Jeremy Spewock
On Mon, Jan 8, 2024 at 6:35 AM Juraj Linkeš 
wrote:

> On Wed, Jan 3, 2024 at 11:33 PM  wrote:
> >
> > From: Jeremy Spewock 
> >
> > Added commonly used methods in testpmd such as starting and stopping
> > packet forwarding, changing forward modes, and verifying link status of
> > ports so that developers can configure testpmd and start forwarding
> > through the provided class rather than sending commands to the testpmd
> > session directly.
> >
> > Signed-off-by: Jeremy Spewock 
> > ---
> >  dts/framework/exception.py|   7 +
> >  dts/framework/remote_session/testpmd_shell.py | 149 +-
> >  2 files changed, 155 insertions(+), 1 deletion(-)
> >
> > diff --git a/dts/framework/exception.py b/dts/framework/exception.py
> > index 658eee2c38..cce1e0231a 100644
> > --- a/dts/framework/exception.py
> > +++ b/dts/framework/exception.py
> 
> > @@ -65,9 +108,66 @@ class TestPmdShell(InteractiveShell):
> >  _command_extra_chars: ClassVar[str] = "\n"
> >
> >  def _start_application(self, get_privileged_command:
> Callable[[str], str] | None) -> None:
> > -self._app_args += " -- -i"
> > +"""Overrides :meth:`~.interactive_shell._start_application`.
> > +
> > +Add flags for starting testpmd in interactive mode and
> disabling messages for link state
> > +change events before starting the application. Link state is
> verified before starting
> > +packet forwarding and the messages create unexpected newlines
> in the terminal which
> > +complicates output collection.
> > +
>
> We should adjust the collection so that it can handle the newlines.
> Also, can you explain exactly why we are disabling the initial link
> state messages?
>

The problem really comes from the newlines causing the prompt to exist in
the buffer before any command is sent. So, what ends up happening is after
starting the application these link state change events happen at some
point, and they cause an empty "testpmd>" line to exist in the buffer and
the next time you send a command it will stop as soon as it encounters that
line. An additional issue with this prompt is it is put in the buffer
before the link state change event occurs, and there is no prompt that
appears after the event messages, just an empty line. This makes it much
more difficult to detect when the link state change event occurs and
consume it because the event isn't captured the next time you collect
output, all that is consumed is a line containing the prompt.. So, this
makes you essentially one command's worth of output behind because the next
time you send a command you will consume what you were supposed to get from
the last command where you stopped early, and this causes false positives
for things like the link state detection method and failures in output
verification.

This puts you in a position where the only way you can really detect that
one of these events happened is either assuming that only getting an empty
prompt means one of these events happened, or trying to consume output
twice and looking ahead to see if one of these events happened. However,
because we wouldn't be doing anything with these events and we verify link
status before starting anyway, it seemed like the less complex but still
functional solution would just be to mask these events.


>
> > +Also find the number of pci addresses which were allowed on the
> command line when the app
> > +was started.
> > +"""
> > +self._app_args += " -- -i --mask-event intr_lsc"
> > +self.number_of_ports = self._app_args.count("-a ")
> >  super()._start_application(get_privileged_command)
> >
> > +def start(self, verify: bool = True) -> None:
> > +"""Start packet forwarding with the current configuration.
> > +
> > +Args:
> > +verify: If :data:`True` , a second start command will be
> sent in an attempt to verify
> > +packet forwarding started as expected.
> > +
> > +Raises:
> > +InteractiveCommandExecutionError: If `verify` is
> :data:`True` and forwarding fails to
> > +start or ports fail to come up.
> > +"""
> > +self.send_command("start")
> > +if verify:
> > +# If forwarding was already started, sending "start" again
> should tell us
> > +start_cmd_output = self.send_command("start")
> > +if "Packet forwarding already started" not in
> start_cmd_output:
> > +self._logger.debug(f"Failed to start packet forwarding:
> \n{start_cmd_output}")
> > +raise InteractiveCommandExecutionError("Testpmd failed
> to start packet forwarding.")
> > +
> > +for port_id in range(self.number_of_ports):
> > +if not self.wait_link_status_up(port_id):
> > +raise InteractiveCommandExecutionError(
> > +"Not all ports came up after starting packet
> forwarding 

Re: [PATCH v6 2/7] dts: limit EAL parameters to DPDK apps and add parameters to all apps

2024-01-08 Thread Jeremy Spewock
On Mon, Jan 8, 2024 at 6:52 AM Juraj Linkeš 
wrote:

> On Wed, Jan 3, 2024 at 11:33 PM  wrote:
> >
> > From: Jeremy Spewock 
> >
> > Changed the factory method for creating interactive apps in the SUT Node
> > so that EAL parameters would only be passed into DPDK apps since
> > non-DPDK apps wouldn't be able to process them. Also modified
> > interactive apps to allow for the ability to pass parameters into the
> > app on startup so that the applications can be started with certain
> > configuration steps passed on the command line.
> >
> > Signed-off-by: Jeremy Spewock 
> > ---
> >
> > I ended up reverting part of this back to making the argument for
> > eal_parameters allowed to be a string. This was because it was casuing
> > mypy errors where the method signatures of sut_node did not match with
> > that of node.
> >
>
> This is because the signatures don't actually match :-)
>
> The eal_parameters parameter is added on not top of what's in the base
> methods. I suggest we move eal_parameters to the end and then we don't
> need to allow str for eal_parameters.
>
> >  dts/framework/remote_session/testpmd_shell.py |  2 +-
> >  dts/framework/testbed_model/sut_node.py   | 14 +-
> >  2 files changed, 10 insertions(+), 6 deletions(-)
> >
> > diff --git a/dts/framework/remote_session/testpmd_shell.py
> b/dts/framework/remote_session/testpmd_shell.py
> > index f310705fac..8f40e8f40e 100644
> > --- a/dts/framework/remote_session/testpmd_shell.py
> > +++ b/dts/framework/remote_session/testpmd_shell.py
> > @@ -118,7 +118,7 @@ def _start_application(self, get_privileged_command:
> Callable[[str], str] | None
> >  Also find the number of pci addresses which were allowed on the
> command line when the app
> >  was started.
> >  """
> > -self._app_args += " -- -i --mask-event intr_lsc"
> > +self._app_args += " -i --mask-event intr_lsc"
> >  self.number_of_ports = self._app_args.count("-a ")
> >  super()._start_application(get_privileged_command)
> >
> > diff --git a/dts/framework/testbed_model/sut_node.py
> b/dts/framework/testbed_model/sut_node.py
> > index c4acea38d1..4df18bc183 100644
> > --- a/dts/framework/testbed_model/sut_node.py
> > +++ b/dts/framework/testbed_model/sut_node.py
> > @@ -431,6 +431,7 @@ def create_interactive_shell(
> >  timeout: float = SETTINGS.timeout,
> >  privileged: bool = False,
> >  eal_parameters: EalParameters | str | None = None,
> > +app_parameters: str = "",
>
> What I meant above is we should move app_parameters before
> eal_parameters and then we can remove the str type of eal_parameters.
>

Sounds good to me, I'll move these around in the next version.


>
> >  ) -> InteractiveShellType:
> >  """Extend the factory for interactive session handlers.
> >
> > @@ -449,20 +450,23 @@ def create_interactive_shell(
> >  eal_parameters: List of EAL parameters to use to launch the
> app. If this
> >  isn't provided or an empty string is passed, it will
> default to calling
> >  :meth:`create_eal_parameters`.
> > +app_parameters: Additional arguments to pass into the
> application on the
> > +command-line.
> >
> >  Returns:
> >  An instance of the desired interactive application shell.
> >  """
> > -if not eal_parameters:
> > -eal_parameters = self.create_eal_parameters()
> > -
> > -# We need to append the build directory for DPDK apps
> > +# We need to append the build directory and add EAL parameters
> for DPDK apps
> >  if shell_cls.dpdk_app:
> > +if not eal_parameters:
> > +eal_parameters = self.create_eal_parameters()
> > +app_parameters = f"{eal_parameters} -- {app_parameters}"
> > +
> >  shell_cls.path = self.main_session.join_remote_path(
> >  self.remote_dpdk_build_dir, shell_cls.path
> >  )
> >
> > -return super().create_interactive_shell(shell_cls, timeout,
> privileged, str(eal_parameters))
> > +return super().create_interactive_shell(shell_cls, timeout,
> privileged, app_parameters)
> >
> >  def bind_ports_to_driver(self, for_dpdk: bool = True) -> None:
> >  """Bind all ports on the SUT to a driver.
> > --
> > 2.43.0
> >
>


Re: [PATCH v6 3/7] dts: add optional packet filtering to scapy sniffer

2024-01-08 Thread Jeremy Spewock
On Mon, Jan 8, 2024 at 7:01 AM Juraj Linkeš 
wrote:

> On Wed, Jan 3, 2024 at 11:33 PM  wrote:
> >
> > From: Jeremy Spewock 
> >
> > Added the options to filter out LLDP and ARP packets when
> > sniffing for packets with scapy. This was done using BPF filters to
> > ensure that the noise these packets provide does not interfere with test
> > cases.
> >
> > Signed-off-by: Jeremy Spewock 
> > ---
> >  dts/framework/test_suite.py   | 15 +--
> >  dts/framework/testbed_model/tg_node.py| 14 --
> >  .../traffic_generator/__init__.py |  7 -
> >  .../capturing_traffic_generator.py| 22 ++-
> >  .../testbed_model/traffic_generator/scapy.py  | 27 +++
> >  5 files changed, 79 insertions(+), 6 deletions(-)
> >
>
> 
>
> > diff --git
> a/dts/framework/testbed_model/traffic_generator/capturing_traffic_generator.py
> b/dts/framework/testbed_model/traffic_generator/capturing_traffic_generator.py
> > index 0246590333..c1c9facedd 100644
> > ---
> a/dts/framework/testbed_model/traffic_generator/capturing_traffic_generator.py
> > +++
> b/dts/framework/testbed_model/traffic_generator/capturing_traffic_generator.py
> 
> > @@ -26,6 +27,19 @@ def _get_default_capture_name() -> str:
> >  return str(uuid.uuid4())
> >
> >
> > +@dataclass(slots=True)
>
> This should also be frozen. If we need a different filter, it's better
> to create a new object I think.
>
> > +class PacketFilteringConfig:
> > +"""The supported filtering options for
> :class:`CapturingTrafficGenerator`.
> > +
> > +Attributes:
> > +no_lldp: If :data:`True`, LLDP packets will be filtered out
> when capturing.
> > +no_arp: If :data:`True`, ARP packets will be filtered out when
> capturing.
> > +"""
> > +
> > +no_lldp: bool = True
> > +no_arp: bool = True
> > +
> > +
> >  class CapturingTrafficGenerator(TrafficGenerator):
> >  """Capture packets after sending traffic.
> >
> 
> > diff --git a/dts/framework/testbed_model/traffic_generator/scapy.py
> b/dts/framework/testbed_model/traffic_generator/scapy.py
> > index 5b60f66237..505de0be94 100644
> > --- a/dts/framework/testbed_model/traffic_generator/scapy.py
> > +++ b/dts/framework/testbed_model/traffic_generator/scapy.py
> 
> > @@ -260,11 +263,34 @@ def _send_packets(self, packets: list[Packet],
> port: Port) -> None:
> >  packets = [packet.build() for packet in packets]
> >  self.rpc_server_proxy.scapy_send_packets(packets,
> port.logical_name)
> >
> > +def _create_packet_filter(self, filter_config:
> PacketFilteringConfig) -> str:
> > +"""Combines filter settings from `filter_config` into a BPF
> that scapy can use.
> > +
> > +Scapy allows for the use of Berkeley Packet Filters (BPFs) to
> filter what packets are
> > +collected based on various attributes of the packet.
> > +
> > +Args:
> > +filter_config: Config class that specifies which filters
> should be applied.
> > +
> > +Returns:
> > +A string representing the combination of BPF filters to be
> passed to scapy. For
> > +example:
> > +
> > +"ether[12:2] != 0x88cc && ether[12:2] != 0x0806"
> > +"""
> > +bpf_filter: list[str] = []
>
> The type hint here is not needed, so let's make this consistent with
> the rest of the code - we don't specify local type hints if they're
> not necessary.
>

Good catch, this might have been left over from when I was experimenting
with formatting the filters, I'll remove it.


>
> > +if filter_config.no_arp:
> > +bpf_filter.append("ether[12:2] != 0x0806")
> > +if filter_config.no_lldp:
> > +bpf_filter.append("ether[12:2] != 0x88cc")
> > +return " && ".join(bpf_filter)
> > +
> >  def _send_packets_and_capture(
> >  self,
> >  packets: list[Packet],
> >  send_port: Port,
> >  receive_port: Port,
> > +filter_config: PacketFilteringConfig,
> >  duration: float,
> >  capture_name: str = _get_default_capture_name(),
> >  ) -> list[Packet]:
>


Re: [PATCH v6 3/7] dts: add optional packet filtering to scapy sniffer

2024-01-08 Thread Jeremy Spewock
On Mon, Jan 8, 2024 at 11:39 AM Jeremy Spewock  wrote:

>
>
> On Mon, Jan 8, 2024 at 7:01 AM Juraj Linkeš 
> wrote:
>
>> On Wed, Jan 3, 2024 at 11:33 PM  wrote:
>> >
>> > From: Jeremy Spewock 
>> >
>> > Added the options to filter out LLDP and ARP packets when
>> > sniffing for packets with scapy. This was done using BPF filters to
>> > ensure that the noise these packets provide does not interfere with test
>> > cases.
>> >
>> > Signed-off-by: Jeremy Spewock 
>> > ---
>> >  dts/framework/test_suite.py   | 15 +--
>> >  dts/framework/testbed_model/tg_node.py| 14 --
>> >  .../traffic_generator/__init__.py |  7 -
>> >  .../capturing_traffic_generator.py| 22 ++-
>> >  .../testbed_model/traffic_generator/scapy.py  | 27 +++
>> >  5 files changed, 79 insertions(+), 6 deletions(-)
>> >
>>
>> 
>>
>> > diff --git
>> a/dts/framework/testbed_model/traffic_generator/capturing_traffic_generator.py
>> b/dts/framework/testbed_model/traffic_generator/capturing_traffic_generator.py
>> > index 0246590333..c1c9facedd 100644
>> > ---
>> a/dts/framework/testbed_model/traffic_generator/capturing_traffic_generator.py
>> > +++
>> b/dts/framework/testbed_model/traffic_generator/capturing_traffic_generator.py
>> 
>> > @@ -26,6 +27,19 @@ def _get_default_capture_name() -> str:
>> >  return str(uuid.uuid4())
>> >
>> >
>> > +@dataclass(slots=True)
>>
>> This should also be frozen. If we need a different filter, it's better
>> to create a new object I think.
>>
>> This is also a good point, I'll make this change as well.


Re: why DPDK reassembles IP fragment packets with AF_PACKET

2024-01-08 Thread Stephen Hemminger
On Mon, 8 Jan 2024 19:07:20 +0800 (CST)
钟  wrote:

> Hi All,
> 
> 
> Recently I debug ovs-dpdk with AF_PACKET mode.  When IP fragment packets are 
> received via DPDK, the IP fragment packets are reassembled by DPDK. After 
> reassembly, the packet length is over 1518. They are discarded by OVS because 
> of oversize packets.
> 
> 
> I don't understandy why  PACKET_FLAG_DEFRAG is set for AF_PACKET mode.
> 
> 

Not sure, but it looks like the af_packet wants to hash packets for fanout.
Hashin of fragments won't work correctly, and packet will arrive on different 
queues if fragmented.


Re: [PATCH v6 7/7] dts: add pmd_buffer_scatter test suite

2024-01-08 Thread Jeremy Spewock
On Mon, Jan 8, 2024 at 10:47 AM Juraj Linkeš 
wrote:

> On Wed, Jan 3, 2024 at 11:33 PM  wrote:
> >
> > From: Jeremy Spewock 
> >
> > This test suite provides testing of the support of scattered packets by
> > Poll Mode Drivers using testpmd, verifying the ability to receive and
> > transmit scattered multi-segment packets made up of multiple
> > non-contiguous memory buffers. This is tested through 5 different cases
> > in which the length of the packets sent are less than the mbuf size,
> > equal to the mbuf size, and 1, 4, and 5 bytes greater than the mbuf size
> > in order to show both the CRC and the packet data are capable of
> > existing in the first, second, or both buffers.
> >
> > Naturally, if the PMD is capable of forwarding scattered packets which
> > it receives as input, this shows it is capable of both receiving and
> > transmitting scattered packets.
> >
> > Signed-off-by: Jeremy Spewock 
> > ---
> >  dts/tests/TestSuite_pmd_buffer_scatter.py | 126 ++
> >  1 file changed, 126 insertions(+)
> >  create mode 100644 dts/tests/TestSuite_pmd_buffer_scatter.py
> >
> > diff --git a/dts/tests/TestSuite_pmd_buffer_scatter.py
> b/dts/tests/TestSuite_pmd_buffer_scatter.py
> > new file mode 100644
> > index 00..8838c3404f
> > --- /dev/null
> > +++ b/dts/tests/TestSuite_pmd_buffer_scatter.py
> > @@ -0,0 +1,126 @@
> > +# SPDX-License-Identifier: BSD-3-Clause
> > +# Copyright(c) 2023-2024 University of New Hampshire
> > +
> > +"""Multi-segment packet scattering testing suite.
>
> Test suite - I guess this is a copy-paste?
>

Good catch, it probably was.


> > +
> > +This testing suite tests the support of transmitting and receiving
> scattered packets. This is shown
> > +by the Poll Mode Driver being able to forward scattered multi-segment
> packets composed of multiple
> > +non-contiguous memory buffers. To ensure the receipt of scattered
> packets, the DMA rings of the
> > +port's RX queues must be configured with mbuf data buffers whose size
> is less than the maximum
> > +length.
> > +
> > +If it is the case that the Poll Mode Driver can forward scattered
> packets which it receives, then
> > +this suffices to show the Poll Mode Driver is capable of both receiving
> and transmitting scattered
> > +packets.
> > +"""
>
> We have a newline between the docstring and the import everywhere.
>

You're right, I must have missed it here, I'll add that.


> > +import struct
> > +
> > +from scapy.layers.inet import IP  # type: ignore[import]
> > +from scapy.layers.l2 import Ether  # type: ignore[import]
> > +from scapy.packet import Raw  # type: ignore[import]
> > +from scapy.utils import hexstr  # type: ignore[import]
> > +
> > +from framework.remote_session.testpmd_shell import
> TestPmdForwardingModes, TestPmdShell
> > +from framework.test_suite import TestSuite
> > +
> > +
> > +class PmdBufferScatter(TestSuite):
> > +"""DPDK PMD packet scattering test suite.
> > +
> > +Configure the Rx queues to have mbuf data buffers whose sizes are
> smaller than the maximum
> > +packet size. Specifically, set mbuf data buffers to have a size of
> 2048 to fit a full 1512-byte
> > +(CRC included) ethernet frame in a mono-segment packet. The testing
> of scattered packets is
> > +done by sending a packet whose length is greater than the size of
> the configured size of mbuf
> > +data buffers. There are a total of 5 packets sent within test cases
> which have lengths less
> > +than, equal to, and greater than the mbuf size. There are multiple
> packets sent with lengths
> > +greater than the mbuf size in order to test cases such as:
> > +
> > +1. A single byte of the CRC being in a second buffer while the
> remaining 3 bytes are stored in
> > +the first buffer alongside packet data.
> > +2. The entire CRC being stored in a second buffer while all of the
> packet data is stored in the
> > +first.
> > +3. Most of the packet data being stored in the first buffer and a
> single byte of packet data
> > +stored in a second buffer alongside the CRC.
> > +"""
> > +
> > +def set_up_suite(self) -> None:
> > +"""Set up the test suite.
> > +
> > +Setup:
> > +Verify they we have at least 2 port links in the current
> execution and increase the MTU
>
> Typo - they.
>

Oops, thank you!


>
> > +of both ports on the tg_node to 9000 to support larger
> packet sizes.
>
> The description should be code agnostic, so let's use traffic
> generator node instead of tg_node.
>

Good point, I'll update this.


>
> > +"""
> > +self.verify(
> > +len(self._port_links) > 1,
> > +"Must have at least two port links to run scatter",
>
> I'd like this to be at least "Must have at least two port links to run
> the scatter test suite" so that it's immediately obvious where this
> comes from. I'm also debating which of these is better: "Must have at
> least" or "There must be at l

Re: [dpdk-dev] [v5] doc: define qualification criteria for external library

2024-01-08 Thread Stephen Hemminger
On Mon, 8 Jan 2024 14:01:37 +0530
Jerin Jacob  wrote:

> On Mon, Jan 8, 2024 at 1:47 PM Hemant Agrawal  wrote:
> >
> >
> > On 08-Jan-24 1:28 PM, jer...@marvell.com wrote:  
> > > From: Jerin Jacob 
> > >
> > > Define qualification criteria for external library
> > > based on a techboard meeting minutes [1] and past
> > > learnings from mailing list discussion.
> > >
> > > [1]
> > > http://mails.dpdk.org/archives/dev/2019-June/135847.html
> > > https://mails.dpdk.org/archives/dev/2024-January/284849.html
> > >
> > > Signed-off-by: Jerin Jacob 
> > > Acked-by: Thomas Monjalon   
> 
> > > +#. **Distributions License:**
> > > +
> > > +   - No specific constraints beyond documentation.  
> >
> > Though we are not mandatory open distribution. However we should ask for the
> > defining the distribution aspect clearly in the library.  
> 
> How about following then,
> 
> No specific constraints, but clear documentation on distribution usage
> aspects is required.
> 
> If not, please suggest the exact wording.

The wording specifies the intent here, and that is not what matters.
This is not a legal document where someone will take us to court
if the board doesn't accept a library.


Re: DTS testpmd and SCAPY integration

2024-01-08 Thread Etelson, Gregory

Hello Luca,


Your proposal sounds rather interesting. Certainly enabling DTS to
accept YAML-written tests sounds more developer-friendly and should
enable quicker test-writing. As this is an extra feature though – and a
nice-to-have, it should definitely be discussed in the DTS meetings as
Honnappa suggested already.

Another thing, I am not sure that the intention is that Scapy will be
the only traffic generator supported. Could be very wrong here. In the
case we'd support others too, how would you tackle this problem? We can
discuss this in the meeting as well if needed.



The proposed design works with any DPDK, traffic generator, sniffer or debugger 
application that ether accepts input from STDIN or dumps output to SDTOUT.

I used testpmd / scapy in the example because that was the most obvious combo.

Regards,
Gregory

RE: DTS testpmd and SCAPY integration

2024-01-08 Thread Honnappa Nagarahalli


> -Original Message-
> From: Luca Vizzarro 
> Sent: Monday, January 8, 2024 6:18 AM
> To: Honnappa Nagarahalli ; Etelson,
> Gregory ; tho...@monjalon.net; Juraj Linkeš
> ; Paul Szczepanek ;
> Yoan Picchi ; Jeremy Spewock
> ; Patrick Robb ; c...@dpdk.org;
> dev@dpdk.org
> Subject: Re: DTS testpmd and SCAPY integration
> 
> Hi Honnappa,
> 
> > YAML has wide support built around it. By using our own text format, we will
> have to build the parsing support etc ourselves.
> >
> > However, YAML is supposed to be easy to read and understand. Is it just a
> matter for getting used to it?
> 
> I may be wrong, please feel free to correct me, but I believe what Gregory
> meant is to write raw Scapy and testpmd commands directly in YAML-written
> tests. As in, we wouldn't be using our own text format, we'd just need to feed
> the raw commands from YAML straight to the Scapy and testpmd shells.
> Finally, validate their responses to what was set in the YAML test. In 
> Gregory's
> example this is just comparing stdout.
Yes, I have misunderstood Gregory's proposal. I am all for using YAML. Let us 
discuss more in the DTS meetings and hear what others have to say.

> 
> Best,
> Luca


RE: DTS testpmd and SCAPY integration

2024-01-08 Thread Honnappa Nagarahalli


> 
> Hello Honnappa,
> 
> [snip]
> 
> > Hi Gregory,
> >I do not fully understand your proposal, it will be helpful to join 
> > the DTS
> meetings to discuss this further.
> >
> 
> Agree, let's discuss the proposal details during the DTS meeting.
> 
> > YAML has wide support built around it. By using our own text format, we will
> have to build the parsing support etc ourselves.
> >
> > However, YAML is supposed to be easy to read and understand. Is it just a
> matter for getting used to it?
> >
> 
> I selected YAML for 2 reasons:
>* Plain and intuitive YAML format minimized test meta data.
>  By the meta data I refer to control tags and markup characters
>  that are not test commands.
>* YAML has Python parser.
I have mis-understood your proposal. I agree with your above comments.
+1 for the proposal.

> 
> Regards,
> Gregory
> 
> 



Re: Issues around packet capture when secondary process is doing rx/tx

2024-01-08 Thread Stephen Hemminger
On Mon, 8 Jan 2024 15:13:25 +
Konstantin Ananyev  wrote:

> > I have been looking at a problem reported by Sandesh
> > where packet capture does not work if rx/tx burst is done in secondary 
> > process.
> > 
> > The root cause is that existing rx/tx callback model just doesn't work
> > unless the process doing the rx/tx burst calls is the same one that
> > registered the callbacks.
> > 
> > An example sequence would be:
> > 1. dumpcap (or pdump) as secondary tells pdump in primary to register 
> > callback
> > 2. secondary process calls rx_burst.
> > 3. rx_burst sees the callback but it has pointer pdump_rx which is not 
> > necessarily
> >at same location in primary and secondary process.
> > 4. indirect function call in secondary to bad location likely causes 
> > crash.  
> 
> As I remember, RX/TX callbacks were never intended to work over multiple 
> processes.
> Right now RX/TX callbacks are private for the process, different process 
> simply should not
> see/execute them.
> I.E. it callbacks list is part of 'struct rte_eth_dev' itself, not the 
> rte_eth_dev.data that is shared
> between processes.
> It should be normal, wehn for the same port/queue you will end-up with 
> different list of callbacks
> for different processes.  
> So, unless I am missing something, I don't see how we can end-up with 3) and 
> 4) from above:
> From my understanding secondary process will never see/call primary's 
> callbacks.
> 
> About pdump itself, it was a while when I looked at it last time, but as I 
> remember to start it to work,
> server process has to call rte_pdump_init() which in terns register PDUMP_MP 
> handler.
> I suppose for the secondary process to act as a 'pdump server' it needs to 
> call rte_pdump_init() itself,
> though I am not sure such option is supported right now. 

Maybe the simplest would be just to make sure that rte_pdump_init() is called
in the process that does rx/tx burst. That might be made to work.
Still won't work for case where there are multiple secondary processes and some
the ethdev ports are used differently in each one, but would work better than 
now.



Re: [PATCH] net/hns3: don't support QinQ insert for VF

2024-01-08 Thread Ferruh Yigit
On 12/28/2023 12:14 PM, Jie Hai wrote:
> From: Chengwen Feng 
> 
> In the HIP08 platform, the PF driver will notify VF driver to update
> the PVID state [1], and VF will declare support QinQ insert when PVID
> is disabled.
> 
> In the later platform (e.g. HIP09), the hardware has been improved,
> so the PF driver will NOT notify VF driver to update the PVID state.
> 
> However, the later platform still have constraint: PVID and QinQ insert
> cannot be enabled at the same time, otherwise, the hardware discards
> packets and reports an error interrupt.
> 
> Plus, as far as we known, VF driver's users don't use the QinQ insert.
> 
> Therefore, we declare that the VF driver don't support QinQ insert.
> 
> [1] commit b4e4d7ac9f09 ("net/hns3: support setting VF PVID by PF driver")
> 
> Cc: sta...@dpdk.org
> 
> Signed-off-by: Chengwen Feng 
> Signed-off-by: Jie Hai 
> 

Applied to dpdk-next-net/main, thanks.



Re: [RFC PATCH v1 0/5] test case blocking and logging

2024-01-08 Thread Jeremy Spewock
Definitely worth-while changes. I looked them over and it all looks good so
far.

+1


RE: [dpdk-dev] [v5] doc: define qualification criteria for external library

2024-01-08 Thread Morten Brørup
> From: Stephen Hemminger [mailto:step...@networkplumber.org]
> Sent: Monday, 8 January 2024 18.19
> 
> The wording specifies the intent here, and that is not what matters.
> This is not a legal document where someone will take us to court
> if the board doesn't accept a library.

The introduction/preamble of the page should mention this, or we risk upsetting 
people very much if we don't accept a compliant library. E.g.:

This document serves as an overall description of what is expected of dependent 
libraries.
The final decision to accept or reject is at the discretion of the DPDK 
Project's Technical Board.



RE: [dpdk-dev] [RFC] ethdev: support Tx queue free descriptor query

2024-01-08 Thread Morten Brørup
> From: Konstantin Ananyev [mailto:konstantin.anan...@huawei.com]
> Sent: Friday, 5 January 2024 12.13
> 
> > From: Thomas Monjalon 
> > Sent: Friday, January 5, 2024 10:04 AM
> >
> > 05/01/2024 10:57, Jerin Jacob:
> > > On Thu, Jan 4, 2024 at 11:59 PM Thomas Monjalon
>  wrote:
> > > >
> > > > 04/01/2024 15:21, Konstantin Ananyev:
> > > > >
> > > > > > > > Introduce a new API to retrieve the number of available
> free descriptors
> > > > > > > > in a Tx queue. Applications can leverage this API in the
> fast path to
> > > > > > > > inspect the Tx queue occupancy and take appropriate
> actions based on the
> > > > > > > > available free descriptors.
> > > > > > > >
> > > > > > > > A notable use case could be implementing Random Early
> Discard (RED)
> > > > > > > > in software based on Tx queue occupancy.
> > > > > > > >
> > > > > > > > Signed-off-by: Jerin Jacob 
> > > > > > >
> > > > > > > I think having an API to get the number of free descriptors
> per queue is a good idea. Why have it only for TX queues and not
> > for RX
> > > > > > queues as well?
> > > > > >
> > > > > > I see no harm in adding for Rx as well. I think, it is better
> to have
> > > > > > separate API for each instead of adding argument as it is
> fast path
> > > > > > API.
> > > > > > If so, we could add a new API when there is any PMD
> implementation or
> > > > > > need for this.
> > > > >
> > > > > I think for RX we already have similar one:
> > > > > /** @internal Get number of used descriptors on a receive
> queue. */
> > > > > typedef uint32_t (*eth_rx_queue_count_t)(void *rxq);
> > > >
> > > > rte_eth_rx_queue_count() gives the number of Rx used descriptors
> > > > rte_eth_rx_descriptor_status() gives the status of one Rx
> descriptor
> > > > rte_eth_tx_descriptor_status() gives the status of one Tx
> descriptor
> > > >
> > > > This patch is adding a function to get Tx available descriptors,
> > > > rte_eth_tx_queue_free_desc_get().
> > > > I can see a symmetry with rte_eth_rx_queue_count().
> > > > For consistency I would rename it to
> rte_eth_tx_queue_free_count().

For consistency with rte_eth_rx_queue_count() regarding both naming and 
behavior of the API, I would prefer:

rte_eth_tx_queue_count(), returning the number of used descriptors.

> > > >
> > > > Should we add rte_eth_tx_queue_count() and
> rte_eth_rx_queue_free_count()?
> > >
> > > IMO, rte_eth_rx_queue_free_count() is enough as
> > > used count =  total desc number(configured via nb_tx_desc with
> > > rte_eth_tx_queue_setup())  - free count
> >
> > I'm fine with that.
> >
> 
> Yep, agree.
> If we ever need  rte_eth_rx_queue_free_count() and
> rte_eth_tx_queue_used_count(),
> it could be done via slow-path as Jerin outlined above, no need to
> waste entries in fp_ops
> for that.

Yes, rte_eth_rx/tx_queue_avail_count() could be added in the ethdev library, 
without driver callbacks, but simply getting data from configured descriptor 
ring sizes and rte_eth_rx/tx_queue_count().

PS: For naming conventions, I sought inspiration from the mempool library. 
Also, I don't see any need to use "descs" as part of the names of the proposed 
functions.

The driver callback can be either "free count" or "used count", whichever is 
easier for the drivers to implement, or (preferably) whichever is likelier to 
be faster to execute in the most common case. I would assume that the TX 
descriptor "used count" is relatively low most of the time.



RE: [dpdk-dev] [RFC] ethdev: support Tx queue free descriptor query

2024-01-08 Thread Morten Brørup
> From: Bruce Richardson [mailto:bruce.richard...@intel.com]
> Sent: Monday, 8 January 2024 11.54
> 
> On Tue, Dec 19, 2023 at 10:59:48PM +0530, jer...@marvell.com wrote:
> > From: Jerin Jacob 
> >
> > Introduce a new API to retrieve the number of available free
> descriptors
> > in a Tx queue. Applications can leverage this API in the fast path to
> > inspect the Tx queue occupancy and take appropriate actions based on
> the
> > available free descriptors.
> >
> > A notable use case could be implementing Random Early Discard (RED)
> > in software based on Tx queue occupancy.
> >
> > Signed-off-by: Jerin Jacob 
> > ---
> 
> Hi Jerin,
> 
> while I don't strongly object to this patch, I wonder if it encourages
> sub-optimal code implementations. To determine the number of free
> descriptors in a ring, the driver in many cases will need to do a scan
> of
> the descriptor ring to see how many are free/used. However, I suspect
> that
> in most cases we will see something like the following being done:
> 
> count = rte_eth_rx_free_count();

Typo: rte_eth_rx_free_count() -> rte_eth_tx_free_count()

> if (count > X) {
> /* Do something */
> }
> 
> For instances like this, scanning the entire ring is wasteful of
> resources.
> Instead it would be better to just check the descriptor at position X
> explicitly. Going to the trouble of checking the exact descriptor count
> is
> unnecessary in this case.

Yes, it introduces such a risk.
All DPDK examples simply call tx_burst() without checking free space first, so 
I think the probability (of the simple case) is low.
And the probability for the case comparing to X could be mitigated by referring 
to rte_eth_tx_descriptor_status() in the function description.

> 
> Out of interest, are you aware of a case where an app would need to
> know
> exactly the number of free descriptors, and where the result would not
> be
> just compared to one or more threshold values? Do we see cases where it
> would be used in a computation, perhaps?

Yes: RED. When exceeding the minimum threshold, the probability of 
marking/dropping a packet increases linearly with the queue's fill level.
E.g.:
0-900 packets in queue: don't drop,
901-1000 packets in queue: probability of dropping the packet is 1-100 % (e.g. 
980 packets in queue = 80 % drop probability).




[PATCH v4 2/2] eal: initialize shared plugins on Windows

2024-01-08 Thread Tyler Retzlaff
When EAL is built with MSVC it is possible to dynamically load plugins
on Windows. Hook eal_plugins_init into rte_eal_init if built with MSVC
and provide code to load plugins on Windows.

Signed-off-by: Tyler Retzlaff 
---
 lib/eal/common/eal_common_options.c | 90 ++---
 lib/eal/windows/eal.c   |  8 
 2 files changed, 83 insertions(+), 15 deletions(-)

diff --git a/lib/eal/common/eal_common_options.c 
b/lib/eal/common/eal_common_options.c
index d974807..d519a55 100644
--- a/lib/eal/common/eal_common_options.c
+++ b/lib/eal/common/eal_common_options.c
@@ -18,9 +18,7 @@
 #include 
 #endif
 #include 
-#ifndef RTE_EXEC_ENV_WINDOWS
 #include 
-#endif
 
 #include 
 #include 
@@ -123,10 +121,8 @@ struct shared_driver {
 static struct shared_driver_list solib_list =
 TAILQ_HEAD_INITIALIZER(solib_list);
 
-#ifndef RTE_EXEC_ENV_WINDOWS
 /* Default path of external loadable drivers */
 static const char *default_solib_dir = RTE_EAL_PMD_PATH;
-#endif
 
 /*
  * Stringified version of solib path used by dpdk-pmdinfo.py
@@ -371,12 +367,12 @@ struct device_option {
 }
 
 #ifdef RTE_EXEC_ENV_WINDOWS
-int
-eal_plugins_init(void)
-{
-   return 0;
-}
+#define SOEXT".dll"
 #else
+#define SOEXT".so"
+#endif
+
+#define SOABIEXT  SOEXT"."ABI_VERSION
 
 static int
 eal_plugindir_init(const char *path)
@@ -397,12 +393,14 @@ struct device_option {
 
while ((dent = readdir(d)) != NULL) {
struct stat sb;
-   int nlen = strnlen(dent->d_name, sizeof(dent->d_name));
+   size_t nlen = strnlen(dent->d_name, sizeof(dent->d_name));
 
-   /* check if name ends in .so or .so.ABI_VERSION */
-   if (strcmp(&dent->d_name[nlen - 3], ".so") != 0 &&
-   strcmp(&dent->d_name[nlen - 4 - strlen(ABI_VERSION)],
-  ".so."ABI_VERSION) != 0)
+   if (nlen < strlen(SOABIEXT))
+   continue;
+
+   /* check if name ends in SOEXT or SOABIEXT */
+   if (strcmp(&dent->d_name[nlen - strlen(SOEXT)], SOEXT) != 0 &&
+   strcmp(&dent->d_name[nlen - strlen(SOABIEXT)], SOABIEXT) != 
0)
continue;
 
snprintf(sopath, sizeof(sopath), "%s/%s", path, dent->d_name);
@@ -420,6 +418,68 @@ struct device_option {
return (dent == NULL) ? 0 : -1;
 }
 
+#ifdef RTE_EXEC_ENV_WINDOWS
+static void*
+eal_dlopen(const char *pathname)
+{
+   void *retval = NULL;
+   struct stat pathstat;
+   char *fullpath = _fullpath(NULL, pathname, 0);
+
+   const char *loadpath = fullpath;
+   DWORD loadflags = 0;
+
+   if (fullpath == NULL) {
+   RTE_LOG(ERR, EAL, "Error expanding full path for %s, %s\n",
+   pathname, strerror(errno));
+   } else {
+   /* Verify that the path exists */
+   if ((stat(fullpath, &pathstat) != 0) && (errno == ENOENT)) {
+   /* not a full or relative path, try a load from default 
dirs */
+   loadpath = pathname;
+   loadflags = LOAD_LIBRARY_SEARCH_DEFAULT_DIRS;
+   }
+
+   retval = LoadLibraryExA(loadpath, NULL, loadflags);
+   if (retval == NULL)
+   RTE_LOG(ERR, EAL, "Error loading %s, error code: %lu\n",
+   loadpath, GetLastError());
+   }
+
+   free(fullpath);
+   return retval;
+}
+
+static int
+is_shared_build(void)
+{
+   int shared = 0;
+   HMODULE apphandle = NULL;
+   HMODULE libhandle = NULL;
+
+   /* if fail to get handle, assume statically linked */
+   apphandle = GetModuleHandleA(NULL);
+   if (apphandle == NULL) {
+   RTE_LOG(ERR, EAL, "Cannot get handle to the app\n");
+   goto out;
+   }
+
+   /* if fail to get handle, assume statically linked */
+   if (GetModuleHandleExA(GET_MODULE_HANDLE_EX_FLAG_FROM_ADDRESS |
+   GET_MODULE_HANDLE_EX_FLAG_UNCHANGED_REFCOUNT,
+   (LPCSTR)&eal_plugins_init,
+   &libhandle)) {
+   if (apphandle != libhandle) {
+   /* lib and app handles are different. */
+   /* Therefore lib is dynamically linked */
+   shared = 1;
+   }
+   }
+
+out:
+   return shared;
+}
+#else
 static int
 verify_perms(const char *dirpath)
 {
@@ -527,6 +587,7 @@ struct device_option {
EAL_LOG(INFO, "Detected static linkage of DPDK");
return 0;
 }
+#endif
 
 int
 eal_plugins_init(void)
@@ -565,7 +626,6 @@ struct device_option {
}
return 0;
 }
-#endif
 
 /*
  * Parse the coremask given as argument (hexadecimal string) and fill
diff --git a/lib/eal/windows/eal.c b/lib/eal/windows/eal.c
index 52f0e74..dc2bcb7 100644
--- a/lib/eal/windows/eal.c
+++ b/lib/eal/windows/eal.c
@@ -305,6 +305,14 @@ enum rte_proc_type_t
if (fctret < 0)
  

[PATCH v4 0/2] eal: initialize shared plugins on Windows

2024-01-08 Thread Tyler Retzlaff
When EAL is built with MSVC it is possible to dynamically load plugins
on Windows. Hook eal_plugins_init into rte_eal_init if built with MSVC
and provide code to load plugins on Windows.

v4:
  * include winipfamily.h header for WINAPI_FAMILY macros and provide
definition for PHONE_APP if mingw winipfamily.h doesn't supply it

v3:
  * revert use of PRIu32 from previous patch just use %lu to make
unsigned long format happy

v2:
  * revert unintended / unrelated whitespace change
  * include inttypes.h for use of PRIu32 in log format string

Tyler Retzlaff (2):
  windows: include winapifamily header for macros
  eal: initialize shared plugins on Windows

 lib/eal/common/eal_common_options.c | 90 ++---
 lib/eal/windows/eal.c   |  8 
 lib/eal/windows/include/dirent.h|  6 +++
 3 files changed, 89 insertions(+), 15 deletions(-)

-- 
1.8.3.1



[PATCH v4 1/2] windows: include winapifamily header for macros

2024-01-08 Thread Tyler Retzlaff
Include winapifamily.h for WINAPI_FAMILY macro and provide a definition
of WINAPI_FAMILY_PHONE_APP if not present (happens compiling under
mingw)

Signed-off-by: Tyler Retzlaff 
---
 lib/eal/windows/include/dirent.h | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/lib/eal/windows/include/dirent.h b/lib/eal/windows/include/dirent.h
index b522424..80bc6c3 100644
--- a/lib/eal/windows/include/dirent.h
+++ b/lib/eal/windows/include/dirent.h
@@ -8,6 +8,12 @@
 #ifndef DIRENT_H
 #define DIRENT_H
 
+#include 
+
+#ifndef WINAPI_FAMILY_PHONE_APP
+#define WINAPI_FAMILY_PHONE_APP 3
+#endif
+
 /*
  * Include windows.h without Windows Sockets 1.1 to prevent conflicts with
  * Windows Sockets 2.0.
-- 
1.8.3.1



RE: Issues around packet capture when secondary process is doing rx/tx

2024-01-08 Thread Honnappa Nagarahalli



> -Original Message-
> From: Stephen Hemminger 
> Sent: Sunday, January 7, 2024 7:59 PM
> To: dev@dpdk.org
> Cc: arshdeep.k...@intel.com; Gowda, Sandesh ;
> Reshma Pattan 
> Subject: Issues around packet capture when secondary process is doing rx/tx
> 
> I have been looking at a problem reported by Sandesh where packet capture
> does not work if rx/tx burst is done in secondary process.
> 
> The root cause is that existing rx/tx callback model just doesn't work unless 
> the
> process doing the rx/tx burst calls is the same one that registered the 
> callbacks.
This is not specific to packet capture. This is a generic problem and we should 
look to solve it generically.

> 
> An example sequence would be:
>   1. dumpcap (or pdump) as secondary tells pdump in primary to register
> callback
>   2. secondary process calls rx_burst.
>   3. rx_burst sees the callback but it has pointer pdump_rx which is not
> necessarily
>  at same location in primary and secondary process.
>   4. indirect function call in secondary to bad location likely causes 
> crash.
> 
> Some possible workarounds.
>   1. Keep callback list per-process: messy, but won't crash. Capture won't
> work
>without other changes. In this primary would register callback, but
> secondaries
>would not use them in rx/tx burst.
> 
>   2. Replace use of rx/tx callback in pdump with change to rte_ethdev to
> have
>a capture flag. (i.e. don't use indirection).  Likely ABI problems.
>Basically, ignore the rx/tx callback mechanism. This is my 
> preferred
>  solution.
> 
>   3. Some fix up mechanism (in EAL mp support?) to have each process
> fixup
>its callback mechanism.
Yes, would prefer this. Let the application call additional APIs to register 
the call backs in secondary process.

> 
>   4. Do something in pdump_init to register the callback in same process
> context
>  (probably need callbacks to be per-process). Would mean callback is
> always
>on independent of capture being enabled.
> 
> 5. Get rid of indirect function call pointer, and replace it by index 
> into
>a static table of callback functions. Every process would have 
> same code
>(in this case pdump_rx) but at different address.  Requires all 
> callbacks
>to be statically defined at build time.
> 
> The existing rx/tx callback is not safe id rx/tx burst is called from 
> different
> process than where callback is registered.
> 



Re: [PATCH] ethdev: add dump regs for telemetry

2024-01-08 Thread Jie Hai

On 2023/12/14 20:49, Ferruh Yigit wrote:

On 12/14/2023 1:56 AM, Jie Hai wrote:

The ethdev library now registers a telemetry command for
dump regs.

An example usage is shown below:
--> /ethdev/regs,test
{
   "/ethdev/regs": {
 "regs_offset": 0,
 "regs_length": 3192,
 "regs_width": 4,
 "device_version": "0x1080f00",
 "regs_file": "port_0_regs_test"
   }
}




Above code writes register data to a file.

I am not sure about this kind of usage of telemetry command, that it
cause data to be written to a file.

My understanding is, telemetry usage is based on what telemetry client
receives.
What do you think just keep the 'reg_info' fields excluding data to the
file?

.Hi, Ferruh


I tried to write all register information to telemetry data,
but gave up because some drivers had too many registers (eg.ixgbe)
to carry. Therefore, the writing data to file approach is selected.

When we query a register, the register content is the key.
The information such as the width and length is only auxiliary
information. If the register data cannot be obtained, the auxiliary 
information is optional. So I don't think register data should be removed.


In my opinion, writing a file is a more appropriate way to do it.
I wonder if there's a better way.

Best regards,
Thanks


Re: [PATCH] ethdev: add dump regs for telemetry

2024-01-08 Thread Jie Hai

On 2024/1/9 10:19, Jie Hai wrote:

On 2023/12/14 20:49, Ferruh Yigit wrote:

On 12/14/2023 1:56 AM, Jie Hai wrote:

The ethdev library now registers a telemetry command for
dump regs.

An example usage is shown below:
--> /ethdev/regs,test
{
   "/ethdev/regs": {
 "regs_offset": 0,
 "regs_length": 3192,
 "regs_width": 4,
 "device_version": "0x1080f00",
 "regs_file": "port_0_regs_test"
   }
}




Above code writes register data to a file.

I am not sure about this kind of usage of telemetry command, that it
cause data to be written to a file.

My understanding is, telemetry usage is based on what telemetry client
receives.
What do you think just keep the 'reg_info' fields excluding data to the
file?

.Hi, Ferruh


I tried to write all register information to telemetry data,
but gave up because some drivers had too many registers (eg.ixgbe)

Sorry, correct it. It is i40e who has the most registers.

to carry. Therefore, the writing data to file approach is selected.

When we query a register, the register content is the key.
The information such as the width and length is only auxiliary
information. If the register data cannot be obtained, the auxiliary 
information is optional. So I don't think register data should be removed.


In my opinion, writing a file is a more appropriate way to do it.
I wonder if there's a better way.

Best regards,
Thanks


RE: [PATCH v2 2/3] net/ice: refactor tm config data structure

2024-01-08 Thread Wu, Wenjun1
> -Original Message-
> From: Zhang, Qi Z 
> Sent: Friday, January 5, 2024 10:11 PM
> To: Yang, Qiming ; Wu, Wenjun1
> 
> Cc: dev@dpdk.org; Zhang, Qi Z 
> Subject: [PATCH v2 2/3] net/ice: refactor tm config data structure
> 
> Simplified struct ice_tm_conf by removing per level node list.
> 
> Signed-off-by: Qi Zhang 
> ---
>  drivers/net/ice/ice_ethdev.h |   5 +-
>  drivers/net/ice/ice_tm.c | 210 +++
>  2 files changed, 88 insertions(+), 127 deletions(-)
> 
> diff --git a/drivers/net/ice/ice_ethdev.h b/drivers/net/ice/ice_ethdev.h index
> ae22c29ffc..008a7a23b9 100644
> --- a/drivers/net/ice/ice_ethdev.h
> +++ b/drivers/net/ice/ice_ethdev.h
> @@ -472,6 +472,7 @@ struct ice_tm_node {
>   uint32_t id;
>   uint32_t priority;
>   uint32_t weight;
> + uint32_t level;
>   uint32_t reference_count;
>   struct ice_tm_node *parent;
>   struct ice_tm_node **children;
> @@ -492,10 +493,6 @@ enum ice_tm_node_type {  struct ice_tm_conf {
>   struct ice_shaper_profile_list shaper_profile_list;
>   struct ice_tm_node *root; /* root node - port */
> - struct ice_tm_node_list qgroup_list; /* node list for all the queue
> groups */
> - struct ice_tm_node_list queue_list; /* node list for all the queues */
> - uint32_t nb_qgroup_node;
> - uint32_t nb_queue_node;
>   bool committed;
>   bool clear_on_fail;
>  };
> diff --git a/drivers/net/ice/ice_tm.c b/drivers/net/ice/ice_tm.c index
> 7ae68c683b..7c662f8a85 100644
> --- a/drivers/net/ice/ice_tm.c
> +++ b/drivers/net/ice/ice_tm.c
> @@ -43,66 +43,30 @@ ice_tm_conf_init(struct rte_eth_dev *dev)
>   /* initialize node configuration */
>   TAILQ_INIT(&pf->tm_conf.shaper_profile_list);
>   pf->tm_conf.root = NULL;
> - TAILQ_INIT(&pf->tm_conf.qgroup_list);
> - TAILQ_INIT(&pf->tm_conf.queue_list);
> - pf->tm_conf.nb_qgroup_node = 0;
> - pf->tm_conf.nb_queue_node = 0;
>   pf->tm_conf.committed = false;
>   pf->tm_conf.clear_on_fail = false;
>  }
> 
> -void
> -ice_tm_conf_uninit(struct rte_eth_dev *dev)
> +static void free_node(struct ice_tm_node *root)
>  {
> - struct ice_pf *pf = ICE_DEV_PRIVATE_TO_PF(dev->data->dev_private);
> - struct ice_tm_node *tm_node;
> + uint32_t i;
> 
> - /* clear node configuration */
> - while ((tm_node = TAILQ_FIRST(&pf->tm_conf.queue_list))) {
> - TAILQ_REMOVE(&pf->tm_conf.queue_list, tm_node, node);
> - rte_free(tm_node);
> - }
> - pf->tm_conf.nb_queue_node = 0;
> - while ((tm_node = TAILQ_FIRST(&pf->tm_conf.qgroup_list))) {
> - TAILQ_REMOVE(&pf->tm_conf.qgroup_list, tm_node, node);
> - rte_free(tm_node);
> - }
> - pf->tm_conf.nb_qgroup_node = 0;
> - if (pf->tm_conf.root) {
> - rte_free(pf->tm_conf.root);
> - pf->tm_conf.root = NULL;
> - }
> + if (root == NULL)
> + return;
> +
> + for (i = 0; i < root->reference_count; i++)
> + free_node(root->children[i]);
> +
> + rte_free(root);
>  }
> 
> -static inline struct ice_tm_node *
> -ice_tm_node_search(struct rte_eth_dev *dev,
> - uint32_t node_id, enum ice_tm_node_type *node_type)
> +void
> +ice_tm_conf_uninit(struct rte_eth_dev *dev)
>  {
>   struct ice_pf *pf = ICE_DEV_PRIVATE_TO_PF(dev->data->dev_private);
> - struct ice_tm_node_list *qgroup_list = &pf->tm_conf.qgroup_list;
> - struct ice_tm_node_list *queue_list = &pf->tm_conf.queue_list;
> - struct ice_tm_node *tm_node;
> -
> - if (pf->tm_conf.root && pf->tm_conf.root->id == node_id) {
> - *node_type = ICE_TM_NODE_TYPE_PORT;
> - return pf->tm_conf.root;
> - }
> 
> - TAILQ_FOREACH(tm_node, qgroup_list, node) {
> - if (tm_node->id == node_id) {
> - *node_type = ICE_TM_NODE_TYPE_QGROUP;
> - return tm_node;
> - }
> - }
> -
> - TAILQ_FOREACH(tm_node, queue_list, node) {
> - if (tm_node->id == node_id) {
> - *node_type = ICE_TM_NODE_TYPE_QUEUE;
> - return tm_node;
> - }
> - }
> -
> - return NULL;
> + free_node(pf->tm_conf.root);
> + pf->tm_conf.root = NULL;
>  }
> 
>  static int
> @@ -195,11 +159,29 @@ ice_node_param_check(struct ice_pf *pf, uint32_t
> node_id,
>   return 0;
>  }
> 
> +static struct ice_tm_node *
> +find_node(struct ice_tm_node *root, uint32_t id) {
> + uint32_t i;
> +
> + if (root == NULL || root->id == id)
> + return root;
> +
> + for (i = 0; i < root->reference_count; i++) {
> + struct ice_tm_node *node = find_node(root->children[i], id);
> +
> + if (node)
> + return node;
> + }
> +
> + return NULL;
> +}
> +
>  static int
>  ice_node_type_get(struct rte_eth_dev *dev, uint32_t node_id,
>  int *is_leaf, struct rte_tm_error *error)  {
> -  

RE: [PATCH v4 2/3] net/ice: refactor tm config data structure

2024-01-08 Thread Wu, Wenjun1
> -Original Message-
> From: Zhang, Qi Z 
> Sent: Tuesday, January 9, 2024 4:22 AM
> To: Yang, Qiming ; Wu, Wenjun1
> 
> Cc: dev@dpdk.org; Zhang, Qi Z 
> Subject: [PATCH v4 2/3] net/ice: refactor tm config data structure
> 
> Simplified struct ice_tm_conf by removing per level node list.
> 
> Signed-off-by: Qi Zhang 
> ---
>  drivers/net/ice/ice_ethdev.h |   5 +-
>  drivers/net/ice/ice_tm.c | 248 ---
>  2 files changed, 113 insertions(+), 140 deletions(-)
> 
> diff --git a/drivers/net/ice/ice_ethdev.h b/drivers/net/ice/ice_ethdev.h index
> ae22c29ffc..008a7a23b9 100644
> --- a/drivers/net/ice/ice_ethdev.h
> +++ b/drivers/net/ice/ice_ethdev.h
> @@ -472,6 +472,7 @@ struct ice_tm_node {
>   uint32_t id;
>   uint32_t priority;
>   uint32_t weight;
> + uint32_t level;
>   uint32_t reference_count;
>   struct ice_tm_node *parent;
>   struct ice_tm_node **children;
> @@ -492,10 +493,6 @@ enum ice_tm_node_type {  struct ice_tm_conf {
>   struct ice_shaper_profile_list shaper_profile_list;
>   struct ice_tm_node *root; /* root node - port */
> - struct ice_tm_node_list qgroup_list; /* node list for all the queue
> groups */
> - struct ice_tm_node_list queue_list; /* node list for all the queues */
> - uint32_t nb_qgroup_node;
> - uint32_t nb_queue_node;
>   bool committed;
>   bool clear_on_fail;
>  };
> diff --git a/drivers/net/ice/ice_tm.c b/drivers/net/ice/ice_tm.c index
> d67783c77e..fbab0b8808 100644
> --- a/drivers/net/ice/ice_tm.c
> +++ b/drivers/net/ice/ice_tm.c
> @@ -6,6 +6,9 @@
>  #include "ice_ethdev.h"
>  #include "ice_rxtx.h"
> 
> +#define MAX_CHILDREN_PER_SCHED_NODE  8
> +#define MAX_CHILDREN_PER_TM_NODE 256
> +
>  static int ice_hierarchy_commit(struct rte_eth_dev *dev,
>int clear_on_fail,
>__rte_unused struct rte_tm_error *error);
> @@ -43,20 +46,28 @@ ice_tm_conf_init(struct rte_eth_dev *dev)
>   /* initialize node configuration */
>   TAILQ_INIT(&pf->tm_conf.shaper_profile_list);
>   pf->tm_conf.root = NULL;
> - TAILQ_INIT(&pf->tm_conf.qgroup_list);
> - TAILQ_INIT(&pf->tm_conf.queue_list);
> - pf->tm_conf.nb_qgroup_node = 0;
> - pf->tm_conf.nb_queue_node = 0;
>   pf->tm_conf.committed = false;
>   pf->tm_conf.clear_on_fail = false;
>  }
> 
> +static void free_node(struct ice_tm_node *root) {
> + uint32_t i;
> +
> + if (root == NULL)
> + return;
> +
> + for (i = 0; i < root->reference_count; i++)
> + free_node(root->children[i]);
> +
> + rte_free(root);
> +}
> +
>  void
>  ice_tm_conf_uninit(struct rte_eth_dev *dev)  {
>   struct ice_pf *pf = ICE_DEV_PRIVATE_TO_PF(dev->data->dev_private);
>   struct ice_tm_shaper_profile *shaper_profile;
> - struct ice_tm_node *tm_node;
> 
>   /* clear profile */
>   while ((shaper_profile = TAILQ_FIRST(&pf-
> >tm_conf.shaper_profile_list))) { @@ -64,52 +75,8 @@
> ice_tm_conf_uninit(struct rte_eth_dev *dev)
>   rte_free(shaper_profile);
>   }
> 
> - /* clear node configuration */
> - while ((tm_node = TAILQ_FIRST(&pf->tm_conf.queue_list))) {
> - TAILQ_REMOVE(&pf->tm_conf.queue_list, tm_node, node);
> - rte_free(tm_node);
> - }
> - pf->tm_conf.nb_queue_node = 0;
> - while ((tm_node = TAILQ_FIRST(&pf->tm_conf.qgroup_list))) {
> - TAILQ_REMOVE(&pf->tm_conf.qgroup_list, tm_node, node);
> - rte_free(tm_node);
> - }
> - pf->tm_conf.nb_qgroup_node = 0;
> - if (pf->tm_conf.root) {
> - rte_free(pf->tm_conf.root);
> - pf->tm_conf.root = NULL;
> - }
> -}
> -
> -static inline struct ice_tm_node *
> -ice_tm_node_search(struct rte_eth_dev *dev,
> - uint32_t node_id, enum ice_tm_node_type *node_type)
> -{
> - struct ice_pf *pf = ICE_DEV_PRIVATE_TO_PF(dev->data->dev_private);
> - struct ice_tm_node_list *qgroup_list = &pf->tm_conf.qgroup_list;
> - struct ice_tm_node_list *queue_list = &pf->tm_conf.queue_list;
> - struct ice_tm_node *tm_node;
> -
> - if (pf->tm_conf.root && pf->tm_conf.root->id == node_id) {
> - *node_type = ICE_TM_NODE_TYPE_PORT;
> - return pf->tm_conf.root;
> - }
> -
> - TAILQ_FOREACH(tm_node, qgroup_list, node) {
> - if (tm_node->id == node_id) {
> - *node_type = ICE_TM_NODE_TYPE_QGROUP;
> - return tm_node;
> - }
> - }
> -
> - TAILQ_FOREACH(tm_node, queue_list, node) {
> - if (tm_node->id == node_id) {
> - *node_type = ICE_TM_NODE_TYPE_QUEUE;
> - return tm_node;
> - }
> - }
> -
> - return NULL;
> + free_node(pf->tm_conf.root);
> + pf->tm_conf.root = NULL;
>  }
> 
>  static int
> @@ -202,11 +169,29 @@ ice_node_param_check(struct ice_pf *pf, uint32_t
> node_id,
> 

RE: [PATCH v4 0/3] simplified to 3 layer Tx scheduler

2024-01-08 Thread Zhang, Qi Z



> -Original Message-
> From: Zhang, Qi Z 
> Sent: Tuesday, January 9, 2024 4:22 AM
> To: Yang, Qiming ; Wu, Wenjun1
> 
> Cc: dev@dpdk.org; Zhang, Qi Z 
> Subject: [PATCH v4 0/3] simplified to 3 layer Tx scheduler
> 
> Remove dummy layers, code refactor, complete document
> 
> v4:
> - rebase.
> 
> v3:
> - fix tm_node memory free.
> - fix corrupt when slibling node deletion is not in a reversed order.
> 
> v2:
> - fix typos.
> 
> Qi Zhang (3):
>   net/ice: hide port and TC layer in Tx sched tree
>   net/ice: refactor tm config data structure
>   doc: update ice document for qos
> 
>  doc/guides/nics/ice.rst  |  19 +++
>  drivers/net/ice/ice_ethdev.h |  12 +-
>  drivers/net/ice/ice_tm.c | 317 +--
>  3 files changed, 134 insertions(+), 214 deletions(-)
> 
> --
> 2.31.1

Applied to dpdk-next-net-intel.

Thanks
Qi


Re: why DPDK reassembles IP fragment packets with AF_PACKET

2024-01-08 Thread
Hi Stephen,


Thank you for immediate response.


When PACKET_FANOUT_FLAG_DEFRAG flag is removed in AF_PACKET mode, IP fragment 
packets cannot be reassembled. That is what we expected or OVS expected.


The file drivers/net/af_packet/rte_net_af_packet.c +770
#if defined(PACKET_FANOUT) fanout_arg = (getpid() ^ (*internals)->if_index) & 
0x; fanout_arg |= (PACKET_FANOUT_HASH | PACKET_FANOUT_FLAG_DEFRAG) << 16; 
==> fanout_arg |= (PACKET_FANOUT_HASH) << 16; #if 
defined(PACKET_FANOUT_FLAG_ROLLOVER) fanout_arg |= PACKET_FANOUT_FLAG_ROLLOVER 
<< 16; #endif #endif


Best regards,


Matthew














At 2024-01-09 00:42:32, "Stephen Hemminger"  wrote:
>On Mon, 8 Jan 2024 19:07:20 +0800 (CST)
>钟  wrote:
>
>> Hi All,
>> 
>> 
>> Recently I debug ovs-dpdk with AF_PACKET mode.  When IP fragment packets are 
>> received via DPDK, the IP fragment packets are reassembled by DPDK. After 
>> reassembly, the packet length is over 1518. They are discarded by OVS 
>> because of oversize packets.
>> 
>> 
>> I don't understandy why  PACKET_FLAG_DEFRAG is set for AF_PACKET mode.
>> 
>> 
>
>Not sure, but it looks like the af_packet wants to hash packets for fanout.
>Hashin of fragments won't work correctly, and packet will arrive on different 
>queues if fragmented.


RE: [PATCH 2/3] net/nfp: fix free resource problem

2024-01-08 Thread Chaoyong He
> On 12/18/2023 1:50 AM, Chaoyong He wrote:
> >> On 12/14/2023 10:24 AM, Chaoyong He wrote:
> >>> From: Long Wu 
> >>>
> >>> Set the representor array to NULL to avoid that close interface does
> >>> not free some resource.
> >>>
> >>> Fixes: a135bc1644d6 ("net/nfp: fix resource leak for flower
> >>> firmware")
> >>> Cc: chaoyong...@corigine.com
> >>> Cc: sta...@dpdk.org
> >>>
> >>> Signed-off-by: Long Wu 
> >>> Reviewed-by: Chaoyong He 
> >>> Reviewed-by: Peng Zhang 
> >>> ---
> >>>  drivers/net/nfp/flower/nfp_flower_representor.c | 15
> >>> ++-
> >>>  1 file changed, 14 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/drivers/net/nfp/flower/nfp_flower_representor.c
> >>> b/drivers/net/nfp/flower/nfp_flower_representor.c
> >>> index 27ea3891bd..5f7c1fa737 100644
> >>> --- a/drivers/net/nfp/flower/nfp_flower_representor.c
> >>> +++ b/drivers/net/nfp/flower/nfp_flower_representor.c
> >>> @@ -294,17 +294,30 @@ nfp_flower_repr_tx_burst(void *tx_queue,
> >>> static int  nfp_flower_repr_uninit(struct rte_eth_dev *eth_dev)  {
> >>> + uint16_t index;
> >>>   struct nfp_flower_representor *repr;
> >>>
> >>>   repr = eth_dev->data->dev_private;
> >>>   rte_ring_free(repr->ring);
> >>>
> >>> + if (repr->repr_type == NFP_REPR_TYPE_PHYS_PORT) {
> >>> + index = NFP_FLOWER_CMSG_PORT_PHYS_PORT_NUM(repr-
> >>> port_id);
> >>> + repr->app_fw_flower->phy_reprs[index] = NULL;
> >>> + } else {
> >>> + index = repr->vf_id;
> >>> + repr->app_fw_flower->vf_reprs[index] = NULL;
> >>> + }
> >>> +
> >>>   return 0;
> >>>  }
> >>>
> >>>  static int
> >>> -nfp_flower_pf_repr_uninit(__rte_unused struct rte_eth_dev *eth_dev)
> >>> +nfp_flower_pf_repr_uninit(struct rte_eth_dev *eth_dev)
> >>>  {
> >>> + struct nfp_flower_representor *repr = eth_dev->data->dev_private;
> >>> +
> >>> + repr->app_fw_flower->pf_repr = NULL;
> >>>
> >>
> >> Here it is assigned to NULL but is it freed? If freed, why not set to
> >> NULL where it is freed?
> >>
> >> Same for above phy_reprs & vf_reprs.
> >
> > The whole invoke view:
> > rte_eth_dev_close()
> > --> nfp_flower_repr_dev_close()
> > --> nfp_flower_repr_free()
> > --> nfp_flower_pf_repr_uninit()
> > --> nfp_flower_repr_uninit()
> >// In these two functions, we just assigned to NULL but not 
> > freed yet.
> >// It is still refer by the `eth_dev->data->dev_private`.
> > --> rte_eth_dev_release_port()
> > --> rte_free(eth_dev->data->dev_private);
> > // And here it is really freed (by the rte framework).
> >
> 
> 'rte_eth_dev_release_port()' frees the device private data, but not all 
> pointers,
> like 'repr->app_fw_flower->pf_repr', in the struct are freed, it is 
> dev_close() or
> unint() functions responsibility.
> 
> Can you please double check if
> 'eth_dev->data->dev_private->app_fw_flower->pf_repr' freed or not?

(gdb) b nfp_flower_repr_dev_close
Breakpoint 1 at 0x7f839a4ad37f: file 
../drivers/net/nfp/flower/nfp_flower_representor.c, line 356.
(gdb) c
Continuing.

Thread 1 "dpdk-testpmd" hit Breakpoint 1, nfp_flower_repr_dev_close 
(dev=0x7f839aed2340 )
at ../drivers/net/nfp/flower/nfp_flower_representor.c:356
356 if (rte_eal_process_type() != RTE_PROC_PRIMARY)
(gdb) n
359 repr = dev->data->dev_private;
(gdb)
360 app_fw_flower = repr->app_fw_flower;
(gdb)
361 hw = app_fw_flower->pf_hw;
(gdb)
362 pf_dev = hw->pf_dev;
(gdb)
368 nfp_net_disable_queues(dev);
(gdb) p repr
$1 = (struct nfp_flower_representor *) 0x17c49c800
(gdb) p dev->data->dev_private
$2 = (void *) 0x17c49c800
(gdb) p repr->app_fw_flower->pf_repr
$3 = (struct nfp_flower_representor *) 0x17c49c800

As we can see, these three pointers point the same block of memory.

(gdb) until 384
nfp_flower_repr_dev_close (dev=0x7f839aed2340 )
at ../drivers/net/nfp/flower/nfp_flower_representor.c:384
384 nfp_flower_repr_free(repr, repr->repr_type);
(gdb) s
nfp_flower_repr_free (repr=0x17c49c800, repr_type=NFP_REPR_TYPE_PF)
at ../drivers/net/nfp/flower/nfp_flower_representor.c:328
328 switch (repr_type) {
(gdb) n
333 nfp_flower_pf_repr_uninit(repr->eth_dev);
(gdb) s
nfp_flower_pf_repr_uninit (eth_dev=0x7f839aed2340 )
at ../drivers/net/nfp/flower/nfp_flower_representor.c:317
317 struct nfp_flower_representor *repr = 
eth_dev->data->dev_private;
(gdb) n
319 repr->app_fw_flower->pf_repr = NULL;
(gdb) p eth_dev->data->dev_private
$4 = (void *) 0x17c49c800
(gdb) p repr
$5 = (struct nfp_flower_representor *) 0x17c49c800
(gdb) p repr->app_fw_flower->pf_repr
$6 = (struct nfp_flower_representor *) 0x17c49c800

As what I said, although I assign NULL to one of the pointers 
`repr->app_fw_flower->pf_repr`, it still can be access by the other one 
`eth_dev->data->dev_private`.

(gdb) fin
Run till exit from #0  nfp_flower_pf_repr_uninit (eth_dev=0x7f839aed2340 
)
at ../drive

Re: [PATCH] event/cnxk: use WFE LDP loop for getwork routine

2024-01-08 Thread Jerin Jacob
On Fri, Jan 5, 2024 at 9:24 AM  wrote:
>
> From: Pavan Nikhilesh 
>
> Use WFE LDP loop while polling for GETWORK completion for better
> power savings.
> Disabled by default and can be enabled by setting
> `RTE_ARM_USE_WFE` to `true` in `config/arm/meson.build`
>
> Signed-off-by: Pavan Nikhilesh 
> ---
>  doc/guides/eventdevs/cnxk.rst |  9 ++
>  drivers/event/cnxk/cn10k_worker.h | 52 +--
>  2 files changed, 52 insertions(+), 9 deletions(-)
>
> diff --git a/doc/guides/eventdevs/cnxk.rst b/doc/guides/eventdevs/cnxk.rst
> index cccb8a0304..d62c143c77 100644
> --- a/doc/guides/eventdevs/cnxk.rst
> +++ b/doc/guides/eventdevs/cnxk.rst
> @@ -198,6 +198,15 @@ Runtime Config Options
>
>  -a 0002:0e:00.0,tim_eclk_freq=12288-10-0
>
> +Power Savings on CN10K
> +--
> +
> +ARM cores can additionally use WFE when polling for transactions on SSO bus
> +to save power i.e., in the event dequeue call ARM core can enter WFE and exit
> +when either work has been scheduled or dequeue timeout has reached.
> +This can be enabled by setting ``RTE_ARM_USE_WFE`` to ``true`` in
> +``config/arm/meson.build``.

+ ARM maintainers

IMO, Updating config/arm/meson.build for enabling RTE_ARM_USE_WFE,
needs to improved.
Could you push a patch for enabling via -D... or via -Dc_args=...