Re: [PATCH] net/mlx5: fix data access race condition for shared Rx queue

2024-07-18 Thread Raslan Darawsheh
Hi,

From: Jiawei(Jonny) Wang 
Sent: Friday, July 5, 2024 4:05 PM
To: Bing Zhao; Slava Ovsiienko; Dariusz Sosnowski; Ori Kam; Suanming Mou; Matan 
Azrad; Alexander Kozyrev
Cc: dev@dpdk.org; Raslan Darawsheh; sta...@dpdk.org
Subject: [PATCH] net/mlx5: fix data access race condition for shared Rx queue

The rxq_data resources were shared for shared Rx queue with the same
group and queue ID.
The cq_ci:24 of rxq_data was unalignment with other fields in the one
32-bit data, like the dynf_meta and delay_drop.

  32bit:   xxxI      IIIx
  ^     ...^
  |  cq_ci |

The issue is that while the control thread updates the dynf_meta:1 or
delay_drop:1 value during port start, another data thread updates the
cq_ci at the same time, it causes the bytes race condition with
different thread, and cq_ci value may be overwritten and updated the
abnormal value into HW CQ DB.

This patch separates the cq_ci from the configuration data spaces, and
adds checking for delay_drop and dynf_meta if shared Rx queue if
started.

Fixes: 02a6195cbe ("net/mlx5: support enhanced CQE compression in Rx burst")
Cc: sta...@dpdk.org

Signed-off-by: Jiawei Wang 
Acked-by: Bing Zhao 
Acked-by: Viacheslav Ovsiienko 

Patch applied to next-net-mlx,

Kindest regards,
Raslan Darawsheh



Re: [PATCH] net/mlx5: fix indexed pool resize

2024-07-18 Thread Raslan Darawsheh
Hi,

From: Gregory Etelson 
Sent: Sunday, July 7, 2024 12:48 PM
To: dev@dpdk.org
Cc: Gregory Etelson; Maayan Kashani; Raslan Darawsheh; Dariusz Sosnowski; Slava 
Ovsiienko; Bing Zhao; Ori Kam; Suanming Mou; Matan Azrad
Subject: [PATCH] net/mlx5: fix indexed pool resize

On success, indexed pool resize sets maximal pool entries number to
the `num_entries` parameter value.

The patch fixes maximal pool entries assignment.

The patch also adds `error` parameter to log error types.

Fixes: 89578504edd9 ("net/mlx5: add ipool resize function")

Signed-off-by: Gregory Etelson 
Acked-by: Dariusz Sosnowski 


Patch applied to next-net-mlx,

Kindest regards,
Raslan Darawsheh


Re: [PATCH] net/mlx5: fix IPv6-in-IPv6 tunnel recognition

2024-07-18 Thread Raslan Darawsheh
Hi,

From: Gregory Etelson 
Sent: Sunday, July 7, 2024 1:13 PM
To: dev@dpdk.org
Cc: Gregory Etelson; Maayan Kashani; Raslan Darawsheh; Dariusz Sosnowski; Slava 
Ovsiienko; Bing Zhao; Ori Kam; Suanming Mou; Matan Azrad
Subject: [PATCH] net/mlx5: fix IPv6-in-IPv6 tunnel recognition

The PMD did not recognize IPv6-in-IPv6 tunnel if IPv6 routing
extension was placed between IPv6 outer and inner headers.

The patch fixes IPv6-in-IPv6 tunnel recognition.

Fixes: 80c676259a04 ("net/mlx5: validate HWS template items")

Signed-off-by: Gregory Etelson 
Acked-by: Dariusz Sosnowski 

Patch applied to next-net-mlx,

Kindest regards,
Raslan Darawsheh



Re: [PATCH] net/mlx5: fix compilation warning in GCC-9.1

2024-07-18 Thread Raslan Darawsheh
Hi,

From: Gregory Etelson 
Sent: Sunday, July 7, 2024 12:57 PM
To: dev@dpdk.org
Cc: Gregory Etelson; Maayan Kashani; Raslan Darawsheh; sta...@dpdk.org; Dariusz 
Sosnowski; Slava Ovsiienko; Bing Zhao; Ori Kam; Suanming Mou; Matan Azrad
Subject: [PATCH] net/mlx5: fix compilation warning in GCC-9.1

GCC has introduced a bugfix in 9.1 that changed GCC ABI in ARM setups:
https://gcc.gnu.org/gcc-9/changes.html
```
On Arm targets (arm*-*-*), a bug in the implementation of the
procedure call standard (AAPCS) in the GCC 6, 7 and 8 releases
has been fixed: a structure containing a bit-field based on a 64-bit
integral type and where no other element in a structure required
64-bit alignment could be passed incorrectly to functions.
This is an ABI change. If the option -Wpsabi is enabled
(on by default) the compiler will emit a diagnostic note for code
that might be affected.
```

The patch fixes PMD compilation in the INTEGRITY flow item.

Fixes: 23b0a8b298b1 ("net/mlx5: fix integrity item validation and translation")

Cc: sta...@dpdk.org

Signed-off-by: Gregory Etelson 
Acked-by: Dariusz Sosnowski 

Patch applied to next-net-mlx,

Kindest regards,
Raslan Darawsheh



Re: [PATCH] net/mlx5: fix MTU configuration

2024-07-18 Thread Raslan Darawsheh
Hi,

From: Dariusz Sosnowski 
Sent: Monday, July 8, 2024 1:59 PM
To: Slava Ovsiienko; Bing Zhao; Ori Kam; Suanming Mou; Matan Azrad
Cc: dev@dpdk.org; sta...@dpdk.org
Subject: [PATCH] net/mlx5: fix MTU configuration

Apply provided MTU, derived from rte_eth_conf.rxmode.mtu,
on port configuration.

Bugzilla ID: 1483
Fixes: e60fbd5b24fc ("mlx5: add device configure/start/stop")
Cc: sta...@dpdk.org

Signed-off-by: Dariusz Sosnowski 
Acked-by: Viacheslav Ovsiienko 

Patch applied to next-net-mlx,

Kindest regards,
Raslan Darawsheh



Re: [PATCH 0/8] HW steering team updates

2024-07-18 Thread Raslan Darawsheh
Hi  Itamar,

From: Itamar Gozlan 
Sent: Tuesday, July 9, 2024 3:30 PM
To: Itamar Gozlan; Erez Shitrit; Hamdan Agbariya; Yevgeny Kliteynik; Alex 
Vesker; Slava Ovsiienko; NBU-Contact-Thomas Monjalon (EXTERNAL); Suanming Mou
Cc: dev@dpdk.org; Ori Kam
Subject: [PATCH 0/8] HW steering team updates

This patch series contains 8 commits from the HW steering team,
addressing
various improvements and fixes in the DPDK project.

This is the second version of this submission. The previous version
erroneously included two unnecessary commits, which have been removed in
this iteration.

Alex Vesker (1):
  net/mlx5/hws: fix incorrect port ID on root item convert

Erez Shitrit (6):
  net/mlx5/hws: set eswitch owner vhc ID valid accordingly
  net/mlx5/hws: fix memory leak in modify header free
  net/mlx5/hws: fix deletion of action vport
  net/mlx5/hws: take out not needed variable
  net/mlx5/hws: fix NAT64 csum issue
  net/mlx5/hws: fix NA64 copy TOS field instead of TTL

Itamar Gozlan (1):
  net/mlx5/hws: strictly range templates check fix

 drivers/net/mlx5/hws/mlx5dr_action.c  | 213 ++
 drivers/net/mlx5/hws/mlx5dr_action.h  |  17 +-
 drivers/net/mlx5/hws/mlx5dr_cmd.c |   6 +-
 drivers/net/mlx5/hws/mlx5dr_cmd.h |   1 +
 drivers/net/mlx5/hws/mlx5dr_definer.c |  11 +-
 drivers/net/mlx5/hws/mlx5dr_matcher.c |  20 +--
 drivers/net/mlx5/hws/mlx5dr_pat_arg.h |   1 -
 drivers/net/mlx5/hws/mlx5dr_rule.c|  22 +--
 8 files changed, 216 insertions(+), 75 deletions(-)

--
2.39.3


Series applied to next-net-mlx,

Kindest regards
Raslan Darawsheh


Re: [PATCH] kni: fix build with Linux 6.10

2024-07-18 Thread Thomas Monjalon
Adding sta...@dpdk.org and LTS maintainers as Cc.
It must be merged directly in LTS branches.


16/07/2024 10:44, Jiri Slaby:
> 6.10 removed the "support" (it was never supported [1]) of separate
> source and build dirs for out of tree modules.
> 
> KNI uses "src=" hack for that purpose.
> 
> Instead, copy sources to the build dir and don't rely upon the
> unsupported...
> 
> Intended esp. for stable/22.11. It should go wherever kni is still in
> the tree.
> 
> [1] 
> https://lore.kernel.org/all/CAK7LNAQ47bZpE6c6Yoz-jQS78uU611oZwU8bH+7e=p5zsya...@mail.gmail.com/
> 
> Signed-off-by: Jiri Slaby 
> ---
>  kernel/linux/kni/meson.build | 11 +++
>  1 file changed, 7 insertions(+), 4 deletions(-)
> 
> diff --git a/kernel/linux/kni/meson.build b/kernel/linux/kni/meson.build
> index 4c90069e..39ca2ade 100644
> --- a/kernel/linux/kni/meson.build
> +++ b/kernel/linux/kni/meson.build
> @@ -11,8 +11,8 @@ if run_cmd.stdout().contains('txqueue') == true
> kmod_cflags = '-DHAVE_ARG_TX_QUEUE'
>  endif
>  
> -
> -kni_mkfile = custom_target('rte_kni_makefile',
> +kni_deps = []
> +kni_deps += custom_target('rte_kni_makefile',
>  output: 'Makefile',
>  command: ['touch', '@OUTPUT@'])
>  
> @@ -22,12 +22,15 @@ kni_sources = files(
>  'Kbuild',
>  )
>  
> +foreach file : kni_sources
> +  kni_deps += fs.copyfile(file)
> +endforeach
> +
>  custom_target('rte_kni',
>  input: kni_sources,
>  output: 'rte_kni.ko',
>  command: ['make', '-j4', '-C', kernel_build_dir,
>  'M=' + meson.current_build_dir(),
> -'src=' + meson.current_source_dir(),
>  ' '.join(['MODULE_CFLAGS=', kmod_cflags,'-include '])
>  + dpdk_source_root + '/config/rte_config.h' +
>  ' -I' + dpdk_source_root + '/lib/eal/include' +
> @@ -35,7 +38,7 @@ custom_target('rte_kni',
>  ' -I' + dpdk_build_root +
>  ' -I' + meson.current_source_dir(),
>  'modules'] + cross_args,
> -depends: kni_mkfile,
> +depends: kni_deps,
>  install: install,
>  install_dir: kernel_install_dir,
>  build_by_default: get_option('enable_kmods'))






DPDK Release Status Meeting 2024-07-18

2024-07-18 Thread Mcnamara, John
Release status meeting minutes 2024-07-18
=

Agenda:
* Release Dates
* Subtrees
* Roadmaps
* LTS
* Defects
* Opens

Participants:
* AMD
* ARM
* Debian/Microsoft
* Intel
* Marvell
* Nvidia


Release Dates
-

The following are the current/updated working dates for 24.07:

- Proposal deadline (RFC/v1 patches): 26 April 2024
- API freeze (-rc1): 14 June 2024
- PMD features freeze (-rc2): 12 July 2024
- Builtin applications features freeze (-rc3): 22 July 2024
- Release: 30/31 July 2023


https://core.dpdk.org/roadmap/#dates


Subtrees


* next-net
  * Looking at Napatech driver updates post-RC2.
  * Adding some fixes.
  * In this release there is a new driver from ZTE and two from Realtek.

* next-net-intel
  * Main base code changes merged.
  * Some other fix patches under review.

* next-net-mlx
  * Merging some fixes and minor changes.

* next-net-mvl
  * No update this week.

* next-eventdev
  * No update this week.

* next-baseband
  * No update this week.

* next-virtio
  * No update this week.

* next-crypto
  * Around 9 fix patches in Patchwork.

* main
  * RC2 released 12 July.
  * Looking at merging DTS patches and fixes.
  * RC3 should close Monday 22 July
  * RC4 should be Friday 26 July
  * Release on 30/31 July


The following are the proposed dates for 24.11:

- Proposal deadline (RFC/v1 patches): 7 September 2024
- API freeze (-rc1): 7 October 2024
- PMD features freeze (-rc2): 28 October 2024
- Builtin applications features freeze (-rc3): 4 November 2024
- Release: 18 November 2023


LTS
---

Status of the current LTSes

* 23.11.2 - In progress.
* 22.11.6 - In progress.
* 21.11.8 - In progress.

* 20.11.10 - Will only be updated with CVE and critical fixes.
* 19.11.15 - Will only be updated with CVE and critical fixes.


* Distros
  * Debian 12 contains DPDK v22.11
  * Ubuntu 24.04 contains DPDK v23.11
  * Ubuntu 23.04 contains DPDK v22.11
  * RHEL 8/9 contains DPDK 23.11

Defects
---

* Bugzilla links, 'Bugs',  added for hosted projects
  * https://www.dpdk.org/hosted-projects/



DPDK Release Status Meetings


The DPDK Release Status Meeting is intended for DPDK Committers to discuss the
status of the master tree and sub-trees, and for project managers to track
progress or milestone dates.

The meeting occurs on every Thursday at 9:30 UTC over Jitsi on 
https://meet.jit.si/DPDK

You don't need an invite to join the meeting but if you want a calendar 
reminder just
send an email to "John McNamara john.mcnam...@intel.com" for the invite.


Re: [PATCH] kni: fix build with Linux 6.10

2024-07-18 Thread Luca Boccassi
Hi Jiri,

Please follow the process described in the "stable release" paragraph at:

https://core.dpdk.org/contribute/

ie, one patch per affected LTS branch must be prepared, tested and
sent individually, and then we'll apply it. Thanks.

On Thu, 18 Jul 2024 at 10:16, Thomas Monjalon  wrote:
>
> Adding sta...@dpdk.org and LTS maintainers as Cc.
> It must be merged directly in LTS branches.
>
>
> 16/07/2024 10:44, Jiri Slaby:
> > 6.10 removed the "support" (it was never supported [1]) of separate
> > source and build dirs for out of tree modules.
> >
> > KNI uses "src=" hack for that purpose.
> >
> > Instead, copy sources to the build dir and don't rely upon the
> > unsupported...
> >
> > Intended esp. for stable/22.11. It should go wherever kni is still in
> > the tree.
> >
> > [1] 
> > https://lore.kernel.org/all/CAK7LNAQ47bZpE6c6Yoz-jQS78uU611oZwU8bH+7e=p5zsya...@mail.gmail.com/
> >
> > Signed-off-by: Jiri Slaby 
> > ---
> >  kernel/linux/kni/meson.build | 11 +++
> >  1 file changed, 7 insertions(+), 4 deletions(-)
> >
> > diff --git a/kernel/linux/kni/meson.build b/kernel/linux/kni/meson.build
> > index 4c90069e..39ca2ade 100644
> > --- a/kernel/linux/kni/meson.build
> > +++ b/kernel/linux/kni/meson.build
> > @@ -11,8 +11,8 @@ if run_cmd.stdout().contains('txqueue') == true
> > kmod_cflags = '-DHAVE_ARG_TX_QUEUE'
> >  endif
> >
> > -
> > -kni_mkfile = custom_target('rte_kni_makefile',
> > +kni_deps = []
> > +kni_deps += custom_target('rte_kni_makefile',
> >  output: 'Makefile',
> >  command: ['touch', '@OUTPUT@'])
> >
> > @@ -22,12 +22,15 @@ kni_sources = files(
> >  'Kbuild',
> >  )
> >
> > +foreach file : kni_sources
> > +  kni_deps += fs.copyfile(file)
> > +endforeach
> > +
> >  custom_target('rte_kni',
> >  input: kni_sources,
> >  output: 'rte_kni.ko',
> >  command: ['make', '-j4', '-C', kernel_build_dir,
> >  'M=' + meson.current_build_dir(),
> > -'src=' + meson.current_source_dir(),
> >  ' '.join(['MODULE_CFLAGS=', kmod_cflags,'-include '])
> >  + dpdk_source_root + '/config/rte_config.h' +
> >  ' -I' + dpdk_source_root + '/lib/eal/include' +
> > @@ -35,7 +38,7 @@ custom_target('rte_kni',
> >  ' -I' + dpdk_build_root +
> >  ' -I' + meson.current_source_dir(),
> >  'modules'] + cross_args,
> > -depends: kni_mkfile,
> > +depends: kni_deps,
> >  install: install,
> >  install_dir: kernel_install_dir,
> >  build_by_default: get_option('enable_kmods'))
>
>
>
>


[PATCH 0/3] net/mlx5: E-Switch and validation fixes

2024-07-18 Thread Dariusz Sosnowski
Patch 1 - Fixes a bug with fdb_def_rule_en device argument,
used to control default E-Switch flow rules created by mlx5 PMD.

Patches 2-3 - Fixes for flow rule validation in async flow API,
which were found during testing.

Dariusz Sosnowski (3):
  net/mlx5: fix disabling E-Switch default flow rules
  net/mlx5: fix action configuration validation
  net/mlx5: fix RSS and queue action validation

 drivers/net/mlx5/mlx5_flow_hw.c | 182 +---
 drivers/net/mlx5/mlx5_trigger.c |   4 +-
 drivers/net/mlx5/mlx5_txq.c |  13 ++-
 3 files changed, 107 insertions(+), 92 deletions(-)

--
2.39.2



[PATCH 1/3] net/mlx5: fix disabling E-Switch default flow rules

2024-07-18 Thread Dariusz Sosnowski
`fdb_def_rule_en` devarg controls whether mlx5 PMD creates default
E-Switch flow rules for:

- Transferring traffic from wire, VFs and SFs to group 1 (default jump).
- Providing default behavior for application traffic (default SQ miss
  flow rules).

With these flow rules, applications effectively create transfer flow
rules in group 1 and higher (application group is translated to one
higher) allowing for faster insertion on all groups and providing
ability to forward to VF, SF and wire on any group.

By default, these rules are created (`fdb_def_rule_en` == 1).

When these default flow rules are disabled (`fdb_def_rule_en` == 0)
with HW Steering flow engine (`dv_flow_en` == 2) only creation of
default jump rules was disabled. Also, necessary template table and
pattern/actions templates were created as well,
but they were never used.
SQ miss flow rules were still created.
This is a bug, because with `fdb_def_rule_en` == 0, application should
not expect any default E-Switch flow rules.

This patch fixes that by disabling all default E-Switch flow rules
creation and disabling creating templates for these flow rules,
when `fdb_def_rule_en` == 0.
If an application needs to run with these flow rules disabled,
and requires flow rules providing SQ miss flow rules functionality,
then application must explicitly create similar flow rules.

Fixes: 1939eb6f660c ("net/mlx5: support flow port action with HWS")
Cc: sta...@dpdk.org

Signed-off-by: Dariusz Sosnowski 
Acked-by: Viacheslav Ovsiienko 
---
 drivers/net/mlx5/mlx5_flow_hw.c | 142 ++--
 drivers/net/mlx5/mlx5_trigger.c |   4 +-
 drivers/net/mlx5/mlx5_txq.c |  13 ++-
 3 files changed, 91 insertions(+), 68 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_flow_hw.c b/drivers/net/mlx5/mlx5_flow_hw.c
index e7d8c251a0..fe7df7305f 100644
--- a/drivers/net/mlx5/mlx5_flow_hw.c
+++ b/drivers/net/mlx5/mlx5_flow_hw.c
@@ -10580,6 +10580,7 @@ flow_hw_create_ctrl_tables(struct rte_eth_dev *dev, 
struct rte_flow_error *error
struct mlx5_flow_hw_ctrl_fdb *hw_ctrl_fdb;
uint32_t xmeta = priv->sh->config.dv_xmeta_en;
uint32_t repr_matching = priv->sh->config.repr_matching;
+   uint32_t fdb_def_rule = priv->sh->config.fdb_def_rule;
 
MLX5_ASSERT(priv->hw_ctrl_fdb == NULL);
hw_ctrl_fdb = mlx5_malloc(MLX5_MEM_ZERO, sizeof(*hw_ctrl_fdb), 0, 
SOCKET_ID_ANY);
@@ -10590,70 +10591,79 @@ flow_hw_create_ctrl_tables(struct rte_eth_dev *dev, 
struct rte_flow_error *error
goto err;
}
priv->hw_ctrl_fdb = hw_ctrl_fdb;
-   /* Create templates and table for default SQ miss flow rules - root 
table. */
-   hw_ctrl_fdb->esw_mgr_items_tmpl = 
flow_hw_create_ctrl_esw_mgr_pattern_template(dev, error);
-   if (!hw_ctrl_fdb->esw_mgr_items_tmpl) {
-   DRV_LOG(ERR, "port %u failed to create E-Switch Manager item"
-   " template for control flows", dev->data->port_id);
-   goto err;
-   }
-   hw_ctrl_fdb->regc_jump_actions_tmpl = 
flow_hw_create_ctrl_regc_jump_actions_template
-   (dev, error);
-   if (!hw_ctrl_fdb->regc_jump_actions_tmpl) {
-   DRV_LOG(ERR, "port %u failed to create REG_C set and jump 
action template"
-   " for control flows", dev->data->port_id);
-   goto err;
-   }
-   hw_ctrl_fdb->hw_esw_sq_miss_root_tbl = 
flow_hw_create_ctrl_sq_miss_root_table
-   (dev, hw_ctrl_fdb->esw_mgr_items_tmpl, 
hw_ctrl_fdb->regc_jump_actions_tmpl,
-error);
-   if (!hw_ctrl_fdb->hw_esw_sq_miss_root_tbl) {
-   DRV_LOG(ERR, "port %u failed to create table for default sq 
miss (root table)"
-   " for control flows", dev->data->port_id);
-   goto err;
-   }
-   /* Create templates and table for default SQ miss flow rules - non-root 
table. */
-   hw_ctrl_fdb->regc_sq_items_tmpl = 
flow_hw_create_ctrl_regc_sq_pattern_template(dev, error);
-   if (!hw_ctrl_fdb->regc_sq_items_tmpl) {
-   DRV_LOG(ERR, "port %u failed to create SQ item template for"
-   " control flows", dev->data->port_id);
-   goto err;
-   }
-   hw_ctrl_fdb->port_actions_tmpl = 
flow_hw_create_ctrl_port_actions_template(dev, error);
-   if (!hw_ctrl_fdb->port_actions_tmpl) {
-   DRV_LOG(ERR, "port %u failed to create port action template"
-   " for control flows", dev->data->port_id);
-   goto err;
-   }
-   hw_ctrl_fdb->hw_esw_sq_miss_tbl = flow_hw_create_ctrl_sq_miss_table
-   (dev, hw_ctrl_fdb->regc_sq_items_tmpl, 
hw_ctrl_fdb->port_actions_tmpl,
-error);
-   if (!hw_ctrl_fdb->hw_esw_sq_miss_tbl) {
-   DRV_LOG(ERR, "port %u failed to create table for default sq 
miss (non-root table)"
-   " for control flows", de

[PATCH 3/3] net/mlx5: fix RSS and queue action validation

2024-07-18 Thread Dariusz Sosnowski
mlx5 PMD supports configuration where
Rx queues managed by DPDK are not set up.
Externally allocated RQs can be used by mapping them to some
DPDK Rx queue indexes using rte_pmd_mlx5_external_rx_queue_id_map()
API. In this case, mlx5 PMD will allow creating flow rules which
reference such external RQ.

HWS validation of RSS and QUEUE unmasked flow actions in actions
templates worked by constructing a "mock" action which was then checked.
This procedure incorrectly assumed that queue index 0 can be used as
"always valid queue", which is not the case in scenario mentioned above,
because queue 0 was not set up

This patch fixes that by removing "mock" actions, since there's no real
data available for validation. RSS and QUEUE validation in unmasked
action case only checks flow attributes.

Fixes: d6dc072aeb12 ("net/mlx5: validate flow actions in table creation")

Signed-off-by: Dariusz Sosnowski 
Acked-by: Viacheslav Ovsiienko 
---
 drivers/net/mlx5/mlx5_flow_hw.c | 37 +
 1 file changed, 14 insertions(+), 23 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_flow_hw.c b/drivers/net/mlx5/mlx5_flow_hw.c
index 39d1cd96d4..d243b59b71 100644
--- a/drivers/net/mlx5/mlx5_flow_hw.c
+++ b/drivers/net/mlx5/mlx5_flow_hw.c
@@ -6806,8 +6806,6 @@ mlx5_hw_validate_action_mark(struct rte_eth_dev *dev,
  &attr, error);
 }
 
-#define MLX5_FLOW_DEFAULT_INGRESS_QUEUE 0
-
 static int
 mlx5_hw_validate_action_queue(struct rte_eth_dev *dev,
  const struct rte_flow_action *template_action,
@@ -6817,22 +6815,22 @@ mlx5_hw_validate_action_queue(struct rte_eth_dev *dev,
  struct rte_flow_error *error)
 {
const struct rte_flow_action_queue *queue_mask = template_mask->conf;
-   const struct rte_flow_action *action =
-   queue_mask && queue_mask->index ? template_action :
-   &(const struct rte_flow_action) {
-   .type = RTE_FLOW_ACTION_TYPE_QUEUE,
-   .conf = &(const struct rte_flow_action_queue) {
-   .index = MLX5_FLOW_DEFAULT_INGRESS_QUEUE
-   }
-   };
const struct rte_flow_attr attr = {
.ingress = template_attr->ingress,
.egress = template_attr->egress,
.transfer = template_attr->transfer
};
+   bool masked = queue_mask != NULL && queue_mask->index;
 
-   return mlx5_flow_validate_action_queue(action, action_flags,
-  dev, &attr, error);
+   if (template_attr->egress || template_attr->transfer)
+   return rte_flow_error_set(error, EINVAL,
+ RTE_FLOW_ERROR_TYPE_ATTR, NULL,
+ "QUEUE action supported for ingress 
only");
+   if (masked)
+   return mlx5_flow_validate_action_queue(template_action, 
action_flags, dev,
+  &attr, error);
+   else
+   return 0;
 }
 
 static int
@@ -6844,22 +6842,15 @@ mlx5_hw_validate_action_rss(struct rte_eth_dev *dev,
  struct rte_flow_error *error)
 {
const struct rte_flow_action_rss *mask = template_mask->conf;
-   const struct rte_flow_action *action = mask ? template_action :
-   &(const struct rte_flow_action) {
-   .type = RTE_FLOW_ACTION_TYPE_RSS,
-   .conf = &(const struct rte_flow_action_rss) {
-   .queue_num = 1,
-   .queue = (uint16_t [1]) {
-   MLX5_FLOW_DEFAULT_INGRESS_QUEUE
-   }
-   }
-   };
 
if (template_attr->egress || template_attr->transfer)
return rte_flow_error_set(error, EINVAL,
  RTE_FLOW_ERROR_TYPE_ATTR, NULL,
  "RSS action supported for ingress 
only");
-   return mlx5_validate_action_rss(dev, action, error);
+   if (mask != NULL)
+   return mlx5_validate_action_rss(dev, template_action, error);
+   else
+   return 0;
 }
 
 static int
-- 
2.39.2



[PATCH 2/3] net/mlx5: fix action configuration validation

2024-07-18 Thread Dariusz Sosnowski
Checking if action configuration is required should be checked based on
action type recorded in the actions template, not on user action.

Also, adds a missing internal RSS action type to configuration check
skip list.

Fixes: 57c7b94301ee ("net/mlx5: add async flow operation validation")

Signed-off-by: Dariusz Sosnowski 
Acked-by: Suanming Mou 
---
 drivers/net/mlx5/mlx5_flow_hw.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/mlx5/mlx5_flow_hw.c b/drivers/net/mlx5/mlx5_flow_hw.c
index fe7df7305f..39d1cd96d4 100644
--- a/drivers/net/mlx5/mlx5_flow_hw.c
+++ b/drivers/net/mlx5/mlx5_flow_hw.c
@@ -16388,10 +16388,11 @@ flow_hw_validate_rule_actions(struct rte_eth_dev *dev,
user_action = &actions[act_data->action_src];
 
/* Skip actions which do not require conf. */
-   switch ((int)user_action->type) {
+   switch ((int)act_data->type) {
case RTE_FLOW_ACTION_TYPE_COUNT:
case MLX5_RTE_FLOW_ACTION_TYPE_COUNT:
case MLX5_RTE_FLOW_ACTION_TYPE_METER_MARK:
+   case MLX5_RTE_FLOW_ACTION_TYPE_RSS:
continue;
default:
break;
-- 
2.39.2



RE: [PATCH 3/3] examples/l3fwd: fix maximum acceptable port ID in routes

2024-07-18 Thread Konstantin Ananyev



> > > Application is accepting routes for port ID up to UINT8_MAX for LPM
> > > amd EM routes on parsing the given rule file, but only up to 32 ports
> > > can be enabled as per the variable enabled_port_mask which is defined
> > > as uint32_t.
> > >
> > > This patch restricts the rules parsing code to accept routes for port
> > > ID up to 31 only to avoid any unnecessary maintenance of rules which
> > > will never be used.
> >
> > If we want to add this extra check, probably better to do it in setup_lpm().
> > Where we already check that port is enabled, and If not, then this route 
> > rule will
> > be skipped:
> >
> > /* populate the LPM table */
> > for (i = 0; i < route_num_v4; i++) {
> > struct in_addr in;
> >
> > /* skip unused ports */
> > if ((1 << route_base_v4[i].if_out &
> > enabled_port_mask) == 0)
> > continue;
> >
> > Same for EM.
> I am trying to update the check for MAX if_out value in rules config file 
> parsing which will be before setup_lpm().
> The reason is, restricting and adding only those rules which can be used by 
> the application
> while populating the route_base_v4/v6 at first step and avoid unnecessary 
> memory allocation
> for local variables to store more not required rules.

Hmm... but why it is a problem?
 
> 
> > ((1 << route_base_v4[i].if_out &
> > enabled_port_mask)
> By looking into this check, it seems restriction to maximum 31 port ID while 
> parsing rule file becomes
> more valid as this check can pass due to overflow in case value of 
> route_base_v4[i].if_out
> Is 31+.

Agree, I think we need both, and it probably need to be in setup_lpm().
Something like:

if (route_base_v4[i].if_out >= sizeof(enabled_port_mask) * CHAR_BIT ||
   ((1 << route_base_v4[i].if_out & enabled_port_mask) == 0) {
 /* print some error message here*/
 rte_exiit(...);  /* or return an error */
} 
   
> 
> > Another question here - why we just silently skip the rule with invalid 
> > port?
> In read_config_files_lpm() we are calling the rte_exit in case port ID is 31+.
> In setup_lpm, skipping the rules for the ports which are not enabled and not 
> giving error,
> I guess probably because of ease of use.
> e.g. user has only single ipv4_routes config file with route rules for port 
> ID 0,1,2,3,4
> and want to use same file for multiple test cases like
> 1. when only port 0 enabled
> 2. when only port 0 and 1 enabled and so on.
> In this case, user can avoid to have separate route files for each of the 
> test case.

The problem as I see it - we are not consistent here.
In some cases we just silently skip rules with invalid (or disabled) port 
numbers,
in other cases we generate an error and exit.
For me it would be better, if we follow one simple policy (abort with error) 
here for all cases.
 
> 
> > Probably need to fail with error... that what ACL code-path does.
> >
> > > Fixes: e7e6dd643092 ("examples/l3fwd: support config file for EM")
> > > Fixes: 52def963fc1c ("examples/l3fwd: support config file for
> > > LPM/FIB")
> > > Cc: sean.morris...@intel.com
> > > Cc: sta...@dpdk.org
> > >
> > > Signed-off-by: Gagandeep Singh 
> > > ---
> > >  examples/l3fwd/em_route_parse.c  | 6 --
> > > examples/l3fwd/lpm_route_parse.c | 6 --
> > >  2 files changed, 8 insertions(+), 4 deletions(-)
> > >
> > > diff --git a/examples/l3fwd/em_route_parse.c
> > > b/examples/l3fwd/em_route_parse.c index 8b534de5f1..65c71cd1ba 100644
> > > --- a/examples/l3fwd/em_route_parse.c
> > > +++ b/examples/l3fwd/em_route_parse.c
> > > @@ -65,7 +65,8 @@ em_parse_v6_rule(char *str, struct em_rule *v)
> > >   /* protocol. */
> > >   GET_CB_FIELD(in[CB_FLD_PROTO], v->v6_key.proto, 0, UINT8_MAX, 0);
> > >   /* out interface. */
> > > - GET_CB_FIELD(in[CB_FLD_IF_OUT], v->if_out, 0, UINT8_MAX, 0);
> > > + GET_CB_FIELD(in[CB_FLD_IF_OUT], v->if_out, 0,
> > > + (sizeof(enabled_port_mask) * CHAR_BIT) - 1, 0);
> > >
> > >   return 0;
> > >  }
> > > @@ -102,7 +103,8 @@ em_parse_v4_rule(char *str, struct em_rule *v)
> > >   /* protocol. */
> > >   GET_CB_FIELD(in[CB_FLD_PROTO], v->v4_key.proto, 0, UINT8_MAX, 0);
> > >   /* out interface. */
> > > - GET_CB_FIELD(in[CB_FLD_IF_OUT], v->if_out, 0, UINT8_MAX, 0);
> > > + GET_CB_FIELD(in[CB_FLD_IF_OUT], v->if_out, 0,
> > > + (sizeof(enabled_port_mask) * CHAR_BIT) - 1, 0);
> > >
> > >   return 0;
> > >  }
> > > diff --git a/examples/l3fwd/lpm_route_parse.c
> > > b/examples/l3fwd/lpm_route_parse.c
> > > index f27b66e838..357c12d9fe 100644
> > > --- a/examples/l3fwd/lpm_route_parse.c
> > > +++ b/examples/l3fwd/lpm_route_parse.c
> > > @@ -110,7 +110,8 @@ lpm_parse_v6_rule(char *str, struct lpm_route_rule
> > > *v)
> > >
> > >   rc = lpm_parse_v6_net(in[CB_FLD_DST_ADDR], v->ip_32, &v->depth);
> > >
> > > - GET_CB_FIELD(in[CB_FLD_IF_OUT], v->if_out, 0, UINT8_MAX, 0);
> > > + GET_CB_FIELD(in[C

[PATCH v2 1/3] ethdev: add description for KEEP CRC offload

2024-07-18 Thread Jie Hai
From: Dengdui Huang 

The data execeed the pkt_len in mbuf is inavailable for user.
When KEEP CRC offload is enabled, CRC field length should be
included pkt_len in mbuf. However, almost of drivers supported
KEEP CRC feature didn't add the CRC data length to pkt_len.
So it is very necessary to add a coments for this.

Fixes: 70815c9ecadd ("ethdev: add new offload flag to keep CRC")
Cc: sta...@dpdk.org

Signed-off-by: Dengdui Huang 
---
 lib/ethdev/rte_ethdev.h | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
index 548fada1c7..3d44673161 100644
--- a/lib/ethdev/rte_ethdev.h
+++ b/lib/ethdev/rte_ethdev.h
@@ -1550,6 +1550,12 @@ struct rte_eth_conf {
  */
 #define RTE_ETH_RX_OFFLOAD_TIMESTAMPRTE_BIT64(14)
 #define RTE_ETH_RX_OFFLOAD_SECURITY RTE_BIT64(15)
+/*
+ * Keep CRC data in packet.
+ *
+ * Note: If this offload is enabled, the pkt_len in mbuf should contain
+ * the CRC data length.
+ */
 #define RTE_ETH_RX_OFFLOAD_KEEP_CRC RTE_BIT64(16)
 #define RTE_ETH_RX_OFFLOAD_SCTP_CKSUM   RTE_BIT64(17)
 #define RTE_ETH_RX_OFFLOAD_OUTER_UDP_CKSUM  RTE_BIT64(18)
-- 
2.33.0



[PATCH v2 0/3] bugfix about KEEP CRC offload

2024-07-18 Thread Jie Hai
From: Dengdui Huang 

For hns3 NIC, when KEEP_CRC offload is enabled, the CRC data is still be 
stripped in rare scenarios.
Some users of hns3 are already using this feature. So driver has to recaculate 
packet CRC.

In addition, in the mbuf, the data that exceeds the length specified by pkt_len 
is invalid.
Therefore, if the packet contains CRC data, pkt_len should contain the CRC data 
length.
However, almost of drivers supported KEEP CRC feature didn't add the CRC data 
length to pkt_len.
This patch adds description for KEEP CRC offload.

Dengdui Huang (3):
  ethdev: add description for KEEP CRC offload
  net/hns3: fix packet length do not contain CRC data length
  net/hns3: fix Rx packet without CRC data

 drivers/net/hns3/hns3_ethdev.c|   5 +
 drivers/net/hns3/hns3_ethdev.h|  23 +
 drivers/net/hns3/hns3_rxtx.c  | 134 --
 drivers/net/hns3/hns3_rxtx.h  |   3 +
 drivers/net/hns3/hns3_rxtx_vec.c  |   3 +-
 drivers/net/hns3/hns3_rxtx_vec_neon.h |  19 
 drivers/net/hns3/hns3_rxtx_vec_sve.c  |   3 +-
 lib/ethdev/rte_ethdev.h   |   6 ++
 8 files changed, 124 insertions(+), 72 deletions(-)

-- 
2.33.0



[PATCH v2 2/3] net/hns3: fix packet length do not contain CRC data length

2024-07-18 Thread Jie Hai
From: Dengdui Huang 

In the HNS3 driver, pkt_len in mbuf do not contain the CRC length.
This patch fix it.

Fixes: 8973d7c4ca12 ("net/hns3: support keeping CRC")
Cc: sta...@dpdk.org

Signed-off-by: Dengdui Huang 
---
 drivers/net/hns3/hns3_rxtx.c  | 53 +++
 drivers/net/hns3/hns3_rxtx_vec_neon.h | 19 --
 drivers/net/hns3/hns3_rxtx_vec_sve.c  |  3 +-
 3 files changed, 7 insertions(+), 68 deletions(-)

diff --git a/drivers/net/hns3/hns3_rxtx.c b/drivers/net/hns3/hns3_rxtx.c
index 5941b966e0..39ba9080ea 100644
--- a/drivers/net/hns3/hns3_rxtx.c
+++ b/drivers/net/hns3/hns3_rxtx.c
@@ -1951,7 +1951,6 @@ hns3_rx_queue_setup(struct rte_eth_dev *dev, uint16_t 
idx, uint16_t nb_desc,
memset(&rxq->err_stats, 0, sizeof(struct hns3_rx_bd_errors_stats));
memset(&rxq->dfx_stats, 0, sizeof(struct hns3_rx_dfx_stats));
 
-   /* CRC len set here is used for amending packet length */
if (dev->data->dev_conf.rxmode.offloads & RTE_ETH_RX_OFFLOAD_KEEP_CRC)
rxq->crc_len = RTE_ETHER_CRC_LEN;
else
@@ -2383,23 +2382,6 @@ hns3_rxd_to_vlan_tci(struct hns3_rx_queue *rxq, struct 
rte_mbuf *mb,
}
 }
 
-static inline void
-recalculate_data_len(struct rte_mbuf *first_seg, struct rte_mbuf *last_seg,
-   struct rte_mbuf *rxm, struct hns3_rx_queue *rxq,
-   uint16_t data_len)
-{
-   uint8_t crc_len = rxq->crc_len;
-
-   if (data_len <= crc_len) {
-   rte_pktmbuf_free_seg(rxm);
-   first_seg->nb_segs--;
-   last_seg->data_len = (uint16_t)(last_seg->data_len -
-   (crc_len - data_len));
-   last_seg->next = NULL;
-   } else
-   rxm->data_len = (uint16_t)(data_len - crc_len);
-}
-
 static inline struct rte_mbuf *
 hns3_rx_alloc_buffer(struct hns3_rx_queue *rxq)
 {
@@ -2503,8 +2485,7 @@ hns3_recv_pkts_simple(void *rx_queue,
rxdp->rx.bd_base_info = 0;
 
rxm->data_off = RTE_PKTMBUF_HEADROOM;
-   rxm->pkt_len = (uint16_t)(rte_le_to_cpu_16(rxd.rx.pkt_len)) -
-   rxq->crc_len;
+   rxm->pkt_len = (uint16_t)(rte_le_to_cpu_16(rxd.rx.pkt_len));
rxm->data_len = rxm->pkt_len;
rxm->port = rxq->port_id;
rxm->hash.rss = rte_le_to_cpu_32(rxd.rx.rss_hash);
@@ -2531,8 +2512,8 @@ hns3_recv_pkts_simple(void *rx_queue,
 
hns3_rxd_to_vlan_tci(rxq, rxm, l234_info, &rxd);
 
-   /* Increment bytes counter  */
-   rxq->basic_stats.bytes += rxm->pkt_len;
+   /* All byte-related statistics do not include Ethernet FCS */
+   rxq->basic_stats.bytes += rxm->pkt_len - rxq->crc_len;
 
rx_pkts[nb_rx++] = rxm;
continue;
@@ -2695,10 +2676,10 @@ hns3_recv_scattered_pkts(void *rx_queue,
 
rxm->data_off = RTE_PKTMBUF_HEADROOM;
rxm->data_len = rte_le_to_cpu_16(rxd.rx.size);
+   rxm->next = NULL;
 
if (!(bd_base_info & BIT(HNS3_RXD_FE_B))) {
last_seg = rxm;
-   rxm->next = NULL;
continue;
}
 
@@ -2706,30 +2687,8 @@ hns3_recv_scattered_pkts(void *rx_queue,
if (unlikely(bd_base_info & BIT(HNS3_RXD_TS_VLD_B)))
hns3_rx_ptp_timestamp_handle(rxq, first_seg, timestamp);
 
-   /*
-* The last buffer of the received packet. packet len from
-* buffer description may contains CRC len, packet len should
-* subtract it, same as data len.
-*/
first_seg->pkt_len = rte_le_to_cpu_16(rxd.rx.pkt_len);
 
-   /*
-* This is the last buffer of the received packet. If the CRC
-* is not stripped by the hardware:
-*  - Subtract the CRC length from the total packet length.
-*  - If the last buffer only contains the whole CRC or a part
-*  of it, free the mbuf associated to the last buffer. If part
-*  of the CRC is also contained in the previous mbuf, subtract
-*  the length of that CRC part from the data length of the
-*  previous mbuf.
-*/
-   rxm->next = NULL;
-   if (unlikely(rxq->crc_len > 0)) {
-   first_seg->pkt_len -= rxq->crc_len;
-   recalculate_data_len(first_seg, last_seg, rxm, rxq,
-   rxm->data_len);
-   }
-
first_seg->port = rxq->port_id;
first_seg->hash.rss = rte_le_to_cpu_32(rxd.rx.rss_hash);
first_seg->ol_flags |= RTE_MBUF_F_RX_RSS_HASH;
@@ -2761,8 +2720,8 @@ hns3_recv_scattered_pkts(void *rx_queue,
 
hns3_rxd_to_vlan_tci(rxq, first_seg, l234_info, &rxd);
 
- 

[PATCH v2 3/3] net/hns3: fix Rx packet without CRC data

2024-07-18 Thread Jie Hai
From: Dengdui Huang 

When KEEP_CRC offload is enabled, the CRC data is still stripped
in following cases:
1. For HIP08 network engine, the packet type is TCP and the length
   is less than or equal to 60B.
2. For HIP09 network engine, the packet type is IP and the length
   is less than or equal to 60B.

So driver has to recaculate packet CRC for this rare scenarios.

In addition, to avoid impacting performance, KEEP_CRC is not
supported when NEON or SVE algorithm is used.

Fixes: 8973d7c4ca12 ("net/hns3: support keeping CRC")
Cc: sta...@dpdk.org

Signed-off-by: Dengdui Huang 
---
 drivers/net/hns3/hns3_ethdev.c   |  5 ++
 drivers/net/hns3/hns3_ethdev.h   | 23 +
 drivers/net/hns3/hns3_rxtx.c | 81 ++--
 drivers/net/hns3/hns3_rxtx.h |  3 ++
 drivers/net/hns3/hns3_rxtx_vec.c |  3 +-
 5 files changed, 111 insertions(+), 4 deletions(-)

diff --git a/drivers/net/hns3/hns3_ethdev.c b/drivers/net/hns3/hns3_ethdev.c
index ec1251cb7e..5fdabba547 100644
--- a/drivers/net/hns3/hns3_ethdev.c
+++ b/drivers/net/hns3/hns3_ethdev.c
@@ -2739,6 +2739,7 @@ hns3_get_capability(struct hns3_hw *hw)
hw->udp_cksum_mode = HNS3_SPECIAL_PORT_SW_CKSUM_MODE;
pf->support_multi_tc_pause = false;
hw->rx_dma_addr_align = HNS3_RX_DMA_ADDR_ALIGN_64;
+   hw->strip_crc_ptype = HNS3_STRIP_CRC_PTYPE_TCP;
return 0;
}
 
@@ -2760,6 +2761,10 @@ hns3_get_capability(struct hns3_hw *hw)
hw->udp_cksum_mode = HNS3_SPECIAL_PORT_HW_CKSUM_MODE;
pf->support_multi_tc_pause = true;
hw->rx_dma_addr_align = HNS3_RX_DMA_ADDR_ALIGN_128;
+   if (hw->revision == PCI_REVISION_ID_HIP09_A)
+   hw->strip_crc_ptype = HNS3_STRIP_CRC_PTYPE_IP;
+   else
+   hw->strip_crc_ptype = HNS3_STRIP_CRC_PTYPE_NONE;
 
return 0;
 }
diff --git a/drivers/net/hns3/hns3_ethdev.h b/drivers/net/hns3/hns3_ethdev.h
index 799b61038a..801411d690 100644
--- a/drivers/net/hns3/hns3_ethdev.h
+++ b/drivers/net/hns3/hns3_ethdev.h
@@ -56,6 +56,10 @@
 #define HNS3_SPECIAL_PORT_SW_CKSUM_MODE 0
 #define HNS3_SPECIAL_PORT_HW_CKSUM_MODE 1
 
+#define HNS3_STRIP_CRC_PTYPE_NONE 0
+#define HNS3_STRIP_CRC_PTYPE_TCP  1
+#define HNS3_STRIP_CRC_PTYPE_IP   2
+
 #define HNS3_UC_MACADDR_NUM128
 #define HNS3_VF_UC_MACADDR_NUM 48
 #define HNS3_MC_MACADDR_NUM128
@@ -657,6 +661,25 @@ struct hns3_hw {
 */
uint8_t udp_cksum_mode;
 
+   /*
+* When KEEP_CRC offload is enabled, the CRC data of some type packets
+* whose length is less than or equal to HNS3_KEEP_CRC_OK_MIN_PKT_LEN
+* is still be stripped on some network engine. So here has to use this
+* field to distinguish they difference between different network 
engines.
+* value range:
+*  - HNS3_STRIP_CRC_PTYPE_TCP
+* This value for HIP08 network engine.
+* Indicates that only the IP-TCP packet type is stripped.
+*
+*  - HNS3_STRIP_CRC_PTYPE_IP
+* This value for HIP09 network engine.
+* Indicates that all IP packet types are stripped.
+*
+*  - HNS3_STRIP_CRC_PTYPE_NONE
+* Indicates that all packet types are not stripped.
+*/
+   uint8_t strip_crc_ptype;
+
struct hns3_port_base_vlan_config port_base_vlan_cfg;
 
pthread_mutex_t flows_lock; /* rte_flow ops lock */
diff --git a/drivers/net/hns3/hns3_rxtx.c b/drivers/net/hns3/hns3_rxtx.c
index 39ba9080ea..e936c0799f 100644
--- a/drivers/net/hns3/hns3_rxtx.c
+++ b/drivers/net/hns3/hns3_rxtx.c
@@ -11,6 +11,7 @@
 #include 
 #include 
 #include 
+#include 
 #if defined(RTE_ARCH_ARM64)
 #include 
 #include 
@@ -1766,8 +1767,9 @@ hns3_rx_buf_len_calc(struct rte_mempool *mp, uint16_t 
*rx_buf_len)
 }
 
 static int
-hns3_rxq_conf_runtime_check(struct hns3_hw *hw, uint16_t buf_size,
-   uint16_t nb_desc)
+hns3_rxq_conf_runtime_check(struct hns3_hw *hw,
+   const struct rte_eth_rxconf *conf,
+   uint16_t buf_size, uint16_t nb_desc)
 {
struct rte_eth_dev *dev = &rte_eth_devices[hw->data->port_id];
eth_rx_burst_t pkt_burst = dev->rx_pkt_burst;
@@ -1800,6 +1802,14 @@ hns3_rxq_conf_runtime_check(struct hns3_hw *hw, uint16_t 
buf_size,
return -EINVAL;
}
}
+
+   if ((conf->offloads & RTE_ETH_RX_OFFLOAD_KEEP_CRC) &&
+   pkt_burst != hns3_recv_pkts_simple &&
+   pkt_burst != hns3_recv_scattered_pkts) {
+   hns3_err(hw, "KEEP_CRC offload is not supported in the current 
rx function.");
+   return -EINVAL;
+   }
+
return 0;
 }
 
@@ -1836,7 +1846,7 @@ hns3_rx_queue_conf_check(struct hns3_hw *hw, const struct 
rte_eth_rxconf *conf,
}
 
if (hw->data->dev_started) {
-   ret = hns3_rxq

RE: [PATCH v2 1/3] ethdev: add description for KEEP CRC offload

2024-07-18 Thread Morten Brørup
> From: Jie Hai [mailto:haij...@huawei.com]
> Sent: Thursday, 18 July 2024 13.48
> 
> From: Dengdui Huang 
> 
> The data execeed the pkt_len in mbuf is inavailable for user.
> When KEEP CRC offload is enabled, CRC field length should be
> included pkt_len in mbuf. However, almost of drivers supported
> KEEP CRC feature didn't add the CRC data length to pkt_len.
> So it is very necessary to add a coments for this.
> 
> Fixes: 70815c9ecadd ("ethdev: add new offload flag to keep CRC")
> Cc: sta...@dpdk.org
> 
> Signed-off-by: Dengdui Huang 
> ---
>  lib/ethdev/rte_ethdev.h | 6 ++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
> index 548fada1c7..3d44673161 100644
> --- a/lib/ethdev/rte_ethdev.h
> +++ b/lib/ethdev/rte_ethdev.h
> @@ -1550,6 +1550,12 @@ struct rte_eth_conf {
>   */
>  #define RTE_ETH_RX_OFFLOAD_TIMESTAMPRTE_BIT64(14)
>  #define RTE_ETH_RX_OFFLOAD_SECURITY RTE_BIT64(15)
> +/*

Suggest making this a Doxygen comment: /**

> + * Keep CRC data in packet.
> + *
> + * Note: If this offload is enabled, the pkt_len in mbuf should contain

"should contain" -> "must include"

> + * the CRC data length.
> + */
>  #define RTE_ETH_RX_OFFLOAD_KEEP_CRC RTE_BIT64(16)
>  #define RTE_ETH_RX_OFFLOAD_SCTP_CKSUM   RTE_BIT64(17)
>  #define RTE_ETH_RX_OFFLOAD_OUTER_UDP_CKSUM  RTE_BIT64(18)
> --
> 2.33.0

With above modifications,
Acked-by: Morten Brørup 



RE: [PATCH] app/testpmd: add postpone option to async flow destroy

2024-07-18 Thread Dariusz Sosnowski
> -Original Message-
> From: Alexander Kozyrev 
> Sent: Wednesday, July 17, 2024 16:20
> To: dev@dpdk.org
> Cc: sta...@dpdk.org; Raslan Darawsheh ; Slava Ovsiienko
> ; Matan Azrad ; Dariusz
> Sosnowski ; Bing Zhao ; Ori Kam
> ; Suanming Mou 
> Subject: [PATCH] app/testpmd: add postpone option to async flow destroy
> 
> The potpone option is not available in the async flow destroy CLI.

Typo: s/potpone/postpone/

> Only flow creation can be postponed in the testpmd application.
> Insert this option into the async flow destroy CLI before the rule ID.
> 
> Fixes: ecdc927b99 ("app/testpmd: add async flow create/destroy operations")

Could you please add Cc: sta...@dpdk.org?
It's better that we keep this tag in git log. It helps LTS maintainers with 
backports.

> 
> Signed-off-by: Alexander Kozyrev 
> ---
>  app/test-pmd/cmdline_flow.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c index
> a76b44bf39..fb6a552863 100644
> --- a/app/test-pmd/cmdline_flow.c
> +++ b/app/test-pmd/cmdline_flow.c
> @@ -3703,7 +3703,7 @@ static const struct token token_list[] = {
>   [QUEUE_DESTROY] = {
>   .name = "destroy",
>   .help = "destroy a flow rule",
> - .next = NEXT(NEXT_ENTRY(QUEUE_DESTROY_ID),
> + .next = NEXT(NEXT_ENTRY(QUEUE_DESTROY_POSTPONE),
>NEXT_ENTRY(COMMON_QUEUE_ID)),
>   .args = ARGS(ARGS_ENTRY(struct buffer, queue)),
>   .call = parse_qo_destroy,
> --
> 2.18.2

Best regards,
Dariusz Sosnowski


Re: [PATCH v2 0/3] bugfix about KEEP CRC offload

2024-07-18 Thread lihuisong (C)

For series.
Acked-by: Huisong Li 


在 2024/7/18 19:48, Jie Hai 写道:

From: Dengdui Huang 

For hns3 NIC, when KEEP_CRC offload is enabled, the CRC data is still be 
stripped in rare scenarios.
Some users of hns3 are already using this feature. So driver has to recaculate 
packet CRC.

In addition, in the mbuf, the data that exceeds the length specified by pkt_len 
is invalid.
Therefore, if the packet contains CRC data, pkt_len should contain the CRC data 
length.
However, almost of drivers supported KEEP CRC feature didn't add the CRC data 
length to pkt_len.
This patch adds description for KEEP CRC offload.

Dengdui Huang (3):
   ethdev: add description for KEEP CRC offload
   net/hns3: fix packet length do not contain CRC data length
   net/hns3: fix Rx packet without CRC data

  drivers/net/hns3/hns3_ethdev.c|   5 +
  drivers/net/hns3/hns3_ethdev.h|  23 +
  drivers/net/hns3/hns3_rxtx.c  | 134 --
  drivers/net/hns3/hns3_rxtx.h  |   3 +
  drivers/net/hns3/hns3_rxtx_vec.c  |   3 +-
  drivers/net/hns3/hns3_rxtx_vec_neon.h |  19 
  drivers/net/hns3/hns3_rxtx_vec_sve.c  |   3 +-
  lib/ethdev/rte_ethdev.h   |   6 ++
  8 files changed, 124 insertions(+), 72 deletions(-)



RE: [PATCH] doc: deprecate graph data structures

2024-07-18 Thread Pavan Nikhilesh Bhagavatula
Thomas, 

Ping can this deprecation notice be merged


> Acked-by: Zhirun Yan 
> 
> > -Original Message-
> > From: Nithin Dabilpuram 
> > Sent: Tuesday, March 19, 2024 1:21 PM
> > To: pbhagavat...@marvell.com
> > Cc: jer...@marvell.com; ndabilpu...@marvell.com;
> kirankum...@marvell.com;
> > Yan, Zhirun ; dev@dpdk.org
> > Subject: Re: [PATCH] doc: deprecate graph data structures
> >
> > Acked-by: Nithin Dabilpuram 
> >
> > On Wed, Feb 21, 2024 at 9:50 PM  wrote:
> > >
> > > From: Pavan Nikhilesh 
> > >
> > > Deprecate rte_node, rte_node_register and rte_graph_cluster_node_stats
> > > structures as will be extended to include node specific error counters
> > > and error description.
> > >
> > > Signed-off-by: Pavan Nikhilesh 
> > > ---
> > >  doc/guides/rel_notes/deprecation.rst | 5 +
> > >  1 file changed, 5 insertions(+)
> > >
> > > diff --git a/doc/guides/rel_notes/deprecation.rst
> > > b/doc/guides/rel_notes/deprecation.rst
> > > index 10630ba255..b3dfd06ed6 100644
> > > --- a/doc/guides/rel_notes/deprecation.rst
> > > +++ b/doc/guides/rel_notes/deprecation.rst
> > > @@ -142,3 +142,8 @@ Deprecation Notices
> > >will be deprecated and subsequently removed in DPDK 24.11 release.
> > >Before this, the new port library API (functions rte_swx_port_*)
> > >will gradually transition from experimental to stable status.
> > > +
> > > +* graph: The graph library data structures will be modified to
> > > +  support node specific errors, the structures ``rte_node``,
> > > +  ``rte_node_register`` and ``rte_graph_cluster_node_stats`` will be
> > > +  extended to include node error counters and error description.
> > > --
> > > 2.25.1
> > >


[PATCH v2 0/4] Fix spelling mistakes

2024-07-18 Thread Joel Kavanagh
Fix up crypto spelling mistakes.

v2: Added more typo fixes to patchset

Joel Kavanagh (4):
  crypto/aesni_mb: fix typo in error message
  app/test: fix typo in error message allocation
  crypto/qat: fix typo in log message
  doc: fix typo in l2fwd-crypto guide

 .mailmap   | 1 +
 app/test/test_cryptodev.c  | 6 +++---
 doc/guides/sample_app_ug/l2_forward_crypto.rst | 2 +-
 drivers/crypto/ipsec_mb/pmd_aesni_mb.c | 2 +-
 drivers/crypto/qat/qat_sym.c   | 2 +-
 5 files changed, 7 insertions(+), 6 deletions(-)

-- 
2.34.1



[PATCH v2 1/4] crypto/aesni_mb: fix typo in error message

2024-07-18 Thread Joel Kavanagh
This patch fixes a typo in the log message for error allocation.
The typo incorrectly spelled 'allocating' as 'allocationg' in the
log message for error allocation.

Fixes: f9dfb59edbcc ("crypto/ipsec_mb: support remaining SGL")
Cc: sta...@dpdk.org

Signed-off-by: Joel Kavanagh 
Acked-by: Brian Dooley 
---
v2: Added more typo fixes to patchset
---
 .mailmap   | 1 +
 drivers/crypto/ipsec_mb/pmd_aesni_mb.c | 2 +-
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/.mailmap b/.mailmap
index 9d0e1380cf..6822c54376 100644
--- a/.mailmap
+++ b/.mailmap
@@ -678,6 +678,7 @@ Jin Yu 
 Jiri Slaby 
 Job Abraham 
 Jochen Behrens  
+Joel Kavanagh 
 Joey Xing 
 Johan Faltstrom 
 Johan Källström 
diff --git a/drivers/crypto/ipsec_mb/pmd_aesni_mb.c 
b/drivers/crypto/ipsec_mb/pmd_aesni_mb.c
index d74946c180..ef4228bd38 100644
--- a/drivers/crypto/ipsec_mb/pmd_aesni_mb.c
+++ b/drivers/crypto/ipsec_mb/pmd_aesni_mb.c
@@ -1534,7 +1534,7 @@ aesni_mb_digest_appended_in_src(struct rte_crypto_op *op, 
IMB_JOB *job,
  *
  * @return
  * - 0 on success, the IMB_JOB will be filled
- * - -1 if invalid session or errors allocationg SGL linear buffer,
+ * - -1 if invalid session or errors allocating SGL linear buffer,
  *   IMB_JOB will not be filled
  */
 static inline int
-- 
2.34.1



[PATCH v2 2/4] app/test: fix typo in error message allocation

2024-07-18 Thread Joel Kavanagh
This patch fixes a typo in the `test_cryptodev.c` file where "out-op" was
incorrectly used instead of "out-of-place" on three separate occassions.

Fixes: f3dbf94be60c ("app/test: check SGL on QAT")
Fixes: 43220096d66a ("test/crypto: add PDCP cases for scatter gather")
Cc: sta...@dpdk.org

Signed-off-by: Joel Kavanagh 
---
 app/test/test_cryptodev.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/app/test/test_cryptodev.c b/app/test/test_cryptodev.c
index 6042db36a4..c846b26ed1 100644
--- a/app/test/test_cryptodev.c
+++ b/app/test/test_cryptodev.c
@@ -9428,7 +9428,7 @@ static int test_pdcp_proto(int i, int oop, enum 
rte_crypto_cipher_operation opc,
/* Out of place support */
if (oop) {
/*
-* For out-op-place we need to alloc another mbuf
+* For out-of-place we need to alloc another mbuf
 */
ut_params->obuf = rte_pktmbuf_alloc(ts_params->mbuf_pool);
rte_pktmbuf_append(ut_params->obuf, output_vec_len);
@@ -9637,7 +9637,7 @@ test_pdcp_proto_SGL(int i, int oop,
/* Out of place support */
if (oop) {
/*
-* For out-op-place we need to alloc another mbuf
+* For out-of-place we need to alloc another mbuf
 */
ut_params->obuf = rte_pktmbuf_alloc(ts_params->mbuf_pool);
rte_pktmbuf_append(ut_params->obuf, frag_size_oop);
@@ -16831,7 +16831,7 @@ test_authenticated_encryption_SGL(const struct 
aead_test_data *tdata,
}
 
/*
-* For out-op-place we need to alloc another mbuf
+* For out-of-place we need to alloc another mbuf
 */
if (oop) {
ut_params->obuf = rte_pktmbuf_alloc(ts_params->mbuf_pool);
-- 
2.34.1



[PATCH v2 3/4] crypto/qat: fix typo in log message

2024-07-18 Thread Joel Kavanagh
This patch fixes a typo in the log message for rte_security support.
The typo incorrectly spelled 'enabled' as 'ensabled' in the
log message indicating that rte_security support is enabled.

Fixes: fb3b9f492205 ("crypto/qat: rework burst data path")
Cc: sta...@dpdk.org

Signed-off-by: Joel Kavanagh 
---
 drivers/crypto/qat/qat_sym.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/crypto/qat/qat_sym.c b/drivers/crypto/qat/qat_sym.c
index b41d1b1def..d979ae6489 100644
--- a/drivers/crypto/qat/qat_sym.c
+++ b/drivers/crypto/qat/qat_sym.c
@@ -291,7 +291,7 @@ qat_sym_dev_create(struct qat_pci_device *qat_pci_dev)
}
 
cryptodev->feature_flags |= RTE_CRYPTODEV_FF_SECURITY;
-   QAT_LOG(INFO, "Device %s rte_security support ensabled", name);
+   QAT_LOG(INFO, "Device %s rte_security support enabled", name);
} else {
QAT_LOG(INFO, "Device %s rte_security support disabled", name);
}
-- 
2.34.1



[PATCH v2 4/4] doc: fix typo in l2fwd-crypto guide

2024-07-18 Thread Joel Kavanagh
This patch fixes a typo in the l2fwd-crypto documentation where
the l2fwd-crypt was changed to l2fwd-crypto.

Fixes: 7cacb0565539 ("doc: add generic build instructions for sample apps")
Cc: sta...@dpdk.org

Signed-off-by: Joel Kavanagh 
---
 doc/guides/sample_app_ug/l2_forward_crypto.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/doc/guides/sample_app_ug/l2_forward_crypto.rst 
b/doc/guides/sample_app_ug/l2_forward_crypto.rst
index ce49eab96f..7ff304d05c 100644
--- a/doc/guides/sample_app_ug/l2_forward_crypto.rst
+++ b/doc/guides/sample_app_ug/l2_forward_crypto.rst
@@ -30,7 +30,7 @@ Compiling the Application
 
 To compile the sample application see :doc:`compiling`.
 
-The application is located in the ``l2fwd-crypt`` sub-directory.
+The application is located in the ``l2fwd-crypto`` sub-directory.
 
 Running the Application
 ---
-- 
2.34.1



FDIR packet distribution with specific multiple RX queues.

2024-07-18 Thread Raghavan V
Hi Team,

Is there any way to distribute packets evenly (Like RSS) to specific multiple 
RX queues in RTE_FLOW_ACTION_TYPE_QUEUE DPDK Flow director?

Desired action:

uint16_t queue_indices[] = {10, 11, 12, 13, 14, 15};
struct rte_flow_action_queue queue = {.index = queue_indices};
struct rte_flow_action action[]={
[0]={.type = RTE_FLOW_ACTION_TYPE_QUEUE,.conf = &queue},
[1]={.type = RTE_FLOW_ACTION_TYPE_END}
};
Is this action limited to drivers specific?

Example:
I have 40 RX queues I need to match an IP address pattern and the matching 
packets should direct to 1-10 RX queue indices.


If not, please suggest a way to direct packets that matches a particular 
pattern to specific multiple RX queues without the use of RSS.
I need the matching packets to distribute evenly to specific RX queues and not 
the duplicates. I'm using Intel x710 NIC i40e and DPDK 20.11.

Can anyone help on this query.

Thanks,
Raghavan V


IPv6 APIs rework

2024-07-18 Thread Robin Jarry

Hi folks,

while working on IPv6 support for grout [1], I noticed that all DPDK 
IPv6 APIs used fixed sized arrays in the route lookup functions [2].


int rte_fib6_lookup_bulk(struct rte_fib6 *fib,
 uint8_t ips[][RTE_FIB6_IPV6_ADDR_SIZE],
 uint64_t *next_hops,
 int n);

If I'm not mistaken, using sized arrays in function signatures is only 
for documentation purposes and does not result in any specific compiler 
checks. In the above example, the ips parameter is considered as a plain 
old `uint8_t **` pointer.


Also, not having a dedicated type for IPv6 addresses requires obscure 
pointer arithmetic [3] and casting [4].


I'd like to introduce a real IPv6 address structure that has the same 
alignment than a dumb `uint8_t *` pointer but has an union to ease 
casting and most importantly presents the whole thing as an explicit 
typed structure:


   #define RTE_IPV6_ADDR_SIZE 16

   struct rte_ipv6_addr {
   union {
   uint8_t u8[RTE_IPV6_ADDR_SIZE];
   uint16_t u16[RTE_IPV6_ADDR_SIZE / sizeof(uint16_t)];
   uint32_t u32[RTE_IPV6_ADDR_SIZE / sizeof(uint32_t)];
   uint64_t u64[RTE_IPV6_ADDR_SIZE / sizeof(uint64_t)];
   };
   } __rte_packed __rte_aligned(1);

This would require some breakage of the APIs but I think it would 
benefit code readability and maintainability in the long term.


int rte_fib6_lookup_bulk(struct rte_fib6 *fib,
 const struct rte_ipv6_addr *ips,
 uint64_t *next_hops,
 int n);

I already have a semi-working draft and am in the process of splitting 
the changes into small chunks to make them easier to review.


https://github.com/DPDK/dpdk/compare/main...rjarry:dpdk:ipv6-address-rework

Is that something that would be of interest? If yes, I would like to 
announce API breakage before the release of 24.07 so that the changes 
can be integrated into 24.11.


Cheers!

[1] https://github.com/rjarry/grout
[2] https://doc.dpdk.org/api/rte__fib6_8h.html#a924678410ccb9551cda3e75d742a11e3
[3] https://git.dpdk.org/dpdk/tree/lib/fib/trie_avx512.c?h=v24.07-rc2#n340
[4] https://git.dpdk.org/dpdk/tree/lib/hash/rte_thash.h?h=v24.07-rc2#n156

--
Robin



RE: [PATCH v1] crypto/qat: add fix for Gen4 WRITE

2024-07-18 Thread Akhil Goyal
> Acked-by: Arkadiusz Kusztal 
> > All generations of QAT use the same Gen1 raw datapath. Gen4 needs a 
> > different
> > WRITE function than other generations. Added separation for configuration of
> > the raw ctx for Gen4 from the Gen1 codepath.
> >
> > Fixes: 85fec6fd9674 ("crypto/qat: unify raw data path functions")
> > Cc: kai...@intel.com
> > Cc: sta...@dpdk.org
> >
> > Signed-off-by: Brian Dooley 
Applied to dpdk-next-crypto


RE: [EXTERNAL] [PATCH v2 1/4] crypto/aesni_mb: fix typo in error message

2024-07-18 Thread Akhil Goyal
> Subject: [EXTERNAL] [PATCH v2 1/4] crypto/aesni_mb: fix typo in error message
> 
> This patch fixes a typo in the log message for error allocation.
> The typo incorrectly spelled 'allocating' as 'allocationg' in the
> log message for error allocation.
> 
> Fixes: f9dfb59edbcc ("crypto/ipsec_mb: support remaining SGL")
> Cc: sta...@dpdk.org
> 
> Signed-off-by: Joel Kavanagh 
> Acked-by: Brian Dooley 
Series Acked-by: Akhil Goyal 
Applied to dpdk-next-crypto

Thanks for the fixes.


RE: [PATCH v1] test/crypto: remove unused stats in test setup

2024-07-18 Thread Akhil Goyal
> Subject: [PATCH v1] test/crypto: remove unused stats in test setup
> 
> Remove unused stats in test setup.
> 
> Coverity issue: 373869
> Fixes: 2c6dab9cd93 ("test/crypto: add RSA and Mod tests")
> Cc: sta...@dpdk.org
> 
> Signed-off-by: Gowrishankar Muthukrishnan 
Applied to dpdk-next-crypto
Thanks.


RE: [PATCH v1] test/crypto: fix asymmetric capability test

2024-07-18 Thread Akhil Goyal
> Subject: [PATCH v1] test/crypto: fix asymmetric capability test
> 
> Fix asymmetric capability test for below:
>  * Skip test if asymmetric crypto feature is not supported by device.
>  * Assert return value of RTE function to get asymmetric capability.
> 
> Coverity issue: 373365
> Fixes: 2c6dab9cd93 ("test/crypto: add RSA and Mod tests")
> Cc: sta...@dpdk.org
> 
> Signed-off-by: Gowrishankar Muthukrishnan 
Applied to dpdk-next-crypto
Thanks.


Re: FDIR packet distribution with specific multiple RX queues.

2024-07-18 Thread Stephen Hemminger
On Thu, 18 Jul 2024 16:39:35 +
Raghavan V  wrote:

> Hi Stephen,
> Thanks for your response.
> 
> As our application has limitations while using RSS,
> I would prefer a similar approach to RTE_ACTION_TYPE_QUEUE.

There is no action like this since hardware does not support it.
Multi queue support comes from the original queue support which was
done as collaboration between Intel and Microsoft (for Windows).
Then Linux got multi-queue (RSS) support.
Then DPDK got similar multi-queue.

Packet spraying across queues would be bad since it would create
out of order packets, and also lots of lock contention on any shared
per flow resource.


[PATCH] test: fix 32 bit overflow in pcapng test

2024-07-18 Thread Stephen Hemminger
The conversion from seconds to nanoseconds in the pcapng test
would overflow on 32 bit platforms leading to this test failing.

Reported-by: Luca Boccassi 
Signed-off-by: Stephen Hemminger 
---
 app/test/test_pcapng.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/app/test/test_pcapng.c b/app/test/test_pcapng.c
index 89535efad0..2665b08c76 100644
--- a/app/test/test_pcapng.c
+++ b/app/test/test_pcapng.c
@@ -235,7 +235,7 @@ parse_pcap_packet(u_char *user, const struct pcap_pkthdr *h,
 * but the file is open in nanonsecond mode therefore
 * the timestamp is really in timespec (ie. nanoseconds).
 */
-   ns = h->ts.tv_sec * NS_PER_S + h->ts.tv_usec;
+   ns = (uint64_t)h->ts.tv_sec * NS_PER_S + h->ts.tv_usec;
if (ns < ctx->start_ns || ns > ctx->end_ns) {
char tstart[128], tend[128];
 
-- 
2.43.0



Re: [PATCH] net/gve: Update Rx/Tx functions for RTE_PROC_SECONDARY

2024-07-18 Thread Joshua Washington
Acked-by: Joshua Washington 

Thanks!


[PATCH 1/2] doc: add note about CPU 0

2024-07-18 Thread Stephen Hemminger
On Linux (and probably BSD), CPU 0 can not be fully isolated
because it receives timer interrupts and is used for other
kernel related functions. The DPDK documentation should
be updated to tell users to avoid polling on that CPU.

Signed-off-by: Stephen Hemminger 
---
 doc/guides/linux_gsg/enable_func.rst | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/doc/guides/linux_gsg/enable_func.rst 
b/doc/guides/linux_gsg/enable_func.rst
index 5511640cb8..dc33ffc718 100644
--- a/doc/guides/linux_gsg/enable_func.rst
+++ b/doc/guides/linux_gsg/enable_func.rst
@@ -131,9 +131,14 @@ from running on those cores, it is possible to use
 the Linux kernel parameters ``isolcpus``, ``nohz_full``, ``irqaffinity``
 to isolate them from the general Linux scheduler tasks.
 
+.. note::
+
+It is not recommended to use CPU core 0 for DPDK polling applications
+because it can not be isolated from other system and kernel activity.
+
 For example, if a given CPU has 0-7 cores
-and DPDK applications are to run on logical cores 2, 4 and 6,
-the following should be added to the kernel parameter list:
+and DPDK applications are to run on logical cores 2, 4 and 6.
+The following should be added to the kernel parameter list:
 
 .. code-block:: console
 
-- 
2.43.0



[PATCH 2/2] doc: remove use of -n 4 option in documentation

2024-07-18 Thread Stephen Hemminger
Many places in the documentation are using -n 4 to set
the number of memory channels. This should not be recommended
since it is not always right and the default should be used
instead.

Signed-off-by: Stephen Hemminger 
---
 doc/guides/contributing/documentation.rst |  2 +-
 doc/guides/cryptodevs/aesni_gcm.rst   |  2 +-
 doc/guides/cryptodevs/aesni_mb.rst|  2 +-
 doc/guides/cryptodevs/ccp.rst |  4 ++--
 doc/guides/cryptodevs/kasumi.rst  |  2 +-
 doc/guides/cryptodevs/null.rst|  2 +-
 doc/guides/cryptodevs/openssl.rst |  2 +-
 doc/guides/cryptodevs/snow3g.rst  |  2 +-
 doc/guides/cryptodevs/zuc.rst |  2 +-
 doc/guides/howto/lm_bond_virtio_sriov.rst |  2 +-
 doc/guides/howto/lm_virtio_vhost_user.rst |  2 +-
 doc/guides/howto/packet_capture_framework.rst |  2 +-
 doc/guides/howto/pvp_reference_benchmark.rst  |  4 ++--
 doc/guides/howto/vfd.rst  |  4 ++--
 .../virtio_user_for_container_networking.rst  |  4 ++--
 doc/guides/linux_gsg/build_sample_apps.rst|  2 +-
 doc/guides/linux_gsg/linux_drivers.rst|  4 ++--
 doc/guides/nics/build_and_test.rst|  2 +-
 doc/guides/nics/cpfl.rst  |  2 +-
 doc/guides/nics/cxgbe.rst |  2 +-
 doc/guides/nics/fail_safe.rst |  8 
 doc/guides/nics/hns3.rst  |  4 ++--
 doc/guides/nics/i40e.rst  |  6 +++---
 doc/guides/nics/ice.rst   |  4 ++--
 doc/guides/nics/intel_vf.rst  |  2 +-
 doc/guides/nics/ipn3ke.rst|  4 ++--
 doc/guides/nics/mlx4.rst  |  2 +-
 doc/guides/nics/mlx5.rst  |  2 +-
 doc/guides/nics/null.rst  |  6 +++---
 doc/guides/nics/pcap_ring.rst | 18 -
 doc/guides/nics/qede.rst  |  2 +-
 doc/guides/nics/softnic.rst   |  2 +-
 doc/guides/nics/thunderx.rst  |  2 +-
 doc/guides/nics/vhost.rst |  2 +-
 doc/guides/nics/virtio.rst|  2 +-
 .../link_bonding_poll_mode_drv_lib.rst| 10 +-
 doc/guides/prog_guide/overview.rst| 11 +-
 doc/guides/sample_app_ug/cmd_line.rst |  2 +-
 doc/guides/sample_app_ug/dist_app.rst |  2 +-
 doc/guides/sample_app_ug/hello_world.rst  |  2 +-
 doc/guides/sample_app_ug/ipsec_secgw.rst  |  4 ++--
 doc/guides/sample_app_ug/keep_alive.rst   |  2 +-
 doc/guides/sample_app_ug/l2_forward_cat.rst   |  4 ++--
 doc/guides/sample_app_ug/l2_forward_event.rst |  8 
 .../sample_app_ug/l2_forward_job_stats.rst|  2 +-
 .../sample_app_ug/l2_forward_real_virtual.rst |  4 ++--
 doc/guides/sample_app_ug/l3_forward.rst   |  8 
 doc/guides/sample_app_ug/l3_forward_graph.rst |  6 +++---
 doc/guides/sample_app_ug/link_status_intr.rst |  2 +-
 doc/guides/sample_app_ug/multi_process.rst| 20 +--
 doc/guides/sample_app_ug/ptpclient.rst|  2 +-
 doc/guides/sample_app_ug/qos_scheduler.rst|  4 ++--
 doc/guides/sample_app_ug/rxtx_callbacks.rst   |  2 +-
 doc/guides/sample_app_ug/skeleton.rst |  2 +-
 doc/guides/sample_app_ug/timer.rst|  2 +-
 doc/guides/sample_app_ug/vdpa.rst |  2 +-
 doc/guides/sample_app_ug/vhost.rst|  2 +-
 .../sample_app_ug/vm_power_management.rst |  6 +++---
 .../sample_app_ug/vmdq_dcb_forwarding.rst |  2 +-
 doc/guides/sample_app_ug/vmdq_forwarding.rst  |  2 +-
 doc/guides/testpmd_app_ug/run_app.rst |  2 +-
 doc/guides/tools/flow-perf.rst|  2 +-
 62 files changed, 117 insertions(+), 116 deletions(-)

diff --git a/doc/guides/contributing/documentation.rst 
b/doc/guides/contributing/documentation.rst
index 68454ae0d5..ff7f0aca98 100644
--- a/doc/guides/contributing/documentation.rst
+++ b/doc/guides/contributing/documentation.rst
@@ -253,7 +253,7 @@ Line Length and Wrapping
   Long literal command lines can be shown wrapped with backslashes. For
   example::
 
- dpdk-testpmd -l 2-3 -n 4 \
+ dpdk-testpmd -l 2-3 \
  --vdev=virtio_user0,path=/dev/vhost-net,queues=2,queue_size=1024 \
  -- -i --tx-offloads=0x002c --enable-lro --txq=2 --rxq=2 \
  --txd=1024 --rxd=1024
diff --git a/doc/guides/cryptodevs/aesni_gcm.rst 
b/doc/guides/cryptodevs/aesni_gcm.rst
index 3af1486553..ae2c3fbcbe 100644
--- a/doc/guides/cryptodevs/aesni_gcm.rst
+++ b/doc/guides/cryptodevs/aesni_gcm.rst
@@ -103,5 +103,5 @@ Example:
 
 .. code-block:: console
 
-./dpdk-l2fwd-crypto -l 1 -n 4 
--vdev="crypto_aesni_gcm,socket_id=0,max_nb_sessions=128" \
+./dpdk-l2fwd-crypto -l 1 
--vdev="crypto_aesni_gcm,socket_id=0,max_nb_sessions=128" \
 -- -p 1 --cdev SW --chain AEAD --aead_algo "aes-gcm"
diff --git a/doc/guides/cryptodevs/aesni_mb.rst 
b/doc/guides/cryptodevs

RE: [RFC v2] ethdev: an API for cache stashing hints

2024-07-18 Thread Wathsala Wathawana Vithanage
> 
> My initial reaction is negative on this. The DPDK does not need more nerd
> knobs for performance. If it is a performance win, it should be automatic and
> handled by the driver.
> 
> If you absolutely have to have another flag, then it should be in existing 
> config
> (yes, extend the ABI) rather than adding more flags and calls in ethdev.


Thanks, Steve, for the feedback. My thesis is that in a DPDK-based packet 
processing system,
the application is more knowledgeable of memory buffer (packets) usage than the 
generic
underlying hardware or the PMD (I have provided some examples below with the 
hint they
would map into). Recognizing such cases, PCI SIG introduced TLP Packet 
Processing Hints (TPH).
Consequently, many interconnect designers enabled support for TPH in their 
interconnects so
that based on steering tags provided by an application to a NIC, which sets 
them in the TLP
header, memory buffers can be targeted toward a CPU at the desired level in the 
cache hierarchy.
With this proposed API, applications provide cache-stashing hints to ethernet 
devices to improve
memory access latencies from the CPU and the NIC to improve system performance.

Listed below are some use cases.

- A run-to-completion application may not need the next packet immediately in 
L1D. It may rather
issue a prefetch and do other work with packet and application data already in 
L1D before it needs
the next packet. A generic PMD will not know such subtleties in the application 
endpoint, and it
would resolve to stash buffers into the L1D indiscriminately or not do it at 
all. But, with a hint from
the application that buffers of the packets will be stashed at a cache level 
suitable for the
application. (like UNIX MADV_DONOTNEED but for mbufs at cache line granularity)

- Similarly, a pipelined application may use a hint that advice the buffers are 
needed in L1D as soon
as they arrive. (parallels MADV_WILLNEED)

- Let's call the time between a mbuf being allocated into an Rx queue, freed 
back into mempool in
the Tx path, and once again reallocated back in the Same Rx queue the "buffer 
recycle window". 
The length of the buffer recycle window is a function of the application in 
question; the PMD or the
NIC has no prior knowledge of this property of an application. A buffer may 
stay in the L1D of a CPU
throughout the entire recycle window if the window is short enough for that 
application.
An application with a short buffer recycle window may hint to the platform that 
the Tx buffer is not
needed anytime soon in the CPU cache via a hint to avoid unnecessary cache 
invalidations when
the buffer gets written by the Rx packet for the second time. (parallels 
MADV_DONOTNEED)


Re: [PATCH] test: fix 32 bit overflow in pcapng test

2024-07-18 Thread Luca Boccassi
On Thu, 18 Jul 2024 at 18:43, Stephen Hemminger
 wrote:
>
> The conversion from seconds to nanoseconds in the pcapng test
> would overflow on 32 bit platforms leading to this test failing.
>
> Reported-by: Luca Boccassi 
> Signed-off-by: Stephen Hemminger 
> ---
>  app/test/test_pcapng.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/app/test/test_pcapng.c b/app/test/test_pcapng.c
> index 89535efad0..2665b08c76 100644
> --- a/app/test/test_pcapng.c
> +++ b/app/test/test_pcapng.c
> @@ -235,7 +235,7 @@ parse_pcap_packet(u_char *user, const struct pcap_pkthdr 
> *h,
>  * but the file is open in nanonsecond mode therefore
>  * the timestamp is really in timespec (ie. nanoseconds).
>  */
> -   ns = h->ts.tv_sec * NS_PER_S + h->ts.tv_usec;
> +   ns = (uint64_t)h->ts.tv_sec * NS_PER_S + h->ts.tv_usec;
> if (ns < ctx->start_ns || ns > ctx->end_ns) {
> char tstart[128], tend[128];

Thanks this fixes the issue, the build is now green

Tested-by: Luca Boccassi 


[PATCH v3 0/3] Mac Filter Port to New DTS

2024-07-18 Thread Nicholas Pratte
v3:
  * Minor adjustments to doc strings
  * Return labels added for testpmd methods

Nicholas Pratte (3):
  dts: add boolean to adjust addresses
  dts: add methods for setting mac and multicast addresses
  dts: mac filter test suite refactored for new dts

 dts/framework/config/conf_yaml_schema.json|   3 +-
 dts/framework/remote_session/testpmd_shell.py | 179 ++
 dts/framework/test_suite.py   |   7 +-
 dts/tests/TestSuite_mac_filter.py | 223 ++
 4 files changed, 410 insertions(+), 2 deletions(-)
 create mode 100644 dts/tests/TestSuite_mac_filter.py

-- 
2.44.0



[PATCH 1/4] test: update alarm test

2024-07-18 Thread Stephen Hemminger
This test should be using the TEST_ASSERT macros, and can be
run as part of the fast test suite now.

Signed-off-by: Stephen Hemminger 
---
 app/test/test_alarm.c | 53 ---
 1 file changed, 20 insertions(+), 33 deletions(-)

diff --git a/app/test/test_alarm.c b/app/test/test_alarm.c
index 70e97a3109..4ba8aa1af2 100644
--- a/app/test/test_alarm.c
+++ b/app/test/test_alarm.c
@@ -10,7 +10,8 @@
 
 #include "test.h"
 
-#ifndef RTE_EXEC_ENV_WINDOWS
+#define US_PER_SEC 100
+
 static volatile int flag;
 
 static void
@@ -19,46 +20,32 @@ test_alarm_callback(void *cb_arg)
flag = 1;
printf("Callback setting flag - OK. [cb_arg = %p]\n", cb_arg);
 }
-#endif
 
 static int
 test_alarm(void)
 {
-#ifdef RTE_EXEC_ENV_FREEBSD
-   printf("The alarm API is not supported on FreeBSD\n");
-   return 0;
-#endif
+   int ret;
+
+   ret = rte_eal_alarm_set(0, test_alarm_callback, NULL);
+   TEST_ASSERT_FAIL(ret, "should not be succeed with 0 us value");
+
+   ret = rte_eal_alarm_set(UINT64_MAX - 1, test_alarm_callback, NULL);
+   TEST_ASSERT_FAIL(ret, "should not be succeed with (UINT64_MAX-1) us 
value");
+
+   ret = rte_eal_alarm_set(10, NULL, NULL);
+   TEST_ASSERT_FAIL(ret, "should not succeed with null callback 
parameter");
 
-#ifndef RTE_EXEC_ENV_WINDOWS
-   /* check if it will fail to set alarm with wrong us value */
-   printf("check if it will fail to set alarm with wrong ms values\n");
-   if (rte_eal_alarm_set(0, test_alarm_callback,
-   NULL) >= 0) {
-   printf("should not be successful with 0 us value\n");
-   return -1;
-   }
-   if (rte_eal_alarm_set(UINT64_MAX - 1, test_alarm_callback,
-   NULL) >= 0) {
-   printf("should not be successful with (UINT64_MAX-1) us 
value\n");
-   return -1;
-   }
-#endif
+   ret = rte_eal_alarm_cancel(NULL, NULL);
+   TEST_ASSERT_FAIL(ret, "should not succeed to remove alarm with null 
callback parameter");
 
-   /* check if it will fail to set alarm with null callback parameter */
-   printf("check if it will fail to set alarm with null callback 
parameter\n");
-   if (rte_eal_alarm_set(10 /* ms */, NULL, NULL) >= 0) {
-   printf("should not be successful to set alarm with null 
callback parameter\n");
-   return -1;
-   }
+   ret = rte_eal_alarm_set(US_PER_SEC, test_alarm_callback, NULL);
+   TEST_ASSERT_SUCCESS(ret, "could not set an alarm");
 
-   /* check if it will fail to remove alarm with null callback parameter */
-   printf("check if it will fail to remove alarm with null callback 
parameter\n");
-   if (rte_eal_alarm_cancel(NULL, NULL) == 0) {
-   printf("should not be successful to remove alarm with null 
callback parameter");
-   return -1;
-   }
+   ret = rte_eal_alarm_cancel(test_alarm_callback, NULL);
+   /* return is the number of the alarm set (or 0 if none or -1 if error) 
*/
+   TEST_ASSERT(ret > 0, "could not cancel an alarm: %d", ret);
 
return 0;
 }
 
-REGISTER_TEST_COMMAND(alarm_autotest, test_alarm);
+REGISTER_FAST_TEST(alarm_autotest, true, true, test_alarm);
-- 
2.43.0



[PATCH 0/4] Enable more unit tests

2024-07-18 Thread Stephen Hemminger
Several of the unit tests were not listed in any test suite
and therefore were rarely run. When running build this was reported as
errors:
 WARNING: Test "alarm_autotest" is not defined in any test suite
 WARNING: Test "cksum_perf_autotest" is not defined in any test suite

Put the alarm, cksum, and timer tests into the appropriate test suite.
More tests should be updated (later).

Stephen Hemminger (4):
  test: update alarm test
  test: run cksum tests as part of perf test suite
  test: make red test part of fast suite
  test: run timer secondary tests as part of fast suite

 app/test/test_alarm.c   | 53 +
 app/test/test_cksum_perf.c  |  3 +-
 app/test/test_red.c |  2 +-
 app/test/test_timer_secondary.c |  2 +-
 4 files changed, 23 insertions(+), 37 deletions(-)

-- 
2.43.0



[PATCH 2/4] test: run cksum tests as part of perf test suite

2024-07-18 Thread Stephen Hemminger
The cksum tests would not get run since not part of one
of the test suites. Meson complains with:

 WARNING: Test "cksum_perf_autotest" is not defined in any test suite

Signed-off-by: Stephen Hemminger 
---
 app/test/test_cksum_perf.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/app/test/test_cksum_perf.c b/app/test/test_cksum_perf.c
index 1f296cae34..0b919cd59f 100644
--- a/app/test/test_cksum_perf.c
+++ b/app/test/test_cksum_perf.c
@@ -113,5 +113,4 @@ test_cksum_perf(void)
return TEST_SUCCESS;
 }
 
-
-REGISTER_TEST_COMMAND(cksum_perf_autotest, test_cksum_perf);
+REGISTER_PERF_TEST(cksum_perf_autotest, test_cksum_perf);
-- 
2.43.0



[PATCH 3/4] test: make red test part of fast suite

2024-07-18 Thread Stephen Hemminger
The red tests were not run because not part of any suite.
Meson warning is:
 WARNING: Test "red_autotest" is not defined in any test suite

Signed-off-by: Stephen Hemminger 
---
 app/test/test_red.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/app/test/test_red.c b/app/test/test_red.c
index aa7538d51a..4bb17dce7a 100644
--- a/app/test/test_red.c
+++ b/app/test/test_red.c
@@ -1877,6 +1877,6 @@ test_red_all(void)
 
 #endif /* !RTE_EXEC_ENV_WINDOWS */
 
-REGISTER_TEST_COMMAND(red_autotest, test_red);
+REGISTER_FAST_TEST(red_autotest, true, true, test_red);
 REGISTER_PERF_TEST(red_perf, test_red_perf);
 REGISTER_PERF_TEST(red_all, test_red_all);
-- 
2.43.0



[PATCH 4/4] test: run timer secondary tests as part of fast suite

2024-07-18 Thread Stephen Hemminger
The tests were not part of any suite so not run normally.
Meson warning is:
WARNING: Test "timer_secondary_autotest" is not defined in any test suite

Signed-off-by: Stephen Hemminger 
---
 app/test/test_timer_secondary.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/app/test/test_timer_secondary.c b/app/test/test_timer_secondary.c
index 4e220559b4..2f98c177cd 100644
--- a/app/test/test_timer_secondary.c
+++ b/app/test/test_timer_secondary.c
@@ -224,4 +224,4 @@ test_timer_secondary(void)
 
 #endif /* !RTE_EXEC_ENV_WINDOWS */
 
-REGISTER_TEST_COMMAND(timer_secondary_autotest, test_timer_secondary);
+REGISTER_FAST_TEST(timer_secondary_autotest, false, true, 
test_timer_secondary);
-- 
2.43.0



[PATCH v3 1/3] dts: add boolean to adjust addresses

2024-07-18 Thread Nicholas Pratte
Various test cases in the mac filter test suite called for granular
manipulation of destination mac addresses to properly test mac address
filtering functionality. To compensate, there is now an
adjust_addresses boolean which the user can toggle if they wish to send
their own addressing; the boolean is true by default.

Bugzilla ID: 1454
Signed-off-by: Nicholas Pratte 
---
 dts/framework/test_suite.py | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/dts/framework/test_suite.py b/dts/framework/test_suite.py
index 694b2eba65..551a587525 100644
--- a/dts/framework/test_suite.py
+++ b/dts/framework/test_suite.py
@@ -185,6 +185,7 @@ def send_packet_and_capture(
 packet: Packet,
 filter_config: PacketFilteringConfig = PacketFilteringConfig(),
 duration: float = 1,
+adjust_addresses: bool = True,
 ) -> list[Packet]:
 """Send and receive `packet` using the associated TG.
 
@@ -195,11 +196,15 @@ def send_packet_and_capture(
 packet: The packet to send.
 filter_config: The filter to use when capturing packets.
 duration: Capture traffic for this amount of time after sending 
`packet`.
+adjust_addresses: If :data:'True', adjust addresses of the 
egressing packet with
+a default addressing scheme. If :data:'False', do not adjust 
the addresses of
+egressing packet.
 
 Returns:
 A list of received packets.
 """
-packet = self._adjust_addresses(packet)
+if adjust_addresses:
+packet = self._adjust_addresses(packet)
 return self.tg_node.send_packet_and_capture(
 packet,
 self._tg_port_egress,
-- 
2.44.0



[PATCH v3 2/3] dts: add methods for setting mac and multicast addresses

2024-07-18 Thread Nicholas Pratte
Several new methods have been added to TestPMDShell in order to produce
the mac filter's individual test cases:
 - set_mac_addr
 - set_multicast_mac_addr
 - rx_vlan_add
 - rx_vlan_rm
 - vlan_filter_set_on
 - vlan_filter_set_off
 - set_promisc

set_mac_addr and set_multicast_addr were created for the mac filter test
suite, enabling users to both add or remove mac and multicast
addresses based on a boolean 'add or remove' parameter. The success or
failure of each call can be verified if a user deems it necessary.

The other methods listed are implemented in other respective test
suites, and their implementations have been copied, but are subject to
change; they are not the focus of this patch.

Bugzilla ID: 1454
Signed-off-by: Nicholas Pratte 
---
 dts/framework/remote_session/testpmd_shell.py | 179 ++
 1 file changed, 179 insertions(+)

diff --git a/dts/framework/remote_session/testpmd_shell.py 
b/dts/framework/remote_session/testpmd_shell.py
index ec22f72221..8122457ad1 100644
--- a/dts/framework/remote_session/testpmd_shell.py
+++ b/dts/framework/remote_session/testpmd_shell.py
@@ -767,6 +767,185 @@ def show_port_info(self, port_id: int) -> TestPmdPort:
 
 return TestPmdPort.parse(output)
 
+def set_mac_addr(self, port_id: int, mac_address: str, add: bool, verify: 
bool = True) -> None:
+"""Add or remove a mac address on a given port's Allowlist.
+
+Args:
+port_id: The port ID the mac address is set on.
+mac_address: The mac address to be added or removed to the 
specified port.
+add: If :data:`True`, add the specified mac address. If 
:data:`False`, remove specified
+mac address.
+verify: If :data:'True', assert that the 'mac_addr' operation was 
successful. If
+:data:'False', run the command and skip this assertion.
+
+Raises:
+InteractiveCommandExecutionError: If the set mac address operation 
fails.
+"""
+mac_cmd = "add" if add else "remove"
+output = self.send_command(f"mac_addr {mac_cmd} {port_id} 
{mac_address}")
+if "Bad arguments" in output:
+self._logger.debug("Invalid argument provided to mac_addr")
+raise InteractiveCommandExecutionError("Invalid argument provided")
+
+if verify:
+if "mac_addr_cmd error:" in output:
+self._logger.debug(f"Failed to {mac_cmd} {mac_address} on port 
{port_id}")
+raise InteractiveCommandExecutionError(
+f"Failed to {mac_cmd} {mac_address} on port {port_id} 
\n{output}"
+)
+
+def set_multicast_mac_addr(
+self, port_id: int, multi_addr: str, add: bool, verify: bool = True
+) -> None:
+"""Add or remove multicast mac address to a specified port's filter.
+
+Args:
+port_id: The port ID the multicast address is set on.
+multi_addr: The multicast address to be added to the filter.
+add: If :data:'True', add the specified multicast address to the 
port filter.
+If :data:'False', remove the specified multicast address from 
the port filter.
+verify: If :data:'True', assert that the 'mcast_addr' operations 
was successful.
+If :data:'False', execute the 'mcast_addr' operation and skip 
the assertion.
+
+Raises:
+InteractiveCommandExecutionError: If either the 'add' or 'remove' 
operations fails.
+"""
+mcast_cmd = "add" if add else "remove"
+output = self.send_command(f"mcast_addr {mcast_cmd} {port_id} 
{multi_addr}")
+if "Bad arguments" in output:
+self._logger.debug("Invalid arguments provided to mcast_addr")
+raise InteractiveCommandExecutionError("Invalid argument provided")
+
+if verify:
+if (
+"Invalid multicast_addr" in output
+or f'multicast address {"already" if add else "not"} filtered 
by port' in output
+):
+self._logger.debug(f"Failed to {mcast_cmd} {multi_addr} on 
port {port_id}")
+raise InteractiveCommandExecutionError(
+f"Failed to {mcast_cmd} {multi_addr} on port {port_id} 
\n{output}"
+)
+
+def rx_vlan_add(self, vlan: int, port: int, verify: bool = True) -> None:
+"""Add specified vlan tag to the filter list on a port.
+
+Args:
+vlan: The vlan tag to add, should be within 1-1005, 1-4094 
extended.
+port: The port number to add the tag on, should be within 0-32.
+verify: If :data:`True`, the output of the command is scanned to 
verify that
+the vlan tag was added to the filter list on the specified 
port. If not, it is
+considered an error.
+
+Raises:
+InteractiveCommandExecutionError: If `verify` is :data:`True` and 
the tag
+is not added

[PATCH] net/netvsc: use rte_eth_dev_set_mtu to set VF MTU

2024-07-18 Thread longli
From: Stephen Hemminger 

The current code uses unnecessary locking to set VF MTU, resulting in
deadlock on hot add/remove path. Fix this by using rte_eth_dev_set_mtu()
to set VF MTU.

Signed-off-by: Stephen Hemminger 

Fixes: 45c83603087e ("net/netvsc: support MTU set")
Cc: sta...@dpdk.org
Signed-off-by: Long Li 
---
 drivers/net/netvsc/hn_vf.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/netvsc/hn_vf.c b/drivers/net/netvsc/hn_vf.c
index 6b3d0eb0c8..b664beaa5d 100644
--- a/drivers/net/netvsc/hn_vf.c
+++ b/drivers/net/netvsc/hn_vf.c
@@ -264,7 +264,7 @@ int hn_vf_add(struct rte_eth_dev *dev, struct hn_data *hv)
goto exit;
}
 
-   ret = hn_vf_mtu_set(dev, dev->data->mtu);
+   ret = rte_eth_dev_set_mtu(port, dev->data->mtu);
if (ret) {
PMD_DRV_LOG(ERR, "Failed to set VF MTU");
goto exit;
@@ -796,7 +796,7 @@ int hn_vf_mtu_set(struct rte_eth_dev *dev, uint16_t mtu)
rte_rwlock_read_lock(&hv->vf_lock);
vf_dev = hn_get_vf_dev(hv);
if (hv->vf_ctx.vf_vsc_switched && vf_dev)
-   ret = vf_dev->dev_ops->mtu_set(vf_dev, mtu);
+   ret = rte_eth_dev_set_mtu(vf_dev->data->port_id, mtu);
rte_rwlock_read_unlock(&hv->vf_lock);
 
return ret;
-- 
2.43.0



[PATCH] eal: fix device unregister for event registered with device_name NULL

2024-07-18 Thread longli
From: Malcolm Bumgardner 

In the device event unregister code, it unconditionally remove all
callbacks which are registered with device_name set to NULL. This results
in many callbacks uncorrectly removed.

Fix this by only removing callbacks with matching cb_fn and cb_arg.

Signed-off-by: Malcolm Bumgardner 

Fixes: a753e53d517b ("eal: add device event monitor framework")
Cc: sta...@dpdk.org
Signed-off-by: Long Li 
---
 lib/eal/common/eal_common_dev.c | 13 +++--
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/lib/eal/common/eal_common_dev.c b/lib/eal/common/eal_common_dev.c
index a99252b02f..70aa04dcd9 100644
--- a/lib/eal/common/eal_common_dev.c
+++ b/lib/eal/common/eal_common_dev.c
@@ -550,16 +550,17 @@ rte_dev_event_callback_unregister(const char *device_name,
next = TAILQ_NEXT(event_cb, next);
 
if (device_name != NULL && event_cb->dev_name != NULL) {
-   if (!strcmp(event_cb->dev_name, device_name)) {
-   if (event_cb->cb_fn != cb_fn ||
-   (cb_arg != (void *)-1 &&
-   event_cb->cb_arg != cb_arg))
-   continue;
-   }
+   if (strcmp(event_cb->dev_name, device_name))
+   continue;
} else if (device_name != NULL) {
continue;
}
 
+   /* Remove only matching callback with arg */
+   if (event_cb->cb_fn != cb_fn ||
+   (cb_arg != (void *)-1 && event_cb->cb_arg != cb_arg))
+   continue;
+
/*
 * if this callback is not executing right now,
 * then remove it.
-- 
2.43.0



[PATCH v3 3/3] dts: mac filter test suite refactored for new dts

2024-07-18 Thread Nicholas Pratte
The mac address filter test suite, whose test cases are based on old
DTS's test cases, has been refactored to interface with the new DTS
framework.

In porting over this test suite into the new framework, some
adjustments were made, namely in the EAL and TestPMD parameter provided
before executing the application. While the original test plan was
referenced, by and large, only for the individual test cases, I'll leave
the parameters the original test plan was asking for below for the sake
of discussion:

--burst=1 --rxpt=0 --rxht=0 --rxwt=0 --txpt=36 --txht=0 --txwt=0
--txfreet=32 --rxfreet=64 --mbcache=250 --portmask=0x3

Bugzilla ID: 1454
Signed-off-by: Nicholas Pratte 

---
v2:
 * Refactored the address pool capacity tests to use all available
   octets in the mac address.
 * Change the payload to 'X' characters instead of 'P' characters.
---
 dts/framework/config/conf_yaml_schema.json |   3 +-
 dts/tests/TestSuite_mac_filter.py  | 223 +
 2 files changed, 225 insertions(+), 1 deletion(-)
 create mode 100644 dts/tests/TestSuite_mac_filter.py

diff --git a/dts/framework/config/conf_yaml_schema.json 
b/dts/framework/config/conf_yaml_schema.json
index f02a310bb5..ad1f3757f7 100644
--- a/dts/framework/config/conf_yaml_schema.json
+++ b/dts/framework/config/conf_yaml_schema.json
@@ -187,7 +187,8 @@
   "enum": [
 "hello_world",
 "os_udp",
-"pmd_buffer_scatter"
+"pmd_buffer_scatter",
+"mac_filter"
   ]
 },
 "test_target": {
diff --git a/dts/tests/TestSuite_mac_filter.py 
b/dts/tests/TestSuite_mac_filter.py
new file mode 100644
index 00..53a3331224
--- /dev/null
+++ b/dts/tests/TestSuite_mac_filter.py
@@ -0,0 +1,223 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2023-2024 University of New Hampshire
+"""Mac address filtering test suite.
+
+This test suite ensures proper and expected behavior of Allowlist filtering 
via mac
+addresses on devices bound to the Poll Mode Driver. If a packet received on a 
device
+contains a mac address not contained with its mac address pool, the packet 
should
+be dropped. Alternatively, if a packet is received that contains a destination 
mac
+within the devices address pool, the packet should be accepted and forwarded. 
This
+behavior should remain consistent across all packets, namely those containing 
dot1q
+tags or otherwise.
+
+The following test suite assesses behaviors based on the aforementioned logic.
+Additionally, testing is done within the PMD itself to ensure that the mac 
address
+allow list is behaving as expected.
+"""
+
+from time import sleep
+
+from scapy.layers.inet import IP  # type: ignore[import-untyped]
+from scapy.layers.l2 import Dot1Q, Ether  # type: ignore[import-untyped]
+from scapy.packet import Raw  # type: ignore[import-untyped]
+
+from framework.exception import InteractiveCommandExecutionError
+from framework.remote_session.testpmd_shell import TestPmdShell
+from framework.test_suite import TestSuite
+
+
+class TestMacFilter(TestSuite):
+"""Mac address allowlist filtering test suite.
+
+Configure mac address filtering on a given port, and test the port's 
filtering behavior
+using both a given port's hardware address as well as dummy addresses. If 
a port accepts
+a packet that is not contained within its mac address allowlist, then a 
given test case
+fails. Alternatively, if a port drops a packet that is designated within 
its mac address
+allowlist, a given test case will fail.
+
+Moreover, a given port should demonstrate proper behavior when bound to 
the Poll Mode
+Driver. A port should not have a mac address allowlist that exceeds its 
designated size.
+A port's default hardware address should not be removed from its address 
pool, and invalid
+addresses should not be included in the allowlist. If a port abides by the 
above rules, the
+test case passes.
+"""
+
+def send_packet_and_verify(
+self,
+mac_address: str,
+add_vlan: bool = False,
+should_receive: bool = True,
+) -> None:
+"""Generate, send, and verify a packet based on specified parameters.
+
+Test cases within this suite utilize this method to create, send, and 
verify
+packets based on criteria relating to the packet's destination mac 
address,
+vlan tag, and whether or not the packet should be received or not. 
Packets
+are verified using an inserted payload. Assuming the test case expects 
to
+receive a specified packet, if the list of received packets contains 
this
+payload within any of its packets, the test case passes. 
Alternatively, if
+the designed packet should not be received, and the packet payload is 
not,
+received, then the test case fails. Each call with this method sends 
exactly
+one packet.
+
+Args:
+mac_address: The destination mac address of the packet being sent.
+ 

RE: [PATCH v2] app/test-crypto-perf: add throughput OOP decryption

2024-07-18 Thread Akhil Goyal
> Hi Brian,
> 
> Since Ciara is no longer available and you are the new maintainer, can you
> investigate this patch?
> There were some concerns which Ciara highlighted. Can you check?
> 

Any update on this patch?


> > > Subject: [PATCH v2] app/test-crypto-perf: add throughput OOP decryption
> > >
> > > During throughput running, re-filling the test data will impact the 
> > > performance
> > > test result. So for now, to run decrypt throughput testing is not 
> > > supported
> since
> > > the test data is not filled.
> > >
> > > But if user requires OOP(out-of-place) mode, the test data from source 
> > > mbuf
> > will
> > > never be modified, and if the test data can be prepared out of the running
> loop,
> > > the decryption test should be fine.
> > >
> > > This commit adds the support of out-of-place decryption testing for
> throughput.
> > >
> > > [1]:
> > > https://urldefense.proofpoint.com/v2/url?u=http-
> > 3A__mails.dpdk.org_archives_dev_2023-
> >
> 2DJuly_273328.html&d=DwIFAg&c=nKjWec2b6R0mOyPaz7xtfQ&r=DnL7Si2wl_P
> > RwpZ9TWey3eu68gBzn7DkPwuqhd6WNyo&m=eTj0O7iYH-
> > xiTQ6dNUZpsOXPqnyC1O_-
> > _IKt0j_yQ_N__vy0wIBLb_QyMQtodUrr&s=eDz_NLjqkUH2cYMilKEtdWImOPj5f-
> > CVKV5UW8P9frk&e=
> > >
> > > Signed-off-by: Suanming Mou 
> > > ---
> > >  app/test-crypto-perf/cperf_ops.c |  5 ++-
> > >  app/test-crypto-perf/cperf_options_parsing.c |  8 +  app/test-crypto-
> > > perf/cperf_test_throughput.c | 34 +---
> > >  3 files changed, 41 insertions(+), 6 deletions(-)
> > >
> > > diff --git a/app/test-crypto-perf/cperf_ops.c b/app/test-crypto-
> > perf/cperf_ops.c
> > > index d3fd115bc0..714616c697 100644
> > > --- a/app/test-crypto-perf/cperf_ops.c
> > > +++ b/app/test-crypto-perf/cperf_ops.c
> > > @@ -644,7 +644,10 @@ cperf_set_ops_aead(struct rte_crypto_op **ops,
> > >   }
> > >
> > >   if ((options->test == CPERF_TEST_TYPE_VERIFY) ||
> > > - (options->test == CPERF_TEST_TYPE_LATENCY)) {
> > > + (options->test == CPERF_TEST_TYPE_LATENCY) ||
> > > + (options->test == CPERF_TEST_TYPE_THROUGHPUT &&
> > > +  (options->aead_op == RTE_CRYPTO_AEAD_OP_DECRYPT ||
> > > +   options->cipher_op == RTE_CRYPTO_CIPHER_OP_DECRYPT))) {
> > >   for (i = 0; i < nb_ops; i++) {
> > >   uint8_t *iv_ptr = rte_crypto_op_ctod_offset(ops[i],
> > >   uint8_t *, iv_offset);
> > > diff --git a/app/test-crypto-perf/cperf_options_parsing.c 
> > > b/app/test-crypto-
> > > perf/cperf_options_parsing.c
> > > index 8c20974273..90526e676f 100644
> > > --- a/app/test-crypto-perf/cperf_options_parsing.c
> > > +++ b/app/test-crypto-perf/cperf_options_parsing.c
> > > @@ -1341,6 +1341,14 @@ cperf_options_check(struct cperf_options
> > > *options)
> > >   }
> > >   }
> > >
> > > + if (options->test == CPERF_TEST_TYPE_THROUGHPUT &&
> > > + (options->aead_op == RTE_CRYPTO_AEAD_OP_DECRYPT ||
> > > +  options->auth_op == RTE_CRYPTO_AUTH_OP_VERIFY) &&
> > > + !options->out_of_place) {
> > > + RTE_LOG(ERR, USER1, "Only out-of-place is allowed in
> > > throughput decryption.\n");
> > > + return -EINVAL;
> > > + }
> >
> > Not totally following some of this, why do we only want to add this for OOP
> > mode?
> >
> > For example an inplace command I can use before this patch but not after:
> > ./build/app/dpdk-test-crypto-perf -l 2,3 -- --ptest throughput --optype 
> > aead --
> > aead-algo aes-gcm --aead-op decrypt --devtype crypto_qat --aead-key-sz 16
> >
> > I get an error;
> > USER1: Only out-of-place is allowed in throughput decryption.
> > USER1: Checking one or more user options failed
> >
> > Do we want to always force the user to use OOP + test vector file for these
> > throughput decryption tests?
> > Or should we just add a warning that the throughput may not be reflecting 
> > the
> > "success" verify path in PMD if using inplace and the dummy data.
> >
> > I am not sure.
> > If we do want to add the limitation on the throughput tests, these changes I
> think
> > are ok for that.
> >
> > Thanks,
> > Ciara
> >
> > > +
> > >   if (options->op_type == CPERF_CIPHER_ONLY ||
> > >   options->op_type == CPERF_CIPHER_THEN_AUTH ||
> > >   options->op_type == CPERF_AUTH_THEN_CIPHER) { diff
> > > --git a/app/test-crypto-perf/cperf_test_throughput.c b/app/test-crypto-
> > > perf/cperf_test_throughput.c
> > > index e3d266d7a4..b347baa913 100644
> > > --- a/app/test-crypto-perf/cperf_test_throughput.c
> > > +++ b/app/test-crypto-perf/cperf_test_throughput.c
> > > @@ -99,6 +99,26 @@ cperf_throughput_test_constructor(struct
> rte_mempool
> > > *sess_mp,
> > >   return NULL;
> > >  }
> > >
> > > +static void
> > > +cperf_verify_init_ops(struct rte_mempool *mp __rte_unused,
> > > +   void *opaque_arg,
> > > +   void *obj,
> > > +   __rte_unused unsigned int i)
> > > +{
> > > + uint16_t iv_offset = sizeof(struct rte_crypto_op) +
> > > + s

RE: IPv6 APIs rework

2024-07-18 Thread Morten Brørup
> From: Robin Jarry [mailto:rja...@redhat.com]
> 
> Hi folks,
> 
> while working on IPv6 support for grout [1], I noticed that all DPDK
> IPv6 APIs used fixed sized arrays in the route lookup functions [2].
> 
>  int rte_fib6_lookup_bulk(struct rte_fib6 *fib,
>   uint8_t ips[][RTE_FIB6_IPV6_ADDR_SIZE],
>   uint64_t *next_hops,
>   int n);
> 
> If I'm not mistaken, using sized arrays in function signatures is only
> for documentation purposes and does not result in any specific compiler
> checks. In the above example, the ips parameter is considered as a plain
> old `uint8_t **` pointer.
> 
> Also, not having a dedicated type for IPv6 addresses requires obscure
> pointer arithmetic [3] and casting [4].
> 
> I'd like to introduce a real IPv6 address structure that has the same
> alignment than a dumb `uint8_t *` pointer but has an union to ease
> casting and most importantly presents the whole thing as an explicit
> typed structure:
> 
> #define RTE_IPV6_ADDR_SIZE 16
> 
> struct rte_ipv6_addr {
> union {
> uint8_t u8[RTE_IPV6_ADDR_SIZE];
> uint16_t u16[RTE_IPV6_ADDR_SIZE / sizeof(uint16_t)];
> uint32_t u32[RTE_IPV6_ADDR_SIZE / sizeof(uint32_t)];
> uint64_t u64[RTE_IPV6_ADDR_SIZE / sizeof(uint64_t)];
> };
> } __rte_packed __rte_aligned(1);
> 
> This would require some breakage of the APIs but I think it would
> benefit code readability and maintainability in the long term.

In short: Although I like the idea of a unified IPv6 address type very much, 
I'm not sure consensus can be reached about the optimal alignment of such a 
type.

The long version:

Please consider this proposal in a broader perspective.

The IPv4 FIB lookup takes an uint32_t array, so the IPv4 address type here is 4 
byte aligned: uint32_t *ips
Generally, uint32_t or rte_be32_t is used for IPv4 addresses, and both these 
types are 4 byte aligned. In other words: IPv4 addresses are considered 4 byte 
aligned by DPDK.

I don't think it is similarly simple for IPv6 addresses.

The alignment of IPv6 addresses may depend on how/where they are used, e.g.:
1. For the FIB library, it might be good for vector implementations to have the 
IPv6 addresses naturally aligned (i.e. 16 byte aligned), like the 
uint128_t/__int128/__m128i type (or the rte_xmm_t type [XMM]). Furthermore, a 
simple integer type (uint128_t equivalent) might be preferable in this API.
2. In the IPv6 packet header, the IPv6 addresses are not 16 byte aligned, they 
are 8 byte aligned. So we cannot make the IPv6 address type 16 byte aligned.

I fear that broadly dumbing down the IPv6 address type to always use 1 byte 
alignment could potentially introduce unwanted performance penalties (now or in 
the future). We didn't do it for IPv4 addresses, so let's not do it for IPv6 
addresses.

Perhaps we could use the lowest "non-exotic" (considering the use of IPv6 
addresses) alignment, which I would guess is 8 byte - as in the IPv6 packet 
header.
For reference, Ethernet addresses are defined as 2 byte aligned [ETH].

[XMM]: 
https://elixir.bootlin.com/dpdk/v24.03/source/lib/eal/x86/include/rte_vect.h#L42
[ETH]: https://elixir.bootlin.com/dpdk/v24.07-rc2/source/lib/net/rte_ether.h#L74

> 
>  int rte_fib6_lookup_bulk(struct rte_fib6 *fib,
>   const struct rte_ipv6_addr *ips,
>   uint64_t *next_hops,
>   int n);
> 
> I already have a semi-working draft and am in the process of splitting
> the changes into small chunks to make them easier to review.
> 
> https://github.com/DPDK/dpdk/compare/main...rjarry:dpdk:ipv6-address-
> rework
> 
> Is that something that would be of interest? If yes, I would like to
> announce API breakage before the release of 24.07 so that the changes
> can be integrated into 24.11.
> 
> Cheers!
> 
> [1] https://github.com/rjarry/grout
> [2]
> https://doc.dpdk.org/api/rte__fib6_8h.html#a924678410ccb9551cda3e75d742a
> 11e3
> [3] https://git.dpdk.org/dpdk/tree/lib/fib/trie_avx512.c?h=v24.07-
> rc2#n340
> [4] https://git.dpdk.org/dpdk/tree/lib/hash/rte_thash.h?h=v24.07-
> rc2#n156
> 
> --
> Robin



Re: IPv6 APIs rework

2024-07-18 Thread Stephen Hemminger
On Thu, 18 Jul 2024 22:27:03 +0200
Morten Brørup  wrote:

> > From: Robin Jarry [mailto:rja...@redhat.com]
> > 
> > Hi folks,
> > 
> > while working on IPv6 support for grout [1], I noticed that all DPDK
> > IPv6 APIs used fixed sized arrays in the route lookup functions [2].
> > 
> >  int rte_fib6_lookup_bulk(struct rte_fib6 *fib,
> >   uint8_t ips[][RTE_FIB6_IPV6_ADDR_SIZE],
> >   uint64_t *next_hops,
> >   int n);
> > 
> > If I'm not mistaken, using sized arrays in function signatures is only
> > for documentation purposes and does not result in any specific compiler
> > checks. In the above example, the ips parameter is considered as a plain
> > old `uint8_t **` pointer.
> > 
> > Also, not having a dedicated type for IPv6 addresses requires obscure
> > pointer arithmetic [3] and casting [4].
> > 
> > I'd like to introduce a real IPv6 address structure that has the same
> > alignment than a dumb `uint8_t *` pointer but has an union to ease
> > casting and most importantly presents the whole thing as an explicit
> > typed structure:
> > 
> > #define RTE_IPV6_ADDR_SIZE 16
> > 
> > struct rte_ipv6_addr {
> > union {
> > uint8_t u8[RTE_IPV6_ADDR_SIZE];
> > uint16_t u16[RTE_IPV6_ADDR_SIZE / sizeof(uint16_t)];
> > uint32_t u32[RTE_IPV6_ADDR_SIZE / sizeof(uint32_t)];
> > uint64_t u64[RTE_IPV6_ADDR_SIZE / sizeof(uint64_t)];
> > };
> > } __rte_packed __rte_aligned(1);
> > 
> > This would require some breakage of the APIs but I think it would
> > benefit code readability and maintainability in the long term.  
> 
> In short: Although I like the idea of a unified IPv6 address type very much, 
> I'm not sure consensus can be reached about the optimal alignment of such a 
> type.
> 
> The long version:
> 
> Please consider this proposal in a broader perspective.
> 
> The IPv4 FIB lookup takes an uint32_t array, so the IPv4 address type here is 
> 4 byte aligned: uint32_t *ips
> Generally, uint32_t or rte_be32_t is used for IPv4 addresses, and both these 
> types are 4 byte aligned. In other words: IPv4 addresses are considered 4 
> byte aligned by DPDK.
> 
> I don't think it is similarly simple for IPv6 addresses.
> 
> The alignment of IPv6 addresses may depend on how/where they are used, e.g.:
> 1. For the FIB library, it might be good for vector implementations to have 
> the IPv6 addresses naturally aligned (i.e. 16 byte aligned), like the 
> uint128_t/__int128/__m128i type (or the rte_xmm_t type [XMM]). Furthermore, a 
> simple integer type (uint128_t equivalent) might be preferable in this API.
> 2. In the IPv6 packet header, the IPv6 addresses are not 16 byte aligned, 
> they are 8 byte aligned. So we cannot make the IPv6 address type 16 byte 
> aligned.
> 
> I fear that broadly dumbing down the IPv6 address type to always use 1 byte 
> alignment could potentially introduce unwanted performance penalties (now or 
> in the future). We didn't do it for IPv4 addresses, so let's not do it for 
> IPv6 addresses.
> 
> Perhaps we could use the lowest "non-exotic" (considering the use of IPv6 
> addresses) alignment, which I would guess is 8 byte - as in the IPv6 packet 
> header.
> For reference, Ethernet addresses are defined as 2 byte aligned [ETH].
> 
> [XMM]: 
> https://elixir.bootlin.com/dpdk/v24.03/source/lib/eal/x86/include/rte_vect.h#L42
> [ETH]: 
> https://elixir.bootlin.com/dpdk/v24.07-rc2/source/lib/net/rte_ether.h#L74
> 
> > 
> >  int rte_fib6_lookup_bulk(struct rte_fib6 *fib,
> >   const struct rte_ipv6_addr *ips,
> >   uint64_t *next_hops,
> >   int n);
> > 
> > I already have a semi-working draft and am in the process of splitting
> > the changes into small chunks to make them easier to review.
> > 
> > https://github.com/DPDK/dpdk/compare/main...rjarry:dpdk:ipv6-address-
> > rework
> > 
> > Is that something that would be of interest? If yes, I would like to
> > announce API breakage before the release of 24.07 so that the changes
> > can be integrated into 24.11.
> > 
> > Cheers!
> > 
> > [1] https://github.com/rjarry/grout
> > [2]
> > https://doc.dpdk.org/api/rte__fib6_8h.html#a924678410ccb9551cda3e75d742a
> > 11e3
> > [3] https://git.dpdk.org/dpdk/tree/lib/fib/trie_avx512.c?h=v24.07-
> > rc2#n340
> > [4] https://git.dpdk.org/dpdk/tree/lib/hash/rte_thash.h?h=v24.07-
> > rc2#n156
> > 
> > --
> > Robin  
> 

If you look at the standard netinet/in.h the storage of IPv6 addresses
is in in6_addr. DPDK has always wanted to do its own thing...

The in6_addr is a union with no explicit alignment.

struct in6_addr
  {
union
  {
uint8_t __u6_addr8[16];
uint16_t __u6_addr16[8];
uint32_t __u6_addr32[4];
  } __in6_u;

Better to not have explicit alignment and not have 64 bit value.



Re: IPv6 APIs rework

2024-07-18 Thread Vladimir Medvedkin
Hi Robin,

Thanks, that is a good idea.




чт, 18 июл. 2024 г. в 21:27, Morten Brørup :

> > From: Robin Jarry [mailto:rja...@redhat.com]
> >
> > Hi folks,
> >
> > while working on IPv6 support for grout [1], I noticed that all DPDK
> > IPv6 APIs used fixed sized arrays in the route lookup functions [2].
> >
> >  int rte_fib6_lookup_bulk(struct rte_fib6 *fib,
> >   uint8_t ips[][RTE_FIB6_IPV6_ADDR_SIZE],
> >   uint64_t *next_hops,
> >   int n);
> >
> > If I'm not mistaken, using sized arrays in function signatures is only
> > for documentation purposes and does not result in any specific compiler
> > checks. In the above example, the ips parameter is considered as a plain
> > old `uint8_t **` pointer.
> >
> > Also, not having a dedicated type for IPv6 addresses requires obscure
> > pointer arithmetic [3] and casting [4].
> >
> > I'd like to introduce a real IPv6 address structure that has the same
> > alignment than a dumb `uint8_t *` pointer but has an union to ease
> > casting and most importantly presents the whole thing as an explicit
> > typed structure:
> >
> > #define RTE_IPV6_ADDR_SIZE 16
> >
> > struct rte_ipv6_addr {
> > union {
> > uint8_t u8[RTE_IPV6_ADDR_SIZE];
> > uint16_t u16[RTE_IPV6_ADDR_SIZE / sizeof(uint16_t)];
> > uint32_t u32[RTE_IPV6_ADDR_SIZE / sizeof(uint32_t)];
> > uint64_t u64[RTE_IPV6_ADDR_SIZE / sizeof(uint64_t)];
> > };
> > } __rte_packed __rte_aligned(1);
> >
> > This would require some breakage of the APIs but I think it would
> > benefit code readability and maintainability in the long term.
>
> In short: Although I like the idea of a unified IPv6 address type very
> much, I'm not sure consensus can be reached about the optimal alignment of
> such a type.
>
> The long version:
>
> Please consider this proposal in a broader perspective.
>
> The IPv4 FIB lookup takes an uint32_t array, so the IPv4 address type here
> is 4 byte aligned: uint32_t *ips
> Generally, uint32_t or rte_be32_t is used for IPv4 addresses, and both
> these types are 4 byte aligned. In other words: IPv4 addresses are
> considered 4 byte aligned by DPDK.
>
> I don't think it is similarly simple for IPv6 addresses.
>
> The alignment of IPv6 addresses may depend on how/where they are used,
> e.g.:
> 1. For the FIB library, it might be good for vector implementations to
> have the IPv6 addresses naturally aligned (i.e. 16 byte aligned), like the
> uint128_t/__int128/__m128i type (or the rte_xmm_t type [XMM]). Furthermore,
> a simple integer type (uint128_t equivalent) might be preferable in this
> API.
>

I think alignment should be 1 since in FIB6 users usually don't copy IPv6
address and just provide a pointer to the memory inside the packet. Current
vector implementation loads IPv6 addresses using unaligned access (
_mm512_loadu_si512) so it doesn't rely on alignment.


> 2. In the IPv6 packet header, the IPv6 addresses are not 16 byte aligned,
> they are 8 byte aligned. So we cannot make the IPv6 address type 16 byte
> aligned.
>
Not necessary, if Ethernet frame in mbuf starts on 8b aligned address, then
IPv6 is aligned only by 2 bytes.


> I fear that broadly dumbing down the IPv6 address type to always use 1
> byte alignment could potentially introduce unwanted performance penalties
> (now or in the future). We didn't do it for IPv4 addresses, so let's not do
> it for IPv6 addresses.
>
> Perhaps we could use the lowest "non-exotic" (considering the use of IPv6
> addresses) alignment, which I would guess is 8 byte - as in the IPv6 packet
> header.
> For reference, Ethernet addresses are defined as 2 byte aligned [ETH].
>
> [XMM]:
> https://elixir.bootlin.com/dpdk/v24.03/source/lib/eal/x86/include/rte_vect.h#L42
> [ETH]:
> https://elixir.bootlin.com/dpdk/v24.07-rc2/source/lib/net/rte_ether.h#L74
>
> >
> >  int rte_fib6_lookup_bulk(struct rte_fib6 *fib,
> >   const struct rte_ipv6_addr *ips,
> >   uint64_t *next_hops,
> >   int n);
> >
> > I already have a semi-working draft and am in the process of splitting
> > the changes into small chunks to make them easier to review.
> >
> > https://github.com/DPDK/dpdk/compare/main...rjarry:dpdk:ipv6-address-
> > rework
> >
> > Is that something that would be of interest? If yes, I would like to
> > announce API breakage before the release of 24.07 so that the changes
> > can be integrated into 24.11.
> >
> > Cheers!
> >
> > [1] https://github.com/rjarry/grout
> > [2]
> > https://doc.dpdk.org/api/rte__fib6_8h.html#a924678410ccb9551cda3e75d742a
> > 11e3
> > [3] https://git.dpdk.org/dpdk/tree/lib/fib/trie_avx512.c?h=v24.07-
> > rc2#n340
> > [4] https://git.dpdk.org/dpdk/tree/lib/hash/rte_thash.h?h=v24.07-
> > rc2#n156
> >
> > --
> > Robin
>
>

-- 
Regards,
Vladimir


Re: IPv6 APIs rework

2024-07-18 Thread Robin Jarry

Vladimir Medvedkin, Jul 18, 2024 at 23:25:

I think alignment should be 1 since in FIB6 users usually don't copy IPv6
address and just provide a pointer to the memory inside the packet. Current
vector implementation loads IPv6 addresses using unaligned access (
_mm512_loadu_si512) so it doesn't rely on alignment.


Yes, my intention was exactly that, being able to map that structure 
directly in packets without copying them on the stack.



> 2. In the IPv6 packet header, the IPv6 addresses are not 16 byte aligned,
> they are 8 byte aligned. So we cannot make the IPv6 address type 16 byte
> aligned.



Not necessary, if Ethernet frame in mbuf starts on 8b aligned address, then
IPv6 is aligned only by 2 bytes.


We probably could safely say that aligning on 2 bytes would be OK. But 
is there any benefit, performance wise, in doing so? Keeping the same 
alignment as before the change would at least make it ABI compatible.




Re: IPv6 APIs rework

2024-07-18 Thread Robin Jarry

Stephen Hemminger, Jul 18, 2024 at 23:15:

If you look at the standard netinet/in.h the storage of IPv6 addresses
is in in6_addr. DPDK has always wanted to do its own thing...

The in6_addr is a union with no explicit alignment.

struct in6_addr
  {
union
  {
uint8_t __u6_addr8[16];
uint16_t __u6_addr16[8];
uint32_t __u6_addr32[4];
  } __in6_u;

Better to not have explicit alignment and not have 64 bit value.


The main reason why I didn't use the standard POSIX type is that it has 
an alignment of 4 which means it cannot always be mapped directly to 
packets in memory depending on the encapsulating protocol.


Also, ip->__in6_u.__u6_addr8 is really ugly as a field name, even if 
the "helper" macros (ip->s6_addr8) make them a bit better :)


What do you have against adding a 64 bit value in the union?



Re: [PATCH v10 15/21] net/ntnic: add link management skeleton

2024-07-18 Thread Ferruh Yigit
On 7/17/2024 2:33 PM, Serhii Iliushyk wrote:
> @@ -373,6 +645,15 @@ nthw_pci_dev_init(struct rte_pci_device *pci_dev)
>   return -1;
>   }
>  
> + /* connect structs */
> + internals->p_drv = p_drv;
> + eth_dev->data->dev_private = internals;
> + eth_dev->data->mac_addrs = rte_malloc(NULL,
> + NUM_MAC_ADDRS_PER_PORT * sizeof(struct 
> rte_ether_addr), 0);
> + rte_memcpy(ð_dev->data->mac_addrs[0],
> + &internals->eth_addrs[0], 
> RTE_ETHER_ADDR_LEN);
> +
> +
>

DPDK has 'rte_ether_addr_copy()' API for MAC address copy, this is not a
change request for this patch series, but you can use it when this code
is updated in the future.



Re: [PATCH v10 01/21] net/ntnic: add ethdev and makes PMD available

2024-07-18 Thread Ferruh Yigit
On 7/17/2024 2:32 PM, Serhii Iliushyk wrote:
> Add initial ntnic ethdev skeleton and register PCI probe functions
> Update documentation: Device description and feature list
> 
> Signed-off-by: Serhii Iliushyk 
>

For series,
Reviewed-by: Ferruh Yigit 

Series applied to dpdk-next-net/main, thanks.



Re: [PATCH v5] net/ark: fix index arithmetic bug

2024-07-18 Thread Ferruh Yigit
On 7/17/2024 9:38 PM, Ed Czeck wrote:
> Behavior for signed integer overflow is not defined
> which can causes undesired behavior at values near
> max and min bounds.
> The used of unsigned is defined as to use modulo arithmetic
> which is the desired behavior.
> This patch replaces int32_t with uint32_t except for
> necessary casts.
> 
> Fixes: 9ee9e0d3b85e ("net/ark: update to reflect FPGA updates")
> Cc: sta...@dpdk.org
> 
> Signed-off-by: Ed Czeck 
>

Acked-by: Ferruh Yigit 

Applied to dpdk-next-net/main, thanks.


Re: [PATCH] net/gve: Update Rx/Tx functions for RTE_PROC_SECONDARY

2024-07-18 Thread Joshua Washington
My ack might have been a bit premature. This patch seems to have a
number of style errors, and does not seem to apply to HEAD of
dpdk-next-net. See http://dpdk.org/patch/142498 for more detail.

The code supplied in the patch does not seem to have any indentation.
Please note that DPDK uses tabs (assumed to take up 8 spaces for the
purposes of wrapping long lines) instead of spaces. The style guide
can be found at
https://doc.dpdk.org/guides/contributing/coding_style.html.

The dpdk-next-net repository can be found at
http://git.dpdk.org/next/dpdk-next-net.


Re: [PATCH] net/gve: Update Rx/Tx functions for RTE_PROC_SECONDARY

2024-07-18 Thread Tathagat Priyadarshi
Thanks Joshua, I will fix all the nits in my next email.

On Fri, Jul 19, 2024 at 8:53 AM Joshua Washington 
wrote:

> My ack might have been a bit premature. This patch seems to have a
> number of style errors, and does not seem to apply to HEAD of
> dpdk-next-net. See http://dpdk.org/patch/142498 for more detail.
>
> The code supplied in the patch does not seem to have any indentation.
> Please note that DPDK uses tabs (assumed to take up 8 spaces for the
> purposes of wrapping long lines) instead of spaces. The style guide
> can be found at
> https://doc.dpdk.org/guides/contributing/coding_style.html.
>
> The dpdk-next-net repository can be found at
> http://git.dpdk.org/next/dpdk-next-net.
>


[PATCH] net/gve: Update Rx/Tx functions for RTE_PROC_SECONDARY

2024-07-18 Thread priyadarshitathagat
The RSS support for GVE allows multiple CPU cores to
handle the rx/tx queues as pollers. This requires initializing
the eth_dev_ops and updating the RX/TX functions for these pollers.

Signed-off-by: Tathagat Priyadarshi 
Acked-by: Rushil Gupta 
Acked-by: Joshua Washington 
---
 drivers/net/gve/gve_ethdev.c | 12 +++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/drivers/net/gve/gve_ethdev.c b/drivers/net/gve/gve_ethdev.c
index ca92277..2d8ef6f 100644
--- a/drivers/net/gve/gve_ethdev.c
+++ b/drivers/net/gve/gve_ethdev.c
@@ -1173,8 +1173,18 @@ struct gve_queue_page_list *
rte_be32_t *db_bar;
int err;
 
-   if (rte_eal_process_type() != RTE_PROC_PRIMARY)
+   if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
+   if (gve_is_gqi(priv)) {
+   gve_set_rx_function(eth_dev);
+   gve_set_tx_function(eth_dev);
+   eth_dev->dev_ops = &gve_eth_dev_ops;
+   } else {
+   gve_set_rx_function_dqo(eth_dev);
+   gve_set_tx_function_dqo(eth_dev);
+   eth_dev->dev_ops = &gve_eth_dev_ops_dqo;
+   }
return 0;
+   }
 
pci_dev = RTE_DEV_TO_PCI(eth_dev->device);
 
-- 
1.8.3.1



RE: [PATCH] net/netvsc: use rte_eth_dev_set_mtu to set VF MTU

2024-07-18 Thread Wei Hu
> Subject: [PATCH] net/netvsc: use rte_eth_dev_set_mtu to set VF MTU
> 
> From: Stephen Hemminger 
> 
> The current code uses unnecessary locking to set VF MTU, resulting in
> deadlock on hot add/remove path. Fix this by using rte_eth_dev_set_mtu() to
> set VF MTU.
> 
> Signed-off-by: Stephen Hemminger 
> 
> Fixes: 45c83603087e ("net/netvsc: support MTU set")
> Cc: sta...@dpdk.org
> Signed-off-by: Long Li 

Reviewed-by: Wei Hu 

> ---
>  drivers/net/netvsc/hn_vf.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/netvsc/hn_vf.c b/drivers/net/netvsc/hn_vf.c index
> 6b3d0eb0c8..b664beaa5d 100644
> --- a/drivers/net/netvsc/hn_vf.c
> +++ b/drivers/net/netvsc/hn_vf.c
> @@ -264,7 +264,7 @@ int hn_vf_add(struct rte_eth_dev *dev, struct
> hn_data *hv)
>   goto exit;
>   }
> 
> - ret = hn_vf_mtu_set(dev, dev->data->mtu);
> + ret = rte_eth_dev_set_mtu(port, dev->data->mtu);
>   if (ret) {
>   PMD_DRV_LOG(ERR, "Failed to set VF MTU");
>   goto exit;
> @@ -796,7 +796,7 @@ int hn_vf_mtu_set(struct rte_eth_dev *dev, uint16_t
> mtu)
>   rte_rwlock_read_lock(&hv->vf_lock);
>   vf_dev = hn_get_vf_dev(hv);
>   if (hv->vf_ctx.vf_vsc_switched && vf_dev)
> - ret = vf_dev->dev_ops->mtu_set(vf_dev, mtu);
> + ret = rte_eth_dev_set_mtu(vf_dev->data->port_id, mtu);
>   rte_rwlock_read_unlock(&hv->vf_lock);
> 
>   return ret;
> --
> 2.43.0



[PATCH v2] net/gve: Update Rx/Tx functions for RTE_PROC_SECONDARY

2024-07-18 Thread priyadarshitathagat
The RSS support for GVE allows multiple CPU cores to
handle the rx/tx queues as pollers. This requires initializing
the eth_dev_ops and updating the RX/TX functions for these pollers.

Signed-off-by: Tathagat Priyadarshi 
Acked-by: Rushil Gupta 
Acked-by: Joshua Washington 
---
 drivers/net/gve/gve_ethdev.c | 12 +++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/drivers/net/gve/gve_ethdev.c b/drivers/net/gve/gve_ethdev.c
index ca92277..2d8ef6f 100644
--- a/drivers/net/gve/gve_ethdev.c
+++ b/drivers/net/gve/gve_ethdev.c
@@ -1173,8 +1173,18 @@ struct gve_queue_page_list *
rte_be32_t *db_bar;
int err;
 
-   if (rte_eal_process_type() != RTE_PROC_PRIMARY)
+   if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
+   if (gve_is_gqi(priv)) {
+   gve_set_rx_function(eth_dev);
+   gve_set_tx_function(eth_dev);
+   eth_dev->dev_ops = &gve_eth_dev_ops;
+   } else {
+   gve_set_rx_function_dqo(eth_dev);
+   gve_set_tx_function_dqo(eth_dev);
+   eth_dev->dev_ops = &gve_eth_dev_ops_dqo;
+   }
return 0;
+   }
 
pci_dev = RTE_DEV_TO_PCI(eth_dev->device);
 
-- 
1.8.3.1