Re: [DPDK/other Bug 1562] dumpcap captures all available network interfaces when specifying any PCI network interface

2025-03-25 Thread Navin Srinivas
Oh, Thank you, Can you point me to those patches?
we require it on top of 22.11.1, I would like to get those patches back
ported on top 22.11.1

On Wed, Feb 12, 2025 at 8:32 PM Stephen Hemminger <
step...@networkplumber.org> wrote:

> On Wed, 12 Feb 2025 13:29:57 +0530
> Navin Srinivas  wrote:
>
> > Hi,
> >
> > Is this backported to 22.11.1?
> >
> > Thanks,
> > Navn Srinivas
>
> No, it required a couple of other patches related to management of network
> interface list.
>


Re: [PATCH v4 00/11] remove component-specific logic for AVX builds

2025-03-25 Thread David Marchand
Hello Bruce,

On Wed, Mar 19, 2025 at 7:09 PM Bruce Richardson
 wrote:
>
> On Wed, Mar 19, 2025 at 05:29:30PM +, Bruce Richardson wrote:
> > A number of libs and drivers had special optimized AVX2 and AVX512 code
> > paths for performance reasons, and these tended to have copy-pasted
> > logic to build those files. Centralise that logic in the main
> > drivers/ and lib/ meson.build files to avoid duplication.
> >
> > v4: rebase on latest main branch
> > minor fixes following feedback
> > limit use of -march=skylake-avx512 to when we don't already have a
> >   -march flag supporting AVX512.
> > v3: add patch for event/dlb2 AVX512 handling.
> > add common code for libraries as well as drivers.
> > v2: add patch 4 to remove use of unnecessary CC_AVX2_SUPPORT flag
> >
> A related follow-up to this patchset. Checking with "godbolt.org", it
> appears that both clang 3.6[1] and gcc 5[2] (the minimum called out compiler
> versions in our docs[1]) support the set of AVX-512 compiler flags we use.
> Therefore, it seems we can simplify our code further by removing the
> "cc_has_avx512" variable.

What about https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90028 ?

You'll need to send a new revision for this series in any case, since
patch 9 broke the crc stuff in the net library.
https://inbox.dpdk.org/dev/CAJFAV8w9wYPN+30Hv=batMvP=0m4momkzgmndfixxbd-9u8...@mail.gmail.com/


-- 
David Marchand



RE: [PATCH] mempool: micro optimizations

2025-03-25 Thread Morten Brørup
PING for review.

@Bruce, you seemed to acknowledge this, but never sent a formal Ack.

Med venlig hilsen / Kind regards,
-Morten Brørup

> -Original Message-
> From: Morten Brørup [mailto:m...@smartsharesystems.com]
> Sent: Wednesday, 26 February 2025 16.59
> To: Andrew Rybchenko; dev@dpdk.org
> Cc: Morten Brørup
> Subject: [PATCH] mempool: micro optimizations
> 
> The comparisons lcore_id < RTE_MAX_LCORE and lcore_id != LCORE_ID_ANY
> are
> equivalent, but the latter compiles to fewer bytes of code space.
> Similarly for lcore_id >= RTE_MAX_LCORE and lcore_id == LCORE_ID_ANY.
> 
> The rte_mempool_get_ops() function is also used in the fast path, so
> RTE_VERIFY() was replaced by RTE_ASSERT().
> 
> Compilers implicitly consider comparisons of variable == 0 likely, so
> unlikely() was added to the check for no mempool cache (mp->cache_size
> ==
> 0) in the rte_mempool_default_cache() function.
> 
> The rte_mempool_do_generic_put() function for adding objects to a
> mempool
> was refactored as follows:
> - The comparison for the request itself being too big, which is
> considered
>   unlikely, was moved down and out of the code path where the cache has
>   sufficient room for the added objects, which is considered the most
>   likely code path.
> - Added __rte_assume() about the cache length, size and threshold, for
>   compiler optimization when "n" is compile time constant.
> - Added __rte_assume() about "ret" being zero, so other functions using
>   the value returned by this function can be potentially optimized by
> the
>   compiler; especially when it merges multiple sequential code paths of
>   inlined code depending on the return value being either zero or
>   negative.
> - The refactored source code (with comments) made the separate comment
>   describing the cache flush/add algorithm superfluous, so it was
> removed.
> 
> A few more likely()/unlikely() were added.
> 
> A few comments were improved for readability.
> 
> Some assertions, RTE_ASSERT(), were added. Most importantly to assert
> that
> the return values of the mempool drivers' enqueue and dequeue
> operations
> are API compliant, i.e. 0 (for success) or negative (for failure), and
> never positive.
> 
> Signed-off-by: Morten Brørup 
> ---
>  lib/mempool/rte_mempool.h | 67 ++-
>  1 file changed, 38 insertions(+), 29 deletions(-)
> 
> diff --git a/lib/mempool/rte_mempool.h b/lib/mempool/rte_mempool.h
> index c495cc012f..aedc100964 100644
> --- a/lib/mempool/rte_mempool.h
> +++ b/lib/mempool/rte_mempool.h
> @@ -334,7 +334,7 @@ struct __rte_cache_aligned rte_mempool {
>  #ifdef RTE_LIBRTE_MEMPOOL_STATS
>  #define RTE_MEMPOOL_STAT_ADD(mp, name, n) do {
> \
>   unsigned int __lcore_id = rte_lcore_id();
> \
> - if (likely(__lcore_id < RTE_MAX_LCORE))
> \
> + if (likely(__lcore_id != LCORE_ID_ANY))
> \
>   (mp)->stats[__lcore_id].name += (n);
> \
>   else
> \
>   rte_atomic_fetch_add_explicit(&((mp)-
> >stats[RTE_MAX_LCORE].name),  \
> @@ -751,7 +751,7 @@ extern struct rte_mempool_ops_table
> rte_mempool_ops_table;
>  static inline struct rte_mempool_ops *
>  rte_mempool_get_ops(int ops_index)
>  {
> - RTE_VERIFY((ops_index >= 0) && (ops_index <
> RTE_MEMPOOL_MAX_OPS_IDX));
> + RTE_ASSERT((ops_index >= 0) && (ops_index <
> RTE_MEMPOOL_MAX_OPS_IDX));
> 
>   return &rte_mempool_ops_table.ops[ops_index];
>  }
> @@ -791,7 +791,8 @@ rte_mempool_ops_dequeue_bulk(struct rte_mempool
> *mp,
>   rte_mempool_trace_ops_dequeue_bulk(mp, obj_table, n);
>   ops = rte_mempool_get_ops(mp->ops_index);
>   ret = ops->dequeue(mp, obj_table, n);
> - if (ret == 0) {
> + RTE_ASSERT(ret <= 0);
> + if (likely(ret == 0)) {
>   RTE_MEMPOOL_STAT_ADD(mp, get_common_pool_bulk, 1);
>   RTE_MEMPOOL_STAT_ADD(mp, get_common_pool_objs, n);
>   }
> @@ -816,11 +817,14 @@ rte_mempool_ops_dequeue_contig_blocks(struct
> rte_mempool *mp,
>   void **first_obj_table, unsigned int n)
>  {
>   struct rte_mempool_ops *ops;
> + int ret;
> 
>   ops = rte_mempool_get_ops(mp->ops_index);
>   RTE_ASSERT(ops->dequeue_contig_blocks != NULL);
>   rte_mempool_trace_ops_dequeue_contig_blocks(mp, first_obj_table,
> n);
> - return ops->dequeue_contig_blocks(mp, first_obj_table, n);
> + ret = ops->dequeue_contig_blocks(mp, first_obj_table, n);
> + RTE_ASSERT(ret <= 0);
> + return ret;
>  }
> 
>  /**
> @@ -848,6 +852,7 @@ rte_mempool_ops_enqueue_bulk(struct rte_mempool
> *mp, void * const *obj_table,
>   rte_mempool_trace_ops_enqueue_bulk(mp, obj_table, n);
>   ops = rte_mempool_get_ops(mp->ops_index);
>   ret = ops->enqueue(mp, obj_table, n);
> + RTE_ASSERT(ret <= 0);
>  #ifdef RTE_LIBRTE_MEMPOOL_DEBUG
>   if (unlikely(ret < 0))
>   RTE_MEMPOOL_LOG(CRIT, "cannot enqueue %u objects to mempool
> %s",
> @@ -1333,10 +1338,10 @@ rte_mem

[PATCH v5 11/11] member: use common AVX512 build support

2025-03-25 Thread Bruce Richardson
Use the support for building AVX512 code present in lib/meson.build
rather than reimplementing it in the library meson.build file.

Signed-off-by: Bruce Richardson 
---
 lib/member/meson.build | 46 +++---
 1 file changed, 7 insertions(+), 39 deletions(-)

diff --git a/lib/member/meson.build b/lib/member/meson.build
index 4341b424df..07f9afaed9 100644
--- a/lib/member/meson.build
+++ b/lib/member/meson.build
@@ -20,44 +20,12 @@ sources = files(
 
 deps += ['hash', 'ring']
 
-# compile AVX512 version if:
-if dpdk_conf.has('RTE_ARCH_X86_64') and binutils_ok
-# compile AVX512 version if either:
-# a. we have AVX512 supported in minimum instruction set
-#baseline
-# b. it's not minimum instruction set, but supported by
-#compiler
-#
-# in former case, just add avx512 C file to files list
-# in latter case, compile c file to static lib, using correct
-# compiler flags, and then have the .o file from static lib
-# linked into main lib.
-
-member_avx512_args = cc_avx512_flags
-if not is_ms_compiler
-member_avx512_args += '-mavx512ifma'
-endif
-
-# check if all required flags already enabled
-sketch_avx512_flags = ['__AVX512F__', '__AVX512DQ__', '__AVX512IFMA__']
-
-sketch_avx512_on = true
-foreach f:sketch_avx512_flags
-if cc.get_define(f, args: machine_args) == ''
-sketch_avx512_on = false
-endif
-endforeach
-
-if sketch_avx512_on == true
-cflags += ['-DCC_AVX512_SUPPORT']
-sources += files('rte_member_sketch_avx512.c')
-elif cc.has_multi_arguments(member_avx512_args)
-sketch_avx512_tmp = static_library('sketch_avx512_tmp',
-'rte_member_sketch_avx512.c',
-include_directories: includes,
-dependencies: [static_rte_eal, static_rte_hash],
-c_args: cflags + member_avx512_args)
-objs += sketch_avx512_tmp.extract_objects('rte_member_sketch_avx512.c')
-cflags += ['-DCC_AVX512_SUPPORT']
+# compile AVX512 version if we have avx512 on MSVC or the 'ifma' flag on 
GCC/Clang
+if dpdk_conf.has('RTE_ARCH_X86_64')
+if is_ms_compiler
+sources_avx512 += files('rte_member_sketch_avx512.c')
+elif cc.has_argument('-mavx512ifma')
+sources_avx512 += files('rte_member_sketch_avx512.c')
+cflags_avx512 += '-mavx512ifma'
 endif
 endif
-- 
2.45.2



Re: [PATCH] drivers/net/mlx5: fix mlx5 send packet failed

2025-03-25 Thread Stephen Hemminger
On Tue, 25 Mar 2025 18:39:00 +0800
Wenbo Liu  wrote:

> Test Environment: ARM architecture, OpenEuler operating system
> CPU: HUAWEI Kunpeng 920 5220, BIOS Vendor ID: HiSilicon
> Network Card: Mellanox Technologies MT27800 Family [ConnectX-5]
> DPDK program sending self-encapsulated packets with MAC, IP, and UDP headers
> continuously prints the following errors and ceases packet transmission
> 
> mlx5_common: Failed to modify SQ using DevX
> mlx5_net: Cannot change the Tx SQ state to RESET Remote I/O error
> 
> Signed-off-by: Wenbo Liu 

Patch has compile failures in CI and coding indent issue reported by checkpatch.


[PATCH v5 00/11] remove component-specific logic for AVX builds

2025-03-25 Thread Bruce Richardson
A number of libs and drivers had special optimized AVX2 and AVX512 code
paths for performance reasons, and these tended to have copy-pasted
logic to build those files. Centralise that logic in the main
drivers/ and lib/ meson.build files to avoid duplication.

v5: fix RTE_ARCH_X86 macro, which broke crc library
v4: rebase on latest main branch
minor fixes following feedback
limit use of -march=skylake-avx512 to when we don't already have a
  -march flag supporting AVX512.
v3: add patch for event/dlb2 AVX512 handling.
add common code for libraries as well as drivers.
v2: add patch 4 to remove use of unnecessary CC_AVX2_SUPPORT flag


Bruce Richardson (11):
  drivers: add generalized AVX build handling
  net/intel: use common AVX build code
  drivers/net: build use common AVX handling
  drivers/net: remove AVX2 build-time define
  event/dlb2: build using common AVX handling
  lib: add generalized AVX build handling
  acl: use common AVX build handling
  fib: use common AVX build handling
  net: simplify build-time logic for x86
  net: use common AVX512 build code
  member: use common AVX512 build support

 drivers/event/dlb2/dlb2_sse.c |  4 ++
 drivers/event/dlb2/meson.build| 16 +---
 drivers/meson.build   | 30 ++
 drivers/net/bnxt/bnxt_ethdev.c|  2 -
 drivers/net/bnxt/meson.build  | 10 +
 drivers/net/enic/meson.build  | 10 +
 drivers/net/intel/i40e/meson.build| 26 +---
 drivers/net/intel/iavf/meson.build| 25 +---
 drivers/net/intel/ice/meson.build | 25 +---
 drivers/net/intel/idpf/meson.build| 25 +---
 drivers/net/nfp/meson.build   | 10 +
 drivers/net/octeon_ep/meson.build | 13 +-
 drivers/net/octeon_ep/otx_ep_ethdev.c |  4 --
 drivers/net/virtio/meson.build|  9 +
 lib/acl/meson.build   | 54 ++---
 lib/fib/dir24_8.c |  6 +--
 lib/fib/meson.build   | 18 +
 lib/fib/trie.c|  6 +--
 lib/member/meson.build| 46 -
 lib/meson.build   | 34 +++-
 lib/net/meson.build   | 58 +++
 lib/net/rte_net_crc.c | 16 
 22 files changed, 114 insertions(+), 333 deletions(-)

--
2.45.2



[PATCH v5 01/11] drivers: add generalized AVX build handling

2025-03-25 Thread Bruce Richardson
Add support to the top-level driver build file for AVX2 and AVX512
specific sources. This should simplify driver builds by avoiding the
need to constantly reimplement the same build logic

Signed-off-by: Bruce Richardson 
---
 drivers/meson.build | 30 ++
 1 file changed, 30 insertions(+)

diff --git a/drivers/meson.build b/drivers/meson.build
index 05391a575d..c15319dc24 100644
--- a/drivers/meson.build
+++ b/drivers/meson.build
@@ -126,6 +126,8 @@ foreach subpath:subdirs
 name = drv
 annotate_locks = true
 sources = []
+sources_avx2 = []
+sources_avx512 = []
 headers = []
 driver_sdk_headers = [] # public headers included by drivers
 objs = []
@@ -235,6 +237,34 @@ foreach subpath:subdirs
 dpdk_includes += include_directories(drv_path)
 endif
 
+# handle avx2 and avx512 source files
+if arch_subdir == 'x86'
+if sources_avx2.length() > 0
+avx2_lib = static_library(lib_name + '_avx2_lib',
+sources_avx2,
+dependencies: static_deps,
+include_directories: includes,
+c_args: [cflags, cc_avx2_flags])
+objs += avx2_lib.extract_objects(sources_avx2)
+endif
+if sources_avx512.length() > 0 and cc_has_avx512
+cflags += '-DCC_AVX512_SUPPORT'
+avx512_args = [cflags, cc_avx512_flags]
+if not target_has_avx512 and 
cc.has_argument('-march=skylake-avx512')
+avx512_args += '-march=skylake-avx512'
+if cc.has_argument('-Wno-overriding-option')
+avx512_args += '-Wno-overriding-option'
+endif
+endif
+avx512_lib = static_library(lib_name + '_avx512_lib',
+sources_avx512,
+dependencies: static_deps,
+include_directories: includes,
+c_args: avx512_args)
+objs += avx512_lib.extract_objects(sources_avx512)
+endif
+endif
+
 # generate pmdinfo sources by building a temporary
 # lib and then running pmdinfogen on the contents of
 # that lib. The final lib reuses the object files and
-- 
2.45.2



[PATCH v5 02/11] net/intel: use common AVX build code

2025-03-25 Thread Bruce Richardson
Remove driver-specific build instructions for the AVX2 and AVX-512 code,
and rely instead on the generic driver build file.

Signed-off-by: Bruce Richardson 
---
 drivers/net/intel/i40e/meson.build | 26 ++
 drivers/net/intel/iavf/meson.build | 25 ++---
 drivers/net/intel/ice/meson.build  | 25 ++---
 drivers/net/intel/idpf/meson.build | 25 ++---
 4 files changed, 8 insertions(+), 93 deletions(-)

diff --git a/drivers/net/intel/i40e/meson.build 
b/drivers/net/intel/i40e/meson.build
index 15993393fb..dae61222cf 100644
--- a/drivers/net/intel/i40e/meson.build
+++ b/drivers/net/intel/i40e/meson.build
@@ -40,31 +40,9 @@ includes += include_directories('base')
 
 if arch_subdir == 'x86'
 sources += files('i40e_rxtx_vec_sse.c')
+sources_avx2 += files('i40e_rxtx_vec_avx2.c')
+sources_avx512 += files('i40e_rxtx_vec_avx512.c')
 
-i40e_avx2_lib = static_library('i40e_avx2_lib',
-'i40e_rxtx_vec_avx2.c',
-dependencies: [static_rte_ethdev, static_rte_kvargs, 
static_rte_hash],
-include_directories: includes,
-c_args: [cflags, cc_avx2_flags])
-objs += i40e_avx2_lib.extract_objects('i40e_rxtx_vec_avx2.c')
-
-if cc_has_avx512
-cflags += ['-DCC_AVX512_SUPPORT']
-avx512_args = cflags + cc_avx512_flags
-if cc.has_argument('-march=skylake-avx512')
-avx512_args += '-march=skylake-avx512'
-if cc.has_argument('-Wno-overriding-option')
-avx512_args += '-Wno-overriding-option'
-endif
-endif
-i40e_avx512_lib = static_library('i40e_avx512_lib',
-'i40e_rxtx_vec_avx512.c',
-dependencies: [static_rte_ethdev,
-static_rte_kvargs, static_rte_hash],
-include_directories: includes,
-c_args: avx512_args)
-objs += i40e_avx512_lib.extract_objects('i40e_rxtx_vec_avx512.c')
-endif
 elif arch_subdir == 'ppc'
sources += files('i40e_rxtx_vec_altivec.c')
 elif arch_subdir == 'arm'
diff --git a/drivers/net/intel/iavf/meson.build 
b/drivers/net/intel/iavf/meson.build
index 833a63e6c8..1ca500c43c 100644
--- a/drivers/net/intel/iavf/meson.build
+++ b/drivers/net/intel/iavf/meson.build
@@ -28,30 +28,9 @@ includes += include_directories('base')
 
 if arch_subdir == 'x86'
 sources += files('iavf_rxtx_vec_sse.c')
+sources_avx2 += files('iavf_rxtx_vec_avx2.c')
+sources_avx512 += files('iavf_rxtx_vec_avx512.c')
 
-iavf_avx2_lib = static_library('iavf_avx2_lib',
-'iavf_rxtx_vec_avx2.c',
-dependencies: [static_rte_ethdev],
-include_directories: includes,
-c_args: [cflags, cc_avx2_flags])
-objs += iavf_avx2_lib.extract_objects('iavf_rxtx_vec_avx2.c')
-
-if cc_has_avx512
-cflags += ['-DCC_AVX512_SUPPORT']
-avx512_args = cflags + cc_avx512_flags
-if cc.has_argument('-march=skylake-avx512')
-avx512_args += '-march=skylake-avx512'
-if cc.has_argument('-Wno-overriding-option')
-avx512_args += '-Wno-overriding-option'
-endif
-endif
-iavf_avx512_lib = static_library('iavf_avx512_lib',
-'iavf_rxtx_vec_avx512.c',
-dependencies: [static_rte_ethdev],
-include_directories: includes,
-c_args: avx512_args)
-objs += iavf_avx512_lib.extract_objects('iavf_rxtx_vec_avx512.c')
-endif
 elif arch_subdir == 'arm'
 sources += files('iavf_rxtx_vec_neon.c')
 endif
diff --git a/drivers/net/intel/ice/meson.build 
b/drivers/net/intel/ice/meson.build
index 4d8f71cd4a..fa6c505450 100644
--- a/drivers/net/intel/ice/meson.build
+++ b/drivers/net/intel/ice/meson.build
@@ -34,30 +34,9 @@ endif
 
 if arch_subdir == 'x86'
 sources += files('ice_rxtx_vec_sse.c')
+sources_avx2 += files('ice_rxtx_vec_avx2.c')
+sources_avx512 += files('ice_rxtx_vec_avx512.c')
 
-ice_avx2_lib = static_library('ice_avx2_lib',
-'ice_rxtx_vec_avx2.c',
-dependencies: [static_rte_ethdev, static_rte_hash],
-include_directories: includes,
-c_args: [cflags, cc_avx2_flags])
-objs += ice_avx2_lib.extract_objects('ice_rxtx_vec_avx2.c')
-
-if cc_has_avx512
-cflags += ['-DCC_AVX512_SUPPORT']
-avx512_args = cflags + cc_avx512_flags
-if cc.has_argument('-march=skylake-avx512')
-avx512_args += '-march=skylake-avx512'
-if cc.has_argument('-Wno-overriding-option')
-avx512_args += '-Wno-overriding-option'
-endif
-endif
-ice_avx512_lib = static_library('ice_avx512_lib',
-'ice_rxtx_vec_avx512.c',
-dependencies: [static_rte_ethdev, static_rte_hash],
-include_directories: includes,
-c_args: avx512_args)
-objs += ice_avx512_lib.

[PATCH v5 03/11] drivers/net: build use common AVX handling

2025-03-25 Thread Bruce Richardson
Remove from remaining net drivers the special-case code to handle AVX2
or AVX512 specific files. These can be built instead using
drivers/meson.build.

Signed-off-by: Bruce Richardson 
---
 drivers/net/bnxt/meson.build  | 10 +-
 drivers/net/enic/meson.build  | 10 +-
 drivers/net/nfp/meson.build   | 10 +-
 drivers/net/octeon_ep/meson.build | 14 ++
 drivers/net/virtio/meson.build|  9 +
 5 files changed, 6 insertions(+), 47 deletions(-)

diff --git a/drivers/net/bnxt/meson.build b/drivers/net/bnxt/meson.build
index fd82d0c409..dcca7df916 100644
--- a/drivers/net/bnxt/meson.build
+++ b/drivers/net/bnxt/meson.build
@@ -58,15 +58,7 @@ subdir('hcapi/cfa_v3')
 
 if arch_subdir == 'x86'
 sources += files('bnxt_rxtx_vec_sse.c')
-# build AVX2 code with instruction set explicitly enabled for runtime 
selection
-bnxt_avx2_lib = static_library('bnxt_avx2_lib',
-'bnxt_rxtx_vec_avx2.c',
-dependencies: [static_rte_ethdev,
-static_rte_bus_pci,
-static_rte_kvargs, static_rte_hash],
-include_directories: includes,
-c_args: [cflags, cc_avx2_flags])
- objs += bnxt_avx2_lib.extract_objects('bnxt_rxtx_vec_avx2.c')
+sources_avx2 = files('bnxt_rxtx_vec_avx2.c')
 elif arch_subdir == 'arm' and dpdk_conf.get('RTE_ARCH_64')
 sources += files('bnxt_rxtx_vec_neon.c')
 endif
diff --git a/drivers/net/enic/meson.build b/drivers/net/enic/meson.build
index cfe5ec170a..2b3052fae8 100644
--- a/drivers/net/enic/meson.build
+++ b/drivers/net/enic/meson.build
@@ -29,17 +29,9 @@ sources = files(
 deps += ['hash']
 includes += include_directories('base')
 
-# Build the avx2 handler for 64-bit X86 targets, even though 'machine'
-# may not. This is to support users who build for the min supported machine
-# and need to run the binary on newer CPUs too.
 if dpdk_conf.has('RTE_ARCH_X86_64')
 cflags += '-DENIC_RXTX_VEC'
-enic_avx2_lib = static_library('enic_avx2_lib',
-'enic_rxtx_vec_avx2.c',
-dependencies: [static_rte_ethdev, static_rte_bus_pci],
-include_directories: includes,
-c_args: [cflags, cc_avx2_flags])
-objs += enic_avx2_lib.extract_objects('enic_rxtx_vec_avx2.c')
+sources_avx2 = files('enic_rxtx_vec_avx2.c')
 endif
 
 annotate_locks = false
diff --git a/drivers/net/nfp/meson.build b/drivers/net/nfp/meson.build
index 0a12b7dce7..a98b584042 100644
--- a/drivers/net/nfp/meson.build
+++ b/drivers/net/nfp/meson.build
@@ -52,19 +52,11 @@ cflags += no_wvla_cflag
 if arch_subdir == 'x86'
 includes += include_directories('../../common/nfp')
 
-avx2_sources = files(
+sources_avx2 = files(
 'nfdk/nfp_nfdk_vec_avx2_dp.c',
 'nfp_rxtx_vec_avx2.c',
 )
 
-nfp_avx2_lib = static_library('nfp_avx2_lib',
-avx2_sources,
-dependencies: [static_rte_ethdev, static_rte_bus_pci],
-include_directories: includes,
-c_args: [cflags, cc_avx2_flags]
-)
-
-objs += nfp_avx2_lib.extract_all_objects(recursive: true)
 else
 sources += files(
 'nfp_rxtx_vec_stub.c',
diff --git a/drivers/net/octeon_ep/meson.build 
b/drivers/net/octeon_ep/meson.build
index 1b34db3edc..9bf4627894 100644
--- a/drivers/net/octeon_ep/meson.build
+++ b/drivers/net/octeon_ep/meson.build
@@ -15,18 +15,8 @@ sources = files(
 
 if arch_subdir == 'x86'
 sources += files('cnxk_ep_rx_sse.c')
-if cc.get_define('__AVX2__', args: machine_args) != ''
-cflags += ['-DCC_AVX2_SUPPORT']
-sources += files('cnxk_ep_rx_avx.c')
-elif cc.has_multi_arguments(cc_avx2_flags)
-cflags += ['-DCC_AVX2_SUPPORT']
-otx_ep_avx2_lib = static_library('otx_ep_avx2_lib',
-'cnxk_ep_rx_avx.c',
-dependencies: [static_rte_ethdev, static_rte_pci, 
static_rte_bus_pci],
-include_directories: includes,
-c_args: [cflags, cc_avx2_flags])
-objs += otx_ep_avx2_lib.extract_objects('cnxk_ep_rx_avx.c')
-endif
+cflags += ['-DCC_AVX2_SUPPORT']
+sources_avx2 = files('cnxk_ep_rx_avx.c')
 endif
 
 if arch_subdir == 'arm'
diff --git a/drivers/net/virtio/meson.build b/drivers/net/virtio/meson.build
index c1c4a85bea..01bfb3c47d 100644
--- a/drivers/net/virtio/meson.build
+++ b/drivers/net/virtio/meson.build
@@ -27,15 +27,8 @@ cflags += no_wvla_cflag
 
 if arch_subdir == 'x86'
 if cc_has_avx512
-cflags += ['-DCC_AVX512_SUPPORT']
 cflags += ['-DVIRTIO_RXTX_PACKED_VEC']
-virtio_avx512_lib = static_library('virtio_avx512_lib',
-'virtio_rxtx_packed.c',
-dependencies: [static_rte_ethdev,
-static_rte_kvargs, static_rte_bus_pci],
-include_directories: includes,
-c_args: cflags + cc_avx512_flags)
-o

[PATCH v5 07/11] acl: use common AVX build handling

2025-03-25 Thread Bruce Richardson
remove custom logic for building AVX2 and AVX-512 files.

Signed-off-by: Bruce Richardson 
---
 lib/acl/meson.build | 54 -
 1 file changed, 4 insertions(+), 50 deletions(-)

diff --git a/lib/acl/meson.build b/lib/acl/meson.build
index a80c172812..87e9f25f8e 100644
--- a/lib/acl/meson.build
+++ b/lib/acl/meson.build
@@ -15,57 +15,11 @@ headers = files('rte_acl.h', 'rte_acl_osdep.h')
 
 if dpdk_conf.has('RTE_ARCH_X86')
 sources += files('acl_run_sse.c')
-
-avx2_tmplib = static_library('avx2_tmp',
-'acl_run_avx2.c',
-dependencies: static_rte_eal,
-c_args: [cflags, cc_avx2_flags])
-objs += avx2_tmplib.extract_objects('acl_run_avx2.c')
-
-# compile AVX512 version if:
-# we are building 64-bit binary AND binutils can generate proper code
-
-if dpdk_conf.has('RTE_ARCH_X86_64') and binutils_ok
-
-# compile AVX512 version if either:
-# a. we have AVX512 supported in minimum instruction set
-#baseline
-# b. it's not minimum instruction set, but supported by
-#compiler
-#
-# in former case, just add avx512 C file to files list
-# in latter case, compile c file to static lib, using correct
-# compiler flags, and then have the .o file from static lib
-# linked into main lib.
-
-# check if all required flags already enabled (variant a).
-acl_avx512_flags = ['__AVX512F__', '__AVX512VL__',
-'__AVX512CD__', '__AVX512BW__']
-
-acl_avx512_on = true
-foreach f:acl_avx512_flags
-
-if cc.get_define(f, args: machine_args) == ''
-acl_avx512_on = false
-endif
-endforeach
-
-if acl_avx512_on == true
-
-sources += files('acl_run_avx512.c')
-cflags += '-DCC_AVX512_SUPPORT'
-
-elif cc_has_avx512
-avx512_tmplib = static_library('avx512_tmp',
-'acl_run_avx512.c',
-dependencies: static_rte_eal,
-c_args: cflags + cc_avx512_flags)
-objs += avx512_tmplib.extract_objects(
-'acl_run_avx512.c')
-cflags += '-DCC_AVX512_SUPPORT'
-endif
+sources_avx2 += files('acl_run_avx2.c')
+# AVX512 is only supported on 64-bit builds
+if dpdk_conf.has('RTE_ARCH_X86_64')
+sources_avx512 += files('acl_run_avx512.c')
 endif
-
 elif dpdk_conf.has('RTE_ARCH_ARM')
 cflags += '-flax-vector-conversions'
 sources += files('acl_run_neon.c')
-- 
2.45.2



[PATCH v5 06/11] lib: add generalized AVX build handling

2025-03-25 Thread Bruce Richardson
Add support to the top-level lib build file for AVX2 and AVX512
specific sources. This should simplify library builds by avoiding the
need to constantly reimplement the same build logic

Signed-off-by: Bruce Richardson 
---
 lib/meson.build | 34 +-
 1 file changed, 33 insertions(+), 1 deletion(-)

diff --git a/lib/meson.build b/lib/meson.build
index ce92cb5537..e2605e7d68 100644
--- a/lib/meson.build
+++ b/lib/meson.build
@@ -122,6 +122,9 @@ foreach l:libraries
 use_function_versioning = false
 annotate_locks = true
 sources = []
+sources_avx2 = []
+sources_avx512 = []
+cflags_avx512 = [] # extra cflags for the avx512 code, e.g. extra avx512 
feature flags
 headers = []
 indirect_headers = [] # public headers not directly included by apps
 driver_sdk_headers = [] # public headers included by drivers
@@ -242,7 +245,36 @@ foreach l:libraries
 cflags += '-Wthread-safety'
 endif
 
-# first build static lib
+# handle avx2 and avx512 source files
+if arch_subdir == 'x86'
+if sources_avx2.length() > 0
+avx2_lib = static_library(libname + '_avx2_lib',
+sources_avx2,
+dependencies: static_deps,
+include_directories: includes,
+c_args: [cflags, cc_avx2_flags])
+objs += avx2_lib.extract_objects(sources_avx2)
+endif
+if sources_avx512.length() > 0 and cc_has_avx512
+cflags += '-DCC_AVX512_SUPPORT'
+avx512_args = [cflags, cflags_avx512, cc_avx512_flags]
+if not target_has_avx512 and 
cc.has_argument('-march=skylake-avx512')
+avx512_args += '-march=skylake-avx512'
+if cc.has_argument('-Wno-overriding-option')
+avx512_args += '-Wno-overriding-option'
+endif
+endif
+avx512_lib = static_library(libname + '_avx512_lib',
+sources_avx512,
+dependencies: static_deps,
+include_directories: includes,
+c_args: avx512_args)
+objs += avx512_lib.extract_objects(sources_avx512)
+endif
+endif
+
+
+# build static lib
 static_lib = static_library(libname,
 sources,
 objects: objs,
-- 
2.45.2



[PATCH v5 04/11] drivers/net: remove AVX2 build-time define

2025-03-25 Thread Bruce Richardson
Since all supported compilers can generate AVX2 code, we will always
enable the build of the AVX2 files on x86. This means that
CC_AVX2_SUPPORT is always true on x86, so it can be removed and a
regular "#ifdef RTE_ARCH_x86" used in its place.

Signed-off-by: Bruce Richardson 
Acked-by: Ajit Khaparde 
---
 drivers/net/bnxt/bnxt_ethdev.c| 2 --
 drivers/net/octeon_ep/meson.build | 1 -
 drivers/net/octeon_ep/otx_ep_ethdev.c | 4 
 3 files changed, 7 deletions(-)

diff --git a/drivers/net/bnxt/bnxt_ethdev.c b/drivers/net/bnxt/bnxt_ethdev.c
index a0e3cd8bbe..2f37f5aa10 100644
--- a/drivers/net/bnxt/bnxt_ethdev.c
+++ b/drivers/net/bnxt/bnxt_ethdev.c
@@ -3258,8 +3258,6 @@ static const struct {
 #if defined(RTE_ARCH_X86)
{bnxt_crx_pkts_vec, "Vector SSE"},
{bnxt_recv_pkts_vec,"Vector SSE"},
-#endif
-#if defined(RTE_ARCH_X86) && defined(CC_AVX2_SUPPORT)
{bnxt_crx_pkts_vec_avx2,"Vector AVX2"},
{bnxt_recv_pkts_vec_avx2,   "Vector AVX2"},
 #endif
diff --git a/drivers/net/octeon_ep/meson.build 
b/drivers/net/octeon_ep/meson.build
index 9bf4627894..a4a7663d1d 100644
--- a/drivers/net/octeon_ep/meson.build
+++ b/drivers/net/octeon_ep/meson.build
@@ -15,7 +15,6 @@ sources = files(
 
 if arch_subdir == 'x86'
 sources += files('cnxk_ep_rx_sse.c')
-cflags += ['-DCC_AVX2_SUPPORT']
 sources_avx2 = files('cnxk_ep_rx_avx.c')
 endif
 
diff --git a/drivers/net/octeon_ep/otx_ep_ethdev.c 
b/drivers/net/octeon_ep/otx_ep_ethdev.c
index 8b14734b0c..10f2f8a2e0 100644
--- a/drivers/net/octeon_ep/otx_ep_ethdev.c
+++ b/drivers/net/octeon_ep/otx_ep_ethdev.c
@@ -91,11 +91,9 @@ otx_ep_set_rx_func(struct rte_eth_dev *eth_dev)
eth_dev->rx_pkt_burst = &cnxk_ep_recv_pkts;
 #ifdef RTE_ARCH_X86
eth_dev->rx_pkt_burst = &cnxk_ep_recv_pkts_sse;
-#ifdef CC_AVX2_SUPPORT
if (rte_vect_get_max_simd_bitwidth() >= RTE_VECT_SIMD_256 &&
rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX2) == 1)
eth_dev->rx_pkt_burst = &cnxk_ep_recv_pkts_avx;
-#endif
 #elif defined(RTE_ARCH_ARM64)
eth_dev->rx_pkt_burst = &cnxk_ep_recv_pkts_neon;
 #endif
@@ -105,11 +103,9 @@ otx_ep_set_rx_func(struct rte_eth_dev *eth_dev)
eth_dev->rx_pkt_burst = &cn9k_ep_recv_pkts;
 #ifdef RTE_ARCH_X86
eth_dev->rx_pkt_burst = &cn9k_ep_recv_pkts_sse;
-#ifdef CC_AVX2_SUPPORT
if (rte_vect_get_max_simd_bitwidth() >= RTE_VECT_SIMD_256 &&
rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX2) == 1)
eth_dev->rx_pkt_burst = &cn9k_ep_recv_pkts_avx;
-#endif
 #elif defined(RTE_ARCH_ARM64)
eth_dev->rx_pkt_burst = &cn9k_ep_recv_pkts_neon;
 #endif
-- 
2.45.2



[PATCH v5 05/11] event/dlb2: build using common AVX handling

2025-03-25 Thread Bruce Richardson
remove special-case handling for AVX512, and rely on mechanisms in
the drivers meson.build file.

Signed-off-by: Bruce Richardson 
---
 drivers/event/dlb2/dlb2_sse.c  |  4 
 drivers/event/dlb2/meson.build | 16 ++--
 2 files changed, 6 insertions(+), 14 deletions(-)

diff --git a/drivers/event/dlb2/dlb2_sse.c b/drivers/event/dlb2/dlb2_sse.c
index f2e1f9fb7e..06474d61dd 100644
--- a/drivers/event/dlb2/dlb2_sse.c
+++ b/drivers/event/dlb2/dlb2_sse.c
@@ -5,6 +5,8 @@
 #include 
 #include 
 
+#ifndef CC_AVX512_SUPPORT
+
 #include "dlb2_priv.h"
 #include "dlb2_iface.h"
 #include "dlb2_inline_fns.h"
@@ -226,3 +228,5 @@ dlb2_event_build_hcws(struct dlb2_port *qm_port,
break;
}
 }
+
+#endif /* no CC_AVX512_SUPPORT */
diff --git a/drivers/event/dlb2/meson.build b/drivers/event/dlb2/meson.build
index c024edb311..13d0fa544e 100644
--- a/drivers/event/dlb2/meson.build
+++ b/drivers/event/dlb2/meson.build
@@ -20,22 +20,10 @@ sources = files(
 'pf/base/dlb2_resource.c',
 'rte_pmd_dlb2.c',
 'dlb2_selftest.c',
+'dlb2_sse.c',
 )
 
-if target_has_avx512
-cflags += '-DCC_AVX512_SUPPORT'
-sources += files('dlb2_avx512.c')
-
-elif cc_has_avx512
-cflags += '-DCC_AVX512_SUPPORT'
-avx512_tmplib = static_library('avx512_tmp',
-   'dlb2_avx512.c',
-   dependencies: [static_rte_eal, 
static_rte_eventdev],
-   c_args: cflags + cc_avx512_flags)
-objs += avx512_tmplib.extract_objects('dlb2_avx512.c')
-else
-sources += files('dlb2_sse.c')
-endif
+sources_avx512 += files('dlb2_avx512.c')
 
 headers = files('rte_pmd_dlb2.h')
 
-- 
2.45.2



[PATCH v5 08/11] fib: use common AVX build handling

2025-03-25 Thread Bruce Richardson
Remove custom logic for building AVX2 and AVX-512 files. Within the C
code this requires some renaming of build macros to use the standard
defines.

Signed-off-by: Bruce Richardson 
---
 lib/fib/dir24_8.c   |  6 +++---
 lib/fib/meson.build | 18 +-
 lib/fib/trie.c  |  6 +++---
 3 files changed, 7 insertions(+), 23 deletions(-)

diff --git a/lib/fib/dir24_8.c b/lib/fib/dir24_8.c
index c48d962d41..2ba7e93511 100644
--- a/lib/fib/dir24_8.c
+++ b/lib/fib/dir24_8.c
@@ -16,11 +16,11 @@
 #include "dir24_8.h"
 #include "fib_log.h"
 
-#ifdef CC_DIR24_8_AVX512_SUPPORT
+#ifdef CC_AVX512_SUPPORT
 
 #include "dir24_8_avx512.h"
 
-#endif /* CC_DIR24_8_AVX512_SUPPORT */
+#endif /* CC_AVX512_SUPPORT */
 
 #define DIR24_8_NAMESIZE   64
 
@@ -63,7 +63,7 @@ get_scalar_fn_inlined(enum rte_fib_dir24_8_nh_sz nh_sz, bool 
be_addr)
 static inline rte_fib_lookup_fn_t
 get_vector_fn(enum rte_fib_dir24_8_nh_sz nh_sz, bool be_addr)
 {
-#ifdef CC_DIR24_8_AVX512_SUPPORT
+#ifdef CC_AVX512_SUPPORT
if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512F) <= 0 ||
rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512DQ) <= 0 ||
rte_vect_get_max_simd_bitwidth() < RTE_VECT_SIMD_512)
diff --git a/lib/fib/meson.build b/lib/fib/meson.build
index 0c19cc8201..55d9ddcee9 100644
--- a/lib/fib/meson.build
+++ b/lib/fib/meson.build
@@ -15,21 +15,5 @@ deps += ['rcu']
 deps += ['net']
 
 if dpdk_conf.has('RTE_ARCH_X86_64')
-if target_has_avx512
-cflags += ['-DCC_DIR24_8_AVX512_SUPPORT', '-DCC_TRIE_AVX512_SUPPORT']
-sources += files('dir24_8_avx512.c', 'trie_avx512.c')
-
-elif cc_has_avx512
-cflags += ['-DCC_DIR24_8_AVX512_SUPPORT', '-DCC_TRIE_AVX512_SUPPORT']
-dir24_8_avx512_tmp = static_library('dir24_8_avx512_tmp',
-'dir24_8_avx512.c',
-dependencies: [static_rte_eal, static_rte_rcu],
-c_args: cflags + cc_avx512_flags)
-objs += dir24_8_avx512_tmp.extract_objects('dir24_8_avx512.c')
-trie_avx512_tmp = static_library('trie_avx512_tmp',
-'trie_avx512.c',
-dependencies: [static_rte_eal, static_rte_rcu, static_rte_net],
-c_args: cflags + cc_avx512_flags)
-objs += trie_avx512_tmp.extract_objects('trie_avx512.c')
-endif
+sources_avx512 = files('dir24_8_avx512.c', 'trie_avx512.c')
 endif
diff --git a/lib/fib/trie.c b/lib/fib/trie.c
index 4893f6c636..6c20057ac5 100644
--- a/lib/fib/trie.c
+++ b/lib/fib/trie.c
@@ -14,11 +14,11 @@
 #include 
 #include "trie.h"
 
-#ifdef CC_TRIE_AVX512_SUPPORT
+#ifdef CC_AVX512_SUPPORT
 
 #include "trie_avx512.h"
 
-#endif /* CC_TRIE_AVX512_SUPPORT */
+#endif /* CC_AVX512_SUPPORT */
 
 #define TRIE_NAMESIZE  64
 
@@ -45,7 +45,7 @@ get_scalar_fn(enum rte_fib_trie_nh_sz nh_sz)
 static inline rte_fib6_lookup_fn_t
 get_vector_fn(enum rte_fib_trie_nh_sz nh_sz)
 {
-#ifdef CC_TRIE_AVX512_SUPPORT
+#ifdef CC_AVX512_SUPPORT
if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512F) <= 0 ||
rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512DQ) <= 0 ||
rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512BW) <= 0 ||
-- 
2.45.2



[PATCH v5 10/11] net: use common AVX512 build code

2025-03-25 Thread Bruce Richardson
Use the common support for AVX512 code present in lib/meson.build,
rather than hard-coding it. The only complication is an extra check for
the "-mvpclmulqdq" command-line flag before adding the AVX512 sources.

Signed-off-by: Bruce Richardson 
---
 lib/net/meson.build   | 12 
 lib/net/rte_net_crc.c |  8 
 2 files changed, 8 insertions(+), 12 deletions(-)

diff --git a/lib/net/meson.build b/lib/net/meson.build
index cd49b4d758..7a6c419f40 100644
--- a/lib/net/meson.build
+++ b/lib/net/meson.build
@@ -44,14 +44,10 @@ use_function_versioning = true
 if dpdk_conf.has('RTE_ARCH_X86_64')
 sources += files('net_crc_sse.c')
 cflags += ['-mpclmul', '-maes']
-if cc.has_argument('-mvpclmulqdq') and cc_has_avx512
-cflags += ['-DCC_X86_64_AVX512_VPCLMULQDQ_SUPPORT']
-net_crc_avx512_lib = static_library(
-'net_crc_avx512_lib',
-'net_crc_avx512.c',
-dependencies: static_rte_eal,
-c_args: [cflags, cc_avx512_flags, '-mvpclmulqdq'])
-objs += net_crc_avx512_lib.extract_objects('net_crc_avx512.c')
+# only build AVX-512 support if we also have PCLMULQDQ support
+if cc.has_argument('-mvpclmulqdq')
+sources_avx512 += files('net_crc_avx512.c')
+cflags_avx512 += ['-mvpclmulqdq']
 endif
 
 elif (dpdk_conf.has('RTE_ARCH_ARM64') and
diff --git a/lib/net/rte_net_crc.c b/lib/net/rte_net_crc.c
index 8d11515712..b5a5dabd4f 100644
--- a/lib/net/rte_net_crc.c
+++ b/lib/net/rte_net_crc.c
@@ -60,7 +60,7 @@ static const rte_net_crc_handler handlers_scalar[] = {
[RTE_NET_CRC16_CCITT] = rte_crc16_ccitt_handler,
[RTE_NET_CRC32_ETH] = rte_crc32_eth_handler,
 };
-#ifdef CC_X86_64_AVX512_VPCLMULQDQ_SUPPORT
+#ifdef CC_AVX512_SUPPORT
 static const rte_net_crc_handler handlers_avx512[] = {
[RTE_NET_CRC16_CCITT] = rte_crc16_ccitt_avx512_handler,
[RTE_NET_CRC32_ETH] = rte_crc32_eth_avx512_handler,
@@ -185,7 +185,7 @@ rte_crc32_eth_handler(const uint8_t *data, uint32_t 
data_len)
 static const rte_net_crc_handler *
 avx512_vpclmulqdq_get_handlers(void)
 {
-#ifdef CC_X86_64_AVX512_VPCLMULQDQ_SUPPORT
+#ifdef CC_AVX512_SUPPORT
if (AVX512_VPCLMULQDQ_CPU_SUPPORTED &&
max_simd_bitwidth >= RTE_VECT_SIMD_512)
return handlers_avx512;
@@ -197,7 +197,7 @@ avx512_vpclmulqdq_get_handlers(void)
 static void
 avx512_vpclmulqdq_init(void)
 {
-#ifdef CC_X86_64_AVX512_VPCLMULQDQ_SUPPORT
+#ifdef CC_AVX512_SUPPORT
if (AVX512_VPCLMULQDQ_CPU_SUPPORTED)
rte_net_crc_avx512_init();
 #endif
@@ -305,7 +305,7 @@ handlers_init(enum rte_net_crc_alg alg)
 
switch (alg) {
case RTE_NET_CRC_AVX512:
-#ifdef CC_X86_64_AVX512_VPCLMULQDQ_SUPPORT
+#ifdef CC_AVX512_SUPPORT
if (AVX512_VPCLMULQDQ_CPU_SUPPORTED) {
handlers_dpdk26[alg].f[RTE_NET_CRC16_CCITT] =
rte_crc16_ccitt_avx512_handler;
-- 
2.45.2



[PATCH v5 09/11] net: simplify build-time logic for x86

2025-03-25 Thread Bruce Richardson
All DPDK-supported versions of clang and gcc have the "-mpclmul" and
"-maes" flags, so we never need to check for those. This allows the SSE
code path to be unconditionally built on x86.

For the AVX512 code path, simplify it by only checking for the
build-time support, and always doing a separate build with AVX512
support when that compiler support is present.

Signed-off-by: Bruce Richardson 
---
 lib/net/meson.build   | 52 +--
 lib/net/rte_net_crc.c |  8 +++
 2 files changed, 9 insertions(+), 51 deletions(-)

diff --git a/lib/net/meson.build b/lib/net/meson.build
index c9b34afc98..cd49b4d758 100644
--- a/lib/net/meson.build
+++ b/lib/net/meson.build
@@ -42,57 +42,15 @@ deps += ['mbuf']
 use_function_versioning = true
 
 if dpdk_conf.has('RTE_ARCH_X86_64')
-net_crc_sse42_cpu_support = (cc.get_define('__PCLMUL__', args: 
machine_args) != '')
-net_crc_avx512_cpu_support = (
-target_has_avx512 and
-cc.get_define('__VPCLMULQDQ__', args: machine_args) != ''
-)
-
-net_crc_sse42_cc_support = (cc.has_argument('-mpclmul') and 
cc.has_argument('-maes'))
-net_crc_avx512_cc_support = (cc.has_argument('-mvpclmulqdq') and 
cc_has_avx512)
-
-build_static_net_crc_sse42_lib = 0
-build_static_net_crc_avx512_lib = 0
-
-if net_crc_sse42_cpu_support == true
-sources += files('net_crc_sse.c')
-cflags += ['-DCC_X86_64_SSE42_PCLMULQDQ_SUPPORT']
-if net_crc_avx512_cpu_support == true
-sources += files('net_crc_avx512.c')
-cflags += ['-DCC_X86_64_AVX512_VPCLMULQDQ_SUPPORT']
-elif net_crc_avx512_cc_support == true
-build_static_net_crc_avx512_lib = 1
-net_crc_avx512_lib_cflags = cc_avx512_flags + ['-mvpclmulqdq']
-cflags += ['-DCC_X86_64_AVX512_VPCLMULQDQ_SUPPORT']
-endif
-elif net_crc_sse42_cc_support == true
-build_static_net_crc_sse42_lib = 1
-net_crc_sse42_lib_cflags = ['-mpclmul', '-maes']
-cflags += ['-DCC_X86_64_SSE42_PCLMULQDQ_SUPPORT']
-if net_crc_avx512_cc_support == true
-build_static_net_crc_avx512_lib = 1
-net_crc_avx512_lib_cflags = cc_avx512_flags + ['-mvpclmulqdq', 
'-mpclmul']
-cflags += ['-DCC_X86_64_AVX512_VPCLMULQDQ_SUPPORT']
-endif
-endif
-
-if build_static_net_crc_sse42_lib == 1
-net_crc_sse42_lib = static_library(
-'net_crc_sse42_lib',
-'net_crc_sse.c',
-dependencies: static_rte_eal,
-c_args: [cflags,
-net_crc_sse42_lib_cflags])
-objs += net_crc_sse42_lib.extract_objects('net_crc_sse.c')
-endif
-
-if build_static_net_crc_avx512_lib == 1
+sources += files('net_crc_sse.c')
+cflags += ['-mpclmul', '-maes']
+if cc.has_argument('-mvpclmulqdq') and cc_has_avx512
+cflags += ['-DCC_X86_64_AVX512_VPCLMULQDQ_SUPPORT']
 net_crc_avx512_lib = static_library(
 'net_crc_avx512_lib',
 'net_crc_avx512.c',
 dependencies: static_rte_eal,
-c_args: [cflags,
-net_crc_avx512_lib_cflags])
+c_args: [cflags, cc_avx512_flags, '-mvpclmulqdq'])
 objs += net_crc_avx512_lib.extract_objects('net_crc_avx512.c')
 endif
 
diff --git a/lib/net/rte_net_crc.c b/lib/net/rte_net_crc.c
index 2fb3eec231..8d11515712 100644
--- a/lib/net/rte_net_crc.c
+++ b/lib/net/rte_net_crc.c
@@ -66,7 +66,7 @@ static const rte_net_crc_handler handlers_avx512[] = {
[RTE_NET_CRC32_ETH] = rte_crc32_eth_avx512_handler,
 };
 #endif
-#ifdef CC_X86_64_SSE42_PCLMULQDQ_SUPPORT
+#ifdef RTE_ARCH_X86_64
 static const rte_net_crc_handler handlers_sse42[] = {
[RTE_NET_CRC16_CCITT] = rte_crc16_ccitt_sse42_handler,
[RTE_NET_CRC32_ETH] = rte_crc32_eth_sse42_handler,
@@ -211,7 +211,7 @@ avx512_vpclmulqdq_init(void)
 static const rte_net_crc_handler *
 sse42_pclmulqdq_get_handlers(void)
 {
-#ifdef CC_X86_64_SSE42_PCLMULQDQ_SUPPORT
+#ifdef RTE_ARCH_X86_64
if (SSE42_PCLMULQDQ_CPU_SUPPORTED &&
max_simd_bitwidth >= RTE_VECT_SIMD_128)
return handlers_sse42;
@@ -223,7 +223,7 @@ sse42_pclmulqdq_get_handlers(void)
 static void
 sse42_pclmulqdq_init(void)
 {
-#ifdef CC_X86_64_SSE42_PCLMULQDQ_SUPPORT
+#ifdef RTE_ARCH_X86_64
if (SSE42_PCLMULQDQ_CPU_SUPPORTED)
rte_net_crc_sse42_init();
 #endif
@@ -316,7 +316,7 @@ handlers_init(enum rte_net_crc_alg alg)
 #endif
/* fall-through */
case RTE_NET_CRC_SSE42:
-#ifdef CC_X86_64_SSE42_PCLMULQDQ_SUPPORT
+#ifdef RTE_ARCH_X86_64
if (SSE42_PCLMULQDQ_CPU_SUPPORTED) {
handlers_dpdk26[alg].f[RTE_NET_CRC16_CCITT] =
rte_crc16_ccitt_sse42_handler;
-- 
2.45.2



Re: [PATCH] doc: add tested platforms with NVIDIA NICs

2025-03-25 Thread Thomas Monjalon
20/03/2025 09:44, Raslan Darawsheh:
> Add tested platforms with NVIDIA NICs to the 25.03 release notes.
> 
> Signed-off-by: Raslan Darawsheh 

Applied, thanks.






RE: [PATCH] mempool perf test: test random bulk sizes

2025-03-25 Thread Morten Brørup
Second PING for review.

Med venlig hilsen / Kind regards,
-Morten Brørup


> From: Morten Brørup [mailto:m...@smartsharesystems.com]
> Sent: Thursday, 13 March 2025 09.23
> 
> PING for review.
> 
> This could still make it into 25.03-rc3 (deadline: 14 March 2025).
> 
> Med venlig hilsen / Kind regards,
> -Morten Brørup
> 
> 
> > From: Morten Brørup [mailto:m...@smartsharesystems.com]
> > Sent: Friday, 28 February 2025 17.49
> >
> > Bulk requests to get or put objects in a mempool often vary in size.
> > A series of tests with pseudo random request sizes, to mitigate the
> > benefits of the CPU's dynamic branch predictor, was added.
> >
> > Also, various other minor changes:
> > - Improved the output formatting for readability.
> > - Added test for the "default" mempool with cache.
> > - Skip the tests for the "default" mempool, if it happens to use the
> > same
> >   driver (i.e. operations) as already tested.
> > - Replaced bare use of "unsigned" with "unsigned int",
> >   to make checkpatches happy.
> >
> > Signed-off-by: Morten Brørup 
> > ---
> >  app/test/test_mempool_perf.c | 219 +++--
> --
> >  1 file changed, 172 insertions(+), 47 deletions(-)
> >
> > diff --git a/app/test/test_mempool_perf.c
> > b/app/test/test_mempool_perf.c
> > index 4dd74ef75a..5e29797f02 100644
> > --- a/app/test/test_mempool_perf.c
> > +++ b/app/test/test_mempool_perf.c
> > @@ -33,6 +33,13 @@
> >   * Mempool performance
> >   * ===
> >   *
> > + *Each core get *n_keep* objects per bulk of a pseudorandom
> number
> > + *between 1 and *n_max_bulk*.
> > + *Objects are put back in the pool per bulk of a similar
> > pseudorandom number.
> > + *Note: The very low entropy of the randomization algorithm is
> > harmless, because
> > + *  the sole purpose of randomization is to prevent the
> CPU's
> > dynamic branch
> > + *  predictor from enhancing the test results.
> > + *
> >   *Each core get *n_keep* objects per bulk of *n_get_bulk*. Then,
> >   *objects are put back in the pool per bulk of *n_put_bulk*.
> >   *
> > @@ -52,7 +59,12 @@
> >   *  - Two cores with user-owned cache
> >   *  - Max. cores with user-owned cache
> >   *
> > - *- Bulk size (*n_get_bulk*, *n_put_bulk*)
> > + *- Pseudorandom max bulk size (*n_max_bulk*)
> > + *
> > + *  - Max bulk from CACHE_LINE_BURST to 256, and
> > RTE_MEMPOOL_CACHE_MAX_SIZE,
> > + *where CACHE_LINE_BURST is the number of pointers fitting
> > into one CPU cache line.
> > + *
> > + *- Fixed bulk size (*n_get_bulk*, *n_put_bulk*)
> >   *
> >   *  - Bulk get from 1 to 256, and RTE_MEMPOOL_CACHE_MAX_SIZE
> >   *  - Bulk put from 1 to 256, and RTE_MEMPOOL_CACHE_MAX_SIZE
> > @@ -89,16 +101,19 @@
> > } while (0)
> >
> >  static int use_external_cache;
> > -static unsigned external_cache_size = RTE_MEMPOOL_CACHE_MAX_SIZE;
> > +static unsigned int external_cache_size =
> RTE_MEMPOOL_CACHE_MAX_SIZE;
> >
> >  static RTE_ATOMIC(uint32_t) synchro;
> >
> > +/* max random number of objects in one bulk operation (get and put)
> */
> > +static unsigned int n_max_bulk;
> > +
> >  /* number of objects in one bulk operation (get or put) */
> > -static unsigned n_get_bulk;
> > -static unsigned n_put_bulk;
> > +static unsigned int n_get_bulk;
> > +static unsigned int n_put_bulk;
> >
> >  /* number of objects retrieved from mempool before putting them back
> > */
> > -static unsigned n_keep;
> > +static unsigned int n_keep;
> >
> >  /* true if we want to test with constant n_get_bulk and n_put_bulk
> */
> >  static int use_constant_values;
> > @@ -118,7 +133,7 @@ static struct mempool_test_stats
> > stats[RTE_MAX_LCORE];
> >   */
> >  static void
> >  my_obj_init(struct rte_mempool *mp, __rte_unused void *arg,
> > -   void *obj, unsigned i)
> > +   void *obj, unsigned int i)
> >  {
> > uint32_t *objnum = obj;
> > memset(obj, 0, mp->elt_size);
> > @@ -159,11 +174,55 @@ test_loop(struct rte_mempool *mp, struct
> > rte_mempool_cache *cache,
> > return 0;
> >  }
> >
> > +static __rte_always_inline int
> > +test_loop_random(struct rte_mempool *mp, struct rte_mempool_cache
> > *cache,
> > + unsigned int x_keep, unsigned int x_max_bulk)
> > +{
> > +   alignas(RTE_CACHE_LINE_SIZE) void *obj_table[MAX_KEEP];
> > +   unsigned int idx;
> > +   unsigned int i;
> > +   unsigned int r = 0;
> > +   unsigned int x_bulk;
> > +   int ret;
> > +
> > +   for (i = 0; likely(i < (N / x_keep)); i++) {
> > +   /* get x_keep objects by bulk of random [1 .. x_max_bulk]
> > */
> > +   for (idx = 0; idx < x_keep; idx += x_bulk, r++) {
> > +   /* Generate a pseudorandom number [1 .. x_max_bulk].
> > */
> > +   x_bulk = ((r ^ (r >> 2) ^ (r << 3)) & (x_max_bulk -
> > 1)) + 1;
> > +   if (unlikely(idx + x_bulk > x_keep))
> > +   x_bulk = x_keep - idx;
> > +   ret = rte_mempool_generic_get(mp,
> > +  

Re: release candidate 25.03-rc3

2025-03-25 Thread Wael Abualrub

 Hi,

> -Original Message-
> From: Thomas Monjalon 
> Date: Wednesday, 19 March 2025 at 6:35
> To: annou...@dpdk.org 
> Subject: release candidate 25.03-rc3
>A new DPDK release candidate is ready for testing:
>   https://git.dpdk.org/dpdk/tag/?id=v25.03-rc3
>
> There are 71 new patches in this snapshot.
>
> Release notes:
>https://doc.dpdk.org/guides/rel_notes/release_25_03.html
>
> Please test and report new issues on https://bugs.dpdk.org
>
> If no major blocker is discovered,
> this release cycle may be closed at the end of the week.
>
> Thank you everyone
The following is a list of tests that we ran on NVIDIA hardware this release:
Note: all tests are passed, and no critical issues were found.

- Basic functionality:
   Send and receive multiple types of traffic.
- testpmd xstats counter test.
- testpmd timestamp test.
- Changing/checking link status through testpmd.
- RTE flow tests:
https://doc.dpdk.org/guides/nics/mlx5.html#supported-hardware-offloads
- RSS testing.
- VLAN filtering, stripping and insertion tests.
- Checksum and TSO tests.
- ptype reporting.
- link status interrupt using the example application link_status_interrupt 
tests.
- Interrupt mode using l3fwd-power example application tests.
- Multi process testing using multi process example applications.
- Hardware LRO tests.
- Buffer Split.
- Tx scheduling.

We don't see new issues caused by changes in this release.

Kindest Regards,
Wael Abualrub


[RFC PATCH] build: reduce use of AVX compiler flags

2025-03-25 Thread Bruce Richardson
When doing a build for a target that already has the instruction sets
for AVX2/AVX512 enabled, skip emitting the AVX compiler flags, or the
skylake-avx512 '-march' flags, as they are unnecessary. Instead, when
the default flags produce the desired output, just use them unmodified.

Depends-on: series-34915 ("remove component-specific logic for AVX builds")

Signed-off-by: Bruce Richardson 
---

This patchset depends on the previous AVX rework. However, sending it
separately as a new RFC because it effectively increases the minimum
compiler versions needed for x86 builds - from GCC 5 to 6, and
Clang 3.6 to 3.9.

For now, I've just documented that as an additional note in the GSG that
these versions are recommended, but it would be simpler if we could just
set them as the required minimum baseline (at least in the docs).

Feedback on these compiler version requirements welcome.

---
 config/x86/meson.build| 31 ---
 doc/guides/linux_gsg/sys_reqs.rst |  8 
 drivers/meson.build   |  9 +
 lib/meson.build   |  9 +
 4 files changed, 30 insertions(+), 27 deletions(-)

diff --git a/config/x86/meson.build b/config/x86/meson.build
index c3564b0011..97f790b0d4 100644
--- a/config/x86/meson.build
+++ b/config/x86/meson.build
@@ -4,11 +4,13 @@
 if is_ms_compiler
 cc_avx2_flags = ['/arch:AVX2']
 else
-cc_avx2_flags = ['-mavx2']
+cc_avx2_flags = []
+if cc.get_define('__AVX2__', args: machine_args) == ''
+cc_avx2_flags = ['-mavx2']
+endif
 endif

 cc_has_avx512 = false
-target_has_avx512 = false

 dpdk_conf.set('RTE_ARCH_X86', 1)
 if dpdk_conf.get('RTE_ARCH_64')
@@ -65,26 +67,33 @@ if is_linux or cc.get_id() == 'gcc'
 endif
 endif

-cc_avx512_flags = ['-mavx512f', '-mavx512vl', '-mavx512dq', '-mavx512bw', 
'-mavx512cd']
-if (binutils_ok and cc.has_multi_arguments(cc_avx512_flags)
+avx512_march_flag = '-march=skylake-avx512'
+cc_avx512_flags = []
+if (binutils_ok and cc.has_argument(avx512_march_flag)
 and '-mno-avx512f' not in get_option('c_args'))
 # check if compiler is working with _mm512_extracti64x4_epi64
 # Ref: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82887
 code = '''#include 
 void test(__m512i zmm){
 __m256i ymm = _mm512_extracti64x4_epi64(zmm, 0);}'''
-result = cc.compiles(code, args : cc_avx512_flags, name : 'AVX512 
checking')
+result = cc.compiles(code, args : [avx512_march_flag], name : 'AVX512 
checking')
 if result == false
 machine_args += '-mno-avx512f'
 warning('Broken _mm512_extracti64x4_epi64, disabling AVX512 support')
 else
 cc_has_avx512 = true
-target_has_avx512 = (
-cc.get_define('__AVX512F__', args: machine_args) != '' and
-cc.get_define('__AVX512BW__', args: machine_args) != '' and
-cc.get_define('__AVX512DQ__', args: machine_args) != '' and
-cc.get_define('__AVX512VL__', args: machine_args) != ''
-)
+if cc.get_define('__AVX512F__', args: machine_args) == ''
+cc_avx512_flags = [avx512_march_flag]
+if cc.has_argument('-Wno-overriding-option')
+cc_avx512_args += '-Wno-overriding-option'
+endif
+endif
+endif
+endif
+if developer_mode
+message('Extra C flags needed for AVX2 output: @0@'.format(cc_avx2_flags))
+if cc_has_avx512
+message('Extra C flags needed for AVX512 output: 
@0@'.format(cc_avx512_flags))
 endif
 endif

diff --git a/doc/guides/linux_gsg/sys_reqs.rst 
b/doc/guides/linux_gsg/sys_reqs.rst
index 5a7d9e4a43..55e9fe4724 100644
--- a/doc/guides/linux_gsg/sys_reqs.rst
+++ b/doc/guides/linux_gsg/sys_reqs.rst
@@ -35,6 +35,14 @@ Compilation of the DPDK
 * For Ubuntu/Debian systems these can be installed using ``apt install 
build-essential``
 * For Alpine Linux, ``apk add alpine-sdk bsd-compat-headers``

+.. note::
+
+   When compiling for x86 platforms,
+   GCC version 6.1 or higher,
+   or Clang version 3.9 or higher is recommended.
+   Earlier versions of these compilers do not support the compiler flags used 
by DPDK for AVX-512 code.
+   As such, any builds using earlier compilers will be missing AVX-512 support.
+
 .. note::

pkg-config 0.27, supplied with RHEL-7,
diff --git a/drivers/meson.build b/drivers/meson.build
index c15319dc24..bb33b0a7a0 100644
--- a/drivers/meson.build
+++ b/drivers/meson.build
@@ -249,18 +249,11 @@ foreach subpath:subdirs
 endif
 if sources_avx512.length() > 0 and cc_has_avx512
 cflags += '-DCC_AVX512_SUPPORT'
-avx512_args = [cflags, cc_avx512_flags]
-if not target_has_avx512 and 
cc.has_argument('-march=skylake-avx512')
-avx512_args += '-march=skylake-avx512'
-if cc.has_argument('-Wno-overriding-option')
-avx512_args += '-Wno-overriding-option'

Re: [PATCH v4 00/11] remove component-specific logic for AVX builds

2025-03-25 Thread Bruce Richardson
On Tue, Mar 25, 2025 at 08:46:35AM +0100, David Marchand wrote:
> Hello Bruce,
> 
> On Wed, Mar 19, 2025 at 7:09 PM Bruce Richardson
>  wrote:
> >
> > On Wed, Mar 19, 2025 at 05:29:30PM +, Bruce Richardson wrote:
> > > A number of libs and drivers had special optimized AVX2 and AVX512 code
> > > paths for performance reasons, and these tended to have copy-pasted
> > > logic to build those files. Centralise that logic in the main
> > > drivers/ and lib/ meson.build files to avoid duplication.
> > >
> > > v4: rebase on latest main branch
> > > minor fixes following feedback
> > > limit use of -march=skylake-avx512 to when we don't already have a
> > >   -march flag supporting AVX512.
> > > v3: add patch for event/dlb2 AVX512 handling.
> > > add common code for libraries as well as drivers.
> > > v2: add patch 4 to remove use of unnecessary CC_AVX2_SUPPORT flag
> > >
> > A related follow-up to this patchset. Checking with "godbolt.org", it
> > appears that both clang 3.6[1] and gcc 5[2] (the minimum called out compiler
> > versions in our docs[1]) support the set of AVX-512 compiler flags we use.
> > Therefore, it seems we can simplify our code further by removing the
> > "cc_has_avx512" variable.
> 
> What about https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90028 ?
> 

Yep, still needs to be handled, something I only realised after sending the
email.

> You'll need to send a new revision for this series in any case, since
> patch 9 broke the crc stuff in the net library.
> https://inbox.dpdk.org/dev/CAJFAV8w9wYPN+30Hv=batMvP=0m4momkzgmndfixxbd-9u8...@mail.gmail.com/
> 
Yes, I saw that and just started looking at it last evening. Will hopefully
get a new revision out soon.

/Bruce


Re: [PATCH] doc/bluefield: add comparison between BlueField versions

2025-03-25 Thread Thomas Monjalon
19/03/2025 13:13, Raslan Darawsheh:
> Updated BlueField-3 documentation to include a detailed comparison
> with BlueField-2 and added notes on compiler requirements.
> 
> Signed-off-by: Raslan Darawsheh 

Applied, thanks.





Re: [PATCH 2/2] app/dma-perf: fix infinite loop

2025-03-25 Thread Stephen Hemminger
On Fri, 21 Mar 2025 12:03:16 +0800
Dengdui Huang  wrote:

> When a core that is not used by the rte is specified in the config
> for testing, the problem of infinite loop occurs. The root cause
> is that the program waits for the completion of the test task when
> the test worker fails to be started on the lcore. This patch fix it.
> 
> Fixes: 533d7e7f66f3 ("app/dma-perf: support config per device")
> Cc: sta...@dpdk.org
> 
> Signed-off-by: Dengdui Huang 
> ---
>  app/test-dma-perf/benchmark.c | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/app/test-dma-perf/benchmark.c b/app/test-dma-perf/benchmark.c
> index 6d617ea200..351c1c966e 100644
> --- a/app/test-dma-perf/benchmark.c
> +++ b/app/test-dma-perf/benchmark.c
> @@ -751,7 +751,10 @@ mem_copy_benchmark(struct test_configure *cfg)
>   goto out;
>   }
>  
> - rte_eal_remote_launch(get_work_function(cfg), (void 
> *)(lcores[i]), lcore_id);
> + if (rte_eal_remote_launch(get_work_function(cfg), (void 
> *)(lcores[i]), lcore_id)) {
> + printf("Error: Fail to start the test on lcore %d\n", 
> lcore_id);

Convention is to log errors on stderr and lcore_id is unsigned not signed value.