RE: [PATCH v1 1/1] crypto/ipsec_mb: unify some IPsec MB PMDs

2024-06-20 Thread Akhil Goyal
Hi Pablo,
> > Subject: [PATCH v1 1/1] crypto/ipsec_mb: unify some IPsec MB PMDs
> >
> > Currently IPsec MB provides both the JOB API and direct API.
> > AESNI_MB PMD is using the JOB API codepath while KASUMI and
> > CHACHA20_POLY1305 are using the direct API.
> > Instead of using the direct API for these PMDs, they should now make use of
> > the JOB API codepath. This would remove all use of the IPsec MB direct API 
> > for
> > these PMDs.
> >
> > Signed-off-by: Brian Dooley 
> > ---
> 
> Acked-by: Pablo de Lara 

V2 is sent for this patch. Please check that too.
https://patches.dpdk.org/project/dpdk/patch/20240614140759.1123505-2-brian.doo...@intel.com/


Re: [PATCH v5 5/5] doc: update release notes for 24.07

2024-06-20 Thread David Marchand
On Wed, Jun 19, 2024 at 11:02 PM Abdullah Sevincer
 wrote:
>
> Update release notes for new DLB features.
>
> Signed-off-by: Abdullah Sevincer 
> ---
>  doc/guides/rel_notes/release_24_07.rst | 32 ++
>  1 file changed, 32 insertions(+)
>
> diff --git a/doc/guides/rel_notes/release_24_07.rst 
> b/doc/guides/rel_notes/release_24_07.rst
> index 7c88de381b..b4eb819503 100644
> --- a/doc/guides/rel_notes/release_24_07.rst
> +++ b/doc/guides/rel_notes/release_24_07.rst
> @@ -144,6 +144,38 @@ New Features
>
>Added an API that allows the user to reclaim the defer queue with RCU.
>
> +* **Added API to support HW delayed token feature for DLB 2.5 device.**
> +
> +  * Added API ``rte_pmd_dlb2_set_port_params`` to support delayed token
> +feature for DLB 2.5 device. The feature will resume CQ scheduling
> +when the number of pending completions fall below a configured
> +threshold.
> +
> +* **Introduced dynamic HL (History List) feature for DLB device.**
> +
> +  * Users can configure history list entries dynamically by passing
> +parameters ``use_default_hl`` and ``alloc_hl_entries``.
> +
> +  * When 'use_default_hl = 1', Per port HL is set to
> +DLB2_FIXED_CQ_HL_SIZE (32) and command line parameter
> +alloc_hl_entries is ignored.
> +
> +  * When 'use_default_hl = 0', Per LDB port HL = 2 * CQ depth and per
> +port HL is set to 2 * DLB2_FIXED_CQ_HL_SIZE.
> +
> +* **DLB credit handling scenario improvements.**
> +
> +  * When ports hold on to credits but can't release them due to insufficient
> +accumulation (less than 2 * credit quanta) deadlocks may occur.
> +Improvement made for worker ports to release all accumulated credits when
> +back-to-back zero poll count reaches preset threshold and producer ports
> +release all accumulated credits if enqueue fails for a consecutive number
> +of retries.
> +
> +  * New meson options are provided for handling credits. Valid options
> +are ``bypass_fence``, ``hw_credits_checks``, ``sw_credits_checks`` and
> +``type_check``. These options need to be provided in meson in comma
> +separated form.
>

Those 3 entries can be gathered under a single item about the dlb2 driver.
Like:

* **Updated dlb2 eventdev driver.**

  * Added API ``rte_pmd_dlb2_set_port_params`` to support delayed token...
...

  * Introduced dynamic HL (History List) feature for DLB device...
...
etc...


Besides, those doc updates should be split and go with the patches
that introduce the features.
This comment applies to the previous doc patch too.

Thanks.


-- 
David Marchand



RE: [PATCH v1 0/4] test/crypto: enhance modex tests

2024-06-20 Thread Akhil Goyal
> Subject: [PATCH v1 0/4] test/crypto: enhance modex tests
> 
> This patch series enhances modex tests to:
>  * use common test function in existing test vectors
>  * add test for zero padded operands
> 
> Gowrishankar Muthukrishnan (4):
>   test/crypto: validate modex result from first nonzero value
>   test/crypto: remove unused variable in modex test data
>   test/crypto: use common test function for mod tests
>   test/crypto: add modex tests for zero padded operands
> 
>  app/test/test_cryptodev_asym.c | 279 
>  app/test/test_cryptodev_asym_util.h|  18 --
>  app/test/test_cryptodev_mod_test_vectors.h | 287 ++---
>  3 files changed, 241 insertions(+), 343 deletions(-)
> 
Please check compilation issues reported by CI.


RE: [PATCH v5] net/af_xdp: parse umem map info from mempool range api

2024-06-20 Thread Morten Brørup
> From: Frank Du [mailto:frank...@intel.com]
> 
> The current calculation assumes that the mbufs are contiguous. However,
> this assumption is incorrect when the mbuf memory spans across huge page.
> 
> Correct to directly read with mempool get range API.
> 
> Fixes: d8a210774e1d ("net/af_xdp: support unaligned umem chunks")
> Cc: sta...@dpdk.org
> 
> Signed-off-by: Frank Du 
> 
> ---
> v2:
> * Add virtual contiguous detect for for multiple memhdrs
> v3:
> * Use RTE_ALIGN_FLOOR to get the aligned addr
> * Add check on the first memhdr of memory chunks
> v4:
> * Replace the iterating with simple nb_mem_chunks check
> v5:
> * Use rte_mempool_get_mem_range to query the mempool range
> ---

Acked-by: Morten Brørup 



[PATCH v2 0/6] Optionally have rte_memcpy delegate to compiler memcpy

2024-06-20 Thread Mattias Rönnblom
This patch set make DPDK library, driver, and application code use the
compiler/libc memcpy() by default when functions in  are
invoked.

The various custom DPDK rte_memcpy() implementations may be retained
by means of a build-time option.

This patch set only make a difference on x86, PPC and ARM. Loongarch
and RISCV already used compiler/libc memcpy().

This patch set includes a number of fixes in drivers and libraries
which errornously relied on  including other header
files (e.g., ) required by its implementation.

Mattias Rönnblom (6):
  net/fm10k: add missing intrinsic include
  event/dlb2: include headers for vector and memory copy APIs
  net/octeon_ep: properly include vector API header file
  distributor: properly include vector API header file
  fib: properly include vector API header file
  eal: provide option to use compiler memcpy instead of RTE

 config/meson.build |  1 +
 doc/guides/rel_notes/release_24_07.rst | 21 +
 drivers/event/dlb2/dlb2.c  |  2 +
 drivers/net/fm10k/fm10k_rxtx_vec.c |  1 +
 drivers/net/octeon_ep/otx_ep_ethdev.c  |  2 +
 lib/distributor/rte_distributor.c  |  1 +
 lib/eal/arm/include/rte_memcpy.h   | 10 +
 lib/eal/include/generic/rte_memcpy.h   | 61 +++---
 lib/eal/loongarch/include/rte_memcpy.h | 53 ++
 lib/eal/ppc/include/rte_memcpy.h   | 10 +
 lib/eal/riscv/include/rte_memcpy.h | 53 ++
 lib/eal/x86/include/meson.build|  1 +
 lib/eal/x86/include/rte_memcpy.h   | 11 -
 lib/fib/trie.c |  1 +
 meson_options.txt  |  2 +
 15 files changed, 124 insertions(+), 106 deletions(-)

-- 
2.34.1



[PATCH v2 1/6] net/fm10k: add missing intrinsic include

2024-06-20 Thread Mattias Rönnblom
Add missing  include, to get the _mm_cvtsi128_si64
prototype.

Signed-off-by: Mattias Rönnblom 
---
 drivers/net/fm10k/fm10k_rxtx_vec.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/fm10k/fm10k_rxtx_vec.c 
b/drivers/net/fm10k/fm10k_rxtx_vec.c
index 2b6914b1da..d417b31bbb 100644
--- a/drivers/net/fm10k/fm10k_rxtx_vec.c
+++ b/drivers/net/fm10k/fm10k_rxtx_vec.c
@@ -10,6 +10,7 @@
 #include "base/fm10k_type.h"
 
 #include 
+#include 
 
 #ifndef __INTEL_COMPILER
 #pragma GCC diagnostic ignored "-Wcast-qual"
-- 
2.34.1



[PATCH v2 3/6] net/octeon_ep: properly include vector API header file

2024-06-20 Thread Mattias Rönnblom
The octeon_ip driver relied on , but failed to provide a
direct include of this file.

Signed-off-by: Mattias Rönnblom 
---
 drivers/net/octeon_ep/otx_ep_ethdev.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/octeon_ep/otx_ep_ethdev.c 
b/drivers/net/octeon_ep/otx_ep_ethdev.c
index 46211361a0..b069216629 100644
--- a/drivers/net/octeon_ep/otx_ep_ethdev.c
+++ b/drivers/net/octeon_ep/otx_ep_ethdev.c
@@ -5,6 +5,8 @@
 #include 
 #include 
 
+#include 
+
 #include "otx_ep_common.h"
 #include "otx_ep_vf.h"
 #include "otx2_ep_vf.h"
-- 
2.34.1



[PATCH v2 2/6] event/dlb2: include headers for vector and memory copy APIs

2024-06-20 Thread Mattias Rönnblom
The DLB2 PMD depended on  being included as a side-effect
of  being included.

In addition, DLB2 used rte_memcpy() but did not include ,
but rather depended on other include files to do so.

This patch addresses both of those issues.

Signed-off-by: Mattias Rönnblom 
---
 drivers/event/dlb2/dlb2.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/event/dlb2/dlb2.c b/drivers/event/dlb2/dlb2.c
index 0b91f03956..19f90b8f8d 100644
--- a/drivers/event/dlb2/dlb2.c
+++ b/drivers/event/dlb2/dlb2.c
@@ -25,11 +25,13 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
 #include 
 #include 
+#include 
 
 #include "dlb2_priv.h"
 #include "dlb2_iface.h"
-- 
2.34.1



[PATCH v2 5/6] fib: properly include vector API header file

2024-06-20 Thread Mattias Rönnblom
The trie implementation of the fib library relied on , but
failed to provide a direct include of this file.

Signed-off-by: Mattias Rönnblom 
---
 lib/fib/trie.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/lib/fib/trie.c b/lib/fib/trie.c
index 09470e7287..74db8863df 100644
--- a/lib/fib/trie.c
+++ b/lib/fib/trie.c
@@ -9,6 +9,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
-- 
2.34.1



[PATCH v2 4/6] distributor: properly include vector API header file

2024-06-20 Thread Mattias Rönnblom
The distributor library relied on , but failed to provide
a direct include of this file.

Signed-off-by: Mattias Rönnblom 
---
 lib/distributor/rte_distributor.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/lib/distributor/rte_distributor.c 
b/lib/distributor/rte_distributor.c
index e58727cdc2..1389efc03f 100644
--- a/lib/distributor/rte_distributor.c
+++ b/lib/distributor/rte_distributor.c
@@ -15,6 +15,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "rte_distributor.h"
 #include "rte_distributor_single.h"
-- 
2.34.1



[PATCH v2 6/6] eal: provide option to use compiler memcpy instead of RTE

2024-06-20 Thread Mattias Rönnblom
Provide build option to have functions in  delegate to
the standard compiler/libc memcpy(), instead of using the various
custom DPDK, handcrafted, per-architecture rte_memcpy()
implementations.

A new meson build option 'use_cc_memcpy' is added. By default,
the compiler/libc memcpy() is used.

The performance benefits of the custom DPDK rte_memcpy()
implementations have been diminishing with every compiler release, and
with current toolchains the use of a custom memcpy() implementation
may even be a liability.

This patch leaves an option to stay on the custom DPDK implementations,
would that prove beneficial for certain applications or architectures.

An additional benefit of this change is that compilers and static
analysis tools have an easier time detecting incorrect usage of
rte_memcpy() (e.g., buffer overruns, or overlapping source and
destination buffers).

Signed-off-by: Mattias Rönnblom 
Acked-by: Morten Brørup 

---

PATCH:
 o Add entry in release notes.
 o Update meson help text.

RFC v3:
 o Fix missing #endif on loongarch.
 o PPC and RISCV now implemented, meaning all architectures are supported.
 o Unnecessary  include is removed from .

RFC v2:
 * Fix bug where rte_memcpy.h was not installed on x86.
 * Made attempt to make Loongarch compile.
---
 config/meson.build |  1 +
 doc/guides/rel_notes/release_24_07.rst | 21 +
 lib/eal/arm/include/rte_memcpy.h   | 10 +
 lib/eal/include/generic/rte_memcpy.h   | 61 +++---
 lib/eal/loongarch/include/rte_memcpy.h | 53 ++
 lib/eal/ppc/include/rte_memcpy.h   | 10 +
 lib/eal/riscv/include/rte_memcpy.h | 53 ++
 lib/eal/x86/include/meson.build|  1 +
 lib/eal/x86/include/rte_memcpy.h   | 11 -
 meson_options.txt  |  2 +
 10 files changed, 117 insertions(+), 106 deletions(-)

diff --git a/config/meson.build b/config/meson.build
index 8c8b019c25..456056628e 100644
--- a/config/meson.build
+++ b/config/meson.build
@@ -353,6 +353,7 @@ endforeach
 # set other values pulled from the build options
 dpdk_conf.set('RTE_MAX_ETHPORTS', get_option('max_ethports'))
 dpdk_conf.set('RTE_LIBEAL_USE_HPET', get_option('use_hpet'))
+dpdk_conf.set('RTE_USE_CC_MEMCPY', get_option('use_cc_memcpy'))
 dpdk_conf.set('RTE_ENABLE_STDATOMIC', get_option('enable_stdatomic'))
 dpdk_conf.set('RTE_ENABLE_TRACE_FP', get_option('enable_trace_fp'))
 dpdk_conf.set('RTE_PKTMBUF_HEADROOM', get_option('pkt_mbuf_headroom'))
diff --git a/doc/guides/rel_notes/release_24_07.rst 
b/doc/guides/rel_notes/release_24_07.rst
index a69f24cf99..4b6eafa86e 100644
--- a/doc/guides/rel_notes/release_24_07.rst
+++ b/doc/guides/rel_notes/release_24_07.rst
@@ -24,6 +24,27 @@ DPDK Release 24.07
 New Features
 
 
+* **Compiler memcpy replaces custom DPDK implementation.**
+
+  The memory copy functions of  now delegates to the
+  standard memcpy() function, implemented by the compiler and the C
+  runtime (e.g., libc).
+
+  In this release of DPDK, the handcrafted, per-architecture memory
+  copy implementations are still available, and may be reactivated by
+  setting the new ``use_cc_memcpy`` build option to false.
+
+  The performance benefits of the custom DPDK rte_memcpy()
+  implementations have been diminishing with every new compiler
+  release, and with current toolchains the use of a custom memcpy()
+  implementation may even result in worse performance than the
+  standard memcpy().
+
+  An additional benefit of this change is that compilers and static
+  analysis tools have an easier time detecting incorrect usage of
+  rte_memcpy() (e.g., buffer overruns, or overlapping source and
+  destination buffers).
+
 .. This section should contain new features added in this release.
Sample format:
 
diff --git a/lib/eal/arm/include/rte_memcpy.h b/lib/eal/arm/include/rte_memcpy.h
index 47dea9a8cc..e8aff722df 100644
--- a/lib/eal/arm/include/rte_memcpy.h
+++ b/lib/eal/arm/include/rte_memcpy.h
@@ -5,10 +5,20 @@
 #ifndef _RTE_MEMCPY_ARM_H_
 #define _RTE_MEMCPY_ARM_H_
 
+#include 
+
+#ifdef RTE_USE_CC_MEMCPY
+
+#include 
+
+#else
+
 #ifdef RTE_ARCH_64
 #include 
 #else
 #include 
 #endif
 
+#endif /* RTE_USE_CC_MEMCPY */
+
 #endif /* _RTE_MEMCPY_ARM_H_ */
diff --git a/lib/eal/include/generic/rte_memcpy.h 
b/lib/eal/include/generic/rte_memcpy.h
index e7f0f8eaa9..cae06117fb 100644
--- a/lib/eal/include/generic/rte_memcpy.h
+++ b/lib/eal/include/generic/rte_memcpy.h
@@ -5,12 +5,19 @@
 #ifndef _RTE_MEMCPY_H_
 #define _RTE_MEMCPY_H_
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
 /**
  * @file
  *
  * Functions for vectorised implementation of memcpy().
  */
 
+#include 
+#include 
+
 /**
  * Copy 16 bytes from one location to another using optimised
  * instructions. The locations should not overlap.
@@ -35,8 +42,6 @@ rte_mov16(uint8_t *dst, const uint8_t *src);
 static inline void
 rte_mov32(uint8_t *dst, const uint8_t *src);
 
-#ifdef __DOXYGEN__
-
 /**
  * Copy 

RE: [EXTERNAL] [PATCH v2 1/2] crypto/mlx5: optimize AES-GCM IPsec operation

2024-06-20 Thread Akhil Goyal
> Hi,
> 
> > -Original Message-
> > From: Akhil Goyal 
> > Sent: Friday, June 14, 2024 5:07 PM
> > To: Suanming Mou ; Matan Azrad
> > 
> > Cc: dev@dpdk.org
> > Subject: RE: [EXTERNAL] [PATCH v2 1/2] crypto/mlx5: optimize AES-GCM IPsec
> > operation
> >
> > > Hi Akhil,
> > >
> > > > -Original Message-
> > > > From: Akhil Goyal 
> > > > Sent: Friday, June 14, 2024 2:49 PM
> > > > To: Suanming Mou ; Matan Azrad
> > > > 
> > > > Cc: dev@dpdk.org
> > > > Subject: RE: [EXTERNAL] [PATCH v2 1/2] crypto/mlx5: optimize AES-GCM
> > > > IPsec operation
> > > >
> > > > > To optimize AES-GCM IPsec operation within crypto/mlx5, the DPDK
> > > > > API typically supplies AES_GCM AAD/Payload/Digest in separate
> > > > > locations, potentially disrupting their contiguous layout. In
> > > > > cases where the memory layout fails to meet hardware (HW)
> > > > > requirements, an UMR WQE is initiated ahead of the GCM's GGA WQE
> > > > > to establish a continuous AAD/Payload/Digest virtual memory space for
> the
> > HW MMU.
> > > > >
> > > > > For IPsec scenarios, where the memory layout consistently adheres
> > > > > to the fixed order of AAD/IV/Payload/Digest, directly shrinking
> > > > > memory for AAD proves more efficient than preparing a UMR WQE. To
> > > > > address this, a new devarg "crypto_mode" with mode "ipsec_opt" is
> > > > > introduced in the commit, offering an optimization hint
> > > > > specifically for IPsec cases. When enabled, the PMD copies AAD
> > > > > directly before Payload in the enqueue_burst function instead of
> > > > > employing the UMR WQE. Subsequently, in the dequeue_burst
> > > > > function, the overridden IV before Payload is restored from the
> > > > > GGA WQE. It's crucial for users to avoid utilizing the input mbuf data
> during
> > processing.
> > > >
> > > > This seems very specific to mlx5 and is not as per the expectations
> > > > of cryptodev APIs.
> > > >
> > > > It seems you are asking to alter the user application to accommodate
> > > > this to support IPsec.
> > > >
> > > > Cryptodev APIs are for generic crypto processing of data as defined
> > > > in rte_crypto_op.
> > > > With your proposed changes, it seems the behavior of the crypto APIs
> > > > will be different in case of mlx5 which I believe is not correct.
> > > >
> > > > Is it not possible for you to use rte_security IPsec path?
> > > >
> > >
> > > Sorry I don't understand why that conflicts the API, IIUC crypto API
> > > only just defines the AAD/Payload/Digest in struct rte_crypto_sym_op,
> > > but not restrict the sequence, and AAD/Payload/Digest may come from
> > difference memory space.
> > > Am I missing something here?
> >
> > Yes you are correct that there is no restriction there.
> >
> > > The input requirements from mlx5 HW is AAD/Payload/Digest sequence, if
> > > the input memory of AAD/Payload/Digest does not meet the requirements,
> > > PMD will try to combine the memory address space with UMR WQE as that
> > > commit does by software shrink.
> >
> > And here, you are adding a restriction for IPsec case.
> > I believe you need a way to identify IPsec case with non-ipsec case in data 
> > path.
> > For that, instead of using a devarg(which is a specific case for mlx5), you 
> > can use
> > generic rte_security session with action type
> > RTE_SECURITY_ACTION_TYPE_NONE.
> 
> Just to emphasize, this is not a restriction, we don't restrict user must use 
> that
> devarg for IPSEC case.
> The way to identify or apply that optimization is user's devarg of 
> "ipsec_opt".
> Without that hint from devarg, pmd will work in UMR mode to combine the
> memory addresses.

Even if it is an optional thing.
After adding the devarg, the user is expected to use the buffers the way your 
PMD
is expecting. So, this is a restriction. Right?

What would be the behavior if devarg is set but the buffers are configured the 
same way as before?

> I agree move to other API will also make the hint work. But if providing one 
> hint
> devarg here does not conflict the API and bring better compatibility, it does 
> not
> hurt.

I do not understand how it is bringing better compatibility.

The devarg that is added for ipsec_opt seems redundant.
We should use standard APIs when they are available.
Devargs are added to pass on additional run time configuration
which is not part of standard API set and is specific to a particular PMD.
But in this case, we do have rte_security and rte_ipsec APIs to configure
IPsec specific requirements.

> 
> >
> > You may also benefit from rte_ipsec library APIs and test framework, for
> > processing of protocol specific things which are specifically written for
> > RTE_SECURITY_ACTION_TYPE_NONE case.
> And again, thanks for the suggestion, I assume we will also consider 
> supporting
> that next for rte_security as well if possible, to provide more choice for 
> user.
> 
> >
> > > And the most important thing is that "ipsec_opt" is not mandatory,
> > > only if user have such case of layout and a

RE: Coding Style for local variables

2024-06-20 Thread Morten Brørup
> From: Thomas Monjalon [mailto:tho...@monjalon.net]
> 
> 10/06/2024 18:31, Konstantin Ananyev:
> > Morten said:
> > > The coding style guide says:
> > >
> > > "Variables should be declared at the start of a block of code rather than
> in the middle. The exception to this is when the variable is
> > > const in which case the declaration must be at the point of first
> use/assignment. Declaring variable inside a for loop is OK."
> > >
> > > Since DPDK switched to C11, variables can be declared where they are used,
> which reduces the risk of using effectively uninitialized
> > > variables. "Effectively uninitialized" means initialized to 0 or NULL
> where declared, to silence any compiler warnings about the use of
> > > uninitialized variables.
> > >
> > > Can we please agree to remove the recommendation/requirement to declare
> variables at the start of a block of code?
> >
> > I know that modern C standards allow to define variable in the middle.
> > But I am strongly opposed to allow that in DPDK coding style.
> > Such practice makes code much harder to read and understand (at least for
> me).
> 
> Yes it is convenient to know that all variables are described
> in a known place, just after function parameters.
> 
> There is also a consistency concern.
> 
> Old contributors like to be in a comfort zone,
>   and we don't want to lose old contributors.
> New contributors may be refrained by old rules,
>   and we would like to get more new contributors.
> 
> So that's a tricky decision.
> 

Independent research shows that readability is improved by declaring local 
variables as close as possible to their first use:
https://barrgroup.com/72-initialization#footnote12

Old people (like myself) need to unlearn their bad old habits (originating from 
limitations in old C standards), and embrace modern methods to reduce the risk 
of introducing bugs.



RE: Coding Style for local variables

2024-06-20 Thread Konstantin Ananyev


> > From: Thomas Monjalon [mailto:tho...@monjalon.net]
> >
> > 10/06/2024 18:31, Konstantin Ananyev:
> > > Morten said:
> > > > The coding style guide says:
> > > >
> > > > "Variables should be declared at the start of a block of code rather 
> > > > than
> > in the middle. The exception to this is when the variable is
> > > > const in which case the declaration must be at the point of first
> > use/assignment. Declaring variable inside a for loop is OK."
> > > >
> > > > Since DPDK switched to C11, variables can be declared where they are 
> > > > used,
> > which reduces the risk of using effectively uninitialized
> > > > variables. "Effectively uninitialized" means initialized to 0 or NULL
> > where declared, to silence any compiler warnings about the use of
> > > > uninitialized variables.
> > > >
> > > > Can we please agree to remove the recommendation/requirement to declare
> > variables at the start of a block of code?
> > >
> > > I know that modern C standards allow to define variable in the middle.
> > > But I am strongly opposed to allow that in DPDK coding style.
> > > Such practice makes code much harder to read and understand (at least for
> > me).
> >
> > Yes it is convenient to know that all variables are described
> > in a known place, just after function parameters.
> >
> > There is also a consistency concern.
> >
> > Old contributors like to be in a comfort zone,
> > and we don't want to lose old contributors.
> > New contributors may be refrained by old rules,
> > and we would like to get more new contributors.
> >
> > So that's a tricky decision.
> >
> 
> Independent research shows that readability is improved by declaring local 
> variables as close as possible to their first use:
> https://barrgroup.com/72-initialization#footnote12

Hmm... seems  they don't provide any data to back up their statements.
Specially that one sounds weird for me:
" Too many programmers assume the C run-time will watch out for them, e.g., by 
zeroing the value of uninitialized variables on system startup."
Why on earth people would assume that?
And what exactly means 'too many? 1%? 10%? 90%? 

> 
> Old people (like myself) need to unlearn their bad old habits (originating 
> from limitations in old C standards), and embrace modern
> methods to reduce the risk of introducing bugs.

Allowing to define variables in the middle of the code by itself wouldn't 
prevent of use of un-initialized variables.
From other side - compilers are quite good these days to catch such bugs.
So I don't think it is a completing argument..
 




RE: [EXTERNAL] [PATCH v2 1/2] crypto/mlx5: optimize AES-GCM IPsec operation

2024-06-20 Thread Suanming Mou


> -Original Message-
> From: Akhil Goyal 
> Sent: Thursday, June 20, 2024 3:40 PM
> To: Suanming Mou ; Matan Azrad
> 
> Cc: dev@dpdk.org
> Subject: RE: [EXTERNAL] [PATCH v2 1/2] crypto/mlx5: optimize AES-GCM
> IPsec operation
> 
> > Hi,
> >
> > > -Original Message-
> > > From: Akhil Goyal 
> > > Sent: Friday, June 14, 2024 5:07 PM
> > > To: Suanming Mou ; Matan Azrad
> > > 
> > > Cc: dev@dpdk.org
> > > Subject: RE: [EXTERNAL] [PATCH v2 1/2] crypto/mlx5: optimize AES-GCM
> > > IPsec operation
> > >
> > > > Hi Akhil,
> > > >
> > > > > -Original Message-
> > > > > From: Akhil Goyal 
> > > > > Sent: Friday, June 14, 2024 2:49 PM
> > > > > To: Suanming Mou ; Matan Azrad
> > > > > 
> > > > > Cc: dev@dpdk.org
> > > > > Subject: RE: [EXTERNAL] [PATCH v2 1/2] crypto/mlx5: optimize
> > > > > AES-GCM IPsec operation
> > > > >
> > > > > > To optimize AES-GCM IPsec operation within crypto/mlx5, the
> > > > > > DPDK API typically supplies AES_GCM AAD/Payload/Digest in
> > > > > > separate locations, potentially disrupting their contiguous
> > > > > > layout. In cases where the memory layout fails to meet
> > > > > > hardware (HW) requirements, an UMR WQE is initiated ahead of
> > > > > > the GCM's GGA WQE to establish a continuous AAD/Payload/Digest
> > > > > > virtual memory space for
> > the
> > > HW MMU.
> > > > > >
> > > > > > For IPsec scenarios, where the memory layout consistently
> > > > > > adheres to the fixed order of AAD/IV/Payload/Digest, directly
> > > > > > shrinking memory for AAD proves more efficient than preparing
> > > > > > a UMR WQE. To address this, a new devarg "crypto_mode" with
> > > > > > mode "ipsec_opt" is introduced in the commit, offering an
> > > > > > optimization hint specifically for IPsec cases. When enabled,
> > > > > > the PMD copies AAD directly before Payload in the
> > > > > > enqueue_burst function instead of employing the UMR WQE.
> > > > > > Subsequently, in the dequeue_burst function, the overridden IV
> > > > > > before Payload is restored from the GGA WQE. It's crucial for
> > > > > > users to avoid utilizing the input mbuf data
> > during
> > > processing.
> > > > >
> > > > > This seems very specific to mlx5 and is not as per the
> > > > > expectations of cryptodev APIs.
> > > > >
> > > > > It seems you are asking to alter the user application to
> > > > > accommodate this to support IPsec.
> > > > >
> > > > > Cryptodev APIs are for generic crypto processing of data as
> > > > > defined in rte_crypto_op.
> > > > > With your proposed changes, it seems the behavior of the crypto
> > > > > APIs will be different in case of mlx5 which I believe is not correct.
> > > > >
> > > > > Is it not possible for you to use rte_security IPsec path?
> > > > >
> > > >
> > > > Sorry I don't understand why that conflicts the API, IIUC crypto
> > > > API only just defines the AAD/Payload/Digest in struct
> > > > rte_crypto_sym_op, but not restrict the sequence, and
> > > > AAD/Payload/Digest may come from
> > > difference memory space.
> > > > Am I missing something here?
> > >
> > > Yes you are correct that there is no restriction there.
> > >
> > > > The input requirements from mlx5 HW is AAD/Payload/Digest
> > > > sequence, if the input memory of AAD/Payload/Digest does not meet
> > > > the requirements, PMD will try to combine the memory address space
> > > > with UMR WQE as that commit does by software shrink.
> > >
> > > And here, you are adding a restriction for IPsec case.
> > > I believe you need a way to identify IPsec case with non-ipsec case in 
> > > data
> path.
> > > For that, instead of using a devarg(which is a specific case for
> > > mlx5), you can use generic rte_security session with action type
> > > RTE_SECURITY_ACTION_TYPE_NONE.
> >
> > Just to emphasize, this is not a restriction, we don't restrict user
> > must use that devarg for IPSEC case.
> > The way to identify or apply that optimization is user's devarg of
> "ipsec_opt".
> > Without that hint from devarg, pmd will work in UMR mode to combine
> > the memory addresses.
> 
> Even if it is an optional thing.
> After adding the devarg, the user is expected to use the buffers the way your
> PMD is expecting. So, this is a restriction. Right?

The devarg is not enabled by default, if user adds the devarg, that means user 
know what he is doing, and the input is suitable for that optimization.
PMD doesn't restrict user must use that hint to handle IPsec case, user will 
still be able to handle IPsec operation without that devarg.
If user has mixed cases, just leave the devarg away, does that make sense?

> 
> What would be the behavior if devarg is set but the buffers are configured the
> same way as before?
> 
> > I agree move to other API will also make the hint work. But if
> > providing one hint devarg here does not conflict the API and bring
> > better compatibility, it does not hurt.
> 
> I do not understand how it is bringing better compatibility.
> 
> The devarg that is added for ipsec_

Re: FW: compilation|FAILURE| pw(141419) sid(32237) job(PER_PATCH_BUILD12332)[v2,6/6] eal: provide option to use compiler memcpy instead of RTE

2024-06-20 Thread Mattias Rönnblom

On 2024-06-20 10:11, Mattias Rönnblom wrote:



-Original Message-
From: sys_...@intel.com 
Sent: Thursday, 20 June 2024 09:55
To: test-rep...@dpdk.org; Mattias Rönnblom 
Subject: compilation|FAILURE| pw(141419) sid(32237) 
job(PER_PATCH_BUILD12332)[v2,6/6] eal: provide option to use compiler memcpy 
instead of RTE


Test-Label: Intel-compilation
Test-Status: FAILURE
http://dpdk.org/patch/141419

_Compilation issues_

Submitter: Mattias Rönnblom 
Date: 2024-06-20 07:24:52
Reply_mail: <20240620072452.420029-7-mattias.ronnb...@ericsson.com>

DPDK git baseline: Repo:dpdk, CommitID: 4a44d97f0a52a76258c6a6cb6a713f4380a8ab1f


Meson Build Summary: 23 Builds Done, 22 Successful, 1 Failures, 0 Blocked

+---++--+++---+--++
| os| gcc-static | clang-static | icc-static | gcc-shared | 
gcc-debug | document | gcc-16byte |
+---++--+++---+--++
| OpenAnolis8.8-64  | pass   |  ||| 
  |  ||
| FreeBSD14-64  | pass   | pass || pass   | 
pass  |  ||
| RHEL94-64 | pass   | pass || pass   | 
pass  |  ||
| SUSE15-64 | pass   | pass ||| 
  |  ||
| CBL-Mariner2.0-64 | pass   |  ||| 
  |  ||
| UB2404-32 | fail   |  ||| 
  |  ||
| RHEL93-64 | pass   |  ||| 
  |  ||
| UB2404-64 | pass   | pass ||| 
  | pass | pass   |
| RHEL94-64Rt   | pass   |  ||| 
  |  ||
| UB2204-64 | pass   |  ||| 
  |  ||
| FC40-64   | pass   | pass ||| 
  |  ||
| UB2404-64Rt   | pass   |  ||| 
  |  ||
+---++--+++---+--++

Comments:
Because of DPDK bug (https://bugs.dpdk.org/show_bug.cgi?id=928),
All the dpdk-next-* branch add `Ddisable_drivers=event/cnxk` option when build 
with ICC complier.

Test environment and configuration as below:


OS: OpenAnolis8.8-64
 Kernel Version: 5.10.134-13.an8.x86_64
 GCC Version: gcc (GCC) 8.5.0 20210514 (Anolis 8.5.0-10.0.3)
 Clang Version: 13.0.1 (Anolis 
13.0.1-2.0.2.module+an8.7.0+10996+1588f068)
 x86_64-native-linuxapp-gcc

OS: FreeBSD14-64
 Kernel Version: 14.0-RELEASE
 GCC Version: gcc (FreeBSD Ports Collection) 12.2.0
 Clang Version: 16.0.6 (https://github.com/llvm/llvm-project.git 
llvmorg-16.0.6-0-g7cbf1a259152)
 x86_64-native-bsdapp-gcc
 x86_64-native-bsdapp-clang
 x86_64-native-bsdapp-gcc+shared
 x86_64-native-bsdapp-gcc+debug

OS: RHEL94-64
 Kernel Version: 5.14.0-427.13.1.el9_4.x86_64
 GCC Version: gcc (GCC) 11.4.1 20231218 (Red Hat 11.4.1-3)
 Clang Version: 17.0.6 (Red Hat, Inc. 17.0.6-5.el9)
 x86_64-native-linuxapp-gcc
 x86_64-native-linuxapp-clang
 x86_64-native-linuxapp-gcc+shared
 x86_64-native-linuxapp-gcc+debug

OS: SUSE15-64
 Kernel Version: 5.14.21-150500.53-default
 GCC Version: gcc (SUSE Linux) 7.5.0
 Clang Version: 15.0.7
 x86_64-native-linuxapp-clang
 x86_64-native-linuxapp-gcc

OS: CBL-Mariner2.0-64
 Kernel Version: 5.15.55.1_2e9a4f9+
 GCC Version: gcc (GCC) 11.2.0
 Clang Version: NA
 x86_64-native-linuxapp-gcc

OS: UB2404-32
 Kernel Version: 6.8.0-31-generic
 GCC Version: gcc (Ubuntu 13.2.0-23ubuntu4) 13.2.0
 Clang Version: NA
 i686-native-linuxapp-gcc

OS: RHEL93-64
 Kernel Version: 5.14.0-362.8.1.el9_3.x86_64
 GCC Version: gcc (GCC) 11.4.1 20231218 (Red Hat 11.4.1-3)
 Clang Version: 17.0.6 (Red Hat, Inc. 17.0.6-5.el9)
 x86_64-native-linuxapp-gcc

OS: UB2404-64
 Kernel Version: 6.8.0-31-generic
 GCC Version: gcc (Ubuntu 13.2.0-23ubuntu4) 13.2.0
 Clang Version: 18.1.3 (1)
 x86_64-native-linuxapp-gcc+16byte
 x86_64-native-linuxapp-gcc
 x86_64-native-linuxapp-clang
 x86_64-native-linuxapp-doc

OS: RHEL94-64Rt
 Kernel Version: 5.14.0-427.13.1.el9_4.x86_64+rt
 GCC Version: gcc (GCC) 11.4.1 20231218 (Red Hat 11.4.1-3)
 Clang Version: 17.0.6 (Re

Re: [PATCH v2] config/arm: add Ampere AmpereOneX platform

2024-06-20 Thread Ruifeng Wang



On 2024/4/11 5:23 PM, Yutang Jiang wrote:

Signed-off-by: Yutang Jiang 
---
  config/arm/arm64_ampereonex_linux_gcc | 16 
  config/arm/meson.build| 19 +++
  2 files changed, 35 insertions(+)
  create mode 100644 config/arm/arm64_ampereonex_linux_gcc

diff --git a/config/arm/arm64_ampereonex_linux_gcc 
b/config/arm/arm64_ampereonex_linux_gcc
new file mode 100644
index 00..c5c334fdb7
--- /dev/null
+++ b/config/arm/arm64_ampereonex_linux_gcc
@@ -0,0 +1,16 @@
+[binaries]
+c = ['ccache', 'aarch64-linux-gnu-gcc']
+cpp = ['ccache', 'aarch64-linux-gnu-g++']
+ar = 'aarch64-linux-gnu-gcc-ar'
+strip = 'aarch64-linux-gnu-strip'
+pkgconfig = 'aarch64-linux-gnu-pkg-config'


This variable is changed to remove some meson warning [1]. Please update 
accordingly.


With suggested change:
Acked-by: Ruifeng Wang 

PS: you can take received ack from the previous version.

[1] 
https://patches.dpdk.org/project/dpdk/patch/20240617151458.1005103-1-david.march...@redhat.com/


Thanks.

+pcap-config = ''
+
+[host_machine]




Re: [PATCH v1] crypto/ipsec_mb: use new ipad/opad calculation API

2024-06-20 Thread Dooley, Brian
Recheck-request: iol-unit-arm64-testing


From: Dooley, Brian 
Sent: Monday, June 10, 2024 9:19 AM
To: Ji, Kai ; De Lara Guarch, Pablo 

Cc: dev@dpdk.org ; gak...@marvell.com 
Subject: Re: [PATCH v1] crypto/ipsec_mb: use new ipad/opad calculation API

Recheck-request: iol-unit-amd64-testing


From: Dooley, Brian 
Sent: Wednesday, June 5, 2024 9:48 AM
To: Ji, Kai ; De Lara Guarch, Pablo 

Cc: dev@dpdk.org ; gak...@marvell.com ; 
Dooley, Brian 
Subject: [PATCH v1] crypto/ipsec_mb: use new ipad/opad calculation API

From: Pablo de Lara 

IPSec Multi-buffer library v1.4 added a new API to
calculate inner/outer padding for HMAC-SHAx/MD5.

Signed-off-by: Pablo de Lara 
Signed-off-by: Brian Dooley 
---
 drivers/crypto/ipsec_mb/pmd_aesni_mb.c | 34 +-
 1 file changed, 33 insertions(+), 1 deletion(-)

diff --git a/drivers/crypto/ipsec_mb/pmd_aesni_mb.c 
b/drivers/crypto/ipsec_mb/pmd_aesni_mb.c
index 69a546697b..b3fdea02ff 100644
--- a/drivers/crypto/ipsec_mb/pmd_aesni_mb.c
+++ b/drivers/crypto/ipsec_mb/pmd_aesni_mb.c
@@ -13,6 +13,7 @@ struct aesni_mb_op_buf_data {
 uint32_t offset;
 };

+#if IMB_VERSION(1, 3, 0) >= IMB_VERSION_NUM
 /**
  * Calculate the authentication pre-computes
  *
@@ -55,6 +56,7 @@ calculate_auth_precomputes(hash_one_block_t one_block_hash,
 memset(ipad_buf, 0, blocksize);
 memset(opad_buf, 0, blocksize);
 }
+#endif

 static inline int
 is_aead_algo(IMB_HASH_ALG hash_alg, IMB_CIPHER_MODE cipher_mode)
@@ -66,12 +68,14 @@ is_aead_algo(IMB_HASH_ALG hash_alg, IMB_CIPHER_MODE 
cipher_mode)

 /** Set session authentication parameters */
 static int
-aesni_mb_set_session_auth_parameters(const IMB_MGR *mb_mgr,
+aesni_mb_set_session_auth_parameters(IMB_MGR *mb_mgr,
 struct aesni_mb_session *sess,
 const struct rte_crypto_sym_xform *xform)
 {
+#if IMB_VERSION(1, 3, 0) >= IMB_VERSION_NUM
 hash_one_block_t hash_oneblock_fn = NULL;
 unsigned int key_larger_block_size = 0;
+#endif
 uint8_t hashed_key[HMAC_MAX_BLOCK_SIZE] = { 0 };
 uint32_t auth_precompute = 1;

@@ -267,18 +271,24 @@ aesni_mb_set_session_auth_parameters(const IMB_MGR 
*mb_mgr,
 switch (xform->auth.algo) {
 case RTE_CRYPTO_AUTH_MD5_HMAC:
 sess->template_job.hash_alg = IMB_AUTH_MD5;
+#if IMB_VERSION(1, 3, 0) >= IMB_VERSION_NUM
 hash_oneblock_fn = mb_mgr->md5_one_block;
+#endif
 break;
 case RTE_CRYPTO_AUTH_SHA1_HMAC:
 sess->template_job.hash_alg = IMB_AUTH_HMAC_SHA_1;
+#if IMB_VERSION(1, 3, 0) >= IMB_VERSION_NUM
 hash_oneblock_fn = mb_mgr->sha1_one_block;
+#endif
 if (xform->auth.key.length > get_auth_algo_blocksize(
 IMB_AUTH_HMAC_SHA_1)) {
 IMB_SHA1(mb_mgr,
 xform->auth.key.data,
 xform->auth.key.length,
 hashed_key);
+#if IMB_VERSION(1, 3, 0) >= IMB_VERSION_NUM
 key_larger_block_size = 1;
+#endif
 }
 break;
 case RTE_CRYPTO_AUTH_SHA1:
@@ -287,14 +297,18 @@ aesni_mb_set_session_auth_parameters(const IMB_MGR 
*mb_mgr,
 break;
 case RTE_CRYPTO_AUTH_SHA224_HMAC:
 sess->template_job.hash_alg = IMB_AUTH_HMAC_SHA_224;
+#if IMB_VERSION(1, 3, 0) >= IMB_VERSION_NUM
 hash_oneblock_fn = mb_mgr->sha224_one_block;
+#endif
 if (xform->auth.key.length > get_auth_algo_blocksize(
 IMB_AUTH_HMAC_SHA_224)) {
 IMB_SHA224(mb_mgr,
 xform->auth.key.data,
 xform->auth.key.length,
 hashed_key);
+#if IMB_VERSION(1, 3, 0) >= IMB_VERSION_NUM
 key_larger_block_size = 1;
+#endif
 }
 break;
 case RTE_CRYPTO_AUTH_SHA224:
@@ -303,14 +317,18 @@ aesni_mb_set_session_auth_parameters(const IMB_MGR 
*mb_mgr,
 break;
 case RTE_CRYPTO_AUTH_SHA256_HMAC:
 sess->template_job.hash_alg = IMB_AUTH_HMAC_SHA_256;
+#if IMB_VERSION(1, 3, 0) >= IMB_VERSION_NUM
 hash_oneblock_fn = mb_mgr->sha256_one_block;
+#endif
 if (xform->auth.key.length > get_auth_algo_blocksize(
 IMB_AUTH_HMAC_SHA_256)) {
 IMB_SHA256(mb_mgr,
 xform->auth.key.data,
 xform->auth.key.length,
 hashed_key);
+#if IMB_VERSION(1, 3, 0) >= IMB_VERSION_NUM
 key_larger_block_size = 1;
+#endif
 }
 break;
 case RTE_CRYPTO_AUTH_SHA256:
@@ -319,14 +337,18 @@ aesni_mb_se

RE: Coding Style for local variables

2024-06-20 Thread Morten Brørup
> From: Konstantin Ananyev [mailto:konstantin.anan...@huawei.com]
> 
> > > From: Thomas Monjalon [mailto:tho...@monjalon.net]
> > >
> > > 10/06/2024 18:31, Konstantin Ananyev:
> > > > Morten said:
> > > > > The coding style guide says:
> > > > >
> > > > > "Variables should be declared at the start of a block of code rather
> than
> > > in the middle. The exception to this is when the variable is
> > > > > const in which case the declaration must be at the point of first
> > > use/assignment. Declaring variable inside a for loop is OK."
> > > > >
> > > > > Since DPDK switched to C11, variables can be declared where they are
> used,
> > > which reduces the risk of using effectively uninitialized
> > > > > variables. "Effectively uninitialized" means initialized to 0 or NULL
> > > where declared, to silence any compiler warnings about the use of
> > > > > uninitialized variables.
> > > > >
> > > > > Can we please agree to remove the recommendation/requirement to
> declare
> > > variables at the start of a block of code?
> > > >
> > > > I know that modern C standards allow to define variable in the middle.
> > > > But I am strongly opposed to allow that in DPDK coding style.
> > > > Such practice makes code much harder to read and understand (at least
> for
> > > me).
> > >
> > > Yes it is convenient to know that all variables are described
> > > in a known place, just after function parameters.
> > >
> > > There is also a consistency concern.
> > >
> > > Old contributors like to be in a comfort zone,
> > >   and we don't want to lose old contributors.
> > > New contributors may be refrained by old rules,
> > >   and we would like to get more new contributors.
> > >
> > > So that's a tricky decision.
> > >
> >
> > Independent research shows that readability is improved by declaring local
> variables as close as possible to their first use:
> > https://barrgroup.com/72-initialization#footnote12

The footnote refers to [Uwano], which can be found here:
[Uwano]: https://www.cs.kent.edu/~jmaletic/Prog-Comp/Papers/Uwano06.pdf

> 
> Hmm... seems  they don't provide any data to back up their statements.
> Specially that one sounds weird for me:
> " Too many programmers assume the C run-time will watch out for them, e.g., by
> zeroing the value of uninitialized variables on system startup."
> Why on earth people would assume that?

Not all programmers remember all the rules all the time. Especially junior 
developers.

> And what exactly means 'too many? 1%? 10%? 90%?

I guess that "too many" means that it is a statistically significant cause of 
bugs.

PS:
I like your way of reasoning.
I guess the Barr Group is trying to keep it short in their handbook, omitting 
the details from the underlying research.
It's a shame Jack Ganssle stopped giving his "How to Develop Better Firmware 
Faster" seminar (https://www.ganssle.com/classes.htm). All his "rule-of-thumb" 
guidelines are backed with hard data from references and experiments!

> 
> >
> > Old people (like myself) need to unlearn their bad old habits (originating
> from limitations in old C standards), and embrace modern
> > methods to reduce the risk of introducing bugs.
> 
> Allowing to define variables in the middle of the code by itself wouldn't
> prevent of use of un-initialized variables.
> From other side - compilers are quite good these days to catch such bugs.
> So I don't think it is a completing argument..

Please note that I am talking about "effectively uninitialized" variables,
meaning variables that have been initialized with dummy values like NULL, 0 or 
-1,
only to make the "use of uninitialized variable" compiler warnings go away.

Initializing variables with dummy values effectively disables the compiler's 
ability to catch bugs where a variable is being used before it has been 
assigned a (correct) value, because the compiler cannot know that the variable 
has been initialized with a dummy value.

The advantages of declaring the variable where it is used the first time are:
- The developer is much likelier to assign it the correct value to begin with.
- The reviewer is much likelier to spot if it is initialized with an incorrect 
value.



Re: [PATCH v2 1/6] net/fm10k: add missing intrinsic include

2024-06-20 Thread Bruce Richardson
On Thu, Jun 20, 2024 at 09:24:47AM +0200, Mattias Rönnblom wrote:
> Add missing  include, to get the _mm_cvtsi128_si64
> prototype.
> 
> Signed-off-by: Mattias Rönnblom 
> ---
Acked-by: Bruce Richardson 


Re: [PATCH v2 2/6] event/dlb2: include headers for vector and memory copy APIs

2024-06-20 Thread Bruce Richardson
On Thu, Jun 20, 2024 at 09:24:48AM +0200, Mattias Rönnblom wrote:
> The DLB2 PMD depended on  being included as a side-effect
> of  being included.
> 
> In addition, DLB2 used rte_memcpy() but did not include ,
> but rather depended on other include files to do so.
> 
> This patch addresses both of those issues.
> 
> Signed-off-by: Mattias Rönnblom 
> ---
Acked-by: Bruce Richardson 


RE: [EXTERNAL] [PATCH v2 1/2] crypto/mlx5: optimize AES-GCM IPsec operation

2024-06-20 Thread Akhil Goyal
> > Subject: RE: [EXTERNAL] [PATCH v2 1/2] crypto/mlx5: optimize AES-GCM
> > IPsec operation
> >
> > > Hi,
> > >
> > > > -Original Message-
> > > > From: Akhil Goyal 
> > > > Sent: Friday, June 14, 2024 5:07 PM
> > > > To: Suanming Mou ; Matan Azrad
> > > > 
> > > > Cc: dev@dpdk.org
> > > > Subject: RE: [EXTERNAL] [PATCH v2 1/2] crypto/mlx5: optimize AES-GCM
> > > > IPsec operation
> > > >
> > > > > Hi Akhil,
> > > > >
> > > > > > -Original Message-
> > > > > > From: Akhil Goyal 
> > > > > > Sent: Friday, June 14, 2024 2:49 PM
> > > > > > To: Suanming Mou ; Matan Azrad
> > > > > > 
> > > > > > Cc: dev@dpdk.org
> > > > > > Subject: RE: [EXTERNAL] [PATCH v2 1/2] crypto/mlx5: optimize
> > > > > > AES-GCM IPsec operation
> > > > > >
> > > > > > > To optimize AES-GCM IPsec operation within crypto/mlx5, the
> > > > > > > DPDK API typically supplies AES_GCM AAD/Payload/Digest in
> > > > > > > separate locations, potentially disrupting their contiguous
> > > > > > > layout. In cases where the memory layout fails to meet
> > > > > > > hardware (HW) requirements, an UMR WQE is initiated ahead of
> > > > > > > the GCM's GGA WQE to establish a continuous AAD/Payload/Digest
> > > > > > > virtual memory space for
> > > the
> > > > HW MMU.
> > > > > > >
> > > > > > > For IPsec scenarios, where the memory layout consistently
> > > > > > > adheres to the fixed order of AAD/IV/Payload/Digest, directly
> > > > > > > shrinking memory for AAD proves more efficient than preparing
> > > > > > > a UMR WQE. To address this, a new devarg "crypto_mode" with
> > > > > > > mode "ipsec_opt" is introduced in the commit, offering an
> > > > > > > optimization hint specifically for IPsec cases. When enabled,
> > > > > > > the PMD copies AAD directly before Payload in the
> > > > > > > enqueue_burst function instead of employing the UMR WQE.
> > > > > > > Subsequently, in the dequeue_burst function, the overridden IV
> > > > > > > before Payload is restored from the GGA WQE. It's crucial for
> > > > > > > users to avoid utilizing the input mbuf data
> > > during
> > > > processing.
> > > > > >
> > > > > > This seems very specific to mlx5 and is not as per the
> > > > > > expectations of cryptodev APIs.
> > > > > >
> > > > > > It seems you are asking to alter the user application to
> > > > > > accommodate this to support IPsec.
> > > > > >
> > > > > > Cryptodev APIs are for generic crypto processing of data as
> > > > > > defined in rte_crypto_op.
> > > > > > With your proposed changes, it seems the behavior of the crypto
> > > > > > APIs will be different in case of mlx5 which I believe is not 
> > > > > > correct.
> > > > > >
> > > > > > Is it not possible for you to use rte_security IPsec path?
> > > > > >
> > > > >
> > > > > Sorry I don't understand why that conflicts the API, IIUC crypto
> > > > > API only just defines the AAD/Payload/Digest in struct
> > > > > rte_crypto_sym_op, but not restrict the sequence, and
> > > > > AAD/Payload/Digest may come from
> > > > difference memory space.
> > > > > Am I missing something here?
> > > >
> > > > Yes you are correct that there is no restriction there.
> > > >
> > > > > The input requirements from mlx5 HW is AAD/Payload/Digest
> > > > > sequence, if the input memory of AAD/Payload/Digest does not meet
> > > > > the requirements, PMD will try to combine the memory address space
> > > > > with UMR WQE as that commit does by software shrink.
> > > >
> > > > And here, you are adding a restriction for IPsec case.
> > > > I believe you need a way to identify IPsec case with non-ipsec case in 
> > > > data
> > path.
> > > > For that, instead of using a devarg(which is a specific case for
> > > > mlx5), you can use generic rte_security session with action type
> > > > RTE_SECURITY_ACTION_TYPE_NONE.
> > >
> > > Just to emphasize, this is not a restriction, we don't restrict user
> > > must use that devarg for IPSEC case.
> > > The way to identify or apply that optimization is user's devarg of
> > "ipsec_opt".
> > > Without that hint from devarg, pmd will work in UMR mode to combine
> > > the memory addresses.
> >
> > Even if it is an optional thing.
> > After adding the devarg, the user is expected to use the buffers the way 
> > your
> > PMD is expecting. So, this is a restriction. Right?
> 
> The devarg is not enabled by default, if user adds the devarg, that means user
> know what he is doing, and the input is suitable for that optimization.
> PMD doesn't restrict user must use that hint to handle IPsec case, user will 
> still be
> able to handle IPsec operation without that devarg.

Devarg is optional and not enabled by default. That is clear to me at the first 
place.

The point is PMD devarg can dictate the behavior of PMD and NOT the user.
The standard APIs are the ones which user must adhere to.

You cannot expect a user to change its code when it wants to use the devarg 
optimization.

> If user has mixed cases, just leave the devarg away, does that make sense?
> 
> >
> > 

Re: FW: compilation|FAILURE| pw(141419) sid(32237) job(PER_PATCH_BUILD12332)[v2,6/6] eal: provide option to use compiler memcpy instead of RTE

2024-06-20 Thread Bruce Richardson
On Thu, Jun 20, 2024 at 10:20:42AM +0200, Mattias Rönnblom wrote:
> On 2024-06-20 10:11, Mattias Rönnblom wrote:
> > 
> > 
> > -Original Message- From: sys_...@intel.com 
> > Sent: Thursday, 20 June 2024 09:55 To: test-rep...@dpdk.org; Mattias
> > Rönnblom  Subject: compilation|FAILURE|
> > pw(141419) sid(32237) job(PER_PATCH_BUILD12332)[v2,6/6] eal: provide
> > option to use compiler memcpy instead of RTE
> > 
> > 



> > *Build Failed #1: OS: UB2404-32 Target: i686-native-linuxapp-gcc
> > FAILED: drivers/libtmp_rte_net_fm10k.a.p/net_fm10k_fm10k_rxtx_vec.c.o
> > gcc -Idrivers/libtmp_rte_net_fm10k.a.p -Idrivers -I../drivers
> > -Idrivers/net/fm10k -I../drivers/net/fm10k -Idrivers/net/fm10k/base
> > -I../drivers/net/fm10k/base -Ilib/ethdev -I../lib/ethdev -I. -I..
> > -Iconfig -I../config -Ilib/eal/include -I../lib/eal/include
> > -Ilib/eal/linux/include -I../lib/eal/linux/include
> > -Ilib/eal/x86/include -I../lib/eal/x86/include -Ilib/eal/common
> > -I../lib/eal/common -Ilib/eal -I../lib/eal -Ilib/kvargs -I../lib/kvargs
> > -Ilib/log -I../lib/log -Ilib/metrics -I../lib/metrics -Ilib/telemetry
> > -I../lib/telemetry -Ilib/net -I../lib/net -Ilib/mbuf -I../lib/mbuf
> > -Ilib/mempool -I../lib/mempool -Ilib/ring -I../lib/ring -Ilib/meter
> > -I../lib/meter -Idrivers/bus/pci -I../drivers/bus/pci
> > -I../drivers/bus/pci/linux -Ilib/pci -I../lib/pci -Idrivers/bus/vdev
> > -I../drivers/bus/vdev -fdiagnostics-color=always -D_FILE_OFFSET_BITS=64
> > -Wall -Winvalid-pch -Wextra -Werror -std=c11 -O3 -include rte_config.h
> > -Wcast-qual -Wdeprecated -Wformat -Wformat-nonliteral -Wformat-security
> > -Wmissing-declarations -Wmissing-prototypes -Wnested-externs
> > -Wold-style-definition -Wpointer-arith -Wsign-compare
> > -Wstrict-prototypes -Wundef -Wwrite-strings
> > -Wno-address-of-packed-member -Wno-packed-not-aligned
> > -Wno-missing-field-initializers -Wno-zero-length-bounds
> > -Wno-pointer-to-int-cast -D_GNU_SOURCE -m32 -fPIC -march=native -mrtm
> > -DALLOW_EXPERIMENTAL_API -DALLOW_INTERNAL_API -Wno-format-truncation
> > -DRTE_LOG_DEFAULT_LOGTYPE=pmd.net.fm10k -MD -MQ
> > drivers/libtmp_rte_net_fm10k.a.p/net_fm10k_fm10k_rxtx_vec.c.o -MF
> > drivers/libtmp_rte_net_fm10k.a.p/net_fm10k_fm10k_rxtx_vec.c.o.d -o
> > drivers/libtmp_rte_net_fm10k.a.p/net_fm10k_fm10k_rxtx_vec.c.o -c
> > ../drivers/net/fm10k/fm10k_rxtx_vec.c
> > ../drivers/net/fm10k/fm10k_rxtx_vec.c: In function
> > ‘fm10k_desc_to_olflags_v’:
> > ../drivers/net/fm10k/fm10k_rxtx_vec.c:132:21: error: implicit
> > declaration of function ‘_mm_cvtsi128_si64’; did you mean
> > ‘_mm_cvtsi128_si16’? [-Werror=implicit-function-declaration] 132 |
> > vol.dword = _mm_cvtsi128_si64(vtag1);
> 
> From what I can tell, _mm_cvtsi128_si64() is only available on 64-bit
> x86. I fail to understand how this code could ever compile on 32-bit.
> 
> A somewhat unrelated question: why are there no maintainers listed for
> many of the Intel drivers?
> 

I can certainly answer this last question :-) A number of the DPDK team in PRC
who were our driver maintainers are no longer working on DPDK, and so
removed themselves from the maintainers file. Those of us based in Ireland
and India are ramping up on the drivers over time and should step up
officially as maintainers - especially for the most active drivers - in the
near future. The drivers are not so much unmaintained, as that we don't
have a single "best" name to put against them just yet.

/Bruce


Re: [PATCH v2 4/6] distributor: properly include vector API header file

2024-06-20 Thread Bruce Richardson
On Thu, Jun 20, 2024 at 09:24:50AM +0200, Mattias Rönnblom wrote:
> The distributor library relied on , but failed to provide
> a direct include of this file.
> 
> Signed-off-by: Mattias Rönnblom 
> ---
Acked-by: Bruce Richardson 


Re: [PATCH v2 5/6] fib: properly include vector API header file

2024-06-20 Thread Bruce Richardson
On Thu, Jun 20, 2024 at 09:24:51AM +0200, Mattias Rönnblom wrote:
> The trie implementation of the fib library relied on , but
> failed to provide a direct include of this file.
> 
> Signed-off-by: Mattias Rönnblom 
> ---
Acked-by: Bruce Richardson 


Re: [PATCH v2 1/6] net/fm10k: add missing intrinsic include

2024-06-20 Thread Bruce Richardson
On Thu, Jun 20, 2024 at 09:24:47AM +0200, Mattias Rönnblom wrote:
> Add missing  include, to get the _mm_cvtsi128_si64
> prototype.
> 
> Signed-off-by: Mattias Rönnblom 
> ---
>  drivers/net/fm10k/fm10k_rxtx_vec.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/net/fm10k/fm10k_rxtx_vec.c 
> b/drivers/net/fm10k/fm10k_rxtx_vec.c
> index 2b6914b1da..d417b31bbb 100644
> --- a/drivers/net/fm10k/fm10k_rxtx_vec.c
> +++ b/drivers/net/fm10k/fm10k_rxtx_vec.c
> @@ -10,6 +10,7 @@
>  #include "base/fm10k_type.h"
>  
>  #include 
> +#include 
>  
Beyond my ack of this patch, a small suggestion is to just include
rte_vect.h rather than trying to include specific x86-intrinsics headers.

My ack remains with or without taking on board this suggestion.

/Bruce


RE: [EXTERNAL] [PATCH v2 1/2] crypto/mlx5: optimize AES-GCM IPsec operation

2024-06-20 Thread Suanming Mou


> -Original Message-
> From: Akhil Goyal 
> Sent: Thursday, June 20, 2024 5:07 PM
> To: Suanming Mou ; Matan Azrad
> 
> Cc: dev@dpdk.org
> Subject: RE: [EXTERNAL] [PATCH v2 1/2] crypto/mlx5: optimize AES-GCM
> IPsec operation
> 
> > > Subject: RE: [EXTERNAL] [PATCH v2 1/2] crypto/mlx5: optimize AES-GCM
> > > IPsec operation
> > >
> > > > Hi,
> > > >
> > > > > -Original Message-
> > > > > From: Akhil Goyal 
> > > > > Sent: Friday, June 14, 2024 5:07 PM
> > > > > To: Suanming Mou ; Matan Azrad
> > > > > 
> > > > > Cc: dev@dpdk.org
> > > > > Subject: RE: [EXTERNAL] [PATCH v2 1/2] crypto/mlx5: optimize
> > > > > AES-GCM IPsec operation
> > > > >
> > > > > > Hi Akhil,
> > > > > >
> > > > > > > -Original Message-
> > > > > > > From: Akhil Goyal 
> > > > > > > Sent: Friday, June 14, 2024 2:49 PM
> > > > > > > To: Suanming Mou ; Matan Azrad
> > > > > > > 
> > > > > > > Cc: dev@dpdk.org
> > > > > > > Subject: RE: [EXTERNAL] [PATCH v2 1/2] crypto/mlx5: optimize
> > > > > > > AES-GCM IPsec operation
> > > > > > >
> > > > > > > > To optimize AES-GCM IPsec operation within crypto/mlx5,
> > > > > > > > the DPDK API typically supplies AES_GCM AAD/Payload/Digest
> > > > > > > > in separate locations, potentially disrupting their
> > > > > > > > contiguous layout. In cases where the memory layout fails
> > > > > > > > to meet hardware (HW) requirements, an UMR WQE is
> > > > > > > > initiated ahead of the GCM's GGA WQE to establish a
> > > > > > > > continuous AAD/Payload/Digest virtual memory space for
> > > > the
> > > > > HW MMU.
> > > > > > > >
> > > > > > > > For IPsec scenarios, where the memory layout consistently
> > > > > > > > adheres to the fixed order of AAD/IV/Payload/Digest,
> > > > > > > > directly shrinking memory for AAD proves more efficient
> > > > > > > > than preparing a UMR WQE. To address this, a new devarg
> > > > > > > > "crypto_mode" with mode "ipsec_opt" is introduced in the
> > > > > > > > commit, offering an optimization hint specifically for
> > > > > > > > IPsec cases. When enabled, the PMD copies AAD directly
> > > > > > > > before Payload in the enqueue_burst function instead of
> employing the UMR WQE.
> > > > > > > > Subsequently, in the dequeue_burst function, the
> > > > > > > > overridden IV before Payload is restored from the GGA WQE.
> > > > > > > > It's crucial for users to avoid utilizing the input mbuf
> > > > > > > > data
> > > > during
> > > > > processing.
> > > > > > >
> > > > > > > This seems very specific to mlx5 and is not as per the
> > > > > > > expectations of cryptodev APIs.
> > > > > > >
> > > > > > > It seems you are asking to alter the user application to
> > > > > > > accommodate this to support IPsec.
> > > > > > >
> > > > > > > Cryptodev APIs are for generic crypto processing of data as
> > > > > > > defined in rte_crypto_op.
> > > > > > > With your proposed changes, it seems the behavior of the
> > > > > > > crypto APIs will be different in case of mlx5 which I believe is 
> > > > > > > not
> correct.
> > > > > > >
> > > > > > > Is it not possible for you to use rte_security IPsec path?
> > > > > > >
> > > > > >
> > > > > > Sorry I don't understand why that conflicts the API, IIUC
> > > > > > crypto API only just defines the AAD/Payload/Digest in struct
> > > > > > rte_crypto_sym_op, but not restrict the sequence, and
> > > > > > AAD/Payload/Digest may come from
> > > > > difference memory space.
> > > > > > Am I missing something here?
> > > > >
> > > > > Yes you are correct that there is no restriction there.
> > > > >
> > > > > > The input requirements from mlx5 HW is AAD/Payload/Digest
> > > > > > sequence, if the input memory of AAD/Payload/Digest does not
> > > > > > meet the requirements, PMD will try to combine the memory
> > > > > > address space with UMR WQE as that commit does by software
> shrink.
> > > > >
> > > > > And here, you are adding a restriction for IPsec case.
> > > > > I believe you need a way to identify IPsec case with non-ipsec
> > > > > case in data
> > > path.
> > > > > For that, instead of using a devarg(which is a specific case for
> > > > > mlx5), you can use generic rte_security session with action type
> > > > > RTE_SECURITY_ACTION_TYPE_NONE.
> > > >
> > > > Just to emphasize, this is not a restriction, we don't restrict
> > > > user must use that devarg for IPSEC case.
> > > > The way to identify or apply that optimization is user's devarg of
> > > "ipsec_opt".
> > > > Without that hint from devarg, pmd will work in UMR mode to
> > > > combine the memory addresses.
> > >
> > > Even if it is an optional thing.
> > > After adding the devarg, the user is expected to use the buffers the
> > > way your PMD is expecting. So, this is a restriction. Right?
> >
> > The devarg is not enabled by default, if user adds the devarg, that
> > means user know what he is doing, and the input is suitable for that
> optimization.
> > PMD doesn't restrict user must use that hint to handle IPsec case,
> > user will still be 

Re: [PATCH v2 037/148] net/ice/base: fix NVM feature check

2024-06-20 Thread Bruce Richardson
On Wed, Jun 12, 2024 at 04:00:31PM +0100, Anatoly Burakov wrote:
> From: Ian Stokes 
> 
> Add defines required by NVM feature check. Although not directly used in this
> patch this change is required in order to better match upstream.
> 
> Signed-off-by: Ian Stokes 
> ---
>  drivers/net/ice/base/ice_adminq_cmd.h | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/net/ice/base/ice_adminq_cmd.h 
> b/drivers/net/ice/base/ice_adminq_cmd.h
> index 6add57d797..89f565d09f 100644
> --- a/drivers/net/ice/base/ice_adminq_cmd.h
> +++ b/drivers/net/ice/base/ice_adminq_cmd.h
> @@ -131,6 +131,8 @@ struct ice_aqc_list_caps_elem {
>  #define ICE_AQC_CAPS_NAC_TOPOLOGY0x0087
>  #define ICE_AQC_CAPS_OROM_RECOVERY_UPDATE0x0090
>  #define ICE_AQC_CAPS_ROCEV2_LAG  0x0092
> +#define ICE_AQC_BIT_ROCEV2_LAG   0x01
> +#define ICE_AQC_BIT_SRIOV_LAG0x02
>  
For defines like this that are just for alignment with the base code, I'd
merge to a single patch, or into other patches that add other defines.
Similar defines are in patch 40 here, so at minimum those can be merged.

/Bruce


Re: [PATCH v2 040/148] net/ice/base: add FW load status mask

2024-06-20 Thread Bruce Richardson
On Wed, Jun 12, 2024 at 04:00:34PM +0100, Anatoly Burakov wrote:
> From: Ian Stokes 
> 
> Add a mask used to extract FW load status from GL_MNG_FWSM.
> 
> Signed-off-by: Jan Sokolowski 
> Signed-off-by: Ian Stokes 
> ---
>  drivers/net/ice/base/ice_hw_autogen.h | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/net/ice/base/ice_hw_autogen.h 
> b/drivers/net/ice/base/ice_hw_autogen.h
> index 3d5d8950bf..fde5f9d86f 100644
> --- a/drivers/net/ice/base/ice_hw_autogen.h
> +++ b/drivers/net/ice/base/ice_hw_autogen.h
> @@ -5474,6 +5474,7 @@
>  #define GL_MNG_FW_RAM_STAT_MNG_MEM_ECC_ERR_S 1
>  #define GL_MNG_FW_RAM_STAT_MNG_MEM_ECC_ERR_M BIT(1)
>  #define GL_MNG_FWSM  0x000B6134 /* Reset Source: POR 
> */
> +#define GL_MNG_FWSM_FW_LOADING_M BIT(30)
>  #define GL_MNG_FWSM_FW_MODES_S   0
>  #define GL_MNG_FWSM_FW_MODES_M   MAKEMASK(0x7, 0)
>  #define GL_MNG_FWSM_RSV0_S   3
> -- 
This can be merged into another patch. Either patch 37 as I suggested in
comment on it, or perhaps better in patch 3.

/Bruce


DPDK Release Status Meeting 2024-06-20

2024-06-20 Thread Mcnamara, John
Release status meeting minutes 2024-06-20
=

Agenda:
* Release Dates
* Subtrees
* Roadmaps
* LTS
* Defects
* Opens

Participants:
* AMD
* ARM
* Intel
* Marvell
* Nvidia
* Red Hat


Release Dates
-

The following are the current/updated working dates for 24.07:

- Proposal deadline (RFC/v1 patches): 26 April 2024
- API freeze (-rc1): 14 June 2024
- PMD features freeze (-rc2): 5 July 2024
- Builtin applications features freeze (-rc3): Mid July 2024 (TBC)
- Release: End July 2023 (TBC)


https://core.dpdk.org/roadmap/#dates


Subtrees


* next-net
  * Merged most of the ethdev patches.
  * GENEVE one is remaining, I guess that is for next release.
  * Napatech PMD reviewed, will wait for new version.
  * No other updates this week.

* next-net-intel
  * Large ICE driver base code update under review.

* next-net-mlx
  * No update.

* next-net-mvl
  * No update.

* next-eventdev
  * No update.

* next-baseband
  * Some of the main series merged for RC1.
Working on remaining series.

* next-virtio
  * Some series merged for RC1.
More patches need reviews and will go to rc2.

* next-crypto
  * 54 patches (mainly to to test apps).
  * Reviews needed on OpenSSL driver.
  * Awaiting some updates as requested from submitters.

* main
  * RC1 is out. Awaiting test.
  * Highlights of 24.07-rc1:
  - pointer compression library
  - AMD Pensando ionic crypto driver
  - UADK compress driver
  - Marvell Odyssey ODM DMA driver
  - more cleanups to prepare MSVC build

LTS
---

Please add acks to confirm validation support for a 3 year LTS window:
http://inbox.dpdk.org/dev/20240117161804.223582-1-ktray...@redhat.com/

* 23.11.1 - Released.
* 22.11.5 - Released.
* 21.11.7 - Released.

* 20.11.10 - Will only be updated with CVE and critical fixes.
* 19.11.15 - Will only be updated with CVE and critical fixes.


* Distros
  * Debian 12 contains DPDK v22.11
  * Ubuntu 24.04 contains DPDK v23.11
  * Ubuntu 23.04 contains DPDK v22.11
  * RHEL 8/9 contains DPDK 23.11

Defects
---

* Bugzilla links, 'Bugs',  added for hosted projects
  * https://www.dpdk.org/hosted-projects/



DPDK Release Status Meetings


The DPDK Release Status Meeting is intended for DPDK Committers to discuss the
status of the master tree and sub-trees, and for project managers to track
progress or milestone dates.

The meeting occurs on every Thursday at 9:30 UTC over Jitsi on 
https://meet.jit.si/DPDK

You don't need an invite to join the meeting but if you want a calendar 
reminder just
send an email to "John McNamara john.mcnam...@intel.com" for the invite.


Re: [PATCH v2 1/6] net/fm10k: add missing intrinsic include

2024-06-20 Thread Mattias Rönnblom

On 2024-06-20 11:28, Bruce Richardson wrote:

On Thu, Jun 20, 2024 at 09:24:47AM +0200, Mattias Rönnblom wrote:

Add missing  include, to get the _mm_cvtsi128_si64
prototype.

Signed-off-by: Mattias Rönnblom 
---
  drivers/net/fm10k/fm10k_rxtx_vec.c | 1 +
  1 file changed, 1 insertion(+)

diff --git a/drivers/net/fm10k/fm10k_rxtx_vec.c 
b/drivers/net/fm10k/fm10k_rxtx_vec.c
index 2b6914b1da..d417b31bbb 100644
--- a/drivers/net/fm10k/fm10k_rxtx_vec.c
+++ b/drivers/net/fm10k/fm10k_rxtx_vec.c
@@ -10,6 +10,7 @@
  #include "base/fm10k_type.h"
  
  #include 

+#include 
  

Beyond my ack of this patch, a small suggestion is to just include
rte_vect.h rather than trying to include specific x86-intrinsics headers.

My ack remains with or without taking on board this suggestion.

/Bruce


I will do that, and hope it will magically solve the 
_mm_cvtsi128_si64-on-32-bit-x86 issue.


Re: [PATCH v2 1/6] net/fm10k: add missing intrinsic include

2024-06-20 Thread Bruce Richardson
On Thu, Jun 20, 2024 at 01:40:42PM +0200, Mattias Rönnblom wrote:
> On 2024-06-20 11:28, Bruce Richardson wrote:
> > On Thu, Jun 20, 2024 at 09:24:47AM +0200, Mattias Rönnblom wrote:
> > > Add missing  include, to get the _mm_cvtsi128_si64
> > > prototype.
> > > 
> > > Signed-off-by: Mattias Rönnblom 
> > > ---
> > >   drivers/net/fm10k/fm10k_rxtx_vec.c | 1 +
> > >   1 file changed, 1 insertion(+)
> > > 
> > > diff --git a/drivers/net/fm10k/fm10k_rxtx_vec.c 
> > > b/drivers/net/fm10k/fm10k_rxtx_vec.c
> > > index 2b6914b1da..d417b31bbb 100644
> > > --- a/drivers/net/fm10k/fm10k_rxtx_vec.c
> > > +++ b/drivers/net/fm10k/fm10k_rxtx_vec.c
> > > @@ -10,6 +10,7 @@
> > >   #include "base/fm10k_type.h"
> > >   #include 
> > > +#include 
> > Beyond my ack of this patch, a small suggestion is to just include
> > rte_vect.h rather than trying to include specific x86-intrinsics headers.
> > 
> > My ack remains with or without taking on board this suggestion.
> > 
> > /Bruce
> 
> I will do that, and hope it will magically solve the
> _mm_cvtsi128_si64-on-32-bit-x86 issue.

I was looking at that, and it does solve it in my testing. There are a lot
of drivers that have just "tmmintrin.h" included. Changing all of those to
rte_vect.h allows 32bit to build with your other changes applied.

/Bruce


[PATCH v3 0/6] Optionally have rte_memcpy delegate to compiler memcpy

2024-06-20 Thread Mattias Rönnblom
This patch set make DPDK library, driver, and application code use the
compiler/libc memcpy() by default when functions in  are
invoked.

The various custom DPDK rte_memcpy() implementations may be retained
by means of a build-time option.

This patch set only make a difference on x86, PPC and ARM. Loongarch
and RISCV already used compiler/libc memcpy().

This patch set includes a number of fixes in drivers and libraries
which errornously relied on  including header files
(i.e., ) required by its implementation.

Mattias Rönnblom (6):
  net/fm10k: add missing vector API header include
  event/dlb2: include headers for vector and memory copy APIs
  net/octeon_ep: add missing vector API header include
  distributor: add missing vector API header include
  fib: add missing vector API header include
  eal: provide option to use compiler memcpy instead of RTE

 config/meson.build |  1 +
 doc/guides/rel_notes/release_24_07.rst | 21 +
 drivers/event/dlb2/dlb2.c  |  2 +
 drivers/net/fm10k/fm10k_rxtx_vec.c |  1 +
 drivers/net/octeon_ep/otx_ep_ethdev.c  |  2 +
 lib/distributor/rte_distributor.c  |  1 +
 lib/eal/arm/include/rte_memcpy.h   | 10 +
 lib/eal/include/generic/rte_memcpy.h   | 61 +++---
 lib/eal/loongarch/include/rte_memcpy.h | 53 ++
 lib/eal/ppc/include/rte_memcpy.h   | 10 +
 lib/eal/riscv/include/rte_memcpy.h | 53 ++
 lib/eal/x86/include/meson.build|  1 +
 lib/eal/x86/include/rte_memcpy.h   | 11 -
 lib/fib/trie.c |  1 +
 meson_options.txt  |  2 +
 15 files changed, 124 insertions(+), 106 deletions(-)

-- 
2.34.1



[PATCH v3 1/6] net/fm10k: add missing vector API header include

2024-06-20 Thread Mattias Rönnblom
The fm10k PMD relied on , but failed to provide a direct
include of this file.

Signed-off-by: Mattias Rönnblom 
Acked-by: Bruce Richardson 
---
 drivers/net/fm10k/fm10k_rxtx_vec.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/fm10k/fm10k_rxtx_vec.c 
b/drivers/net/fm10k/fm10k_rxtx_vec.c
index 2b6914b1da..62119de373 100644
--- a/drivers/net/fm10k/fm10k_rxtx_vec.c
+++ b/drivers/net/fm10k/fm10k_rxtx_vec.c
@@ -6,6 +6,7 @@
 
 #include 
 #include 
+#include 
 #include "fm10k.h"
 #include "base/fm10k_type.h"
 
-- 
2.34.1



[PATCH v3 3/6] net/octeon_ep: add missing vector API header include

2024-06-20 Thread Mattias Rönnblom
The octeon_ip driver relied on , but failed to provide a
direct include of this file.

Signed-off-by: Mattias Rönnblom 
---
 drivers/net/octeon_ep/otx_ep_ethdev.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/octeon_ep/otx_ep_ethdev.c 
b/drivers/net/octeon_ep/otx_ep_ethdev.c
index 46211361a0..b069216629 100644
--- a/drivers/net/octeon_ep/otx_ep_ethdev.c
+++ b/drivers/net/octeon_ep/otx_ep_ethdev.c
@@ -5,6 +5,8 @@
 #include 
 #include 
 
+#include 
+
 #include "otx_ep_common.h"
 #include "otx_ep_vf.h"
 #include "otx2_ep_vf.h"
-- 
2.34.1



[PATCH v3 4/6] distributor: add missing vector API header include

2024-06-20 Thread Mattias Rönnblom
The distributor library relied on , but failed to provide
a direct include of this file.

Signed-off-by: Mattias Rönnblom 
Acked-by: Bruce Richardson 
---
 lib/distributor/rte_distributor.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/lib/distributor/rte_distributor.c 
b/lib/distributor/rte_distributor.c
index e58727cdc2..1389efc03f 100644
--- a/lib/distributor/rte_distributor.c
+++ b/lib/distributor/rte_distributor.c
@@ -15,6 +15,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "rte_distributor.h"
 #include "rte_distributor_single.h"
-- 
2.34.1



[PATCH v3 6/6] eal: provide option to use compiler memcpy instead of RTE

2024-06-20 Thread Mattias Rönnblom
Provide build option to have functions in  delegate to
the standard compiler/libc memcpy(), instead of using the various
custom DPDK, handcrafted, per-architecture rte_memcpy()
implementations.

A new meson build option 'use_cc_memcpy' is added. By default,
the compiler/libc memcpy() is used.

The performance benefits of the custom DPDK rte_memcpy()
implementations have been diminishing with every compiler release, and
with current toolchains the use of a custom memcpy() implementation
may even be a liability.

This patch leaves an option to stay on the custom DPDK implementations,
would that prove beneficial for certain applications or architectures.

An additional benefit of this change is that compilers and static
analysis tools have an easier time detecting incorrect usage of
rte_memcpy() (e.g., buffer overruns, or overlapping source and
destination buffers).

Signed-off-by: Mattias Rönnblom 
Acked-by: Morten Brørup 

---

PATCH:
 o Add entry in release notes.
 o Update meson help text.

RFC v3:
 o Fix missing #endif on loongarch.
 o PPC and RISCV now implemented, meaning all architectures are supported.
 o Unnecessary  include is removed from .

RFC v2:
 * Fix bug where rte_memcpy.h was not installed on x86.
 * Made attempt to make Loongarch compile.
---
 config/meson.build |  1 +
 doc/guides/rel_notes/release_24_07.rst | 21 +
 lib/eal/arm/include/rte_memcpy.h   | 10 +
 lib/eal/include/generic/rte_memcpy.h   | 61 +++---
 lib/eal/loongarch/include/rte_memcpy.h | 53 ++
 lib/eal/ppc/include/rte_memcpy.h   | 10 +
 lib/eal/riscv/include/rte_memcpy.h | 53 ++
 lib/eal/x86/include/meson.build|  1 +
 lib/eal/x86/include/rte_memcpy.h   | 11 -
 meson_options.txt  |  2 +
 10 files changed, 117 insertions(+), 106 deletions(-)

diff --git a/config/meson.build b/config/meson.build
index 8c8b019c25..456056628e 100644
--- a/config/meson.build
+++ b/config/meson.build
@@ -353,6 +353,7 @@ endforeach
 # set other values pulled from the build options
 dpdk_conf.set('RTE_MAX_ETHPORTS', get_option('max_ethports'))
 dpdk_conf.set('RTE_LIBEAL_USE_HPET', get_option('use_hpet'))
+dpdk_conf.set('RTE_USE_CC_MEMCPY', get_option('use_cc_memcpy'))
 dpdk_conf.set('RTE_ENABLE_STDATOMIC', get_option('enable_stdatomic'))
 dpdk_conf.set('RTE_ENABLE_TRACE_FP', get_option('enable_trace_fp'))
 dpdk_conf.set('RTE_PKTMBUF_HEADROOM', get_option('pkt_mbuf_headroom'))
diff --git a/doc/guides/rel_notes/release_24_07.rst 
b/doc/guides/rel_notes/release_24_07.rst
index 7c88de381b..ebe0085d8b 100644
--- a/doc/guides/rel_notes/release_24_07.rst
+++ b/doc/guides/rel_notes/release_24_07.rst
@@ -24,6 +24,27 @@ DPDK Release 24.07
 New Features
 
 
+* **Compiler memcpy replaces custom DPDK implementation.**
+
+  The memory copy functions of  now delegates to the
+  standard memcpy() function, implemented by the compiler and the C
+  runtime (e.g., libc).
+
+  In this release of DPDK, the handcrafted, per-architecture memory
+  copy implementations are still available, and may be reactivated by
+  setting the new ``use_cc_memcpy`` build option to false.
+
+  The performance benefits of the custom DPDK rte_memcpy()
+  implementations have been diminishing with every new compiler
+  release, and with current toolchains the use of a custom memcpy()
+  implementation may even result in worse performance than the
+  standard memcpy().
+
+  An additional benefit of this change is that compilers and static
+  analysis tools have an easier time detecting incorrect usage of
+  rte_memcpy() (e.g., buffer overruns, or overlapping source and
+  destination buffers).
+
 .. This section should contain new features added in this release.
Sample format:
 
diff --git a/lib/eal/arm/include/rte_memcpy.h b/lib/eal/arm/include/rte_memcpy.h
index 47dea9a8cc..e8aff722df 100644
--- a/lib/eal/arm/include/rte_memcpy.h
+++ b/lib/eal/arm/include/rte_memcpy.h
@@ -5,10 +5,20 @@
 #ifndef _RTE_MEMCPY_ARM_H_
 #define _RTE_MEMCPY_ARM_H_
 
+#include 
+
+#ifdef RTE_USE_CC_MEMCPY
+
+#include 
+
+#else
+
 #ifdef RTE_ARCH_64
 #include 
 #else
 #include 
 #endif
 
+#endif /* RTE_USE_CC_MEMCPY */
+
 #endif /* _RTE_MEMCPY_ARM_H_ */
diff --git a/lib/eal/include/generic/rte_memcpy.h 
b/lib/eal/include/generic/rte_memcpy.h
index e7f0f8eaa9..cae06117fb 100644
--- a/lib/eal/include/generic/rte_memcpy.h
+++ b/lib/eal/include/generic/rte_memcpy.h
@@ -5,12 +5,19 @@
 #ifndef _RTE_MEMCPY_H_
 #define _RTE_MEMCPY_H_
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
 /**
  * @file
  *
  * Functions for vectorised implementation of memcpy().
  */
 
+#include 
+#include 
+
 /**
  * Copy 16 bytes from one location to another using optimised
  * instructions. The locations should not overlap.
@@ -35,8 +42,6 @@ rte_mov16(uint8_t *dst, const uint8_t *src);
 static inline void
 rte_mov32(uint8_t *dst, const uint8_t *src);
 
-#ifdef __DOXYGEN__
-
 /**
  * Copy 

[PATCH v3 5/6] fib: add missing vector API header include

2024-06-20 Thread Mattias Rönnblom
The trie implementation of the fib library relied on , but
failed to provide a direct include of this file.

Signed-off-by: Mattias Rönnblom 
Acked-by: Bruce Richardson 
---
 lib/fib/trie.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/lib/fib/trie.c b/lib/fib/trie.c
index 09470e7287..74db8863df 100644
--- a/lib/fib/trie.c
+++ b/lib/fib/trie.c
@@ -9,6 +9,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
-- 
2.34.1



[PATCH v3 2/6] event/dlb2: include headers for vector and memory copy APIs

2024-06-20 Thread Mattias Rönnblom
The DLB2 PMD depended on  being included as a side-effect
of  being included.

In addition, DLB2 used rte_memcpy() but did not include ,
but rather depended on other include files to do so.

This patch addresses both of those issues.

Signed-off-by: Mattias Rönnblom 
Acked-by: Bruce Richardson 
---
 drivers/event/dlb2/dlb2.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/event/dlb2/dlb2.c b/drivers/event/dlb2/dlb2.c
index 0b91f03956..19f90b8f8d 100644
--- a/drivers/event/dlb2/dlb2.c
+++ b/drivers/event/dlb2/dlb2.c
@@ -25,11 +25,13 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
 #include 
 #include 
+#include 
 
 #include "dlb2_priv.h"
 #include "dlb2_iface.h"
-- 
2.34.1



Re: [PATCH v5 1/5] event/dlb2: add support for HW delayed token

2024-06-20 Thread Jerin Jacob
On Thu, Jun 20, 2024 at 2:37 AM Abdullah Sevincer
 wrote:
>
> In DLB 2.5, hardware assist is available, complementing the Delayed
> token POP software implementation. When it is enabled, the feature
> works as follows:
>
> It stops CQ scheduling when the inflight limit associated with the CQ
> is reached. So the feature is activated only if the core is
> congested. If the core can handle multiple atomic flows, DLB will not
> try to switch them. This is an improvement over SW implementation
> which always switches the flows.
>
> The feature will resume CQ scheduling when the number of pending
> completions fall below a configured threshold. To emulate older 2.0
> behavior, this threshold is set to 1 by old APIs. SW sets CQ to
> auto-pop mode for token return, as tokens withholding is not
> necessary now. As HW counts completions and not tokens, events equal
> to HL (History List) entries will be scheduled to DLB before the
> feature activates and stops CQ scheduling.
>
> Signed-off-by: Abdullah Sevincer 

>
> +/** Set inflight threshold for flow migration */
> +#define RTE_PMD_DLB2_FLOW_MIGRATION_THRESHOLD RTE_BIT64(0)
> +
> +/** Set port history list */
> +#define RTE_PMD_DLB2_SET_PORT_HL RTE_BIT64(1)
> +

Missing Doxygen comment

> +struct rte_pmd_dlb2_port_params {
> +   uint16_t inflight_threshold : 12;

Missing Doxygen comment

> +};
> +
> +/*!
> + * @warning
> + * @b EXPERIMENTAL: this API may change, or be removed, without prior notice
> + *
> + * Configure various port parameters.
> + * AUTO_POP. This function must be called before calling 
> rte_event_port_setup()
> + * for the port, but after calling rte_event_dev_configure().
> + *
> + * @param dev_id
> + *The identifier of the event device.
> + * @param port_id
> + *The identifier of the event port.
> + * @param flags
> + *Bitmask of the parameters being set.
> + * @param params
> + *Structure coantaining the values of parameters being set.
> + *
> + * @return
> + * - 0: Success
> + * - EINVAL: Invalid dev_id, port_id, or mode
> + * - EINVAL: The DLB2 is not configured, is already running, or the port is
> + *   already setup
> + */
> +__rte_experimental
> +int
> +rte_pmd_dlb2_set_port_params(uint8_t dev_id,
> +   uint8_t port_id,
> +   uint64_t flags,
> +   struct  rte_pmd_dlb2_port_params *params);
> +


Re: [PATCH v2 047/148] net/ice/base: added informational message for NAC topology

2024-06-20 Thread Bruce Richardson
On Wed, Jun 12, 2024 at 04:00:41PM +0100, Anatoly Burakov wrote:
> From: Ian Stokes 
> 
> Use proper bitmask to verify primary/secondary mode instead of whole 'mode'
> field, which also includes other information. In the result, for Mode 2a for
> example, 'secondary' mode was always reported which was misleading.
> 

Commit log is not accurate for this, I think. Commit title alone is
probably sufficient for such a simple change, or at most a one-line
description of it.

/Bruce

> Signed-off-by: Prathisna Padmasanan 
> Signed-off-by: Ian Stokes 
> ---
>  drivers/net/ice/base/ice_common.c | 4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/drivers/net/ice/base/ice_common.c 
> b/drivers/net/ice/base/ice_common.c
> index 45fea193da..62c68b6d73 100644
> --- a/drivers/net/ice/base/ice_common.c
> +++ b/drivers/net/ice/base/ice_common.c
> @@ -2886,6 +2886,10 @@ ice_parse_nac_topo_dev_caps(struct ice_hw *hw, struct 
> ice_hw_dev_caps *dev_p,
>   dev_p->nac_topo.mode = LE32_TO_CPU(cap->number);
>   dev_p->nac_topo.id = LE32_TO_CPU(cap->phys_id) & ICE_NAC_TOPO_ID_M;
>  
> + ice_info(hw, "PF is configured in %s mode with IP instance ID %d\n",
> +  (dev_p->nac_topo.mode & ICE_NAC_TOPO_PRIMARY_M) ?
> +  "primary" : "secondary", dev_p->nac_topo.id);
> +
>   ice_debug(hw, ICE_DBG_INIT, "dev caps: nac topology is_primary = %d\n",
> !!(dev_p->nac_topo.mode & ICE_NAC_TOPO_PRIMARY_M));
>   ice_debug(hw, ICE_DBG_INIT, "dev caps: nac topology is_dual = %d\n",
> -- 
> 2.43.0
> 


Re: [PATCH v5 3/5] event/dlb2: enhance DLB credit handling

2024-06-20 Thread Jerin Jacob
On Thu, Jun 20, 2024 at 2:31 AM Abdullah Sevincer
 wrote:
>
> This commit improves DLB credit handling scenarios when
> ports hold on to credits but can't release them due to insufficient
> accumulation (less than 2 * credit quanta).
>
> Worker ports now release all accumulated credits when back-to-back
> zero poll count reaches preset threshold.
>
> Producer ports release all accumulated credits if enqueue fails for a
> consecutive number of retries.
>
> All newly introduced compilation flags are in the fastpath.
>
> Signed-off-by: Abdullah Sevincer 
> ---
>  drivers/event/dlb2/dlb2.c  | 322 +++--
>  drivers/event/dlb2/dlb2_priv.h |   1 +
>  drivers/event/dlb2/meson.build |  40 
>  meson_options.txt  |   2 +

+ @Richardson, Bruce  @Thomas Monjalon  @David Marchand @Ferruh Yigit

It is not allowed to add PMD specific build options in generic DPDK
build options.  Please check with Bruce.

You may use scheme like
https://patches.dpdk.org/project/dpdk/patch/20240522192139.3016-1-pbhagavat...@marvell.com/

or if we think, we need to standardize the PMD compilation options,
then we can do that as well.


[PATCH] net/*: replace intrinsic header include with rte_vect

2024-06-20 Thread Bruce Richardson
Rather than having the SSE code in each driver include tmmintrin.h,
which often does not contain all needed intrinsics, e.g.
_mm_cvtsi128_si64() for 32-bit x86 builds, we can just replace the
include of ?mmintrin.h with rte_vect.h for all network drivers.

Signed-off-by: Bruce Richardson 
---
 drivers/net/fm10k/fm10k_rxtx_vec.c  | 2 +-
 drivers/net/i40e/i40e_rxtx_vec_sse.c| 2 +-
 drivers/net/iavf/iavf_rxtx_vec_sse.c| 2 +-
 drivers/net/ice/ice_rxtx_vec_sse.c  | 2 +-
 drivers/net/ixgbe/ixgbe_rxtx_vec_sse.c  | 2 +-
 drivers/net/mlx5/mlx5_rxtx_vec_sse.h| 2 +-
 drivers/net/ngbe/ngbe_rxtx_vec_sse.c| 2 +-
 drivers/net/txgbe/txgbe_rxtx_vec_sse.c  | 2 +-
 drivers/net/virtio/virtio_rxtx_simple_sse.c | 2 +-
 9 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/drivers/net/fm10k/fm10k_rxtx_vec.c 
b/drivers/net/fm10k/fm10k_rxtx_vec.c
index 62119de373..9a84775cb1 100644
--- a/drivers/net/fm10k/fm10k_rxtx_vec.c
+++ b/drivers/net/fm10k/fm10k_rxtx_vec.c
@@ -10,7 +10,7 @@
 #include "fm10k.h"
 #include "base/fm10k_type.h"
 
-#include 
+#include 
 
 #ifndef __INTEL_COMPILER
 #pragma GCC diagnostic ignored "-Wcast-qual"
diff --git a/drivers/net/i40e/i40e_rxtx_vec_sse.c 
b/drivers/net/i40e/i40e_rxtx_vec_sse.c
index 2d4480a765..ad560d2b6b 100644
--- a/drivers/net/i40e/i40e_rxtx_vec_sse.c
+++ b/drivers/net/i40e/i40e_rxtx_vec_sse.c
@@ -12,7 +12,7 @@
 #include "i40e_rxtx.h"
 #include "i40e_rxtx_vec_common.h"
 
-#include 
+#include 
 
 #ifndef __INTEL_COMPILER
 #pragma GCC diagnostic ignored "-Wcast-qual"
diff --git a/drivers/net/iavf/iavf_rxtx_vec_sse.c 
b/drivers/net/iavf/iavf_rxtx_vec_sse.c
index 96f187f511..0db6fa8bd4 100644
--- a/drivers/net/iavf/iavf_rxtx_vec_sse.c
+++ b/drivers/net/iavf/iavf_rxtx_vec_sse.c
@@ -10,7 +10,7 @@
 #include "iavf_rxtx.h"
 #include "iavf_rxtx_vec_common.h"
 
-#include 
+#include 
 
 #ifndef __INTEL_COMPILER
 #pragma GCC diagnostic ignored "-Wcast-qual"
diff --git a/drivers/net/ice/ice_rxtx_vec_sse.c 
b/drivers/net/ice/ice_rxtx_vec_sse.c
index 9a1b7e3e51..c01d8ede29 100644
--- a/drivers/net/ice/ice_rxtx_vec_sse.c
+++ b/drivers/net/ice/ice_rxtx_vec_sse.c
@@ -4,7 +4,7 @@
 
 #include "ice_rxtx_vec_common.h"
 
-#include 
+#include 
 
 #ifndef __INTEL_COMPILER
 #pragma GCC diagnostic ignored "-Wcast-qual"
diff --git a/drivers/net/ixgbe/ixgbe_rxtx_vec_sse.c 
b/drivers/net/ixgbe/ixgbe_rxtx_vec_sse.c
index f60808d576..a77370cdb7 100644
--- a/drivers/net/ixgbe/ixgbe_rxtx_vec_sse.c
+++ b/drivers/net/ixgbe/ixgbe_rxtx_vec_sse.c
@@ -10,7 +10,7 @@
 #include "ixgbe_rxtx.h"
 #include "ixgbe_rxtx_vec_common.h"
 
-#include 
+#include 
 
 #ifndef __INTEL_COMPILER
 #pragma GCC diagnostic ignored "-Wcast-qual"
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_sse.h 
b/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
index 2bdd1f676d..93d6d1b5f0 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
@@ -9,7 +9,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 
 #include 
 #include 
diff --git a/drivers/net/ngbe/ngbe_rxtx_vec_sse.c 
b/drivers/net/ngbe/ngbe_rxtx_vec_sse.c
index f703d0ea15..b128bd3a67 100644
--- a/drivers/net/ngbe/ngbe_rxtx_vec_sse.c
+++ b/drivers/net/ngbe/ngbe_rxtx_vec_sse.c
@@ -11,7 +11,7 @@
 #include "ngbe_rxtx.h"
 #include "ngbe_rxtx_vec_common.h"
 
-#include 
+#include 
 
 static inline void
 ngbe_rxq_rearm(struct ngbe_rx_queue *rxq)
diff --git a/drivers/net/txgbe/txgbe_rxtx_vec_sse.c 
b/drivers/net/txgbe/txgbe_rxtx_vec_sse.c
index 12eb4aeef5..1a3f2ce3cd 100644
--- a/drivers/net/txgbe/txgbe_rxtx_vec_sse.c
+++ b/drivers/net/txgbe/txgbe_rxtx_vec_sse.c
@@ -10,7 +10,7 @@
 #include "txgbe_rxtx.h"
 #include "txgbe_rxtx_vec_common.h"
 
-#include 
+#include 
 
 static inline void
 txgbe_rxq_rearm(struct txgbe_rx_queue *rxq)
diff --git a/drivers/net/virtio/virtio_rxtx_simple_sse.c 
b/drivers/net/virtio/virtio_rxtx_simple_sse.c
index 6a18741b6d..d53acc4fd6 100644
--- a/drivers/net/virtio/virtio_rxtx_simple_sse.c
+++ b/drivers/net/virtio/virtio_rxtx_simple_sse.c
@@ -8,7 +8,7 @@
 #include 
 #include 
 
-#include 
+#include 
 
 #include 
 #include 
-- 
2.43.0



Re: [PATCH v3 1/6] net/fm10k: add missing vector API header include

2024-06-20 Thread Bruce Richardson
On Thu, Jun 20, 2024 at 01:50:22PM +0200, Mattias Rönnblom wrote:
> The fm10k PMD relied on , but failed to provide a direct
> include of this file.
> 
> Signed-off-by: Mattias Rönnblom 
> Acked-by: Bruce Richardson 
> ---
>  drivers/net/fm10k/fm10k_rxtx_vec.c | 1 +
>  1 file changed, 1 insertion(+)
> 
To fix 32-bit builds, more than just this driver needs to be fixed. See
https://patches.dpdk.org/project/dpdk/patch/20240620123218.1936250-1-bruce.richard...@intel.com/

Feel free to include this patch in new revisions of your patchset, if it
simplifies things for you.

/Bruce


Re: [PATCH v2 072/148] net/ice/base: update strict status when assigning BW limits

2024-06-20 Thread Bruce Richardson
On Wed, Jun 12, 2024 at 04:01:06PM +0100, Anatoly Burakov wrote:
> From: Ian Stokes 
> 
> In the BW configuration performed by DCF functions, the strict/WFQ and 
> priority
> field (referred to as Generic in the EAS) is not updated in the FW.  This 
> needs
> to be updated so as to not incorrectly allocate BW credits in the traffic
> shaping Tx scheduler.
> 
> Call the function "ice_sched_replay_node_prio" in the configuration flow once
> the node has been determined and the value of strict verified.
> 
That change isn't in this patch.


Re: [PATCH v2 073/148] net/ice/base: remove unused define

2024-06-20 Thread Bruce Richardson
On Wed, Jun 12, 2024 at 04:01:07PM +0100, Anatoly Burakov wrote:
> From: Ian Stokes 
> 
> In a previous patch a define was added that is not used. This is causing 
> issues
> with CI builds.
> 
> Signed-off-by: Dave Ertman 
> Signed-off-by: Ian Stokes 
> ---
>  drivers/net/ice/base/ice_sched.c | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/drivers/net/ice/base/ice_sched.c 
> b/drivers/net/ice/base/ice_sched.c
> index d39106173b..cb6131f69d 100644
> --- a/drivers/net/ice/base/ice_sched.c
> +++ b/drivers/net/ice/base/ice_sched.c
> @@ -4660,7 +4660,6 @@ ice_sched_save_tc_node_bw(struct ice_port_info *pi, u8 
> tc,
>  }
>  
>  #define ICE_SCHED_GENERIC_STRICT_MODEBIT(4)
> -#define ICE_SCHED_GENERIC_PRIO_M 0xE
>  #define ICE_SCHED_GENERIC_PRIO_S 1
>  
This was just added in previous patch, p72, so quash 72 & 73 together for
v3.


Re: [PATCH 2/4] dts: Use First Core Logic Change

2024-06-20 Thread Nicholas Pratte
On Fri, Jun 14, 2024 at 2:09 PM Jeremy Spewock  wrote:
>
> On Thu, Jun 13, 2024 at 4:21 PM Nicholas Pratte  wrote:
> >
> > Removed use_first_core from the conf.yaml in favor of determining this
> > within the framework. use_first_core continue to serve a purpose in that
> > it is only enabled when core 0 is explicitly provided in the
> > configuration. Any other configuration, including "" or "any," will
> > omit core 0.
> >
> > Documentation reworks are included to reflect the changes made.
> >
> > Bugzilla ID: 1360
> > Signed-off-by: Nicholas Pratte 
> >
> > ---
> >  doc/guides/tools/dts.rst   |  9 +++--
> >  dts/conf.yaml  |  3 +--
> >  dts/framework/config/__init__.py   | 11 +++
> >  dts/framework/config/conf_yaml_schema.json |  6 +-
> >  dts/framework/testbed_model/node.py|  9 +
> >  5 files changed, 21 insertions(+), 17 deletions(-)
> >
> > diff --git a/doc/guides/tools/dts.rst b/doc/guides/tools/dts.rst
> > index da85295db9..fbb5c6f17b 100644
> > --- a/doc/guides/tools/dts.rst
> > +++ b/doc/guides/tools/dts.rst
> > @@ -546,15 +546,12 @@ involved in the testing. These can be defined with 
> > the following mappings:
> > 
> > +---+---+
> > | ``os``| The operating system of this node. See `OS`_ 
> > for supported values.|
> > 
> > +---+---+
> > -   | ``lcores``| | (*optional*, defaults to 1) *string* – 
> > Comma-separated list of logical  |
> > -   |   | | cores to use. An empty string means use all 
> > lcores. |
> > +   | ``lcores``| | (*optional*, defaults to 1 if not used) 
> > *string* – Comma-separated list of logical  |
>
> I think just leaving this as "defaults to 1" is fine here. It's more
> explicit to say "if it isn't used", but just saying it defaults I
> think implies that enough.

Good point, and you're right. '*optional* and 'if not used' are redundant.

>
> > +   |   | | cores to use. An empty string means use all 
> > lcores except core 0. core 0 is used|
> > +   |   | | only when explicitly specified  
> > |
> > |   |   
> > |
> > |   | **Example**: ``1,2,3,4,5,18-22``  
> > |
> > 
> > +---+---+
> > -   | ``use_first_core``| (*optional*, defaults to ``false``) *boolean* 
> > |
> > -   |   |   
> > |
> > -   |   | Indicates whether DPDK should use only the 
> > first physical core or not.|
> > -   
> > +---+---+
> > | ``memory_channels``   | (*optional*, defaults to 1) *integer* 
> > |
> > |   |   
> > |
> > |   | The number of the memory channels to use. 
> > |
> 
> > diff --git a/dts/framework/config/__init__.py 
> > b/dts/framework/config/__init__.py
> > index 5a127a1207..6bc290a56a 100644
> > --- a/dts/framework/config/__init__.py
> > +++ b/dts/framework/config/__init__.py
> > @@ -245,6 +245,9 @@ def from_dict(
> >  hugepage_config_dict["force_first_numa"] = False
> >  hugepage_config = HugepageConfiguration(**hugepage_config_dict)
> >
> > +lcores = "1" if "lcores" not in d else d["lcores"] if "any" not in 
> > d["lcores"] else ""
> > +use_first_core = "0" in lcores
> > +
> >  # The calls here contain duplicated code which is here because 
> > Mypy doesn't
> >  # properly support dictionary unpacking with TypedDicts
> >  if "traffic_generator" in d:
> > @@ -255,8 +258,8 @@ def from_dict(
> >  password=d.get("password"),
> >  arch=Architecture(d["arch"]),
> >  os=OS(d["os"]),
> > -lcores=d.get("lcores", "1"),
> > -use_first_core=d.get("use_first_core", False),
> > +lcores=lcores,
> > +use_first_core=use_first_core,
>
> I wonder if we could completely remove the "use_first_core" attribute
> from the 

rte_bitmap_free() Re: DPDK Coverity issue 426433

2024-06-20 Thread Boyer, Andrew
Hello John,
While Coverity is correct that this is a useless call, that's an internal 
implementation detail of rte_bitmap_free() - not really something the caller 
should know about.

Can we annotate rte_bitmap_free() in some way to eliminate these? This is not 
the first PMD that's had this issue reported against it.

Thanks,
Andrew

> On Jun 20, 2024, at 7:18 AM, John McNamara  wrote:
> 
> Caution: This message originated from an External Source. Use proper caution 
> when opening attachments, clicking links, or responding.
> 
> 
> Hi Andrew,
> 
> This is an automated email in relation to a new Coverity static code analysis
> issue in DPDK. Details of the issue are below.
> 
> The email has been sent to you because you have been identified as the author
> or maintainer of the code where the defect appears.
> 
> There are several possible scenarios:
> 
> * The defect identified isn't a real issue: In this case you can edit the
>  defect online and change the defect "Classification" to "False Positive" or
>  "Intentional" and change the "Action" to "Ignore". You should also update
>  the "Severity", add yourself as the "Owner" and add a comment note.
> 
> * The defect is a real issue: In this case you should submit a patch to fix
>  the issues. The patch should include the following information in addition
>  to the usual comments and signoff:
> 
>Coverity issue: 426433
>Fixes: 6bc7f2cf6687 ("crypto/ionic: support sessions")
> 
>  In Coverity you should update the Classification, Severity, Action (to "Fix
>  required" or "Fix Submitted"), Owner and a Comment if necessary.
> 
> * The defect wasn't introduced by you. The line where the defect occurs may
>  not be the source of the defect. If this is the case then let the actual
>  author of the defect know by forwarding this email with a note or reply to
>  the sender of this automated email: 
> 
> You can review the defects online at:
> 
>http://scan.coverity.com/projects/dpdk-data-plane-development-kit
> 
> If you aren't registered for the DPDK Coverity you can do so here:
> 
>http://scan.coverity.com/users/sign_up
> 
> Git commit data and Coverity defect information below.
> 
> Commit data
> ===
> 
> Commit: crypto/ionic: support sessions
> Id: 6bc7f2cf6687126e265d848bcb83743a68f96ad6
> Author: Andrew Boyer
> Email:  andrew.bo...@amd.com
> Date:   Fri Jun  7 14:27:37 2024 -0700
> 
> Defect information
> ==
> 
> /drivers/crypto/ionic/ionic_crypto_main.c: 816 in iocpt_free_objs()
> *** CID 426433:  Incorrect expression  (USELESS_CALL)
> 810 for (i = 0; i < dev->crypto_dev->data->nb_queue_pairs; i++) {
> 811 iocpt_cryptoq_free(queue_pairs[i]);
> 812 queue_pairs[i] = NULL;
> 813 }
> 814
> 815 if (dev->sess_bm != NULL) {
CID 426433:  Incorrect expression  (USELESS_CALL)
Calling "rte_bitmap_free(dev->sess_bm)" is only useful for its return 
 value, which is ignored.
> 816 rte_bitmap_free(dev->sess_bm);
> 817 rte_free(dev->sess_bm);
> 818 dev->sess_bm = NULL;
> 819 }
> 820
> 821 if (dev->adminq != NULL) {
> 



RE: [EXTERNAL] [PATCH v2 1/2] crypto: fix build issues on unsetting crypto callbacks macro

2024-06-20 Thread Kundapura, Ganapati
Hi Akhil,

> -Original Message-
> From: Akhil Goyal 
> Sent: Thursday, June 13, 2024 11:34 PM
> To: Kundapura, Ganapati ; dev@dpdk.org;
> Gujjar, Abhinandan S ; Mcnamara, John
> ; Richardson, Bruce
> 
> Cc: Morten Brørup ; ferruh.yi...@amd.com;
> fanzhang@gmail.com; tho...@monjalon.net
> Subject: RE: [EXTERNAL] [PATCH v2 1/2] crypto: fix build issues on unsetting
> crypto callbacks macro
> 
> > > > From: Kundapura, Ganapati [mailto:ganapati.kundap...@intel.com]
> > > > Sent: Thursday, 30 May 2024 16.22
> > > >
> > > > Hi,
> > > >
> > > > > From: Akhil Goyal 
> > > > > Sent: Thursday, May 30, 2024 5:17 PM
> > > > >
> > > > > > > > #if may not be needed in application.
> > > > > > > > Test should be skipped if API is not available/supported.
> > > > > > > >
> > > > > > It's needed otherwise application developer has to check the
> > > > > > implementation for supported/not supported or else run the
> > > > > > application to get to know whether api is supported or not.
> > > > > >
> > > > >
> > > > > Application is always required to check the return value or else
> > > > > it will
> > > > miss the
> > > > > other errors that the API can return.
> > > > Currently RTE_CRYPTO_CALLBACKS is enabled by default and test
> > > > application checks the return value of the APIs. This patch fixes
> > > > build issues on compiling the DPDK with unsetting
> > > > RTE_CRYPTO_CALLBACKS.
> > > > >
> > > > > > > > > diff --git a/lib/cryptodev/rte_cryptodev.c
> > > > > > > > > b/lib/cryptodev/rte_cryptodev.c index 886eb7a..2e0890f
> > > > > > > > > 100644
> > > > > > > > > --- a/lib/cryptodev/rte_cryptodev.c
> > > > > > > > > +++ b/lib/cryptodev/rte_cryptodev.c
> > > > > > > > > @@ -628,6 +628,7 @@
> > > > > > > rte_cryptodev_asym_xform_capability_check_hash(
> > > > > > > > >   return ret;
> > > > > > > > >  }
> > > > > > > > >
> > > > > > > > > +#if RTE_CRYPTO_CALLBACKS
> > > > > > > > >  /* spinlock for crypto device enq callbacks */  static
> > > > > > > > > rte_spinlock_t rte_cryptodev_callback_lock =
> > > > > > > > RTE_SPINLOCK_INITIALIZER;
> > > > > > > > >
> > > > > > > > > @@ -744,6 +745,7 @@ cryptodev_cb_init(struct
> > > > > > > > > rte_cryptodev
> > *dev)
> > > > > > > > >   cryptodev_cb_cleanup(dev);
> > > > > > > > >   return -ENOMEM;
> > > > > > > > >  }
> > > > > > > > > +#endif /* RTE_CRYPTO_CALLBACKS */
> > > > > > > >
> > > > > > > >
> > > > > > > > > @@ -1485,6 +1491,7 @@
> > > > > > > > > rte_cryptodev_queue_pair_setup(uint8_t
> > > > > > > dev_id,
> > > > > > > > > uint16_t queue_pair_id,
> > > > > > > > >   socket_id);
> > > > > > > > >  }
> > > > > > > > >
> > > > > > > > > +#if RTE_CRYPTO_CALLBACKS
> > > > > > > > >  struct rte_cryptodev_cb *
> > > > > > > > > rte_cryptodev_add_enq_callback(uint8_t dev_id,
> > > > > > > > >  uint16_t qp_id, @@ -1763,6 
> > > > > > > > > +1770,7 @@
> > > > > rte_cryptodev_remove_deq_callback(uint8_t
> > > > > > > dev_id,
> > > > > > > > >   rte_spinlock_unlock(&rte_cryptodev_callback_lock);
> > > > > > > > >   return ret;
> > > > > > > > >  }
> > > > > > > > > +#endif /* RTE_CRYPTO_CALLBACKS */
> > > > > > > >
> > > > > > > > There is an issue here.
> > > > > > > > The APIs are visible in .h file and are available for
> > > > > > > > application to
> > > > use.
> > > > > > > > But the API implementation is compiled out.
> > > > > > > > Rather, you should add a return ENOTSUP from the beginning
> > > > > > > > of the APIs if RTE_CRYPTO_CALLBACKS  is enabled.
> > > > > > > > With this approach application will not need to put #if in its 
> > > > > > > > code.
> > > > > > API declarations wrapped under the macro changes in next patch.
> > > > >
> > > > > No, that is not the correct way. Application should check the return
> value.
> > > > > And we cannot force it to add ifdefs.
> > > > Test application is indeed checking the return value. Ifdefs are
> > > > added to avoid build issues on compiling with RTE_CRYPTO_CALLBACKS
> > > > is turned off Which is on by default.
> > >
> > > The test application should be able to build and run, regardless if
> > > the DPDK
> > library
> > > was built with RTE_CRYPTO_CALLBACKS defined or not.
> > >
> > > The test application should not assume that the DPDK library was
> > > built with the same RTE_CRYPTO_CALLBACKS configuration (i.e. defined
> > > or not) as the test application.
> > >
> > > > Even ethdev callbacks also doesn't return -ENOTSUP on
> > > > setting/unsetting RTE_ETHDEV_RXTX_CALLBACKS config.
> > >
> > > That would be a bug in the ethdev library.
> > > I just checked the ethdev source code
> > > (/source/lib/ethdev/rte_ethdev.c), and
> > all
> > > the add/remove rx/tx callback functions fail with ENOTSUP if
> > > RTE_ETHDEV_RXTX_CALLBACKS is not defined.
> > > Please note that some ethdev callbacks are not rx/tx callbacks, and
> > > thus are not gated by RTE_ETHDEV_RXTX_CALLBACKS.
> >
> > Hi Ganapati,
> > Can you send a new version incorpora

Re: [PATCH v2 5/6] fib: properly include vector API header file

2024-06-20 Thread Stephen Hemminger
On Thu, 20 Jun 2024 10:14:18 +0100
Bruce Richardson  wrote:

> On Thu, Jun 20, 2024 at 09:24:51AM +0200, Mattias Rönnblom wrote:
> > The trie implementation of the fib library relied on , but
> > failed to provide a direct include of this file.
> > 
> > Signed-off-by: Mattias Rönnblom 
> > ---  
> Acked-by: Bruce Richardson 

Acked-by: Stephen Hemminger 


Re: [PATCH v2 084/148] net/ice/base: add function to read SDP section from NVM

2024-06-20 Thread Bruce Richardson
On Wed, Jun 12, 2024 at 04:01:18PM +0100, Anatoly Burakov wrote:
> From: Ian Stokes 
> 
> Add API and definitions related to reading SDP section from NVM, related to 
> PTP
> pins assignment.
> 

Not familiar with the acronym here, so checked datasheet:
SDP == Software Definable Pin???

/Bruce



Re: Coding Style for local variables

2024-06-20 Thread Stephen Hemminger
On Thu, 20 Jun 2024 11:02:21 +0200
Morten Brørup  wrote:

> > From: Konstantin Ananyev [mailto:konstantin.anan...@huawei.com]
> >   
> > > > From: Thomas Monjalon [mailto:tho...@monjalon.net]
> > > >
> > > > 10/06/2024 18:31, Konstantin Ananyev:  
> > > > > Morten said:  
> > > > > > The coding style guide says:
> > > > > >
> > > > > > "Variables should be declared at the start of a block of code 
> > > > > > rather  
> > than  
> > > > in the middle. The exception to this is when the variable is  
> > > > > > const in which case the declaration must be at the point of first  
> > > > use/assignment. Declaring variable inside a for loop is OK."  
> > > > > >
> > > > > > Since DPDK switched to C11, variables can be declared where they 
> > > > > > are  
> > used,  
> > > > which reduces the risk of using effectively uninitialized  
> > > > > > variables. "Effectively uninitialized" means initialized to 0 or 
> > > > > > NULL  
> > > > where declared, to silence any compiler warnings about the use of  
> > > > > > uninitialized variables.
> > > > > >
> > > > > > Can we please agree to remove the recommendation/requirement to  
> > declare  
> > > > variables at the start of a block of code?  
> > > > >
> > > > > I know that modern C standards allow to define variable in the middle.
> > > > > But I am strongly opposed to allow that in DPDK coding style.
> > > > > Such practice makes code much harder to read and understand (at least 
> > > > >  
> > for  
> > > > me).
> > > >
> > > > Yes it is convenient to know that all variables are described
> > > > in a known place, just after function parameters.
> > > >
> > > > There is also a consistency concern.
> > > >
> > > > Old contributors like to be in a comfort zone,
> > > > and we don't want to lose old contributors.
> > > > New contributors may be refrained by old rules,
> > > > and we would like to get more new contributors.
> > > >
> > > > So that's a tricky decision.

Either way looks ok to me. See no need for hard and fast rules in this area.
But please no patches to change existing code.


Re: [PATCH v2 3/6] net/octeon_ep: properly include vector API header file

2024-06-20 Thread Stephen Hemminger
On Thu, 20 Jun 2024 09:24:49 +0200
Mattias Rönnblom  wrote:

> The octeon_ip driver relied on , but failed to provide a
> direct include of this file.
> 
> Signed-off-by: Mattias Rönnblom 
> ---

Acked-by: Stephen Hemminger 


[PATCH v2 0/7] Improvements and new test cases

2024-06-20 Thread Aakash Sasidharan
v2:
* Remove unused variables from tests for padding corruption.

Adding new test cases and improvements to test application.

Aakash Sasidharan (4):
  test/crypto: add combined mode cases for TLS 1.3
  test/security: add TLS 1.3 data walkthrough tests
  test/security: add out of place sgl tests for TLS
  test/security: use single session in data walkthrough test

Vidya Sagar Velumuri (3):
  test/crypto: unit tests for padding for TLS-1.3
  test/crypto: verify padding corruption in TLS-1.2
  test/crypto: verify padding corruption in DTLS-1.2

 app/test/test_cryptodev.c | 214 --
 app/test/test_cryptodev_security_tls_record.c |   7 +
 app/test/test_cryptodev_security_tls_record.h |   2 +
 3 files changed, 201 insertions(+), 22 deletions(-)

-- 
2.25.1



[PATCH v2 1/7] test/crypto: unit tests for padding for TLS-1.3

2024-06-20 Thread Aakash Sasidharan
From: Vidya Sagar Velumuri 

Add unit tests to verify the padding for TLS-1.3.

Signed-off-by: Vidya Sagar Velumuri 
---
 app/test/test_cryptodev.c | 31 +++
 1 file changed, 31 insertions(+)

diff --git a/app/test/test_cryptodev.c b/app/test/test_cryptodev.c
index 94438c587a..61ee43327a 100644
--- a/app/test/test_cryptodev.c
+++ b/app/test/test_cryptodev.c
@@ -12740,6 +12740,25 @@ test_tls_1_3_record_proto_zero_len_non_app(void)
 
return test_tls_record_proto_all(&flags);
 }
+
+static int
+test_tls_1_3_record_proto_dm_opt_padding(void)
+{
+   return test_tls_record_proto_opt_padding(6, 0, 
RTE_SECURITY_VERSION_TLS_1_3);
+}
+
+static int
+test_tls_1_3_record_proto_sg_opt_padding(void)
+{
+   return test_tls_record_proto_opt_padding(25, 5, 
RTE_SECURITY_VERSION_TLS_1_3);
+}
+
+static int
+test_tls_1_3_record_proto_sg_opt_padding_1(void)
+{
+   return test_tls_record_proto_opt_padding(25, 4, 
RTE_SECURITY_VERSION_TLS_1_3);
+}
+
 #endif
 
 static int
@@ -18168,6 +18187,18 @@ static struct unit_test_suite 
tls13_record_proto_testsuite  = {
"TLS-1.3 record with zero len and content type as ctrl",
ut_setup_security, ut_teardown,
test_tls_1_3_record_proto_zero_len_non_app),
+   TEST_CASE_NAMED_ST(
+   "TLS-1.3 record DM mode with optional padding",
+   ut_setup_security, ut_teardown,
+   test_tls_1_3_record_proto_dm_opt_padding),
+   TEST_CASE_NAMED_ST(
+   "TLS-1.3 record SG mode with optional padding - 1",
+   ut_setup_security, ut_teardown,
+   test_tls_1_3_record_proto_sg_opt_padding),
+   TEST_CASE_NAMED_ST(
+   "TLS-1.3 record SG mode with optional padding",
+   ut_setup_security, ut_teardown,
+   test_tls_1_3_record_proto_sg_opt_padding_1),
TEST_CASES_END() /**< NULL terminate unit test array */
}
 };
-- 
2.25.1



[PATCH v2 2/7] test/crypto: add combined mode cases for TLS 1.3

2024-06-20 Thread Aakash Sasidharan
Add cases to try TLS 1.3 record write(encrypt) + read(decrypt)
operations. This is used for testing TLS 1.3 record features with
all algorithms supported by the security device.

Signed-off-by: Aakash Sasidharan 
---
 app/test/test_cryptodev.c | 17 +
 1 file changed, 17 insertions(+)

diff --git a/app/test/test_cryptodev.c b/app/test/test_cryptodev.c
index 61ee43327a..c5244db883 100644
--- a/app/test/test_cryptodev.c
+++ b/app/test/test_cryptodev.c
@@ -12680,6 +12680,19 @@ test_dtls_1_2_record_proto_sg_opt_padding_max(void)
return test_tls_record_proto_opt_padding(33, 4, 
RTE_SECURITY_VERSION_DTLS_1_2);
 }
 
+static int
+test_tls_1_3_record_proto_display_list(void)
+{
+   struct tls_record_test_flags flags;
+
+   memset(&flags, 0, sizeof(flags));
+
+   flags.display_alg = true;
+   flags.tls_version = RTE_SECURITY_VERSION_TLS_1_3;
+
+   return test_tls_record_proto_all(&flags);
+}
+
 static int
 test_tls_1_3_record_proto_corrupt_pkt(void)
 {
@@ -18199,6 +18212,10 @@ static struct unit_test_suite 
tls13_record_proto_testsuite  = {
"TLS-1.3 record SG mode with optional padding",
ut_setup_security, ut_teardown,
test_tls_1_3_record_proto_sg_opt_padding_1),
+   TEST_CASE_NAMED_ST(
+   "Combined test alg list",
+   ut_setup_security, ut_teardown,
+   test_tls_1_3_record_proto_display_list),
TEST_CASES_END() /**< NULL terminate unit test array */
}
 };
-- 
2.25.1



[PATCH v2 3/7] test/security: add TLS 1.3 data walkthrough tests

2024-06-20 Thread Aakash Sasidharan
Add combined mode data walkthrough test and multi-segmented
packet data walkthrough test for TLS 1.3.

Signed-off-by: Aakash Sasidharan 
---
 app/test/test_cryptodev.c | 41 +++
 1 file changed, 41 insertions(+)

diff --git a/app/test/test_cryptodev.c b/app/test/test_cryptodev.c
index c5244db883..f3145abfee 100644
--- a/app/test/test_cryptodev.c
+++ b/app/test/test_cryptodev.c
@@ -12772,6 +12772,39 @@ test_tls_1_3_record_proto_sg_opt_padding_1(void)
return test_tls_record_proto_opt_padding(25, 4, 
RTE_SECURITY_VERSION_TLS_1_3);
 }
 
+static int
+test_tls_1_3_record_proto_data_walkthrough(void)
+{
+   struct tls_record_test_flags flags;
+
+   memset(&flags, 0, sizeof(flags));
+
+   flags.data_walkthrough = true;
+   flags.tls_version = RTE_SECURITY_VERSION_TLS_1_3;
+
+   return test_tls_record_proto_all(&flags);
+}
+
+static int
+test_tls_1_3_record_proto_sgl_data_walkthrough(void)
+{
+   struct tls_record_test_flags flags = {
+   .nb_segs_in_mbuf = 5,
+   .tls_version = RTE_SECURITY_VERSION_TLS_1_3,
+   .data_walkthrough = true
+   };
+   struct crypto_testsuite_params *ts_params = &testsuite_params;
+   struct rte_cryptodev_info dev_info;
+
+   rte_cryptodev_info_get(ts_params->valid_devs[0], &dev_info);
+   if (!(dev_info.feature_flags & RTE_CRYPTODEV_FF_IN_PLACE_SGL)) {
+   printf("Device doesn't support in-place scatter-gather. Test 
Skipped.\n");
+   return TEST_SKIPPED;
+   }
+
+   return test_tls_record_proto_all(&flags);
+}
+
 #endif
 
 static int
@@ -18216,6 +18249,14 @@ static struct unit_test_suite 
tls13_record_proto_testsuite  = {
"Combined test alg list",
ut_setup_security, ut_teardown,
test_tls_1_3_record_proto_display_list),
+   TEST_CASE_NAMED_ST(
+   "Data walkthrough combined test alg list",
+   ut_setup_security, ut_teardown,
+   test_tls_1_3_record_proto_data_walkthrough),
+   TEST_CASE_NAMED_ST(
+   "Multi-segmented mode data walkthrough",
+   ut_setup_security, ut_teardown,
+   test_tls_1_3_record_proto_sgl_data_walkthrough),
TEST_CASES_END() /**< NULL terminate unit test array */
}
 };
-- 
2.25.1



[PATCH v2 4/7] test/crypto: verify padding corruption in TLS-1.2

2024-06-20 Thread Aakash Sasidharan
From: Vidya Sagar Velumuri 

Add unit test to verify corrupted padding bytes in TLS-1.2 record

Signed-off-by: Vidya Sagar Velumuri 
Signed-off-by: Aakash Sasidharan 
---
 app/test/test_cryptodev.c | 18 +-
 app/test/test_cryptodev_security_tls_record.c |  7 +++
 app/test/test_cryptodev_security_tls_record.h |  1 +
 3 files changed, 25 insertions(+), 1 deletion(-)

diff --git a/app/test/test_cryptodev.c b/app/test/test_cryptodev.c
index f3145abfee..da8d7bf109 100644
--- a/app/test/test_cryptodev.c
+++ b/app/test/test_cryptodev.c
@@ -12173,7 +12173,7 @@ test_tls_record_proto_all(const struct 
tls_record_test_flags *flags)
if (ret == TEST_SKIPPED)
continue;
 
-   if (flags->pkt_corruption) {
+   if (flags->pkt_corruption || flags->padding_corruption) {
if (ret == TEST_SUCCESS)
return TEST_FAILED;
} else {
@@ -12404,6 +12404,18 @@ test_tls_record_proto_sg_opt_padding_max(void)
return test_tls_record_proto_opt_padding(33, 4, 
RTE_SECURITY_VERSION_TLS_1_2);
 }
 
+static int
+test_tls_record_proto_sg_opt_padding_corrupt(void)
+{
+   struct tls_record_test_flags flags = {
+   .opt_padding = 8,
+   .padding_corruption = true,
+   .nb_segs_in_mbuf = 4,
+   };
+
+   return test_tls_record_proto_all(&flags);
+}
+
 static int
 test_dtls_1_2_record_proto_data_walkthrough(void)
 {
@@ -17997,6 +18009,10 @@ static struct unit_test_suite 
tls12_record_proto_testsuite  = {
"TLS record SG mode with optional padding > max range",
ut_setup_security, ut_teardown,
test_tls_record_proto_sg_opt_padding_max),
+   TEST_CASE_NAMED_ST(
+   "TLS record SG mode with padding corruption",
+   ut_setup_security, ut_teardown,
+   test_tls_record_proto_sg_opt_padding_corrupt),
TEST_CASES_END() /**< NULL terminate unit test array */
}
 };
diff --git a/app/test/test_cryptodev_security_tls_record.c 
b/app/test/test_cryptodev_security_tls_record.c
index 03d9efefc3..1ba9609e1b 100644
--- a/app/test/test_cryptodev_security_tls_record.c
+++ b/app/test/test_cryptodev_security_tls_record.c
@@ -215,6 +215,13 @@ test_tls_record_td_update(struct tls_record_test_data 
td_inb[],
if (flags->pkt_corruption)
td_inb[i].input_text.data[0] = 
~td_inb[i].input_text.data[0];
 
+   /* Corrupt a byte in the last but one block */
+   if (flags->padding_corruption) {
+   int offset = td_inb[i].input_text.len - 
TLS_RECORD_PAD_CORRUPT_OFFSET;
+
+   td_inb[i].input_text.data[offset] = 
~td_inb[i].input_text.data[offset];
+   }
+
/* Clear outbound specific flags */
td_inb[i].tls_record_xform.options.iv_gen_disable = 0;
}
diff --git a/app/test/test_cryptodev_security_tls_record.h 
b/app/test/test_cryptodev_security_tls_record.h
index 18a90c6ff6..acb7f15f1c 100644
--- a/app/test/test_cryptodev_security_tls_record.h
+++ b/app/test/test_cryptodev_security_tls_record.h
@@ -41,6 +41,7 @@ static_assert(TLS_1_3_RECORD_PLAINTEXT_MAX_LEN <= 
TEST_SEC_CLEARTEXT_MAX_LEN,
  "TEST_SEC_CLEARTEXT_MAX_LEN should be at least RECORD MAX LEN!");
 
 #define TLS_RECORD_PLAINTEXT_MIN_LEN   (1u)
+#define TLS_RECORD_PAD_CORRUPT_OFFSET  20
 
 enum tls_record_test_content_type {
TLS_RECORD_TEST_CONTENT_TYPE_APP,
-- 
2.25.1



[PATCH v2 5/7] test/crypto: verify padding corruption in DTLS-1.2

2024-06-20 Thread Aakash Sasidharan
From: Vidya Sagar Velumuri 

Add unit test to verify corrupted padding bytes in DTLS-1.2 record

Signed-off-by: Vidya Sagar Velumuri 
Signed-off-by: Aakash Sasidharan 
---
 app/test/test_cryptodev.c | 17 +
 1 file changed, 17 insertions(+)

diff --git a/app/test/test_cryptodev.c b/app/test/test_cryptodev.c
index da8d7bf109..dd8880ed87 100644
--- a/app/test/test_cryptodev.c
+++ b/app/test/test_cryptodev.c
@@ -12705,6 +12705,19 @@ test_tls_1_3_record_proto_display_list(void)
return test_tls_record_proto_all(&flags);
 }
 
+static int
+test_dtls_1_2_record_proto_sg_opt_padding_corrupt(void)
+{
+   struct tls_record_test_flags flags = {
+   .opt_padding = 8,
+   .padding_corruption = true,
+   .nb_segs_in_mbuf = 4,
+   .tls_version = RTE_SECURITY_VERSION_DTLS_1_2
+   };
+
+   return test_tls_record_proto_all(&flags);
+}
+
 static int
 test_tls_1_3_record_proto_corrupt_pkt(void)
 {
@@ -18200,6 +18213,10 @@ static struct unit_test_suite 
dtls12_record_proto_testsuite  = {
"DTLS record SG mode with optional padding > max range",
ut_setup_security, ut_teardown,
test_dtls_1_2_record_proto_sg_opt_padding_max),
+   TEST_CASE_NAMED_ST(
+   "DTLS record SG mode with padding corruption",
+   ut_setup_security, ut_teardown,
+   test_dtls_1_2_record_proto_sg_opt_padding_corrupt),
TEST_CASES_END() /**< NULL terminate unit test array */
}
 };
-- 
2.25.1



[PATCH v2 6/7] test/security: add out of place sgl tests for TLS

2024-06-20 Thread Aakash Sasidharan
Add multi segmented test for TLS 1.3 and multi segmented out of place
tests for DTLS 1.2 and TLS 1.3.

Signed-off-by: Aakash Sasidharan 
---
 app/test/test_cryptodev.c | 69 ++-
 1 file changed, 39 insertions(+), 30 deletions(-)

diff --git a/app/test/test_cryptodev.c b/app/test/test_cryptodev.c
index dd8880ed87..e6ef5a13e0 100644
--- a/app/test/test_cryptodev.c
+++ b/app/test/test_cryptodev.c
@@ -12224,11 +12224,11 @@ test_tls_1_2_record_proto_display_list(void)
 }
 
 static int
-test_tls_1_2_record_proto_sgl(void)
+test_tls_record_proto_sgl(enum rte_security_tls_version tls_version)
 {
struct tls_record_test_flags flags = {
.nb_segs_in_mbuf = 5,
-   .tls_version = RTE_SECURITY_VERSION_TLS_1_2
+   .tls_version = tls_version
};
struct crypto_testsuite_params *ts_params = &testsuite_params;
struct rte_cryptodev_info dev_info;
@@ -12242,6 +12242,12 @@ test_tls_1_2_record_proto_sgl(void)
return test_tls_record_proto_all(&flags);
 }
 
+static int
+test_tls_1_2_record_proto_sgl(void)
+{
+   return test_tls_record_proto_sgl(RTE_SECURITY_VERSION_TLS_1_2);
+}
+
 static int
 test_tls_record_proto_sgl_data_walkthrough(enum rte_security_tls_version 
tls_version)
 {
@@ -12573,20 +12579,7 @@ test_dtls_1_2_record_proto_antireplay4096(void)
 static int
 test_dtls_1_2_record_proto_sgl(void)
 {
-   struct tls_record_test_flags flags = {
-   .nb_segs_in_mbuf = 5,
-   .tls_version = RTE_SECURITY_VERSION_DTLS_1_2
-   };
-   struct crypto_testsuite_params *ts_params = &testsuite_params;
-   struct rte_cryptodev_info dev_info;
-
-   rte_cryptodev_info_get(ts_params->valid_devs[0], &dev_info);
-   if (!(dev_info.feature_flags & RTE_CRYPTODEV_FF_IN_PLACE_SGL)) {
-   printf("Device doesn't support in-place scatter-gather. Test 
Skipped.\n");
-   return TEST_SKIPPED;
-   }
-
-   return test_tls_record_proto_all(&flags);
+   return test_tls_record_proto_sgl(RTE_SECURITY_VERSION_DTLS_1_2);
 }
 
 static int
@@ -12595,6 +12588,12 @@ test_dtls_1_2_record_proto_sgl_data_walkthrough(void)
return 
test_tls_record_proto_sgl_data_walkthrough(RTE_SECURITY_VERSION_DTLS_1_2);
 }
 
+static int
+test_dtls_1_2_record_proto_sgl_oop(void)
+{
+   return test_tls_record_proto_sgl_oop(RTE_SECURITY_VERSION_DTLS_1_2);
+}
+
 static int
 test_dtls_1_2_record_proto_corrupt_pkt(void)
 {
@@ -12811,23 +12810,21 @@ test_tls_1_3_record_proto_data_walkthrough(void)
 }
 
 static int
-test_tls_1_3_record_proto_sgl_data_walkthrough(void)
+test_tls_1_3_record_proto_sgl(void)
 {
-   struct tls_record_test_flags flags = {
-   .nb_segs_in_mbuf = 5,
-   .tls_version = RTE_SECURITY_VERSION_TLS_1_3,
-   .data_walkthrough = true
-   };
-   struct crypto_testsuite_params *ts_params = &testsuite_params;
-   struct rte_cryptodev_info dev_info;
+   return test_tls_record_proto_sgl(RTE_SECURITY_VERSION_TLS_1_3);
+}
 
-   rte_cryptodev_info_get(ts_params->valid_devs[0], &dev_info);
-   if (!(dev_info.feature_flags & RTE_CRYPTODEV_FF_IN_PLACE_SGL)) {
-   printf("Device doesn't support in-place scatter-gather. Test 
Skipped.\n");
-   return TEST_SKIPPED;
-   }
+static int
+test_tls_1_3_record_proto_sgl_data_walkthrough(void)
+{
+   return 
test_tls_record_proto_sgl_data_walkthrough(RTE_SECURITY_VERSION_TLS_1_3);
+}
 
-   return test_tls_record_proto_all(&flags);
+static int
+test_tls_1_3_record_proto_sgl_oop(void)
+{
+   return test_tls_record_proto_sgl_oop(RTE_SECURITY_VERSION_TLS_1_3);
 }
 
 #endif
@@ -18145,6 +18142,10 @@ static struct unit_test_suite 
dtls12_record_proto_testsuite  = {
"Multi-segmented mode data walkthrough",
ut_setup_security, ut_teardown,
test_dtls_1_2_record_proto_sgl_data_walkthrough),
+   TEST_CASE_NAMED_ST(
+   "Multi-segmented mode out of place",
+   ut_setup_security, ut_teardown,
+   test_dtls_1_2_record_proto_sgl_oop),
TEST_CASE_NAMED_ST(
"Packet corruption",
ut_setup_security, ut_teardown,
@@ -18286,10 +18287,18 @@ static struct unit_test_suite 
tls13_record_proto_testsuite  = {
"Data walkthrough combined test alg list",
ut_setup_security, ut_teardown,
test_tls_1_3_record_proto_data_walkthrough),
+   TEST_CASE_NAMED_ST(
+   "Multi-segmented mode",
+   ut_setup_security, ut_teardown,
+   test_tls_1_3_record_proto_sgl),
TEST_CASE_NAMED_ST(
"Multi-segmented mode data walkthrough",
ut_setup_security, ut_teardown,

[PATCH v2 7/7] test/security: use single session in data walkthrough test

2024-06-20 Thread Aakash Sasidharan
Existing data walkthrough test creates a new session
per each test packet size. Enhance the test to use single
session instead.

Signed-off-by: Aakash Sasidharan 
---
 app/test/test_cryptodev.c | 49 +--
 app/test/test_cryptodev_security_tls_record.h |  1 +
 2 files changed, 45 insertions(+), 5 deletions(-)

diff --git a/app/test/test_cryptodev.c b/app/test/test_cryptodev.c
index e6ef5a13e0..75f98b6744 100644
--- a/app/test/test_cryptodev.c
+++ b/app/test/test_cryptodev.c
@@ -11968,9 +11968,12 @@ test_tls_record_proto_process(const struct 
tls_record_test_data td[],
}
}
 
-   /* Create security session */
-   ut_params->sec_session = rte_security_session_create(ctx, &sess_conf,
-   ts_params->session_mpool);
+   if (ut_params->sec_session == NULL) {
+   /* Create security session */
+   ut_params->sec_session = rte_security_session_create(ctx, 
&sess_conf,
+   ts_params->session_mpool);
+   }
+
if (ut_params->sec_session == NULL)
return TEST_SKIPPED;
 
@@ -12075,9 +12078,10 @@ test_tls_record_proto_process(const struct 
tls_record_test_data td[],
rte_pktmbuf_free(ut_params->ibuf);
ut_params->ibuf = NULL;
 
-   if (ut_params->sec_session)
+   if (ut_params->sec_session != NULL && !flags->skip_sess_destroy) {
rte_security_session_destroy(ctx, ut_params->sec_session);
-   ut_params->sec_session = NULL;
+   ut_params->sec_session = NULL;
+   }
 
RTE_SET_USED(flags);
 
@@ -12121,8 +12125,11 @@ static int
 test_tls_record_proto_all(const struct tls_record_test_flags *flags)
 {
unsigned int i, nb_pkts = 1, pass_cnt = 0, payload_len, max_payload_len;
+   struct crypto_unittest_params *ut_params = &unittest_params;
struct tls_record_test_data td_outb[TEST_SEC_PKTS_MAX];
struct tls_record_test_data td_inb[TEST_SEC_PKTS_MAX];
+   void *sec_session_outb = NULL;
+   void *sec_session_inb = NULL;
int ret;
 
switch (flags->tls_version) {
@@ -12152,10 +12159,16 @@ test_tls_record_proto_all(const struct 
tls_record_test_flags *flags)
if (ret == TEST_SKIPPED)
continue;
 
+   if (flags->skip_sess_destroy)
+   ut_params->sec_session = sec_session_outb;
+
ret = test_tls_record_proto_process(td_outb, td_inb, nb_pkts, 
true, flags);
if (ret == TEST_SKIPPED)
continue;
 
+   if (flags->skip_sess_destroy && sec_session_outb == NULL)
+   sec_session_outb = ut_params->sec_session;
+
if (flags->zero_len &&
((flags->content_type == 
TLS_RECORD_TEST_CONTENT_TYPE_HANDSHAKE) ||
(flags->content_type == 
TLS_RECORD_TEST_CONTENT_TYPE_HANDSHAKE) ||
@@ -12169,10 +12182,16 @@ test_tls_record_proto_all(const struct 
tls_record_test_flags *flags)
 
test_tls_record_td_update(td_inb, td_outb, nb_pkts, flags);
 
+   if (flags->skip_sess_destroy)
+   ut_params->sec_session = sec_session_inb;
+
ret = test_tls_record_proto_process(td_inb, NULL, nb_pkts, 
true, flags);
if (ret == TEST_SKIPPED)
continue;
 
+   if (flags->skip_sess_destroy && sec_session_inb == NULL)
+   sec_session_inb = ut_params->sec_session;
+
if (flags->pkt_corruption || flags->padding_corruption) {
if (ret == TEST_SUCCESS)
return TEST_FAILED;
@@ -12188,6 +12207,22 @@ test_tls_record_proto_all(const struct 
tls_record_test_flags *flags)
if (flags->display_alg)
test_sec_alg_display(sec_alg_list[i].param1, 
sec_alg_list[i].param2);
 
+   if (flags->skip_sess_destroy) {
+   uint8_t dev_id = testsuite_params.valid_devs[0];
+   struct rte_security_ctx *ctx;
+
+   ctx = rte_cryptodev_get_sec_ctx(dev_id);
+   if (sec_session_inb != NULL) {
+   rte_security_session_destroy(ctx, 
sec_session_inb);
+   sec_session_inb = NULL;
+   }
+   if (sec_session_outb != NULL) {
+   rte_security_session_destroy(ctx, 
sec_session_outb);
+   sec_session_outb = NULL;
+   }
+   ut_params->sec_session = NULL;
+   }
+
pass_cnt++;
}
 
@@ -12205,6 +12240,7 @@ test_tls_1_2_record_proto_data_walkthrough(void)
memset(&flags, 0, sizeof(flags));
 
flags.data_walkthrough = true;
+   flags.skip_sess_destroy = true;
flags.t

Re: [PATCH v2 087/148] net/ice/base: allow skipping PF clear

2024-06-20 Thread Bruce Richardson
On Wed, Jun 12, 2024 at 04:01:21PM +0100, Anatoly Burakov wrote:
> From: Ian Stokes 
> 
> As per updated data sheet, add 'skip_clear_pf' field to ice_hw structure, 
> which
> can be used to skip call to ice_clear_pf_cfg() in ice_init_hw().
> 
> Also, make 'fw_vsi_num' field of ice_hw structure visible to every component
> using shared code, as well as make ice_init_fltr_mgmt_struct() and
> ice_cleanup_fltr_mgmt_struct() non-static.

Change to make functions non-static not in this patch.



[PATCH 00/12] fixes and improvements to CNXK crypto PMD

2024-06-20 Thread Aakash Sasidharan
This series adds improvements to CNXK crypto PMD and fixes aes-gcm zero
length input failure.

Aakash Sasidharan (1):
  crypto/cnxk: fix aes-gcm zero len input cases

Anoob Joseph (11):
  common/cnxk: add comments to denote skipped entries
  crypto/cnxk: update version map file with PMD APIs
  common/cnxk: make inline dev PF func get as idev API
  crypto/cnxk: add flow control in Rx inject path
  crypto/cnxk: use SSO PF func of inline device in inst
  crypto/cnxk: use NEON for Rx inject inst preparation
  crypto/cnxk: remove init of CPT result field in packet
  crypto/cnxk: add dual submission in Rx inject
  crypto/cnxk: update sess pointer for next iteration
  crypto/cnxk: make pack IV variable as const
  crypto/cnxk: enable dual submission to CPT

 drivers/common/cnxk/roc_ae.c  |   6 +-
 drivers/common/cnxk/roc_ae_fpm_tables.c   |   6 +-
 drivers/common/cnxk/roc_cpt.c |  17 +-
 drivers/common/cnxk/roc_cpt.h |  51 +++--
 drivers/common/cnxk/roc_idev.c|   6 +
 drivers/common/cnxk/roc_idev.h|   2 +
 drivers/common/cnxk/roc_nix_inl.h |   1 -
 drivers/common/cnxk/roc_nix_inl_dev.c |   6 -
 drivers/common/cnxk/version.map   |   2 +-
 drivers/crypto/cnxk/cn10k_cryptodev_ops.c | 231 +-
 drivers/crypto/cnxk/cn10k_cryptodev_ops.h |  60 +-
 drivers/crypto/cnxk/cnxk_cryptodev.h  |   2 +-
 drivers/crypto/cnxk/cnxk_cryptodev_ops.c  |  40 ++--
 drivers/crypto/cnxk/cnxk_cryptodev_ops.h  |   2 +
 drivers/crypto/cnxk/cnxk_se.h |  55 +++---
 drivers/crypto/cnxk/rte_pmd_cnxk_crypto.h |   2 +
 drivers/crypto/cnxk/version.map   |   8 +
 drivers/event/cnxk/cnxk_eventdev_adptr.c  |   4 +-
 drivers/net/cnxk/cn10k_ethdev_sec.c   |   2 +-
 drivers/net/cnxk/cnxk_ethdev_telemetry.c  |   3 +-
 20 files changed, 272 insertions(+), 234 deletions(-)

-- 
2.25.1



[PATCH 01/12] common/cnxk: add comments to denote skipped entries

2024-06-20 Thread Aakash Sasidharan
From: Anoob Joseph 

Add comments to denote unused table entries.

Signed-off-by: Anoob Joseph 
---
 drivers/common/cnxk/roc_ae.c| 6 +++---
 drivers/common/cnxk/roc_ae_fpm_tables.c | 6 +++---
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/common/cnxk/roc_ae.c b/drivers/common/cnxk/roc_ae.c
index e6a013d7c4..7ef0efe2b3 100644
--- a/drivers/common/cnxk/roc_ae.c
+++ b/drivers/common/cnxk/roc_ae.c
@@ -151,9 +151,9 @@ const struct roc_ae_ec_group ae_ec_grp[ROC_AE_EC_ID_PMAX] = 
{
 0x3F, 0x00},
.length = 66},
},
-   {},
-   {},
-   {},
+   { /* ROC_AE_EC_ID_P160 */ },
+   { /* ROC_AE_EC_ID_P320 */ },
+   { /* ROC_AE_EC_ID_P512 */ },
{
.prime = {.data = {0xFF, 0xFF, 0xFF, 0xFE, 0xFF, 0xFF, 0xFF,
   0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
diff --git a/drivers/common/cnxk/roc_ae_fpm_tables.c 
b/drivers/common/cnxk/roc_ae_fpm_tables.c
index ead3128e7f..942657b56a 100644
--- a/drivers/common/cnxk/roc_ae_fpm_tables.c
+++ b/drivers/common/cnxk/roc_ae_fpm_tables.c
@@ -1261,9 +1261,9 @@ const struct ae_fpm_entry 
ae_fpm_tbl_scalar[ROC_AE_EC_ID_PMAX] = {
.data = ae_fpm_tbl_p521,
.len = sizeof(ae_fpm_tbl_p521)
},
-   {},
-   {},
-   {},
+   { /* ROC_AE_EC_ID_P160 */ },
+   { /* ROC_AE_EC_ID_P320 */ },
+   { /* ROC_AE_EC_ID_P512 */ },
{
.data = ae_fpm_tbl_p256_sm2,
.len = sizeof(ae_fpm_tbl_p256_sm2)
-- 
2.25.1



[PATCH 02/12] crypto/cnxk: update version map file with PMD APIs

2024-06-20 Thread Aakash Sasidharan
From: Anoob Joseph 

Update version map with details of PMD APIs added.

Signed-off-by: Anoob Joseph 
---
 drivers/crypto/cnxk/rte_pmd_cnxk_crypto.h | 2 ++
 drivers/crypto/cnxk/version.map   | 8 
 2 files changed, 10 insertions(+)

diff --git a/drivers/crypto/cnxk/rte_pmd_cnxk_crypto.h 
b/drivers/crypto/cnxk/rte_pmd_cnxk_crypto.h
index 8b0a5ba0f2..eab1243065 100644
--- a/drivers/crypto/cnxk/rte_pmd_cnxk_crypto.h
+++ b/drivers/crypto/cnxk/rte_pmd_cnxk_crypto.h
@@ -23,6 +23,7 @@
  * @return
  *   Pointer to queue pair structure that would be the input to submit APIs.
  */
+__rte_experimental
 void *rte_pmd_cnxk_crypto_qptr_get(uint8_t dev_id, uint16_t qp_id);
 
 /**
@@ -41,6 +42,7 @@ void *rte_pmd_cnxk_crypto_qptr_get(uint8_t dev_id, uint16_t 
qp_id);
  * @param nb_inst
  *   Number of instructions.
  */
+__rte_experimental
 void rte_pmd_cnxk_crypto_submit(void *qptr, void *inst, uint16_t nb_inst);
 
 #endif /* _PMD_CNXK_CRYPTO_H_ */
diff --git a/drivers/crypto/cnxk/version.map b/drivers/crypto/cnxk/version.map
index 5789a6bfc9..7a77607774 100644
--- a/drivers/crypto/cnxk/version.map
+++ b/drivers/crypto/cnxk/version.map
@@ -1,3 +1,11 @@
+EXPERIMENTAL {
+   global:
+
+   # added in 24.03
+   rte_pmd_cnxk_crypto_submit;
+   rte_pmd_cnxk_crypto_qptr_get;
+};
+
 INTERNAL {
global:
 
-- 
2.25.1



[PATCH 03/12] common/cnxk: make inline dev PF func get as idev API

2024-06-20 Thread Aakash Sasidharan
From: Anoob Joseph 

Inline PF FUNC would be required to set SSO_PF_FUNC in the instruction
for cryptodev Rx inject. Move the API to idev to allow usage of the
same.

Signed-off-by: Anoob Joseph 
---
 drivers/common/cnxk/roc_idev.c   | 6 ++
 drivers/common/cnxk/roc_idev.h   | 2 ++
 drivers/common/cnxk/roc_nix_inl.h| 1 -
 drivers/common/cnxk/roc_nix_inl_dev.c| 6 --
 drivers/common/cnxk/version.map  | 2 +-
 drivers/net/cnxk/cn10k_ethdev_sec.c  | 2 +-
 drivers/net/cnxk/cnxk_ethdev_telemetry.c | 3 +--
 7 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/drivers/common/cnxk/roc_idev.c b/drivers/common/cnxk/roc_idev.c
index d0307c666c..0778d51d1e 100644
--- a/drivers/common/cnxk/roc_idev.c
+++ b/drivers/common/cnxk/roc_idev.c
@@ -374,3 +374,9 @@ roc_idev_nix_rx_chan_set(uint16_t port, uint16_t chan)
if (idev != NULL && port < PLT_MAX_ETHPORTS)
__atomic_store_n(&idev->inl_rx_inj_cfg.chan[port], chan, 
__ATOMIC_RELEASE);
 }
+
+uint16_t
+roc_idev_nix_inl_dev_pffunc_get(void)
+{
+   return nix_inl_dev_pffunc_get();
+}
diff --git a/drivers/common/cnxk/roc_idev.h b/drivers/common/cnxk/roc_idev.h
index 00664eaed6..fc0f7db54e 100644
--- a/drivers/common/cnxk/roc_idev.h
+++ b/drivers/common/cnxk/roc_idev.h
@@ -27,4 +27,6 @@ uint8_t __roc_api roc_idev_nix_rx_inject_get(uint16_t port);
 void __roc_api roc_idev_nix_rx_inject_set(uint16_t port, uint8_t enable);
 uint16_t *__roc_api roc_idev_nix_rx_chan_base_get(void);
 void __roc_api roc_idev_nix_rx_chan_set(uint16_t port, uint16_t chan);
+
+uint16_t __roc_api roc_idev_nix_inl_dev_pffunc_get(void);
 #endif /* _ROC_IDEV_H_ */
diff --git a/drivers/common/cnxk/roc_nix_inl.h 
b/drivers/common/cnxk/roc_nix_inl.h
index ab0965e512..1a4bf8808c 100644
--- a/drivers/common/cnxk/roc_nix_inl.h
+++ b/drivers/common/cnxk/roc_nix_inl.h
@@ -112,7 +112,6 @@ void __roc_api roc_nix_inl_dev_lock(void);
 void __roc_api roc_nix_inl_dev_unlock(void);
 int __roc_api roc_nix_inl_dev_xaq_realloc(uint64_t aura_handle);
 int __roc_api roc_nix_inl_dev_stats_get(struct roc_nix_stats *stats);
-uint16_t __roc_api roc_nix_inl_dev_pffunc_get(void);
 int __roc_api roc_nix_inl_dev_cpt_setup(bool use_inl_dev_sso);
 int __roc_api roc_nix_inl_dev_cpt_release(void);
 bool __roc_api roc_nix_inl_dev_is_multi_channel(void);
diff --git a/drivers/common/cnxk/roc_nix_inl_dev.c 
b/drivers/common/cnxk/roc_nix_inl_dev.c
index 60e6a43033..e2bbe3a67b 100644
--- a/drivers/common/cnxk/roc_nix_inl_dev.c
+++ b/drivers/common/cnxk/roc_nix_inl_dev.c
@@ -34,12 +34,6 @@ nix_inl_dev_pffunc_get(void)
return 0;
 }
 
-uint16_t
-roc_nix_inl_dev_pffunc_get(void)
-{
-   return nix_inl_dev_pffunc_get();
-}
-
 static void
 nix_inl_selftest_work_cb(uint64_t *gw, void *args, uint32_t soft_exp_event)
 {
diff --git a/drivers/common/cnxk/version.map b/drivers/common/cnxk/version.map
index eac2ea9ff8..f98738d07e 100644
--- a/drivers/common/cnxk/version.map
+++ b/drivers/common/cnxk/version.map
@@ -112,6 +112,7 @@ INTERNAL {
roc_idev_npa_nix_get;
roc_idev_num_lmtlines_get;
roc_idev_nix_inl_meta_aura_get;
+   roc_idev_nix_inl_dev_pffunc_get;
roc_idev_nix_list_get;
roc_idev_nix_rx_chan_base_get;
roc_idev_nix_rx_chan_set;
@@ -244,7 +245,6 @@ INTERNAL {
roc_nix_inl_dev_is_probed;
roc_nix_inl_dev_stats_get;
roc_nix_inl_dev_lock;
-   roc_nix_inl_dev_pffunc_get;
roc_nix_inl_dev_rq;
roc_nix_inl_dev_rq_get;
roc_nix_inl_dev_rq_put;
diff --git a/drivers/net/cnxk/cn10k_ethdev_sec.c 
b/drivers/net/cnxk/cn10k_ethdev_sec.c
index b8b0da5ea9..5e509e97d4 100644
--- a/drivers/net/cnxk/cn10k_ethdev_sec.c
+++ b/drivers/net/cnxk/cn10k_ethdev_sec.c
@@ -1360,7 +1360,7 @@ cn10k_eth_sec_rx_inject_config(void *device, uint16_t 
port_id, bool enable)
inj_cfg->io_addr = inl_lf->io_addr;
inj_cfg->lmt_base = nix->lmt_base;
channel = roc_nix_get_base_chan(nix);
-   pf_func = roc_nix_inl_dev_pffunc_get();
+   pf_func = roc_idev_nix_inl_dev_pffunc_get();
inj_cfg->cmd_w0 = pf_func << 48 | inj_match_id << 32 | channel << 4;
 
return 0;
diff --git a/drivers/net/cnxk/cnxk_ethdev_telemetry.c 
b/drivers/net/cnxk/cnxk_ethdev_telemetry.c
index 3027ca4735..a1958185f2 100644
--- a/drivers/net/cnxk/cnxk_ethdev_telemetry.c
+++ b/drivers/net/cnxk/cnxk_ethdev_telemetry.c
@@ -65,8 +65,7 @@ ethdev_tel_handle_info(const char *cmd __rte_unused,
info = ð_info.info;
dev = cnxk_eth_pmd_priv(eth_dev);
if (dev) {
-   info->inl_dev_pf_func =
-   roc_nix_inl_dev_pffunc_get();
+   info->inl_dev_pf_func = 
roc_idev_nix_inl_dev_pffunc_get();
info->pf_func = roc_nix_get_pf_func(&dev->nix);
info->max_mac_entries = dev->max_mac_entries;
   

[PATCH 04/12] crypto/cnxk: add flow control in Rx inject path

2024-06-20 Thread Aakash Sasidharan
From: Anoob Joseph 

Add flow control in Rx inject path to avoid over submission to CPT.

Signed-off-by: Anoob Joseph 
---
 drivers/crypto/cnxk/cn10k_cryptodev_ops.c | 16 +++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/drivers/crypto/cnxk/cn10k_cryptodev_ops.c 
b/drivers/crypto/cnxk/cn10k_cryptodev_ops.c
index 720b756001..9f1c074925 100644
--- a/drivers/crypto/cnxk/cn10k_cryptodev_ops.c
+++ b/drivers/crypto/cnxk/cn10k_cryptodev_ops.c
@@ -1400,8 +1400,10 @@ cn10k_cryptodev_sec_inb_rx_inject(void *dev, struct 
rte_mbuf **pkts,
struct rte_cryptodev *cdev = dev;
union cpt_res_s *hw_res = NULL;
struct cpt_inst_s *inst;
+   union cpt_fc_write_s fc;
struct cnxk_cpt_vf *vf;
struct rte_mbuf *m;
+   uint64_t *fc_addr;
uint64_t dptr;
int i;
 
@@ -1413,13 +1415,24 @@ cn10k_cryptodev_sec_inb_rx_inject(void *dev, struct 
rte_mbuf **pkts,
 
lmt_base = vf->rx_inj_lmtline.lmt_base;
io_addr = vf->rx_inj_lmtline.io_addr;
+   fc_addr = vf->rx_inj_lmtline.fc_addr;
 
ROC_LMT_BASE_ID_GET(lmt_base, lmt_id);
pf_func = vf->rx_inj_pf_func;
 
+   const uint32_t fc_thresh = vf->rx_inj_lmtline.fc_thresh;
+
 again:
+   fc.u64[0] =
+   rte_atomic_load_explicit((RTE_ATOMIC(uint64_t) *)fc_addr, 
rte_memory_order_relaxed);
inst = (struct cpt_inst_s *)lmt_base;
-   for (i = 0; i < RTE_MIN(CN10K_PKTS_PER_LOOP, nb_pkts); i++) {
+
+   i = 0;
+
+   if (unlikely(fc.s.qsize > fc_thresh))
+   goto exit;
+
+   for (; i < RTE_MIN(CN10K_PKTS_PER_LOOP, nb_pkts); i++) {
 
m = pkts[i];
sec_sess = (struct cn10k_sec_session *)sess[i];
@@ -1487,6 +1500,7 @@ cn10k_cryptodev_sec_inb_rx_inject(void *dev, struct 
rte_mbuf **pkts,
goto again;
}
 
+exit:
return count + i;
 }
 
-- 
2.25.1



[PATCH 05/12] crypto/cnxk: use SSO PF func of inline device in inst

2024-06-20 Thread Aakash Sasidharan
From: Anoob Joseph 

RVU PF FUNC of the CPT LF need not be set as the hardware would
determine that. Instead SSO PF FUNC need to be set as inline device so
that critical errors would reach inline device.

Signed-off-by: Anoob Joseph 
---
 drivers/crypto/cnxk/cn10k_cryptodev_ops.c | 2 +-
 drivers/crypto/cnxk/cnxk_cryptodev.h  | 2 +-
 drivers/crypto/cnxk/cnxk_cryptodev_ops.c  | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/crypto/cnxk/cn10k_cryptodev_ops.c 
b/drivers/crypto/cnxk/cn10k_cryptodev_ops.c
index 9f1c074925..f2980399c5 100644
--- a/drivers/crypto/cnxk/cn10k_cryptodev_ops.c
+++ b/drivers/crypto/cnxk/cn10k_cryptodev_ops.c
@@ -1418,7 +1418,7 @@ cn10k_cryptodev_sec_inb_rx_inject(void *dev, struct 
rte_mbuf **pkts,
fc_addr = vf->rx_inj_lmtline.fc_addr;
 
ROC_LMT_BASE_ID_GET(lmt_base, lmt_id);
-   pf_func = vf->rx_inj_pf_func;
+   pf_func = vf->rx_inj_sso_pf_func;
 
const uint32_t fc_thresh = vf->rx_inj_lmtline.fc_thresh;
 
diff --git a/drivers/crypto/cnxk/cnxk_cryptodev.h 
b/drivers/crypto/cnxk/cnxk_cryptodev.h
index fffc4a47b4..4000e84a7e 100644
--- a/drivers/crypto/cnxk/cnxk_cryptodev.h
+++ b/drivers/crypto/cnxk/cnxk_cryptodev.h
@@ -22,7 +22,7 @@
  */
 struct cnxk_cpt_vf {
struct roc_cpt_lmtline rx_inj_lmtline;
-   uint16_t rx_inj_pf_func;
+   uint16_t rx_inj_sso_pf_func;
uint16_t *rx_chan_base;
struct roc_cpt cpt;
struct rte_cryptodev_capabilities crypto_caps[CNXK_CPT_MAX_CAPS];
diff --git a/drivers/crypto/cnxk/cnxk_cryptodev_ops.c 
b/drivers/crypto/cnxk/cnxk_cryptodev_ops.c
index d7f5780637..51369309c5 100644
--- a/drivers/crypto/cnxk/cnxk_cryptodev_ops.c
+++ b/drivers/crypto/cnxk/cnxk_cryptodev_ops.c
@@ -483,7 +483,7 @@ cnxk_cpt_queue_pair_setup(struct rte_cryptodev *dev, 
uint16_t qp_id,
goto exit;
}
 
-   vf->rx_inj_pf_func = qp->lf.pf_func;
+   vf->rx_inj_sso_pf_func = roc_idev_nix_inl_dev_pffunc_get();
 
/* Block the queue for other submissions */
qp->pend_q.pq_mask = 0;
-- 
2.25.1



[PATCH 06/12] crypto/cnxk: use NEON for Rx inject inst preparation

2024-06-20 Thread Aakash Sasidharan
From: Anoob Joseph 

Use NEON instructions for Rx inject instruction preparation.

Signed-off-by: Anoob Joseph 
---
 drivers/crypto/cnxk/cn10k_cryptodev_ops.c | 57 +--
 1 file changed, 42 insertions(+), 15 deletions(-)

diff --git a/drivers/crypto/cnxk/cn10k_cryptodev_ops.c 
b/drivers/crypto/cnxk/cn10k_cryptodev_ops.c
index f2980399c5..d36516735a 100644
--- a/drivers/crypto/cnxk/cn10k_cryptodev_ops.c
+++ b/drivers/crypto/cnxk/cn10k_cryptodev_ops.c
@@ -7,6 +7,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 
@@ -1390,15 +1391,17 @@ cn10k_cpt_dequeue_burst(void *qptr, struct 
rte_crypto_op **ops, uint16_t nb_ops)
return i;
 }
 
+#if defined(RTE_ARCH_ARM64)
 uint16_t __rte_hot
 cn10k_cryptodev_sec_inb_rx_inject(void *dev, struct rte_mbuf **pkts,
  struct rte_security_session **sess, uint16_t 
nb_pkts)
 {
-   uint16_t l2_len, pf_func, lmt_id, count = 0;
-   uint64_t lmt_base, lmt_arg, io_addr;
+   uint64_t lmt_base, lmt_arg, io_addr, u64_0, u64_1, l2_len, pf_func;
+   uint64x2_t inst_01, inst_23, inst_45, inst_67;
struct cn10k_sec_session *sec_sess;
struct rte_cryptodev *cdev = dev;
union cpt_res_s *hw_res = NULL;
+   uint16_t lmt_id, count = 0;
struct cpt_inst_s *inst;
union cpt_fc_write_s fc;
struct cnxk_cpt_vf *vf;
@@ -1456,26 +1459,38 @@ cn10k_cryptodev_sec_inb_rx_inject(void *dev, struct 
rte_mbuf **pkts,
hw_res = RTE_PTR_ALIGN_CEIL(hw_res, 16);
 
/* Prepare CPT instruction */
-   inst->w0.u64 = 0;
-   inst->w2.u64 = 0;
-   inst->w2.s.rvu_pf_func = pf_func;
-   inst->w3.u64 = (((uint64_t)m + sizeof(struct rte_mbuf)) >> 3) 
<< 3 | 1;
 
-   inst->w4.u64 = sec_sess->inst.w4 | (rte_pktmbuf_pkt_len(m));
+   /* Word 0 and 1 */
+   u64_0 = pf_func << 48 | *(vf->rx_chan_base + m->port) << 4 | 
(l2_len - 2) << 24 |
+   l2_len << 16;
+   inst_01 = vsetq_lane_u64(u64_0, inst_01, 0);
+   inst_01 = vsetq_lane_u64((uint64_t)hw_res, inst_01, 1);
+   vst1q_u64(&inst->w0.u64, inst_01);
+
+   /* Word 2 and 3 */
+   inst_23 = vdupq_n_u64(0);
+   u64_1 = (((uint64_t)m + sizeof(struct rte_mbuf)) >> 3) << 3 | 1;
+   inst_23 = vsetq_lane_u64(u64_1, inst_23, 1);
+   vst1q_u64(&inst->w2.u64, inst_23);
+
+   /* Word 4 and 5 */
+   u64_0 = sec_sess->inst.w4 | (rte_pktmbuf_pkt_len(m));
+   inst_45 = vsetq_lane_u64(u64_0, inst_45, 0);
dptr = (uint64_t)rte_pktmbuf_iova(m);
-   inst->dptr = dptr;
-   inst->rptr = dptr;
+   u64_1 = dptr;
+   inst_45 = vsetq_lane_u64(u64_1, inst_45, 1);
+   vst1q_u64(&inst->w4.u64, inst_45);
 
-   inst->w0.hw_s.chan = *(vf->rx_chan_base + m->port);
-   inst->w0.hw_s.l2_len = l2_len;
-   inst->w0.hw_s.et_offset = l2_len - 2;
+   /* Word 6 and 7 */
+   u64_0 = dptr;
+   u64_1 = sec_sess->inst.w7;
+   inst_67 = vsetq_lane_u64(u64_0, inst_67, 0);
+   inst_67 = vsetq_lane_u64(u64_1, inst_67, 1);
+   vst1q_u64(&inst->w6.u64, inst_67);
 
-   inst->res_addr = (uint64_t)hw_res;
rte_atomic_store_explicit((unsigned long __rte_atomic 
*)&hw_res->u64[0], res.u64[0],
  rte_memory_order_relaxed);
 
-   inst->w7.u64 = sec_sess->inst.w7;
-
inst += 2;
}
 
@@ -1503,6 +1518,18 @@ cn10k_cryptodev_sec_inb_rx_inject(void *dev, struct 
rte_mbuf **pkts,
 exit:
return count + i;
 }
+#else
+uint16_t __rte_hot
+cn10k_cryptodev_sec_inb_rx_inject(void *dev, struct rte_mbuf **pkts,
+ struct rte_security_session **sess, uint16_t 
nb_pkts)
+{
+   RTE_SET_USED(dev);
+   RTE_SET_USED(pkts);
+   RTE_SET_USED(sess);
+   RTE_SET_USED(nb_pkts);
+   return 0;
+}
+#endif
 
 void
 cn10k_cpt_set_enqdeq_fns(struct rte_cryptodev *dev, struct cnxk_cpt_vf *vf)
-- 
2.25.1



[PATCH 07/12] crypto/cnxk: remove init of CPT result field in packet

2024-06-20 Thread Aakash Sasidharan
From: Anoob Joseph 

The packet would be posted to CPT only when there is a valid result.
Skip setting of the same.

Signed-off-by: Anoob Joseph 
---
 drivers/crypto/cnxk/cn10k_cryptodev_ops.c | 7 ---
 1 file changed, 7 deletions(-)

diff --git a/drivers/crypto/cnxk/cn10k_cryptodev_ops.c 
b/drivers/crypto/cnxk/cn10k_cryptodev_ops.c
index d36516735a..1108a8a1da 100644
--- a/drivers/crypto/cnxk/cn10k_cryptodev_ops.c
+++ b/drivers/crypto/cnxk/cn10k_cryptodev_ops.c
@@ -1410,10 +1410,6 @@ cn10k_cryptodev_sec_inb_rx_inject(void *dev, struct 
rte_mbuf **pkts,
uint64_t dptr;
int i;
 
-   const union cpt_res_s res = {
-   .cn10k.compcode = CPT_COMP_NOT_DONE,
-   };
-
vf = cdev->data->dev_private;
 
lmt_base = vf->rx_inj_lmtline.lmt_base;
@@ -1488,9 +1484,6 @@ cn10k_cryptodev_sec_inb_rx_inject(void *dev, struct 
rte_mbuf **pkts,
inst_67 = vsetq_lane_u64(u64_1, inst_67, 1);
vst1q_u64(&inst->w6.u64, inst_67);
 
-   rte_atomic_store_explicit((unsigned long __rte_atomic 
*)&hw_res->u64[0], res.u64[0],
- rte_memory_order_relaxed);
-
inst += 2;
}
 
-- 
2.25.1



[PATCH 08/12] crypto/cnxk: add dual submission in Rx inject

2024-06-20 Thread Aakash Sasidharan
From: Anoob Joseph 

Add dual submission to CPT in Rx inject path.

Signed-off-by: Anoob Joseph 
Signed-off-by: Vidya Sagar Velumuri 
---
 drivers/common/cnxk/roc_cpt.h | 43 +-
 drivers/crypto/cnxk/cn10k_cryptodev_ops.c | 70 +--
 drivers/crypto/cnxk/cnxk_cryptodev_ops.c  |  9 +++
 3 files changed, 90 insertions(+), 32 deletions(-)

diff --git a/drivers/common/cnxk/roc_cpt.h b/drivers/common/cnxk/roc_cpt.h
index 3721fa08c0..8ef9062ae0 100644
--- a/drivers/common/cnxk/roc_cpt.h
+++ b/drivers/common/cnxk/roc_cpt.h
@@ -30,23 +30,36 @@
 /* Vector of sizes in the burst of 16 CPT inst except first in 63:19 of
  * APT_LMT_ARG_S
  */
-#define ROC_CN10K_CPT_LMT_ARG  
\
-   (ROC_CN10K_CPT_INST_DW_M1 << (19 + 3 * 0) |\
-ROC_CN10K_CPT_INST_DW_M1 << (19 + 3 * 1) |\
-ROC_CN10K_CPT_INST_DW_M1 << (19 + 3 * 2) |\
-ROC_CN10K_CPT_INST_DW_M1 << (19 + 3 * 3) |\
-ROC_CN10K_CPT_INST_DW_M1 << (19 + 3 * 4) |\
-ROC_CN10K_CPT_INST_DW_M1 << (19 + 3 * 5) |\
-ROC_CN10K_CPT_INST_DW_M1 << (19 + 3 * 6) |\
-ROC_CN10K_CPT_INST_DW_M1 << (19 + 3 * 7) |\
-ROC_CN10K_CPT_INST_DW_M1 << (19 + 3 * 8) |\
-ROC_CN10K_CPT_INST_DW_M1 << (19 + 3 * 9) |\
-ROC_CN10K_CPT_INST_DW_M1 << (19 + 3 * 10) |   \
-ROC_CN10K_CPT_INST_DW_M1 << (19 + 3 * 11) |   \
-ROC_CN10K_CPT_INST_DW_M1 << (19 + 3 * 12) |   \
-ROC_CN10K_CPT_INST_DW_M1 << (19 + 3 * 13) |   \
+#define ROC_CN10K_CPT_LMT_ARG  
\
+   (ROC_CN10K_CPT_INST_DW_M1 << (19 + 3 * 0) | ROC_CN10K_CPT_INST_DW_M1 << 
(19 + 3 * 1) | \
+ROC_CN10K_CPT_INST_DW_M1 << (19 + 3 * 2) | ROC_CN10K_CPT_INST_DW_M1 << 
(19 + 3 * 3) | \
+ROC_CN10K_CPT_INST_DW_M1 << (19 + 3 * 4) | ROC_CN10K_CPT_INST_DW_M1 << 
(19 + 3 * 5) | \
+ROC_CN10K_CPT_INST_DW_M1 << (19 + 3 * 6) | ROC_CN10K_CPT_INST_DW_M1 << 
(19 + 3 * 7) | \
+ROC_CN10K_CPT_INST_DW_M1 << (19 + 3 * 8) | ROC_CN10K_CPT_INST_DW_M1 << 
(19 + 3 * 9) | \
+ROC_CN10K_CPT_INST_DW_M1 << (19 + 3 * 10) | ROC_CN10K_CPT_INST_DW_M1 
<< (19 + 3 * 11) |   \
+ROC_CN10K_CPT_INST_DW_M1 << (19 + 3 * 12) | ROC_CN10K_CPT_INST_DW_M1 
<< (19 + 3 * 13) |   \
 ROC_CN10K_CPT_INST_DW_M1 << (19 + 3 * 14))
 
+/* Vector of sizes in the burst of 2 * 16 CPT inst except first in 63:19 of
+ * APT_LMT_ARG_S
+ */
+#define ROC_CN10K_DUAL_CPT_LMT_ARG 
\
+   (ROC_CN10K_TWO_CPT_INST_DW_M1 << (19 + 3 * 0) | 
   \
+ROC_CN10K_TWO_CPT_INST_DW_M1 << (19 + 3 * 1) | 
   \
+ROC_CN10K_TWO_CPT_INST_DW_M1 << (19 + 3 * 2) | 
   \
+ROC_CN10K_TWO_CPT_INST_DW_M1 << (19 + 3 * 3) | 
   \
+ROC_CN10K_TWO_CPT_INST_DW_M1 << (19 + 3 * 4) | 
   \
+ROC_CN10K_TWO_CPT_INST_DW_M1 << (19 + 3 * 5) | 
   \
+ROC_CN10K_TWO_CPT_INST_DW_M1 << (19 + 3 * 6) | 
   \
+ROC_CN10K_TWO_CPT_INST_DW_M1 << (19 + 3 * 7) | 
   \
+ROC_CN10K_TWO_CPT_INST_DW_M1 << (19 + 3 * 8) | 
   \
+ROC_CN10K_TWO_CPT_INST_DW_M1 << (19 + 3 * 9) | 
   \
+ROC_CN10K_TWO_CPT_INST_DW_M1 << (19 + 3 * 10) |
   \
+ROC_CN10K_TWO_CPT_INST_DW_M1 << (19 + 3 * 11) |
   \
+ROC_CN10K_TWO_CPT_INST_DW_M1 << (19 + 3 * 12) |
   \
+ROC_CN10K_TWO_CPT_INST_DW_M1 << (19 + 3 * 13) |
   \
+ROC_CN10K_TWO_CPT_INST_DW_M1 << (19 + 3 * 14))
+
 /* CPT helper macros */
 #define ROC_CPT_AH_HDR_LEN 12
 #define ROC_CPT_AES_GCM_IV_LEN 8
diff --git a/drivers/crypto/cnxk/cn10k_cryptodev_ops.c 
b/drivers/crypto/cnxk/cn10k_cryptodev_ops.c
index 1108a8a1da..3fd002d549 100644
--- a/drivers/crypto/cnxk/cn10k_cryptodev_ops.c
+++ b/drivers/crypto/cnxk/cn10k_cryptodev_ops.c
@@ -55,6 +55,54 @@ struct vec_request {
uint64_t w2;
 };
 
+static __rte_always_inline void __rte_hot
+cn10k_cpt_lmtst_dual_submit(uint64_t *io_addr, const uint16_t lmt_id, int *i)
+{
+   uint64_t lmt_arg;
+

[PATCH 09/12] crypto/cnxk: update sess pointer for next iteration

2024-06-20 Thread Aakash Sasidharan
From: Anoob Joseph 

Update sess pointer while working on next set of packets.

Signed-off-by: Anoob Joseph 
---
 drivers/crypto/cnxk/cn10k_cryptodev_ops.c | 13 -
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/drivers/crypto/cnxk/cn10k_cryptodev_ops.c 
b/drivers/crypto/cnxk/cn10k_cryptodev_ops.c
index 3fd002d549..0afd623990 100644
--- a/drivers/crypto/cnxk/cn10k_cryptodev_ops.c
+++ b/drivers/crypto/cnxk/cn10k_cryptodev_ops.c
@@ -1460,6 +1460,8 @@ cn10k_cryptodev_sec_inb_rx_inject(void *dev, struct 
rte_mbuf **pkts,
 
vf = cdev->data->dev_private;
 
+   const int nb_pkts_per_loop = 2 * CN10K_PKTS_PER_LOOP;
+
lmt_base = vf->rx_inj_lmtline.lmt_base;
io_addr = vf->rx_inj_lmtline.io_addr;
fc_addr = vf->rx_inj_lmtline.fc_addr;
@@ -1479,7 +1481,7 @@ cn10k_cryptodev_sec_inb_rx_inject(void *dev, struct 
rte_mbuf **pkts,
if (unlikely(fc.s.qsize > fc_thresh))
goto exit;
 
-   for (; i < RTE_MIN(2 * CN10K_PKTS_PER_LOOP, nb_pkts); i++) {
+   for (; i < RTE_MIN(nb_pkts_per_loop, nb_pkts); i++) {
 
m = pkts[i];
sec_sess = (struct cn10k_sec_session *)sess[i];
@@ -1537,10 +1539,11 @@ cn10k_cryptodev_sec_inb_rx_inject(void *dev, struct 
rte_mbuf **pkts,
 
cn10k_cpt_lmtst_dual_submit(&io_addr, lmt_id, &i);
 
-   if (nb_pkts - i > 0 && i == 2 * CN10K_PKTS_PER_LOOP) {
-   nb_pkts -= i;
-   pkts += i;
-   count += i;
+   if (nb_pkts - i > 0 && i == nb_pkts_per_loop) {
+   nb_pkts -= nb_pkts_per_loop;
+   pkts += nb_pkts_per_loop;
+   count += nb_pkts_per_loop;
+   sess += nb_pkts_per_loop;
goto again;
}
 
-- 
2.25.1



[PATCH 10/12] crypto/cnxk: fix aes-gcm zero len input cases

2024-06-20 Thread Aakash Sasidharan
For aes-gcm (AEAD) zero length input, sg code path is taken unlike
the digest only cases as AAD is treated as a separate input component.
Fix the zero len case in SG path by avoiding the gather component
only when it is a non AEAD algorithm. Also add sg version check as
the fix only applies to specific model.

Fixes: 4d8166d64988 ("crypto/cnxk: enable digest for zero length input")

Signed-off-by: Aakash Sasidharan 
---
 drivers/crypto/cnxk/cnxk_se.h | 28 +---
 1 file changed, 17 insertions(+), 11 deletions(-)

diff --git a/drivers/crypto/cnxk/cnxk_se.h b/drivers/crypto/cnxk/cnxk_se.h
index 6374718a82..63dbef4411 100644
--- a/drivers/crypto/cnxk/cnxk_se.h
+++ b/drivers/crypto/cnxk/cnxk_se.h
@@ -2468,13 +2468,14 @@ fill_sess_gmac(struct rte_crypto_sym_xform *xform, 
struct cnxk_se_sess *sess)
 }
 
 static __rte_always_inline uint32_t
-prepare_iov_from_pkt(struct rte_mbuf *pkt, struct roc_se_iov_ptr *iovec, 
uint32_t start_offset)
+prepare_iov_from_pkt(struct rte_mbuf *pkt, struct roc_se_iov_ptr *iovec, 
uint32_t start_offset,
+const bool is_aead, const bool is_sg_ver2)
 {
uint16_t index = 0;
void *seg_data = NULL;
int32_t seg_size = 0;
 
-   if (!pkt || pkt->data_len == 0) {
+   if (!pkt || (is_sg_ver2 && (pkt->data_len == 0) && !is_aead)) {
iovec->buf_cnt = 0;
return 0;
}
@@ -2619,13 +2620,13 @@ fill_sm_params(struct rte_crypto_op *cop, struct 
cnxk_se_sess *sess,
fc_params.dst_iov = (void *)dst;
 
/* Store SG I/O in the api for reuse */
-   if (prepare_iov_from_pkt(m_src, fc_params.src_iov, 0)) {
+   if (prepare_iov_from_pkt(m_src, fc_params.src_iov, 0, false, 
is_sg_ver2)) {
plt_dp_err("Prepare src iov failed");
ret = -EINVAL;
goto err_exit;
}
 
-   if (prepare_iov_from_pkt(m_dst, fc_params.dst_iov, 0)) {
+   if (prepare_iov_from_pkt(m_dst, fc_params.dst_iov, 0, false, 
is_sg_ver2)) {
plt_dp_err("Prepare dst iov failed for m_dst %p", 
m_dst);
ret = -EINVAL;
goto err_exit;
@@ -2816,14 +2817,15 @@ fill_fc_params(struct rte_crypto_op *cop, struct 
cnxk_se_sess *sess,
fc_params.dst_iov = (void *)dst;
 
/* Store SG I/O in the api for reuse */
-   if (prepare_iov_from_pkt(m_src, fc_params.src_iov, 0)) {
+   if (prepare_iov_from_pkt(m_src, fc_params.src_iov, 0, is_aead, 
is_sg_ver2)) {
plt_dp_err("Prepare src iov failed");
ret = -EINVAL;
goto err_exit;
}
 
if (unlikely(m_dst != NULL)) {
-   if (prepare_iov_from_pkt(m_dst, fc_params.dst_iov, 0)) {
+   if (prepare_iov_from_pkt(m_dst, fc_params.dst_iov, 0, 
is_aead,
+is_sg_ver2)) {
plt_dp_err("Prepare dst iov failed for "
   "m_dst %p",
   m_dst);
@@ -2957,13 +2959,15 @@ fill_pdcp_params(struct rte_crypto_op *cop, struct 
cnxk_se_sess *sess,
fc_params.dst_iov = (void *)dst;
 
/* Store SG I/O in the api for reuse */
-   if (unlikely(prepare_iov_from_pkt(m_src, fc_params.src_iov, 
0))) {
+   if (unlikely(
+   prepare_iov_from_pkt(m_src, fc_params.src_iov, 0, 
false, is_sg_ver2))) {
plt_dp_err("Prepare src iov failed");
ret = -EINVAL;
goto err_exit;
}
 
-   if (unlikely(prepare_iov_from_pkt(m_dst, fc_params.dst_iov, 
0))) {
+   if (unlikely(
+   prepare_iov_from_pkt(m_dst, fc_params.dst_iov, 0, 
false, is_sg_ver2))) {
plt_dp_err("Prepare dst iov failed for m_dst %p", 
m_dst);
ret = -EINVAL;
goto err_exit;
@@ -3080,14 +3084,16 @@ fill_pdcp_chain_params(struct rte_crypto_op *cop, 
struct cnxk_se_sess *sess,
fc_params.dst_iov = (void *)dst;
 
/* Store SG I/O in the api for reuse */
-   if (unlikely(prepare_iov_from_pkt(m_src, fc_params.src_iov, 
0))) {
+   if (unlikely(
+   prepare_iov_from_pkt(m_src, fc_params.src_iov, 0, 
false, is_sg_ver2))) {
plt_dp_err("Could not prepare src iov");
ret = -EINVAL;
goto err_exit;
}
 
if (unlikely(m_dst != NULL)) {
-   if (unlikely(prepare_iov_from_pkt(m_dst, 
fc_params.dst_iov, 0))) {
+   if (unlikely(prepare_iov_from_pkt(m_dst, 

[PATCH 11/12] crypto/cnxk: make pack IV variable as const

2024-06-20 Thread Aakash Sasidharan
From: Anoob Joseph 

Make 'pack_iv' variable as const to avoid multiple checks.

Signed-off-by: Anoob Joseph 
---
 drivers/crypto/cnxk/cnxk_se.h | 27 +--
 1 file changed, 13 insertions(+), 14 deletions(-)

diff --git a/drivers/crypto/cnxk/cnxk_se.h b/drivers/crypto/cnxk/cnxk_se.h
index 63dbef4411..dbd36a8a54 100644
--- a/drivers/crypto/cnxk/cnxk_se.h
+++ b/drivers/crypto/cnxk/cnxk_se.h
@@ -105,7 +105,7 @@ cpt_pack_iv(uint8_t *iv_src, uint8_t *iv_dst)
 }
 
 static inline void
-pdcp_iv_copy(uint8_t *iv_d, const uint8_t *iv_s, const uint8_t pdcp_alg_type, 
uint8_t pack_iv)
+pdcp_iv_copy(uint8_t *iv_d, const uint8_t *iv_s, const uint8_t pdcp_alg_type, 
const bool pack_iv)
 {
const uint32_t *iv_s_temp;
uint32_t iv_temp[4];
@@ -261,7 +261,7 @@ cpt_mac_len_verify(struct rte_crypto_auth_xform *auth)
 
 static __rte_always_inline int
 sg_inst_prep(struct roc_se_fc_params *params, struct cpt_inst_s *inst, 
uint64_t offset_ctrl,
-const uint8_t *iv_s, int iv_len, uint8_t pack_iv, uint8_t 
pdcp_alg_type,
+const uint8_t *iv_s, int iv_len, const bool pack_iv, uint8_t 
pdcp_alg_type,
 int32_t inputlen, int32_t outputlen, uint32_t passthrough_len, 
uint32_t req_flags,
 int pdcp_flag, int decrypt)
 {
@@ -457,7 +457,7 @@ sg_inst_prep(struct roc_se_fc_params *params, struct 
cpt_inst_s *inst, uint64_t
 
 static __rte_always_inline int
 sg2_inst_prep(struct roc_se_fc_params *params, struct cpt_inst_s *inst, 
uint64_t offset_ctrl,
- const uint8_t *iv_s, int iv_len, uint8_t pack_iv, uint8_t 
pdcp_alg_type,
+ const uint8_t *iv_s, int iv_len, const bool pack_iv, uint8_t 
pdcp_alg_type,
  int32_t inputlen, int32_t outputlen, uint32_t passthrough_len, 
uint32_t req_flags,
  int pdcp_flag, int decrypt)
 {
@@ -882,7 +882,7 @@ static inline int
 pdcp_chain_sg1_prep(struct roc_se_fc_params *params, struct roc_se_ctx 
*cpt_ctx,
struct cpt_inst_s *inst, union cpt_inst_w4 w4, int32_t 
inputlen,
uint8_t hdr_len, uint64_t offset_ctrl, uint32_t req_flags,
-   const uint8_t *cipher_iv, const uint8_t *auth_iv, const int 
pack_iv,
+   const uint8_t *cipher_iv, const uint8_t *auth_iv, const 
bool pack_iv,
const uint8_t pdcp_ci_alg, const uint8_t pdcp_auth_alg)
 {
struct roc_sglist_comp *scatter_comp, *gather_comp;
@@ -991,7 +991,7 @@ static inline int
 pdcp_chain_sg2_prep(struct roc_se_fc_params *params, struct roc_se_ctx 
*cpt_ctx,
struct cpt_inst_s *inst, union cpt_inst_w4 w4, int32_t 
inputlen,
uint8_t hdr_len, uint64_t offset_ctrl, uint32_t req_flags,
-   const uint8_t *cipher_iv, const uint8_t *auth_iv, const int 
pack_iv,
+   const uint8_t *cipher_iv, const uint8_t *auth_iv, const 
bool pack_iv,
const uint8_t pdcp_ci_alg, const uint8_t pdcp_auth_alg)
 {
struct roc_sg2list_comp *gather_comp, *scatter_comp;
@@ -1528,7 +1528,6 @@ cpt_pdcp_chain_alg_prep(uint32_t req_flags, uint64_t 
d_offs, uint64_t d_lens,
struct roc_se_ctx *se_ctx;
uint64_t *offset_vaddr;
uint64_t offset_ctrl;
-   uint8_t pack_iv = 0;
int32_t inputlen;
void *dm_vaddr;
uint8_t *iv_d;
@@ -1606,10 +1605,10 @@ cpt_pdcp_chain_alg_prep(uint32_t req_flags, uint64_t 
d_offs, uint64_t d_lens,
cpt_inst_w4.s.dlen = inputlen + ROC_SE_OFF_CTRL_LEN;
 
iv_d = ((uint8_t *)offset_vaddr + ROC_SE_OFF_CTRL_LEN);
-   pdcp_iv_copy(iv_d, cipher_iv, pdcp_ci_alg, pack_iv);
+   pdcp_iv_copy(iv_d, cipher_iv, pdcp_ci_alg, false);
 
iv_d = ((uint8_t *)offset_vaddr + ROC_SE_OFF_CTRL_LEN + 
pdcp_iv_off);
-   pdcp_iv_copy(iv_d, auth_iv, pdcp_auth_alg, pack_iv);
+   pdcp_iv_copy(iv_d, auth_iv, pdcp_auth_alg, false);
 
inst->w4.u64 = cpt_inst_w4.u64;
return 0;
@@ -1618,11 +1617,11 @@ cpt_pdcp_chain_alg_prep(uint32_t req_flags, uint64_t 
d_offs, uint64_t d_lens,
if (is_sg_ver2)
return pdcp_chain_sg2_prep(params, se_ctx, inst, 
cpt_inst_w4, inputlen,
   hdr_len, offset_ctrl, 
req_flags, cipher_iv,
-  auth_iv, pack_iv, 
pdcp_ci_alg, pdcp_auth_alg);
+  auth_iv, false, pdcp_ci_alg, 
pdcp_auth_alg);
else
return pdcp_chain_sg1_prep(params, se_ctx, inst, 
cpt_inst_w4, inputlen,
   hdr_len, offset_ctrl, 
req_flags, cipher_iv,
-  auth_iv, pack_iv, 
pdcp_ci_alg, pdcp_auth_alg);
+  auth_iv, false, pdcp_ci_alg, 
pdcp_auth_alg);
}
 }
 
@@ -1647,9 

[PATCH 12/12] crypto/cnxk: enable dual submission to CPT

2024-06-20 Thread Aakash Sasidharan
From: Anoob Joseph 

Submit two instructions in one LMTLINE.

Signed-off-by: Anoob Joseph 
---
 drivers/common/cnxk/roc_cpt.c |  17 +-
 drivers/common/cnxk/roc_cpt.h |   8 +-
 drivers/crypto/cnxk/cn10k_cryptodev_ops.c | 182 +-
 drivers/crypto/cnxk/cn10k_cryptodev_ops.h |  60 ++-
 drivers/crypto/cnxk/cnxk_cryptodev_ops.c  |  47 ++
 drivers/crypto/cnxk/cnxk_cryptodev_ops.h  |   2 +
 drivers/event/cnxk/cnxk_eventdev_adptr.c  |   4 +-
 7 files changed, 124 insertions(+), 196 deletions(-)

diff --git a/drivers/common/cnxk/roc_cpt.c b/drivers/common/cnxk/roc_cpt.c
index 9f283ceb2e..aba2a49d19 100644
--- a/drivers/common/cnxk/roc_cpt.c
+++ b/drivers/common/cnxk/roc_cpt.c
@@ -1135,8 +1135,8 @@ roc_cpt_iq_enable(struct roc_cpt_lf *lf)
 }
 
 int
-roc_cpt_lmtline_init(struct roc_cpt *roc_cpt, struct roc_cpt_lmtline *lmtline,
-int lf_id)
+roc_cpt_lmtline_init(struct roc_cpt *roc_cpt, struct roc_cpt_lmtline *lmtline, 
int lf_id,
+bool is_dual)
 {
struct roc_cpt_lf *lf;
 
@@ -1145,12 +1145,19 @@ roc_cpt_lmtline_init(struct roc_cpt *roc_cpt, struct 
roc_cpt_lmtline *lmtline,
return -ENOTSUP;
 
lmtline->io_addr = lf->io_addr;
-   if (roc_model_is_cn10k())
-   lmtline->io_addr |= ROC_CN10K_CPT_INST_DW_M1 << 4;
+   lmtline->fc_thresh = lf->nb_desc - CPT_LF_FC_MIN_THRESHOLD;
+
+   if (roc_model_is_cn10k()) {
+   if (is_dual) {
+   lmtline->io_addr |= ROC_CN10K_TWO_CPT_INST_DW_M1 << 4;
+   lmtline->fc_thresh = lf->nb_desc -  2 * 
CPT_LF_FC_MIN_THRESHOLD;
+   } else {
+   lmtline->io_addr |= ROC_CN10K_CPT_INST_DW_M1 << 4;
+   }
+   }
 
lmtline->fc_addr = lf->fc_addr;
lmtline->lmt_base = lf->lmt_base;
-   lmtline->fc_thresh = lf->nb_desc - CPT_LF_FC_MIN_THRESHOLD;
 
return 0;
 }
diff --git a/drivers/common/cnxk/roc_cpt.h b/drivers/common/cnxk/roc_cpt.h
index 8ef9062ae0..e2e919f80f 100644
--- a/drivers/common/cnxk/roc_cpt.h
+++ b/drivers/common/cnxk/roc_cpt.h
@@ -200,12 +200,12 @@ int __roc_api roc_cpt_afs_print(struct roc_cpt *roc_cpt);
 int __roc_api roc_cpt_lfs_print(struct roc_cpt *roc_cpt);
 void __roc_api roc_cpt_iq_disable(struct roc_cpt_lf *lf);
 void __roc_api roc_cpt_iq_enable(struct roc_cpt_lf *lf);
-int __roc_api roc_cpt_lmtline_init(struct roc_cpt *roc_cpt,
-  struct roc_cpt_lmtline *lmtline, int lf_id);
+int __roc_api roc_cpt_lmtline_init(struct roc_cpt *roc_cpt, struct 
roc_cpt_lmtline *lmtline,
+  int lf_id, bool is_dual);
 
 void __roc_api roc_cpt_parse_hdr_dump(FILE *file, const struct cpt_parse_hdr_s 
*cpth);
-int __roc_api roc_cpt_ctx_write(struct roc_cpt_lf *lf, void *sa_dptr,
-   void *sa_cptr, uint16_t sa_len);
+int __roc_api roc_cpt_ctx_write(struct roc_cpt_lf *lf, void *sa_dptr, void 
*sa_cptr,
+   uint16_t sa_len);
 
 void __roc_api roc_cpt_int_misc_cb_register(roc_cpt_int_misc_cb_t cb, void 
*args);
 int __roc_api roc_cpt_int_misc_cb_unregister(roc_cpt_int_misc_cb_t cb, void 
*args);
diff --git a/drivers/crypto/cnxk/cn10k_cryptodev_ops.c 
b/drivers/crypto/cnxk/cn10k_cryptodev_ops.c
index 0afd623990..f46379b43e 100644
--- a/drivers/crypto/cnxk/cn10k_cryptodev_ops.c
+++ b/drivers/crypto/cnxk/cn10k_cryptodev_ops.c
@@ -12,11 +12,6 @@
 #include 
 
 #include "roc_cpt.h"
-#if defined(__aarch64__)
-#include "roc_io.h"
-#else
-#include "roc_io_generic.h"
-#endif
 #include "roc_idev.h"
 #include "roc_sso.h"
 #include "roc_sso_dp.h"
@@ -40,8 +35,8 @@
 
 /* Holds information required to send crypto operations in one burst */
 struct ops_burst {
-   struct rte_crypto_op *op[CN10K_PKTS_PER_LOOP];
-   uint64_t w2[CN10K_PKTS_PER_LOOP];
+   struct rte_crypto_op *op[CN10K_CPT_PKTS_PER_LOOP];
+   uint64_t w2[CN10K_CPT_PKTS_PER_LOOP];
struct cn10k_sso_hws *ws;
struct cnxk_cpt_qp *qp;
uint16_t nb_ops;
@@ -55,54 +50,6 @@ struct vec_request {
uint64_t w2;
 };
 
-static __rte_always_inline void __rte_hot
-cn10k_cpt_lmtst_dual_submit(uint64_t *io_addr, const uint16_t lmt_id, int *i)
-{
-   uint64_t lmt_arg;
-
-   /* Check if the total number of instructions is odd or even. */
-   const int flag_odd = *i & 0x1;
-
-   /* Reduce i by 1 when odd number of instructions.*/
-   *i -= flag_odd;
-
-   if (*i > 2 * CN10K_PKTS_PER_STEORL) {
-   lmt_arg = ROC_CN10K_DUAL_CPT_LMT_ARG | (CN10K_PKTS_PER_STEORL - 
1) << 12 |
- (uint64_t)lmt_id;
-   roc_lmt_submit_steorl(lmt_arg, *io_addr);
-   lmt_arg = ROC_CN10K_DUAL_CPT_LMT_ARG | (*i / 2 - 
CN10K_PKTS_PER_STEORL - 1) << 12 |
- (uint64_t)(lmt_id + CN10K_PKTS_PER_STEORL);
-   roc_lmt_submit_steorl(lmt_arg, *io_addr);
-   if (fla

RE: [PATCH] net/*: replace intrinsic header include with rte_vect

2024-06-20 Thread Konstantin Ananyev


> Rather than having the SSE code in each driver include tmmintrin.h,
> which often does not contain all needed intrinsics, e.g.
> _mm_cvtsi128_si64() for 32-bit x86 builds, we can just replace the
> include of ?mmintrin.h with rte_vect.h for all network drivers.
> 
> Signed-off-by: Bruce Richardson 
> ---
>  drivers/net/fm10k/fm10k_rxtx_vec.c  | 2 +-
>  drivers/net/i40e/i40e_rxtx_vec_sse.c| 2 +-
>  drivers/net/iavf/iavf_rxtx_vec_sse.c| 2 +-
>  drivers/net/ice/ice_rxtx_vec_sse.c  | 2 +-
>  drivers/net/ixgbe/ixgbe_rxtx_vec_sse.c  | 2 +-
>  drivers/net/mlx5/mlx5_rxtx_vec_sse.h| 2 +-
>  drivers/net/ngbe/ngbe_rxtx_vec_sse.c| 2 +-
>  drivers/net/txgbe/txgbe_rxtx_vec_sse.c  | 2 +-
>  drivers/net/virtio/virtio_rxtx_simple_sse.c | 2 +-
>  9 files changed, 9 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/net/fm10k/fm10k_rxtx_vec.c 
> b/drivers/net/fm10k/fm10k_rxtx_vec.c
> index 62119de373..9a84775cb1 100644
> --- a/drivers/net/fm10k/fm10k_rxtx_vec.c
> +++ b/drivers/net/fm10k/fm10k_rxtx_vec.c
> @@ -10,7 +10,7 @@
>  #include "fm10k.h"
>  #include "base/fm10k_type.h"
> 
> -#include 
> +#include 
> 
>  #ifndef __INTEL_COMPILER
>  #pragma GCC diagnostic ignored "-Wcast-qual"
> diff --git a/drivers/net/i40e/i40e_rxtx_vec_sse.c 
> b/drivers/net/i40e/i40e_rxtx_vec_sse.c
> index 2d4480a765..ad560d2b6b 100644
> --- a/drivers/net/i40e/i40e_rxtx_vec_sse.c
> +++ b/drivers/net/i40e/i40e_rxtx_vec_sse.c
> @@ -12,7 +12,7 @@
>  #include "i40e_rxtx.h"
>  #include "i40e_rxtx_vec_common.h"
> 
> -#include 
> +#include 
> 
>  #ifndef __INTEL_COMPILER
>  #pragma GCC diagnostic ignored "-Wcast-qual"
> diff --git a/drivers/net/iavf/iavf_rxtx_vec_sse.c 
> b/drivers/net/iavf/iavf_rxtx_vec_sse.c
> index 96f187f511..0db6fa8bd4 100644
> --- a/drivers/net/iavf/iavf_rxtx_vec_sse.c
> +++ b/drivers/net/iavf/iavf_rxtx_vec_sse.c
> @@ -10,7 +10,7 @@
>  #include "iavf_rxtx.h"
>  #include "iavf_rxtx_vec_common.h"
> 
> -#include 
> +#include 
> 
>  #ifndef __INTEL_COMPILER
>  #pragma GCC diagnostic ignored "-Wcast-qual"
> diff --git a/drivers/net/ice/ice_rxtx_vec_sse.c 
> b/drivers/net/ice/ice_rxtx_vec_sse.c
> index 9a1b7e3e51..c01d8ede29 100644
> --- a/drivers/net/ice/ice_rxtx_vec_sse.c
> +++ b/drivers/net/ice/ice_rxtx_vec_sse.c
> @@ -4,7 +4,7 @@
> 
>  #include "ice_rxtx_vec_common.h"
> 
> -#include 
> +#include 
> 
>  #ifndef __INTEL_COMPILER
>  #pragma GCC diagnostic ignored "-Wcast-qual"
> diff --git a/drivers/net/ixgbe/ixgbe_rxtx_vec_sse.c 
> b/drivers/net/ixgbe/ixgbe_rxtx_vec_sse.c
> index f60808d576..a77370cdb7 100644
> --- a/drivers/net/ixgbe/ixgbe_rxtx_vec_sse.c
> +++ b/drivers/net/ixgbe/ixgbe_rxtx_vec_sse.c
> @@ -10,7 +10,7 @@
>  #include "ixgbe_rxtx.h"
>  #include "ixgbe_rxtx_vec_common.h"
> 
> -#include 
> +#include 
> 
>  #ifndef __INTEL_COMPILER
>  #pragma GCC diagnostic ignored "-Wcast-qual"
> diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_sse.h 
> b/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
> index 2bdd1f676d..93d6d1b5f0 100644
> --- a/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
> +++ b/drivers/net/mlx5/mlx5_rxtx_vec_sse.h
> @@ -9,7 +9,7 @@
>  #include 
>  #include 
>  #include 
> -#include 
> +#include 
> 
>  #include 
>  #include 
> diff --git a/drivers/net/ngbe/ngbe_rxtx_vec_sse.c 
> b/drivers/net/ngbe/ngbe_rxtx_vec_sse.c
> index f703d0ea15..b128bd3a67 100644
> --- a/drivers/net/ngbe/ngbe_rxtx_vec_sse.c
> +++ b/drivers/net/ngbe/ngbe_rxtx_vec_sse.c
> @@ -11,7 +11,7 @@
>  #include "ngbe_rxtx.h"
>  #include "ngbe_rxtx_vec_common.h"
> 
> -#include 
> +#include 
> 
>  static inline void
>  ngbe_rxq_rearm(struct ngbe_rx_queue *rxq)
> diff --git a/drivers/net/txgbe/txgbe_rxtx_vec_sse.c 
> b/drivers/net/txgbe/txgbe_rxtx_vec_sse.c
> index 12eb4aeef5..1a3f2ce3cd 100644
> --- a/drivers/net/txgbe/txgbe_rxtx_vec_sse.c
> +++ b/drivers/net/txgbe/txgbe_rxtx_vec_sse.c
> @@ -10,7 +10,7 @@
>  #include "txgbe_rxtx.h"
>  #include "txgbe_rxtx_vec_common.h"
> 
> -#include 
> +#include 
> 
>  static inline void
>  txgbe_rxq_rearm(struct txgbe_rx_queue *rxq)
> diff --git a/drivers/net/virtio/virtio_rxtx_simple_sse.c 
> b/drivers/net/virtio/virtio_rxtx_simple_sse.c
> index 6a18741b6d..d53acc4fd6 100644
> --- a/drivers/net/virtio/virtio_rxtx_simple_sse.c
> +++ b/drivers/net/virtio/virtio_rxtx_simple_sse.c
> @@ -8,7 +8,7 @@
>  #include 
>  #include 
> 
> -#include 
> +#include 
> 
>  #include 
>  #include 
> --

Acked-by: Konstantin Ananyev 

> 2.43.0



Re: [PATCH v2 093/148] net/ice/base: allow different FW API versions based on MAC type

2024-06-20 Thread Bruce Richardson
On Wed, Jun 12, 2024 at 04:01:27PM +0100, Anatoly Burakov wrote:
> From: Ian Stokes 
> 
> Allow the driver to be compatible with different FW API versions based
> on the device's MAC type. Currently, E810 is only compatible with one
> FW API version. Now the driver can be compatible with different FW API
> versions for both E810 and E830.
> 
> Signed-off-by: Dan Nowlin 
> Signed-off-by: Ian Stokes 
> ---
>  drivers/net/ice/base/ice_controlq.c | 17 ++---
>  drivers/net/ice/base/ice_controlq.h | 22 +++---
>  2 files changed, 29 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/net/ice/base/ice_controlq.c 
> b/drivers/net/ice/base/ice_controlq.c
> index c2cf747b65..edc068481e 100644
> --- a/drivers/net/ice/base/ice_controlq.c
> +++ b/drivers/net/ice/base/ice_controlq.c
> @@ -479,24 +479,27 @@ ice_shutdown_sq(struct ice_hw *hw, struct 
> ice_ctl_q_info *cq)
>   */
>  static bool ice_aq_ver_check(struct ice_hw *hw)
>  {
> - if (hw->api_maj_ver > EXP_FW_API_VER_MAJOR) {
> + u8 exp_fw_api_ver_major = EXP_FW_API_VER_MAJOR_BY_MAC(hw);
> + u8 exp_fw_api_ver_minor = EXP_FW_API_VER_MINOR_BY_MAC(hw);
> +
> +if (hw->api_maj_ver > exp_fw_api_ver_major) {

Let's fix the indentation on this line.



Re: [PATCH v2 095/148] net/ice/base: add E830 debug dump cluster ID values

2024-06-20 Thread Bruce Richardson
On Wed, Jun 12, 2024 at 04:01:29PM +0100, Anatoly Burakov wrote:
> From: Ian Stokes 
> 
> Add E830 debug dump cluster ID values, which are different than the
> values for E810. Rename E810 cluster IDs to make it easier to
> distinguish which values are which.Add E830 debug dump cluster ID
> values, which are different than the values for E810.
> Rename E810 cluster IDs to make it easier to
> distinguish which values are which.
> 

A commit log so good it's worth repeating? :-)

/Bruce


DTS WG Meeting Minutes - June 20, 2024

2024-06-20 Thread Patrick Robb
#
Attendees
* Patrick Robb
* Juraj Linkeš
* Paul Szczepanek
* Jeremy Spewock
* Nicholas Pratte
* Dean Marx
* Luca Vizzarro

#
Minutes

=
General Announcements
* Thomas has merged some patchseries:
   * Testpmd params:
https://patchwork.dpdk.org/project/dpdk/list/?series=32231&state=*
   * Mypy: https://patchwork.dpdk.org/project/dpdk/list/?series=32026&state=*
  * UNH can roll out the CI testing for mypy early next week
   * Error and usage improvements:
https://patchwork.dpdk.org/project/dpdk/list/?series=32038&state=*
   * Testpmd show port info/stats:
https://patchwork.dpdk.org/project/dpdk/list/?series=32112&state=*
   * Rename execution to test run:
https://patchwork.dpdk.org/project/dpdk/patch/20240607083858.58906-1-juraj.lin...@pantheon.tech/
   * Node and inheritance improvements:
https://patchwork.dpdk.org/project/dpdk/list/?series=32230&state=*
   * Hugepage configuration refactor:
https://patchwork.dpdk.org/project/dpdk/list/?series=32129&state=*
  * Thomas found an issue with Nick's patchset, some parts were
not in the right patches; the parts are in DPDK docs and each
individual commit must be valid, which was not the case with Nick's
patch
 * Each individual patch must build docs successfully. The
renaming needs to be grouped into one patch instead of split out.
 * Lines must be broken logically, like after a dot or comma.
But, there is a line limitation and this may restrict our ability to
write freely. There isn’t a hard requirement that the line breaks with
some punctuation (although this is helpful), a logical separation in
the sentence structure is fine too.
   * Improved interactive shell output gathering not merged as there
were some unresolved comments
   * Everyone please rebase
* DPDK Summit Montreal
   * Possible CFP topics for DTS:
  * How to write a testsuite. This could be a simple lightning
talk which shows how we wrote one of the testsuites we are working on
now.
  * Luca is interested in giving this talk
* Thomas mentioned: We may discuss creating a git tree for you (for DTS)
   * This would help Juraj and the DTS group have more control over
merging patches
   * This is like next-* branches on DPDK. So it would be like a next-dts.
   * Patrick Robbcan also add it to the agenda.
  * Juraj should join to share his thoughts as DTS maintainer
  * Meeting info here: https://core.dpdk.org/techboard/
* Do we want to switch to Ubuntu 24.04 as the distro of choice? This
would prompt a Python update, which we could use for dataclasses
(@typing.dataclass_transform) and other things
* RC2 should be 2 weeks out. 5th of July.

=
Patch discussions
* Improve interactive shell output gathering/logging
   * There are some unresolved comments from Nick

=
Bugzilla discussions
* Vlan filter ethdev bug added: https://bugs.dpdk.org/show_bug.cgi?id=1464

=
Any other business
* Next meeting Jul 3, 2024


RE: [PATCH] examples/fips_validation: fix coverity issues

2024-06-20 Thread Dooley, Brian
Hey Gowrishankar,

> -Original Message-
> From: Gowrishankar Muthukrishnan 
> Sent: Saturday, June 15, 2024 12:31 PM
> To: dev@dpdk.org; Dooley, Brian ; Gowrishankar
> Muthukrishnan 
> Cc: Anoob Joseph ; sta...@dpdk.org
> Subject: [PATCH] examples/fips_validation: fix coverity issues
> 
> Fix NULL dereference, out-of-bound, bad bit shift issues reported by coverity
> scan.
> 
> Coverity issue: 384440, 384435, 384433, 384429
> Fixes: 36128a67c27 ("examples/fips_validation: add asymmetric validation")
> Cc: sta...@dpdk.org
> 
> Signed-off-by: Gowrishankar Muthukrishnan 
> ---
>  examples/fips_validation/fips_validation_rsa.c | 7 +--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/examples/fips_validation/fips_validation_rsa.c
> b/examples/fips_validation/fips_validation_rsa.c
> index f675b51051..55f81860a0 100644
> --- a/examples/fips_validation/fips_validation_rsa.c
> +++ b/examples/fips_validation/fips_validation_rsa.c
> @@ -328,6 +328,9 @@ parse_test_rsa_json_interim_writeback(struct
> fips_val *val)
>   if (prepare_vec_rsa() < 0)
>   return -1;
> 
> + if (!vec.rsa.e.val)
> + return -1;
> +
>   writeback_hex_str("", info.one_line_text, &vec.rsa.n);
>   obj = json_string(info.one_line_text);
>   json_object_set_new(json_info.json_write_group, "n", obj);
> @@ -474,7 +477,7 @@ fips_test_randomize_message(struct fips_val *msg,
> struct fips_val *rand)
>   uint16_t rv_len;
> 
>   if (!msg->val || !rand->val || rand->len > RV_BUF_LEN
> - || msg->len > FIPS_TEST_JSON_BUF_LEN)
> + || msg->len > (FIPS_TEST_JSON_BUF_LEN - 1))
>   return -EINVAL;
> 
>   memset(rv, 0, sizeof(rv));
> @@ -503,7 +506,7 @@ fips_test_randomize_message(struct fips_val *msg,
> struct fips_val *rand)
>   m[i + j] ^= rv[j];
> 
>   m[i + j] = ((uint8_t *)&rv_bitlen)[0];
> - m[i + j + 1] = (((uint8_t *)&rv_bitlen)[1] >> 8) & 0xFF;
> + m[i + j + 1] = ((uint8_t *)&rv_bitlen)[1];
> 
>   rte_free(msg->val);
>   msg->len = (rv_bitlen + m_bitlen + 16) / 8;
> --
> 2.25.1

Acked-by: Brian Dooley 



Community CI Meeting Minutes - June 13, 2024

2024-06-20 Thread Patrick Robb
#
Attendees
1. Patrick Robb
2. Juraj Linkeš
3. Aaron Conole
4. Dean Marx
5. Jeremy Spewock
6. Manit Mahajan
7. Nicholas Pratte
8. Paul Szczepanek
9. Tomas Durovec

#
Minutes

=
General Announcements
* DPDK Summit in Montreal will be September 24-25:
https://www.dpdk.org/event/dpdk-summit-2024/
   * CFP closes July 21
* Next Wednesday is a federal holiday in the United States, so the UNH
folks other than Patrick (Jeremy, Nick, Dean) wouldn’t be able to
attend the DTS meeting
   * Rescheduled to 13:00 UTC next thursday
* Depends-on:
   * Adam from UNH has begun development for the patchwork server /
git-pw support for Depends-on. The project maintainer has approved his
approach and he addressed questions from a couple other community
members on the github issue.
  * Aaron requested that we start tracking which labs support
depends-on on the dpdk.org testing page, like how we do with retests.
https://patchwork.dpdk.org/project/web/patch/20240531220110.5159-1-pr...@iol.unh.edu/

=
CI Status

-
UNH-IOL Community Lab
* Environments added for Compile/Unit tests:
   * Fedora 40
   * Rhel 9
* Template Engine:
   * New versions of python/pip do not allow for pip installs outside
of a virtual environment without an override, per pep 668
https://peps.python.org/pep-0668/. UNH team proposes to shift to using
–break-system-packages to allow for pip installs outside of venv, in
modern distros
  * For these ephemeral containers which only exist to run some
DPDK build, unit tests, etc. the concern of breaking system python
dependencies did not seem very salient
  * Flagging that this also adds a venv requirement to the
linux-setup.sh script, although many users were probably already
running from venv to begin with:
https://git.dpdk.org/dpdk/tree/.ci/linux-setup.sh
* Zlib vdev compression test is now running on an ARM tx2 server in
the lab, and reporting results. So, now we have coverage for both x86
and arm arch.
* SPDK compilation is running in CI:
https://mails.dpdk.org/archives/test-report/2024-June/692342.html
   * This is only for x86_64. We tried arm64 but spdk support is poor
on that side, and almost all distros compile with warnings.
* Dashboard updates:
   * Now posting a retest counter for a patchseries on the report
detail page (reminder, once you hit 3/3, you can no longer request
retests on a series)
* We introduced a bug to our reporting Monday afternoon which resulted
in some missing patchwork contexts. This was fixed yesterday and all
reports re-queued. It appears that we have backfilled all results now.
* Pending:
   * 
https://patchwork.dpdk.org/project/ci/patch/20240523215945.16468-1-pr...@iol.unh.edu/
  * Ali acked v1, but Patrick Robbwill ping him for final
confirmation this is good
   * UNH has staged all the code internally to support this, so we
should be good to go
-
Intel Lab
* None

-
Github Actions
* No news - it seems to be stable after the fixes implemented a couple weeks ago
* New features like cirrus and retests v2 are forthcoming
* No missing results were reported from the downtime mentioned at the
last meeting

-
Loongarch Lab
* Zhoumin indicated that he would be interested in supporting
depends-on for his CI project, once it is supported by the patchwork
project

=
DTS Improvements & Test Development
* New Testsuites written at UNH:
   * Vlan
  * Tests vlan filtering, vlan insertion, vlan stripping
  * Filtering works normal for mellanox cx5. I40e and bnxt_en nics
are forwarding on packets with VLAN ids which differ from the vlan
assigned to the rx_port
   * Jumboframes
  * Runs fine
   * Dynamic_queue
  * Is working for rx side.
  * When you change testpmd to tx forwarding mode, it is supposed
to immediately send “a bunch of packets”. This happens the first time
for i40e NICs, but not on subsequent starts, but happens every time on
cx5.
  * Debugging this behavior, though it may necessitate a bugzilla ticket
   * Queue_start_stop
  * Cannot start/stop queues in vectorized RX mode. Need to
determine whether another burst mode is allowed for this NIC, or
whether start/stop is simply invalid on Mellanox.
* Juraj pinged Thomas about merging the 7 patches which are ready, and
he indicated he will be able to take a look at them on Monday.
   * update mypy and clean up:
https://patches.dpdk.org/project/dpdk/list/?series=32026
   * error and usage impr

[PATCH v4 0/3] Improve interactive shell output gathering and logging

2024-06-20 Thread jspewock
From: Jeremy Spewock 

v4:
  * rebase on top of rc1.
  * address comments and fix typos in the added docstring example.

Jeremy Spewock (3):
  dts: Improve output gathering in interactive shells
  dts: Add missing docstring from XML-RPC server
  dts: Improve logging for interactive shells

 dts/framework/exception.py| 66 ---
 dts/framework/remote_session/dpdk_shell.py|  3 +-
 .../remote_session/interactive_shell.py   | 58 +++-
 dts/framework/remote_session/testpmd_shell.py |  2 +
 .../testbed_model/traffic_generator/scapy.py  | 50 +-
 5 files changed, 139 insertions(+), 40 deletions(-)

-- 
2.45.2



[PATCH v4 1/3] dts: Improve output gathering in interactive shells

2024-06-20 Thread jspewock
From: Jeremy Spewock 

The current implementation of consuming output from interactive shells
relies on being able to find an expected prompt somewhere within the
output buffer after sending the command. This is useful in situations
where the prompt does not appear in the output itself, but in some
practical cases (such as the starting of an XML-RPC server for scapy)
the prompt exists in one of the commands sent to the shell and this can
cause the command to exit early and creates a race condition between the
server starting and the first command being sent to the server.

This patch addresses this problem by searching for a line that strictly
ends with the provided prompt, rather than one that simply contains it,
so that the detection that a command is finished is more consistent. It
also adds a catch to detect when a command times out before finding the
prompt or the underlying SSH session dies so that the exception can be
wrapped into a more explicit one and be more consistent with the
non-interactive shells.

Bugzilla ID: 1359
Fixes: 88489c0501af ("dts: add smoke tests")

Signed-off-by: Jeremy Spewock 
---
 dts/framework/exception.py| 66 ---
 .../remote_session/interactive_shell.py   | 49 ++
 2 files changed, 80 insertions(+), 35 deletions(-)

diff --git a/dts/framework/exception.py b/dts/framework/exception.py
index 74fd2af3b6..f45f789825 100644
--- a/dts/framework/exception.py
+++ b/dts/framework/exception.py
@@ -51,26 +51,6 @@ class DTSError(Exception):
 severity: ClassVar[ErrorSeverity] = ErrorSeverity.GENERIC_ERR
 
 
-class SSHTimeoutError(DTSError):
-"""The SSH execution of a command timed out."""
-
-#:
-severity: ClassVar[ErrorSeverity] = ErrorSeverity.SSH_ERR
-_command: str
-
-def __init__(self, command: str):
-"""Define the meaning of the first argument.
-
-Args:
-command: The executed command.
-"""
-self._command = command
-
-def __str__(self) -> str:
-"""Add some context to the string representation."""
-return f"{self._command} execution timed out."
-
-
 class SSHConnectionError(DTSError):
 """An unsuccessful SSH connection."""
 
@@ -98,8 +78,42 @@ def __str__(self) -> str:
 return message
 
 
-class SSHSessionDeadError(DTSError):
-"""The SSH session is no longer alive."""
+class _SSHTimeoutError(DTSError):
+"""The execution of a command via SSH timed out.
+
+This class is private and meant to be raised as its interactive and 
non-interactive variants.
+"""
+
+#:
+severity: ClassVar[ErrorSeverity] = ErrorSeverity.SSH_ERR
+_command: str
+
+def __init__(self, command: str):
+"""Define the meaning of the first argument.
+
+Args:
+command: The executed command.
+"""
+self._command = command
+
+def __str__(self) -> str:
+"""Add some context to the string representation."""
+return f"{self._command} execution timed out."
+
+
+class SSHTimeoutError(_SSHTimeoutError):
+"""The execution of a command on a non-interactive SSH session timed 
out."""
+
+
+class InteractiveSSHTimeoutError(_SSHTimeoutError):
+"""The execution of a command on an interactive SSH session timed out."""
+
+
+class _SSHSessionDeadError(DTSError):
+"""The SSH session is no longer alive.
+
+This class is private and meant to be raised as its interactive and 
non-interactive variants.
+"""
 
 #:
 severity: ClassVar[ErrorSeverity] = ErrorSeverity.SSH_ERR
@@ -118,6 +132,14 @@ def __str__(self) -> str:
 return f"SSH session with {self._host} has died."
 
 
+class SSHSessionDeadError(_SSHSessionDeadError):
+"""Non-interactive SSH session has died."""
+
+
+class InteractiveSSHSessionDeadError(_SSHSessionDeadError):
+"""Interactive SSH session as died."""
+
+
 class ConfigurationError(DTSError):
 """An invalid configuration."""
 
diff --git a/dts/framework/remote_session/interactive_shell.py 
b/dts/framework/remote_session/interactive_shell.py
index 254aa29f89..6aa5281d6a 100644
--- a/dts/framework/remote_session/interactive_shell.py
+++ b/dts/framework/remote_session/interactive_shell.py
@@ -21,6 +21,10 @@
 
 from paramiko import Channel, channel  # type: ignore[import-untyped]
 
+from framework.exception import (
+InteractiveSSHSessionDeadError,
+InteractiveSSHTimeoutError,
+)
 from framework.logger import DTSLogger
 from framework.params import Params
 from framework.settings import SETTINGS
@@ -53,7 +57,10 @@ class InteractiveShell(ABC):
 
 #: Extra characters to add to the end of every command
 #: before sending them. This is often overridden by subclasses and is
-#: most commonly an additional newline character.
+#: most commonly an additional newline character. This additional newline
+#: character is used to force the line that is currently awaiting input
+#: into the stdout buffer so that it can be consume

[PATCH v4 2/3] dts: Add missing docstring from XML-RPC server

2024-06-20 Thread jspewock
From: Jeremy Spewock 

When this XML-RPC server implementation was added, the docstring had to
be shortened in order to reduce the chances of this race condition being
encountered. Now that this race condition issue is resolved, the full
docstring can be restored.

Signed-off-by: Jeremy Spewock 
---
 .../testbed_model/traffic_generator/scapy.py  | 46 ++-
 1 file changed, 45 insertions(+), 1 deletion(-)

diff --git a/dts/framework/testbed_model/traffic_generator/scapy.py 
b/dts/framework/testbed_model/traffic_generator/scapy.py
index bf58ad1c5e..c3648134fc 100644
--- a/dts/framework/testbed_model/traffic_generator/scapy.py
+++ b/dts/framework/testbed_model/traffic_generator/scapy.py
@@ -128,9 +128,53 @@ def scapy_send_packets(xmlrpc_packets: 
list[xmlrpc.client.Binary], send_iface: s
 
 
 class QuittableXMLRPCServer(SimpleXMLRPCServer):
-"""Basic XML-RPC server.
+r"""Basic XML-RPC server.
 
 The server may be augmented by functions serializable by the 
:mod:`marshal` module.
+
+Example:
+::
+
+def hello_world():
+# to be sent to the XML-RPC server
+print("Hello World!")
+
+# start the XML-RPC server on the remote node
+# this is done by starting a Python shell on the remote node
+from framework.remote_session import PythonShell
+# the example assumes you're already connected to a tg_node
+session = tg_node.create_interactive_shell(PythonShell, timeout=5, 
privileged=True)
+
+# then importing the modules needed to run the server
+# and the modules for any functions later added to the server
+session.send_command("import xmlrpc")
+session.send_command("from xmlrpc.server import 
SimpleXMLRPCServer")
+
+# sending the source code of this class to the Python shell
+from xmlrpc.server import SimpleXMLRPCServer
+src = inspect.getsource(QuittableXMLRPCServer)
+src = "\n".join([l for l in src.splitlines() if not l.isspace() 
and l != ""])
+spacing = "\n" * 4
+session.send_command(spacing + src + spacing)
+
+# then starting the server with:
+command = "s = QuittableXMLRPCServer(('0.0.0.0', 
{listen_port}));s.serve_forever()"
+session.send_command(command, "XMLRPC OK")
+
+# now the server is running on the remote node and we can add 
functions to it
+# first connect to the server from the execution node
+import xmlrpc.client
+server_url = f"http://{tg_node.config.hostname}:8000";
+rpc_server_proxy = xmlrpc.client.ServerProxy(server_url)
+
+# get the function bytes to send
+import marshal
+function_bytes = marshal.dumps(hello_world.__code__)
+rpc_server_proxy.add_rpc_function(hello_world.__name__, 
function_bytes)
+
+# now we can execute the function on the server
+xmlrpc_binary_recv: xmlrpc.client.Binary = 
rpc_server_proxy.hello_world()
+print(str(xmlrpc_binary_recv))
 """
 
 def __init__(self, *args, **kwargs):
-- 
2.45.2



[PATCH v4 3/3] dts: Improve logging for interactive shells

2024-06-20 Thread jspewock
From: Jeremy Spewock 

The messages being logged by interactive shells currently are using the
same logger as the node they were created from. Because of this, when
sending interactive commands, the logs make no distinction between when
you are sending a command directly to the host and when you are using an
interactive shell on the host. This change adds names to interactive
shells so that they are able to use their own loggers with distinct
names.

Signed-off-by: Jeremy Spewock 
---
 dts/framework/remote_session/dpdk_shell.py | 3 ++-
 dts/framework/remote_session/interactive_shell.py  | 9 +++--
 dts/framework/remote_session/testpmd_shell.py  | 2 ++
 dts/framework/testbed_model/traffic_generator/scapy.py | 4 +++-
 4 files changed, 14 insertions(+), 4 deletions(-)

diff --git a/dts/framework/remote_session/dpdk_shell.py 
b/dts/framework/remote_session/dpdk_shell.py
index 296639f37d..9b4ff47334 100644
--- a/dts/framework/remote_session/dpdk_shell.py
+++ b/dts/framework/remote_session/dpdk_shell.py
@@ -81,6 +81,7 @@ def __init__(
 append_prefix_timestamp: bool = True,
 start_on_init: bool = True,
 app_params: EalParams = EalParams(),
+name: str | None = None,
 ) -> None:
 """Extends :meth:`~.interactive_shell.InteractiveShell.__init__`.
 
@@ -95,7 +96,7 @@ def __init__(
 append_prefix_timestamp,
 )
 
-super().__init__(node, privileged, timeout, start_on_init, app_params)
+super().__init__(node, privileged, timeout, start_on_init, app_params, 
name)
 
 def _update_real_path(self, path: PurePath) -> None:
 """Extends 
:meth:`~.interactive_shell.InteractiveShell._update_real_path`.
diff --git a/dts/framework/remote_session/interactive_shell.py 
b/dts/framework/remote_session/interactive_shell.py
index 6aa5281d6a..c92fdbfcdf 100644
--- a/dts/framework/remote_session/interactive_shell.py
+++ b/dts/framework/remote_session/interactive_shell.py
@@ -25,7 +25,7 @@
 InteractiveSSHSessionDeadError,
 InteractiveSSHTimeoutError,
 )
-from framework.logger import DTSLogger
+from framework.logger import DTSLogger, get_dts_logger
 from framework.params import Params
 from framework.settings import SETTINGS
 from framework.testbed_model.node import Node
@@ -73,6 +73,7 @@ def __init__(
 timeout: float = SETTINGS.timeout,
 start_on_init: bool = True,
 app_params: Params = Params(),
+name: str | None = None,
 ) -> None:
 """Create an SSH channel during initialization.
 
@@ -84,9 +85,13 @@ def __init__(
 and no output is gathered within the timeout, an exception is 
thrown.
 start_on_init: Start interactive shell automatically after object 
initialisation.
 app_params: The command line parameters to be passed to the 
application on startup.
+name: Name for the interactive shell to use for logging. This name 
will be appended to
+the name of the underlying node which it is running on.
 """
 self._node = node
-self._logger = node._logger
+if name is None:
+name = type(self).__name__
+self._logger = get_dts_logger(f"{node.name}.{name}")
 self._app_params = app_params
 self._privileged = privileged
 self._timeout = timeout
diff --git a/dts/framework/remote_session/testpmd_shell.py 
b/dts/framework/remote_session/testpmd_shell.py
index ec22f72221..1f00556187 100644
--- a/dts/framework/remote_session/testpmd_shell.py
+++ b/dts/framework/remote_session/testpmd_shell.py
@@ -605,6 +605,7 @@ def __init__(
 ascending_cores: bool = True,
 append_prefix_timestamp: bool = True,
 start_on_init: bool = True,
+name: str | None = None,
 **app_params: Unpack[TestPmdParamsDict],
 ) -> None:
 """Overrides :meth:`~.dpdk_shell.DPDKShell.__init__`. Changes 
app_params to kwargs."""
@@ -617,6 +618,7 @@ def __init__(
 append_prefix_timestamp,
 start_on_init,
 TestPmdParams(**app_params),
+name,
 )
 
 def start(self, verify: bool = True) -> None:
diff --git a/dts/framework/testbed_model/traffic_generator/scapy.py 
b/dts/framework/testbed_model/traffic_generator/scapy.py
index c3648134fc..ca0ea6aca3 100644
--- a/dts/framework/testbed_model/traffic_generator/scapy.py
+++ b/dts/framework/testbed_model/traffic_generator/scapy.py
@@ -261,7 +261,9 @@ def __init__(self, tg_node: Node, config: 
ScapyTrafficGeneratorConfig):
 self._tg_node.config.os == OS.linux
 ), "Linux is the only supported OS for scapy traffic generation"
 
-self.session = PythonShell(self._tg_node, timeout=5, privileged=True)
+self.session = PythonShell(
+self._tg_node, timeout=5, privileged=True, name="ScapyXMLRPCServer"
+)
 
 # import libs in remote python console
 for import_statement in SCAPY_RPC_SE

[PATCH v4 02/13] net/iavf: add missing vector API header include

2024-06-20 Thread Mattias Rönnblom
The iavf driver relied on , but failed to provide a direct
include of this file.

Signed-off-by: Mattias Rönnblom 
---
 drivers/net/iavf/iavf_rxtx_vec_sse.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/net/iavf/iavf_rxtx_vec_sse.c 
b/drivers/net/iavf/iavf_rxtx_vec_sse.c
index 96f187f511..75270876c1 100644
--- a/drivers/net/iavf/iavf_rxtx_vec_sse.c
+++ b/drivers/net/iavf/iavf_rxtx_vec_sse.c
@@ -5,13 +5,12 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "iavf.h"
 #include "iavf_rxtx.h"
 #include "iavf_rxtx_vec_common.h"
 
-#include 
-
 #ifndef __INTEL_COMPILER
 #pragma GCC diagnostic ignored "-Wcast-qual"
 #endif
-- 
2.34.1



[PATCH v4 00/13] Optionally have rte_memcpy delegate to compiler memcpy

2024-06-20 Thread Mattias Rönnblom
This patch set make DPDK library, driver, and application code use the
compiler/libc memcpy() by default when functions in  are
invoked.

The various custom DPDK rte_memcpy() implementations may be retained
by means of a build-time option.

This patch set only make a difference on x86, PPC and ARM. Loongarch
and RISCV already used compiler/libc memcpy().

This patch set includes a number of fixes in drivers and libraries
which errornously relied on  including header files
(i.e., ) required by its implementation.

Mattias Rönnblom (13):
  net/i40e: add missing vector API header include
  net/iavf: add missing vector API header include
  net/ice: add missing vector API header include
  net/ixgbe: add missing vector API header include
  net/ngbe: add missing vector API header include
  net/txgbe: add missing vector API header include
  net/virtio: add missing vector API header include
  net/fm10k: add missing vector API header include
  event/dlb2: include headers for vector and memory copy APIs
  net/octeon_ep: add missing vector API header include
  distributor: add missing vector API header include
  fib: add missing vector API header include
  eal: provide option to use compiler memcpy instead of RTE

 config/meson.build  |  1 +
 doc/guides/rel_notes/release_24_07.rst  | 21 +++
 drivers/event/dlb2/dlb2.c   |  2 +
 drivers/net/fm10k/fm10k_rxtx_vec.c  |  3 +-
 drivers/net/i40e/i40e_rxtx_vec_sse.c|  3 +-
 drivers/net/iavf/iavf_rxtx_vec_sse.c|  3 +-
 drivers/net/ice/ice_rxtx_vec_sse.c  |  2 +-
 drivers/net/ixgbe/ixgbe_rxtx_vec_sse.c  |  3 +-
 drivers/net/ngbe/ngbe_rxtx_vec_sse.c|  3 +-
 drivers/net/octeon_ep/otx_ep_ethdev.c   |  2 +
 drivers/net/txgbe/txgbe_rxtx_vec_sse.c  |  3 +-
 drivers/net/virtio/virtio_rxtx_simple_sse.c |  3 +-
 lib/distributor/rte_distributor.c   |  1 +
 lib/eal/arm/include/rte_memcpy.h| 10 
 lib/eal/include/generic/rte_memcpy.h| 61 ++---
 lib/eal/loongarch/include/rte_memcpy.h  | 53 ++
 lib/eal/ppc/include/rte_memcpy.h| 10 
 lib/eal/riscv/include/rte_memcpy.h  | 53 ++
 lib/eal/x86/include/meson.build |  1 +
 lib/eal/x86/include/rte_memcpy.h| 11 +++-
 lib/fib/trie.c  |  1 +
 meson_options.txt   |  2 +
 22 files changed, 131 insertions(+), 121 deletions(-)

-- 
2.34.1



[PATCH v4 01/13] net/i40e: add missing vector API header include

2024-06-20 Thread Mattias Rönnblom
The i40e driver relied on , but failed to provide a direct
include of this file.

Signed-off-by: Mattias Rönnblom 
---
 drivers/net/i40e/i40e_rxtx_vec_sse.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/net/i40e/i40e_rxtx_vec_sse.c 
b/drivers/net/i40e/i40e_rxtx_vec_sse.c
index 2d4480a765..0a0448544f 100644
--- a/drivers/net/i40e/i40e_rxtx_vec_sse.c
+++ b/drivers/net/i40e/i40e_rxtx_vec_sse.c
@@ -5,6 +5,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "base/i40e_prototype.h"
 #include "base/i40e_type.h"
@@ -12,8 +13,6 @@
 #include "i40e_rxtx.h"
 #include "i40e_rxtx_vec_common.h"
 
-#include 
-
 #ifndef __INTEL_COMPILER
 #pragma GCC diagnostic ignored "-Wcast-qual"
 #endif
-- 
2.34.1



[PATCH v4 04/13] net/ixgbe: add missing vector API header include

2024-06-20 Thread Mattias Rönnblom
The ixgbe driver relied on , but failed to provide a
direct include of this file.

Signed-off-by: Mattias Rönnblom 
---
 drivers/net/ixgbe/ixgbe_rxtx_vec_sse.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/net/ixgbe/ixgbe_rxtx_vec_sse.c 
b/drivers/net/ixgbe/ixgbe_rxtx_vec_sse.c
index f60808d576..0f93f58745 100644
--- a/drivers/net/ixgbe/ixgbe_rxtx_vec_sse.c
+++ b/drivers/net/ixgbe/ixgbe_rxtx_vec_sse.c
@@ -5,13 +5,12 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "ixgbe_ethdev.h"
 #include "ixgbe_rxtx.h"
 #include "ixgbe_rxtx_vec_common.h"
 
-#include 
-
 #ifndef __INTEL_COMPILER
 #pragma GCC diagnostic ignored "-Wcast-qual"
 #endif
-- 
2.34.1



[PATCH v4 03/13] net/ice: add missing vector API header include

2024-06-20 Thread Mattias Rönnblom
The ice driver relied on , but failed to provide a direct
include of this file.

Signed-off-by: Mattias Rönnblom 
---
 drivers/net/ice/ice_rxtx_vec_sse.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ice/ice_rxtx_vec_sse.c 
b/drivers/net/ice/ice_rxtx_vec_sse.c
index 9a1b7e3e51..c01d8ede29 100644
--- a/drivers/net/ice/ice_rxtx_vec_sse.c
+++ b/drivers/net/ice/ice_rxtx_vec_sse.c
@@ -4,7 +4,7 @@
 
 #include "ice_rxtx_vec_common.h"
 
-#include 
+#include 
 
 #ifndef __INTEL_COMPILER
 #pragma GCC diagnostic ignored "-Wcast-qual"
-- 
2.34.1



[PATCH v4 06/13] net/txgbe: add missing vector API header include

2024-06-20 Thread Mattias Rönnblom
The txgbe driver relied on , but failed to provide a
direct include of this file.

Signed-off-by: Mattias Rönnblom 
---
 drivers/net/txgbe/txgbe_rxtx_vec_sse.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/net/txgbe/txgbe_rxtx_vec_sse.c 
b/drivers/net/txgbe/txgbe_rxtx_vec_sse.c
index 12eb4aeef5..d5f60ec92e 100644
--- a/drivers/net/txgbe/txgbe_rxtx_vec_sse.c
+++ b/drivers/net/txgbe/txgbe_rxtx_vec_sse.c
@@ -5,13 +5,12 @@
 
 #include 
 #include 
+#include 
 
 #include "txgbe_ethdev.h"
 #include "txgbe_rxtx.h"
 #include "txgbe_rxtx_vec_common.h"
 
-#include 
-
 static inline void
 txgbe_rxq_rearm(struct txgbe_rx_queue *rxq)
 {
-- 
2.34.1



[PATCH v4 07/13] net/virtio: add missing vector API header include

2024-06-20 Thread Mattias Rönnblom
The virtio driver relied on , but failed to provide a
direct include of this file.

Signed-off-by: Mattias Rönnblom 
---
 drivers/net/virtio/virtio_rxtx_simple_sse.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/net/virtio/virtio_rxtx_simple_sse.c 
b/drivers/net/virtio/virtio_rxtx_simple_sse.c
index 6a18741b6d..db84a308e4 100644
--- a/drivers/net/virtio/virtio_rxtx_simple_sse.c
+++ b/drivers/net/virtio/virtio_rxtx_simple_sse.c
@@ -8,8 +8,6 @@
 #include 
 #include 
 
-#include 
-
 #include 
 #include 
 #include 
@@ -22,6 +20,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "virtio_rxtx_simple.h"
 
-- 
2.34.1



[PATCH v4 08/13] net/fm10k: add missing vector API header include

2024-06-20 Thread Mattias Rönnblom
The fm10k PMD relied on , but failed to provide a direct
include of this file.

Signed-off-by: Mattias Rönnblom 
Acked-by: Bruce Richardson 
---
 drivers/net/fm10k/fm10k_rxtx_vec.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/net/fm10k/fm10k_rxtx_vec.c 
b/drivers/net/fm10k/fm10k_rxtx_vec.c
index 2b6914b1da..6be8822284 100644
--- a/drivers/net/fm10k/fm10k_rxtx_vec.c
+++ b/drivers/net/fm10k/fm10k_rxtx_vec.c
@@ -6,11 +6,10 @@
 
 #include 
 #include 
+#include 
 #include "fm10k.h"
 #include "base/fm10k_type.h"
 
-#include 
-
 #ifndef __INTEL_COMPILER
 #pragma GCC diagnostic ignored "-Wcast-qual"
 #endif
-- 
2.34.1



[PATCH v4 09/13] event/dlb2: include headers for vector and memory copy APIs

2024-06-20 Thread Mattias Rönnblom
The DLB2 PMD depended on  being included as a side-effect
of  being included.

In addition, DLB2 used rte_memcpy() but did not include ,
but rather depended on other include files to do so.

This patch addresses both of those issues.

Signed-off-by: Mattias Rönnblom 
Acked-by: Bruce Richardson 
---
 drivers/event/dlb2/dlb2.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/event/dlb2/dlb2.c b/drivers/event/dlb2/dlb2.c
index 0b91f03956..19f90b8f8d 100644
--- a/drivers/event/dlb2/dlb2.c
+++ b/drivers/event/dlb2/dlb2.c
@@ -25,11 +25,13 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
 #include 
 #include 
+#include 
 
 #include "dlb2_priv.h"
 #include "dlb2_iface.h"
-- 
2.34.1



[PATCH v4 10/13] net/octeon_ep: add missing vector API header include

2024-06-20 Thread Mattias Rönnblom
The octeon_ip driver relied on , but failed to provide a
direct include of this file.

Signed-off-by: Mattias Rönnblom 
Acked-by: Stephen Hemminger 
---
 drivers/net/octeon_ep/otx_ep_ethdev.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/octeon_ep/otx_ep_ethdev.c 
b/drivers/net/octeon_ep/otx_ep_ethdev.c
index 46211361a0..b069216629 100644
--- a/drivers/net/octeon_ep/otx_ep_ethdev.c
+++ b/drivers/net/octeon_ep/otx_ep_ethdev.c
@@ -5,6 +5,8 @@
 #include 
 #include 
 
+#include 
+
 #include "otx_ep_common.h"
 #include "otx_ep_vf.h"
 #include "otx2_ep_vf.h"
-- 
2.34.1



[PATCH v4 05/13] net/ngbe: add missing vector API header include

2024-06-20 Thread Mattias Rönnblom
The ngbe driver relied on , but failed to provide a direct
include of this file.

Signed-off-by: Mattias Rönnblom 
---
 drivers/net/ngbe/ngbe_rxtx_vec_sse.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/net/ngbe/ngbe_rxtx_vec_sse.c 
b/drivers/net/ngbe/ngbe_rxtx_vec_sse.c
index f703d0ea15..80d0bedcdd 100644
--- a/drivers/net/ngbe/ngbe_rxtx_vec_sse.c
+++ b/drivers/net/ngbe/ngbe_rxtx_vec_sse.c
@@ -5,14 +5,13 @@
 
 #include 
 #include 
+#include 
 
 #include "ngbe_type.h"
 #include "ngbe_ethdev.h"
 #include "ngbe_rxtx.h"
 #include "ngbe_rxtx_vec_common.h"
 
-#include 
-
 static inline void
 ngbe_rxq_rearm(struct ngbe_rx_queue *rxq)
 {
-- 
2.34.1



[PATCH v4 13/13] eal: provide option to use compiler memcpy instead of RTE

2024-06-20 Thread Mattias Rönnblom
Provide build option to have functions in  delegate to
the standard compiler/libc memcpy(), instead of using the various
custom DPDK, handcrafted, per-architecture rte_memcpy()
implementations.

A new meson build option 'use_cc_memcpy' is added. By default,
the compiler/libc memcpy() is used.

The performance benefits of the custom DPDK rte_memcpy()
implementations have been diminishing with every compiler release, and
with current toolchains the use of a custom memcpy() implementation
may even be a liability.

This patch leaves an option to stay on the custom DPDK implementations,
would that prove beneficial for certain applications or architectures.

An additional benefit of this change is that compilers and static
analysis tools have an easier time detecting incorrect usage of
rte_memcpy() (e.g., buffer overruns, or overlapping source and
destination buffers).

Signed-off-by: Mattias Rönnblom 
Acked-by: Morten Brørup 

---

PATCH:
 o Add entry in release notes.
 o Update meson help text.

RFC v3:
 o Fix missing #endif on loongarch.
 o PPC and RISCV now implemented, meaning all architectures are supported.
 o Unnecessary  include is removed from .

RFC v2:
 * Fix bug where rte_memcpy.h was not installed on x86.
 * Made attempt to make Loongarch compile.
---
 config/meson.build |  1 +
 doc/guides/rel_notes/release_24_07.rst | 21 +
 lib/eal/arm/include/rte_memcpy.h   | 10 +
 lib/eal/include/generic/rte_memcpy.h   | 61 +++---
 lib/eal/loongarch/include/rte_memcpy.h | 53 ++
 lib/eal/ppc/include/rte_memcpy.h   | 10 +
 lib/eal/riscv/include/rte_memcpy.h | 53 ++
 lib/eal/x86/include/meson.build|  1 +
 lib/eal/x86/include/rte_memcpy.h   | 11 -
 meson_options.txt  |  2 +
 10 files changed, 117 insertions(+), 106 deletions(-)

diff --git a/config/meson.build b/config/meson.build
index 8c8b019c25..456056628e 100644
--- a/config/meson.build
+++ b/config/meson.build
@@ -353,6 +353,7 @@ endforeach
 # set other values pulled from the build options
 dpdk_conf.set('RTE_MAX_ETHPORTS', get_option('max_ethports'))
 dpdk_conf.set('RTE_LIBEAL_USE_HPET', get_option('use_hpet'))
+dpdk_conf.set('RTE_USE_CC_MEMCPY', get_option('use_cc_memcpy'))
 dpdk_conf.set('RTE_ENABLE_STDATOMIC', get_option('enable_stdatomic'))
 dpdk_conf.set('RTE_ENABLE_TRACE_FP', get_option('enable_trace_fp'))
 dpdk_conf.set('RTE_PKTMBUF_HEADROOM', get_option('pkt_mbuf_headroom'))
diff --git a/doc/guides/rel_notes/release_24_07.rst 
b/doc/guides/rel_notes/release_24_07.rst
index 7c88de381b..ebe0085d8b 100644
--- a/doc/guides/rel_notes/release_24_07.rst
+++ b/doc/guides/rel_notes/release_24_07.rst
@@ -24,6 +24,27 @@ DPDK Release 24.07
 New Features
 
 
+* **Compiler memcpy replaces custom DPDK implementation.**
+
+  The memory copy functions of  now delegates to the
+  standard memcpy() function, implemented by the compiler and the C
+  runtime (e.g., libc).
+
+  In this release of DPDK, the handcrafted, per-architecture memory
+  copy implementations are still available, and may be reactivated by
+  setting the new ``use_cc_memcpy`` build option to false.
+
+  The performance benefits of the custom DPDK rte_memcpy()
+  implementations have been diminishing with every new compiler
+  release, and with current toolchains the use of a custom memcpy()
+  implementation may even result in worse performance than the
+  standard memcpy().
+
+  An additional benefit of this change is that compilers and static
+  analysis tools have an easier time detecting incorrect usage of
+  rte_memcpy() (e.g., buffer overruns, or overlapping source and
+  destination buffers).
+
 .. This section should contain new features added in this release.
Sample format:
 
diff --git a/lib/eal/arm/include/rte_memcpy.h b/lib/eal/arm/include/rte_memcpy.h
index 47dea9a8cc..e8aff722df 100644
--- a/lib/eal/arm/include/rte_memcpy.h
+++ b/lib/eal/arm/include/rte_memcpy.h
@@ -5,10 +5,20 @@
 #ifndef _RTE_MEMCPY_ARM_H_
 #define _RTE_MEMCPY_ARM_H_
 
+#include 
+
+#ifdef RTE_USE_CC_MEMCPY
+
+#include 
+
+#else
+
 #ifdef RTE_ARCH_64
 #include 
 #else
 #include 
 #endif
 
+#endif /* RTE_USE_CC_MEMCPY */
+
 #endif /* _RTE_MEMCPY_ARM_H_ */
diff --git a/lib/eal/include/generic/rte_memcpy.h 
b/lib/eal/include/generic/rte_memcpy.h
index e7f0f8eaa9..cae06117fb 100644
--- a/lib/eal/include/generic/rte_memcpy.h
+++ b/lib/eal/include/generic/rte_memcpy.h
@@ -5,12 +5,19 @@
 #ifndef _RTE_MEMCPY_H_
 #define _RTE_MEMCPY_H_
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
 /**
  * @file
  *
  * Functions for vectorised implementation of memcpy().
  */
 
+#include 
+#include 
+
 /**
  * Copy 16 bytes from one location to another using optimised
  * instructions. The locations should not overlap.
@@ -35,8 +42,6 @@ rte_mov16(uint8_t *dst, const uint8_t *src);
 static inline void
 rte_mov32(uint8_t *dst, const uint8_t *src);
 
-#ifdef __DOXYGEN__
-
 /**
  * Copy 

[PATCH v4 11/13] distributor: add missing vector API header include

2024-06-20 Thread Mattias Rönnblom
The distributor library relied on , but failed to provide
a direct include of this file.

Signed-off-by: Mattias Rönnblom 
Acked-by: Bruce Richardson 
---
 lib/distributor/rte_distributor.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/lib/distributor/rte_distributor.c 
b/lib/distributor/rte_distributor.c
index e58727cdc2..1389efc03f 100644
--- a/lib/distributor/rte_distributor.c
+++ b/lib/distributor/rte_distributor.c
@@ -15,6 +15,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "rte_distributor.h"
 #include "rte_distributor_single.h"
-- 
2.34.1



[PATCH v4 12/13] fib: add missing vector API header include

2024-06-20 Thread Mattias Rönnblom
The trie implementation of the fib library relied on , but
failed to provide a direct include of this file.

Signed-off-by: Mattias Rönnblom 
Acked-by: Bruce Richardson 
Acked-by: Stephen Hemminger 
---
 lib/fib/trie.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/lib/fib/trie.c b/lib/fib/trie.c
index 09470e7287..74db8863df 100644
--- a/lib/fib/trie.c
+++ b/lib/fib/trie.c
@@ -9,6 +9,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
-- 
2.34.1



RE: [PATCH v1 7/9] test/bbdev: check assumptions on fft window

2024-06-20 Thread Chautru, Nicolas
Hi Maxime, 

> -Original Message-
> From: Maxime Coquelin 
> Sent: Wednesday, June 12, 2024 4:11 AM
> To: Vargas, Hernan ; dev@dpdk.org;
> gak...@marvell.com; t...@redhat.com
> Cc: Chautru, Nicolas ; Zhang, Qi Z
> 
> Subject: Re: [PATCH v1 7/9] test/bbdev: check assumptions on fft window
> 
> 
> 
> On 4/22/24 21:07, Hernan Vargas wrote:
> > Add check for FFT window width.
> >
> > Signed-off-by: Hernan Vargas 
> > ---
> >   app/test-bbdev/test_bbdev_perf.c   | 26 ++
> >   app/test-bbdev/test_bbdev_vector.c | 14 ++
> >   app/test-bbdev/test_bbdev_vector.h |  2 ++
> >   3 files changed, 38 insertions(+), 4 deletions(-)
> >
> > diff --git a/app/test-bbdev/test_bbdev_perf.c
> > b/app/test-bbdev/test_bbdev_perf.c
> > index 28d78e73a9c1..57b21730cab2 100644
> > --- a/app/test-bbdev/test_bbdev_perf.c
> > +++ b/app/test-bbdev/test_bbdev_perf.c
> > @@ -106,6 +106,8 @@ static int ldpc_llr_decimals;
> >   static int ldpc_llr_size;
> >   /* Keep track of the LDPC decoder device capability flag */
> >   static uint32_t ldpc_cap_flags;
> > +/* FFT window width predefined on device and on vector. */ static int
> > +fft_window_width_dev;
> >
> >   /* Represents tested active devices */
> >   static struct active_device {
> > @@ -881,6 +883,13 @@ add_bbdev_dev(uint8_t dev_id, struct
> rte_bbdev_info *info,
> > rte_bbdev_info_get(dev_id, info);
> > if (info->drv.device_status == RTE_BBDEV_DEV_FATAL_ERR)
> > printf("Device Status %s\n",
> > rte_bbdev_device_status_str(info->drv.device_status));
> > +   if (info->drv.fft_window_width != NULL)
> > +   fft_window_width_dev = info->drv.fft_window_width[0];
> > +   else
> > +   fft_window_width_dev = 0;
> > +   if (fft_window_width_dev != 0)
> > +   printf("  FFT Window0 width %d\n", fft_window_width_dev);
> 
> Why not print the value systematically?

It would only be zero if the application was not able to get that information, 
hence that would be irrelevant. 

> 
> > +
> > nb_queues = RTE_MIN(rte_lcore_count(), info-
> >drv.max_num_queues);
> > nb_queues = RTE_MIN(nb_queues, (unsigned int) MAX_QUEUES);
> >
> > @@ -2583,7 +2592,8 @@ validate_ldpc_enc_op(struct rte_bbdev_enc_op
> **ops, const uint16_t n,
> >   }
> >
> >   static inline int
> > -validate_op_fft_chain(struct rte_bbdev_op_data *op, struct
> > op_data_entries *orig_op)
> > +validate_op_fft_chain(struct rte_bbdev_op_data *op, struct
> op_data_entries *orig_op,
> > +   bool skip_validate_output)
> >   {
> > struct rte_mbuf *m = op->data;
> > uint8_t i, nb_dst_segments = orig_op->nb_segments; @@ -2613,7
> > +2623,7 @@ validate_op_fft_chain(struct rte_bbdev_op_data *op, struct
> op_data_entries *orig
> > abs_delt = delt > 0 ? delt : -delt;
> > error_num += (abs_delt > thres_hold ? 1 : 0);
> > }
> > -   if (error_num > 0) {
> > +   if ((error_num > 0) && !skip_validate_output) {
> > rte_memdump(stdout, "Buffer A", ref_out, data_len);
> > rte_memdump(stdout, "Buffer B", op_out, data_len);
> > TEST_ASSERT(error_num == 0,
> > @@ -2686,16 +2696,24 @@ validate_fft_op(struct rte_bbdev_fft_op **ops,
> const uint16_t n,
> > int ret;
> > struct op_data_entries *fft_data_orig =
> &test_vector.entries[DATA_HARD_OUTPUT];
> > struct op_data_entries *fft_pwr_orig =
> > &test_vector.entries[DATA_SOFT_OUTPUT];
> > +   bool skip_validate_output = false;
> > +
> > +   if ((test_vector.fft_window_width_vec > 0) &&
> > +   (test_vector.fft_window_width_vec !=
> fft_window_width_dev)) {
> > +   printf("The vector FFT width doesn't match with device - skip
> %d %d\n",
> > +   test_vector.fft_window_width_vec,
> fft_window_width_dev);
> > +   skip_validate_output = true;
> > +   }
> >
> > for (i = 0; i < n; ++i) {
> > ret = check_fft_status_and_ordering(ops[i], i, ref_op->status);
> > TEST_ASSERT_SUCCESS(ret, "Checking status and ordering for
> FFT failed");
> > TEST_ASSERT_SUCCESS(validate_op_fft_chain(
> > -   &ops[i]->fft.base_output, fft_data_orig),
> > +   &ops[i]->fft.base_output, fft_data_orig,
> skip_validate_output),
> > "FFT Output buffers (op=%u) are not
> matched", i);
> > if (check_bit(ops[i]->fft.op_flags,
> RTE_BBDEV_FFT_POWER_MEAS))
> > TEST_ASSERT_SUCCESS(validate_op_fft_chain(
> > -   &ops[i]->fft.power_meas_output,
> fft_pwr_orig),
> > +   &ops[i]->fft.power_meas_output,
> fft_pwr_orig,
> > +skip_validate_output),
> > "FFT Power Output buffers (op=%u) are not
> matched", i);
> > }
> >
> > diff --git a/app/test-bbdev/test_bbdev_vector.c
> > b/app/test-bbdev/test_bbdev_vector.c
> > index b3e9d4bb7504..

  1   2   >