RE: [RFC] ethdev: fast path async flow API

2024-01-04 Thread Konstantin Ananyev



> > This is a blocker, showstopper for me.
> +1
> 
> > Have you considered having something like
> >rte_flow_create_bulk()
> >
> > or better yet a Linux iouring style API?
> >
> > A ring style API would allow for better mixed operations across the board 
> > and
> > get rid of the I-cache overhead which is the root cause of the needing 
> > inline.
> Existing async flow API is somewhat close to the io_uring interface.
> The difference being that queue is not directly exposed to the application.
> Application interacts with the queue using rte_flow_async_* APIs (e.g., 
> places operations in the queue, pushes them to the HW).
> Such design has some benefits over a flow API which exposes the queue to the 
> user:
> - Easier to use - Applications do not manage the queue directly, they do it 
> through exposed APIs.
> - Consistent with other DPDK APIs - In other libraries, queues are 
> manipulated through API, not directly by an application.
> - Lower memory usage - only HW primitives are needed (e.g., HW queue on PMD 
> side), no need to allocate separate application
> queues.
> 
> Bulking of flow operations is a tricky subject.
> Compared to packet processing, where it is desired to keep the manipulation 
> of raw packet data to the minimum (e.g., only packet
> headers are accessed),
> during flow rule creation all items and actions must be processed by PMD to 
> create a flow rule.
> The amount of memory consumed by items and actions themselves during this 
> process might be nonnegligible.
> If flow rule operations were bulked, the size of working set of memory would 
> increase, which could have negative consequences on
> the cache behavior.
> So, it might be the case that by utilizing bulking the I-cache overhead is 
> removed, but the D-cache overhead is added.

Is rte_flow struct really that big?
We do bulk processing for mbufs, crypto_ops, etc., and usually bulk processing 
improves performance not degrades it.
Of course bulk size has to be somewhat reasonable.

> On the other hand, creating flow rule operations (or enqueuing flow rule 
> operations) one by one enables applications to reuse the
> same memory for different flow rules.
> 
> In summary, in my opinion extending the async flow API with bulking 
> capabilities or exposing the queue directly to the application is
> not desirable.
> This proposal aims to reduce the I-cache overhead in async flow API by 
> reusing the existing design pattern in DPDK - fast path
> functions are inlined to the application code and they call cached PMD 
> callbacks.
> 
> Best regards,
> Dariusz Sosnowski


RE: [EXT] [PATCH 2/2] app/test-crypto-perf: fix encrypt operation verify

2024-01-04 Thread Suanming Mou
Hi,

> -Original Message-
> From: Anoob Joseph 
> Sent: Thursday, January 4, 2024 1:13 PM
> To: Suanming Mou ; Ciara Power
> 
> Cc: dev@dpdk.org
> Subject: RE: [EXT] [PATCH 2/2] app/test-crypto-perf: fix encrypt operation 
> verify
> 
> Hi Suanming,
> 
> Please see inline.
> 
> Thanks,
> Anoob
> 
> > -Original Message-
> > From: Suanming Mou 
> > Sent: Wednesday, January 3, 2024 9:26 AM
> > To: Ciara Power 
> > Cc: dev@dpdk.org
> > Subject: [EXT] [PATCH 2/2] app/test-crypto-perf: fix encrypt operation
> > verify
> >
> > External Email
> >
> > --
> > AEAD users RTE_CRYPTO_AEAD_OP_* with aead_op and CIPHER uses
> [Anoob] users -> uses
> 
> > RTE_CRYPTO_CIPHER_OP_* with cipher_op in current code.
> >
> > This commit aligns aead_op and cipher_op operation to fix incorrect
> > AEAD verification.
> >
> > Fixes: df52cb3b6e13 ("app/crypto-perf: move verify as single test
> > type")
> >
> > Signed-off-by: Suanming Mou 
> > ---
> >  app/test-crypto-perf/cperf_test_verify.c | 9 +++--
> >  1 file changed, 7 insertions(+), 2 deletions(-)
> >
> > diff --git a/app/test-crypto-perf/cperf_test_verify.c
> > b/app/test-crypto- perf/cperf_test_verify.c index
> > 8aa714b969..525a2b1373 100644
> > --- a/app/test-crypto-perf/cperf_test_verify.c
> > +++ b/app/test-crypto-perf/cperf_test_verify.c
> > @@ -113,6 +113,7 @@ cperf_verify_op(struct rte_crypto_op *op,
> > uint8_t *data;
> > uint32_t cipher_offset, auth_offset;
> > uint8_t cipher, auth;
> > +   bool is_encrypt = false;
> > int res = 0;
> >
> > if (op->status != RTE_CRYPTO_OP_STATUS_SUCCESS) @@ -154,12
> > +155,14 @@ cperf_verify_op(struct rte_crypto_op *op,
> > cipher_offset = 0;
> > auth = 0;
> > auth_offset = 0;
> > +   is_encrypt = options->cipher_op ==
> > RTE_CRYPTO_CIPHER_OP_ENCRYPT;
> > break;
> > case CPERF_CIPHER_THEN_AUTH:
> > cipher = 1;
> > cipher_offset = 0;
> > auth = 1;
> > auth_offset = options->test_buffer_size;
> > +   is_encrypt = options->cipher_op ==
> > RTE_CRYPTO_CIPHER_OP_ENCRYPT;
> > break;
> > case CPERF_AUTH_ONLY:
> > cipher = 0;
> > @@ -172,12 +175,14 @@ cperf_verify_op(struct rte_crypto_op *op,
> > cipher_offset = 0;
> > auth = 1;
> > auth_offset = options->test_buffer_size;
> > +   is_encrypt = options->cipher_op ==
> > RTE_CRYPTO_CIPHER_OP_ENCRYPT;
> > break;
> > case CPERF_AEAD:
> > cipher = 1;
> > cipher_offset = 0;
> > -   auth = 1;
> > +   auth = options->aead_op == RTE_CRYPTO_AEAD_OP_ENCRYPT;
> > auth_offset = options->test_buffer_size;
> > +   is_encrypt = !!auth;
> > break;
> > default:
> > res = 1;
> > @@ -185,7 +190,7 @@ cperf_verify_op(struct rte_crypto_op *op,
> > }
> >
> > if (cipher == 1) {
> > -   if (options->cipher_op == RTE_CRYPTO_CIPHER_OP_ENCRYPT)
> > +   if (is_encrypt)
> 
> [Anoob] A similar check is there under 'auth == 1' check, right? Won't that 
> also
> need fixing?
> 
>   if (auth == 1) {
>   if (options->auth_op == RTE_CRYPTO_AUTH_OP_GENERATE)
> 
> I think some renaming of the local variables might make code better.
> bool cipher, digest_verify = false, is_encrypt = false;
> 
>   case CPERF_CIPHER_THEN_AUTH:
>   cipher = true;
>   cipher_offset = 0;
>   if (options->cipher_op == RTE_CRYPTO_CIPHER_OP_ENCRYPT) {
>   is_encrypt = true;
>   digest_verify = true; /* Assumption - options->auth_op
> == RTE_CRYPTO_AUTH_OP_GENERATE is verified elsewhere */
>   auth_offset = options->test_buffer_size;
>   }
>   break;
>   <...>
>   case CPERF_AEAD:
>   cipher = true;
>   cipher_offset = 0;
>  if (options->aead_op == 
> RTE_CRYPTO_AEAD_OP_ENCRYPT) {
>   is_encrypt = true;
>   digest_verify = true;
>   auth_offset = options->test_buffer_size;
>   }
> 
> What do you think?

Yes, so we can totally remove the auth for now. I will do that. Thanks for the 
suggestion.

> 
> > res += !!memcmp(data + cipher_offset,
> > vector->ciphertext.data,
> > options->test_buffer_size);
> > --
> > 2.34.1



[PATCH 1/2] config/arm: fix CN10K minimum march requirement

2024-01-04 Thread pbhagavatula
From: Pavan Nikhilesh 

Meson selects march and mcpu based on compiler support and
partnumber, only the minimum required march should be defined
in cross compile configuration file.

Fixes: 1b4c86a721c9 ("config/arm: add Marvell CN10K")
Cc: sta...@dpdk.org

Signed-off-by: Pavan Nikhilesh 
---
 config/arm/arm64_cn10k_linux_gcc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/config/arm/arm64_cn10k_linux_gcc b/config/arm/arm64_cn10k_linux_gcc
index fa904af5d0..801a7ededd 100644
--- a/config/arm/arm64_cn10k_linux_gcc
+++ b/config/arm/arm64_cn10k_linux_gcc
@@ -10,7 +10,7 @@ cmake = 'cmake'
 [host_machine]
 system = 'linux'
 cpu_family = 'aarch64'
-cpu = 'armv8.6-a'
+cpu = 'armv8-a'
 endian = 'little'
 
 [properties]
-- 
2.25.1



[PATCH 2/2] config/arm: add armv9-a march

2024-01-04 Thread pbhagavatula
From: Pavan Nikhilesh 

Now that major versions of GCC recognize armv9-a march option,
add it to the list of supported march.
Update neoverse-n2 part number to include march as armv9-a.

Signed-off-by: Pavan Nikhilesh 
---
 config/arm/meson.build | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/config/arm/meson.build b/config/arm/meson.build
index 36f21d2259..0804877b57 100644
--- a/config/arm/meson.build
+++ b/config/arm/meson.build
@@ -92,6 +92,7 @@ part_number_config_arm = {
 'march': 'armv8.4-a',
 },
 '0xd49': {
+'march': 'armv9-a',
 'march_features': ['sve2'],
 'compiler_options': ['-mcpu=neoverse-n2'],
 'flags': [
@@ -701,7 +702,7 @@ if update_flags
 if part_number_config.get('force_march', false)
 candidate_march = part_number_config['march']
 else
-supported_marchs = ['armv8.6-a', 'armv8.5-a', 'armv8.4-a', 
'armv8.3-a',
+supported_marchs = ['armv9-a', 'armv8.6-a', 'armv8.5-a', 
'armv8.4-a', 'armv8.3-a',
 'armv8.2-a', 'armv8.1-a', 'armv8-a']
 check_compiler_support = false
 foreach supported_march: supported_marchs
-- 
2.25.1



[PATCH v9 0/2] net/iavf: fix Rx/Tx burst and add diagnostics

2024-01-04 Thread Mingjin Ye
Fixed Rx/Tx crash in multi-process environment and
added Tx diagnostic feature.

Mingjin Ye (2):
  net/iavf: fix Rx/Tx burst in multi-process
  net/iavf: add diagnostic support in TX path

 doc/guides/nics/intel_vf.rst   |   9 ++
 drivers/net/iavf/iavf.h|  55 ++-
 drivers/net/iavf/iavf_ethdev.c |  75 +
 drivers/net/iavf/iavf_rxtx.c   | 283 ++---
 drivers/net/iavf/iavf_rxtx.h   |   2 +
 5 files changed, 365 insertions(+), 59 deletions(-)

-- 
2.25.1



[PATCH v9 1/2] net/iavf: fix Rx/Tx burst in multi-process

2024-01-04 Thread Mingjin Ye
In a multi-process environment, a secondary process operates on shared
memory and changes the function pointer of the primary process, resulting
in a crash when the primary process cannot find the function address
during an Rx/Tx burst.

Fixes: 5b3124a0a6ef ("net/iavf: support no polling when link down")
Cc: sta...@dpdk.org

Signed-off-by: Mingjin Ye 
---
v2: Add fix for Rx burst.
---
v3: fix Rx/Tx routing.
---
v4: Fix the ops array.
---
v5: rebase.
---
 drivers/net/iavf/iavf.h  |  43 +++-
 drivers/net/iavf/iavf_rxtx.c | 185 ---
 2 files changed, 169 insertions(+), 59 deletions(-)

diff --git a/drivers/net/iavf/iavf.h b/drivers/net/iavf/iavf.h
index d273d884f5..ab24cb02c3 100644
--- a/drivers/net/iavf/iavf.h
+++ b/drivers/net/iavf/iavf.h
@@ -314,6 +314,45 @@ struct iavf_devargs {
 
 struct iavf_security_ctx;
 
+enum iavf_rx_burst_type {
+   IAVF_RX_DEFAULT,
+   IAVF_RX_FLEX_RXD,
+   IAVF_RX_BULK_ALLOC,
+   IAVF_RX_SCATTERED,
+   IAVF_RX_SCATTERED_FLEX_RXD,
+   IAVF_RX_SSE,
+   IAVF_RX_AVX2,
+   IAVF_RX_AVX2_OFFLOAD,
+   IAVF_RX_SSE_FLEX_RXD,
+   IAVF_RX_AVX2_FLEX_RXD,
+   IAVF_RX_AVX2_FLEX_RXD_OFFLOAD,
+   IAVF_RX_SSE_SCATTERED,
+   IAVF_RX_AVX2_SCATTERED,
+   IAVF_RX_AVX2_SCATTERED_OFFLOAD,
+   IAVF_RX_SSE_SCATTERED_FLEX_RXD,
+   IAVF_RX_AVX2_SCATTERED_FLEX_RXD,
+   IAVF_RX_AVX2_SCATTERED_FLEX_RXD_OFFLOAD,
+   IAVF_RX_AVX512,
+   IAVF_RX_AVX512_OFFLOAD,
+   IAVF_RX_AVX512_FLEX_RXD,
+   IAVF_RX_AVX512_FLEX_RXD_OFFLOAD,
+   IAVF_RX_AVX512_SCATTERED,
+   IAVF_RX_AVX512_SCATTERED_OFFLOAD,
+   IAVF_RX_AVX512_SCATTERED_FLEX_RXD,
+   IAVF_RX_AVX512_SCATTERED_FLEX_RXD_OFFLOAD,
+};
+
+enum iavf_tx_burst_type {
+   IAVF_TX_DEFAULT,
+   IAVF_TX_SSE,
+   IAVF_TX_AVX2,
+   IAVF_TX_AVX2_OFFLOAD,
+   IAVF_TX_AVX512,
+   IAVF_TX_AVX512_OFFLOAD,
+   IAVF_TX_AVX512_CTX,
+   IAVF_TX_AVX512_CTX_OFFLOAD,
+};
+
 /* Structure to store private data for each VF instance. */
 struct iavf_adapter {
struct iavf_hw hw;
@@ -329,8 +368,8 @@ struct iavf_adapter {
bool stopped;
bool closed;
bool no_poll;
-   eth_rx_burst_t rx_pkt_burst;
-   eth_tx_burst_t tx_pkt_burst;
+   enum iavf_rx_burst_type rx_burst_type;
+   enum iavf_tx_burst_type tx_burst_type;
uint16_t fdir_ref_cnt;
struct iavf_devargs devargs;
 };
diff --git a/drivers/net/iavf/iavf_rxtx.c b/drivers/net/iavf/iavf_rxtx.c
index e54fb74b79..f044ad3f26 100644
--- a/drivers/net/iavf/iavf_rxtx.c
+++ b/drivers/net/iavf/iavf_rxtx.c
@@ -3716,15 +3716,78 @@ iavf_prep_pkts(__rte_unused void *tx_queue, struct 
rte_mbuf **tx_pkts,
return i;
 }
 
+static
+const eth_rx_burst_t iavf_rx_pkt_burst_ops[] = {
+   [IAVF_RX_DEFAULT] = iavf_recv_pkts,
+   [IAVF_RX_FLEX_RXD] = iavf_recv_pkts_flex_rxd,
+   [IAVF_RX_BULK_ALLOC] = iavf_recv_pkts_bulk_alloc,
+   [IAVF_RX_SCATTERED] = iavf_recv_scattered_pkts,
+   [IAVF_RX_SCATTERED_FLEX_RXD] = iavf_recv_scattered_pkts_flex_rxd,
+#ifdef RTE_ARCH_X86
+   [IAVF_RX_SSE] = iavf_recv_pkts_vec,
+   [IAVF_RX_AVX2] = iavf_recv_pkts_vec_avx2,
+   [IAVF_RX_AVX2_OFFLOAD] = iavf_recv_pkts_vec_avx2_offload,
+   [IAVF_RX_SSE_FLEX_RXD] = iavf_recv_pkts_vec_flex_rxd,
+   [IAVF_RX_AVX2_FLEX_RXD] = iavf_recv_pkts_vec_avx2_flex_rxd,
+   [IAVF_RX_AVX2_FLEX_RXD_OFFLOAD] =
+   iavf_recv_pkts_vec_avx2_flex_rxd_offload,
+   [IAVF_RX_SSE_SCATTERED] = iavf_recv_scattered_pkts_vec,
+   [IAVF_RX_AVX2_SCATTERED] = iavf_recv_scattered_pkts_vec_avx2,
+   [IAVF_RX_AVX2_SCATTERED_OFFLOAD] =
+   iavf_recv_scattered_pkts_vec_avx2_offload,
+   [IAVF_RX_SSE_SCATTERED_FLEX_RXD] =
+   iavf_recv_scattered_pkts_vec_flex_rxd,
+   [IAVF_RX_AVX2_SCATTERED_FLEX_RXD] =
+   iavf_recv_scattered_pkts_vec_avx2_flex_rxd,
+   [IAVF_RX_AVX2_SCATTERED_FLEX_RXD_OFFLOAD] =
+   iavf_recv_scattered_pkts_vec_avx2_flex_rxd_offload,
+#ifdef CC_AVX512_SUPPORT
+   [IAVF_RX_AVX512] = iavf_recv_pkts_vec_avx512,
+   [IAVF_RX_AVX512_OFFLOAD] = iavf_recv_pkts_vec_avx512_offload,
+   [IAVF_RX_AVX512_FLEX_RXD] = iavf_recv_pkts_vec_avx512_flex_rxd,
+   [IAVF_RX_AVX512_FLEX_RXD_OFFLOAD] =
+   iavf_recv_pkts_vec_avx512_flex_rxd_offload,
+   [IAVF_RX_AVX512_SCATTERED] = iavf_recv_scattered_pkts_vec_avx512,
+   [IAVF_RX_AVX512_SCATTERED_OFFLOAD] =
+   iavf_recv_scattered_pkts_vec_avx512_offload,
+   [IAVF_RX_AVX512_SCATTERED_FLEX_RXD] =
+   iavf_recv_scattered_pkts_vec_avx512_flex_rxd,
+   [IAVF_RX_AVX512_SCATTERED_FLEX_RXD_OFFLOAD] =
+   iavf_recv_scattered_pkts_vec_avx512_flex_rxd_offload,
+#endif
+#elif defined RTE_ARCH_ARM
+   [IAVF_RX_SSE] = iavf_recv_pkts_vec,
+#endif
+};
+
+static
+const eth_tx_burst_t iavf_tx_pkt_burst_ops[] = {
+   [IAVF_TX_DEFAULT] = iavf_xmit_pkts,
+#

[PATCH v9 2/2] net/iavf: add diagnostic support in TX path

2024-01-04 Thread Mingjin Ye
The only way to enable diagnostics for TX paths is to modify the
application source code. Making it difficult to diagnose faults.

In this patch, the devarg option "mbuf_check" is introduced and the
parameters are configured to enable the corresponding diagnostics.

supported cases: mbuf, size, segment, offload.
 1. mbuf: check for corrupted mbuf.
 2. size: check min/max packet length according to hw spec.
 3. segment: check number of mbuf segments not exceed hw limitation.
 4. offload: check any unsupported offload flag.

parameter format: mbuf_check=[mbuf,,]
eg: dpdk-testpmd -a :81:01.0,mbuf_check=[mbuf,size] -- -i

Signed-off-by: Mingjin Ye 
---
v2: Remove call chain.
---
v3: Optimisation implementation.
---
v4: Fix Windows os compilation error.
---
v5: Split Patch.
---
v6: remove strict.
---
v8: Modify the description document.
---
 doc/guides/nics/intel_vf.rst   |  9 
 drivers/net/iavf/iavf.h| 12 +
 drivers/net/iavf/iavf_ethdev.c | 75 ++
 drivers/net/iavf/iavf_rxtx.c   | 98 ++
 drivers/net/iavf/iavf_rxtx.h   |  2 +
 5 files changed, 196 insertions(+)

diff --git a/doc/guides/nics/intel_vf.rst b/doc/guides/nics/intel_vf.rst
index ce96c2e1f8..bf6936082e 100644
--- a/doc/guides/nics/intel_vf.rst
+++ b/doc/guides/nics/intel_vf.rst
@@ -111,6 +111,15 @@ For more detail on SR-IOV, please refer to the following 
documents:
 by setting the ``devargs`` parameter like ``-a 
18:01.0,no-poll-on-link-down=1``
 when IAVF is backed by an Intel\ |reg| E810 device or an Intel\ |reg| 700 
Series Ethernet device.
 
+When IAVF is backed by an Intel\ |reg| E810 device or an Intel\ |reg| 700 
series Ethernet devices.
+Set the ``devargs`` parameter ``mbuf_check`` to enable TX diagnostics. For 
example,
+``-a 18:01.0,mbuf_check=mbuf`` or ``-a 18:01.0,mbuf_check=[mbuf,size]``. 
Supported cases:
+
+*   mbuf: Check for corrupted mbuf.
+*   size: Check min/max packet length according to hw spec.
+*   segment: Check number of mbuf segments not exceed hw limitation.
+*   offload: Check any unsupported offload flag.
+
 The PCIE host-interface of Intel Ethernet Switch FM1 Series VF 
infrastructure
 
^
 
diff --git a/drivers/net/iavf/iavf.h b/drivers/net/iavf/iavf.h
index ab24cb02c3..23c0496d54 100644
--- a/drivers/net/iavf/iavf.h
+++ b/drivers/net/iavf/iavf.h
@@ -114,9 +114,14 @@ struct iavf_ipsec_crypto_stats {
} ierrors;
 };
 
+struct iavf_mbuf_stats {
+   uint64_t tx_pkt_errors;
+};
+
 struct iavf_eth_xstats {
struct virtchnl_eth_stats eth_stats;
struct iavf_ipsec_crypto_stats ips_stats;
+   struct iavf_mbuf_stats mbuf_stats;
 };
 
 /* Structure that defines a VSI, associated with a adapter. */
@@ -310,6 +315,7 @@ struct iavf_devargs {
uint32_t watchdog_period;
int auto_reset;
int no_poll_on_link_down;
+   int mbuf_check;
 };
 
 struct iavf_security_ctx;
@@ -353,6 +359,11 @@ enum iavf_tx_burst_type {
IAVF_TX_AVX512_CTX_OFFLOAD,
 };
 
+#define IAVF_MBUF_CHECK_F_TX_MBUF(1ULL << 0)
+#define IAVF_MBUF_CHECK_F_TX_SIZE(1ULL << 1)
+#define IAVF_MBUF_CHECK_F_TX_SEGMENT (1ULL << 2)
+#define IAVF_MBUF_CHECK_F_TX_OFFLOAD (1ULL << 3)
+
 /* Structure to store private data for each VF instance. */
 struct iavf_adapter {
struct iavf_hw hw;
@@ -370,6 +381,7 @@ struct iavf_adapter {
bool no_poll;
enum iavf_rx_burst_type rx_burst_type;
enum iavf_tx_burst_type tx_burst_type;
+   uint64_t mc_flags; /* mbuf check flags. */
uint16_t fdir_ref_cnt;
struct iavf_devargs devargs;
 };
diff --git a/drivers/net/iavf/iavf_ethdev.c b/drivers/net/iavf/iavf_ethdev.c
index 1fb876e827..903a43d004 100644
--- a/drivers/net/iavf/iavf_ethdev.c
+++ b/drivers/net/iavf/iavf_ethdev.c
@@ -13,6 +13,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -39,6 +40,7 @@
 #define IAVF_RESET_WATCHDOG_ARG"watchdog_period"
 #define IAVF_ENABLE_AUTO_RESET_ARG "auto_reset"
 #define IAVF_NO_POLL_ON_LINK_DOWN_ARG "no-poll-on-link-down"
+#define IAVF_MBUF_CHECK_ARG   "mbuf_check"
 uint64_t iavf_timestamp_dynflag;
 int iavf_timestamp_dynfield_offset = -1;
 int rte_pmd_iavf_tx_lldp_dynfield_offset = -1;
@@ -49,6 +51,7 @@ static const char * const iavf_valid_args[] = {
IAVF_RESET_WATCHDOG_ARG,
IAVF_ENABLE_AUTO_RESET_ARG,
IAVF_NO_POLL_ON_LINK_DOWN_ARG,
+   IAVF_MBUF_CHECK_ARG,
NULL
 };
 
@@ -175,6 +178,7 @@ static const struct rte_iavf_xstats_name_off 
rte_iavf_stats_strings[] = {
{"tx_broadcast_packets", _OFF_OF(eth_stats.tx_broadcast)},
{"tx_dropped_packets", _OFF_OF(eth_stats.tx_discards)},
{"tx_error_packets", _OFF_OF(eth_stats.tx_errors)},
+   {"tx_mbuf_error_packets", _OFF_OF(mbuf_stats.tx_pkt_errors)},
 
{"inline_ipsec_crypto_ipackets", _OFF_OF(ips_sta

[PATCH v3] net/i40e: add diagnostic support in TX path

2024-01-04 Thread Mingjin Ye
The only way to enable diagnostics for TX paths is to modify the
application source code. Making it difficult to diagnose faults.

In this patch, the devarg option "mbuf_check" is introduced and the
parameters are configured to enable the corresponding diagnostics.

supported cases: mbuf, size, segment, offload.
 1. mbuf: check for corrupted mbuf.
 2. size: check min/max packet length according to hw spec.
 3. segment: check number of mbuf segments not exceed hw limitation.
 4. offload: check any unsupported offload flag.

parameter format: mbuf_check=[mbuf,,]
eg: dpdk-testpmd -a :81:01.0,mbuf_check=[mbuf,size] -- -i

Signed-off-by: Mingjin Ye 
---
v2: remove strict.
---
v3: optimised.
---
 doc/guides/nics/i40e.rst   |  11 +++
 drivers/net/i40e/i40e_ethdev.c | 137 -
 drivers/net/i40e/i40e_ethdev.h |  28 ++
 drivers/net/i40e/i40e_rxtx.c   | 153 +++--
 drivers/net/i40e/i40e_rxtx.h   |   2 +
 5 files changed, 323 insertions(+), 8 deletions(-)

diff --git a/doc/guides/nics/i40e.rst b/doc/guides/nics/i40e.rst
index 15689ac958..b15b5b61c5 100644
--- a/doc/guides/nics/i40e.rst
+++ b/doc/guides/nics/i40e.rst
@@ -275,6 +275,17 @@ Runtime Configuration
 
   -a 84:00.0,vf_msg_cfg=80@120:180
 
+- ``Support TX diagnostics`` (default ``not enabled``)
+
+  Set the ``devargs`` parameter ``mbuf_check`` to enable TX diagnostics. For 
example,
+  ``-a 18:01.0,mbuf_check=mbuf`` or ``-a 18:01.0,mbuf_check=[mbuf,size]``.
+  Supported cases:
+
+  *   mbuf: Check for corrupted mbuf.
+  *   size: Check min/max packet length according to hw spec.
+  *   segment: Check number of mbuf segments not exceed hw limitation.
+  *   offload: Check any unsupported offload flag.
+
 Vector RX Pre-conditions
 
 For Vector RX it is assumed that the number of descriptor rings will be a power
diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c
index 3ca226156b..e554bae1ab 100644
--- a/drivers/net/i40e/i40e_ethdev.c
+++ b/drivers/net/i40e/i40e_ethdev.c
@@ -48,6 +48,7 @@
 #define ETH_I40E_SUPPORT_MULTI_DRIVER  "support-multi-driver"
 #define ETH_I40E_QUEUE_NUM_PER_VF_ARG  "queue-num-per-vf"
 #define ETH_I40E_VF_MSG_CFG"vf_msg_cfg"
+#define ETH_I40E_MBUF_CHECK_ARG   "mbuf_check"
 
 #define I40E_CLEAR_PXE_WAIT_MS 200
 #define I40E_VSI_TSR_QINQ_STRIP0x4010
@@ -412,6 +413,7 @@ static const char *const valid_keys[] = {
ETH_I40E_SUPPORT_MULTI_DRIVER,
ETH_I40E_QUEUE_NUM_PER_VF_ARG,
ETH_I40E_VF_MSG_CFG,
+   ETH_I40E_MBUF_CHECK_ARG,
NULL};
 
 static const struct rte_pci_id pci_id_i40e_map[] = {
@@ -545,6 +547,14 @@ static const struct rte_i40e_xstats_name_off 
rte_i40e_stats_strings[] = {
 #define I40E_NB_ETH_XSTATS (sizeof(rte_i40e_stats_strings) / \
sizeof(rte_i40e_stats_strings[0]))
 
+static const struct rte_i40e_xstats_name_off i40e_mbuf_strings[] = {
+   {"tx_mbuf_error_packets", offsetof(struct i40e_mbuf_stats,
+   tx_pkt_errors)},
+};
+
+#define I40E_NB_MBUF_XSTATS (sizeof(i40e_mbuf_strings) / \
+   sizeof(i40e_mbuf_strings[0]))
+
 static const struct rte_i40e_xstats_name_off rte_i40e_hw_port_strings[] = {
{"tx_link_down_dropped", offsetof(struct i40e_hw_port_stats,
tx_dropped_link_down)},
@@ -1373,6 +1383,88 @@ read_vf_msg_config(__rte_unused const char *key,
return 0;
 }
 
+static int
+read_mbuf_check_config(__rte_unused const char *key, const char *value, void 
*args)
+{
+   char *cur;
+   char *tmp;
+   int str_len;
+   int valid_len;
+
+   int ret = 0;
+   uint64_t *mc_flags = args;
+   char *str2 = strdup(value);
+   if (str2 == NULL)
+   return -1;
+
+   str_len = strlen(str2);
+   if (str2[0] == '[' && str2[str_len - 1] == ']') {
+   if (str_len < 3) {
+   ret = -1;
+   goto mdd_end;
+   }
+   valid_len = str_len - 2;
+   memmove(str2, str2 + 1, valid_len);
+   memset(str2 + valid_len, '\0', 2);
+   }
+   cur = strtok_r(str2, ",", &tmp);
+   while (cur != NULL) {
+   if (!strcmp(cur, "mbuf"))
+   *mc_flags |= I40E_MBUF_CHECK_F_TX_MBUF;
+   else if (!strcmp(cur, "size"))
+   *mc_flags |= I40E_MBUF_CHECK_F_TX_SIZE;
+   else if (!strcmp(cur, "segment"))
+   *mc_flags |= I40E_MBUF_CHECK_F_TX_SEGMENT;
+   else if (!strcmp(cur, "offload"))
+   *mc_flags |= I40E_MBUF_CHECK_F_TX_OFFLOAD;
+   else
+   PMD_DRV_LOG(ERR, "Unsupported mdd check type: %s", cur);
+   cur = strtok_r(NULL, ",", &tmp);
+   }
+
+mdd_end:
+   free(str2);
+   return ret;
+}
+
+static int
+i40e_parse_mbuf_check(struct rte_eth_dev *dev)
+{
+   struct i40e_adapter *ad =
+  

Re: [PATCH] dts: improve documentation

2024-01-04 Thread Thomas Monjalon
03/01/2024 13:54, Luca Vizzarro:
> Improve instructions for installing dependencies, configuring and
> launching the project. Finally, document the configuration schema
> by adding more comments to the example and documenting every
> property and definition.

Thank you for taking care of the documentation.

> +Luca Vizzarro 

For consistency, we don't use uppercase characters in email addresses.


> -  poetry install
> +  poetry install --no-root

Please could you explain this change in the commit log?


>  DTS needs to know which nodes to connect to and what hardware to use on 
> those nodes.
> -Once that's configured, DTS needs a DPDK tarball and it's ready to run.
> +Once that's configured, DTS needs a DPDK tarball or a git ref ID and it's 
> ready to run.

That's assuming DTS is compiling DPDK.
We may want to provide an already compiled DPDK to DTS.


> -   usage: main.py [-h] [--config-file CONFIG_FILE] [--output-dir OUTPUT_DIR] 
> [-t TIMEOUT]
> -  [-v VERBOSE] [-s SKIP_SETUP] [--tarball TARBALL]
> -  [--compile-timeout COMPILE_TIMEOUT] [--test-cases 
> TEST_CASES]
> -  [--re-run RE_RUN]
> +   (dts-py3.10) $ ./main.py --help

Why adding this line?
Should we remove the shell prefix referring to a specific Python version?

> +   usage: main.py [-h] [--config-file CONFIG_FILE] [--output-dir OUTPUT_DIR] 
> [-t TIMEOUT] [-v VERBOSE]
> +  [-s SKIP_SETUP] [--tarball TARBALL] [--compile-timeout 
> COMPILE_TIMEOUT]
> +  [--test-cases TEST_CASES] [--re-run RE_RUN]
>  
> -   Run DPDK test suites. All options may be specified with the environment 
> variables provided in
> -   brackets. Command line arguments have higher priority.
> +   Run DPDK test suites. All options may be specified with the environment 
> variables provided in brackets.

In general it is better to avoid long lines, and split after a punctation.
I think we should take the habit to always go to the next line after the end of 
a sentence.


> -   [DTS_OUTPUT_DIR] Output directory where dts logs 
> and results are
> -   saved. (default: output)
> +   [DTS_OUTPUT_DIR] Output directory where dts logs 
> and results are saved.

dts -> DTS


> +Configuration Schema
> +
> +
> +Definitions
> +~~~
> +
> +_`Node name`
> +   *string* – A unique identifier for a node. **Examples**: ``SUT1``, 
> ``TG1``.
> +
> +_`ARCH`
> +   *string* – The CPU architecture. **Supported values**: ``x86_64``, 
> ``arm64``, ``ppc64le``.
> +
> +_`CPU`
> +   *string* – The CPU microarchitecture. Use ``native`` for x86. **Supported 
> values**: ``native``, ``armv8a``, ``dpaa2``, ``thunderx``, ``xgene1``.
> +
> +_`OS`
> +   *string* – The operating system. **Supported values**: ``linux``.
> +
> +_`Compiler`
> +   *string* – The compiler used for building DPDK. **Supported values**: 
> ``gcc``, ``clang``, ``icc``, ``mscv``.
> +
> +_`Build target`
> +   *object* – Build targets supported by DTS for building DPDK, described as:
> +
> +    
> =
> +   ``arch`` See `ARCH`_
> +   ``os``   See `OS`_
> +   ``cpu``  See `CPU`_
> +   ``compiler`` See `Compiler`_
> +   ``compiler_wrapper`` *string* – Value prepended to the CC variable for 
> the DPDK build.

Please don't add compilation configuration for now,
I would like to work on the schema first.
This is mostly imported from the old DTS and needs to be rethink.





Re: [PATCH v2 02/24] net/cnxk: implementing eswitch device

2024-01-04 Thread Jerin Jacob
On Wed, Dec 20, 2023 at 12:53 AM Harman Kalra  wrote:
>
> Eswitch device is a parent or base device behind all the representors,
> acting as transport layer between representors and representees
>
> Signed-off-by: Harman Kalra 
> ---
>  drivers/net/cnxk/cnxk_eswitch.c | 465 
>  drivers/net/cnxk/cnxk_eswitch.h | 103 +++
>  drivers/net/cnxk/meson.build|   1 +
>  3 files changed, 569 insertions(+)
>  create mode 100644 drivers/net/cnxk/cnxk_eswitch.c
>  create mode 100644 drivers/net/cnxk/cnxk_eswitch.h
>
> diff --git a/drivers/net/cnxk/cnxk_eswitch.c b/drivers/net/cnxk/cnxk_eswitch.c
> new file mode 100644
> index 00..51110a762d
> --- /dev/null
> +++ b/drivers/net/cnxk/cnxk_eswitch.c
> @@ -0,0 +1,465 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(C) 2023 Marvell.

Change to 2024 for new files in this series.


> +static int
> +eswitch_dev_nix_flow_ctrl_set(struct cnxk_eswitch_dev *eswitch_dev)
> +{

> +
> +   rc = roc_nix_fc_mode_set(nix, mode_map[ROC_NIX_FC_FULL]);
> +   if (rc)
> +   return rc;
> +
> +   return rc;


same as return roc_nix_fc_mode_set(nix, mode_map[ROC_NIX_FC_FULL]);


Re: [PATCH v2 03/24] net/cnxk: eswitch HW resource configuration

2024-01-04 Thread Jerin Jacob
On Wed, Dec 20, 2023 at 12:58 AM Harman Kalra  wrote:
>
> Configuring the hardware resources used by the eswitch device.
>
> Signed-off-by: Harman Kalra 
> ---
>  drivers/net/cnxk/cnxk_eswitch.c | 206 
>  1 file changed, 206 insertions(+)
>

> +
>  static int
>  cnxk_eswitch_dev_probe(struct rte_pci_driver *pci_drv, struct rte_pci_device 
> *pci_dev)
>  {
> @@ -433,6 +630,12 @@ cnxk_eswitch_dev_probe(struct rte_pci_driver *pci_drv, 
> struct rte_pci_device *pc
> return rc;
> +free_mem:
> +   if (mz)

Not needed as rte_memzone_free has the check

> +   rte_memzone_free(mz);
>  fail:
> return rc;
>  }
> --
> 2.18.0
>


Re: [PATCH v2 07/24] common/cnxk: interface to update VLAN TPID

2024-01-04 Thread Jerin Jacob
On Wed, Dec 20, 2023 at 12:53 AM Harman Kalra  wrote:
>
> Introducing eswitch variant of set vlan tpid api which can be
> using for PF and VF
>
> Signed-off-by: Harman Kalra 

> +
> +int
> +roc_eswitch_nix_vlan_tpid_set(struct roc_nix *roc_nix, uint32_t type, 
> uint16_t tpid, bool is_vf)
> +{
> +   struct nix *nix = roc_nix_to_nix_priv(roc_nix);
> +   struct dev *dev = &nix->dev;
> +   int rc = 0;

Across the series, Please check the need for initializing to zero for rc.
In this case, it is not needed.

> +
> +   /* Configuring for PF/VF */
> +   rc = nix_vlan_tpid_set(dev->mbox, dev->pf_func | is_vf, type, tpid);
> +   if (rc)
> +   plt_err("Failed to set tpid for PF, rc %d", rc);
> +
> +   return rc;
> +}


RE: [RFC] ethdev: introduce entropy calculation

2024-01-04 Thread Dumitrescu, Cristian


> -Original Message-
> From: Ori Kam 
> Sent: Wednesday, December 27, 2023 3:20 PM
> To: Andrew Rybchenko ; NBU-Contact-
> Thomas Monjalon (EXTERNAL) ; Stephen Hemminger
> ; Ferruh Yigit 
> Cc: Dumitrescu, Cristian ; Dariusz Sosnowski
> ; dev@dpdk.org; Raslan Darawsheh
> 
> Subject: RE: [RFC] ethdev: introduce entropy calculation
> 
> Hi Andrew, Stephen, Ferruh and Thomas,
> 
> > -Original Message-
> > From: Andrew Rybchenko 
> > Sent: Saturday, December 16, 2023 11:04 AM
> >
> > On 12/15/23 19:21, Thomas Monjalon wrote:
> > > 15/12/2023 14:44, Ferruh Yigit:
> > >> On 12/14/2023 5:26 PM, Stephen Hemminger wrote:
> > >>> On Thu, 14 Dec 2023 17:18:25 +
> > >>> Ori Kam  wrote:
> > >>>
> > >> Since encap groups number of different 5 tuples together, if HW
> > doesn’t know
> > >> how to RSS
> > >> based on the inner application will not be able to get any 
> > >> distribution of
> > > packets.
> > >>
> > >> This value is used to reflect the inner packet on the outer header, 
> > >> so
> > > distribution
> > >> will be possible.
> > >>
> > >> The main use case is, if application does full offload and implements
> > the encap
> > > on
> > >> the RX.
> > >> For example:
> > >> Ingress/FDB  match on 5 tuple encap send to hairpin / different port 
> > >> in
> > case of
> > >> switch.
> > >>
> > >
> > > Smart idea! So basically the user is able to get an idea on how good 
> > > the
> > RSS
> > > distribution is, correct?
> > >
> > 
> >  Not exactly, this simply allows the distribution.
> >  Maybe entropy is a bad name, this is the name they use in the protocol,
> > but in reality
> >  this is some hash calculated on the packet header before the encap and
> > set in the encap header.
> >  Using this hash results in entropy for the packets. Which can be used 
> >  for
> > load balancing.
> > 
> >  Maybe better name would be:
> >  Rte_flow_calc_entropy_hash?
> > 
> >  or maybe rte_flow_calc_encap_hash (I like it less since it looks like 
> >  we
> > calculate the hash on the encap data and not the inner part)
> > 
> >  what do you think?
> > >>>
> > >>> Entropy has meaning in crypto and random numbers generators that is
> > different from
> > >>> this usage. So entropy is bad name to use. Maybe
> > rte_flow_hash_distribution?
> > >>>
> > >>
> > >> Hi Ori,
> > >>
> > >> Thank you for the description, it is more clear now.
> > >>
> > >> And unless this is specifically defined as 'entropy' in spec, I am too
> > >> for rename.
> > >>
> > >> At least in VXLAN spec, it is mentioned that this field is to "enable a
> > >> level of entropy", but not exactly names it as entropy.
> > >
> > > Exactly my thought about the naming.
> > > Good to see I am not alone thinking this naming is disturbing :)
> >
> > I'd avoid usage of term "entropy" in this patch. It is very confusing.
> 
> What about rte_flow_calc_encap_hash?
> 
> 
How about simply rte_flow_calc_hash? My understanding is this is a 
general-purpose hash that is not limited to encapsulation work.


RE: [RFC] ethdev: fast path async flow API

2024-01-04 Thread Dariusz Sosnowski
> -Original Message-
> From: Ivan Malov 
> Sent: Wednesday, January 3, 2024 19:29
> Hi Dariusz,
> 
> I appreciate your response. All to the point.
> 
> I have to confess my question was inspired by the 23.11 merge commit in OVS
> mailing list. I first thought that an obvious consumer for the async flow API
> could have been OVS but saw no usage of it in the current code. It was my
> impression that there had been some patches in OVS already, waiting either
> for approval/testing or for this particular optimisation to be accepted first.
> 
> So far I've been mistaken -- there are no such patches, hence my question. Do
> we have real-world examples of the async flow usage? Should it be tested
> somehow...
> 
> (I apologise in case I'm asking for too many clarifications).
> 
> Thank you.
No need to apologize :)

Unfortunately, we are yet to see async flow API adoption in other open-source 
projects.
Until now, only direct NVIDIA customers use async flow API in their products.

Best regards,
Dariusz Sosnowski


Re: [PATCH] dts: improve documentation

2024-01-04 Thread Thomas Monjalon
04/01/2024 13:34, Luca Vizzarro:
> On 04/01/2024 10:52, Thomas Monjalon wrote:
> >>   DTS needs to know which nodes to connect to and what hardware to use on 
> >> those nodes.
> >> -Once that's configured, DTS needs a DPDK tarball and it's ready to run.
> >> +Once that's configured, DTS needs a DPDK tarball or a git ref ID and it's 
> >> ready to run.
> > 
> > That's assuming DTS is compiling DPDK.
> > We may want to provide an already compiled DPDK to DTS.
> 
> Yes, that is correct. At the current state, DTS is always compiled from 
> source though, so it may be reasonable to leave it as it is until this
> feature may be implemented. Nonetheless, my change just informs the user
> of the (already implemented) feature that uses `git archive` from the 
> local repository to create a tarball. A sensible change would be to add
> this explanation I have just given, but it is a technicality and it 
> won't really make a difference to the user.

Yes
I would like to make it clear in this doc that DTS is compiling DPDK.
Please could you change to something like
"DTS needs a DPDK tarball or a git ref ID to compile" ?

I hope we will change it later to allow external compilation.


> >> +   (dts-py3.10) $ ./main.py --help
> > 
> > Why adding this line?
> 
> Just running `./main.py` will just throw a confusing error to the user. 
> I am in the process of sorting this out as it is misleading and not 
> helpful. Specifying the line in this case just hints to the user on the 
> origin of that help/usage document.

Yes would be good to have a message to help the user instead of a confusing 
error.

> > Should we remove the shell prefix referring to a specific Python version?
> 
> I have purposely left the prefix to indicate that we are in a Poetry 
> shell environment, as that is a pre-requisite to run DTS. So more of an 
> implicit reminder. The Python version specified is in line with the 
> minimum requirement of DTS.

OK

> > In general it is better to avoid long lines, and split after a punctation.
> > I think we should take the habit to always go to the next line after the 
> > end of a sentence.
> 
> I left the output of `--help` under a code block as it is originally 
> printed in the console. Could surely amend it in the docs to be easier 
> to read, but the user could as easily print it themselves in their own 
> terminal in the comfort of their own environment.

I was not referring to the console output.
Maybe I misunderstood it.
For the doc sentences, please try to split sentences on different lines.


> >> -   [DTS_OUTPUT_DIR] Output directory where dts 
> >> logs and results are
> >> -   saved. (default: output)
> >> +   [DTS_OUTPUT_DIR] Output directory where dts 
> >> logs and results are saved.
> > 
> > dts -> DTS
> 
> As above. The output of `--help` only changed as a result of not being 
> updated before in parallel with code changes. Consistently this is what 
> the user would see right now. It may or may not be a good idea to update 
> this whenever changed in the future.

I did not understand it is part of the help message.

> Nonetheless, I am keen to update the code as part of this patch to 
> resolve your comments.

Yes please update the code for this small wording fix.

> > Please don't add compilation configuration for now,
> > I would like to work on the schema first.
> > This is mostly imported from the old DTS and needs to be rethink.
> 
> While I understand the concern on wanting to rework the schema, which is 
> a great point you make, it may be reasonable to provide something useful 
> to close the existing documentation gap. And incrementally updating from 
> there. If there is no realistic timeline set in place for a schema 
> rework, it may just be better to have something rather than nothing. And 
> certainly it would not be very useful to upstream a partial documentation.

I don't know. I have big doubts about the current schema.
I will review it with your doc patches.
Can you please split this patch in 2 so that the schema doc is in a different 
patch?

> Thank you a lot for your review! You have made some good points which 
> open up new potential tasks to add to the pipeline.





RE: [dpdk-dev] [RFC] ethdev: support Tx queue free descriptor query

2024-01-04 Thread Dumitrescu, Cristian



> -Original Message-
> From: jer...@marvell.com 
> Sent: Tuesday, December 19, 2023 5:30 PM
> To: dev@dpdk.org; Thomas Monjalon ; Ferruh Yigit
> ; Andrew Rybchenko 
> Cc: ferruh.yi...@xilinx.com; ajit.khapa...@broadcom.com;
> abo...@pensando.io; Xing, Beilei ; Richardson, Bruce
> ; ch...@att.com; chenbo@intel.com; Loftus,
> Ciara ; dsinghra...@marvell.com; Czeck, Ed
> ; evge...@amazon.com; gr...@u256.net;
> g.si...@nxp.com; zhouguoy...@huawei.com; Wang, Haiyue
> ; hka...@marvell.com; heinrich.k...@corigine.com;
> hemant.agra...@nxp.com; hyon...@cisco.com; igo...@amazon.com;
> irussk...@marvell.com; jgraj...@cisco.com; Singh, Jasvinder
> ; jianw...@trustnetic.com;
> jiawe...@trustnetic.com; Wu, Jingjing ;
> johnd...@cisco.com; john.mil...@atomicrules.com; linvi...@tuxdriver.com;
> Wiles, Keith ; kirankum...@marvell.com;
> ouli...@huawei.com; lir...@marvell.com; lon...@microsoft.com;
> m...@semihalf.com; spin...@cesnet.cz; ma...@nvidia.com; Peters, Matt
> ; maxime.coque...@redhat.com;
> m...@semihalf.com; humi...@huawei.com; pna...@marvell.com;
> ndabilpu...@marvell.com; Yang, Qiming ; Zhang, Qi Z
> ; rad...@marvell.com; rahul.lakkire...@chelsio.com;
> rm...@marvell.com; Xu, Rosen ;
> sachin.sax...@oss.nxp.com; skotesh...@marvell.com; shsha...@marvell.com;
> shaib...@amazon.com; Siegel, Shepard ;
> asoma...@amd.com; somnath.ko...@broadcom.com;
> sthem...@microsoft.com; Webster, Steven ;
> sk...@marvell.com; mtetsu...@gmail.com; vbu...@marvell.com;
> viachesl...@nvidia.com; Wang, Xiao W ;
> cloud.wangxiao...@huawei.com; yisen.zhu...@huawei.com; Wang, Yong
> ; xuanziya...@huawei.com; Dumitrescu, Cristian
> ; Jerin Jacob 
> Subject: [dpdk-dev] [RFC] ethdev: support Tx queue free descriptor query
> 
> From: Jerin Jacob 
> 
> Introduce a new API to retrieve the number of available free descriptors
> in a Tx queue. Applications can leverage this API in the fast path to
> inspect the Tx queue occupancy and take appropriate actions based on the
> available free descriptors.
> 
> A notable use case could be implementing Random Early Discard (RED)
> in software based on Tx queue occupancy.
> 
> Signed-off-by: Jerin Jacob 
> ---
>  doc/guides/nics/features.rst | 10 
>  doc/guides/nics/features/default.ini |  1 +
>  lib/ethdev/ethdev_trace_points.c |  3 ++
>  lib/ethdev/rte_ethdev.h  | 78 
>  lib/ethdev/rte_ethdev_core.h |  7 ++-
>  lib/ethdev/rte_ethdev_trace_fp.h |  8 +++
>  6 files changed, 106 insertions(+), 1 deletion(-)
> 
> diff --git a/doc/guides/nics/features.rst b/doc/guides/nics/features.rst
> index f7d9980849..9d6655473a 100644
> --- a/doc/guides/nics/features.rst
> +++ b/doc/guides/nics/features.rst
> @@ -962,6 +962,16 @@ management (see :doc:`../prog_guide/power_man` for
> more details).
> 
>  * **[implements] eth_dev_ops**: ``get_monitor_addr``
> 
> +.. _nic_features_tx_queue_free_desc_query:
> +
> +Tx queue free descriptor query
> +--
> +
> +Supports to get the number of free descriptors in a Tx queue.
> +
> +* **[implements] eth_dev_ops**: ``tx_queue_free_desc_get``.
> +* **[related] API**: ``rte_eth_tx_queue_free_desc_get()``.
> +
>  .. _nic_features_other:
> 
>  Other dev ops not represented by a Feature
> diff --git a/doc/guides/nics/features/default.ini
> b/doc/guides/nics/features/default.ini
> index 806cb033ff..b30002b1c1 100644
> --- a/doc/guides/nics/features/default.ini
> +++ b/doc/guides/nics/features/default.ini
> @@ -59,6 +59,7 @@ Packet type parsing  =
>  Timesync =
>  Rx descriptor status =
>  Tx descriptor status =
> +Tx free descriptor query =
>  Basic stats  =
>  Extended stats   =
>  Stats per queue  =
> diff --git a/lib/ethdev/ethdev_trace_points.c 
> b/lib/ethdev/ethdev_trace_points.c
> index 91f71d868b..346f37f2e4 100644
> --- a/lib/ethdev/ethdev_trace_points.c
> +++ b/lib/ethdev/ethdev_trace_points.c
> @@ -481,6 +481,9 @@
> RTE_TRACE_POINT_REGISTER(rte_eth_trace_count_aggr_ports,
>  RTE_TRACE_POINT_REGISTER(rte_eth_trace_map_aggr_tx_affinity,
>   lib.ethdev.map_aggr_tx_affinity)
> 
> +RTE_TRACE_POINT_REGISTER(rte_eth_trace_tx_queue_free_desc_get,
> + lib.ethdev.tx_queue_free_desc_get)
> +
>  RTE_TRACE_POINT_REGISTER(rte_flow_trace_copy,
>   lib.ethdev.flow.copy)
> 
> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
> index 77331ce652..033fcb8c9b 100644
> --- a/lib/ethdev/rte_ethdev.h
> +++ b/lib/ethdev/rte_ethdev.h
> @@ -6802,6 +6802,84 @@ rte_eth_recycle_mbufs(uint16_t rx_port_id, uint16_t
> rx_queue_id,
>  __rte_experimental
>  int rte_eth_buffer_split_get_supported_hdr_ptypes(uint16_t port_id, uint32_t
> *ptypes, int num);
> 
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change, or be removed, without prior notice
> + *
> + * Get the number of free descriptors in a Tx queue.
> + *
> + * This function retrieves the number of available free descriptors in a
> + * transmit queue. Applications can use this AP

Re: [dpdk-dev] [RFC] ethdev: support Tx queue free descriptor query

2024-01-04 Thread Jerin Jacob
On Thu, Jan 4, 2024 at 6:46 PM Dumitrescu, Cristian
 wrote:
>
>
>
> > -Original Message-
> > From: jer...@marvell.com 
> > Sent: Tuesday, December 19, 2023 5:30 PM
> > To: dev@dpdk.org; Thomas Monjalon ; Ferruh Yigit
> > ; Andrew Rybchenko 
> > Cc: ferruh.yi...@xilinx.com; ajit.khapa...@broadcom.com;
> > abo...@pensando.io; Xing, Beilei ; Richardson, Bruce
> > ; ch...@att.com; chenbo@intel.com; Loftus,
> > Ciara ; dsinghra...@marvell.com; Czeck, Ed
> > ; evge...@amazon.com; gr...@u256.net;
> > g.si...@nxp.com; zhouguoy...@huawei.com; Wang, Haiyue
> > ; hka...@marvell.com; heinrich.k...@corigine.com;
> > hemant.agra...@nxp.com; hyon...@cisco.com; igo...@amazon.com;
> > irussk...@marvell.com; jgraj...@cisco.com; Singh, Jasvinder
> > ; jianw...@trustnetic.com;
> > jiawe...@trustnetic.com; Wu, Jingjing ;
> > johnd...@cisco.com; john.mil...@atomicrules.com; linvi...@tuxdriver.com;
> > Wiles, Keith ; kirankum...@marvell.com;
> > ouli...@huawei.com; lir...@marvell.com; lon...@microsoft.com;
> > m...@semihalf.com; spin...@cesnet.cz; ma...@nvidia.com; Peters, Matt
> > ; maxime.coque...@redhat.com;
> > m...@semihalf.com; humi...@huawei.com; pna...@marvell.com;
> > ndabilpu...@marvell.com; Yang, Qiming ; Zhang, Qi Z
> > ; rad...@marvell.com; rahul.lakkire...@chelsio.com;
> > rm...@marvell.com; Xu, Rosen ;
> > sachin.sax...@oss.nxp.com; skotesh...@marvell.com; shsha...@marvell.com;
> > shaib...@amazon.com; Siegel, Shepard ;
> > asoma...@amd.com; somnath.ko...@broadcom.com;
> > sthem...@microsoft.com; Webster, Steven ;
> > sk...@marvell.com; mtetsu...@gmail.com; vbu...@marvell.com;
> > viachesl...@nvidia.com; Wang, Xiao W ;
> > cloud.wangxiao...@huawei.com; yisen.zhu...@huawei.com; Wang, Yong
> > ; xuanziya...@huawei.com; Dumitrescu, Cristian
> > ; Jerin Jacob 
> > Subject: [dpdk-dev] [RFC] ethdev: support Tx queue free descriptor query
> >
> > From: Jerin Jacob 
> >
> > Introduce a new API to retrieve the number of available free descriptors
> > in a Tx queue. Applications can leverage this API in the fast path to
> > inspect the Tx queue occupancy and take appropriate actions based on the
> > available free descriptors.
> >
> > A notable use case could be implementing Random Early Discard (RED)
> > in software based on Tx queue occupancy.
> >
> > Signed-off-by: Jerin Jacob 
> > ---
> >  doc/guides/nics/features.rst | 10 
> >  doc/guides/nics/features/default.ini |  1 +
> >  lib/ethdev/ethdev_trace_points.c |  3 ++
> >  lib/ethdev/rte_ethdev.h  | 78 
> >  lib/ethdev/rte_ethdev_core.h |  7 ++-
> >  lib/ethdev/rte_ethdev_trace_fp.h |  8 +++
> >  6 files changed, 106 insertions(+), 1 deletion(-)
>
> Hi Jerin,

Hi Cristian,

>
> I think having an API to get the number of free descriptors per queue is a 
> good idea. Why have it only for TX queues and not for RX queues as well?

I see no harm in adding for Rx as well. I think, it is better to have
separate API for each instead of adding argument as it is fast path
API.
If so, we could add a new API when there is any PMD implementation or
need for this.

>
> Regards,
> Cristian


RE: [EXT] [RFC PATCH] cryptodev: add sm2 key exchange and encryption for HW

2024-01-04 Thread Gowrishankar Muthukrishnan
Hi,

> This commit adds comments for the proposal of addition of SM2 algorithm key
> exchange and encryption/decryption operation.
> 
> Signed-off-by: Arkadiusz Kusztal 
> ---
>  lib/cryptodev/rte_crypto_asym.h | 16 
>  1 file changed, 16 insertions(+)
> 
> diff --git a/lib/cryptodev/rte_crypto_asym.h b/lib/cryptodev/rte_crypto_asym.h
> index 39d3da3952..6911a14dbd 100644
> --- a/lib/cryptodev/rte_crypto_asym.h
> +++ b/lib/cryptodev/rte_crypto_asym.h
> @@ -639,6 +639,10 @@ struct rte_crypto_asym_xform {  struct
> rte_crypto_sm2_op_param {
>   enum rte_crypto_asym_op_type op_type;
>   /**< Signature generation or verification. */
> + /*
> +  * For key exchange operation, new struct should be created.
> +  * Doing that, the current struct could be split into signature and
> encryption.
> +  */
> 
>   enum rte_crypto_auth_algorithm hash;
>   /**< Hash algorithm used in EC op. */
> @@ -672,6 +676,18 @@ struct rte_crypto_sm2_op_param {
>* C1 (1 + N + N), C2 = M, C3 = N. The cipher.length field will
>* be overwritten by the PMD with the encrypted length.
>*/
> + /* SM2 encryption algorithm relies on certain cryptographic functions,
> +  * that HW devices not necesseraly need to implement.
> +  * When C1 is a elliptic curve point, C2 and C3 need additional
> +  * operation like KDF and Hash. The question here is: should only
> +  * elliptic curve output parameters (namely C1 and PB) be returned to
> the user,
> +  * or should encryption be, in this case, computed within the PMD using
> +  * software methods, or should both option be available?
> +  */

I second on splitting this struct for PKE (may be _pke and _dsa).

At the same time, handling these structs should be followed by some capability 
check
and that was what I have been thinking on to propose as asym OP capability in 
this release.
Right now, asymmetric capability is defined only by xform (not also by op).
But we could add op capab also as below.

struct rte_cryptodev_capabilities caps_sm2[] = {
.op = RTE_CRYPTO_OP_TYPE_ASYMMETRIC,
{
.asym = {
.xform_capa = {
.xform_type = RTE_CRYPTO_ASYM_XFORM_SM2,
.op_types = ...
},
.op_capa = [
{
.op_type = RTE_CRYPTO_ASYM_OP_ENC,
.capa = (1 << 
RTE_CRYPTO_ASYM_SM2_PKE_KDF | 1 << RTE_CRYPTO_ASYM_SM2_PKE_HASH)   NEW ENUM 

}
]
}
}
}

Doing this, hash_algos member in asym xform capability today can eventually be 
removed
And it sounds better for an op. Also, this op capability check could be done 
once for the session.
If you are also aligned, I can send an RFC for capab check.

> + /* Similar applies to the key exchange in the HW. The second phase of
> KE, most likely,
> +  * will go as far as to obtain xU,yU(xV,xV), where SW can easily
> calculate SA.

What does SA mean here ? Signature algorithm ??.

Thanks,
Gowrishankar

> +  * Should then both options be available?
> +  */
> 
>   rte_crypto_uint id;
>   /**< The SM2 id used by signer and verifier. */
> --
> 2.13.6



RE: [dpdk-dev] [RFC] ethdev: support Tx queue free descriptor query

2024-01-04 Thread Konstantin Ananyev


> > > Introduce a new API to retrieve the number of available free descriptors
> > > in a Tx queue. Applications can leverage this API in the fast path to
> > > inspect the Tx queue occupancy and take appropriate actions based on the
> > > available free descriptors.
> > >
> > > A notable use case could be implementing Random Early Discard (RED)
> > > in software based on Tx queue occupancy.
> > >
> > > Signed-off-by: Jerin Jacob 
> > > ---
> > >  doc/guides/nics/features.rst | 10 
> > >  doc/guides/nics/features/default.ini |  1 +
> > >  lib/ethdev/ethdev_trace_points.c |  3 ++
> > >  lib/ethdev/rte_ethdev.h  | 78 
> > >  lib/ethdev/rte_ethdev_core.h |  7 ++-
> > >  lib/ethdev/rte_ethdev_trace_fp.h |  8 +++
> > >  6 files changed, 106 insertions(+), 1 deletion(-)
> >
> > Hi Jerin,
> 
> Hi Cristian,
> 
> >
> > I think having an API to get the number of free descriptors per queue is a 
> > good idea. Why have it only for TX queues and not for RX
> queues as well?
> 
> I see no harm in adding for Rx as well. I think, it is better to have
> separate API for each instead of adding argument as it is fast path
> API.
> If so, we could add a new API when there is any PMD implementation or
> need for this.

I think for RX we already have similar one:
/** @internal Get number of used descriptors on a receive queue. */
typedef uint32_t (*eth_rx_queue_count_t)(void *rxq);

 



RE: [RFC] ethdev: introduce entropy calculation

2024-01-04 Thread Ori Kam
Hi Cristian,

> -Original Message-
> From: Dumitrescu, Cristian 
> Sent: Thursday, January 4, 2024 2:57 PM
> > > >>
> > > >> And unless this is specifically defined as 'entropy' in spec, I am too
> > > >> for rename.
> > > >>
> > > >> At least in VXLAN spec, it is mentioned that this field is to "enable a
> > > >> level of entropy", but not exactly names it as entropy.
> > > >
> > > > Exactly my thought about the naming.
> > > > Good to see I am not alone thinking this naming is disturbing :)
> > >
> > > I'd avoid usage of term "entropy" in this patch. It is very confusing.
> >
> > What about rte_flow_calc_encap_hash?
> >
> >
> How about simply rte_flow_calc_hash? My understanding is this is a general-
> purpose hash that is not limited to encapsulation work.

Unfortunately, this is not a general-purpose hash.  HW may implement a 
different hash for each use case.
also, the hash result is length differs depending on the feature and even the 
target field.

We can take your naming idea and change the parameters a bit:
rte_flow_calc_hash(port, feature, *attribute, pattern, hash_len, *hash)

For the feature we will have at this point:
NVGRE_HASH,
SPORT_HASH 

The attribute parameter will be empty for now, but it may be used later to add 
extra information
for the hash if more information is required, for example, some key.
In addition, we will also be able to merge the current function 
rte_flow_calc_table_hash,
if we pass the missing parameters (table id, template id) in the attribute 
field.

What do you think?



Re: [PATCH v9] gro: fix reordering of packets in GRO layer

2024-01-04 Thread 胡嘉瑜


在 2023/12/9 上午2:17, Kumara Parameshwaran 写道:

In the current implementation when a packet is received with
special TCP flag(s) set, only that packet is delivered out of order.
There could be already coalesced packets in the GRO table
belonging to the same flow but not delivered.
This fix makes sure that the entire segment is delivered with the
special flag(s) set which is how the Linux GRO is also implemented

Signed-off-by: Kumara Parameshwaran
Co-authored-by: Kumara Parameshwaran
---
If the received packet is not a pure ACK packet, we check if
there are any previous packets in the flow, if present we indulge
the received packet also in the coalescing logic and update the flags
of the last recived packet to the entire segment which would avoid
re-ordering.

Lets say a case where P1(PSH), P2(ACK), P3(ACK)  are received in burst 
mode,
P1 contains PSH flag and since it does not contain any prior packets in 
the flow
we copy it to unprocess_packets and P2(ACK) and P3(ACK) are merged 
together.
In the existing case the  P2,P3 would be delivered as single segment 
first and the
unprocess_packets will be copied later which will cause reordering. 
With the patch
copy the unprocess packets first and then the packets from the GRO 
table.

Testing done
The csum test-pmd was modified to support the following
GET request of 10MB from client to server via test-pmd (static arp 
entries added in client
and server). Enable GRO and TSO in test-pmd where the packets recived 
from the client mac
would be sent to server mac and vice versa.
In above testing, without the patch the client observerd re-ordering of 
25 packets
and with the patch there were no packet re-ordering observerd.

v2:
Fix warnings in commit and comment.
Do not consider packet as candidate to merge if it contains SYN/RST 
flag.

v3:
Fix warnings.

v4:
Rebase with master.

v5:
Adding co-author email
v6:
Address review comments from the maintainer to restructure the code
and handle only special flags PSH,FIN

v7:
Fix warnings and errors

v8:
Fix warnings and errors

v9:
Fix commit message

  lib/gro/gro_tcp.h  | 11 
  lib/gro/gro_tcp4.c | 67 +-
  2 files changed, 54 insertions(+), 24 deletions(-)

diff --git a/lib/gro/gro_tcp.h b/lib/gro/gro_tcp.h
index d926c4b8cc..137a03bc96 100644
--- a/lib/gro/gro_tcp.h
+++ b/lib/gro/gro_tcp.h
@@ -187,4 +187,15 @@ is_same_common_tcp_key(struct cmn_tcp_key *k1, struct 
cmn_tcp_key *k2)
return (!memcmp(k1, k2, sizeof(struct cmn_tcp_key)));
  }
  
+static inline void

+update_tcp_hdr_flags(struct rte_tcp_hdr *tcp_hdr, struct rte_mbuf *pkt)
+{
+   struct rte_tcp_hdr *merged_tcp_hdr;
+
+   merged_tcp_hdr = rte_pktmbuf_mtod_offset(pkt, struct rte_tcp_hdr *, 
pkt->l2_len +
+   pkt->l3_len);
+   merged_tcp_hdr->tcp_flags |= tcp_hdr->tcp_flags;
+
+}
+
  #endif
diff --git a/lib/gro/gro_tcp4.c b/lib/gro/gro_tcp4.c
index 6645de592b..8af5a8d8a9 100644
--- a/lib/gro/gro_tcp4.c
+++ b/lib/gro/gro_tcp4.c
@@ -126,6 +126,7 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
uint32_t item_idx;
uint32_t i, max_flow_num, remaining_flow_num;
uint8_t find;
+   uint32_t item_start_idx;
  
  	/*

 * Don't process the packet whose TCP header length is greater
@@ -139,13 +140,6 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
tcp_hdr = (struct rte_tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len);
hdr_len = pkt->l2_len + pkt->l3_len + pkt->l4_len;
  
-	/*

-* Don't process the packet which has FIN, SYN, RST, PSH, URG, ECE
-* or CWR set.
-*/
-   if (tcp_hdr->tcp_flags != RTE_TCP_ACK_FLAG)
-   return -1;
-
/* trim the tail padding bytes */
ip_tlen = rte_be_to_cpu_16(ipv4_hdr->total_length);
if (pkt->pkt_len > (uint32_t)(ip_tlen + pkt->l2_len))
@@ -183,6 +177,7 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
if (tbl->flows[i].start_index != INVALID_ARRAY_INDEX) {
if (is_same_tcp4_flow(tbl->flows[i].key, key)) {
find = 1;
+   item_start_idx = tbl->flows[i].start_index;
break;
}
remaining_flow_num--;
@@ -190,28 +185,52 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
}
  
  	if (find == 0) {


It is more likely to find a match flow. So better to put the below logic 
to the else statement.



-   sent_seq = rte_be_to_cpu_32(tcp_hdr->sent_seq);
-   item_idx = insert_new_tcp_item(pkt, tbl->items, &tbl->item_num,
-   tbl->max_item_num, start_time,
- 

RE: [RFC] ethdev: fast path async flow API

2024-01-04 Thread Dariusz Sosnowski
Hi Konstantin,

> -Original Message-
> From: Konstantin Ananyev 
> Sent: Thursday, January 4, 2024 09:47
> > > This is a blocker, showstopper for me.
> > +1
> >
> > > Have you considered having something like
> > >rte_flow_create_bulk()
> > >
> > > or better yet a Linux iouring style API?
> > >
> > > A ring style API would allow for better mixed operations across the
> > > board and get rid of the I-cache overhead which is the root cause of the
> needing inline.
> > Existing async flow API is somewhat close to the io_uring interface.
> > The difference being that queue is not directly exposed to the application.
> > Application interacts with the queue using rte_flow_async_* APIs (e.g.,
> places operations in the queue, pushes them to the HW).
> > Such design has some benefits over a flow API which exposes the queue to
> the user:
> > - Easier to use - Applications do not manage the queue directly, they do it
> through exposed APIs.
> > - Consistent with other DPDK APIs - In other libraries, queues are
> manipulated through API, not directly by an application.
> > - Lower memory usage - only HW primitives are needed (e.g., HW queue
> > on PMD side), no need to allocate separate application queues.
> >
> > Bulking of flow operations is a tricky subject.
> > Compared to packet processing, where it is desired to keep the
> > manipulation of raw packet data to the minimum (e.g., only packet
> > headers are accessed), during flow rule creation all items and actions must
> be processed by PMD to create a flow rule.
> > The amount of memory consumed by items and actions themselves during
> this process might be nonnegligible.
> > If flow rule operations were bulked, the size of working set of memory
> > would increase, which could have negative consequences on the cache
> behavior.
> > So, it might be the case that by utilizing bulking the I-cache overhead is
> removed, but the D-cache overhead is added.
> 
> Is rte_flow struct really that big?
> We do bulk processing for mbufs, crypto_ops, etc., and usually bulk
> processing improves performance not degrades it.
> Of course bulk size has to be somewhat reasonable.
It does not really depend on rte_flow struct size itself (it's opaque to the 
user), but on sizes of items and actions which are the parameters for flow 
operations.
To create a flow through async flow API the following is needed:
- array of items and their spec,
- array of actions and their configuration,
- pointer to template table,
- indexes of pattern and actions templates to be used.
If we assume a simple case of ETH/IPV4/TCP/END match and COUNT/RSS/END actions, 
then we have at most:
- 4 items (32B each) + 3 specs (20B each) = 188B
- 3 actions (16B each) + 2 configurations (4B and 40B) = 92B
- 8B for table pointer
- 2B for template indexes
In total = 290B.
Bulk API can be designed in a way that single bulk operates on a single set of 
tables and templates - this would remove a few bytes.
Flow actions can be based on actions templates (so no need for conf), but 
items' specs are still needed.
This would leave us at 236B, so at least 4 cache lines (assuming everything is 
tightly packed) for a single flow and almost twice the size of the mbuf.
Depending on the bulk size it might be a much more significant chunk of the 
cache.

I don't want to dismiss the idea. I think it's worth of evaluation.
However, I'm not entirely confident if bulking API would introduce performance 
benefits.

Best regards,
Dariusz Sosnowski


[PATCH v4] [PATCH 2/2] net/tap: fix buffer overflow for ptypes list

2024-01-04 Thread Sivaramakrishnan Venkat
Incorrect ptypes list causes buffer overflow for Address Sanitizer
run. Previously, the last element in the ptypes lists to be
"RTE_PTYPE_UNKNOWN" for rte_eth_dev_get_supported_ptypes(), but this was
not clearly documented and many PMDs did not follow this implementation.
Instead, the dev_supported_ptypes_get() function pointer now returns the
number of elements to eliminate the need for "RTE_PTYPE_UNKNOWN"
as the last item.

Fixes: 47909357a069 ("ethdev: make device operations struct private")
Cc: ferruh.yi...@intel.com
Cc: sta...@dpdk.org

V4:
The first patch is for drivers for backporting.
The second patch is for driver API update.

Signed-off-by: Sivaramakrishnan Venkat 
---
 drivers/net/atlantic/atl_ethdev.c  | 13 -
 drivers/net/axgbe/axgbe_ethdev.c   | 13 -
 drivers/net/bnxt/bnxt_ethdev.c |  7 ---
 drivers/net/cnxk/cnxk_ethdev.h |  3 ++-
 drivers/net/cnxk/cnxk_lookup.c |  7 ---
 drivers/net/cpfl/cpfl_ethdev.c |  7 ---
 drivers/net/cxgbe/cxgbe_ethdev.c   | 10 ++
 drivers/net/cxgbe/cxgbe_pfvf.h |  3 ++-
 drivers/net/dpaa/dpaa_ethdev.c | 11 +++
 drivers/net/dpaa2/dpaa2_ethdev.c   | 10 ++
 drivers/net/e1000/igb_ethdev.c | 13 -
 drivers/net/enetc/enetc_ethdev.c   |  7 ---
 drivers/net/enic/enic_ethdev.c | 17 ++---
 drivers/net/failsafe/failsafe_ops.c|  5 +++--
 drivers/net/fm10k/fm10k_ethdev.c   | 15 +--
 drivers/net/hns3/hns3_rxtx.c   | 16 +---
 drivers/net/hns3/hns3_rxtx.h   |  3 ++-
 drivers/net/i40e/i40e_rxtx.c   | 11 +++
 drivers/net/i40e/i40e_rxtx.h   |  3 ++-
 drivers/net/iavf/iavf_ethdev.c | 10 ++
 drivers/net/ice/ice_dcf_ethdev.c   |  7 ---
 drivers/net/ice/ice_rxtx.c | 23 ++-
 drivers/net/ice/ice_rxtx.h |  3 ++-
 drivers/net/idpf/idpf_ethdev.c |  7 ---
 drivers/net/igc/igc_ethdev.c   | 10 ++
 drivers/net/ionic/ionic_rxtx.c |  7 ---
 drivers/net/ionic/ionic_rxtx.h |  3 ++-
 drivers/net/ixgbe/ixgbe_ethdev.c   | 18 --
 drivers/net/mana/mana.c|  7 ---
 drivers/net/mlx4/mlx4.h|  3 ++-
 drivers/net/mlx4/mlx4_ethdev.c | 17 ++---
 drivers/net/mlx5/mlx5.h|  3 ++-
 drivers/net/mlx5/mlx5_ethdev.c | 11 +++
 drivers/net/mvneta/mvneta_ethdev.c |  7 ---
 drivers/net/mvpp2/mrvl_ethdev.c|  7 ---
 drivers/net/netvsc/hn_var.h|  3 ++-
 drivers/net/netvsc/hn_vf.c |  5 +++--
 drivers/net/nfp/nfp_net_common.c   | 15 ++-
 drivers/net/nfp/nfp_net_common.h   |  3 ++-
 drivers/net/ngbe/ngbe_ethdev.c |  9 ++---
 drivers/net/ngbe/ngbe_ethdev.h |  3 ++-
 drivers/net/ngbe/ngbe_ptypes.c |  3 ++-
 drivers/net/ngbe/ngbe_ptypes.h |  2 +-
 drivers/net/octeontx/octeontx_ethdev.c | 11 +++
 drivers/net/pfe/pfe_ethdev.c   | 11 +++
 drivers/net/qede/qede_ethdev.c | 11 +++
 drivers/net/sfc/sfc_dp_rx.h|  2 +-
 drivers/net/sfc/sfc_ef10.h |  3 ++-
 drivers/net/sfc/sfc_ef100_rx.c |  7 ---
 drivers/net/sfc/sfc_ef10_rx.c  | 11 ++-
 drivers/net/sfc/sfc_ethdev.c   |  5 +++--
 drivers/net/sfc/sfc_rx.c   |  7 ---
 drivers/net/tap/rte_eth_tap.c  |  7 ---
 drivers/net/thunderx/nicvf_ethdev.c| 10 +-
 drivers/net/txgbe/txgbe_ethdev.c   |  9 ++---
 drivers/net/txgbe/txgbe_ethdev.h   |  3 ++-
 drivers/net/txgbe/txgbe_ptypes.c   |  6 +++---
 drivers/net/txgbe/txgbe_ptypes.h   |  2 +-
 drivers/net/vmxnet3/vmxnet3_ethdev.c   | 14 +-
 lib/ethdev/ethdev_driver.h |  3 ++-
 lib/ethdev/rte_ethdev.c| 10 ++
 61 files changed, 299 insertions(+), 193 deletions(-)

diff --git a/drivers/net/atlantic/atl_ethdev.c 
b/drivers/net/atlantic/atl_ethdev.c
index 3a028f4290..bc087738e4 100644
--- a/drivers/net/atlantic/atl_ethdev.c
+++ b/drivers/net/atlantic/atl_ethdev.c
@@ -43,7 +43,8 @@ static int atl_dev_stats_reset(struct rte_eth_dev *dev);
 static int atl_fw_version_get(struct rte_eth_dev *dev, char *fw_version,
  size_t fw_size);
 
-static const uint32_t *atl_dev_supported_ptypes_get(struct rte_eth_dev *dev);
+static const uint32_t *atl_dev_supported_ptypes_get(struct rte_eth_dev *dev,
+   size_t *no_of_elements);
 
 static int atl_dev_mtu_set(struct rte_eth_dev *dev, uint16_t mtu);
 
@@ -1132,7 +1133,8 @@ atl_dev_info_get(struct rte_eth_dev *dev, struct 
rte_eth_dev_info *dev_info)
 }
 
 static const uint32_t *
-atl_dev_supported_ptypes_get(struct rte_eth_dev *dev)
+atl_dev_supported_ptypes_get(struct rte_eth_dev *dev,
+   size_t *no_of_elements)
 {
static const u

[PATCH v4] [PATCH 2/2] net/tap: fix buffer overflow for ptypes list

2024-01-04 Thread Sivaramakrishnan Venkat
Incorrect ptypes list causes buffer overflow for Address Sanitizer
run. Previously, the last element in the ptypes lists to be
"RTE_PTYPE_UNKNOWN" for rte_eth_dev_get_supported_ptypes(), but this was
not clearly documented and many PMDs did not follow this implementation.
Instead, the dev_supported_ptypes_get() function pointer now returns the
number of elements to eliminate the need for "RTE_PTYPE_UNKNOWN"
as the last item.

Fixes: 47909357a069 ("ethdev: make device operations struct private")
Cc: ferruh.yi...@intel.com
Cc: sta...@dpdk.org

V4:
The first patch is for drivers for backporting.
The second patch is for driver API update.

Signed-off-by: Sivaramakrishnan Venkat 
---
 drivers/net/atlantic/atl_ethdev.c  | 13 -
 drivers/net/axgbe/axgbe_ethdev.c   | 13 -
 drivers/net/bnxt/bnxt_ethdev.c |  7 ---
 drivers/net/cnxk/cnxk_ethdev.h |  3 ++-
 drivers/net/cnxk/cnxk_lookup.c |  7 ---
 drivers/net/cpfl/cpfl_ethdev.c |  7 ---
 drivers/net/cxgbe/cxgbe_ethdev.c   | 10 ++
 drivers/net/cxgbe/cxgbe_pfvf.h |  3 ++-
 drivers/net/dpaa/dpaa_ethdev.c | 11 +++
 drivers/net/dpaa2/dpaa2_ethdev.c   | 10 ++
 drivers/net/e1000/igb_ethdev.c | 13 -
 drivers/net/enetc/enetc_ethdev.c   |  7 ---
 drivers/net/enic/enic_ethdev.c | 17 ++---
 drivers/net/failsafe/failsafe_ops.c|  5 +++--
 drivers/net/fm10k/fm10k_ethdev.c   | 15 +--
 drivers/net/hns3/hns3_rxtx.c   | 16 +---
 drivers/net/hns3/hns3_rxtx.h   |  3 ++-
 drivers/net/i40e/i40e_rxtx.c   | 11 +++
 drivers/net/i40e/i40e_rxtx.h   |  3 ++-
 drivers/net/iavf/iavf_ethdev.c | 10 ++
 drivers/net/ice/ice_dcf_ethdev.c   |  7 ---
 drivers/net/ice/ice_rxtx.c | 23 ++-
 drivers/net/ice/ice_rxtx.h |  3 ++-
 drivers/net/idpf/idpf_ethdev.c |  7 ---
 drivers/net/igc/igc_ethdev.c   | 10 ++
 drivers/net/ionic/ionic_rxtx.c |  7 ---
 drivers/net/ionic/ionic_rxtx.h |  3 ++-
 drivers/net/ixgbe/ixgbe_ethdev.c   | 18 --
 drivers/net/mana/mana.c|  7 ---
 drivers/net/mlx4/mlx4.h|  3 ++-
 drivers/net/mlx4/mlx4_ethdev.c | 17 ++---
 drivers/net/mlx5/mlx5.h|  3 ++-
 drivers/net/mlx5/mlx5_ethdev.c | 11 +++
 drivers/net/mvneta/mvneta_ethdev.c |  7 ---
 drivers/net/mvpp2/mrvl_ethdev.c|  7 ---
 drivers/net/netvsc/hn_var.h|  3 ++-
 drivers/net/netvsc/hn_vf.c |  5 +++--
 drivers/net/nfp/nfp_net_common.c   | 15 ++-
 drivers/net/nfp/nfp_net_common.h   |  3 ++-
 drivers/net/ngbe/ngbe_ethdev.c |  9 ++---
 drivers/net/ngbe/ngbe_ethdev.h |  3 ++-
 drivers/net/ngbe/ngbe_ptypes.c |  3 ++-
 drivers/net/ngbe/ngbe_ptypes.h |  2 +-
 drivers/net/octeontx/octeontx_ethdev.c | 11 +++
 drivers/net/pfe/pfe_ethdev.c   | 11 +++
 drivers/net/qede/qede_ethdev.c | 11 +++
 drivers/net/sfc/sfc_dp_rx.h|  2 +-
 drivers/net/sfc/sfc_ef10.h |  3 ++-
 drivers/net/sfc/sfc_ef100_rx.c |  7 ---
 drivers/net/sfc/sfc_ef10_rx.c  | 11 ++-
 drivers/net/sfc/sfc_ethdev.c   |  5 +++--
 drivers/net/sfc/sfc_rx.c   |  7 ---
 drivers/net/tap/rte_eth_tap.c  |  7 ---
 drivers/net/thunderx/nicvf_ethdev.c| 10 +-
 drivers/net/txgbe/txgbe_ethdev.c   |  9 ++---
 drivers/net/txgbe/txgbe_ethdev.h   |  3 ++-
 drivers/net/txgbe/txgbe_ptypes.c   |  6 +++---
 drivers/net/txgbe/txgbe_ptypes.h   |  2 +-
 drivers/net/vmxnet3/vmxnet3_ethdev.c   | 14 +-
 lib/ethdev/ethdev_driver.h |  3 ++-
 lib/ethdev/rte_ethdev.c| 10 ++
 61 files changed, 299 insertions(+), 193 deletions(-)

diff --git a/drivers/net/atlantic/atl_ethdev.c 
b/drivers/net/atlantic/atl_ethdev.c
index 3a028f4290..bc087738e4 100644
--- a/drivers/net/atlantic/atl_ethdev.c
+++ b/drivers/net/atlantic/atl_ethdev.c
@@ -43,7 +43,8 @@ static int atl_dev_stats_reset(struct rte_eth_dev *dev);
 static int atl_fw_version_get(struct rte_eth_dev *dev, char *fw_version,
  size_t fw_size);
 
-static const uint32_t *atl_dev_supported_ptypes_get(struct rte_eth_dev *dev);
+static const uint32_t *atl_dev_supported_ptypes_get(struct rte_eth_dev *dev,
+   size_t *no_of_elements);
 
 static int atl_dev_mtu_set(struct rte_eth_dev *dev, uint16_t mtu);
 
@@ -1132,7 +1133,8 @@ atl_dev_info_get(struct rte_eth_dev *dev, struct 
rte_eth_dev_info *dev_info)
 }
 
 static const uint32_t *
-atl_dev_supported_ptypes_get(struct rte_eth_dev *dev)
+atl_dev_supported_ptypes_get(struct rte_eth_dev *dev,
+   size_t *no_of_elements)
 {
static const u

[dpdk-dev v4 2/2] net/tap: fix buffer overflow for ptypes list

2024-01-04 Thread Sivaramakrishnan Venkat


[PATCH 1] - net/tap: fix buffer overflow for ptypes list through
updation of last element. The first patch is for drivers for backporting

[PATCH 2] - net/tap: fix buffer overflow for ptypes list through
through driver API update. The second patch is for drivers API update


Sivaramakrishnan Venkat (2):
  net/tap: fix buffer overflow for ptypes list through updation of last
element.
  net/tap: fix buffer overflow for ptypes list through driver API update

 drivers/net/atlantic/atl_ethdev.c  | 13 -
 drivers/net/axgbe/axgbe_ethdev.c   | 13 -
 drivers/net/bnxt/bnxt_ethdev.c |  7 ---
 drivers/net/cnxk/cnxk_ethdev.h |  3 ++-
 drivers/net/cnxk/cnxk_lookup.c |  7 ---
 drivers/net/cpfl/cpfl_ethdev.c |  7 ---
 drivers/net/cxgbe/cxgbe_ethdev.c   | 10 ++
 drivers/net/cxgbe/cxgbe_pfvf.h |  3 ++-
 drivers/net/dpaa/dpaa_ethdev.c |  8 ++--
 drivers/net/dpaa2/dpaa2_ethdev.c   | 10 ++
 drivers/net/e1000/igb_ethdev.c | 13 -
 drivers/net/enetc/enetc_ethdev.c   |  7 ---
 drivers/net/enic/enic_ethdev.c | 17 ++---
 drivers/net/failsafe/failsafe_ops.c|  5 +++--
 drivers/net/fm10k/fm10k_ethdev.c   | 15 +--
 drivers/net/hns3/hns3_rxtx.c   | 16 +---
 drivers/net/hns3/hns3_rxtx.h   |  3 ++-
 drivers/net/i40e/i40e_rxtx.c   | 11 +++
 drivers/net/i40e/i40e_rxtx.h   |  3 ++-
 drivers/net/iavf/iavf_ethdev.c | 10 ++
 drivers/net/ice/ice_dcf_ethdev.c   |  7 ---
 drivers/net/ice/ice_rxtx.c | 23 ++-
 drivers/net/ice/ice_rxtx.h |  3 ++-
 drivers/net/idpf/idpf_ethdev.c |  7 ---
 drivers/net/igc/igc_ethdev.c   | 10 ++
 drivers/net/ionic/ionic_rxtx.c |  7 ---
 drivers/net/ionic/ionic_rxtx.h |  3 ++-
 drivers/net/ixgbe/ixgbe_ethdev.c   | 18 --
 drivers/net/mana/mana.c|  7 ---
 drivers/net/mlx4/mlx4.h|  3 ++-
 drivers/net/mlx4/mlx4_ethdev.c | 17 ++---
 drivers/net/mlx5/mlx5.h|  3 ++-
 drivers/net/mlx5/mlx5_ethdev.c | 11 +++
 drivers/net/mvneta/mvneta_ethdev.c |  4 +++-
 drivers/net/mvpp2/mrvl_ethdev.c|  4 +++-
 drivers/net/netvsc/hn_var.h|  3 ++-
 drivers/net/netvsc/hn_vf.c |  5 +++--
 drivers/net/nfp/nfp_net_common.c   | 14 ++
 drivers/net/nfp/nfp_net_common.h   |  3 ++-
 drivers/net/ngbe/ngbe_ethdev.c |  9 ++---
 drivers/net/ngbe/ngbe_ethdev.h |  3 ++-
 drivers/net/ngbe/ngbe_ptypes.c |  3 ++-
 drivers/net/ngbe/ngbe_ptypes.h |  2 +-
 drivers/net/octeontx/octeontx_ethdev.c | 11 +++
 drivers/net/pfe/pfe_ethdev.c   |  8 ++--
 drivers/net/qede/qede_ethdev.c | 11 +++
 drivers/net/sfc/sfc_dp_rx.h|  2 +-
 drivers/net/sfc/sfc_ef10.h |  3 ++-
 drivers/net/sfc/sfc_ef100_rx.c |  7 ---
 drivers/net/sfc/sfc_ef10_rx.c  | 11 ++-
 drivers/net/sfc/sfc_ethdev.c   |  5 +++--
 drivers/net/sfc/sfc_rx.c   |  7 ---
 drivers/net/tap/rte_eth_tap.c  |  6 --
 drivers/net/thunderx/nicvf_ethdev.c|  8 +---
 drivers/net/txgbe/txgbe_ethdev.c   |  9 ++---
 drivers/net/txgbe/txgbe_ethdev.h   |  3 ++-
 drivers/net/txgbe/txgbe_ptypes.c   |  6 +++---
 drivers/net/txgbe/txgbe_ptypes.h   |  2 +-
 drivers/net/vmxnet3/vmxnet3_ethdev.c   | 14 +-
 lib/ethdev/ethdev_driver.h |  3 ++-
 lib/ethdev/rte_ethdev.c| 10 ++
 61 files changed, 295 insertions(+), 181 deletions(-)

-- 
2.25.1



[dpdk-dev v4 1/2] net/tap: fix buffer overflow for ptypes list through updation of last element.

2024-01-04 Thread Sivaramakrishnan Venkat
Incorrect ptypes list causes buffer overflow for Address Sanitizer
run. The last element in the ptypes lists to be "RTE_PTYPE_UNKNOWN"
for rte_eth_dev_get_supported_ptypes().
In rte_eth_dev_get_supported_ptypes(),the loop iterates until it
finds "RTE_PTYPE_UNKNOWN" to detect last element of the ptypes array.
Fix the ptypes list for drivers.

Fixes: 0849ac3b6122 ("net/tap: add packet type management")
Fixes: a7bdc3bd4244 ("net/dpaa: support packet type parsing")
Fixes: 4ccc8d770d3b ("net/mvneta: add PMD skeleton")
Fixes: f3f0d77db6b0 ("net/mrvl: support packet type parsing")
Fixes: 78a38edf66de ("ethdev: query supported packet types")
Fixes: 659b494d3d88 ("net/pfe: add packet types and basic statistics")
Fixes: 398a1be14168 ("net/thunderx: remove generic passX references")
Cc: pascal.ma...@6wind.com
Cc: z...@semihalf.com
Cc: t...@semihalf.com
Cc: jianfeng@intel.com
Cc: g.si...@nxp.com
Cc: jerin.ja...@caviumnetworks.com
Cc: sta...@dpdk.org

Signed-off-by: Sivaramakrishnan Venkat 
---
 drivers/net/dpaa/dpaa_ethdev.c  | 3 ++-
 drivers/net/mvneta/mvneta_ethdev.c  | 3 ++-
 drivers/net/mvpp2/mrvl_ethdev.c | 3 ++-
 drivers/net/nfp/nfp_net_common.c| 1 +
 drivers/net/pfe/pfe_ethdev.c| 3 ++-
 drivers/net/tap/rte_eth_tap.c   | 1 +
 drivers/net/thunderx/nicvf_ethdev.c | 2 ++
 7 files changed, 12 insertions(+), 4 deletions(-)

diff --git a/drivers/net/dpaa/dpaa_ethdev.c b/drivers/net/dpaa/dpaa_ethdev.c
index ef4c06db6a..779bdc5860 100644
--- a/drivers/net/dpaa/dpaa_ethdev.c
+++ b/drivers/net/dpaa/dpaa_ethdev.c
@@ -363,7 +363,8 @@ dpaa_supported_ptypes_get(struct rte_eth_dev *dev)
RTE_PTYPE_L4_TCP,
RTE_PTYPE_L4_UDP,
RTE_PTYPE_L4_SCTP,
-   RTE_PTYPE_TUNNEL_ESP
+   RTE_PTYPE_TUNNEL_ESP,
+   RTE_PTYPE_UNKNOWN
};
 
PMD_INIT_FUNC_TRACE();
diff --git a/drivers/net/mvneta/mvneta_ethdev.c 
b/drivers/net/mvneta/mvneta_ethdev.c
index daa69e533a..212c300c14 100644
--- a/drivers/net/mvneta/mvneta_ethdev.c
+++ b/drivers/net/mvneta/mvneta_ethdev.c
@@ -198,7 +198,8 @@ mvneta_dev_supported_ptypes_get(struct rte_eth_dev *dev 
__rte_unused)
RTE_PTYPE_L3_IPV4,
RTE_PTYPE_L3_IPV6,
RTE_PTYPE_L4_TCP,
-   RTE_PTYPE_L4_UDP
+   RTE_PTYPE_L4_UDP,
+   RTE_PTYPE_UNKNOWN
};
 
return ptypes;
diff --git a/drivers/net/mvpp2/mrvl_ethdev.c b/drivers/net/mvpp2/mrvl_ethdev.c
index c12364941d..4cc64c7cad 100644
--- a/drivers/net/mvpp2/mrvl_ethdev.c
+++ b/drivers/net/mvpp2/mrvl_ethdev.c
@@ -1777,7 +1777,8 @@ mrvl_dev_supported_ptypes_get(struct rte_eth_dev *dev 
__rte_unused)
RTE_PTYPE_L3_IPV6_EXT,
RTE_PTYPE_L2_ETHER_ARP,
RTE_PTYPE_L4_TCP,
-   RTE_PTYPE_L4_UDP
+   RTE_PTYPE_L4_UDP,
+   RTE_PTYPE_UNKNOWN
};
 
return ptypes;
diff --git a/drivers/net/nfp/nfp_net_common.c b/drivers/net/nfp/nfp_net_common.c
index e969b840d6..46d0e07850 100644
--- a/drivers/net/nfp/nfp_net_common.c
+++ b/drivers/net/nfp/nfp_net_common.c
@@ -1299,6 +1299,7 @@ nfp_net_supported_ptypes_get(struct rte_eth_dev *dev)
RTE_PTYPE_INNER_L4_NONFRAG,
RTE_PTYPE_INNER_L4_ICMP,
RTE_PTYPE_INNER_L4_SCTP,
+   RTE_PTYPE_UNKNOWN
};
 
if (dev->rx_pkt_burst != nfp_net_recv_pkts)
diff --git a/drivers/net/pfe/pfe_ethdev.c b/drivers/net/pfe/pfe_ethdev.c
index 551f3cf193..0073dd7405 100644
--- a/drivers/net/pfe/pfe_ethdev.c
+++ b/drivers/net/pfe/pfe_ethdev.c
@@ -520,7 +520,8 @@ pfe_supported_ptypes_get(struct rte_eth_dev *dev)
RTE_PTYPE_L3_IPV6_EXT,
RTE_PTYPE_L4_TCP,
RTE_PTYPE_L4_UDP,
-   RTE_PTYPE_L4_SCTP
+   RTE_PTYPE_L4_SCTP,
+   RTE_PTYPE_UNKNOWN
};
 
if (dev->rx_pkt_burst == pfe_recv_pkts ||
diff --git a/drivers/net/tap/rte_eth_tap.c b/drivers/net/tap/rte_eth_tap.c
index b41fa971cb..3fa03cdbee 100644
--- a/drivers/net/tap/rte_eth_tap.c
+++ b/drivers/net/tap/rte_eth_tap.c
@@ -1803,6 +1803,7 @@ tap_dev_supported_ptypes_get(struct rte_eth_dev *dev 
__rte_unused)
RTE_PTYPE_L4_UDP,
RTE_PTYPE_L4_TCP,
RTE_PTYPE_L4_SCTP,
+   RTE_PTYPE_UNKNOWN
};
 
return ptypes;
diff --git a/drivers/net/thunderx/nicvf_ethdev.c 
b/drivers/net/thunderx/nicvf_ethdev.c
index a504d41dfe..5a0c3dc4a6 100644
--- a/drivers/net/thunderx/nicvf_ethdev.c
+++ b/drivers/net/thunderx/nicvf_ethdev.c
@@ -392,12 +392,14 @@ nicvf_dev_supported_ptypes_get(struct rte_eth_dev *dev)
RTE_PTYPE_L4_TCP,
RTE_PTYPE_L4_UDP,
RTE_PTYPE_L4_FRAG,
+   RTE_PTYPE_UNKNOWN
};
static const uint32_t ptypes_tunnel[] = {
RTE_PTYPE_TUNNEL_GRE,
RTE_PTYPE_TUNNEL_GENEVE,
RTE_PTYPE_T

[dpdk-dev v4 2/2] net/tap: fix buffer overflow for ptypes list through driver API update

2024-01-04 Thread Sivaramakrishnan Venkat
Incorrect ptypes list causes buffer overflow for Address Sanitizer
run. Previously, the last element in the ptypes lists to be
"RTE_PTYPE_UNKNOWN" for rte_eth_dev_get_supported_ptypes(), but this was
not clearly documented and many PMDs did not follow this implementation.
Instead, the dev_supported_ptypes_get() function pointer now returns the
number of elements to eliminate the need for "RTE_PTYPE_UNKNOWN"
as the last item.

Fixes: 47909357a069 ("ethdev: make device operations struct private")
Cc: ferruh.yi...@intel.com
Cc: sta...@dpdk.org

Signed-off-by: Sivaramakrishnan Venkat 
---
 drivers/net/atlantic/atl_ethdev.c  | 13 -
 drivers/net/axgbe/axgbe_ethdev.c   | 13 -
 drivers/net/bnxt/bnxt_ethdev.c |  7 ---
 drivers/net/cnxk/cnxk_ethdev.h |  3 ++-
 drivers/net/cnxk/cnxk_lookup.c |  7 ---
 drivers/net/cpfl/cpfl_ethdev.c |  7 ---
 drivers/net/cxgbe/cxgbe_ethdev.c   | 10 ++
 drivers/net/cxgbe/cxgbe_pfvf.h |  3 ++-
 drivers/net/dpaa/dpaa_ethdev.c | 11 +++
 drivers/net/dpaa2/dpaa2_ethdev.c   | 10 ++
 drivers/net/e1000/igb_ethdev.c | 13 -
 drivers/net/enetc/enetc_ethdev.c   |  7 ---
 drivers/net/enic/enic_ethdev.c | 17 ++---
 drivers/net/failsafe/failsafe_ops.c|  5 +++--
 drivers/net/fm10k/fm10k_ethdev.c   | 15 +--
 drivers/net/hns3/hns3_rxtx.c   | 16 +---
 drivers/net/hns3/hns3_rxtx.h   |  3 ++-
 drivers/net/i40e/i40e_rxtx.c   | 11 +++
 drivers/net/i40e/i40e_rxtx.h   |  3 ++-
 drivers/net/iavf/iavf_ethdev.c | 10 ++
 drivers/net/ice/ice_dcf_ethdev.c   |  7 ---
 drivers/net/ice/ice_rxtx.c | 23 ++-
 drivers/net/ice/ice_rxtx.h |  3 ++-
 drivers/net/idpf/idpf_ethdev.c |  7 ---
 drivers/net/igc/igc_ethdev.c   | 10 ++
 drivers/net/ionic/ionic_rxtx.c |  7 ---
 drivers/net/ionic/ionic_rxtx.h |  3 ++-
 drivers/net/ixgbe/ixgbe_ethdev.c   | 18 --
 drivers/net/mana/mana.c|  7 ---
 drivers/net/mlx4/mlx4.h|  3 ++-
 drivers/net/mlx4/mlx4_ethdev.c | 17 ++---
 drivers/net/mlx5/mlx5.h|  3 ++-
 drivers/net/mlx5/mlx5_ethdev.c | 11 +++
 drivers/net/mvneta/mvneta_ethdev.c |  7 ---
 drivers/net/mvpp2/mrvl_ethdev.c|  7 ---
 drivers/net/netvsc/hn_var.h|  3 ++-
 drivers/net/netvsc/hn_vf.c |  5 +++--
 drivers/net/nfp/nfp_net_common.c   | 15 ++-
 drivers/net/nfp/nfp_net_common.h   |  3 ++-
 drivers/net/ngbe/ngbe_ethdev.c |  9 ++---
 drivers/net/ngbe/ngbe_ethdev.h |  3 ++-
 drivers/net/ngbe/ngbe_ptypes.c |  3 ++-
 drivers/net/ngbe/ngbe_ptypes.h |  2 +-
 drivers/net/octeontx/octeontx_ethdev.c | 11 +++
 drivers/net/pfe/pfe_ethdev.c   | 11 +++
 drivers/net/qede/qede_ethdev.c | 11 +++
 drivers/net/sfc/sfc_dp_rx.h|  2 +-
 drivers/net/sfc/sfc_ef10.h |  3 ++-
 drivers/net/sfc/sfc_ef100_rx.c |  7 ---
 drivers/net/sfc/sfc_ef10_rx.c  | 11 ++-
 drivers/net/sfc/sfc_ethdev.c   |  5 +++--
 drivers/net/sfc/sfc_rx.c   |  7 ---
 drivers/net/tap/rte_eth_tap.c  |  7 ---
 drivers/net/thunderx/nicvf_ethdev.c| 10 +-
 drivers/net/txgbe/txgbe_ethdev.c   |  9 ++---
 drivers/net/txgbe/txgbe_ethdev.h   |  3 ++-
 drivers/net/txgbe/txgbe_ptypes.c   |  6 +++---
 drivers/net/txgbe/txgbe_ptypes.h   |  2 +-
 drivers/net/vmxnet3/vmxnet3_ethdev.c   | 14 +-
 lib/ethdev/ethdev_driver.h |  3 ++-
 lib/ethdev/rte_ethdev.c| 10 ++
 61 files changed, 299 insertions(+), 193 deletions(-)

diff --git a/drivers/net/atlantic/atl_ethdev.c 
b/drivers/net/atlantic/atl_ethdev.c
index 3a028f4290..bc087738e4 100644
--- a/drivers/net/atlantic/atl_ethdev.c
+++ b/drivers/net/atlantic/atl_ethdev.c
@@ -43,7 +43,8 @@ static int atl_dev_stats_reset(struct rte_eth_dev *dev);
 static int atl_fw_version_get(struct rte_eth_dev *dev, char *fw_version,
  size_t fw_size);
 
-static const uint32_t *atl_dev_supported_ptypes_get(struct rte_eth_dev *dev);
+static const uint32_t *atl_dev_supported_ptypes_get(struct rte_eth_dev *dev,
+   size_t *no_of_elements);
 
 static int atl_dev_mtu_set(struct rte_eth_dev *dev, uint16_t mtu);
 
@@ -1132,7 +1133,8 @@ atl_dev_info_get(struct rte_eth_dev *dev, struct 
rte_eth_dev_info *dev_info)
 }
 
 static const uint32_t *
-atl_dev_supported_ptypes_get(struct rte_eth_dev *dev)
+atl_dev_supported_ptypes_get(struct rte_eth_dev *dev,
+   size_t *no_of_elements)
 {
static const uint32_t ptypes[] = {
RTE_PTYPE_L2_ETHER,
@@ -1143,12 +1145,13 @@ atl_dev_support

Re: [RFC] ethdev: introduce entropy calculation

2024-01-04 Thread Thomas Monjalon
04/01/2024 15:33, Ori Kam:
> Hi Cristian,
> 
> > From: Dumitrescu, Cristian 
> > Sent: Thursday, January 4, 2024 2:57 PM
> > > > >>
> > > > >> And unless this is specifically defined as 'entropy' in spec, I am 
> > > > >> too
> > > > >> for rename.
> > > > >>
> > > > >> At least in VXLAN spec, it is mentioned that this field is to 
> > > > >> "enable a
> > > > >> level of entropy", but not exactly names it as entropy.
> > > > >
> > > > > Exactly my thought about the naming.
> > > > > Good to see I am not alone thinking this naming is disturbing :)
> > > >
> > > > I'd avoid usage of term "entropy" in this patch. It is very confusing.
> > >
> > > What about rte_flow_calc_encap_hash?
> > >
> > >
> > How about simply rte_flow_calc_hash? My understanding is this is a general-
> > purpose hash that is not limited to encapsulation work.
> 
> Unfortunately, this is not a general-purpose hash.  HW may implement a 
> different hash for each use case.
> also, the hash result is length differs depending on the feature and even the 
> target field.
> 
> We can take your naming idea and change the parameters a bit:
> rte_flow_calc_hash(port, feature, *attribute, pattern, hash_len, *hash)
> 
> For the feature we will have at this point:
> NVGRE_HASH,
> SPORT_HASH 
> 
> The attribute parameter will be empty for now, but it may be used later to 
> add extra information
> for the hash if more information is required, for example, some key.
> In addition, we will also be able to merge the current function 
> rte_flow_calc_table_hash,
> if we pass the missing parameters (table id, template id) in the attribute 
> field.
> 
> What do you think?

I like the idea of having a single function for HW hashes.
Is there an impact on performance? How much is it sensitive?





Re: [dpdk-dev] [RFC] ethdev: support Tx queue free descriptor query

2024-01-04 Thread Thomas Monjalon
04/01/2024 15:21, Konstantin Ananyev:
> 
> > > > Introduce a new API to retrieve the number of available free descriptors
> > > > in a Tx queue. Applications can leverage this API in the fast path to
> > > > inspect the Tx queue occupancy and take appropriate actions based on the
> > > > available free descriptors.
> > > >
> > > > A notable use case could be implementing Random Early Discard (RED)
> > > > in software based on Tx queue occupancy.
> > > >
> > > > Signed-off-by: Jerin Jacob 
> > >
> > > I think having an API to get the number of free descriptors per queue is 
> > > a good idea. Why have it only for TX queues and not for RX
> > queues as well?
> > 
> > I see no harm in adding for Rx as well. I think, it is better to have
> > separate API for each instead of adding argument as it is fast path
> > API.
> > If so, we could add a new API when there is any PMD implementation or
> > need for this.
> 
> I think for RX we already have similar one:
> /** @internal Get number of used descriptors on a receive queue. */
> typedef uint32_t (*eth_rx_queue_count_t)(void *rxq);

rte_eth_rx_queue_count() gives the number of Rx used descriptors
rte_eth_rx_descriptor_status() gives the status of one Rx descriptor
rte_eth_tx_descriptor_status() gives the status of one Tx descriptor

This patch is adding a function to get Tx available descriptors,
rte_eth_tx_queue_free_desc_get().
I can see a symmetry with rte_eth_rx_queue_count().
For consistency I would rename it to rte_eth_tx_queue_free_count().

Should we add rte_eth_tx_queue_count() and rte_eth_rx_queue_free_count()?




[PATCH] event/cnxk: use WFE LDP loop for getwork routine

2024-01-04 Thread pbhagavatula
From: Pavan Nikhilesh 

Use WFE LDP loop while polling for GETWORK completion for better
power savings.
Disabled by default and can be enabled by setting
`RTE_ARM_USE_WFE` to `true` in `config/arm/meson.build`

Signed-off-by: Pavan Nikhilesh 
---
 doc/guides/eventdevs/cnxk.rst |  9 ++
 drivers/event/cnxk/cn10k_worker.h | 52 +--
 2 files changed, 52 insertions(+), 9 deletions(-)

diff --git a/doc/guides/eventdevs/cnxk.rst b/doc/guides/eventdevs/cnxk.rst
index cccb8a0304..d62c143c77 100644
--- a/doc/guides/eventdevs/cnxk.rst
+++ b/doc/guides/eventdevs/cnxk.rst
@@ -198,6 +198,15 @@ Runtime Config Options
 
 -a 0002:0e:00.0,tim_eclk_freq=12288-10-0
 
+Power Savings on CN10K
+--
+
+ARM cores can additionally use WFE when polling for transactions on SSO bus
+to save power i.e., in the event dequeue call ARM core can enter WFE and exit
+when either work has been scheduled or dequeue timeout has reached.
+This can be enabled by setting ``RTE_ARM_USE_WFE`` to ``true`` in
+``config/arm/meson.build``.
+
 Debugging Options
 -
 
diff --git a/drivers/event/cnxk/cn10k_worker.h 
b/drivers/event/cnxk/cn10k_worker.h
index 8aa916fa12..92d5190842 100644
--- a/drivers/event/cnxk/cn10k_worker.h
+++ b/drivers/event/cnxk/cn10k_worker.h
@@ -250,23 +250,57 @@ cn10k_sso_hws_get_work(struct cn10k_sso_hws *ws, struct 
rte_event *ev,
 
gw.get_work = ws->gw_wdata;
 #if defined(RTE_ARCH_ARM64)
-#if !defined(__clang__)
-   asm volatile(
-   PLT_CPU_FEATURE_PREAMBLE
-   "caspal %[wdata], %H[wdata], %[wdata], %H[wdata], [%[gw_loc]]\n"
-   : [wdata] "+r"(gw.get_work)
-   : [gw_loc] "r"(ws->base + SSOW_LF_GWS_OP_GET_WORK0)
-   : "memory");
-#else
+#if defined(__clang__)
register uint64_t x0 __asm("x0") = (uint64_t)gw.u64[0];
register uint64_t x1 __asm("x1") = (uint64_t)gw.u64[1];
+#if defined(RTE_ARM_USE_WFE)
+   plt_write64(gw.u64[0], ws->base + SSOW_LF_GWS_OP_GET_WORK0);
+   asm volatile(PLT_CPU_FEATURE_PREAMBLE
+"  ldp %[x0], %[x1], [%[tag_loc]]  \n"
+"  tbz %[x0], %[pend_gw], done%=   \n"
+"  sevl\n"
+"rty%=:wfe \n"
+"  ldp %[x0], %[x1], [%[tag_loc]]  \n"
+"  tbnz %[x0], %[pend_gw], rty%=   \n"
+"done%=:   \n"
+"  dmb ld  \n"
+: [x0] "+r" (x0), [x1] "+r" (x1)
+: [tag_loc] "r"(ws->base + SSOW_LF_GWS_WQE0),
+  [pend_gw] "i"(SSOW_LF_GWS_TAG_PEND_GET_WORK_BIT)
+: "memory");
+#else
asm volatile(".arch armv8-a+lse\n"
 "caspal %[x0], %[x1], %[x0], %[x1], [%[dst]]\n"
-: [x0] "+r"(x0), [x1] "+r"(x1)
+: [x0] "+r" (x0), [x1] "+r" (x1)
 : [dst] "r"(ws->base + SSOW_LF_GWS_OP_GET_WORK0)
 : "memory");
+#endif
gw.u64[0] = x0;
gw.u64[1] = x1;
+#else
+#if defined(RTE_ARM_USE_WFE)
+   plt_write64(gw.u64[0], ws->base + SSOW_LF_GWS_OP_GET_WORK0);
+   asm volatile(PLT_CPU_FEATURE_PREAMBLE
+"  ldp %[wdata], %H[wdata], [%[tag_loc]]   \n"
+"  tbz %[wdata], %[pend_gw], done%=\n"
+"  sevl\n"
+"rty%=:wfe \n"
+"  ldp %[wdata], %H[wdata], [%[tag_loc]]   \n"
+"  tbnz %[wdata], %[pend_gw], rty%=\n"
+"done%=:   \n"
+"  dmb ld  \n"
+: [wdata] "=&r"(gw.get_work)
+: [tag_loc] "r"(ws->base + SSOW_LF_GWS_WQE0),
+  [pend_gw] "i"(SSOW_LF_GWS_TAG_PEND_GET_WORK_BIT)
+: "memory");
+#else
+   asm volatile(
+   PLT_CPU_FEATURE_PREAMBLE
+   "caspal %[wdata], %H[wdata], %[wdata], %H[wdata], [%[gw_loc]]\n"
+   : [wdata] "+r"(gw.get_work)
+   : [gw_loc] "r"(ws->base + SSOW_LF_GWS_OP_GET_WORK0)
+   : "memory");
+#endif
 #endif
 #else
plt_write64(gw.u64[0], ws->base + SSOW_LF_GWS_OP_GET_WORK0);
-- 
2.25.1



Re: [PATCH] net/tap: Modified TAP BPF program as per the new Kernel-version upgrade requirements.

2024-01-04 Thread Stephen Hemminger
On Thu,  4 Jan 2024 22:57:56 +0530
madhuker.myt...@oracle.com wrote:

> From: Madhuker Mythri 
> 
> When multiple queues configured, internally RSS will be enabled and thus TAP 
> BPF RSS byte-code will be loaded on to the Kernel using BPF system calls.
> 
> Here, the problem is loading the existing BPF byte-code to the Kernel-5.15 
> and above versions throws errors, i.e: Kernel BPF verifier not accepted this 
> existing BPF byte-code and system calls return error code "-7" as follows:
> 
> rss_add_actions(): Failed to load BPF section l3_l4 (7): Argument list too 
> long
> 
> 
> RCA:  These errors started coming after from the Kernel-5.15 version, in 
> which lots of new BPF verification restrictions were added for safe execution 
> of byte-code on to the Kernel, due to which existing BPF program verification 
> does not pass.
> Here are the major BPF verifier restrictions observed:
> 1) Need to use new BPF maps structure.
> 2) Kernel SKB data pointer access not allowed.
> 3) Undefined loops were not allowed(which are bounded by a variable value).
> 4) unreachable instructions(like: undefined array access).
> 
> After addressing all these Kernel BPF verifier restrictions able to load the 
> BPF byte-code onto the Kernel successfully.
> 
> Note: This new BPF changes supports from Kernel:4.10 version.
> 
> Bugzilla Id: 1329
> 
> Signed-off-by: Madhuker Mythri 
> ---
>  drivers/net/tap/bpf/tap_bpf_program.c |  243 +-
>  drivers/net/tap/tap_bpf_api.c |4 +-
>  drivers/net/tap/tap_bpf_insns.h   | 3781 ++---
>  3 files changed, 2151 insertions(+), 1877 deletions(-)

Patch has trailing whitespace, git complains:
$ git am /tmp/bpf.mbox
Applying: net/tap: Modified TAP BPF program as per the new Kernel-version 
upgrade requirements.
/home/shemminger/DPDK/main/.git/worktrees/libbpf/rebase-apply/patch:98: 
trailing whitespace.
// queue match 
/home/shemminger/DPDK/main/.git/worktrees/libbpf/rebase-apply/patch:243: 
trailing whitespace.
/** Is IP fragmented **/ 
/home/shemminger/DPDK/main/.git/worktrees/libbpf/rebase-apply/patch:326: 
trailing whitespace.
/*  bpf_printk("> rss_l3_l4 hash=0x%x queue:1=%u\n", hash, queue); */ 
warning: 3 lines add whitespace errors.




Re: [PATCH] net/tap: Modified TAP BPF program as per the new Kernel-version upgrade requirements.

2024-01-04 Thread Stephen Hemminger
On Thu,  4 Jan 2024 22:57:56 +0530
madhuker.myt...@oracle.com wrote:

> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include "tap_rss.h"
>  

This change in headers breaks the use of make in the bpf directory.


Re: [PATCH] net/tap: Modified TAP BPF program as per the new Kernel-version upgrade requirements.

2024-01-04 Thread Stephen Hemminger
On Thu,  4 Jan 2024 22:57:56 +0530
madhuker.myt...@oracle.com wrote:

> +#include 
> +#include 
> +#include 
> +#include 

The original code copied the bpf headers from distro (was bad idea).
This should be fixed in tap driver to make sure that there is no mismatch.


Re: [PATCH] net/tap: Modified TAP BPF program as per the new Kernel-version upgrade requirements.

2024-01-04 Thread Stephen Hemminger
On Thu,  4 Jan 2024 22:57:56 +0530
madhuker.myt...@oracle.com wrote:

> -static __u32  __attribute__((always_inline))
> -rte_softrss_be(const __u32 *input_tuple, const uint8_t *rss_key,
> - __u8 input_len)
> +static __u64  __attribute__((always_inline))
> +rte_softrss_be(const __u32 *input_tuple, __u8 input_len)

Why the change to u64?
This is not part of the bug fix and not how RSS is defined.


Re: [PATCH] net/tap: Modified TAP BPF program as per the new Kernel-version upgrade requirements.

2024-01-04 Thread Stephen Hemminger
On Thu,  4 Jan 2024 22:57:56 +0530
madhuker.myt...@oracle.com wrote:

> diff --git a/drivers/net/tap/tap_bpf_insns.h b/drivers/net/tap/tap_bpf_insns.h
> index 53fa76c4e6..b3dc11b901 100644
> --- a/drivers/net/tap/tap_bpf_insns.h
> +++ b/drivers/net/tap/tap_bpf_insns.h
> @@ -1,10 +1,10 @@
>  /* SPDX-License-Identifier: BSD-3-Clause
> - * Auto-generated from tap_bpf_program.c
> - * This not the original source file. Do NOT edit it.
> + * Copyright 2017 Mellanox Technologies, Ltd
>   */

Why the Mellanox copyright addition, the python auto-generator does not add it?

Overall, it looks like you did not work with existing TAP BPF code but went back
to some other code you had.


Re: [PATCH] net/tap: Modified TAP BPF program as per the new Kernel-version upgrade requirements.

2024-01-04 Thread Stephen Hemminger
On Thu,  4 Jan 2024 22:57:56 +0530
madhuker.myt...@oracle.com wrote:

> - /* Get correct proto for 802.1ad */
> - if (skb->vlan_present && skb->vlan_proto == htons(ETH_P_8021AD)) {
> - if (data + ETH_ALEN * 2 + sizeof(struct vlan_hdr) +
> - sizeof(proto) > data_end)
> - return TC_ACT_OK;
> - proto = *(__u16 *)(data + ETH_ALEN * 2 +
> -sizeof(struct vlan_hdr));
> - off += sizeof(struct vlan_hdr);
> - }

Your version loses VLAN support?


Re: [PATCH] net/tap: Modified TAP BPF program as per the new Kernel-version upgrade requirements.

2024-01-04 Thread Stephen Hemminger
On Thu,  4 Jan 2024 22:57:56 +0530
madhuker.myt...@oracle.com wrote:

>  {
> - void *data_end = (void *)(long)skb->data_end;
> - void *data = (void *)(long)skb->data;
> - __u16 proto = (__u16)skb->protocol;
> +struct neth nh;
> +struct net6h n6h;

Sloppy non-standard indentation.

And original code would work with tunnels, this won't


Re: [PATCH] net/tap: Modified TAP BPF program as per the new Kernel-version upgrade requirements.

2024-01-04 Thread Stephen Hemminger
On Thu,  4 Jan 2024 22:57:56 +0530
madhuker.myt...@oracle.com wrote:

> 
> 
> RCA:  These errors started coming after from the Kernel-5.15 version, in 
> which lots of new BPF verification restrictions were added for safe execution 
> of byte-code on to the Kernel, due to which existing BPF program verification 
> does not pass.
> Here are the major BPF verifier restrictions observed:
> 1) Need to use new BPF maps structure.
> 2) Kernel SKB data pointer access not allowed.

I noticed you are now using bpf_skb_load_bytes(), but the bpf helper man page
implies it is not needed.

 long bpf_skb_load_bytes(const void *skb, u32 offset, void *to,
   u32 len)

  Description
 This helper was provided as an easy way to load
 data from a packet. It can be used to load len
 bytes from offset from the packet associated to
 skb, into the buffer pointed by to.

 Since Linux 4.7, usage of this helper has mostly
 been replaced by "direct packet access", enabling
 packet data to be manipulated with skb->data and
 skb->data_end pointing respectively to the first
 byte of packet data and to the byte after the last
 byte of packet data. However, it remains useful if
 one wishes to read large quantities of data at once
 from a packet into the eBPF stack.

  Return 0 on success, or a negative error in case of


Re: [dpdk-dev] [RFC] ethdev: support Tx queue free descriptor query

2024-01-04 Thread Thomas Monjalon
19/12/2023 18:29, jer...@marvell.com:
> --- a/doc/guides/nics/features/default.ini
> +++ b/doc/guides/nics/features/default.ini
> @@ -59,6 +59,7 @@ Packet type parsing  =
> 
>  Timesync =
>  Rx descriptor status =
>  Tx descriptor status =
> +Tx free descriptor query =

I think we can drop "query" here.


> +__rte_experimental
> +static inline uint32_t
> +rte_eth_tx_queue_free_desc_get(uint16_t port_id, uint16_t tx_queue_id)

For consistency with rte_eth_rx_queue_count(),
I propose the name rte_eth_tx_queue_free_count().





[PATCH v2 1/2] app/test-crypto-perf: fix invalid memcmp results

2024-01-04 Thread Suanming Mou
The function memcmp() returns an integer less than, equal to,
or greater than zero. In current code, if the first memcmp()
returns less than zero and the second memcmp() returns greater
than zero, the sum of results may still be 0 and indicates
verify succussed.

This commit converts the return value to be zero or greater
than zero. That will make sure the sum of results be correct.

Fixes: df52cb3b6e13 ("app/crypto-perf: move verify as single test type")

Signed-off-by: Suanming Mou 
Acked-by: Anoob Joseph 
---
 app/test-crypto-perf/cperf_test_verify.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/app/test-crypto-perf/cperf_test_verify.c 
b/app/test-crypto-perf/cperf_test_verify.c
index a6c0ffe813..8aa714b969 100644
--- a/app/test-crypto-perf/cperf_test_verify.c
+++ b/app/test-crypto-perf/cperf_test_verify.c
@@ -186,18 +186,18 @@ cperf_verify_op(struct rte_crypto_op *op,
 
if (cipher == 1) {
if (options->cipher_op == RTE_CRYPTO_CIPHER_OP_ENCRYPT)
-   res += memcmp(data + cipher_offset,
+   res += !!memcmp(data + cipher_offset,
vector->ciphertext.data,
options->test_buffer_size);
else
-   res += memcmp(data + cipher_offset,
+   res += !!memcmp(data + cipher_offset,
vector->plaintext.data,
options->test_buffer_size);
}
 
if (auth == 1) {
if (options->auth_op == RTE_CRYPTO_AUTH_OP_GENERATE)
-   res += memcmp(data + auth_offset,
+   res += !!memcmp(data + auth_offset,
vector->digest.data,
options->digest_sz);
}
-- 
2.34.1



[PATCH v2 2/2] app/test-crypto-perf: fix encrypt operation verify

2024-01-04 Thread Suanming Mou
AEAD uses RTE_CRYPTO_AEAD_OP_* with aead_op and CIPHER uses
RTE_CRYPTO_CIPHER_OP_* with cipher_op in current code.

This commit aligns aead_op and cipher_op operation to fix
incorrect AEAD verification.

Fixes: df52cb3b6e13 ("app/crypto-perf: move verify as single test type")

Signed-off-by: Suanming Mou 
---

v2: align auth/cipher to bool.

---
 app/test-crypto-perf/cperf_test_verify.c | 55 
 1 file changed, 27 insertions(+), 28 deletions(-)

diff --git a/app/test-crypto-perf/cperf_test_verify.c 
b/app/test-crypto-perf/cperf_test_verify.c
index 8aa714b969..2b0d3f142b 100644
--- a/app/test-crypto-perf/cperf_test_verify.c
+++ b/app/test-crypto-perf/cperf_test_verify.c
@@ -111,8 +111,10 @@ cperf_verify_op(struct rte_crypto_op *op,
uint32_t len;
uint16_t nb_segs;
uint8_t *data;
-   uint32_t cipher_offset, auth_offset;
-   uint8_t cipher, auth;
+   uint32_t cipher_offset, auth_offset = 0;
+   bool cipher = false;
+   bool digest_verify = false;
+   bool is_encrypt = false;
int res = 0;
 
if (op->status != RTE_CRYPTO_OP_STATUS_SUCCESS)
@@ -150,42 +152,43 @@ cperf_verify_op(struct rte_crypto_op *op,
 
switch (options->op_type) {
case CPERF_CIPHER_ONLY:
-   cipher = 1;
+   cipher = true;
cipher_offset = 0;
-   auth = 0;
-   auth_offset = 0;
-   break;
-   case CPERF_CIPHER_THEN_AUTH:
-   cipher = 1;
-   cipher_offset = 0;
-   auth = 1;
-   auth_offset = options->test_buffer_size;
+   is_encrypt = options->cipher_op == RTE_CRYPTO_CIPHER_OP_ENCRYPT;
break;
case CPERF_AUTH_ONLY:
-   cipher = 0;
cipher_offset = 0;
-   auth = 1;
-   auth_offset = options->test_buffer_size;
+   if (options->auth_op == RTE_CRYPTO_AUTH_OP_GENERATE) {
+   auth_offset = options->test_buffer_size;
+   digest_verify = true;
+   }
break;
+   case CPERF_CIPHER_THEN_AUTH:
case CPERF_AUTH_THEN_CIPHER:
-   cipher = 1;
+   cipher = true;
cipher_offset = 0;
-   auth = 1;
-   auth_offset = options->test_buffer_size;
+   if (options->cipher_op == RTE_CRYPTO_CIPHER_OP_ENCRYPT) {
+   auth_offset = options->test_buffer_size;
+   digest_verify = true;
+   is_encrypt = true;
+   }
break;
case CPERF_AEAD:
-   cipher = 1;
+   cipher = true;
cipher_offset = 0;
-   auth = 1;
-   auth_offset = options->test_buffer_size;
+   if (options->aead_op == RTE_CRYPTO_AEAD_OP_ENCRYPT) {
+   auth_offset = options->test_buffer_size;
+   digest_verify = true;
+   is_encrypt = true;
+   }
break;
default:
res = 1;
goto out;
}
 
-   if (cipher == 1) {
-   if (options->cipher_op == RTE_CRYPTO_CIPHER_OP_ENCRYPT)
+   if (cipher) {
+   if (is_encrypt)
res += !!memcmp(data + cipher_offset,
vector->ciphertext.data,
options->test_buffer_size);
@@ -195,12 +198,8 @@ cperf_verify_op(struct rte_crypto_op *op,
options->test_buffer_size);
}
 
-   if (auth == 1) {
-   if (options->auth_op == RTE_CRYPTO_AUTH_OP_GENERATE)
-   res += !!memcmp(data + auth_offset,
-   vector->digest.data,
-   options->digest_sz);
-   }
+   if (digest_verify)
+   res += !!memcmp(data + auth_offset, vector->digest.data, 
options->digest_sz);
 
 out:
rte_free(data);
-- 
2.34.1



RE: [PATCH v8 2/2] net/iavf: add diagnostic support in TX path

2024-01-04 Thread Zhang, Qi Z



> -Original Message-
> From: Mingjin Ye 
> Sent: Wednesday, January 3, 2024 6:11 PM
> To: dev@dpdk.org
> Cc: Yang, Qiming ; Ye, MingjinX
> ; Su, Simei ; Wu, Wenjun1
> ; Zhang, Yuying ; Xing,
> Beilei ; Wu, Jingjing 
> Subject: [PATCH v8 2/2] net/iavf: add diagnostic support in TX path
> 
> The only way to enable diagnostics for TX paths is to modify the application
> source code. Making it difficult to diagnose faults.
> 
> In this patch, the devarg option "mbuf_check" is introduced and the
> parameters are configured to enable the corresponding diagnostics.
> 
> supported cases: mbuf, size, segment, offload.
>  1. mbuf: check for corrupted mbuf.
>  2. size: check min/max packet length according to hw spec.
>  3. segment: check number of mbuf segments not exceed hw limitation.
>  4. offload: check any unsupported offload flag.
> 
> parameter format: mbuf_check=[mbuf,,]
> eg: dpdk-testpmd -a :81:01.0,mbuf_check=[mbuf,size] -- -i
> 
> Signed-off-by: Mingjin Ye 
> ---
> v2: Remove call chain.
> ---
> v3: Optimisation implementation.
> ---
> v4: Fix Windows os compilation error.
> ---
> v5: Split Patch.
> ---
> v6: remove strict.
> ---
> v7: Modify the description document.
> ---
>  doc/guides/nics/intel_vf.rst   |  9 
>  drivers/net/iavf/iavf.h| 12 +
>  drivers/net/iavf/iavf_ethdev.c | 76 ++
>  drivers/net/iavf/iavf_rxtx.c   | 98 ++
>  drivers/net/iavf/iavf_rxtx.h   |  2 +
>  5 files changed, 197 insertions(+)
> 
> diff --git a/doc/guides/nics/intel_vf.rst b/doc/guides/nics/intel_vf.rst index
> ad08198f0f..bda6648726 100644
> --- a/doc/guides/nics/intel_vf.rst
> +++ b/doc/guides/nics/intel_vf.rst
> @@ -111,6 +111,15 @@ For more detail on SR-IOV, please refer to the
> following documents:
>  by setting the ``devargs`` parameter like ``-a 18:01.0,no-poll-on-link-
> down=1``
>  when IAVF is backed by an Intel\ |reg| E810 device or an Intel\ |reg| 700
> Series Ethernet device.
> 
> +When IAVF is backed by an Intel\ |reg| E810 device or an Intel\ |reg| 700
> series Ethernet devices.
> +Set the ``devargs`` parameter ``mbuf_check`` to enable TX diagnostics. 
> For
> example,
> +``-a 18:01.0,mbuf_check=mbuf`` or ``-a 18:01.0,mbuf_check=[mbuf,size]``.
> Supported cases:

``-a 18:01.0,mbuf_check=`` or ``-a 
18:01.0,mbuf_check=[,...]``

> +
> +*   mbuf: Check for corrupted mbuf.
> +*   size: Check min/max packet length according to hw spec.
> +*   segment: Check number of mbuf segments not exceed hw limitation.
> +*   offload: Check any unsupported offload flag.

please also describe how to get the error count by xstats_get, a testpmd 
command is suggested

Btw, PATCH 1/2 as a fix has been merged seperately, new version can only target 
to this patch only.


RE: [PATCH v2] net/e1000: support launchtime feature

2024-01-04 Thread Zhang, Qi Z



> -Original Message-
> From: Su, Simei 
> Sent: Thursday, January 4, 2024 11:13 AM
> To: Chuanyu Xue ; Lu, Wenzhuo
> ; Zhang, Qi Z ; Xing, Beilei
> 
> Cc: dev@dpdk.org
> Subject: RE: [PATCH v2] net/e1000: support launchtime feature
> 
> 
> > -Original Message-
> > From: Chuanyu Xue 
> > Sent: Sunday, December 31, 2023 12:35 AM
> > To: Su, Simei ; Lu, Wenzhuo
> > ; Zhang, Qi Z ; Xing,
> > Beilei 
> > Cc: dev@dpdk.org; Chuanyu Xue 
> > Subject: [PATCH v2] net/e1000: support launchtime feature
> >
> > Enable the time-based scheduled Tx of packets based on the
> > RTE_ETH_TX_OFFLOAD_SEND_ON_TIMESTAMP flag. The launchtime defines
> the
> > packet transmission time based on PTP clock at MAC layer, which should
> > be set to the advanced transmit descriptor.
> >
> > Signed-off-by: Chuanyu Xue 
> > ---
> > change log:
> >
> > v2:
> > - Add delay compensation for i210 NIC by setting tx offset register.
> > - Revise read_clock function.
> >
> >  drivers/net/e1000/base/e1000_regs.h |  1 +
> >  drivers/net/e1000/e1000_ethdev.h| 14 +++
> >  drivers/net/e1000/igb_ethdev.c  | 63
> > -
> >  drivers/net/e1000/igb_rxtx.c| 42 +++
> >  4 files changed, 112 insertions(+), 8 deletions(-)
> >
> > diff --git a/drivers/net/e1000/base/e1000_regs.h
> > b/drivers/net/e1000/base/e1000_regs.h
> > index d44de59c29..092d9d71e6 100644
> > --- a/drivers/net/e1000/base/e1000_regs.h
> > +++ b/drivers/net/e1000/base/e1000_regs.h
> > @@ -162,6 +162,7 @@
> >
> >  /* QAV Tx mode control register */
> >  #define E1000_I210_TQAVCTRL0x3570
> > +#define E1000_I210_LAUNCH_OS0 0x3578
> >
> >  /* QAV Tx mode control register bitfields masks */
> >  /* QAV enable */
> > diff --git a/drivers/net/e1000/e1000_ethdev.h
> > b/drivers/net/e1000/e1000_ethdev.h
> > index 718a9746ed..339ae1f4b6 100644
> > --- a/drivers/net/e1000/e1000_ethdev.h
> > +++ b/drivers/net/e1000/e1000_ethdev.h
> > @@ -382,6 +382,20 @@ extern struct igb_rss_filter_list
> > igb_filter_rss_list; TAILQ_HEAD(igb_flow_mem_list, igb_flow_mem);
> > extern struct igb_flow_mem_list igb_flow_list;
> >
> > +/*
> > + * Macros to compensate the constant latency observed in i210 for
> > +launch time
> > + *
> > + * launch time = (offset_speed - offset_base + txtime) * 32
> > + * offset_speed is speed dependent, set in E1000_I210_LAUNCH_OS0  */
> > +#define IGB_I210_TX_OFFSET_BASE0xffe0
> > +#define IGB_I210_TX_OFFSET_SPEED_100xc7a0
> > +#define IGB_I210_TX_OFFSET_SPEED_100   0x86e0
> > +#define IGB_I210_TX_OFFSET_SPEED_1000  0xbe00
> > +
> > +extern uint64_t igb_tx_timestamp_dynflag; extern int
> > +igb_tx_timestamp_dynfield_offset;
> > +
> >  extern const struct rte_flow_ops igb_flow_ops;
> >
> >  /*
> > diff --git a/drivers/net/e1000/igb_ethdev.c
> > b/drivers/net/e1000/igb_ethdev.c index 8858f975f8..2262035710 100644
> > --- a/drivers/net/e1000/igb_ethdev.c
> > +++ b/drivers/net/e1000/igb_ethdev.c
> > @@ -223,6 +223,7 @@ static int igb_timesync_read_time(struct
> > rte_eth_dev *dev,
> >   struct timespec *timestamp);
> >  static int igb_timesync_write_time(struct rte_eth_dev *dev,
> >const struct timespec *timestamp);
> > +static int eth_igb_read_clock(struct rte_eth_dev *dev, uint64_t
> > +*clock);
> >  static int eth_igb_rx_queue_intr_enable(struct rte_eth_dev *dev,
> > uint16_t queue_id);
> >  static int eth_igb_rx_queue_intr_disable(struct rte_eth_dev *dev, @@
> > -313,6
> > +314,9 @@ static const struct rte_pci_id pci_id_igbvf_map[] = {
> > { .vendor_id = 0, /* sentinel */ },
> >  };
> >
> > +uint64_t igb_tx_timestamp_dynflag;
> > +int igb_tx_timestamp_dynfield_offset = -1;
> > +
> >  static const struct rte_eth_desc_lim rx_desc_lim = {
> > .nb_max = E1000_MAX_RING_DESC,
> > .nb_min = E1000_MIN_RING_DESC,
> > @@ -389,6 +393,7 @@ static const struct eth_dev_ops eth_igb_ops = {
> > .timesync_adjust_time = igb_timesync_adjust_time,
> > .timesync_read_time   = igb_timesync_read_time,
> > .timesync_write_time  = igb_timesync_write_time,
> > +   .read_clock   = eth_igb_read_clock,
> >  };
> >
> >  /*
> > @@ -1188,6 +1193,40 @@ eth_igb_rxtx_control(struct rte_eth_dev *dev,
> > E1000_WRITE_FLUSH(hw);
> >  }
> >
> > +
> > +static uint32_t igb_tx_offset(struct rte_eth_dev *dev) {
> > +   struct e1000_hw *hw =
> > +   E1000_DEV_PRIVATE_TO_HW(dev->data->dev_private);
> > +
> > +   uint16_t duplex, speed;
> > +   hw->mac.ops.get_link_up_info(hw, &speed, &duplex);
> > +
> > +   uint32_t launch_os0 = E1000_READ_REG(hw,
> E1000_I210_LAUNCH_OS0);
> > +   if (hw->mac.type != e1000_i210) {
> > +   /* Set launch offset to base, no compensation */
> > +   launch_os0 |= IGB_I210_TX_OFFSET_BASE;
> > +   } else {
> > +   /* Set launch offset depend on link speeds */
> > +   sw

[PATCH] doc: update default value for config parameter

2024-01-04 Thread Simei Su
Update documentation value to match default value in code base.

Signed-off-by: Simei Su 
---
 doc/guides/prog_guide/ip_fragment_reassembly_lib.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/doc/guides/prog_guide/ip_fragment_reassembly_lib.rst 
b/doc/guides/prog_guide/ip_fragment_reassembly_lib.rst
index 314d4ad..b14289e 100644
--- a/doc/guides/prog_guide/ip_fragment_reassembly_lib.rst
+++ b/doc/guides/prog_guide/ip_fragment_reassembly_lib.rst
@@ -43,7 +43,7 @@ Note that all update/lookup operations on Fragment Table are 
not thread safe.
 So if different execution contexts (threads/processes) will access the same 
table simultaneously,
 then some external syncing mechanism have to be provided.
 
-Each table entry can hold information about packets consisting of up to 
RTE_LIBRTE_IP_FRAG_MAX (by default: 4) fragments.
+Each table entry can hold information about packets consisting of up to 
RTE_LIBRTE_IP_FRAG_MAX (by default: 8) fragments.
 
 Code example, that demonstrates creation of a new Fragment table:
 
-- 
2.9.5



RE: [EXT] [PATCH v2 2/2] app/test-crypto-perf: fix encrypt operation verify

2024-01-04 Thread Anoob Joseph
> AEAD uses RTE_CRYPTO_AEAD_OP_* with aead_op and CIPHER uses
> RTE_CRYPTO_CIPHER_OP_* with cipher_op in current code.
> 
> This commit aligns aead_op and cipher_op operation to fix incorrect AEAD
> verification.
> 
> Fixes: df52cb3b6e13 ("app/crypto-perf: move verify as single test type")
> 
> Signed-off-by: Suanming Mou 

Acked-by: Anoob Joseph 




[PATCH] net/ice: refine queue start stop

2024-01-04 Thread Qi Zhang
Not necessary to return fail when starting or stopping a queue
if the queue was already at required state.

Signed-off-by: Qi Zhang 
---
 drivers/net/ice/ice_rxtx.c | 16 
 1 file changed, 16 insertions(+)

diff --git a/drivers/net/ice/ice_rxtx.c b/drivers/net/ice/ice_rxtx.c
index 73e47ae92d..3286bb08fe 100644
--- a/drivers/net/ice/ice_rxtx.c
+++ b/drivers/net/ice/ice_rxtx.c
@@ -673,6 +673,10 @@ ice_rx_queue_start(struct rte_eth_dev *dev, uint16_t 
rx_queue_id)
return -EINVAL;
}
 
+   if (dev->data->rx_queue_state[rx_queue_id] ==
+   RTE_ETH_QUEUE_STATE_STARTED)
+   return 0;
+
if (dev->data->dev_conf.rxmode.offloads & RTE_ETH_RX_OFFLOAD_TIMESTAMP)
rxq->ts_enable = true;
err = ice_program_hw_rx_queue(rxq);
@@ -717,6 +721,10 @@ ice_rx_queue_stop(struct rte_eth_dev *dev, uint16_t 
rx_queue_id)
if (rx_queue_id < dev->data->nb_rx_queues) {
rxq = dev->data->rx_queues[rx_queue_id];
 
+   if (dev->data->rx_queue_state[rx_queue_id] ==
+   RTE_ETH_QUEUE_STATE_STOPPED)
+   return 0;
+
err = ice_switch_rx_queue(hw, rxq->reg_idx, false);
if (err) {
PMD_DRV_LOG(ERR, "Failed to switch RX queue %u off",
@@ -758,6 +766,10 @@ ice_tx_queue_start(struct rte_eth_dev *dev, uint16_t 
tx_queue_id)
return -EINVAL;
}
 
+   if (dev->data->tx_queue_state[tx_queue_id] ==
+   RTE_ETH_QUEUE_STATE_STARTED)
+   return 0;
+
buf_len = ice_struct_size(txq_elem, txqs, 1);
txq_elem = ice_malloc(hw, buf_len);
if (!txq_elem)
@@ -1066,6 +1078,10 @@ ice_tx_queue_stop(struct rte_eth_dev *dev, uint16_t 
tx_queue_id)
return -EINVAL;
}
 
+   if (dev->data->tx_queue_state[tx_queue_id] ==
+   RTE_ETH_QUEUE_STATE_STOPPED)
+   return 0;
+
q_ids[0] = txq->reg_idx;
q_teids[0] = txq->q_teid;
 
-- 
2.31.1



[PATCH 0/3] net/ice: simplified to 3 layer Tx scheduler.

2024-01-04 Thread Qi Zhang
Remove dummy layers, code refactor, complete document.

Qi Zhang (3):
  net/ice: hide port and TC layer in Tx sched tree
  net/ice: refactor tm config data struture
  doc: update ice document for qos

 doc/guides/nics/ice.rst  |  19 +++
 drivers/net/ice/ice_ethdev.h |  12 +-
 drivers/net/ice/ice_tm.c | 285 +++
 3 files changed, 112 insertions(+), 204 deletions(-)

-- 
2.31.1



[PATCH 1/3] net/ice: hide port and TC layer in Tx sched tree

2024-01-04 Thread Qi Zhang
In currently 5 layer tree implementation, the port and tc layer
is not configurable, so its not necessary to expose them to applicaiton.

The patch hides the top 2 layers and represented the root of the tree at
VSI layer. From application's point of view, its a 3 layer scheduler tree:

Port -> Queue Group -> Queue.

Signed-off-by: Qi Zhang 
---
 drivers/net/ice/ice_ethdev.h |  7 
 drivers/net/ice/ice_tm.c | 79 
 2 files changed, 7 insertions(+), 79 deletions(-)

diff --git a/drivers/net/ice/ice_ethdev.h b/drivers/net/ice/ice_ethdev.h
index fa4981ed14..ae22c29ffc 100644
--- a/drivers/net/ice/ice_ethdev.h
+++ b/drivers/net/ice/ice_ethdev.h
@@ -470,7 +470,6 @@ struct ice_tm_shaper_profile {
 struct ice_tm_node {
TAILQ_ENTRY(ice_tm_node) node;
uint32_t id;
-   uint32_t tc;
uint32_t priority;
uint32_t weight;
uint32_t reference_count;
@@ -484,8 +483,6 @@ struct ice_tm_node {
 /* node type of Traffic Manager */
 enum ice_tm_node_type {
ICE_TM_NODE_TYPE_PORT,
-   ICE_TM_NODE_TYPE_TC,
-   ICE_TM_NODE_TYPE_VSI,
ICE_TM_NODE_TYPE_QGROUP,
ICE_TM_NODE_TYPE_QUEUE,
ICE_TM_NODE_TYPE_MAX,
@@ -495,12 +492,8 @@ enum ice_tm_node_type {
 struct ice_tm_conf {
struct ice_shaper_profile_list shaper_profile_list;
struct ice_tm_node *root; /* root node - port */
-   struct ice_tm_node_list tc_list; /* node list for all the TCs */
-   struct ice_tm_node_list vsi_list; /* node list for all the VSIs */
struct ice_tm_node_list qgroup_list; /* node list for all the queue 
groups */
struct ice_tm_node_list queue_list; /* node list for all the queues */
-   uint32_t nb_tc_node;
-   uint32_t nb_vsi_node;
uint32_t nb_qgroup_node;
uint32_t nb_queue_node;
bool committed;
diff --git a/drivers/net/ice/ice_tm.c b/drivers/net/ice/ice_tm.c
index b570798f07..7ae68c683b 100644
--- a/drivers/net/ice/ice_tm.c
+++ b/drivers/net/ice/ice_tm.c
@@ -43,12 +43,8 @@ ice_tm_conf_init(struct rte_eth_dev *dev)
/* initialize node configuration */
TAILQ_INIT(&pf->tm_conf.shaper_profile_list);
pf->tm_conf.root = NULL;
-   TAILQ_INIT(&pf->tm_conf.tc_list);
-   TAILQ_INIT(&pf->tm_conf.vsi_list);
TAILQ_INIT(&pf->tm_conf.qgroup_list);
TAILQ_INIT(&pf->tm_conf.queue_list);
-   pf->tm_conf.nb_tc_node = 0;
-   pf->tm_conf.nb_vsi_node = 0;
pf->tm_conf.nb_qgroup_node = 0;
pf->tm_conf.nb_queue_node = 0;
pf->tm_conf.committed = false;
@@ -72,16 +68,6 @@ ice_tm_conf_uninit(struct rte_eth_dev *dev)
rte_free(tm_node);
}
pf->tm_conf.nb_qgroup_node = 0;
-   while ((tm_node = TAILQ_FIRST(&pf->tm_conf.vsi_list))) {
-   TAILQ_REMOVE(&pf->tm_conf.vsi_list, tm_node, node);
-   rte_free(tm_node);
-   }
-   pf->tm_conf.nb_vsi_node = 0;
-   while ((tm_node = TAILQ_FIRST(&pf->tm_conf.tc_list))) {
-   TAILQ_REMOVE(&pf->tm_conf.tc_list, tm_node, node);
-   rte_free(tm_node);
-   }
-   pf->tm_conf.nb_tc_node = 0;
if (pf->tm_conf.root) {
rte_free(pf->tm_conf.root);
pf->tm_conf.root = NULL;
@@ -93,8 +79,6 @@ ice_tm_node_search(struct rte_eth_dev *dev,
uint32_t node_id, enum ice_tm_node_type *node_type)
 {
struct ice_pf *pf = ICE_DEV_PRIVATE_TO_PF(dev->data->dev_private);
-   struct ice_tm_node_list *tc_list = &pf->tm_conf.tc_list;
-   struct ice_tm_node_list *vsi_list = &pf->tm_conf.vsi_list;
struct ice_tm_node_list *qgroup_list = &pf->tm_conf.qgroup_list;
struct ice_tm_node_list *queue_list = &pf->tm_conf.queue_list;
struct ice_tm_node *tm_node;
@@ -104,20 +88,6 @@ ice_tm_node_search(struct rte_eth_dev *dev,
return pf->tm_conf.root;
}
 
-   TAILQ_FOREACH(tm_node, tc_list, node) {
-   if (tm_node->id == node_id) {
-   *node_type = ICE_TM_NODE_TYPE_TC;
-   return tm_node;
-   }
-   }
-
-   TAILQ_FOREACH(tm_node, vsi_list, node) {
-   if (tm_node->id == node_id) {
-   *node_type = ICE_TM_NODE_TYPE_VSI;
-   return tm_node;
-   }
-   }
-
TAILQ_FOREACH(tm_node, qgroup_list, node) {
if (tm_node->id == node_id) {
*node_type = ICE_TM_NODE_TYPE_QGROUP;
@@ -371,6 +341,8 @@ ice_shaper_profile_del(struct rte_eth_dev *dev,
return 0;
 }
 
+#define MAX_QUEUE_PER_GROUP8
+
 static int
 ice_tm_node_add(struct rte_eth_dev *dev, uint32_t node_id,
  uint32_t parent_node_id, uint32_t priority,
@@ -384,8 +356,6 @@ ice_tm_node_add(struct rte_eth_dev *dev, uint32_t node_id,
struct ice_tm_shaper_profile *shaper_profile = NULL;
struct ice_tm_node *tm_node;
struct ice_tm_node *parent_node;

[PATCH 2/3] net/ice: refactor tm config data struture

2024-01-04 Thread Qi Zhang
Simplified struct ice_tm_conf by removing per level node list.

Signed-off-by: Qi Zhang 
---
 drivers/net/ice/ice_ethdev.h |   5 +-
 drivers/net/ice/ice_tm.c | 210 +++
 2 files changed, 88 insertions(+), 127 deletions(-)

diff --git a/drivers/net/ice/ice_ethdev.h b/drivers/net/ice/ice_ethdev.h
index ae22c29ffc..008a7a23b9 100644
--- a/drivers/net/ice/ice_ethdev.h
+++ b/drivers/net/ice/ice_ethdev.h
@@ -472,6 +472,7 @@ struct ice_tm_node {
uint32_t id;
uint32_t priority;
uint32_t weight;
+   uint32_t level;
uint32_t reference_count;
struct ice_tm_node *parent;
struct ice_tm_node **children;
@@ -492,10 +493,6 @@ enum ice_tm_node_type {
 struct ice_tm_conf {
struct ice_shaper_profile_list shaper_profile_list;
struct ice_tm_node *root; /* root node - port */
-   struct ice_tm_node_list qgroup_list; /* node list for all the queue 
groups */
-   struct ice_tm_node_list queue_list; /* node list for all the queues */
-   uint32_t nb_qgroup_node;
-   uint32_t nb_queue_node;
bool committed;
bool clear_on_fail;
 };
diff --git a/drivers/net/ice/ice_tm.c b/drivers/net/ice/ice_tm.c
index 7ae68c683b..7c662f8a85 100644
--- a/drivers/net/ice/ice_tm.c
+++ b/drivers/net/ice/ice_tm.c
@@ -43,66 +43,30 @@ ice_tm_conf_init(struct rte_eth_dev *dev)
/* initialize node configuration */
TAILQ_INIT(&pf->tm_conf.shaper_profile_list);
pf->tm_conf.root = NULL;
-   TAILQ_INIT(&pf->tm_conf.qgroup_list);
-   TAILQ_INIT(&pf->tm_conf.queue_list);
-   pf->tm_conf.nb_qgroup_node = 0;
-   pf->tm_conf.nb_queue_node = 0;
pf->tm_conf.committed = false;
pf->tm_conf.clear_on_fail = false;
 }
 
-void
-ice_tm_conf_uninit(struct rte_eth_dev *dev)
+static void free_node(struct ice_tm_node *root)
 {
-   struct ice_pf *pf = ICE_DEV_PRIVATE_TO_PF(dev->data->dev_private);
-   struct ice_tm_node *tm_node;
+   uint32_t i;
 
-   /* clear node configuration */
-   while ((tm_node = TAILQ_FIRST(&pf->tm_conf.queue_list))) {
-   TAILQ_REMOVE(&pf->tm_conf.queue_list, tm_node, node);
-   rte_free(tm_node);
-   }
-   pf->tm_conf.nb_queue_node = 0;
-   while ((tm_node = TAILQ_FIRST(&pf->tm_conf.qgroup_list))) {
-   TAILQ_REMOVE(&pf->tm_conf.qgroup_list, tm_node, node);
-   rte_free(tm_node);
-   }
-   pf->tm_conf.nb_qgroup_node = 0;
-   if (pf->tm_conf.root) {
-   rte_free(pf->tm_conf.root);
-   pf->tm_conf.root = NULL;
-   }
+   if (root == NULL)
+   return;
+
+   for (i = 0; i < root->reference_count; i++)
+   free_node(root->children[i]);
+
+   rte_free(root);
 }
 
-static inline struct ice_tm_node *
-ice_tm_node_search(struct rte_eth_dev *dev,
-   uint32_t node_id, enum ice_tm_node_type *node_type)
+void
+ice_tm_conf_uninit(struct rte_eth_dev *dev)
 {
struct ice_pf *pf = ICE_DEV_PRIVATE_TO_PF(dev->data->dev_private);
-   struct ice_tm_node_list *qgroup_list = &pf->tm_conf.qgroup_list;
-   struct ice_tm_node_list *queue_list = &pf->tm_conf.queue_list;
-   struct ice_tm_node *tm_node;
-
-   if (pf->tm_conf.root && pf->tm_conf.root->id == node_id) {
-   *node_type = ICE_TM_NODE_TYPE_PORT;
-   return pf->tm_conf.root;
-   }
 
-   TAILQ_FOREACH(tm_node, qgroup_list, node) {
-   if (tm_node->id == node_id) {
-   *node_type = ICE_TM_NODE_TYPE_QGROUP;
-   return tm_node;
-   }
-   }
-
-   TAILQ_FOREACH(tm_node, queue_list, node) {
-   if (tm_node->id == node_id) {
-   *node_type = ICE_TM_NODE_TYPE_QUEUE;
-   return tm_node;
-   }
-   }
-
-   return NULL;
+   free_node(pf->tm_conf.root);
+   pf->tm_conf.root = NULL;
 }
 
 static int
@@ -195,11 +159,29 @@ ice_node_param_check(struct ice_pf *pf, uint32_t node_id,
return 0;
 }
 
+static struct ice_tm_node *
+find_node(struct ice_tm_node *root, uint32_t id)
+{
+   uint32_t i;
+
+   if (root == NULL || root->id == id)
+   return root;
+
+   for (i = 0; i < root->reference_count; i++) {
+   struct ice_tm_node *node = find_node(root->children[i], id);
+
+   if (node)
+   return node;
+   }
+
+   return NULL;
+}
+
 static int
 ice_node_type_get(struct rte_eth_dev *dev, uint32_t node_id,
   int *is_leaf, struct rte_tm_error *error)
 {
-   enum ice_tm_node_type node_type = ICE_TM_NODE_TYPE_MAX;
+   struct ice_pf *pf = ICE_DEV_PRIVATE_TO_PF(dev->data->dev_private);
struct ice_tm_node *tm_node;
 
if (!is_leaf || !error)
@@ -212,14 +194,14 @@ ice_node_type_get(struct rte_eth_dev *dev, uint32_t 
node_id,
}
 
/* check if the node id exists 

[PATCH 3/3] doc: update ice document for qos

2024-01-04 Thread Qi Zhang
Add description for ice PMD's rte_tm capabilities.

Signed-off-by: Qi Zhang 
---
 doc/guides/nics/ice.rst | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/doc/guides/nics/ice.rst b/doc/guides/nics/ice.rst
index bafb3ba022..1f737a009c 100644
--- a/doc/guides/nics/ice.rst
+++ b/doc/guides/nics/ice.rst
@@ -352,6 +352,25 @@ queue 3 using a raw pattern::
 
 Currently, raw pattern support is limited to the FDIR and Hash engines.
 
+Traffic Management Support
+~~
+
+The ice PMD provides support for the Traffic Management API (RTE_RM), allow
+users to offload a 3-layers Tx scheduler on the E810 NIC:
+
+- ``Port Layer``
+
+  This is the root layer, support peak bandwidth configuration, max to 32 
children.
+
+- ``Queue Group Layer``
+
+  The middel layer, support peak / committed bandwidth, weight, prioirty 
configurations,
+  max to 8 children.
+
+- ``Queue Layer``
+
+  The leaf layer, support peak / committed bandwidth, weight, prioirty 
configurations.
+
 Additional Options
 ++
 
-- 
2.31.1



[PATCH v2 0/3] net/ice: simplified to 3 layer Tx scheduler

2024-01-04 Thread Qi Zhang
Remove dummy layers, code refactor, complete document

Qi Zhang (3):
  net/ice: hide port and TC layer in Tx sched tree
  net/ice: refactor tm config data structure
  doc: update ice document for qos

v2:
- fix typos.

 doc/guides/nics/ice.rst  |  19 +++
 drivers/net/ice/ice_ethdev.h |  12 +-
 drivers/net/ice/ice_tm.c | 285 +++
 3 files changed, 112 insertions(+), 204 deletions(-)

-- 
2.31.1



[PATCH v2 1/3] net/ice: hide port and TC layer in Tx sched tree

2024-01-04 Thread Qi Zhang
In currently 5 layer tree implementation, the port and tc layer
is not configurable, so its not necessary to expose them to application.

The patch hides the top 2 layers and represented the root of the tree at
VSI layer. From application's point of view, its a 3 layer scheduler tree:

Port -> Queue Group -> Queue.

Signed-off-by: Qi Zhang 
---
 drivers/net/ice/ice_ethdev.h |  7 
 drivers/net/ice/ice_tm.c | 79 
 2 files changed, 7 insertions(+), 79 deletions(-)

diff --git a/drivers/net/ice/ice_ethdev.h b/drivers/net/ice/ice_ethdev.h
index fa4981ed14..ae22c29ffc 100644
--- a/drivers/net/ice/ice_ethdev.h
+++ b/drivers/net/ice/ice_ethdev.h
@@ -470,7 +470,6 @@ struct ice_tm_shaper_profile {
 struct ice_tm_node {
TAILQ_ENTRY(ice_tm_node) node;
uint32_t id;
-   uint32_t tc;
uint32_t priority;
uint32_t weight;
uint32_t reference_count;
@@ -484,8 +483,6 @@ struct ice_tm_node {
 /* node type of Traffic Manager */
 enum ice_tm_node_type {
ICE_TM_NODE_TYPE_PORT,
-   ICE_TM_NODE_TYPE_TC,
-   ICE_TM_NODE_TYPE_VSI,
ICE_TM_NODE_TYPE_QGROUP,
ICE_TM_NODE_TYPE_QUEUE,
ICE_TM_NODE_TYPE_MAX,
@@ -495,12 +492,8 @@ enum ice_tm_node_type {
 struct ice_tm_conf {
struct ice_shaper_profile_list shaper_profile_list;
struct ice_tm_node *root; /* root node - port */
-   struct ice_tm_node_list tc_list; /* node list for all the TCs */
-   struct ice_tm_node_list vsi_list; /* node list for all the VSIs */
struct ice_tm_node_list qgroup_list; /* node list for all the queue 
groups */
struct ice_tm_node_list queue_list; /* node list for all the queues */
-   uint32_t nb_tc_node;
-   uint32_t nb_vsi_node;
uint32_t nb_qgroup_node;
uint32_t nb_queue_node;
bool committed;
diff --git a/drivers/net/ice/ice_tm.c b/drivers/net/ice/ice_tm.c
index b570798f07..7ae68c683b 100644
--- a/drivers/net/ice/ice_tm.c
+++ b/drivers/net/ice/ice_tm.c
@@ -43,12 +43,8 @@ ice_tm_conf_init(struct rte_eth_dev *dev)
/* initialize node configuration */
TAILQ_INIT(&pf->tm_conf.shaper_profile_list);
pf->tm_conf.root = NULL;
-   TAILQ_INIT(&pf->tm_conf.tc_list);
-   TAILQ_INIT(&pf->tm_conf.vsi_list);
TAILQ_INIT(&pf->tm_conf.qgroup_list);
TAILQ_INIT(&pf->tm_conf.queue_list);
-   pf->tm_conf.nb_tc_node = 0;
-   pf->tm_conf.nb_vsi_node = 0;
pf->tm_conf.nb_qgroup_node = 0;
pf->tm_conf.nb_queue_node = 0;
pf->tm_conf.committed = false;
@@ -72,16 +68,6 @@ ice_tm_conf_uninit(struct rte_eth_dev *dev)
rte_free(tm_node);
}
pf->tm_conf.nb_qgroup_node = 0;
-   while ((tm_node = TAILQ_FIRST(&pf->tm_conf.vsi_list))) {
-   TAILQ_REMOVE(&pf->tm_conf.vsi_list, tm_node, node);
-   rte_free(tm_node);
-   }
-   pf->tm_conf.nb_vsi_node = 0;
-   while ((tm_node = TAILQ_FIRST(&pf->tm_conf.tc_list))) {
-   TAILQ_REMOVE(&pf->tm_conf.tc_list, tm_node, node);
-   rte_free(tm_node);
-   }
-   pf->tm_conf.nb_tc_node = 0;
if (pf->tm_conf.root) {
rte_free(pf->tm_conf.root);
pf->tm_conf.root = NULL;
@@ -93,8 +79,6 @@ ice_tm_node_search(struct rte_eth_dev *dev,
uint32_t node_id, enum ice_tm_node_type *node_type)
 {
struct ice_pf *pf = ICE_DEV_PRIVATE_TO_PF(dev->data->dev_private);
-   struct ice_tm_node_list *tc_list = &pf->tm_conf.tc_list;
-   struct ice_tm_node_list *vsi_list = &pf->tm_conf.vsi_list;
struct ice_tm_node_list *qgroup_list = &pf->tm_conf.qgroup_list;
struct ice_tm_node_list *queue_list = &pf->tm_conf.queue_list;
struct ice_tm_node *tm_node;
@@ -104,20 +88,6 @@ ice_tm_node_search(struct rte_eth_dev *dev,
return pf->tm_conf.root;
}
 
-   TAILQ_FOREACH(tm_node, tc_list, node) {
-   if (tm_node->id == node_id) {
-   *node_type = ICE_TM_NODE_TYPE_TC;
-   return tm_node;
-   }
-   }
-
-   TAILQ_FOREACH(tm_node, vsi_list, node) {
-   if (tm_node->id == node_id) {
-   *node_type = ICE_TM_NODE_TYPE_VSI;
-   return tm_node;
-   }
-   }
-
TAILQ_FOREACH(tm_node, qgroup_list, node) {
if (tm_node->id == node_id) {
*node_type = ICE_TM_NODE_TYPE_QGROUP;
@@ -371,6 +341,8 @@ ice_shaper_profile_del(struct rte_eth_dev *dev,
return 0;
 }
 
+#define MAX_QUEUE_PER_GROUP8
+
 static int
 ice_tm_node_add(struct rte_eth_dev *dev, uint32_t node_id,
  uint32_t parent_node_id, uint32_t priority,
@@ -384,8 +356,6 @@ ice_tm_node_add(struct rte_eth_dev *dev, uint32_t node_id,
struct ice_tm_shaper_profile *shaper_profile = NULL;
struct ice_tm_node *tm_node;
struct ice_tm_node *parent_node;

[PATCH v2 2/3] net/ice: refactor tm config data structure

2024-01-04 Thread Qi Zhang
Simplified struct ice_tm_conf by removing per level node list.

Signed-off-by: Qi Zhang 
---
 drivers/net/ice/ice_ethdev.h |   5 +-
 drivers/net/ice/ice_tm.c | 210 +++
 2 files changed, 88 insertions(+), 127 deletions(-)

diff --git a/drivers/net/ice/ice_ethdev.h b/drivers/net/ice/ice_ethdev.h
index ae22c29ffc..008a7a23b9 100644
--- a/drivers/net/ice/ice_ethdev.h
+++ b/drivers/net/ice/ice_ethdev.h
@@ -472,6 +472,7 @@ struct ice_tm_node {
uint32_t id;
uint32_t priority;
uint32_t weight;
+   uint32_t level;
uint32_t reference_count;
struct ice_tm_node *parent;
struct ice_tm_node **children;
@@ -492,10 +493,6 @@ enum ice_tm_node_type {
 struct ice_tm_conf {
struct ice_shaper_profile_list shaper_profile_list;
struct ice_tm_node *root; /* root node - port */
-   struct ice_tm_node_list qgroup_list; /* node list for all the queue 
groups */
-   struct ice_tm_node_list queue_list; /* node list for all the queues */
-   uint32_t nb_qgroup_node;
-   uint32_t nb_queue_node;
bool committed;
bool clear_on_fail;
 };
diff --git a/drivers/net/ice/ice_tm.c b/drivers/net/ice/ice_tm.c
index 7ae68c683b..7c662f8a85 100644
--- a/drivers/net/ice/ice_tm.c
+++ b/drivers/net/ice/ice_tm.c
@@ -43,66 +43,30 @@ ice_tm_conf_init(struct rte_eth_dev *dev)
/* initialize node configuration */
TAILQ_INIT(&pf->tm_conf.shaper_profile_list);
pf->tm_conf.root = NULL;
-   TAILQ_INIT(&pf->tm_conf.qgroup_list);
-   TAILQ_INIT(&pf->tm_conf.queue_list);
-   pf->tm_conf.nb_qgroup_node = 0;
-   pf->tm_conf.nb_queue_node = 0;
pf->tm_conf.committed = false;
pf->tm_conf.clear_on_fail = false;
 }
 
-void
-ice_tm_conf_uninit(struct rte_eth_dev *dev)
+static void free_node(struct ice_tm_node *root)
 {
-   struct ice_pf *pf = ICE_DEV_PRIVATE_TO_PF(dev->data->dev_private);
-   struct ice_tm_node *tm_node;
+   uint32_t i;
 
-   /* clear node configuration */
-   while ((tm_node = TAILQ_FIRST(&pf->tm_conf.queue_list))) {
-   TAILQ_REMOVE(&pf->tm_conf.queue_list, tm_node, node);
-   rte_free(tm_node);
-   }
-   pf->tm_conf.nb_queue_node = 0;
-   while ((tm_node = TAILQ_FIRST(&pf->tm_conf.qgroup_list))) {
-   TAILQ_REMOVE(&pf->tm_conf.qgroup_list, tm_node, node);
-   rte_free(tm_node);
-   }
-   pf->tm_conf.nb_qgroup_node = 0;
-   if (pf->tm_conf.root) {
-   rte_free(pf->tm_conf.root);
-   pf->tm_conf.root = NULL;
-   }
+   if (root == NULL)
+   return;
+
+   for (i = 0; i < root->reference_count; i++)
+   free_node(root->children[i]);
+
+   rte_free(root);
 }
 
-static inline struct ice_tm_node *
-ice_tm_node_search(struct rte_eth_dev *dev,
-   uint32_t node_id, enum ice_tm_node_type *node_type)
+void
+ice_tm_conf_uninit(struct rte_eth_dev *dev)
 {
struct ice_pf *pf = ICE_DEV_PRIVATE_TO_PF(dev->data->dev_private);
-   struct ice_tm_node_list *qgroup_list = &pf->tm_conf.qgroup_list;
-   struct ice_tm_node_list *queue_list = &pf->tm_conf.queue_list;
-   struct ice_tm_node *tm_node;
-
-   if (pf->tm_conf.root && pf->tm_conf.root->id == node_id) {
-   *node_type = ICE_TM_NODE_TYPE_PORT;
-   return pf->tm_conf.root;
-   }
 
-   TAILQ_FOREACH(tm_node, qgroup_list, node) {
-   if (tm_node->id == node_id) {
-   *node_type = ICE_TM_NODE_TYPE_QGROUP;
-   return tm_node;
-   }
-   }
-
-   TAILQ_FOREACH(tm_node, queue_list, node) {
-   if (tm_node->id == node_id) {
-   *node_type = ICE_TM_NODE_TYPE_QUEUE;
-   return tm_node;
-   }
-   }
-
-   return NULL;
+   free_node(pf->tm_conf.root);
+   pf->tm_conf.root = NULL;
 }
 
 static int
@@ -195,11 +159,29 @@ ice_node_param_check(struct ice_pf *pf, uint32_t node_id,
return 0;
 }
 
+static struct ice_tm_node *
+find_node(struct ice_tm_node *root, uint32_t id)
+{
+   uint32_t i;
+
+   if (root == NULL || root->id == id)
+   return root;
+
+   for (i = 0; i < root->reference_count; i++) {
+   struct ice_tm_node *node = find_node(root->children[i], id);
+
+   if (node)
+   return node;
+   }
+
+   return NULL;
+}
+
 static int
 ice_node_type_get(struct rte_eth_dev *dev, uint32_t node_id,
   int *is_leaf, struct rte_tm_error *error)
 {
-   enum ice_tm_node_type node_type = ICE_TM_NODE_TYPE_MAX;
+   struct ice_pf *pf = ICE_DEV_PRIVATE_TO_PF(dev->data->dev_private);
struct ice_tm_node *tm_node;
 
if (!is_leaf || !error)
@@ -212,14 +194,14 @@ ice_node_type_get(struct rte_eth_dev *dev, uint32_t 
node_id,
}
 
/* check if the node id exists 

[PATCH v2 3/3] doc: update ice document for qos

2024-01-04 Thread Qi Zhang
Add description for ice PMD's rte_tm capabilities.

Signed-off-by: Qi Zhang 
---
 doc/guides/nics/ice.rst | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/doc/guides/nics/ice.rst b/doc/guides/nics/ice.rst
index bafb3ba022..3d381a266b 100644
--- a/doc/guides/nics/ice.rst
+++ b/doc/guides/nics/ice.rst
@@ -352,6 +352,25 @@ queue 3 using a raw pattern::
 
 Currently, raw pattern support is limited to the FDIR and Hash engines.
 
+Traffic Management Support
+~~
+
+The ice PMD provides support for the Traffic Management API (RTE_RM), allow
+users to offload a 3-layers Tx scheduler on the E810 NIC:
+
+- ``Port Layer``
+
+  This is the root layer, support peak bandwidth configuration, max to 32 
children.
+
+- ``Queue Group Layer``
+
+  The middel layer, support peak / committed bandwidth, weight, priority 
configurations,
+  max to 8 children.
+
+- ``Queue Layer``
+
+  The leaf layer, support peak / committed bandwidth, weight, priority 
configurations.
+
 Additional Options
 ++
 
-- 
2.31.1



RE: [PATCH] net/ice: refine queue start stop

2024-01-04 Thread Wu, Wenjun1
> -Original Message-
> From: Zhang, Qi Z 
> Sent: Friday, January 5, 2024 9:37 PM
> To: Yang, Qiming ; Wu, Wenjun1
> 
> Cc: dev@dpdk.org; Zhang, Qi Z 
> Subject: [PATCH] net/ice: refine queue start stop
> 
> Not necessary to return fail when starting or stopping a queue if the queue
> was already at required state.
> 
> Signed-off-by: Qi Zhang 
> ---
>  drivers/net/ice/ice_rxtx.c | 16 
>  1 file changed, 16 insertions(+)
> 
> diff --git a/drivers/net/ice/ice_rxtx.c b/drivers/net/ice/ice_rxtx.c index
> 73e47ae92d..3286bb08fe 100644
> --- a/drivers/net/ice/ice_rxtx.c
> +++ b/drivers/net/ice/ice_rxtx.c
> @@ -673,6 +673,10 @@ ice_rx_queue_start(struct rte_eth_dev *dev,
> uint16_t rx_queue_id)
>   return -EINVAL;
>   }
> 
> + if (dev->data->rx_queue_state[rx_queue_id] ==
> + RTE_ETH_QUEUE_STATE_STARTED)
> + return 0;
> +
>   if (dev->data->dev_conf.rxmode.offloads &
> RTE_ETH_RX_OFFLOAD_TIMESTAMP)
>   rxq->ts_enable = true;
>   err = ice_program_hw_rx_queue(rxq);
> @@ -717,6 +721,10 @@ ice_rx_queue_stop(struct rte_eth_dev *dev,
> uint16_t rx_queue_id)
>   if (rx_queue_id < dev->data->nb_rx_queues) {
>   rxq = dev->data->rx_queues[rx_queue_id];
> 
> + if (dev->data->rx_queue_state[rx_queue_id] ==
> + RTE_ETH_QUEUE_STATE_STOPPED)
> + return 0;
> +
>   err = ice_switch_rx_queue(hw, rxq->reg_idx, false);
>   if (err) {
>   PMD_DRV_LOG(ERR, "Failed to switch RX queue %u
> off", @@ -758,6 +766,10 @@ ice_tx_queue_start(struct rte_eth_dev *dev,
> uint16_t tx_queue_id)
>   return -EINVAL;
>   }
> 
> + if (dev->data->tx_queue_state[tx_queue_id] ==
> + RTE_ETH_QUEUE_STATE_STARTED)
> + return 0;
> +
>   buf_len = ice_struct_size(txq_elem, txqs, 1);
>   txq_elem = ice_malloc(hw, buf_len);
>   if (!txq_elem)
> @@ -1066,6 +1078,10 @@ ice_tx_queue_stop(struct rte_eth_dev *dev,
> uint16_t tx_queue_id)
>   return -EINVAL;
>   }
> 
> + if (dev->data->tx_queue_state[tx_queue_id] ==
> + RTE_ETH_QUEUE_STATE_STOPPED)
> + return 0;
> +
>   q_ids[0] = txq->reg_idx;
>   q_teids[0] = txq->q_teid;
> 
> --
> 2.31.1

Acked-by: Wenjun Wu 


[Bug 1341] ovs+dpdk ixgbe port tx failed. rte_pktmbuf_alloc failed

2024-01-04 Thread bugzilla
https://bugs.dpdk.org/show_bug.cgi?id=1341

Bug ID: 1341
   Summary: ovs+dpdk ixgbe port tx failed. rte_pktmbuf_alloc
failed
   Product: DPDK
   Version: 22.11
  Hardware: All
OS: All
Status: UNCONFIRMED
  Severity: normal
  Priority: Normal
 Component: ethdev
  Assignee: dev@dpdk.org
  Reporter: 125163...@qq.com
  Target Milestone: ---

Using the openvswitch+dpdk method, for the ixgbe driver type network card added
to the bridge, when two VM virtual machine interfaces are sent, the netperf
tool is used to send TCP_ After a period of time, the message of CRR will
undergo rte_ Eth_ Tx_ Burst stops sending packets and discovers rte through
localization_ Pktmbuf_ Alloc cannot allocate mbuf.

Openvswitch:

DPDK: 22.11

Network card: 82599ES

Network card queue: 8

Network card receiving descriptor: 4096

-- 
You are receiving this mail because:
You are the assignee for the bug.

[PATCH] app/test-crypto-perf: add missed resubmission fix

2024-01-04 Thread Suanming Mou
Currently, after enqueue_burst, there may be ops_unused ops
left for next round enqueue. And in next round preparation,
only ops_needed ops will be added. But if in the final round
the left ops is less than ops_needed, there will be invalid
ops between the new needed ops and previous unused ops. The
previous unused ops should be moved front after the needed
ops.

In the commit[1], an resubmission fix was added to throughput
test, and the fix was missed for verify.

This commit adds the missed resubmission fix for verify.

[1] 44e2980b70d1 ("app/crypto-perf: fix crypto operation resubmission")

Fixes: f8be1786b1b8 ("app/crypto-perf: introduce performance test application")

Cc: sta...@dpdk.org

Signed-off-by: Suanming Mou 
---
 app/test-crypto-perf/cperf_test_verify.c | 14 +-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/app/test-crypto-perf/cperf_test_verify.c 
b/app/test-crypto-perf/cperf_test_verify.c
index 2b0d3f142b..0328bb5724 100644
--- a/app/test-crypto-perf/cperf_test_verify.c
+++ b/app/test-crypto-perf/cperf_test_verify.c
@@ -275,7 +275,6 @@ cperf_verify_test_runner(void *test_ctx)
ops_needed, ctx->sess, ctx->options,
ctx->test_vector, iv_offset, &imix_idx, NULL);
 
-
/* Populate the mbuf with the test vector, for verification */
for (i = 0; i < ops_needed; i++)
cperf_mbuf_set(ops[i]->sym->m_src,
@@ -293,6 +292,19 @@ cperf_verify_test_runner(void *test_ctx)
}
 #endif /* CPERF_LINEARIZATION_ENABLE */
 
+   /**
+* When ops_needed is smaller than ops_enqd, the
+* unused ops need to be moved to the front for
+* next round use.
+*/
+   if (unlikely(ops_enqd > ops_needed)) {
+   size_t nb_b_to_mov = ops_unused * sizeof(
+   struct rte_crypto_op *);
+
+   memmove(&ops[ops_needed], &ops[ops_enqd],
+   nb_b_to_mov);
+   }
+
/* Enqueue burst of ops on crypto device */
ops_enqd = rte_cryptodev_enqueue_burst(ctx->dev_id, ctx->qp_id,
ops, burst_size);
-- 
2.34.1