RE: [RFC] ethdev: fast path async flow API
> > This is a blocker, showstopper for me. > +1 > > > Have you considered having something like > >rte_flow_create_bulk() > > > > or better yet a Linux iouring style API? > > > > A ring style API would allow for better mixed operations across the board > > and > > get rid of the I-cache overhead which is the root cause of the needing > > inline. > Existing async flow API is somewhat close to the io_uring interface. > The difference being that queue is not directly exposed to the application. > Application interacts with the queue using rte_flow_async_* APIs (e.g., > places operations in the queue, pushes them to the HW). > Such design has some benefits over a flow API which exposes the queue to the > user: > - Easier to use - Applications do not manage the queue directly, they do it > through exposed APIs. > - Consistent with other DPDK APIs - In other libraries, queues are > manipulated through API, not directly by an application. > - Lower memory usage - only HW primitives are needed (e.g., HW queue on PMD > side), no need to allocate separate application > queues. > > Bulking of flow operations is a tricky subject. > Compared to packet processing, where it is desired to keep the manipulation > of raw packet data to the minimum (e.g., only packet > headers are accessed), > during flow rule creation all items and actions must be processed by PMD to > create a flow rule. > The amount of memory consumed by items and actions themselves during this > process might be nonnegligible. > If flow rule operations were bulked, the size of working set of memory would > increase, which could have negative consequences on > the cache behavior. > So, it might be the case that by utilizing bulking the I-cache overhead is > removed, but the D-cache overhead is added. Is rte_flow struct really that big? We do bulk processing for mbufs, crypto_ops, etc., and usually bulk processing improves performance not degrades it. Of course bulk size has to be somewhat reasonable. > On the other hand, creating flow rule operations (or enqueuing flow rule > operations) one by one enables applications to reuse the > same memory for different flow rules. > > In summary, in my opinion extending the async flow API with bulking > capabilities or exposing the queue directly to the application is > not desirable. > This proposal aims to reduce the I-cache overhead in async flow API by > reusing the existing design pattern in DPDK - fast path > functions are inlined to the application code and they call cached PMD > callbacks. > > Best regards, > Dariusz Sosnowski
RE: [EXT] [PATCH 2/2] app/test-crypto-perf: fix encrypt operation verify
Hi, > -Original Message- > From: Anoob Joseph > Sent: Thursday, January 4, 2024 1:13 PM > To: Suanming Mou ; Ciara Power > > Cc: dev@dpdk.org > Subject: RE: [EXT] [PATCH 2/2] app/test-crypto-perf: fix encrypt operation > verify > > Hi Suanming, > > Please see inline. > > Thanks, > Anoob > > > -Original Message- > > From: Suanming Mou > > Sent: Wednesday, January 3, 2024 9:26 AM > > To: Ciara Power > > Cc: dev@dpdk.org > > Subject: [EXT] [PATCH 2/2] app/test-crypto-perf: fix encrypt operation > > verify > > > > External Email > > > > -- > > AEAD users RTE_CRYPTO_AEAD_OP_* with aead_op and CIPHER uses > [Anoob] users -> uses > > > RTE_CRYPTO_CIPHER_OP_* with cipher_op in current code. > > > > This commit aligns aead_op and cipher_op operation to fix incorrect > > AEAD verification. > > > > Fixes: df52cb3b6e13 ("app/crypto-perf: move verify as single test > > type") > > > > Signed-off-by: Suanming Mou > > --- > > app/test-crypto-perf/cperf_test_verify.c | 9 +++-- > > 1 file changed, 7 insertions(+), 2 deletions(-) > > > > diff --git a/app/test-crypto-perf/cperf_test_verify.c > > b/app/test-crypto- perf/cperf_test_verify.c index > > 8aa714b969..525a2b1373 100644 > > --- a/app/test-crypto-perf/cperf_test_verify.c > > +++ b/app/test-crypto-perf/cperf_test_verify.c > > @@ -113,6 +113,7 @@ cperf_verify_op(struct rte_crypto_op *op, > > uint8_t *data; > > uint32_t cipher_offset, auth_offset; > > uint8_t cipher, auth; > > + bool is_encrypt = false; > > int res = 0; > > > > if (op->status != RTE_CRYPTO_OP_STATUS_SUCCESS) @@ -154,12 > > +155,14 @@ cperf_verify_op(struct rte_crypto_op *op, > > cipher_offset = 0; > > auth = 0; > > auth_offset = 0; > > + is_encrypt = options->cipher_op == > > RTE_CRYPTO_CIPHER_OP_ENCRYPT; > > break; > > case CPERF_CIPHER_THEN_AUTH: > > cipher = 1; > > cipher_offset = 0; > > auth = 1; > > auth_offset = options->test_buffer_size; > > + is_encrypt = options->cipher_op == > > RTE_CRYPTO_CIPHER_OP_ENCRYPT; > > break; > > case CPERF_AUTH_ONLY: > > cipher = 0; > > @@ -172,12 +175,14 @@ cperf_verify_op(struct rte_crypto_op *op, > > cipher_offset = 0; > > auth = 1; > > auth_offset = options->test_buffer_size; > > + is_encrypt = options->cipher_op == > > RTE_CRYPTO_CIPHER_OP_ENCRYPT; > > break; > > case CPERF_AEAD: > > cipher = 1; > > cipher_offset = 0; > > - auth = 1; > > + auth = options->aead_op == RTE_CRYPTO_AEAD_OP_ENCRYPT; > > auth_offset = options->test_buffer_size; > > + is_encrypt = !!auth; > > break; > > default: > > res = 1; > > @@ -185,7 +190,7 @@ cperf_verify_op(struct rte_crypto_op *op, > > } > > > > if (cipher == 1) { > > - if (options->cipher_op == RTE_CRYPTO_CIPHER_OP_ENCRYPT) > > + if (is_encrypt) > > [Anoob] A similar check is there under 'auth == 1' check, right? Won't that > also > need fixing? > > if (auth == 1) { > if (options->auth_op == RTE_CRYPTO_AUTH_OP_GENERATE) > > I think some renaming of the local variables might make code better. > bool cipher, digest_verify = false, is_encrypt = false; > > case CPERF_CIPHER_THEN_AUTH: > cipher = true; > cipher_offset = 0; > if (options->cipher_op == RTE_CRYPTO_CIPHER_OP_ENCRYPT) { > is_encrypt = true; > digest_verify = true; /* Assumption - options->auth_op > == RTE_CRYPTO_AUTH_OP_GENERATE is verified elsewhere */ > auth_offset = options->test_buffer_size; > } > break; > <...> > case CPERF_AEAD: > cipher = true; > cipher_offset = 0; > if (options->aead_op == > RTE_CRYPTO_AEAD_OP_ENCRYPT) { > is_encrypt = true; > digest_verify = true; > auth_offset = options->test_buffer_size; > } > > What do you think? Yes, so we can totally remove the auth for now. I will do that. Thanks for the suggestion. > > > res += !!memcmp(data + cipher_offset, > > vector->ciphertext.data, > > options->test_buffer_size); > > -- > > 2.34.1
[PATCH 1/2] config/arm: fix CN10K minimum march requirement
From: Pavan Nikhilesh Meson selects march and mcpu based on compiler support and partnumber, only the minimum required march should be defined in cross compile configuration file. Fixes: 1b4c86a721c9 ("config/arm: add Marvell CN10K") Cc: sta...@dpdk.org Signed-off-by: Pavan Nikhilesh --- config/arm/arm64_cn10k_linux_gcc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/config/arm/arm64_cn10k_linux_gcc b/config/arm/arm64_cn10k_linux_gcc index fa904af5d0..801a7ededd 100644 --- a/config/arm/arm64_cn10k_linux_gcc +++ b/config/arm/arm64_cn10k_linux_gcc @@ -10,7 +10,7 @@ cmake = 'cmake' [host_machine] system = 'linux' cpu_family = 'aarch64' -cpu = 'armv8.6-a' +cpu = 'armv8-a' endian = 'little' [properties] -- 2.25.1
[PATCH 2/2] config/arm: add armv9-a march
From: Pavan Nikhilesh Now that major versions of GCC recognize armv9-a march option, add it to the list of supported march. Update neoverse-n2 part number to include march as armv9-a. Signed-off-by: Pavan Nikhilesh --- config/arm/meson.build | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/config/arm/meson.build b/config/arm/meson.build index 36f21d2259..0804877b57 100644 --- a/config/arm/meson.build +++ b/config/arm/meson.build @@ -92,6 +92,7 @@ part_number_config_arm = { 'march': 'armv8.4-a', }, '0xd49': { +'march': 'armv9-a', 'march_features': ['sve2'], 'compiler_options': ['-mcpu=neoverse-n2'], 'flags': [ @@ -701,7 +702,7 @@ if update_flags if part_number_config.get('force_march', false) candidate_march = part_number_config['march'] else -supported_marchs = ['armv8.6-a', 'armv8.5-a', 'armv8.4-a', 'armv8.3-a', +supported_marchs = ['armv9-a', 'armv8.6-a', 'armv8.5-a', 'armv8.4-a', 'armv8.3-a', 'armv8.2-a', 'armv8.1-a', 'armv8-a'] check_compiler_support = false foreach supported_march: supported_marchs -- 2.25.1
[PATCH v9 0/2] net/iavf: fix Rx/Tx burst and add diagnostics
Fixed Rx/Tx crash in multi-process environment and added Tx diagnostic feature. Mingjin Ye (2): net/iavf: fix Rx/Tx burst in multi-process net/iavf: add diagnostic support in TX path doc/guides/nics/intel_vf.rst | 9 ++ drivers/net/iavf/iavf.h| 55 ++- drivers/net/iavf/iavf_ethdev.c | 75 + drivers/net/iavf/iavf_rxtx.c | 283 ++--- drivers/net/iavf/iavf_rxtx.h | 2 + 5 files changed, 365 insertions(+), 59 deletions(-) -- 2.25.1
[PATCH v9 1/2] net/iavf: fix Rx/Tx burst in multi-process
In a multi-process environment, a secondary process operates on shared memory and changes the function pointer of the primary process, resulting in a crash when the primary process cannot find the function address during an Rx/Tx burst. Fixes: 5b3124a0a6ef ("net/iavf: support no polling when link down") Cc: sta...@dpdk.org Signed-off-by: Mingjin Ye --- v2: Add fix for Rx burst. --- v3: fix Rx/Tx routing. --- v4: Fix the ops array. --- v5: rebase. --- drivers/net/iavf/iavf.h | 43 +++- drivers/net/iavf/iavf_rxtx.c | 185 --- 2 files changed, 169 insertions(+), 59 deletions(-) diff --git a/drivers/net/iavf/iavf.h b/drivers/net/iavf/iavf.h index d273d884f5..ab24cb02c3 100644 --- a/drivers/net/iavf/iavf.h +++ b/drivers/net/iavf/iavf.h @@ -314,6 +314,45 @@ struct iavf_devargs { struct iavf_security_ctx; +enum iavf_rx_burst_type { + IAVF_RX_DEFAULT, + IAVF_RX_FLEX_RXD, + IAVF_RX_BULK_ALLOC, + IAVF_RX_SCATTERED, + IAVF_RX_SCATTERED_FLEX_RXD, + IAVF_RX_SSE, + IAVF_RX_AVX2, + IAVF_RX_AVX2_OFFLOAD, + IAVF_RX_SSE_FLEX_RXD, + IAVF_RX_AVX2_FLEX_RXD, + IAVF_RX_AVX2_FLEX_RXD_OFFLOAD, + IAVF_RX_SSE_SCATTERED, + IAVF_RX_AVX2_SCATTERED, + IAVF_RX_AVX2_SCATTERED_OFFLOAD, + IAVF_RX_SSE_SCATTERED_FLEX_RXD, + IAVF_RX_AVX2_SCATTERED_FLEX_RXD, + IAVF_RX_AVX2_SCATTERED_FLEX_RXD_OFFLOAD, + IAVF_RX_AVX512, + IAVF_RX_AVX512_OFFLOAD, + IAVF_RX_AVX512_FLEX_RXD, + IAVF_RX_AVX512_FLEX_RXD_OFFLOAD, + IAVF_RX_AVX512_SCATTERED, + IAVF_RX_AVX512_SCATTERED_OFFLOAD, + IAVF_RX_AVX512_SCATTERED_FLEX_RXD, + IAVF_RX_AVX512_SCATTERED_FLEX_RXD_OFFLOAD, +}; + +enum iavf_tx_burst_type { + IAVF_TX_DEFAULT, + IAVF_TX_SSE, + IAVF_TX_AVX2, + IAVF_TX_AVX2_OFFLOAD, + IAVF_TX_AVX512, + IAVF_TX_AVX512_OFFLOAD, + IAVF_TX_AVX512_CTX, + IAVF_TX_AVX512_CTX_OFFLOAD, +}; + /* Structure to store private data for each VF instance. */ struct iavf_adapter { struct iavf_hw hw; @@ -329,8 +368,8 @@ struct iavf_adapter { bool stopped; bool closed; bool no_poll; - eth_rx_burst_t rx_pkt_burst; - eth_tx_burst_t tx_pkt_burst; + enum iavf_rx_burst_type rx_burst_type; + enum iavf_tx_burst_type tx_burst_type; uint16_t fdir_ref_cnt; struct iavf_devargs devargs; }; diff --git a/drivers/net/iavf/iavf_rxtx.c b/drivers/net/iavf/iavf_rxtx.c index e54fb74b79..f044ad3f26 100644 --- a/drivers/net/iavf/iavf_rxtx.c +++ b/drivers/net/iavf/iavf_rxtx.c @@ -3716,15 +3716,78 @@ iavf_prep_pkts(__rte_unused void *tx_queue, struct rte_mbuf **tx_pkts, return i; } +static +const eth_rx_burst_t iavf_rx_pkt_burst_ops[] = { + [IAVF_RX_DEFAULT] = iavf_recv_pkts, + [IAVF_RX_FLEX_RXD] = iavf_recv_pkts_flex_rxd, + [IAVF_RX_BULK_ALLOC] = iavf_recv_pkts_bulk_alloc, + [IAVF_RX_SCATTERED] = iavf_recv_scattered_pkts, + [IAVF_RX_SCATTERED_FLEX_RXD] = iavf_recv_scattered_pkts_flex_rxd, +#ifdef RTE_ARCH_X86 + [IAVF_RX_SSE] = iavf_recv_pkts_vec, + [IAVF_RX_AVX2] = iavf_recv_pkts_vec_avx2, + [IAVF_RX_AVX2_OFFLOAD] = iavf_recv_pkts_vec_avx2_offload, + [IAVF_RX_SSE_FLEX_RXD] = iavf_recv_pkts_vec_flex_rxd, + [IAVF_RX_AVX2_FLEX_RXD] = iavf_recv_pkts_vec_avx2_flex_rxd, + [IAVF_RX_AVX2_FLEX_RXD_OFFLOAD] = + iavf_recv_pkts_vec_avx2_flex_rxd_offload, + [IAVF_RX_SSE_SCATTERED] = iavf_recv_scattered_pkts_vec, + [IAVF_RX_AVX2_SCATTERED] = iavf_recv_scattered_pkts_vec_avx2, + [IAVF_RX_AVX2_SCATTERED_OFFLOAD] = + iavf_recv_scattered_pkts_vec_avx2_offload, + [IAVF_RX_SSE_SCATTERED_FLEX_RXD] = + iavf_recv_scattered_pkts_vec_flex_rxd, + [IAVF_RX_AVX2_SCATTERED_FLEX_RXD] = + iavf_recv_scattered_pkts_vec_avx2_flex_rxd, + [IAVF_RX_AVX2_SCATTERED_FLEX_RXD_OFFLOAD] = + iavf_recv_scattered_pkts_vec_avx2_flex_rxd_offload, +#ifdef CC_AVX512_SUPPORT + [IAVF_RX_AVX512] = iavf_recv_pkts_vec_avx512, + [IAVF_RX_AVX512_OFFLOAD] = iavf_recv_pkts_vec_avx512_offload, + [IAVF_RX_AVX512_FLEX_RXD] = iavf_recv_pkts_vec_avx512_flex_rxd, + [IAVF_RX_AVX512_FLEX_RXD_OFFLOAD] = + iavf_recv_pkts_vec_avx512_flex_rxd_offload, + [IAVF_RX_AVX512_SCATTERED] = iavf_recv_scattered_pkts_vec_avx512, + [IAVF_RX_AVX512_SCATTERED_OFFLOAD] = + iavf_recv_scattered_pkts_vec_avx512_offload, + [IAVF_RX_AVX512_SCATTERED_FLEX_RXD] = + iavf_recv_scattered_pkts_vec_avx512_flex_rxd, + [IAVF_RX_AVX512_SCATTERED_FLEX_RXD_OFFLOAD] = + iavf_recv_scattered_pkts_vec_avx512_flex_rxd_offload, +#endif +#elif defined RTE_ARCH_ARM + [IAVF_RX_SSE] = iavf_recv_pkts_vec, +#endif +}; + +static +const eth_tx_burst_t iavf_tx_pkt_burst_ops[] = { + [IAVF_TX_DEFAULT] = iavf_xmit_pkts, +#
[PATCH v9 2/2] net/iavf: add diagnostic support in TX path
The only way to enable diagnostics for TX paths is to modify the application source code. Making it difficult to diagnose faults. In this patch, the devarg option "mbuf_check" is introduced and the parameters are configured to enable the corresponding diagnostics. supported cases: mbuf, size, segment, offload. 1. mbuf: check for corrupted mbuf. 2. size: check min/max packet length according to hw spec. 3. segment: check number of mbuf segments not exceed hw limitation. 4. offload: check any unsupported offload flag. parameter format: mbuf_check=[mbuf,,] eg: dpdk-testpmd -a :81:01.0,mbuf_check=[mbuf,size] -- -i Signed-off-by: Mingjin Ye --- v2: Remove call chain. --- v3: Optimisation implementation. --- v4: Fix Windows os compilation error. --- v5: Split Patch. --- v6: remove strict. --- v8: Modify the description document. --- doc/guides/nics/intel_vf.rst | 9 drivers/net/iavf/iavf.h| 12 + drivers/net/iavf/iavf_ethdev.c | 75 ++ drivers/net/iavf/iavf_rxtx.c | 98 ++ drivers/net/iavf/iavf_rxtx.h | 2 + 5 files changed, 196 insertions(+) diff --git a/doc/guides/nics/intel_vf.rst b/doc/guides/nics/intel_vf.rst index ce96c2e1f8..bf6936082e 100644 --- a/doc/guides/nics/intel_vf.rst +++ b/doc/guides/nics/intel_vf.rst @@ -111,6 +111,15 @@ For more detail on SR-IOV, please refer to the following documents: by setting the ``devargs`` parameter like ``-a 18:01.0,no-poll-on-link-down=1`` when IAVF is backed by an Intel\ |reg| E810 device or an Intel\ |reg| 700 Series Ethernet device. +When IAVF is backed by an Intel\ |reg| E810 device or an Intel\ |reg| 700 series Ethernet devices. +Set the ``devargs`` parameter ``mbuf_check`` to enable TX diagnostics. For example, +``-a 18:01.0,mbuf_check=mbuf`` or ``-a 18:01.0,mbuf_check=[mbuf,size]``. Supported cases: + +* mbuf: Check for corrupted mbuf. +* size: Check min/max packet length according to hw spec. +* segment: Check number of mbuf segments not exceed hw limitation. +* offload: Check any unsupported offload flag. + The PCIE host-interface of Intel Ethernet Switch FM1 Series VF infrastructure ^ diff --git a/drivers/net/iavf/iavf.h b/drivers/net/iavf/iavf.h index ab24cb02c3..23c0496d54 100644 --- a/drivers/net/iavf/iavf.h +++ b/drivers/net/iavf/iavf.h @@ -114,9 +114,14 @@ struct iavf_ipsec_crypto_stats { } ierrors; }; +struct iavf_mbuf_stats { + uint64_t tx_pkt_errors; +}; + struct iavf_eth_xstats { struct virtchnl_eth_stats eth_stats; struct iavf_ipsec_crypto_stats ips_stats; + struct iavf_mbuf_stats mbuf_stats; }; /* Structure that defines a VSI, associated with a adapter. */ @@ -310,6 +315,7 @@ struct iavf_devargs { uint32_t watchdog_period; int auto_reset; int no_poll_on_link_down; + int mbuf_check; }; struct iavf_security_ctx; @@ -353,6 +359,11 @@ enum iavf_tx_burst_type { IAVF_TX_AVX512_CTX_OFFLOAD, }; +#define IAVF_MBUF_CHECK_F_TX_MBUF(1ULL << 0) +#define IAVF_MBUF_CHECK_F_TX_SIZE(1ULL << 1) +#define IAVF_MBUF_CHECK_F_TX_SEGMENT (1ULL << 2) +#define IAVF_MBUF_CHECK_F_TX_OFFLOAD (1ULL << 3) + /* Structure to store private data for each VF instance. */ struct iavf_adapter { struct iavf_hw hw; @@ -370,6 +381,7 @@ struct iavf_adapter { bool no_poll; enum iavf_rx_burst_type rx_burst_type; enum iavf_tx_burst_type tx_burst_type; + uint64_t mc_flags; /* mbuf check flags. */ uint16_t fdir_ref_cnt; struct iavf_devargs devargs; }; diff --git a/drivers/net/iavf/iavf_ethdev.c b/drivers/net/iavf/iavf_ethdev.c index 1fb876e827..903a43d004 100644 --- a/drivers/net/iavf/iavf_ethdev.c +++ b/drivers/net/iavf/iavf_ethdev.c @@ -13,6 +13,7 @@ #include #include #include +#include #include #include @@ -39,6 +40,7 @@ #define IAVF_RESET_WATCHDOG_ARG"watchdog_period" #define IAVF_ENABLE_AUTO_RESET_ARG "auto_reset" #define IAVF_NO_POLL_ON_LINK_DOWN_ARG "no-poll-on-link-down" +#define IAVF_MBUF_CHECK_ARG "mbuf_check" uint64_t iavf_timestamp_dynflag; int iavf_timestamp_dynfield_offset = -1; int rte_pmd_iavf_tx_lldp_dynfield_offset = -1; @@ -49,6 +51,7 @@ static const char * const iavf_valid_args[] = { IAVF_RESET_WATCHDOG_ARG, IAVF_ENABLE_AUTO_RESET_ARG, IAVF_NO_POLL_ON_LINK_DOWN_ARG, + IAVF_MBUF_CHECK_ARG, NULL }; @@ -175,6 +178,7 @@ static const struct rte_iavf_xstats_name_off rte_iavf_stats_strings[] = { {"tx_broadcast_packets", _OFF_OF(eth_stats.tx_broadcast)}, {"tx_dropped_packets", _OFF_OF(eth_stats.tx_discards)}, {"tx_error_packets", _OFF_OF(eth_stats.tx_errors)}, + {"tx_mbuf_error_packets", _OFF_OF(mbuf_stats.tx_pkt_errors)}, {"inline_ipsec_crypto_ipackets", _OFF_OF(ips_sta
[PATCH v3] net/i40e: add diagnostic support in TX path
The only way to enable diagnostics for TX paths is to modify the application source code. Making it difficult to diagnose faults. In this patch, the devarg option "mbuf_check" is introduced and the parameters are configured to enable the corresponding diagnostics. supported cases: mbuf, size, segment, offload. 1. mbuf: check for corrupted mbuf. 2. size: check min/max packet length according to hw spec. 3. segment: check number of mbuf segments not exceed hw limitation. 4. offload: check any unsupported offload flag. parameter format: mbuf_check=[mbuf,,] eg: dpdk-testpmd -a :81:01.0,mbuf_check=[mbuf,size] -- -i Signed-off-by: Mingjin Ye --- v2: remove strict. --- v3: optimised. --- doc/guides/nics/i40e.rst | 11 +++ drivers/net/i40e/i40e_ethdev.c | 137 - drivers/net/i40e/i40e_ethdev.h | 28 ++ drivers/net/i40e/i40e_rxtx.c | 153 +++-- drivers/net/i40e/i40e_rxtx.h | 2 + 5 files changed, 323 insertions(+), 8 deletions(-) diff --git a/doc/guides/nics/i40e.rst b/doc/guides/nics/i40e.rst index 15689ac958..b15b5b61c5 100644 --- a/doc/guides/nics/i40e.rst +++ b/doc/guides/nics/i40e.rst @@ -275,6 +275,17 @@ Runtime Configuration -a 84:00.0,vf_msg_cfg=80@120:180 +- ``Support TX diagnostics`` (default ``not enabled``) + + Set the ``devargs`` parameter ``mbuf_check`` to enable TX diagnostics. For example, + ``-a 18:01.0,mbuf_check=mbuf`` or ``-a 18:01.0,mbuf_check=[mbuf,size]``. + Supported cases: + + * mbuf: Check for corrupted mbuf. + * size: Check min/max packet length according to hw spec. + * segment: Check number of mbuf segments not exceed hw limitation. + * offload: Check any unsupported offload flag. + Vector RX Pre-conditions For Vector RX it is assumed that the number of descriptor rings will be a power diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c index 3ca226156b..e554bae1ab 100644 --- a/drivers/net/i40e/i40e_ethdev.c +++ b/drivers/net/i40e/i40e_ethdev.c @@ -48,6 +48,7 @@ #define ETH_I40E_SUPPORT_MULTI_DRIVER "support-multi-driver" #define ETH_I40E_QUEUE_NUM_PER_VF_ARG "queue-num-per-vf" #define ETH_I40E_VF_MSG_CFG"vf_msg_cfg" +#define ETH_I40E_MBUF_CHECK_ARG "mbuf_check" #define I40E_CLEAR_PXE_WAIT_MS 200 #define I40E_VSI_TSR_QINQ_STRIP0x4010 @@ -412,6 +413,7 @@ static const char *const valid_keys[] = { ETH_I40E_SUPPORT_MULTI_DRIVER, ETH_I40E_QUEUE_NUM_PER_VF_ARG, ETH_I40E_VF_MSG_CFG, + ETH_I40E_MBUF_CHECK_ARG, NULL}; static const struct rte_pci_id pci_id_i40e_map[] = { @@ -545,6 +547,14 @@ static const struct rte_i40e_xstats_name_off rte_i40e_stats_strings[] = { #define I40E_NB_ETH_XSTATS (sizeof(rte_i40e_stats_strings) / \ sizeof(rte_i40e_stats_strings[0])) +static const struct rte_i40e_xstats_name_off i40e_mbuf_strings[] = { + {"tx_mbuf_error_packets", offsetof(struct i40e_mbuf_stats, + tx_pkt_errors)}, +}; + +#define I40E_NB_MBUF_XSTATS (sizeof(i40e_mbuf_strings) / \ + sizeof(i40e_mbuf_strings[0])) + static const struct rte_i40e_xstats_name_off rte_i40e_hw_port_strings[] = { {"tx_link_down_dropped", offsetof(struct i40e_hw_port_stats, tx_dropped_link_down)}, @@ -1373,6 +1383,88 @@ read_vf_msg_config(__rte_unused const char *key, return 0; } +static int +read_mbuf_check_config(__rte_unused const char *key, const char *value, void *args) +{ + char *cur; + char *tmp; + int str_len; + int valid_len; + + int ret = 0; + uint64_t *mc_flags = args; + char *str2 = strdup(value); + if (str2 == NULL) + return -1; + + str_len = strlen(str2); + if (str2[0] == '[' && str2[str_len - 1] == ']') { + if (str_len < 3) { + ret = -1; + goto mdd_end; + } + valid_len = str_len - 2; + memmove(str2, str2 + 1, valid_len); + memset(str2 + valid_len, '\0', 2); + } + cur = strtok_r(str2, ",", &tmp); + while (cur != NULL) { + if (!strcmp(cur, "mbuf")) + *mc_flags |= I40E_MBUF_CHECK_F_TX_MBUF; + else if (!strcmp(cur, "size")) + *mc_flags |= I40E_MBUF_CHECK_F_TX_SIZE; + else if (!strcmp(cur, "segment")) + *mc_flags |= I40E_MBUF_CHECK_F_TX_SEGMENT; + else if (!strcmp(cur, "offload")) + *mc_flags |= I40E_MBUF_CHECK_F_TX_OFFLOAD; + else + PMD_DRV_LOG(ERR, "Unsupported mdd check type: %s", cur); + cur = strtok_r(NULL, ",", &tmp); + } + +mdd_end: + free(str2); + return ret; +} + +static int +i40e_parse_mbuf_check(struct rte_eth_dev *dev) +{ + struct i40e_adapter *ad = +
Re: [PATCH] dts: improve documentation
03/01/2024 13:54, Luca Vizzarro: > Improve instructions for installing dependencies, configuring and > launching the project. Finally, document the configuration schema > by adding more comments to the example and documenting every > property and definition. Thank you for taking care of the documentation. > +Luca Vizzarro For consistency, we don't use uppercase characters in email addresses. > - poetry install > + poetry install --no-root Please could you explain this change in the commit log? > DTS needs to know which nodes to connect to and what hardware to use on > those nodes. > -Once that's configured, DTS needs a DPDK tarball and it's ready to run. > +Once that's configured, DTS needs a DPDK tarball or a git ref ID and it's > ready to run. That's assuming DTS is compiling DPDK. We may want to provide an already compiled DPDK to DTS. > - usage: main.py [-h] [--config-file CONFIG_FILE] [--output-dir OUTPUT_DIR] > [-t TIMEOUT] > - [-v VERBOSE] [-s SKIP_SETUP] [--tarball TARBALL] > - [--compile-timeout COMPILE_TIMEOUT] [--test-cases > TEST_CASES] > - [--re-run RE_RUN] > + (dts-py3.10) $ ./main.py --help Why adding this line? Should we remove the shell prefix referring to a specific Python version? > + usage: main.py [-h] [--config-file CONFIG_FILE] [--output-dir OUTPUT_DIR] > [-t TIMEOUT] [-v VERBOSE] > + [-s SKIP_SETUP] [--tarball TARBALL] [--compile-timeout > COMPILE_TIMEOUT] > + [--test-cases TEST_CASES] [--re-run RE_RUN] > > - Run DPDK test suites. All options may be specified with the environment > variables provided in > - brackets. Command line arguments have higher priority. > + Run DPDK test suites. All options may be specified with the environment > variables provided in brackets. In general it is better to avoid long lines, and split after a punctation. I think we should take the habit to always go to the next line after the end of a sentence. > - [DTS_OUTPUT_DIR] Output directory where dts logs > and results are > - saved. (default: output) > + [DTS_OUTPUT_DIR] Output directory where dts logs > and results are saved. dts -> DTS > +Configuration Schema > + > + > +Definitions > +~~~ > + > +_`Node name` > + *string* – A unique identifier for a node. **Examples**: ``SUT1``, > ``TG1``. > + > +_`ARCH` > + *string* – The CPU architecture. **Supported values**: ``x86_64``, > ``arm64``, ``ppc64le``. > + > +_`CPU` > + *string* – The CPU microarchitecture. Use ``native`` for x86. **Supported > values**: ``native``, ``armv8a``, ``dpaa2``, ``thunderx``, ``xgene1``. > + > +_`OS` > + *string* – The operating system. **Supported values**: ``linux``. > + > +_`Compiler` > + *string* – The compiler used for building DPDK. **Supported values**: > ``gcc``, ``clang``, ``icc``, ``mscv``. > + > +_`Build target` > + *object* – Build targets supported by DTS for building DPDK, described as: > + > + > = > + ``arch`` See `ARCH`_ > + ``os`` See `OS`_ > + ``cpu`` See `CPU`_ > + ``compiler`` See `Compiler`_ > + ``compiler_wrapper`` *string* – Value prepended to the CC variable for > the DPDK build. Please don't add compilation configuration for now, I would like to work on the schema first. This is mostly imported from the old DTS and needs to be rethink.
Re: [PATCH v2 02/24] net/cnxk: implementing eswitch device
On Wed, Dec 20, 2023 at 12:53 AM Harman Kalra wrote: > > Eswitch device is a parent or base device behind all the representors, > acting as transport layer between representors and representees > > Signed-off-by: Harman Kalra > --- > drivers/net/cnxk/cnxk_eswitch.c | 465 > drivers/net/cnxk/cnxk_eswitch.h | 103 +++ > drivers/net/cnxk/meson.build| 1 + > 3 files changed, 569 insertions(+) > create mode 100644 drivers/net/cnxk/cnxk_eswitch.c > create mode 100644 drivers/net/cnxk/cnxk_eswitch.h > > diff --git a/drivers/net/cnxk/cnxk_eswitch.c b/drivers/net/cnxk/cnxk_eswitch.c > new file mode 100644 > index 00..51110a762d > --- /dev/null > +++ b/drivers/net/cnxk/cnxk_eswitch.c > @@ -0,0 +1,465 @@ > +/* SPDX-License-Identifier: BSD-3-Clause > + * Copyright(C) 2023 Marvell. Change to 2024 for new files in this series. > +static int > +eswitch_dev_nix_flow_ctrl_set(struct cnxk_eswitch_dev *eswitch_dev) > +{ > + > + rc = roc_nix_fc_mode_set(nix, mode_map[ROC_NIX_FC_FULL]); > + if (rc) > + return rc; > + > + return rc; same as return roc_nix_fc_mode_set(nix, mode_map[ROC_NIX_FC_FULL]);
Re: [PATCH v2 03/24] net/cnxk: eswitch HW resource configuration
On Wed, Dec 20, 2023 at 12:58 AM Harman Kalra wrote: > > Configuring the hardware resources used by the eswitch device. > > Signed-off-by: Harman Kalra > --- > drivers/net/cnxk/cnxk_eswitch.c | 206 > 1 file changed, 206 insertions(+) > > + > static int > cnxk_eswitch_dev_probe(struct rte_pci_driver *pci_drv, struct rte_pci_device > *pci_dev) > { > @@ -433,6 +630,12 @@ cnxk_eswitch_dev_probe(struct rte_pci_driver *pci_drv, > struct rte_pci_device *pc > return rc; > +free_mem: > + if (mz) Not needed as rte_memzone_free has the check > + rte_memzone_free(mz); > fail: > return rc; > } > -- > 2.18.0 >
Re: [PATCH v2 07/24] common/cnxk: interface to update VLAN TPID
On Wed, Dec 20, 2023 at 12:53 AM Harman Kalra wrote: > > Introducing eswitch variant of set vlan tpid api which can be > using for PF and VF > > Signed-off-by: Harman Kalra > + > +int > +roc_eswitch_nix_vlan_tpid_set(struct roc_nix *roc_nix, uint32_t type, > uint16_t tpid, bool is_vf) > +{ > + struct nix *nix = roc_nix_to_nix_priv(roc_nix); > + struct dev *dev = &nix->dev; > + int rc = 0; Across the series, Please check the need for initializing to zero for rc. In this case, it is not needed. > + > + /* Configuring for PF/VF */ > + rc = nix_vlan_tpid_set(dev->mbox, dev->pf_func | is_vf, type, tpid); > + if (rc) > + plt_err("Failed to set tpid for PF, rc %d", rc); > + > + return rc; > +}
RE: [RFC] ethdev: introduce entropy calculation
> -Original Message- > From: Ori Kam > Sent: Wednesday, December 27, 2023 3:20 PM > To: Andrew Rybchenko ; NBU-Contact- > Thomas Monjalon (EXTERNAL) ; Stephen Hemminger > ; Ferruh Yigit > Cc: Dumitrescu, Cristian ; Dariusz Sosnowski > ; dev@dpdk.org; Raslan Darawsheh > > Subject: RE: [RFC] ethdev: introduce entropy calculation > > Hi Andrew, Stephen, Ferruh and Thomas, > > > -Original Message- > > From: Andrew Rybchenko > > Sent: Saturday, December 16, 2023 11:04 AM > > > > On 12/15/23 19:21, Thomas Monjalon wrote: > > > 15/12/2023 14:44, Ferruh Yigit: > > >> On 12/14/2023 5:26 PM, Stephen Hemminger wrote: > > >>> On Thu, 14 Dec 2023 17:18:25 + > > >>> Ori Kam wrote: > > >>> > > >> Since encap groups number of different 5 tuples together, if HW > > doesn’t know > > >> how to RSS > > >> based on the inner application will not be able to get any > > >> distribution of > > > packets. > > >> > > >> This value is used to reflect the inner packet on the outer header, > > >> so > > > distribution > > >> will be possible. > > >> > > >> The main use case is, if application does full offload and implements > > the encap > > > on > > >> the RX. > > >> For example: > > >> Ingress/FDB match on 5 tuple encap send to hairpin / different port > > >> in > > case of > > >> switch. > > >> > > > > > > Smart idea! So basically the user is able to get an idea on how good > > > the > > RSS > > > distribution is, correct? > > > > > > > Not exactly, this simply allows the distribution. > > Maybe entropy is a bad name, this is the name they use in the protocol, > > but in reality > > this is some hash calculated on the packet header before the encap and > > set in the encap header. > > Using this hash results in entropy for the packets. Which can be used > > for > > load balancing. > > > > Maybe better name would be: > > Rte_flow_calc_entropy_hash? > > > > or maybe rte_flow_calc_encap_hash (I like it less since it looks like > > we > > calculate the hash on the encap data and not the inner part) > > > > what do you think? > > >>> > > >>> Entropy has meaning in crypto and random numbers generators that is > > different from > > >>> this usage. So entropy is bad name to use. Maybe > > rte_flow_hash_distribution? > > >>> > > >> > > >> Hi Ori, > > >> > > >> Thank you for the description, it is more clear now. > > >> > > >> And unless this is specifically defined as 'entropy' in spec, I am too > > >> for rename. > > >> > > >> At least in VXLAN spec, it is mentioned that this field is to "enable a > > >> level of entropy", but not exactly names it as entropy. > > > > > > Exactly my thought about the naming. > > > Good to see I am not alone thinking this naming is disturbing :) > > > > I'd avoid usage of term "entropy" in this patch. It is very confusing. > > What about rte_flow_calc_encap_hash? > > How about simply rte_flow_calc_hash? My understanding is this is a general-purpose hash that is not limited to encapsulation work.
RE: [RFC] ethdev: fast path async flow API
> -Original Message- > From: Ivan Malov > Sent: Wednesday, January 3, 2024 19:29 > Hi Dariusz, > > I appreciate your response. All to the point. > > I have to confess my question was inspired by the 23.11 merge commit in OVS > mailing list. I first thought that an obvious consumer for the async flow API > could have been OVS but saw no usage of it in the current code. It was my > impression that there had been some patches in OVS already, waiting either > for approval/testing or for this particular optimisation to be accepted first. > > So far I've been mistaken -- there are no such patches, hence my question. Do > we have real-world examples of the async flow usage? Should it be tested > somehow... > > (I apologise in case I'm asking for too many clarifications). > > Thank you. No need to apologize :) Unfortunately, we are yet to see async flow API adoption in other open-source projects. Until now, only direct NVIDIA customers use async flow API in their products. Best regards, Dariusz Sosnowski
Re: [PATCH] dts: improve documentation
04/01/2024 13:34, Luca Vizzarro: > On 04/01/2024 10:52, Thomas Monjalon wrote: > >> DTS needs to know which nodes to connect to and what hardware to use on > >> those nodes. > >> -Once that's configured, DTS needs a DPDK tarball and it's ready to run. > >> +Once that's configured, DTS needs a DPDK tarball or a git ref ID and it's > >> ready to run. > > > > That's assuming DTS is compiling DPDK. > > We may want to provide an already compiled DPDK to DTS. > > Yes, that is correct. At the current state, DTS is always compiled from > source though, so it may be reasonable to leave it as it is until this > feature may be implemented. Nonetheless, my change just informs the user > of the (already implemented) feature that uses `git archive` from the > local repository to create a tarball. A sensible change would be to add > this explanation I have just given, but it is a technicality and it > won't really make a difference to the user. Yes I would like to make it clear in this doc that DTS is compiling DPDK. Please could you change to something like "DTS needs a DPDK tarball or a git ref ID to compile" ? I hope we will change it later to allow external compilation. > >> + (dts-py3.10) $ ./main.py --help > > > > Why adding this line? > > Just running `./main.py` will just throw a confusing error to the user. > I am in the process of sorting this out as it is misleading and not > helpful. Specifying the line in this case just hints to the user on the > origin of that help/usage document. Yes would be good to have a message to help the user instead of a confusing error. > > Should we remove the shell prefix referring to a specific Python version? > > I have purposely left the prefix to indicate that we are in a Poetry > shell environment, as that is a pre-requisite to run DTS. So more of an > implicit reminder. The Python version specified is in line with the > minimum requirement of DTS. OK > > In general it is better to avoid long lines, and split after a punctation. > > I think we should take the habit to always go to the next line after the > > end of a sentence. > > I left the output of `--help` under a code block as it is originally > printed in the console. Could surely amend it in the docs to be easier > to read, but the user could as easily print it themselves in their own > terminal in the comfort of their own environment. I was not referring to the console output. Maybe I misunderstood it. For the doc sentences, please try to split sentences on different lines. > >> - [DTS_OUTPUT_DIR] Output directory where dts > >> logs and results are > >> - saved. (default: output) > >> + [DTS_OUTPUT_DIR] Output directory where dts > >> logs and results are saved. > > > > dts -> DTS > > As above. The output of `--help` only changed as a result of not being > updated before in parallel with code changes. Consistently this is what > the user would see right now. It may or may not be a good idea to update > this whenever changed in the future. I did not understand it is part of the help message. > Nonetheless, I am keen to update the code as part of this patch to > resolve your comments. Yes please update the code for this small wording fix. > > Please don't add compilation configuration for now, > > I would like to work on the schema first. > > This is mostly imported from the old DTS and needs to be rethink. > > While I understand the concern on wanting to rework the schema, which is > a great point you make, it may be reasonable to provide something useful > to close the existing documentation gap. And incrementally updating from > there. If there is no realistic timeline set in place for a schema > rework, it may just be better to have something rather than nothing. And > certainly it would not be very useful to upstream a partial documentation. I don't know. I have big doubts about the current schema. I will review it with your doc patches. Can you please split this patch in 2 so that the schema doc is in a different patch? > Thank you a lot for your review! You have made some good points which > open up new potential tasks to add to the pipeline.
RE: [dpdk-dev] [RFC] ethdev: support Tx queue free descriptor query
> -Original Message- > From: jer...@marvell.com > Sent: Tuesday, December 19, 2023 5:30 PM > To: dev@dpdk.org; Thomas Monjalon ; Ferruh Yigit > ; Andrew Rybchenko > Cc: ferruh.yi...@xilinx.com; ajit.khapa...@broadcom.com; > abo...@pensando.io; Xing, Beilei ; Richardson, Bruce > ; ch...@att.com; chenbo@intel.com; Loftus, > Ciara ; dsinghra...@marvell.com; Czeck, Ed > ; evge...@amazon.com; gr...@u256.net; > g.si...@nxp.com; zhouguoy...@huawei.com; Wang, Haiyue > ; hka...@marvell.com; heinrich.k...@corigine.com; > hemant.agra...@nxp.com; hyon...@cisco.com; igo...@amazon.com; > irussk...@marvell.com; jgraj...@cisco.com; Singh, Jasvinder > ; jianw...@trustnetic.com; > jiawe...@trustnetic.com; Wu, Jingjing ; > johnd...@cisco.com; john.mil...@atomicrules.com; linvi...@tuxdriver.com; > Wiles, Keith ; kirankum...@marvell.com; > ouli...@huawei.com; lir...@marvell.com; lon...@microsoft.com; > m...@semihalf.com; spin...@cesnet.cz; ma...@nvidia.com; Peters, Matt > ; maxime.coque...@redhat.com; > m...@semihalf.com; humi...@huawei.com; pna...@marvell.com; > ndabilpu...@marvell.com; Yang, Qiming ; Zhang, Qi Z > ; rad...@marvell.com; rahul.lakkire...@chelsio.com; > rm...@marvell.com; Xu, Rosen ; > sachin.sax...@oss.nxp.com; skotesh...@marvell.com; shsha...@marvell.com; > shaib...@amazon.com; Siegel, Shepard ; > asoma...@amd.com; somnath.ko...@broadcom.com; > sthem...@microsoft.com; Webster, Steven ; > sk...@marvell.com; mtetsu...@gmail.com; vbu...@marvell.com; > viachesl...@nvidia.com; Wang, Xiao W ; > cloud.wangxiao...@huawei.com; yisen.zhu...@huawei.com; Wang, Yong > ; xuanziya...@huawei.com; Dumitrescu, Cristian > ; Jerin Jacob > Subject: [dpdk-dev] [RFC] ethdev: support Tx queue free descriptor query > > From: Jerin Jacob > > Introduce a new API to retrieve the number of available free descriptors > in a Tx queue. Applications can leverage this API in the fast path to > inspect the Tx queue occupancy and take appropriate actions based on the > available free descriptors. > > A notable use case could be implementing Random Early Discard (RED) > in software based on Tx queue occupancy. > > Signed-off-by: Jerin Jacob > --- > doc/guides/nics/features.rst | 10 > doc/guides/nics/features/default.ini | 1 + > lib/ethdev/ethdev_trace_points.c | 3 ++ > lib/ethdev/rte_ethdev.h | 78 > lib/ethdev/rte_ethdev_core.h | 7 ++- > lib/ethdev/rte_ethdev_trace_fp.h | 8 +++ > 6 files changed, 106 insertions(+), 1 deletion(-) > > diff --git a/doc/guides/nics/features.rst b/doc/guides/nics/features.rst > index f7d9980849..9d6655473a 100644 > --- a/doc/guides/nics/features.rst > +++ b/doc/guides/nics/features.rst > @@ -962,6 +962,16 @@ management (see :doc:`../prog_guide/power_man` for > more details). > > * **[implements] eth_dev_ops**: ``get_monitor_addr`` > > +.. _nic_features_tx_queue_free_desc_query: > + > +Tx queue free descriptor query > +-- > + > +Supports to get the number of free descriptors in a Tx queue. > + > +* **[implements] eth_dev_ops**: ``tx_queue_free_desc_get``. > +* **[related] API**: ``rte_eth_tx_queue_free_desc_get()``. > + > .. _nic_features_other: > > Other dev ops not represented by a Feature > diff --git a/doc/guides/nics/features/default.ini > b/doc/guides/nics/features/default.ini > index 806cb033ff..b30002b1c1 100644 > --- a/doc/guides/nics/features/default.ini > +++ b/doc/guides/nics/features/default.ini > @@ -59,6 +59,7 @@ Packet type parsing = > Timesync = > Rx descriptor status = > Tx descriptor status = > +Tx free descriptor query = > Basic stats = > Extended stats = > Stats per queue = > diff --git a/lib/ethdev/ethdev_trace_points.c > b/lib/ethdev/ethdev_trace_points.c > index 91f71d868b..346f37f2e4 100644 > --- a/lib/ethdev/ethdev_trace_points.c > +++ b/lib/ethdev/ethdev_trace_points.c > @@ -481,6 +481,9 @@ > RTE_TRACE_POINT_REGISTER(rte_eth_trace_count_aggr_ports, > RTE_TRACE_POINT_REGISTER(rte_eth_trace_map_aggr_tx_affinity, > lib.ethdev.map_aggr_tx_affinity) > > +RTE_TRACE_POINT_REGISTER(rte_eth_trace_tx_queue_free_desc_get, > + lib.ethdev.tx_queue_free_desc_get) > + > RTE_TRACE_POINT_REGISTER(rte_flow_trace_copy, > lib.ethdev.flow.copy) > > diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h > index 77331ce652..033fcb8c9b 100644 > --- a/lib/ethdev/rte_ethdev.h > +++ b/lib/ethdev/rte_ethdev.h > @@ -6802,6 +6802,84 @@ rte_eth_recycle_mbufs(uint16_t rx_port_id, uint16_t > rx_queue_id, > __rte_experimental > int rte_eth_buffer_split_get_supported_hdr_ptypes(uint16_t port_id, uint32_t > *ptypes, int num); > > +/** > + * @warning > + * @b EXPERIMENTAL: this API may change, or be removed, without prior notice > + * > + * Get the number of free descriptors in a Tx queue. > + * > + * This function retrieves the number of available free descriptors in a > + * transmit queue. Applications can use this AP
Re: [dpdk-dev] [RFC] ethdev: support Tx queue free descriptor query
On Thu, Jan 4, 2024 at 6:46 PM Dumitrescu, Cristian wrote: > > > > > -Original Message- > > From: jer...@marvell.com > > Sent: Tuesday, December 19, 2023 5:30 PM > > To: dev@dpdk.org; Thomas Monjalon ; Ferruh Yigit > > ; Andrew Rybchenko > > Cc: ferruh.yi...@xilinx.com; ajit.khapa...@broadcom.com; > > abo...@pensando.io; Xing, Beilei ; Richardson, Bruce > > ; ch...@att.com; chenbo@intel.com; Loftus, > > Ciara ; dsinghra...@marvell.com; Czeck, Ed > > ; evge...@amazon.com; gr...@u256.net; > > g.si...@nxp.com; zhouguoy...@huawei.com; Wang, Haiyue > > ; hka...@marvell.com; heinrich.k...@corigine.com; > > hemant.agra...@nxp.com; hyon...@cisco.com; igo...@amazon.com; > > irussk...@marvell.com; jgraj...@cisco.com; Singh, Jasvinder > > ; jianw...@trustnetic.com; > > jiawe...@trustnetic.com; Wu, Jingjing ; > > johnd...@cisco.com; john.mil...@atomicrules.com; linvi...@tuxdriver.com; > > Wiles, Keith ; kirankum...@marvell.com; > > ouli...@huawei.com; lir...@marvell.com; lon...@microsoft.com; > > m...@semihalf.com; spin...@cesnet.cz; ma...@nvidia.com; Peters, Matt > > ; maxime.coque...@redhat.com; > > m...@semihalf.com; humi...@huawei.com; pna...@marvell.com; > > ndabilpu...@marvell.com; Yang, Qiming ; Zhang, Qi Z > > ; rad...@marvell.com; rahul.lakkire...@chelsio.com; > > rm...@marvell.com; Xu, Rosen ; > > sachin.sax...@oss.nxp.com; skotesh...@marvell.com; shsha...@marvell.com; > > shaib...@amazon.com; Siegel, Shepard ; > > asoma...@amd.com; somnath.ko...@broadcom.com; > > sthem...@microsoft.com; Webster, Steven ; > > sk...@marvell.com; mtetsu...@gmail.com; vbu...@marvell.com; > > viachesl...@nvidia.com; Wang, Xiao W ; > > cloud.wangxiao...@huawei.com; yisen.zhu...@huawei.com; Wang, Yong > > ; xuanziya...@huawei.com; Dumitrescu, Cristian > > ; Jerin Jacob > > Subject: [dpdk-dev] [RFC] ethdev: support Tx queue free descriptor query > > > > From: Jerin Jacob > > > > Introduce a new API to retrieve the number of available free descriptors > > in a Tx queue. Applications can leverage this API in the fast path to > > inspect the Tx queue occupancy and take appropriate actions based on the > > available free descriptors. > > > > A notable use case could be implementing Random Early Discard (RED) > > in software based on Tx queue occupancy. > > > > Signed-off-by: Jerin Jacob > > --- > > doc/guides/nics/features.rst | 10 > > doc/guides/nics/features/default.ini | 1 + > > lib/ethdev/ethdev_trace_points.c | 3 ++ > > lib/ethdev/rte_ethdev.h | 78 > > lib/ethdev/rte_ethdev_core.h | 7 ++- > > lib/ethdev/rte_ethdev_trace_fp.h | 8 +++ > > 6 files changed, 106 insertions(+), 1 deletion(-) > > Hi Jerin, Hi Cristian, > > I think having an API to get the number of free descriptors per queue is a > good idea. Why have it only for TX queues and not for RX queues as well? I see no harm in adding for Rx as well. I think, it is better to have separate API for each instead of adding argument as it is fast path API. If so, we could add a new API when there is any PMD implementation or need for this. > > Regards, > Cristian
RE: [EXT] [RFC PATCH] cryptodev: add sm2 key exchange and encryption for HW
Hi, > This commit adds comments for the proposal of addition of SM2 algorithm key > exchange and encryption/decryption operation. > > Signed-off-by: Arkadiusz Kusztal > --- > lib/cryptodev/rte_crypto_asym.h | 16 > 1 file changed, 16 insertions(+) > > diff --git a/lib/cryptodev/rte_crypto_asym.h b/lib/cryptodev/rte_crypto_asym.h > index 39d3da3952..6911a14dbd 100644 > --- a/lib/cryptodev/rte_crypto_asym.h > +++ b/lib/cryptodev/rte_crypto_asym.h > @@ -639,6 +639,10 @@ struct rte_crypto_asym_xform { struct > rte_crypto_sm2_op_param { > enum rte_crypto_asym_op_type op_type; > /**< Signature generation or verification. */ > + /* > + * For key exchange operation, new struct should be created. > + * Doing that, the current struct could be split into signature and > encryption. > + */ > > enum rte_crypto_auth_algorithm hash; > /**< Hash algorithm used in EC op. */ > @@ -672,6 +676,18 @@ struct rte_crypto_sm2_op_param { >* C1 (1 + N + N), C2 = M, C3 = N. The cipher.length field will >* be overwritten by the PMD with the encrypted length. >*/ > + /* SM2 encryption algorithm relies on certain cryptographic functions, > + * that HW devices not necesseraly need to implement. > + * When C1 is a elliptic curve point, C2 and C3 need additional > + * operation like KDF and Hash. The question here is: should only > + * elliptic curve output parameters (namely C1 and PB) be returned to > the user, > + * or should encryption be, in this case, computed within the PMD using > + * software methods, or should both option be available? > + */ I second on splitting this struct for PKE (may be _pke and _dsa). At the same time, handling these structs should be followed by some capability check and that was what I have been thinking on to propose as asym OP capability in this release. Right now, asymmetric capability is defined only by xform (not also by op). But we could add op capab also as below. struct rte_cryptodev_capabilities caps_sm2[] = { .op = RTE_CRYPTO_OP_TYPE_ASYMMETRIC, { .asym = { .xform_capa = { .xform_type = RTE_CRYPTO_ASYM_XFORM_SM2, .op_types = ... }, .op_capa = [ { .op_type = RTE_CRYPTO_ASYM_OP_ENC, .capa = (1 << RTE_CRYPTO_ASYM_SM2_PKE_KDF | 1 << RTE_CRYPTO_ASYM_SM2_PKE_HASH) NEW ENUM } ] } } } Doing this, hash_algos member in asym xform capability today can eventually be removed And it sounds better for an op. Also, this op capability check could be done once for the session. If you are also aligned, I can send an RFC for capab check. > + /* Similar applies to the key exchange in the HW. The second phase of > KE, most likely, > + * will go as far as to obtain xU,yU(xV,xV), where SW can easily > calculate SA. What does SA mean here ? Signature algorithm ??. Thanks, Gowrishankar > + * Should then both options be available? > + */ > > rte_crypto_uint id; > /**< The SM2 id used by signer and verifier. */ > -- > 2.13.6
RE: [dpdk-dev] [RFC] ethdev: support Tx queue free descriptor query
> > > Introduce a new API to retrieve the number of available free descriptors > > > in a Tx queue. Applications can leverage this API in the fast path to > > > inspect the Tx queue occupancy and take appropriate actions based on the > > > available free descriptors. > > > > > > A notable use case could be implementing Random Early Discard (RED) > > > in software based on Tx queue occupancy. > > > > > > Signed-off-by: Jerin Jacob > > > --- > > > doc/guides/nics/features.rst | 10 > > > doc/guides/nics/features/default.ini | 1 + > > > lib/ethdev/ethdev_trace_points.c | 3 ++ > > > lib/ethdev/rte_ethdev.h | 78 > > > lib/ethdev/rte_ethdev_core.h | 7 ++- > > > lib/ethdev/rte_ethdev_trace_fp.h | 8 +++ > > > 6 files changed, 106 insertions(+), 1 deletion(-) > > > > Hi Jerin, > > Hi Cristian, > > > > > I think having an API to get the number of free descriptors per queue is a > > good idea. Why have it only for TX queues and not for RX > queues as well? > > I see no harm in adding for Rx as well. I think, it is better to have > separate API for each instead of adding argument as it is fast path > API. > If so, we could add a new API when there is any PMD implementation or > need for this. I think for RX we already have similar one: /** @internal Get number of used descriptors on a receive queue. */ typedef uint32_t (*eth_rx_queue_count_t)(void *rxq);
RE: [RFC] ethdev: introduce entropy calculation
Hi Cristian, > -Original Message- > From: Dumitrescu, Cristian > Sent: Thursday, January 4, 2024 2:57 PM > > > >> > > > >> And unless this is specifically defined as 'entropy' in spec, I am too > > > >> for rename. > > > >> > > > >> At least in VXLAN spec, it is mentioned that this field is to "enable a > > > >> level of entropy", but not exactly names it as entropy. > > > > > > > > Exactly my thought about the naming. > > > > Good to see I am not alone thinking this naming is disturbing :) > > > > > > I'd avoid usage of term "entropy" in this patch. It is very confusing. > > > > What about rte_flow_calc_encap_hash? > > > > > How about simply rte_flow_calc_hash? My understanding is this is a general- > purpose hash that is not limited to encapsulation work. Unfortunately, this is not a general-purpose hash. HW may implement a different hash for each use case. also, the hash result is length differs depending on the feature and even the target field. We can take your naming idea and change the parameters a bit: rte_flow_calc_hash(port, feature, *attribute, pattern, hash_len, *hash) For the feature we will have at this point: NVGRE_HASH, SPORT_HASH The attribute parameter will be empty for now, but it may be used later to add extra information for the hash if more information is required, for example, some key. In addition, we will also be able to merge the current function rte_flow_calc_table_hash, if we pass the missing parameters (table id, template id) in the attribute field. What do you think?
Re: [PATCH v9] gro: fix reordering of packets in GRO layer
在 2023/12/9 上午2:17, Kumara Parameshwaran 写道: In the current implementation when a packet is received with special TCP flag(s) set, only that packet is delivered out of order. There could be already coalesced packets in the GRO table belonging to the same flow but not delivered. This fix makes sure that the entire segment is delivered with the special flag(s) set which is how the Linux GRO is also implemented Signed-off-by: Kumara Parameshwaran Co-authored-by: Kumara Parameshwaran --- If the received packet is not a pure ACK packet, we check if there are any previous packets in the flow, if present we indulge the received packet also in the coalescing logic and update the flags of the last recived packet to the entire segment which would avoid re-ordering. Lets say a case where P1(PSH), P2(ACK), P3(ACK) are received in burst mode, P1 contains PSH flag and since it does not contain any prior packets in the flow we copy it to unprocess_packets and P2(ACK) and P3(ACK) are merged together. In the existing case the P2,P3 would be delivered as single segment first and the unprocess_packets will be copied later which will cause reordering. With the patch copy the unprocess packets first and then the packets from the GRO table. Testing done The csum test-pmd was modified to support the following GET request of 10MB from client to server via test-pmd (static arp entries added in client and server). Enable GRO and TSO in test-pmd where the packets recived from the client mac would be sent to server mac and vice versa. In above testing, without the patch the client observerd re-ordering of 25 packets and with the patch there were no packet re-ordering observerd. v2: Fix warnings in commit and comment. Do not consider packet as candidate to merge if it contains SYN/RST flag. v3: Fix warnings. v4: Rebase with master. v5: Adding co-author email v6: Address review comments from the maintainer to restructure the code and handle only special flags PSH,FIN v7: Fix warnings and errors v8: Fix warnings and errors v9: Fix commit message lib/gro/gro_tcp.h | 11 lib/gro/gro_tcp4.c | 67 +- 2 files changed, 54 insertions(+), 24 deletions(-) diff --git a/lib/gro/gro_tcp.h b/lib/gro/gro_tcp.h index d926c4b8cc..137a03bc96 100644 --- a/lib/gro/gro_tcp.h +++ b/lib/gro/gro_tcp.h @@ -187,4 +187,15 @@ is_same_common_tcp_key(struct cmn_tcp_key *k1, struct cmn_tcp_key *k2) return (!memcmp(k1, k2, sizeof(struct cmn_tcp_key))); } +static inline void +update_tcp_hdr_flags(struct rte_tcp_hdr *tcp_hdr, struct rte_mbuf *pkt) +{ + struct rte_tcp_hdr *merged_tcp_hdr; + + merged_tcp_hdr = rte_pktmbuf_mtod_offset(pkt, struct rte_tcp_hdr *, pkt->l2_len + + pkt->l3_len); + merged_tcp_hdr->tcp_flags |= tcp_hdr->tcp_flags; + +} + #endif diff --git a/lib/gro/gro_tcp4.c b/lib/gro/gro_tcp4.c index 6645de592b..8af5a8d8a9 100644 --- a/lib/gro/gro_tcp4.c +++ b/lib/gro/gro_tcp4.c @@ -126,6 +126,7 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt, uint32_t item_idx; uint32_t i, max_flow_num, remaining_flow_num; uint8_t find; + uint32_t item_start_idx; /* * Don't process the packet whose TCP header length is greater @@ -139,13 +140,6 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt, tcp_hdr = (struct rte_tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len); hdr_len = pkt->l2_len + pkt->l3_len + pkt->l4_len; - /* -* Don't process the packet which has FIN, SYN, RST, PSH, URG, ECE -* or CWR set. -*/ - if (tcp_hdr->tcp_flags != RTE_TCP_ACK_FLAG) - return -1; - /* trim the tail padding bytes */ ip_tlen = rte_be_to_cpu_16(ipv4_hdr->total_length); if (pkt->pkt_len > (uint32_t)(ip_tlen + pkt->l2_len)) @@ -183,6 +177,7 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt, if (tbl->flows[i].start_index != INVALID_ARRAY_INDEX) { if (is_same_tcp4_flow(tbl->flows[i].key, key)) { find = 1; + item_start_idx = tbl->flows[i].start_index; break; } remaining_flow_num--; @@ -190,28 +185,52 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt, } if (find == 0) { It is more likely to find a match flow. So better to put the below logic to the else statement. - sent_seq = rte_be_to_cpu_32(tcp_hdr->sent_seq); - item_idx = insert_new_tcp_item(pkt, tbl->items, &tbl->item_num, - tbl->max_item_num, start_time, -
RE: [RFC] ethdev: fast path async flow API
Hi Konstantin, > -Original Message- > From: Konstantin Ananyev > Sent: Thursday, January 4, 2024 09:47 > > > This is a blocker, showstopper for me. > > +1 > > > > > Have you considered having something like > > >rte_flow_create_bulk() > > > > > > or better yet a Linux iouring style API? > > > > > > A ring style API would allow for better mixed operations across the > > > board and get rid of the I-cache overhead which is the root cause of the > needing inline. > > Existing async flow API is somewhat close to the io_uring interface. > > The difference being that queue is not directly exposed to the application. > > Application interacts with the queue using rte_flow_async_* APIs (e.g., > places operations in the queue, pushes them to the HW). > > Such design has some benefits over a flow API which exposes the queue to > the user: > > - Easier to use - Applications do not manage the queue directly, they do it > through exposed APIs. > > - Consistent with other DPDK APIs - In other libraries, queues are > manipulated through API, not directly by an application. > > - Lower memory usage - only HW primitives are needed (e.g., HW queue > > on PMD side), no need to allocate separate application queues. > > > > Bulking of flow operations is a tricky subject. > > Compared to packet processing, where it is desired to keep the > > manipulation of raw packet data to the minimum (e.g., only packet > > headers are accessed), during flow rule creation all items and actions must > be processed by PMD to create a flow rule. > > The amount of memory consumed by items and actions themselves during > this process might be nonnegligible. > > If flow rule operations were bulked, the size of working set of memory > > would increase, which could have negative consequences on the cache > behavior. > > So, it might be the case that by utilizing bulking the I-cache overhead is > removed, but the D-cache overhead is added. > > Is rte_flow struct really that big? > We do bulk processing for mbufs, crypto_ops, etc., and usually bulk > processing improves performance not degrades it. > Of course bulk size has to be somewhat reasonable. It does not really depend on rte_flow struct size itself (it's opaque to the user), but on sizes of items and actions which are the parameters for flow operations. To create a flow through async flow API the following is needed: - array of items and their spec, - array of actions and their configuration, - pointer to template table, - indexes of pattern and actions templates to be used. If we assume a simple case of ETH/IPV4/TCP/END match and COUNT/RSS/END actions, then we have at most: - 4 items (32B each) + 3 specs (20B each) = 188B - 3 actions (16B each) + 2 configurations (4B and 40B) = 92B - 8B for table pointer - 2B for template indexes In total = 290B. Bulk API can be designed in a way that single bulk operates on a single set of tables and templates - this would remove a few bytes. Flow actions can be based on actions templates (so no need for conf), but items' specs are still needed. This would leave us at 236B, so at least 4 cache lines (assuming everything is tightly packed) for a single flow and almost twice the size of the mbuf. Depending on the bulk size it might be a much more significant chunk of the cache. I don't want to dismiss the idea. I think it's worth of evaluation. However, I'm not entirely confident if bulking API would introduce performance benefits. Best regards, Dariusz Sosnowski
[PATCH v4] [PATCH 2/2] net/tap: fix buffer overflow for ptypes list
Incorrect ptypes list causes buffer overflow for Address Sanitizer run. Previously, the last element in the ptypes lists to be "RTE_PTYPE_UNKNOWN" for rte_eth_dev_get_supported_ptypes(), but this was not clearly documented and many PMDs did not follow this implementation. Instead, the dev_supported_ptypes_get() function pointer now returns the number of elements to eliminate the need for "RTE_PTYPE_UNKNOWN" as the last item. Fixes: 47909357a069 ("ethdev: make device operations struct private") Cc: ferruh.yi...@intel.com Cc: sta...@dpdk.org V4: The first patch is for drivers for backporting. The second patch is for driver API update. Signed-off-by: Sivaramakrishnan Venkat --- drivers/net/atlantic/atl_ethdev.c | 13 - drivers/net/axgbe/axgbe_ethdev.c | 13 - drivers/net/bnxt/bnxt_ethdev.c | 7 --- drivers/net/cnxk/cnxk_ethdev.h | 3 ++- drivers/net/cnxk/cnxk_lookup.c | 7 --- drivers/net/cpfl/cpfl_ethdev.c | 7 --- drivers/net/cxgbe/cxgbe_ethdev.c | 10 ++ drivers/net/cxgbe/cxgbe_pfvf.h | 3 ++- drivers/net/dpaa/dpaa_ethdev.c | 11 +++ drivers/net/dpaa2/dpaa2_ethdev.c | 10 ++ drivers/net/e1000/igb_ethdev.c | 13 - drivers/net/enetc/enetc_ethdev.c | 7 --- drivers/net/enic/enic_ethdev.c | 17 ++--- drivers/net/failsafe/failsafe_ops.c| 5 +++-- drivers/net/fm10k/fm10k_ethdev.c | 15 +-- drivers/net/hns3/hns3_rxtx.c | 16 +--- drivers/net/hns3/hns3_rxtx.h | 3 ++- drivers/net/i40e/i40e_rxtx.c | 11 +++ drivers/net/i40e/i40e_rxtx.h | 3 ++- drivers/net/iavf/iavf_ethdev.c | 10 ++ drivers/net/ice/ice_dcf_ethdev.c | 7 --- drivers/net/ice/ice_rxtx.c | 23 ++- drivers/net/ice/ice_rxtx.h | 3 ++- drivers/net/idpf/idpf_ethdev.c | 7 --- drivers/net/igc/igc_ethdev.c | 10 ++ drivers/net/ionic/ionic_rxtx.c | 7 --- drivers/net/ionic/ionic_rxtx.h | 3 ++- drivers/net/ixgbe/ixgbe_ethdev.c | 18 -- drivers/net/mana/mana.c| 7 --- drivers/net/mlx4/mlx4.h| 3 ++- drivers/net/mlx4/mlx4_ethdev.c | 17 ++--- drivers/net/mlx5/mlx5.h| 3 ++- drivers/net/mlx5/mlx5_ethdev.c | 11 +++ drivers/net/mvneta/mvneta_ethdev.c | 7 --- drivers/net/mvpp2/mrvl_ethdev.c| 7 --- drivers/net/netvsc/hn_var.h| 3 ++- drivers/net/netvsc/hn_vf.c | 5 +++-- drivers/net/nfp/nfp_net_common.c | 15 ++- drivers/net/nfp/nfp_net_common.h | 3 ++- drivers/net/ngbe/ngbe_ethdev.c | 9 ++--- drivers/net/ngbe/ngbe_ethdev.h | 3 ++- drivers/net/ngbe/ngbe_ptypes.c | 3 ++- drivers/net/ngbe/ngbe_ptypes.h | 2 +- drivers/net/octeontx/octeontx_ethdev.c | 11 +++ drivers/net/pfe/pfe_ethdev.c | 11 +++ drivers/net/qede/qede_ethdev.c | 11 +++ drivers/net/sfc/sfc_dp_rx.h| 2 +- drivers/net/sfc/sfc_ef10.h | 3 ++- drivers/net/sfc/sfc_ef100_rx.c | 7 --- drivers/net/sfc/sfc_ef10_rx.c | 11 ++- drivers/net/sfc/sfc_ethdev.c | 5 +++-- drivers/net/sfc/sfc_rx.c | 7 --- drivers/net/tap/rte_eth_tap.c | 7 --- drivers/net/thunderx/nicvf_ethdev.c| 10 +- drivers/net/txgbe/txgbe_ethdev.c | 9 ++--- drivers/net/txgbe/txgbe_ethdev.h | 3 ++- drivers/net/txgbe/txgbe_ptypes.c | 6 +++--- drivers/net/txgbe/txgbe_ptypes.h | 2 +- drivers/net/vmxnet3/vmxnet3_ethdev.c | 14 +- lib/ethdev/ethdev_driver.h | 3 ++- lib/ethdev/rte_ethdev.c| 10 ++ 61 files changed, 299 insertions(+), 193 deletions(-) diff --git a/drivers/net/atlantic/atl_ethdev.c b/drivers/net/atlantic/atl_ethdev.c index 3a028f4290..bc087738e4 100644 --- a/drivers/net/atlantic/atl_ethdev.c +++ b/drivers/net/atlantic/atl_ethdev.c @@ -43,7 +43,8 @@ static int atl_dev_stats_reset(struct rte_eth_dev *dev); static int atl_fw_version_get(struct rte_eth_dev *dev, char *fw_version, size_t fw_size); -static const uint32_t *atl_dev_supported_ptypes_get(struct rte_eth_dev *dev); +static const uint32_t *atl_dev_supported_ptypes_get(struct rte_eth_dev *dev, + size_t *no_of_elements); static int atl_dev_mtu_set(struct rte_eth_dev *dev, uint16_t mtu); @@ -1132,7 +1133,8 @@ atl_dev_info_get(struct rte_eth_dev *dev, struct rte_eth_dev_info *dev_info) } static const uint32_t * -atl_dev_supported_ptypes_get(struct rte_eth_dev *dev) +atl_dev_supported_ptypes_get(struct rte_eth_dev *dev, + size_t *no_of_elements) { static const u
[PATCH v4] [PATCH 2/2] net/tap: fix buffer overflow for ptypes list
Incorrect ptypes list causes buffer overflow for Address Sanitizer run. Previously, the last element in the ptypes lists to be "RTE_PTYPE_UNKNOWN" for rte_eth_dev_get_supported_ptypes(), but this was not clearly documented and many PMDs did not follow this implementation. Instead, the dev_supported_ptypes_get() function pointer now returns the number of elements to eliminate the need for "RTE_PTYPE_UNKNOWN" as the last item. Fixes: 47909357a069 ("ethdev: make device operations struct private") Cc: ferruh.yi...@intel.com Cc: sta...@dpdk.org V4: The first patch is for drivers for backporting. The second patch is for driver API update. Signed-off-by: Sivaramakrishnan Venkat --- drivers/net/atlantic/atl_ethdev.c | 13 - drivers/net/axgbe/axgbe_ethdev.c | 13 - drivers/net/bnxt/bnxt_ethdev.c | 7 --- drivers/net/cnxk/cnxk_ethdev.h | 3 ++- drivers/net/cnxk/cnxk_lookup.c | 7 --- drivers/net/cpfl/cpfl_ethdev.c | 7 --- drivers/net/cxgbe/cxgbe_ethdev.c | 10 ++ drivers/net/cxgbe/cxgbe_pfvf.h | 3 ++- drivers/net/dpaa/dpaa_ethdev.c | 11 +++ drivers/net/dpaa2/dpaa2_ethdev.c | 10 ++ drivers/net/e1000/igb_ethdev.c | 13 - drivers/net/enetc/enetc_ethdev.c | 7 --- drivers/net/enic/enic_ethdev.c | 17 ++--- drivers/net/failsafe/failsafe_ops.c| 5 +++-- drivers/net/fm10k/fm10k_ethdev.c | 15 +-- drivers/net/hns3/hns3_rxtx.c | 16 +--- drivers/net/hns3/hns3_rxtx.h | 3 ++- drivers/net/i40e/i40e_rxtx.c | 11 +++ drivers/net/i40e/i40e_rxtx.h | 3 ++- drivers/net/iavf/iavf_ethdev.c | 10 ++ drivers/net/ice/ice_dcf_ethdev.c | 7 --- drivers/net/ice/ice_rxtx.c | 23 ++- drivers/net/ice/ice_rxtx.h | 3 ++- drivers/net/idpf/idpf_ethdev.c | 7 --- drivers/net/igc/igc_ethdev.c | 10 ++ drivers/net/ionic/ionic_rxtx.c | 7 --- drivers/net/ionic/ionic_rxtx.h | 3 ++- drivers/net/ixgbe/ixgbe_ethdev.c | 18 -- drivers/net/mana/mana.c| 7 --- drivers/net/mlx4/mlx4.h| 3 ++- drivers/net/mlx4/mlx4_ethdev.c | 17 ++--- drivers/net/mlx5/mlx5.h| 3 ++- drivers/net/mlx5/mlx5_ethdev.c | 11 +++ drivers/net/mvneta/mvneta_ethdev.c | 7 --- drivers/net/mvpp2/mrvl_ethdev.c| 7 --- drivers/net/netvsc/hn_var.h| 3 ++- drivers/net/netvsc/hn_vf.c | 5 +++-- drivers/net/nfp/nfp_net_common.c | 15 ++- drivers/net/nfp/nfp_net_common.h | 3 ++- drivers/net/ngbe/ngbe_ethdev.c | 9 ++--- drivers/net/ngbe/ngbe_ethdev.h | 3 ++- drivers/net/ngbe/ngbe_ptypes.c | 3 ++- drivers/net/ngbe/ngbe_ptypes.h | 2 +- drivers/net/octeontx/octeontx_ethdev.c | 11 +++ drivers/net/pfe/pfe_ethdev.c | 11 +++ drivers/net/qede/qede_ethdev.c | 11 +++ drivers/net/sfc/sfc_dp_rx.h| 2 +- drivers/net/sfc/sfc_ef10.h | 3 ++- drivers/net/sfc/sfc_ef100_rx.c | 7 --- drivers/net/sfc/sfc_ef10_rx.c | 11 ++- drivers/net/sfc/sfc_ethdev.c | 5 +++-- drivers/net/sfc/sfc_rx.c | 7 --- drivers/net/tap/rte_eth_tap.c | 7 --- drivers/net/thunderx/nicvf_ethdev.c| 10 +- drivers/net/txgbe/txgbe_ethdev.c | 9 ++--- drivers/net/txgbe/txgbe_ethdev.h | 3 ++- drivers/net/txgbe/txgbe_ptypes.c | 6 +++--- drivers/net/txgbe/txgbe_ptypes.h | 2 +- drivers/net/vmxnet3/vmxnet3_ethdev.c | 14 +- lib/ethdev/ethdev_driver.h | 3 ++- lib/ethdev/rte_ethdev.c| 10 ++ 61 files changed, 299 insertions(+), 193 deletions(-) diff --git a/drivers/net/atlantic/atl_ethdev.c b/drivers/net/atlantic/atl_ethdev.c index 3a028f4290..bc087738e4 100644 --- a/drivers/net/atlantic/atl_ethdev.c +++ b/drivers/net/atlantic/atl_ethdev.c @@ -43,7 +43,8 @@ static int atl_dev_stats_reset(struct rte_eth_dev *dev); static int atl_fw_version_get(struct rte_eth_dev *dev, char *fw_version, size_t fw_size); -static const uint32_t *atl_dev_supported_ptypes_get(struct rte_eth_dev *dev); +static const uint32_t *atl_dev_supported_ptypes_get(struct rte_eth_dev *dev, + size_t *no_of_elements); static int atl_dev_mtu_set(struct rte_eth_dev *dev, uint16_t mtu); @@ -1132,7 +1133,8 @@ atl_dev_info_get(struct rte_eth_dev *dev, struct rte_eth_dev_info *dev_info) } static const uint32_t * -atl_dev_supported_ptypes_get(struct rte_eth_dev *dev) +atl_dev_supported_ptypes_get(struct rte_eth_dev *dev, + size_t *no_of_elements) { static const u
[dpdk-dev v4 2/2] net/tap: fix buffer overflow for ptypes list
[PATCH 1] - net/tap: fix buffer overflow for ptypes list through updation of last element. The first patch is for drivers for backporting [PATCH 2] - net/tap: fix buffer overflow for ptypes list through through driver API update. The second patch is for drivers API update Sivaramakrishnan Venkat (2): net/tap: fix buffer overflow for ptypes list through updation of last element. net/tap: fix buffer overflow for ptypes list through driver API update drivers/net/atlantic/atl_ethdev.c | 13 - drivers/net/axgbe/axgbe_ethdev.c | 13 - drivers/net/bnxt/bnxt_ethdev.c | 7 --- drivers/net/cnxk/cnxk_ethdev.h | 3 ++- drivers/net/cnxk/cnxk_lookup.c | 7 --- drivers/net/cpfl/cpfl_ethdev.c | 7 --- drivers/net/cxgbe/cxgbe_ethdev.c | 10 ++ drivers/net/cxgbe/cxgbe_pfvf.h | 3 ++- drivers/net/dpaa/dpaa_ethdev.c | 8 ++-- drivers/net/dpaa2/dpaa2_ethdev.c | 10 ++ drivers/net/e1000/igb_ethdev.c | 13 - drivers/net/enetc/enetc_ethdev.c | 7 --- drivers/net/enic/enic_ethdev.c | 17 ++--- drivers/net/failsafe/failsafe_ops.c| 5 +++-- drivers/net/fm10k/fm10k_ethdev.c | 15 +-- drivers/net/hns3/hns3_rxtx.c | 16 +--- drivers/net/hns3/hns3_rxtx.h | 3 ++- drivers/net/i40e/i40e_rxtx.c | 11 +++ drivers/net/i40e/i40e_rxtx.h | 3 ++- drivers/net/iavf/iavf_ethdev.c | 10 ++ drivers/net/ice/ice_dcf_ethdev.c | 7 --- drivers/net/ice/ice_rxtx.c | 23 ++- drivers/net/ice/ice_rxtx.h | 3 ++- drivers/net/idpf/idpf_ethdev.c | 7 --- drivers/net/igc/igc_ethdev.c | 10 ++ drivers/net/ionic/ionic_rxtx.c | 7 --- drivers/net/ionic/ionic_rxtx.h | 3 ++- drivers/net/ixgbe/ixgbe_ethdev.c | 18 -- drivers/net/mana/mana.c| 7 --- drivers/net/mlx4/mlx4.h| 3 ++- drivers/net/mlx4/mlx4_ethdev.c | 17 ++--- drivers/net/mlx5/mlx5.h| 3 ++- drivers/net/mlx5/mlx5_ethdev.c | 11 +++ drivers/net/mvneta/mvneta_ethdev.c | 4 +++- drivers/net/mvpp2/mrvl_ethdev.c| 4 +++- drivers/net/netvsc/hn_var.h| 3 ++- drivers/net/netvsc/hn_vf.c | 5 +++-- drivers/net/nfp/nfp_net_common.c | 14 ++ drivers/net/nfp/nfp_net_common.h | 3 ++- drivers/net/ngbe/ngbe_ethdev.c | 9 ++--- drivers/net/ngbe/ngbe_ethdev.h | 3 ++- drivers/net/ngbe/ngbe_ptypes.c | 3 ++- drivers/net/ngbe/ngbe_ptypes.h | 2 +- drivers/net/octeontx/octeontx_ethdev.c | 11 +++ drivers/net/pfe/pfe_ethdev.c | 8 ++-- drivers/net/qede/qede_ethdev.c | 11 +++ drivers/net/sfc/sfc_dp_rx.h| 2 +- drivers/net/sfc/sfc_ef10.h | 3 ++- drivers/net/sfc/sfc_ef100_rx.c | 7 --- drivers/net/sfc/sfc_ef10_rx.c | 11 ++- drivers/net/sfc/sfc_ethdev.c | 5 +++-- drivers/net/sfc/sfc_rx.c | 7 --- drivers/net/tap/rte_eth_tap.c | 6 -- drivers/net/thunderx/nicvf_ethdev.c| 8 +--- drivers/net/txgbe/txgbe_ethdev.c | 9 ++--- drivers/net/txgbe/txgbe_ethdev.h | 3 ++- drivers/net/txgbe/txgbe_ptypes.c | 6 +++--- drivers/net/txgbe/txgbe_ptypes.h | 2 +- drivers/net/vmxnet3/vmxnet3_ethdev.c | 14 +- lib/ethdev/ethdev_driver.h | 3 ++- lib/ethdev/rte_ethdev.c| 10 ++ 61 files changed, 295 insertions(+), 181 deletions(-) -- 2.25.1
[dpdk-dev v4 1/2] net/tap: fix buffer overflow for ptypes list through updation of last element.
Incorrect ptypes list causes buffer overflow for Address Sanitizer run. The last element in the ptypes lists to be "RTE_PTYPE_UNKNOWN" for rte_eth_dev_get_supported_ptypes(). In rte_eth_dev_get_supported_ptypes(),the loop iterates until it finds "RTE_PTYPE_UNKNOWN" to detect last element of the ptypes array. Fix the ptypes list for drivers. Fixes: 0849ac3b6122 ("net/tap: add packet type management") Fixes: a7bdc3bd4244 ("net/dpaa: support packet type parsing") Fixes: 4ccc8d770d3b ("net/mvneta: add PMD skeleton") Fixes: f3f0d77db6b0 ("net/mrvl: support packet type parsing") Fixes: 78a38edf66de ("ethdev: query supported packet types") Fixes: 659b494d3d88 ("net/pfe: add packet types and basic statistics") Fixes: 398a1be14168 ("net/thunderx: remove generic passX references") Cc: pascal.ma...@6wind.com Cc: z...@semihalf.com Cc: t...@semihalf.com Cc: jianfeng@intel.com Cc: g.si...@nxp.com Cc: jerin.ja...@caviumnetworks.com Cc: sta...@dpdk.org Signed-off-by: Sivaramakrishnan Venkat --- drivers/net/dpaa/dpaa_ethdev.c | 3 ++- drivers/net/mvneta/mvneta_ethdev.c | 3 ++- drivers/net/mvpp2/mrvl_ethdev.c | 3 ++- drivers/net/nfp/nfp_net_common.c| 1 + drivers/net/pfe/pfe_ethdev.c| 3 ++- drivers/net/tap/rte_eth_tap.c | 1 + drivers/net/thunderx/nicvf_ethdev.c | 2 ++ 7 files changed, 12 insertions(+), 4 deletions(-) diff --git a/drivers/net/dpaa/dpaa_ethdev.c b/drivers/net/dpaa/dpaa_ethdev.c index ef4c06db6a..779bdc5860 100644 --- a/drivers/net/dpaa/dpaa_ethdev.c +++ b/drivers/net/dpaa/dpaa_ethdev.c @@ -363,7 +363,8 @@ dpaa_supported_ptypes_get(struct rte_eth_dev *dev) RTE_PTYPE_L4_TCP, RTE_PTYPE_L4_UDP, RTE_PTYPE_L4_SCTP, - RTE_PTYPE_TUNNEL_ESP + RTE_PTYPE_TUNNEL_ESP, + RTE_PTYPE_UNKNOWN }; PMD_INIT_FUNC_TRACE(); diff --git a/drivers/net/mvneta/mvneta_ethdev.c b/drivers/net/mvneta/mvneta_ethdev.c index daa69e533a..212c300c14 100644 --- a/drivers/net/mvneta/mvneta_ethdev.c +++ b/drivers/net/mvneta/mvneta_ethdev.c @@ -198,7 +198,8 @@ mvneta_dev_supported_ptypes_get(struct rte_eth_dev *dev __rte_unused) RTE_PTYPE_L3_IPV4, RTE_PTYPE_L3_IPV6, RTE_PTYPE_L4_TCP, - RTE_PTYPE_L4_UDP + RTE_PTYPE_L4_UDP, + RTE_PTYPE_UNKNOWN }; return ptypes; diff --git a/drivers/net/mvpp2/mrvl_ethdev.c b/drivers/net/mvpp2/mrvl_ethdev.c index c12364941d..4cc64c7cad 100644 --- a/drivers/net/mvpp2/mrvl_ethdev.c +++ b/drivers/net/mvpp2/mrvl_ethdev.c @@ -1777,7 +1777,8 @@ mrvl_dev_supported_ptypes_get(struct rte_eth_dev *dev __rte_unused) RTE_PTYPE_L3_IPV6_EXT, RTE_PTYPE_L2_ETHER_ARP, RTE_PTYPE_L4_TCP, - RTE_PTYPE_L4_UDP + RTE_PTYPE_L4_UDP, + RTE_PTYPE_UNKNOWN }; return ptypes; diff --git a/drivers/net/nfp/nfp_net_common.c b/drivers/net/nfp/nfp_net_common.c index e969b840d6..46d0e07850 100644 --- a/drivers/net/nfp/nfp_net_common.c +++ b/drivers/net/nfp/nfp_net_common.c @@ -1299,6 +1299,7 @@ nfp_net_supported_ptypes_get(struct rte_eth_dev *dev) RTE_PTYPE_INNER_L4_NONFRAG, RTE_PTYPE_INNER_L4_ICMP, RTE_PTYPE_INNER_L4_SCTP, + RTE_PTYPE_UNKNOWN }; if (dev->rx_pkt_burst != nfp_net_recv_pkts) diff --git a/drivers/net/pfe/pfe_ethdev.c b/drivers/net/pfe/pfe_ethdev.c index 551f3cf193..0073dd7405 100644 --- a/drivers/net/pfe/pfe_ethdev.c +++ b/drivers/net/pfe/pfe_ethdev.c @@ -520,7 +520,8 @@ pfe_supported_ptypes_get(struct rte_eth_dev *dev) RTE_PTYPE_L3_IPV6_EXT, RTE_PTYPE_L4_TCP, RTE_PTYPE_L4_UDP, - RTE_PTYPE_L4_SCTP + RTE_PTYPE_L4_SCTP, + RTE_PTYPE_UNKNOWN }; if (dev->rx_pkt_burst == pfe_recv_pkts || diff --git a/drivers/net/tap/rte_eth_tap.c b/drivers/net/tap/rte_eth_tap.c index b41fa971cb..3fa03cdbee 100644 --- a/drivers/net/tap/rte_eth_tap.c +++ b/drivers/net/tap/rte_eth_tap.c @@ -1803,6 +1803,7 @@ tap_dev_supported_ptypes_get(struct rte_eth_dev *dev __rte_unused) RTE_PTYPE_L4_UDP, RTE_PTYPE_L4_TCP, RTE_PTYPE_L4_SCTP, + RTE_PTYPE_UNKNOWN }; return ptypes; diff --git a/drivers/net/thunderx/nicvf_ethdev.c b/drivers/net/thunderx/nicvf_ethdev.c index a504d41dfe..5a0c3dc4a6 100644 --- a/drivers/net/thunderx/nicvf_ethdev.c +++ b/drivers/net/thunderx/nicvf_ethdev.c @@ -392,12 +392,14 @@ nicvf_dev_supported_ptypes_get(struct rte_eth_dev *dev) RTE_PTYPE_L4_TCP, RTE_PTYPE_L4_UDP, RTE_PTYPE_L4_FRAG, + RTE_PTYPE_UNKNOWN }; static const uint32_t ptypes_tunnel[] = { RTE_PTYPE_TUNNEL_GRE, RTE_PTYPE_TUNNEL_GENEVE, RTE_PTYPE_T
[dpdk-dev v4 2/2] net/tap: fix buffer overflow for ptypes list through driver API update
Incorrect ptypes list causes buffer overflow for Address Sanitizer run. Previously, the last element in the ptypes lists to be "RTE_PTYPE_UNKNOWN" for rte_eth_dev_get_supported_ptypes(), but this was not clearly documented and many PMDs did not follow this implementation. Instead, the dev_supported_ptypes_get() function pointer now returns the number of elements to eliminate the need for "RTE_PTYPE_UNKNOWN" as the last item. Fixes: 47909357a069 ("ethdev: make device operations struct private") Cc: ferruh.yi...@intel.com Cc: sta...@dpdk.org Signed-off-by: Sivaramakrishnan Venkat --- drivers/net/atlantic/atl_ethdev.c | 13 - drivers/net/axgbe/axgbe_ethdev.c | 13 - drivers/net/bnxt/bnxt_ethdev.c | 7 --- drivers/net/cnxk/cnxk_ethdev.h | 3 ++- drivers/net/cnxk/cnxk_lookup.c | 7 --- drivers/net/cpfl/cpfl_ethdev.c | 7 --- drivers/net/cxgbe/cxgbe_ethdev.c | 10 ++ drivers/net/cxgbe/cxgbe_pfvf.h | 3 ++- drivers/net/dpaa/dpaa_ethdev.c | 11 +++ drivers/net/dpaa2/dpaa2_ethdev.c | 10 ++ drivers/net/e1000/igb_ethdev.c | 13 - drivers/net/enetc/enetc_ethdev.c | 7 --- drivers/net/enic/enic_ethdev.c | 17 ++--- drivers/net/failsafe/failsafe_ops.c| 5 +++-- drivers/net/fm10k/fm10k_ethdev.c | 15 +-- drivers/net/hns3/hns3_rxtx.c | 16 +--- drivers/net/hns3/hns3_rxtx.h | 3 ++- drivers/net/i40e/i40e_rxtx.c | 11 +++ drivers/net/i40e/i40e_rxtx.h | 3 ++- drivers/net/iavf/iavf_ethdev.c | 10 ++ drivers/net/ice/ice_dcf_ethdev.c | 7 --- drivers/net/ice/ice_rxtx.c | 23 ++- drivers/net/ice/ice_rxtx.h | 3 ++- drivers/net/idpf/idpf_ethdev.c | 7 --- drivers/net/igc/igc_ethdev.c | 10 ++ drivers/net/ionic/ionic_rxtx.c | 7 --- drivers/net/ionic/ionic_rxtx.h | 3 ++- drivers/net/ixgbe/ixgbe_ethdev.c | 18 -- drivers/net/mana/mana.c| 7 --- drivers/net/mlx4/mlx4.h| 3 ++- drivers/net/mlx4/mlx4_ethdev.c | 17 ++--- drivers/net/mlx5/mlx5.h| 3 ++- drivers/net/mlx5/mlx5_ethdev.c | 11 +++ drivers/net/mvneta/mvneta_ethdev.c | 7 --- drivers/net/mvpp2/mrvl_ethdev.c| 7 --- drivers/net/netvsc/hn_var.h| 3 ++- drivers/net/netvsc/hn_vf.c | 5 +++-- drivers/net/nfp/nfp_net_common.c | 15 ++- drivers/net/nfp/nfp_net_common.h | 3 ++- drivers/net/ngbe/ngbe_ethdev.c | 9 ++--- drivers/net/ngbe/ngbe_ethdev.h | 3 ++- drivers/net/ngbe/ngbe_ptypes.c | 3 ++- drivers/net/ngbe/ngbe_ptypes.h | 2 +- drivers/net/octeontx/octeontx_ethdev.c | 11 +++ drivers/net/pfe/pfe_ethdev.c | 11 +++ drivers/net/qede/qede_ethdev.c | 11 +++ drivers/net/sfc/sfc_dp_rx.h| 2 +- drivers/net/sfc/sfc_ef10.h | 3 ++- drivers/net/sfc/sfc_ef100_rx.c | 7 --- drivers/net/sfc/sfc_ef10_rx.c | 11 ++- drivers/net/sfc/sfc_ethdev.c | 5 +++-- drivers/net/sfc/sfc_rx.c | 7 --- drivers/net/tap/rte_eth_tap.c | 7 --- drivers/net/thunderx/nicvf_ethdev.c| 10 +- drivers/net/txgbe/txgbe_ethdev.c | 9 ++--- drivers/net/txgbe/txgbe_ethdev.h | 3 ++- drivers/net/txgbe/txgbe_ptypes.c | 6 +++--- drivers/net/txgbe/txgbe_ptypes.h | 2 +- drivers/net/vmxnet3/vmxnet3_ethdev.c | 14 +- lib/ethdev/ethdev_driver.h | 3 ++- lib/ethdev/rte_ethdev.c| 10 ++ 61 files changed, 299 insertions(+), 193 deletions(-) diff --git a/drivers/net/atlantic/atl_ethdev.c b/drivers/net/atlantic/atl_ethdev.c index 3a028f4290..bc087738e4 100644 --- a/drivers/net/atlantic/atl_ethdev.c +++ b/drivers/net/atlantic/atl_ethdev.c @@ -43,7 +43,8 @@ static int atl_dev_stats_reset(struct rte_eth_dev *dev); static int atl_fw_version_get(struct rte_eth_dev *dev, char *fw_version, size_t fw_size); -static const uint32_t *atl_dev_supported_ptypes_get(struct rte_eth_dev *dev); +static const uint32_t *atl_dev_supported_ptypes_get(struct rte_eth_dev *dev, + size_t *no_of_elements); static int atl_dev_mtu_set(struct rte_eth_dev *dev, uint16_t mtu); @@ -1132,7 +1133,8 @@ atl_dev_info_get(struct rte_eth_dev *dev, struct rte_eth_dev_info *dev_info) } static const uint32_t * -atl_dev_supported_ptypes_get(struct rte_eth_dev *dev) +atl_dev_supported_ptypes_get(struct rte_eth_dev *dev, + size_t *no_of_elements) { static const uint32_t ptypes[] = { RTE_PTYPE_L2_ETHER, @@ -1143,12 +1145,13 @@ atl_dev_support
Re: [RFC] ethdev: introduce entropy calculation
04/01/2024 15:33, Ori Kam: > Hi Cristian, > > > From: Dumitrescu, Cristian > > Sent: Thursday, January 4, 2024 2:57 PM > > > > >> > > > > >> And unless this is specifically defined as 'entropy' in spec, I am > > > > >> too > > > > >> for rename. > > > > >> > > > > >> At least in VXLAN spec, it is mentioned that this field is to > > > > >> "enable a > > > > >> level of entropy", but not exactly names it as entropy. > > > > > > > > > > Exactly my thought about the naming. > > > > > Good to see I am not alone thinking this naming is disturbing :) > > > > > > > > I'd avoid usage of term "entropy" in this patch. It is very confusing. > > > > > > What about rte_flow_calc_encap_hash? > > > > > > > > How about simply rte_flow_calc_hash? My understanding is this is a general- > > purpose hash that is not limited to encapsulation work. > > Unfortunately, this is not a general-purpose hash. HW may implement a > different hash for each use case. > also, the hash result is length differs depending on the feature and even the > target field. > > We can take your naming idea and change the parameters a bit: > rte_flow_calc_hash(port, feature, *attribute, pattern, hash_len, *hash) > > For the feature we will have at this point: > NVGRE_HASH, > SPORT_HASH > > The attribute parameter will be empty for now, but it may be used later to > add extra information > for the hash if more information is required, for example, some key. > In addition, we will also be able to merge the current function > rte_flow_calc_table_hash, > if we pass the missing parameters (table id, template id) in the attribute > field. > > What do you think? I like the idea of having a single function for HW hashes. Is there an impact on performance? How much is it sensitive?
Re: [dpdk-dev] [RFC] ethdev: support Tx queue free descriptor query
04/01/2024 15:21, Konstantin Ananyev: > > > > > Introduce a new API to retrieve the number of available free descriptors > > > > in a Tx queue. Applications can leverage this API in the fast path to > > > > inspect the Tx queue occupancy and take appropriate actions based on the > > > > available free descriptors. > > > > > > > > A notable use case could be implementing Random Early Discard (RED) > > > > in software based on Tx queue occupancy. > > > > > > > > Signed-off-by: Jerin Jacob > > > > > > I think having an API to get the number of free descriptors per queue is > > > a good idea. Why have it only for TX queues and not for RX > > queues as well? > > > > I see no harm in adding for Rx as well. I think, it is better to have > > separate API for each instead of adding argument as it is fast path > > API. > > If so, we could add a new API when there is any PMD implementation or > > need for this. > > I think for RX we already have similar one: > /** @internal Get number of used descriptors on a receive queue. */ > typedef uint32_t (*eth_rx_queue_count_t)(void *rxq); rte_eth_rx_queue_count() gives the number of Rx used descriptors rte_eth_rx_descriptor_status() gives the status of one Rx descriptor rte_eth_tx_descriptor_status() gives the status of one Tx descriptor This patch is adding a function to get Tx available descriptors, rte_eth_tx_queue_free_desc_get(). I can see a symmetry with rte_eth_rx_queue_count(). For consistency I would rename it to rte_eth_tx_queue_free_count(). Should we add rte_eth_tx_queue_count() and rte_eth_rx_queue_free_count()?
[PATCH] event/cnxk: use WFE LDP loop for getwork routine
From: Pavan Nikhilesh Use WFE LDP loop while polling for GETWORK completion for better power savings. Disabled by default and can be enabled by setting `RTE_ARM_USE_WFE` to `true` in `config/arm/meson.build` Signed-off-by: Pavan Nikhilesh --- doc/guides/eventdevs/cnxk.rst | 9 ++ drivers/event/cnxk/cn10k_worker.h | 52 +-- 2 files changed, 52 insertions(+), 9 deletions(-) diff --git a/doc/guides/eventdevs/cnxk.rst b/doc/guides/eventdevs/cnxk.rst index cccb8a0304..d62c143c77 100644 --- a/doc/guides/eventdevs/cnxk.rst +++ b/doc/guides/eventdevs/cnxk.rst @@ -198,6 +198,15 @@ Runtime Config Options -a 0002:0e:00.0,tim_eclk_freq=12288-10-0 +Power Savings on CN10K +-- + +ARM cores can additionally use WFE when polling for transactions on SSO bus +to save power i.e., in the event dequeue call ARM core can enter WFE and exit +when either work has been scheduled or dequeue timeout has reached. +This can be enabled by setting ``RTE_ARM_USE_WFE`` to ``true`` in +``config/arm/meson.build``. + Debugging Options - diff --git a/drivers/event/cnxk/cn10k_worker.h b/drivers/event/cnxk/cn10k_worker.h index 8aa916fa12..92d5190842 100644 --- a/drivers/event/cnxk/cn10k_worker.h +++ b/drivers/event/cnxk/cn10k_worker.h @@ -250,23 +250,57 @@ cn10k_sso_hws_get_work(struct cn10k_sso_hws *ws, struct rte_event *ev, gw.get_work = ws->gw_wdata; #if defined(RTE_ARCH_ARM64) -#if !defined(__clang__) - asm volatile( - PLT_CPU_FEATURE_PREAMBLE - "caspal %[wdata], %H[wdata], %[wdata], %H[wdata], [%[gw_loc]]\n" - : [wdata] "+r"(gw.get_work) - : [gw_loc] "r"(ws->base + SSOW_LF_GWS_OP_GET_WORK0) - : "memory"); -#else +#if defined(__clang__) register uint64_t x0 __asm("x0") = (uint64_t)gw.u64[0]; register uint64_t x1 __asm("x1") = (uint64_t)gw.u64[1]; +#if defined(RTE_ARM_USE_WFE) + plt_write64(gw.u64[0], ws->base + SSOW_LF_GWS_OP_GET_WORK0); + asm volatile(PLT_CPU_FEATURE_PREAMBLE +" ldp %[x0], %[x1], [%[tag_loc]] \n" +" tbz %[x0], %[pend_gw], done%= \n" +" sevl\n" +"rty%=:wfe \n" +" ldp %[x0], %[x1], [%[tag_loc]] \n" +" tbnz %[x0], %[pend_gw], rty%= \n" +"done%=: \n" +" dmb ld \n" +: [x0] "+r" (x0), [x1] "+r" (x1) +: [tag_loc] "r"(ws->base + SSOW_LF_GWS_WQE0), + [pend_gw] "i"(SSOW_LF_GWS_TAG_PEND_GET_WORK_BIT) +: "memory"); +#else asm volatile(".arch armv8-a+lse\n" "caspal %[x0], %[x1], %[x0], %[x1], [%[dst]]\n" -: [x0] "+r"(x0), [x1] "+r"(x1) +: [x0] "+r" (x0), [x1] "+r" (x1) : [dst] "r"(ws->base + SSOW_LF_GWS_OP_GET_WORK0) : "memory"); +#endif gw.u64[0] = x0; gw.u64[1] = x1; +#else +#if defined(RTE_ARM_USE_WFE) + plt_write64(gw.u64[0], ws->base + SSOW_LF_GWS_OP_GET_WORK0); + asm volatile(PLT_CPU_FEATURE_PREAMBLE +" ldp %[wdata], %H[wdata], [%[tag_loc]] \n" +" tbz %[wdata], %[pend_gw], done%=\n" +" sevl\n" +"rty%=:wfe \n" +" ldp %[wdata], %H[wdata], [%[tag_loc]] \n" +" tbnz %[wdata], %[pend_gw], rty%=\n" +"done%=: \n" +" dmb ld \n" +: [wdata] "=&r"(gw.get_work) +: [tag_loc] "r"(ws->base + SSOW_LF_GWS_WQE0), + [pend_gw] "i"(SSOW_LF_GWS_TAG_PEND_GET_WORK_BIT) +: "memory"); +#else + asm volatile( + PLT_CPU_FEATURE_PREAMBLE + "caspal %[wdata], %H[wdata], %[wdata], %H[wdata], [%[gw_loc]]\n" + : [wdata] "+r"(gw.get_work) + : [gw_loc] "r"(ws->base + SSOW_LF_GWS_OP_GET_WORK0) + : "memory"); +#endif #endif #else plt_write64(gw.u64[0], ws->base + SSOW_LF_GWS_OP_GET_WORK0); -- 2.25.1
Re: [PATCH] net/tap: Modified TAP BPF program as per the new Kernel-version upgrade requirements.
On Thu, 4 Jan 2024 22:57:56 +0530 madhuker.myt...@oracle.com wrote: > From: Madhuker Mythri > > When multiple queues configured, internally RSS will be enabled and thus TAP > BPF RSS byte-code will be loaded on to the Kernel using BPF system calls. > > Here, the problem is loading the existing BPF byte-code to the Kernel-5.15 > and above versions throws errors, i.e: Kernel BPF verifier not accepted this > existing BPF byte-code and system calls return error code "-7" as follows: > > rss_add_actions(): Failed to load BPF section l3_l4 (7): Argument list too > long > > > RCA: These errors started coming after from the Kernel-5.15 version, in > which lots of new BPF verification restrictions were added for safe execution > of byte-code on to the Kernel, due to which existing BPF program verification > does not pass. > Here are the major BPF verifier restrictions observed: > 1) Need to use new BPF maps structure. > 2) Kernel SKB data pointer access not allowed. > 3) Undefined loops were not allowed(which are bounded by a variable value). > 4) unreachable instructions(like: undefined array access). > > After addressing all these Kernel BPF verifier restrictions able to load the > BPF byte-code onto the Kernel successfully. > > Note: This new BPF changes supports from Kernel:4.10 version. > > Bugzilla Id: 1329 > > Signed-off-by: Madhuker Mythri > --- > drivers/net/tap/bpf/tap_bpf_program.c | 243 +- > drivers/net/tap/tap_bpf_api.c |4 +- > drivers/net/tap/tap_bpf_insns.h | 3781 ++--- > 3 files changed, 2151 insertions(+), 1877 deletions(-) Patch has trailing whitespace, git complains: $ git am /tmp/bpf.mbox Applying: net/tap: Modified TAP BPF program as per the new Kernel-version upgrade requirements. /home/shemminger/DPDK/main/.git/worktrees/libbpf/rebase-apply/patch:98: trailing whitespace. // queue match /home/shemminger/DPDK/main/.git/worktrees/libbpf/rebase-apply/patch:243: trailing whitespace. /** Is IP fragmented **/ /home/shemminger/DPDK/main/.git/worktrees/libbpf/rebase-apply/patch:326: trailing whitespace. /* bpf_printk("> rss_l3_l4 hash=0x%x queue:1=%u\n", hash, queue); */ warning: 3 lines add whitespace errors.
Re: [PATCH] net/tap: Modified TAP BPF program as per the new Kernel-version upgrade requirements.
On Thu, 4 Jan 2024 22:57:56 +0530 madhuker.myt...@oracle.com wrote: > +#include > +#include > +#include > +#include > +#include > +#include > +#include "tap_rss.h" > This change in headers breaks the use of make in the bpf directory.
Re: [PATCH] net/tap: Modified TAP BPF program as per the new Kernel-version upgrade requirements.
On Thu, 4 Jan 2024 22:57:56 +0530 madhuker.myt...@oracle.com wrote: > +#include > +#include > +#include > +#include The original code copied the bpf headers from distro (was bad idea). This should be fixed in tap driver to make sure that there is no mismatch.
Re: [PATCH] net/tap: Modified TAP BPF program as per the new Kernel-version upgrade requirements.
On Thu, 4 Jan 2024 22:57:56 +0530 madhuker.myt...@oracle.com wrote: > -static __u32 __attribute__((always_inline)) > -rte_softrss_be(const __u32 *input_tuple, const uint8_t *rss_key, > - __u8 input_len) > +static __u64 __attribute__((always_inline)) > +rte_softrss_be(const __u32 *input_tuple, __u8 input_len) Why the change to u64? This is not part of the bug fix and not how RSS is defined.
Re: [PATCH] net/tap: Modified TAP BPF program as per the new Kernel-version upgrade requirements.
On Thu, 4 Jan 2024 22:57:56 +0530 madhuker.myt...@oracle.com wrote: > diff --git a/drivers/net/tap/tap_bpf_insns.h b/drivers/net/tap/tap_bpf_insns.h > index 53fa76c4e6..b3dc11b901 100644 > --- a/drivers/net/tap/tap_bpf_insns.h > +++ b/drivers/net/tap/tap_bpf_insns.h > @@ -1,10 +1,10 @@ > /* SPDX-License-Identifier: BSD-3-Clause > - * Auto-generated from tap_bpf_program.c > - * This not the original source file. Do NOT edit it. > + * Copyright 2017 Mellanox Technologies, Ltd > */ Why the Mellanox copyright addition, the python auto-generator does not add it? Overall, it looks like you did not work with existing TAP BPF code but went back to some other code you had.
Re: [PATCH] net/tap: Modified TAP BPF program as per the new Kernel-version upgrade requirements.
On Thu, 4 Jan 2024 22:57:56 +0530 madhuker.myt...@oracle.com wrote: > - /* Get correct proto for 802.1ad */ > - if (skb->vlan_present && skb->vlan_proto == htons(ETH_P_8021AD)) { > - if (data + ETH_ALEN * 2 + sizeof(struct vlan_hdr) + > - sizeof(proto) > data_end) > - return TC_ACT_OK; > - proto = *(__u16 *)(data + ETH_ALEN * 2 + > -sizeof(struct vlan_hdr)); > - off += sizeof(struct vlan_hdr); > - } Your version loses VLAN support?
Re: [PATCH] net/tap: Modified TAP BPF program as per the new Kernel-version upgrade requirements.
On Thu, 4 Jan 2024 22:57:56 +0530 madhuker.myt...@oracle.com wrote: > { > - void *data_end = (void *)(long)skb->data_end; > - void *data = (void *)(long)skb->data; > - __u16 proto = (__u16)skb->protocol; > +struct neth nh; > +struct net6h n6h; Sloppy non-standard indentation. And original code would work with tunnels, this won't
Re: [PATCH] net/tap: Modified TAP BPF program as per the new Kernel-version upgrade requirements.
On Thu, 4 Jan 2024 22:57:56 +0530 madhuker.myt...@oracle.com wrote: > > > RCA: These errors started coming after from the Kernel-5.15 version, in > which lots of new BPF verification restrictions were added for safe execution > of byte-code on to the Kernel, due to which existing BPF program verification > does not pass. > Here are the major BPF verifier restrictions observed: > 1) Need to use new BPF maps structure. > 2) Kernel SKB data pointer access not allowed. I noticed you are now using bpf_skb_load_bytes(), but the bpf helper man page implies it is not needed. long bpf_skb_load_bytes(const void *skb, u32 offset, void *to, u32 len) Description This helper was provided as an easy way to load data from a packet. It can be used to load len bytes from offset from the packet associated to skb, into the buffer pointed by to. Since Linux 4.7, usage of this helper has mostly been replaced by "direct packet access", enabling packet data to be manipulated with skb->data and skb->data_end pointing respectively to the first byte of packet data and to the byte after the last byte of packet data. However, it remains useful if one wishes to read large quantities of data at once from a packet into the eBPF stack. Return 0 on success, or a negative error in case of
Re: [dpdk-dev] [RFC] ethdev: support Tx queue free descriptor query
19/12/2023 18:29, jer...@marvell.com: > --- a/doc/guides/nics/features/default.ini > +++ b/doc/guides/nics/features/default.ini > @@ -59,6 +59,7 @@ Packet type parsing = > > Timesync = > Rx descriptor status = > Tx descriptor status = > +Tx free descriptor query = I think we can drop "query" here. > +__rte_experimental > +static inline uint32_t > +rte_eth_tx_queue_free_desc_get(uint16_t port_id, uint16_t tx_queue_id) For consistency with rte_eth_rx_queue_count(), I propose the name rte_eth_tx_queue_free_count().
[PATCH v2 1/2] app/test-crypto-perf: fix invalid memcmp results
The function memcmp() returns an integer less than, equal to, or greater than zero. In current code, if the first memcmp() returns less than zero and the second memcmp() returns greater than zero, the sum of results may still be 0 and indicates verify succussed. This commit converts the return value to be zero or greater than zero. That will make sure the sum of results be correct. Fixes: df52cb3b6e13 ("app/crypto-perf: move verify as single test type") Signed-off-by: Suanming Mou Acked-by: Anoob Joseph --- app/test-crypto-perf/cperf_test_verify.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/app/test-crypto-perf/cperf_test_verify.c b/app/test-crypto-perf/cperf_test_verify.c index a6c0ffe813..8aa714b969 100644 --- a/app/test-crypto-perf/cperf_test_verify.c +++ b/app/test-crypto-perf/cperf_test_verify.c @@ -186,18 +186,18 @@ cperf_verify_op(struct rte_crypto_op *op, if (cipher == 1) { if (options->cipher_op == RTE_CRYPTO_CIPHER_OP_ENCRYPT) - res += memcmp(data + cipher_offset, + res += !!memcmp(data + cipher_offset, vector->ciphertext.data, options->test_buffer_size); else - res += memcmp(data + cipher_offset, + res += !!memcmp(data + cipher_offset, vector->plaintext.data, options->test_buffer_size); } if (auth == 1) { if (options->auth_op == RTE_CRYPTO_AUTH_OP_GENERATE) - res += memcmp(data + auth_offset, + res += !!memcmp(data + auth_offset, vector->digest.data, options->digest_sz); } -- 2.34.1
[PATCH v2 2/2] app/test-crypto-perf: fix encrypt operation verify
AEAD uses RTE_CRYPTO_AEAD_OP_* with aead_op and CIPHER uses RTE_CRYPTO_CIPHER_OP_* with cipher_op in current code. This commit aligns aead_op and cipher_op operation to fix incorrect AEAD verification. Fixes: df52cb3b6e13 ("app/crypto-perf: move verify as single test type") Signed-off-by: Suanming Mou --- v2: align auth/cipher to bool. --- app/test-crypto-perf/cperf_test_verify.c | 55 1 file changed, 27 insertions(+), 28 deletions(-) diff --git a/app/test-crypto-perf/cperf_test_verify.c b/app/test-crypto-perf/cperf_test_verify.c index 8aa714b969..2b0d3f142b 100644 --- a/app/test-crypto-perf/cperf_test_verify.c +++ b/app/test-crypto-perf/cperf_test_verify.c @@ -111,8 +111,10 @@ cperf_verify_op(struct rte_crypto_op *op, uint32_t len; uint16_t nb_segs; uint8_t *data; - uint32_t cipher_offset, auth_offset; - uint8_t cipher, auth; + uint32_t cipher_offset, auth_offset = 0; + bool cipher = false; + bool digest_verify = false; + bool is_encrypt = false; int res = 0; if (op->status != RTE_CRYPTO_OP_STATUS_SUCCESS) @@ -150,42 +152,43 @@ cperf_verify_op(struct rte_crypto_op *op, switch (options->op_type) { case CPERF_CIPHER_ONLY: - cipher = 1; + cipher = true; cipher_offset = 0; - auth = 0; - auth_offset = 0; - break; - case CPERF_CIPHER_THEN_AUTH: - cipher = 1; - cipher_offset = 0; - auth = 1; - auth_offset = options->test_buffer_size; + is_encrypt = options->cipher_op == RTE_CRYPTO_CIPHER_OP_ENCRYPT; break; case CPERF_AUTH_ONLY: - cipher = 0; cipher_offset = 0; - auth = 1; - auth_offset = options->test_buffer_size; + if (options->auth_op == RTE_CRYPTO_AUTH_OP_GENERATE) { + auth_offset = options->test_buffer_size; + digest_verify = true; + } break; + case CPERF_CIPHER_THEN_AUTH: case CPERF_AUTH_THEN_CIPHER: - cipher = 1; + cipher = true; cipher_offset = 0; - auth = 1; - auth_offset = options->test_buffer_size; + if (options->cipher_op == RTE_CRYPTO_CIPHER_OP_ENCRYPT) { + auth_offset = options->test_buffer_size; + digest_verify = true; + is_encrypt = true; + } break; case CPERF_AEAD: - cipher = 1; + cipher = true; cipher_offset = 0; - auth = 1; - auth_offset = options->test_buffer_size; + if (options->aead_op == RTE_CRYPTO_AEAD_OP_ENCRYPT) { + auth_offset = options->test_buffer_size; + digest_verify = true; + is_encrypt = true; + } break; default: res = 1; goto out; } - if (cipher == 1) { - if (options->cipher_op == RTE_CRYPTO_CIPHER_OP_ENCRYPT) + if (cipher) { + if (is_encrypt) res += !!memcmp(data + cipher_offset, vector->ciphertext.data, options->test_buffer_size); @@ -195,12 +198,8 @@ cperf_verify_op(struct rte_crypto_op *op, options->test_buffer_size); } - if (auth == 1) { - if (options->auth_op == RTE_CRYPTO_AUTH_OP_GENERATE) - res += !!memcmp(data + auth_offset, - vector->digest.data, - options->digest_sz); - } + if (digest_verify) + res += !!memcmp(data + auth_offset, vector->digest.data, options->digest_sz); out: rte_free(data); -- 2.34.1
RE: [PATCH v8 2/2] net/iavf: add diagnostic support in TX path
> -Original Message- > From: Mingjin Ye > Sent: Wednesday, January 3, 2024 6:11 PM > To: dev@dpdk.org > Cc: Yang, Qiming ; Ye, MingjinX > ; Su, Simei ; Wu, Wenjun1 > ; Zhang, Yuying ; Xing, > Beilei ; Wu, Jingjing > Subject: [PATCH v8 2/2] net/iavf: add diagnostic support in TX path > > The only way to enable diagnostics for TX paths is to modify the application > source code. Making it difficult to diagnose faults. > > In this patch, the devarg option "mbuf_check" is introduced and the > parameters are configured to enable the corresponding diagnostics. > > supported cases: mbuf, size, segment, offload. > 1. mbuf: check for corrupted mbuf. > 2. size: check min/max packet length according to hw spec. > 3. segment: check number of mbuf segments not exceed hw limitation. > 4. offload: check any unsupported offload flag. > > parameter format: mbuf_check=[mbuf,,] > eg: dpdk-testpmd -a :81:01.0,mbuf_check=[mbuf,size] -- -i > > Signed-off-by: Mingjin Ye > --- > v2: Remove call chain. > --- > v3: Optimisation implementation. > --- > v4: Fix Windows os compilation error. > --- > v5: Split Patch. > --- > v6: remove strict. > --- > v7: Modify the description document. > --- > doc/guides/nics/intel_vf.rst | 9 > drivers/net/iavf/iavf.h| 12 + > drivers/net/iavf/iavf_ethdev.c | 76 ++ > drivers/net/iavf/iavf_rxtx.c | 98 ++ > drivers/net/iavf/iavf_rxtx.h | 2 + > 5 files changed, 197 insertions(+) > > diff --git a/doc/guides/nics/intel_vf.rst b/doc/guides/nics/intel_vf.rst index > ad08198f0f..bda6648726 100644 > --- a/doc/guides/nics/intel_vf.rst > +++ b/doc/guides/nics/intel_vf.rst > @@ -111,6 +111,15 @@ For more detail on SR-IOV, please refer to the > following documents: > by setting the ``devargs`` parameter like ``-a 18:01.0,no-poll-on-link- > down=1`` > when IAVF is backed by an Intel\ |reg| E810 device or an Intel\ |reg| 700 > Series Ethernet device. > > +When IAVF is backed by an Intel\ |reg| E810 device or an Intel\ |reg| 700 > series Ethernet devices. > +Set the ``devargs`` parameter ``mbuf_check`` to enable TX diagnostics. > For > example, > +``-a 18:01.0,mbuf_check=mbuf`` or ``-a 18:01.0,mbuf_check=[mbuf,size]``. > Supported cases: ``-a 18:01.0,mbuf_check=`` or ``-a 18:01.0,mbuf_check=[,...]`` > + > +* mbuf: Check for corrupted mbuf. > +* size: Check min/max packet length according to hw spec. > +* segment: Check number of mbuf segments not exceed hw limitation. > +* offload: Check any unsupported offload flag. please also describe how to get the error count by xstats_get, a testpmd command is suggested Btw, PATCH 1/2 as a fix has been merged seperately, new version can only target to this patch only.
RE: [PATCH v2] net/e1000: support launchtime feature
> -Original Message- > From: Su, Simei > Sent: Thursday, January 4, 2024 11:13 AM > To: Chuanyu Xue ; Lu, Wenzhuo > ; Zhang, Qi Z ; Xing, Beilei > > Cc: dev@dpdk.org > Subject: RE: [PATCH v2] net/e1000: support launchtime feature > > > > -Original Message- > > From: Chuanyu Xue > > Sent: Sunday, December 31, 2023 12:35 AM > > To: Su, Simei ; Lu, Wenzhuo > > ; Zhang, Qi Z ; Xing, > > Beilei > > Cc: dev@dpdk.org; Chuanyu Xue > > Subject: [PATCH v2] net/e1000: support launchtime feature > > > > Enable the time-based scheduled Tx of packets based on the > > RTE_ETH_TX_OFFLOAD_SEND_ON_TIMESTAMP flag. The launchtime defines > the > > packet transmission time based on PTP clock at MAC layer, which should > > be set to the advanced transmit descriptor. > > > > Signed-off-by: Chuanyu Xue > > --- > > change log: > > > > v2: > > - Add delay compensation for i210 NIC by setting tx offset register. > > - Revise read_clock function. > > > > drivers/net/e1000/base/e1000_regs.h | 1 + > > drivers/net/e1000/e1000_ethdev.h| 14 +++ > > drivers/net/e1000/igb_ethdev.c | 63 > > - > > drivers/net/e1000/igb_rxtx.c| 42 +++ > > 4 files changed, 112 insertions(+), 8 deletions(-) > > > > diff --git a/drivers/net/e1000/base/e1000_regs.h > > b/drivers/net/e1000/base/e1000_regs.h > > index d44de59c29..092d9d71e6 100644 > > --- a/drivers/net/e1000/base/e1000_regs.h > > +++ b/drivers/net/e1000/base/e1000_regs.h > > @@ -162,6 +162,7 @@ > > > > /* QAV Tx mode control register */ > > #define E1000_I210_TQAVCTRL0x3570 > > +#define E1000_I210_LAUNCH_OS0 0x3578 > > > > /* QAV Tx mode control register bitfields masks */ > > /* QAV enable */ > > diff --git a/drivers/net/e1000/e1000_ethdev.h > > b/drivers/net/e1000/e1000_ethdev.h > > index 718a9746ed..339ae1f4b6 100644 > > --- a/drivers/net/e1000/e1000_ethdev.h > > +++ b/drivers/net/e1000/e1000_ethdev.h > > @@ -382,6 +382,20 @@ extern struct igb_rss_filter_list > > igb_filter_rss_list; TAILQ_HEAD(igb_flow_mem_list, igb_flow_mem); > > extern struct igb_flow_mem_list igb_flow_list; > > > > +/* > > + * Macros to compensate the constant latency observed in i210 for > > +launch time > > + * > > + * launch time = (offset_speed - offset_base + txtime) * 32 > > + * offset_speed is speed dependent, set in E1000_I210_LAUNCH_OS0 */ > > +#define IGB_I210_TX_OFFSET_BASE0xffe0 > > +#define IGB_I210_TX_OFFSET_SPEED_100xc7a0 > > +#define IGB_I210_TX_OFFSET_SPEED_100 0x86e0 > > +#define IGB_I210_TX_OFFSET_SPEED_1000 0xbe00 > > + > > +extern uint64_t igb_tx_timestamp_dynflag; extern int > > +igb_tx_timestamp_dynfield_offset; > > + > > extern const struct rte_flow_ops igb_flow_ops; > > > > /* > > diff --git a/drivers/net/e1000/igb_ethdev.c > > b/drivers/net/e1000/igb_ethdev.c index 8858f975f8..2262035710 100644 > > --- a/drivers/net/e1000/igb_ethdev.c > > +++ b/drivers/net/e1000/igb_ethdev.c > > @@ -223,6 +223,7 @@ static int igb_timesync_read_time(struct > > rte_eth_dev *dev, > > struct timespec *timestamp); > > static int igb_timesync_write_time(struct rte_eth_dev *dev, > >const struct timespec *timestamp); > > +static int eth_igb_read_clock(struct rte_eth_dev *dev, uint64_t > > +*clock); > > static int eth_igb_rx_queue_intr_enable(struct rte_eth_dev *dev, > > uint16_t queue_id); > > static int eth_igb_rx_queue_intr_disable(struct rte_eth_dev *dev, @@ > > -313,6 > > +314,9 @@ static const struct rte_pci_id pci_id_igbvf_map[] = { > > { .vendor_id = 0, /* sentinel */ }, > > }; > > > > +uint64_t igb_tx_timestamp_dynflag; > > +int igb_tx_timestamp_dynfield_offset = -1; > > + > > static const struct rte_eth_desc_lim rx_desc_lim = { > > .nb_max = E1000_MAX_RING_DESC, > > .nb_min = E1000_MIN_RING_DESC, > > @@ -389,6 +393,7 @@ static const struct eth_dev_ops eth_igb_ops = { > > .timesync_adjust_time = igb_timesync_adjust_time, > > .timesync_read_time = igb_timesync_read_time, > > .timesync_write_time = igb_timesync_write_time, > > + .read_clock = eth_igb_read_clock, > > }; > > > > /* > > @@ -1188,6 +1193,40 @@ eth_igb_rxtx_control(struct rte_eth_dev *dev, > > E1000_WRITE_FLUSH(hw); > > } > > > > + > > +static uint32_t igb_tx_offset(struct rte_eth_dev *dev) { > > + struct e1000_hw *hw = > > + E1000_DEV_PRIVATE_TO_HW(dev->data->dev_private); > > + > > + uint16_t duplex, speed; > > + hw->mac.ops.get_link_up_info(hw, &speed, &duplex); > > + > > + uint32_t launch_os0 = E1000_READ_REG(hw, > E1000_I210_LAUNCH_OS0); > > + if (hw->mac.type != e1000_i210) { > > + /* Set launch offset to base, no compensation */ > > + launch_os0 |= IGB_I210_TX_OFFSET_BASE; > > + } else { > > + /* Set launch offset depend on link speeds */ > > + sw
[PATCH] doc: update default value for config parameter
Update documentation value to match default value in code base. Signed-off-by: Simei Su --- doc/guides/prog_guide/ip_fragment_reassembly_lib.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/doc/guides/prog_guide/ip_fragment_reassembly_lib.rst b/doc/guides/prog_guide/ip_fragment_reassembly_lib.rst index 314d4ad..b14289e 100644 --- a/doc/guides/prog_guide/ip_fragment_reassembly_lib.rst +++ b/doc/guides/prog_guide/ip_fragment_reassembly_lib.rst @@ -43,7 +43,7 @@ Note that all update/lookup operations on Fragment Table are not thread safe. So if different execution contexts (threads/processes) will access the same table simultaneously, then some external syncing mechanism have to be provided. -Each table entry can hold information about packets consisting of up to RTE_LIBRTE_IP_FRAG_MAX (by default: 4) fragments. +Each table entry can hold information about packets consisting of up to RTE_LIBRTE_IP_FRAG_MAX (by default: 8) fragments. Code example, that demonstrates creation of a new Fragment table: -- 2.9.5
RE: [EXT] [PATCH v2 2/2] app/test-crypto-perf: fix encrypt operation verify
> AEAD uses RTE_CRYPTO_AEAD_OP_* with aead_op and CIPHER uses > RTE_CRYPTO_CIPHER_OP_* with cipher_op in current code. > > This commit aligns aead_op and cipher_op operation to fix incorrect AEAD > verification. > > Fixes: df52cb3b6e13 ("app/crypto-perf: move verify as single test type") > > Signed-off-by: Suanming Mou Acked-by: Anoob Joseph
[PATCH] net/ice: refine queue start stop
Not necessary to return fail when starting or stopping a queue if the queue was already at required state. Signed-off-by: Qi Zhang --- drivers/net/ice/ice_rxtx.c | 16 1 file changed, 16 insertions(+) diff --git a/drivers/net/ice/ice_rxtx.c b/drivers/net/ice/ice_rxtx.c index 73e47ae92d..3286bb08fe 100644 --- a/drivers/net/ice/ice_rxtx.c +++ b/drivers/net/ice/ice_rxtx.c @@ -673,6 +673,10 @@ ice_rx_queue_start(struct rte_eth_dev *dev, uint16_t rx_queue_id) return -EINVAL; } + if (dev->data->rx_queue_state[rx_queue_id] == + RTE_ETH_QUEUE_STATE_STARTED) + return 0; + if (dev->data->dev_conf.rxmode.offloads & RTE_ETH_RX_OFFLOAD_TIMESTAMP) rxq->ts_enable = true; err = ice_program_hw_rx_queue(rxq); @@ -717,6 +721,10 @@ ice_rx_queue_stop(struct rte_eth_dev *dev, uint16_t rx_queue_id) if (rx_queue_id < dev->data->nb_rx_queues) { rxq = dev->data->rx_queues[rx_queue_id]; + if (dev->data->rx_queue_state[rx_queue_id] == + RTE_ETH_QUEUE_STATE_STOPPED) + return 0; + err = ice_switch_rx_queue(hw, rxq->reg_idx, false); if (err) { PMD_DRV_LOG(ERR, "Failed to switch RX queue %u off", @@ -758,6 +766,10 @@ ice_tx_queue_start(struct rte_eth_dev *dev, uint16_t tx_queue_id) return -EINVAL; } + if (dev->data->tx_queue_state[tx_queue_id] == + RTE_ETH_QUEUE_STATE_STARTED) + return 0; + buf_len = ice_struct_size(txq_elem, txqs, 1); txq_elem = ice_malloc(hw, buf_len); if (!txq_elem) @@ -1066,6 +1078,10 @@ ice_tx_queue_stop(struct rte_eth_dev *dev, uint16_t tx_queue_id) return -EINVAL; } + if (dev->data->tx_queue_state[tx_queue_id] == + RTE_ETH_QUEUE_STATE_STOPPED) + return 0; + q_ids[0] = txq->reg_idx; q_teids[0] = txq->q_teid; -- 2.31.1
[PATCH 0/3] net/ice: simplified to 3 layer Tx scheduler.
Remove dummy layers, code refactor, complete document. Qi Zhang (3): net/ice: hide port and TC layer in Tx sched tree net/ice: refactor tm config data struture doc: update ice document for qos doc/guides/nics/ice.rst | 19 +++ drivers/net/ice/ice_ethdev.h | 12 +- drivers/net/ice/ice_tm.c | 285 +++ 3 files changed, 112 insertions(+), 204 deletions(-) -- 2.31.1
[PATCH 1/3] net/ice: hide port and TC layer in Tx sched tree
In currently 5 layer tree implementation, the port and tc layer is not configurable, so its not necessary to expose them to applicaiton. The patch hides the top 2 layers and represented the root of the tree at VSI layer. From application's point of view, its a 3 layer scheduler tree: Port -> Queue Group -> Queue. Signed-off-by: Qi Zhang --- drivers/net/ice/ice_ethdev.h | 7 drivers/net/ice/ice_tm.c | 79 2 files changed, 7 insertions(+), 79 deletions(-) diff --git a/drivers/net/ice/ice_ethdev.h b/drivers/net/ice/ice_ethdev.h index fa4981ed14..ae22c29ffc 100644 --- a/drivers/net/ice/ice_ethdev.h +++ b/drivers/net/ice/ice_ethdev.h @@ -470,7 +470,6 @@ struct ice_tm_shaper_profile { struct ice_tm_node { TAILQ_ENTRY(ice_tm_node) node; uint32_t id; - uint32_t tc; uint32_t priority; uint32_t weight; uint32_t reference_count; @@ -484,8 +483,6 @@ struct ice_tm_node { /* node type of Traffic Manager */ enum ice_tm_node_type { ICE_TM_NODE_TYPE_PORT, - ICE_TM_NODE_TYPE_TC, - ICE_TM_NODE_TYPE_VSI, ICE_TM_NODE_TYPE_QGROUP, ICE_TM_NODE_TYPE_QUEUE, ICE_TM_NODE_TYPE_MAX, @@ -495,12 +492,8 @@ enum ice_tm_node_type { struct ice_tm_conf { struct ice_shaper_profile_list shaper_profile_list; struct ice_tm_node *root; /* root node - port */ - struct ice_tm_node_list tc_list; /* node list for all the TCs */ - struct ice_tm_node_list vsi_list; /* node list for all the VSIs */ struct ice_tm_node_list qgroup_list; /* node list for all the queue groups */ struct ice_tm_node_list queue_list; /* node list for all the queues */ - uint32_t nb_tc_node; - uint32_t nb_vsi_node; uint32_t nb_qgroup_node; uint32_t nb_queue_node; bool committed; diff --git a/drivers/net/ice/ice_tm.c b/drivers/net/ice/ice_tm.c index b570798f07..7ae68c683b 100644 --- a/drivers/net/ice/ice_tm.c +++ b/drivers/net/ice/ice_tm.c @@ -43,12 +43,8 @@ ice_tm_conf_init(struct rte_eth_dev *dev) /* initialize node configuration */ TAILQ_INIT(&pf->tm_conf.shaper_profile_list); pf->tm_conf.root = NULL; - TAILQ_INIT(&pf->tm_conf.tc_list); - TAILQ_INIT(&pf->tm_conf.vsi_list); TAILQ_INIT(&pf->tm_conf.qgroup_list); TAILQ_INIT(&pf->tm_conf.queue_list); - pf->tm_conf.nb_tc_node = 0; - pf->tm_conf.nb_vsi_node = 0; pf->tm_conf.nb_qgroup_node = 0; pf->tm_conf.nb_queue_node = 0; pf->tm_conf.committed = false; @@ -72,16 +68,6 @@ ice_tm_conf_uninit(struct rte_eth_dev *dev) rte_free(tm_node); } pf->tm_conf.nb_qgroup_node = 0; - while ((tm_node = TAILQ_FIRST(&pf->tm_conf.vsi_list))) { - TAILQ_REMOVE(&pf->tm_conf.vsi_list, tm_node, node); - rte_free(tm_node); - } - pf->tm_conf.nb_vsi_node = 0; - while ((tm_node = TAILQ_FIRST(&pf->tm_conf.tc_list))) { - TAILQ_REMOVE(&pf->tm_conf.tc_list, tm_node, node); - rte_free(tm_node); - } - pf->tm_conf.nb_tc_node = 0; if (pf->tm_conf.root) { rte_free(pf->tm_conf.root); pf->tm_conf.root = NULL; @@ -93,8 +79,6 @@ ice_tm_node_search(struct rte_eth_dev *dev, uint32_t node_id, enum ice_tm_node_type *node_type) { struct ice_pf *pf = ICE_DEV_PRIVATE_TO_PF(dev->data->dev_private); - struct ice_tm_node_list *tc_list = &pf->tm_conf.tc_list; - struct ice_tm_node_list *vsi_list = &pf->tm_conf.vsi_list; struct ice_tm_node_list *qgroup_list = &pf->tm_conf.qgroup_list; struct ice_tm_node_list *queue_list = &pf->tm_conf.queue_list; struct ice_tm_node *tm_node; @@ -104,20 +88,6 @@ ice_tm_node_search(struct rte_eth_dev *dev, return pf->tm_conf.root; } - TAILQ_FOREACH(tm_node, tc_list, node) { - if (tm_node->id == node_id) { - *node_type = ICE_TM_NODE_TYPE_TC; - return tm_node; - } - } - - TAILQ_FOREACH(tm_node, vsi_list, node) { - if (tm_node->id == node_id) { - *node_type = ICE_TM_NODE_TYPE_VSI; - return tm_node; - } - } - TAILQ_FOREACH(tm_node, qgroup_list, node) { if (tm_node->id == node_id) { *node_type = ICE_TM_NODE_TYPE_QGROUP; @@ -371,6 +341,8 @@ ice_shaper_profile_del(struct rte_eth_dev *dev, return 0; } +#define MAX_QUEUE_PER_GROUP8 + static int ice_tm_node_add(struct rte_eth_dev *dev, uint32_t node_id, uint32_t parent_node_id, uint32_t priority, @@ -384,8 +356,6 @@ ice_tm_node_add(struct rte_eth_dev *dev, uint32_t node_id, struct ice_tm_shaper_profile *shaper_profile = NULL; struct ice_tm_node *tm_node; struct ice_tm_node *parent_node;
[PATCH 2/3] net/ice: refactor tm config data struture
Simplified struct ice_tm_conf by removing per level node list. Signed-off-by: Qi Zhang --- drivers/net/ice/ice_ethdev.h | 5 +- drivers/net/ice/ice_tm.c | 210 +++ 2 files changed, 88 insertions(+), 127 deletions(-) diff --git a/drivers/net/ice/ice_ethdev.h b/drivers/net/ice/ice_ethdev.h index ae22c29ffc..008a7a23b9 100644 --- a/drivers/net/ice/ice_ethdev.h +++ b/drivers/net/ice/ice_ethdev.h @@ -472,6 +472,7 @@ struct ice_tm_node { uint32_t id; uint32_t priority; uint32_t weight; + uint32_t level; uint32_t reference_count; struct ice_tm_node *parent; struct ice_tm_node **children; @@ -492,10 +493,6 @@ enum ice_tm_node_type { struct ice_tm_conf { struct ice_shaper_profile_list shaper_profile_list; struct ice_tm_node *root; /* root node - port */ - struct ice_tm_node_list qgroup_list; /* node list for all the queue groups */ - struct ice_tm_node_list queue_list; /* node list for all the queues */ - uint32_t nb_qgroup_node; - uint32_t nb_queue_node; bool committed; bool clear_on_fail; }; diff --git a/drivers/net/ice/ice_tm.c b/drivers/net/ice/ice_tm.c index 7ae68c683b..7c662f8a85 100644 --- a/drivers/net/ice/ice_tm.c +++ b/drivers/net/ice/ice_tm.c @@ -43,66 +43,30 @@ ice_tm_conf_init(struct rte_eth_dev *dev) /* initialize node configuration */ TAILQ_INIT(&pf->tm_conf.shaper_profile_list); pf->tm_conf.root = NULL; - TAILQ_INIT(&pf->tm_conf.qgroup_list); - TAILQ_INIT(&pf->tm_conf.queue_list); - pf->tm_conf.nb_qgroup_node = 0; - pf->tm_conf.nb_queue_node = 0; pf->tm_conf.committed = false; pf->tm_conf.clear_on_fail = false; } -void -ice_tm_conf_uninit(struct rte_eth_dev *dev) +static void free_node(struct ice_tm_node *root) { - struct ice_pf *pf = ICE_DEV_PRIVATE_TO_PF(dev->data->dev_private); - struct ice_tm_node *tm_node; + uint32_t i; - /* clear node configuration */ - while ((tm_node = TAILQ_FIRST(&pf->tm_conf.queue_list))) { - TAILQ_REMOVE(&pf->tm_conf.queue_list, tm_node, node); - rte_free(tm_node); - } - pf->tm_conf.nb_queue_node = 0; - while ((tm_node = TAILQ_FIRST(&pf->tm_conf.qgroup_list))) { - TAILQ_REMOVE(&pf->tm_conf.qgroup_list, tm_node, node); - rte_free(tm_node); - } - pf->tm_conf.nb_qgroup_node = 0; - if (pf->tm_conf.root) { - rte_free(pf->tm_conf.root); - pf->tm_conf.root = NULL; - } + if (root == NULL) + return; + + for (i = 0; i < root->reference_count; i++) + free_node(root->children[i]); + + rte_free(root); } -static inline struct ice_tm_node * -ice_tm_node_search(struct rte_eth_dev *dev, - uint32_t node_id, enum ice_tm_node_type *node_type) +void +ice_tm_conf_uninit(struct rte_eth_dev *dev) { struct ice_pf *pf = ICE_DEV_PRIVATE_TO_PF(dev->data->dev_private); - struct ice_tm_node_list *qgroup_list = &pf->tm_conf.qgroup_list; - struct ice_tm_node_list *queue_list = &pf->tm_conf.queue_list; - struct ice_tm_node *tm_node; - - if (pf->tm_conf.root && pf->tm_conf.root->id == node_id) { - *node_type = ICE_TM_NODE_TYPE_PORT; - return pf->tm_conf.root; - } - TAILQ_FOREACH(tm_node, qgroup_list, node) { - if (tm_node->id == node_id) { - *node_type = ICE_TM_NODE_TYPE_QGROUP; - return tm_node; - } - } - - TAILQ_FOREACH(tm_node, queue_list, node) { - if (tm_node->id == node_id) { - *node_type = ICE_TM_NODE_TYPE_QUEUE; - return tm_node; - } - } - - return NULL; + free_node(pf->tm_conf.root); + pf->tm_conf.root = NULL; } static int @@ -195,11 +159,29 @@ ice_node_param_check(struct ice_pf *pf, uint32_t node_id, return 0; } +static struct ice_tm_node * +find_node(struct ice_tm_node *root, uint32_t id) +{ + uint32_t i; + + if (root == NULL || root->id == id) + return root; + + for (i = 0; i < root->reference_count; i++) { + struct ice_tm_node *node = find_node(root->children[i], id); + + if (node) + return node; + } + + return NULL; +} + static int ice_node_type_get(struct rte_eth_dev *dev, uint32_t node_id, int *is_leaf, struct rte_tm_error *error) { - enum ice_tm_node_type node_type = ICE_TM_NODE_TYPE_MAX; + struct ice_pf *pf = ICE_DEV_PRIVATE_TO_PF(dev->data->dev_private); struct ice_tm_node *tm_node; if (!is_leaf || !error) @@ -212,14 +194,14 @@ ice_node_type_get(struct rte_eth_dev *dev, uint32_t node_id, } /* check if the node id exists
[PATCH 3/3] doc: update ice document for qos
Add description for ice PMD's rte_tm capabilities. Signed-off-by: Qi Zhang --- doc/guides/nics/ice.rst | 19 +++ 1 file changed, 19 insertions(+) diff --git a/doc/guides/nics/ice.rst b/doc/guides/nics/ice.rst index bafb3ba022..1f737a009c 100644 --- a/doc/guides/nics/ice.rst +++ b/doc/guides/nics/ice.rst @@ -352,6 +352,25 @@ queue 3 using a raw pattern:: Currently, raw pattern support is limited to the FDIR and Hash engines. +Traffic Management Support +~~ + +The ice PMD provides support for the Traffic Management API (RTE_RM), allow +users to offload a 3-layers Tx scheduler on the E810 NIC: + +- ``Port Layer`` + + This is the root layer, support peak bandwidth configuration, max to 32 children. + +- ``Queue Group Layer`` + + The middel layer, support peak / committed bandwidth, weight, prioirty configurations, + max to 8 children. + +- ``Queue Layer`` + + The leaf layer, support peak / committed bandwidth, weight, prioirty configurations. + Additional Options ++ -- 2.31.1
[PATCH v2 0/3] net/ice: simplified to 3 layer Tx scheduler
Remove dummy layers, code refactor, complete document Qi Zhang (3): net/ice: hide port and TC layer in Tx sched tree net/ice: refactor tm config data structure doc: update ice document for qos v2: - fix typos. doc/guides/nics/ice.rst | 19 +++ drivers/net/ice/ice_ethdev.h | 12 +- drivers/net/ice/ice_tm.c | 285 +++ 3 files changed, 112 insertions(+), 204 deletions(-) -- 2.31.1
[PATCH v2 1/3] net/ice: hide port and TC layer in Tx sched tree
In currently 5 layer tree implementation, the port and tc layer is not configurable, so its not necessary to expose them to application. The patch hides the top 2 layers and represented the root of the tree at VSI layer. From application's point of view, its a 3 layer scheduler tree: Port -> Queue Group -> Queue. Signed-off-by: Qi Zhang --- drivers/net/ice/ice_ethdev.h | 7 drivers/net/ice/ice_tm.c | 79 2 files changed, 7 insertions(+), 79 deletions(-) diff --git a/drivers/net/ice/ice_ethdev.h b/drivers/net/ice/ice_ethdev.h index fa4981ed14..ae22c29ffc 100644 --- a/drivers/net/ice/ice_ethdev.h +++ b/drivers/net/ice/ice_ethdev.h @@ -470,7 +470,6 @@ struct ice_tm_shaper_profile { struct ice_tm_node { TAILQ_ENTRY(ice_tm_node) node; uint32_t id; - uint32_t tc; uint32_t priority; uint32_t weight; uint32_t reference_count; @@ -484,8 +483,6 @@ struct ice_tm_node { /* node type of Traffic Manager */ enum ice_tm_node_type { ICE_TM_NODE_TYPE_PORT, - ICE_TM_NODE_TYPE_TC, - ICE_TM_NODE_TYPE_VSI, ICE_TM_NODE_TYPE_QGROUP, ICE_TM_NODE_TYPE_QUEUE, ICE_TM_NODE_TYPE_MAX, @@ -495,12 +492,8 @@ enum ice_tm_node_type { struct ice_tm_conf { struct ice_shaper_profile_list shaper_profile_list; struct ice_tm_node *root; /* root node - port */ - struct ice_tm_node_list tc_list; /* node list for all the TCs */ - struct ice_tm_node_list vsi_list; /* node list for all the VSIs */ struct ice_tm_node_list qgroup_list; /* node list for all the queue groups */ struct ice_tm_node_list queue_list; /* node list for all the queues */ - uint32_t nb_tc_node; - uint32_t nb_vsi_node; uint32_t nb_qgroup_node; uint32_t nb_queue_node; bool committed; diff --git a/drivers/net/ice/ice_tm.c b/drivers/net/ice/ice_tm.c index b570798f07..7ae68c683b 100644 --- a/drivers/net/ice/ice_tm.c +++ b/drivers/net/ice/ice_tm.c @@ -43,12 +43,8 @@ ice_tm_conf_init(struct rte_eth_dev *dev) /* initialize node configuration */ TAILQ_INIT(&pf->tm_conf.shaper_profile_list); pf->tm_conf.root = NULL; - TAILQ_INIT(&pf->tm_conf.tc_list); - TAILQ_INIT(&pf->tm_conf.vsi_list); TAILQ_INIT(&pf->tm_conf.qgroup_list); TAILQ_INIT(&pf->tm_conf.queue_list); - pf->tm_conf.nb_tc_node = 0; - pf->tm_conf.nb_vsi_node = 0; pf->tm_conf.nb_qgroup_node = 0; pf->tm_conf.nb_queue_node = 0; pf->tm_conf.committed = false; @@ -72,16 +68,6 @@ ice_tm_conf_uninit(struct rte_eth_dev *dev) rte_free(tm_node); } pf->tm_conf.nb_qgroup_node = 0; - while ((tm_node = TAILQ_FIRST(&pf->tm_conf.vsi_list))) { - TAILQ_REMOVE(&pf->tm_conf.vsi_list, tm_node, node); - rte_free(tm_node); - } - pf->tm_conf.nb_vsi_node = 0; - while ((tm_node = TAILQ_FIRST(&pf->tm_conf.tc_list))) { - TAILQ_REMOVE(&pf->tm_conf.tc_list, tm_node, node); - rte_free(tm_node); - } - pf->tm_conf.nb_tc_node = 0; if (pf->tm_conf.root) { rte_free(pf->tm_conf.root); pf->tm_conf.root = NULL; @@ -93,8 +79,6 @@ ice_tm_node_search(struct rte_eth_dev *dev, uint32_t node_id, enum ice_tm_node_type *node_type) { struct ice_pf *pf = ICE_DEV_PRIVATE_TO_PF(dev->data->dev_private); - struct ice_tm_node_list *tc_list = &pf->tm_conf.tc_list; - struct ice_tm_node_list *vsi_list = &pf->tm_conf.vsi_list; struct ice_tm_node_list *qgroup_list = &pf->tm_conf.qgroup_list; struct ice_tm_node_list *queue_list = &pf->tm_conf.queue_list; struct ice_tm_node *tm_node; @@ -104,20 +88,6 @@ ice_tm_node_search(struct rte_eth_dev *dev, return pf->tm_conf.root; } - TAILQ_FOREACH(tm_node, tc_list, node) { - if (tm_node->id == node_id) { - *node_type = ICE_TM_NODE_TYPE_TC; - return tm_node; - } - } - - TAILQ_FOREACH(tm_node, vsi_list, node) { - if (tm_node->id == node_id) { - *node_type = ICE_TM_NODE_TYPE_VSI; - return tm_node; - } - } - TAILQ_FOREACH(tm_node, qgroup_list, node) { if (tm_node->id == node_id) { *node_type = ICE_TM_NODE_TYPE_QGROUP; @@ -371,6 +341,8 @@ ice_shaper_profile_del(struct rte_eth_dev *dev, return 0; } +#define MAX_QUEUE_PER_GROUP8 + static int ice_tm_node_add(struct rte_eth_dev *dev, uint32_t node_id, uint32_t parent_node_id, uint32_t priority, @@ -384,8 +356,6 @@ ice_tm_node_add(struct rte_eth_dev *dev, uint32_t node_id, struct ice_tm_shaper_profile *shaper_profile = NULL; struct ice_tm_node *tm_node; struct ice_tm_node *parent_node;
[PATCH v2 2/3] net/ice: refactor tm config data structure
Simplified struct ice_tm_conf by removing per level node list. Signed-off-by: Qi Zhang --- drivers/net/ice/ice_ethdev.h | 5 +- drivers/net/ice/ice_tm.c | 210 +++ 2 files changed, 88 insertions(+), 127 deletions(-) diff --git a/drivers/net/ice/ice_ethdev.h b/drivers/net/ice/ice_ethdev.h index ae22c29ffc..008a7a23b9 100644 --- a/drivers/net/ice/ice_ethdev.h +++ b/drivers/net/ice/ice_ethdev.h @@ -472,6 +472,7 @@ struct ice_tm_node { uint32_t id; uint32_t priority; uint32_t weight; + uint32_t level; uint32_t reference_count; struct ice_tm_node *parent; struct ice_tm_node **children; @@ -492,10 +493,6 @@ enum ice_tm_node_type { struct ice_tm_conf { struct ice_shaper_profile_list shaper_profile_list; struct ice_tm_node *root; /* root node - port */ - struct ice_tm_node_list qgroup_list; /* node list for all the queue groups */ - struct ice_tm_node_list queue_list; /* node list for all the queues */ - uint32_t nb_qgroup_node; - uint32_t nb_queue_node; bool committed; bool clear_on_fail; }; diff --git a/drivers/net/ice/ice_tm.c b/drivers/net/ice/ice_tm.c index 7ae68c683b..7c662f8a85 100644 --- a/drivers/net/ice/ice_tm.c +++ b/drivers/net/ice/ice_tm.c @@ -43,66 +43,30 @@ ice_tm_conf_init(struct rte_eth_dev *dev) /* initialize node configuration */ TAILQ_INIT(&pf->tm_conf.shaper_profile_list); pf->tm_conf.root = NULL; - TAILQ_INIT(&pf->tm_conf.qgroup_list); - TAILQ_INIT(&pf->tm_conf.queue_list); - pf->tm_conf.nb_qgroup_node = 0; - pf->tm_conf.nb_queue_node = 0; pf->tm_conf.committed = false; pf->tm_conf.clear_on_fail = false; } -void -ice_tm_conf_uninit(struct rte_eth_dev *dev) +static void free_node(struct ice_tm_node *root) { - struct ice_pf *pf = ICE_DEV_PRIVATE_TO_PF(dev->data->dev_private); - struct ice_tm_node *tm_node; + uint32_t i; - /* clear node configuration */ - while ((tm_node = TAILQ_FIRST(&pf->tm_conf.queue_list))) { - TAILQ_REMOVE(&pf->tm_conf.queue_list, tm_node, node); - rte_free(tm_node); - } - pf->tm_conf.nb_queue_node = 0; - while ((tm_node = TAILQ_FIRST(&pf->tm_conf.qgroup_list))) { - TAILQ_REMOVE(&pf->tm_conf.qgroup_list, tm_node, node); - rte_free(tm_node); - } - pf->tm_conf.nb_qgroup_node = 0; - if (pf->tm_conf.root) { - rte_free(pf->tm_conf.root); - pf->tm_conf.root = NULL; - } + if (root == NULL) + return; + + for (i = 0; i < root->reference_count; i++) + free_node(root->children[i]); + + rte_free(root); } -static inline struct ice_tm_node * -ice_tm_node_search(struct rte_eth_dev *dev, - uint32_t node_id, enum ice_tm_node_type *node_type) +void +ice_tm_conf_uninit(struct rte_eth_dev *dev) { struct ice_pf *pf = ICE_DEV_PRIVATE_TO_PF(dev->data->dev_private); - struct ice_tm_node_list *qgroup_list = &pf->tm_conf.qgroup_list; - struct ice_tm_node_list *queue_list = &pf->tm_conf.queue_list; - struct ice_tm_node *tm_node; - - if (pf->tm_conf.root && pf->tm_conf.root->id == node_id) { - *node_type = ICE_TM_NODE_TYPE_PORT; - return pf->tm_conf.root; - } - TAILQ_FOREACH(tm_node, qgroup_list, node) { - if (tm_node->id == node_id) { - *node_type = ICE_TM_NODE_TYPE_QGROUP; - return tm_node; - } - } - - TAILQ_FOREACH(tm_node, queue_list, node) { - if (tm_node->id == node_id) { - *node_type = ICE_TM_NODE_TYPE_QUEUE; - return tm_node; - } - } - - return NULL; + free_node(pf->tm_conf.root); + pf->tm_conf.root = NULL; } static int @@ -195,11 +159,29 @@ ice_node_param_check(struct ice_pf *pf, uint32_t node_id, return 0; } +static struct ice_tm_node * +find_node(struct ice_tm_node *root, uint32_t id) +{ + uint32_t i; + + if (root == NULL || root->id == id) + return root; + + for (i = 0; i < root->reference_count; i++) { + struct ice_tm_node *node = find_node(root->children[i], id); + + if (node) + return node; + } + + return NULL; +} + static int ice_node_type_get(struct rte_eth_dev *dev, uint32_t node_id, int *is_leaf, struct rte_tm_error *error) { - enum ice_tm_node_type node_type = ICE_TM_NODE_TYPE_MAX; + struct ice_pf *pf = ICE_DEV_PRIVATE_TO_PF(dev->data->dev_private); struct ice_tm_node *tm_node; if (!is_leaf || !error) @@ -212,14 +194,14 @@ ice_node_type_get(struct rte_eth_dev *dev, uint32_t node_id, } /* check if the node id exists
[PATCH v2 3/3] doc: update ice document for qos
Add description for ice PMD's rte_tm capabilities. Signed-off-by: Qi Zhang --- doc/guides/nics/ice.rst | 19 +++ 1 file changed, 19 insertions(+) diff --git a/doc/guides/nics/ice.rst b/doc/guides/nics/ice.rst index bafb3ba022..3d381a266b 100644 --- a/doc/guides/nics/ice.rst +++ b/doc/guides/nics/ice.rst @@ -352,6 +352,25 @@ queue 3 using a raw pattern:: Currently, raw pattern support is limited to the FDIR and Hash engines. +Traffic Management Support +~~ + +The ice PMD provides support for the Traffic Management API (RTE_RM), allow +users to offload a 3-layers Tx scheduler on the E810 NIC: + +- ``Port Layer`` + + This is the root layer, support peak bandwidth configuration, max to 32 children. + +- ``Queue Group Layer`` + + The middel layer, support peak / committed bandwidth, weight, priority configurations, + max to 8 children. + +- ``Queue Layer`` + + The leaf layer, support peak / committed bandwidth, weight, priority configurations. + Additional Options ++ -- 2.31.1
RE: [PATCH] net/ice: refine queue start stop
> -Original Message- > From: Zhang, Qi Z > Sent: Friday, January 5, 2024 9:37 PM > To: Yang, Qiming ; Wu, Wenjun1 > > Cc: dev@dpdk.org; Zhang, Qi Z > Subject: [PATCH] net/ice: refine queue start stop > > Not necessary to return fail when starting or stopping a queue if the queue > was already at required state. > > Signed-off-by: Qi Zhang > --- > drivers/net/ice/ice_rxtx.c | 16 > 1 file changed, 16 insertions(+) > > diff --git a/drivers/net/ice/ice_rxtx.c b/drivers/net/ice/ice_rxtx.c index > 73e47ae92d..3286bb08fe 100644 > --- a/drivers/net/ice/ice_rxtx.c > +++ b/drivers/net/ice/ice_rxtx.c > @@ -673,6 +673,10 @@ ice_rx_queue_start(struct rte_eth_dev *dev, > uint16_t rx_queue_id) > return -EINVAL; > } > > + if (dev->data->rx_queue_state[rx_queue_id] == > + RTE_ETH_QUEUE_STATE_STARTED) > + return 0; > + > if (dev->data->dev_conf.rxmode.offloads & > RTE_ETH_RX_OFFLOAD_TIMESTAMP) > rxq->ts_enable = true; > err = ice_program_hw_rx_queue(rxq); > @@ -717,6 +721,10 @@ ice_rx_queue_stop(struct rte_eth_dev *dev, > uint16_t rx_queue_id) > if (rx_queue_id < dev->data->nb_rx_queues) { > rxq = dev->data->rx_queues[rx_queue_id]; > > + if (dev->data->rx_queue_state[rx_queue_id] == > + RTE_ETH_QUEUE_STATE_STOPPED) > + return 0; > + > err = ice_switch_rx_queue(hw, rxq->reg_idx, false); > if (err) { > PMD_DRV_LOG(ERR, "Failed to switch RX queue %u > off", @@ -758,6 +766,10 @@ ice_tx_queue_start(struct rte_eth_dev *dev, > uint16_t tx_queue_id) > return -EINVAL; > } > > + if (dev->data->tx_queue_state[tx_queue_id] == > + RTE_ETH_QUEUE_STATE_STARTED) > + return 0; > + > buf_len = ice_struct_size(txq_elem, txqs, 1); > txq_elem = ice_malloc(hw, buf_len); > if (!txq_elem) > @@ -1066,6 +1078,10 @@ ice_tx_queue_stop(struct rte_eth_dev *dev, > uint16_t tx_queue_id) > return -EINVAL; > } > > + if (dev->data->tx_queue_state[tx_queue_id] == > + RTE_ETH_QUEUE_STATE_STOPPED) > + return 0; > + > q_ids[0] = txq->reg_idx; > q_teids[0] = txq->q_teid; > > -- > 2.31.1 Acked-by: Wenjun Wu
[Bug 1341] ovs+dpdk ixgbe port tx failed. rte_pktmbuf_alloc failed
https://bugs.dpdk.org/show_bug.cgi?id=1341 Bug ID: 1341 Summary: ovs+dpdk ixgbe port tx failed. rte_pktmbuf_alloc failed Product: DPDK Version: 22.11 Hardware: All OS: All Status: UNCONFIRMED Severity: normal Priority: Normal Component: ethdev Assignee: dev@dpdk.org Reporter: 125163...@qq.com Target Milestone: --- Using the openvswitch+dpdk method, for the ixgbe driver type network card added to the bridge, when two VM virtual machine interfaces are sent, the netperf tool is used to send TCP_ After a period of time, the message of CRR will undergo rte_ Eth_ Tx_ Burst stops sending packets and discovers rte through localization_ Pktmbuf_ Alloc cannot allocate mbuf. Openvswitch: DPDK: 22.11 Network card: 82599ES Network card queue: 8 Network card receiving descriptor: 4096 -- You are receiving this mail because: You are the assignee for the bug.
[PATCH] app/test-crypto-perf: add missed resubmission fix
Currently, after enqueue_burst, there may be ops_unused ops left for next round enqueue. And in next round preparation, only ops_needed ops will be added. But if in the final round the left ops is less than ops_needed, there will be invalid ops between the new needed ops and previous unused ops. The previous unused ops should be moved front after the needed ops. In the commit[1], an resubmission fix was added to throughput test, and the fix was missed for verify. This commit adds the missed resubmission fix for verify. [1] 44e2980b70d1 ("app/crypto-perf: fix crypto operation resubmission") Fixes: f8be1786b1b8 ("app/crypto-perf: introduce performance test application") Cc: sta...@dpdk.org Signed-off-by: Suanming Mou --- app/test-crypto-perf/cperf_test_verify.c | 14 +- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/app/test-crypto-perf/cperf_test_verify.c b/app/test-crypto-perf/cperf_test_verify.c index 2b0d3f142b..0328bb5724 100644 --- a/app/test-crypto-perf/cperf_test_verify.c +++ b/app/test-crypto-perf/cperf_test_verify.c @@ -275,7 +275,6 @@ cperf_verify_test_runner(void *test_ctx) ops_needed, ctx->sess, ctx->options, ctx->test_vector, iv_offset, &imix_idx, NULL); - /* Populate the mbuf with the test vector, for verification */ for (i = 0; i < ops_needed; i++) cperf_mbuf_set(ops[i]->sym->m_src, @@ -293,6 +292,19 @@ cperf_verify_test_runner(void *test_ctx) } #endif /* CPERF_LINEARIZATION_ENABLE */ + /** +* When ops_needed is smaller than ops_enqd, the +* unused ops need to be moved to the front for +* next round use. +*/ + if (unlikely(ops_enqd > ops_needed)) { + size_t nb_b_to_mov = ops_unused * sizeof( + struct rte_crypto_op *); + + memmove(&ops[ops_needed], &ops[ops_enqd], + nb_b_to_mov); + } + /* Enqueue burst of ops on crypto device */ ops_enqd = rte_cryptodev_enqueue_burst(ctx->dev_id, ctx->qp_id, ops, burst_size); -- 2.34.1