[PATCH v5] gro : packets not getting flushed in heavy-weight mode API
In heavy-weight mode GRO which is based on timer, the GRO packets will not be flushed in spite of timer expiry if there is no packet in the current poll. If timer mode GRO is enabled the rte_gro_timeout_flush API should be invoked. Signed-off-by: Kumara Parameshwaran --- v1: Changes to make sure that the GRO flush API is invoked if there are no packets in current poll and timer expiry. v2: Fix code organisation issue v3: Fix warnings v4: Fix error and warnings v5: Fix compilation issue when GRO is not defined app/test-pmd/csumonly.c | 15 +++ 1 file changed, 11 insertions(+), 4 deletions(-) diff --git a/app/test-pmd/csumonly.c b/app/test-pmd/csumonly.c index c103e54111..6d9ce99500 100644 --- a/app/test-pmd/csumonly.c +++ b/app/test-pmd/csumonly.c @@ -863,16 +863,23 @@ pkt_burst_checksum_forward(struct fwd_stream *fs) /* receive a burst of packet */ nb_rx = common_fwd_stream_receive(fs, pkts_burst, nb_pkt_per_burst); +#ifndef RTE_LIB_GRO if (unlikely(nb_rx == 0)) return false; - +#else + gro_enable = gro_ports[fs->rx_port].enable; + if (unlikely(nb_rx == 0)) { + if (gro_enable && (gro_flush_cycles != GRO_DEFAULT_FLUSH_CYCLES)) + goto init; + else + return false; + } +init: +#endif rx_bad_ip_csum = 0; rx_bad_l4_csum = 0; rx_bad_outer_l4_csum = 0; rx_bad_outer_ip_csum = 0; -#ifdef RTE_LIB_GRO - gro_enable = gro_ports[fs->rx_port].enable; -#endif txp = &ports[fs->tx_port]; tx_offloads = txp->dev_conf.txmode.offloads; -- 2.25.1
Re: [PATCH v4 4/6] net/i40e: avoid using const variable in assertion
On Wed, Jan 17, 2024 at 10:19:58AM -0800, Stephen Hemminger wrote: > Clang does not allow const variable in a static_assert > expression. > > Signed-off-by: Stephen Hemminger Acked-by: Bruce Richardson
Re: [dpdk-dev] [v1] ethdev: support Tx queue used count
On Fri, Jan 12, 2024 at 5:04 PM Ferruh Yigit wrote: > > On 1/11/2024 3:17 PM, jer...@marvell.com wrote: > > From: Jerin Jacob > > > > Introduce a new API to retrieve the number of used descriptors > > in a Tx queue. Applications can leverage this API in the fast path to > > inspect the Tx queue occupancy and take appropriate actions based on the > > available free descriptors. > > > > A notable use case could be implementing Random Early Discard (RED) > > in software based on Tx queue occupancy. > > > > Signed-off-by: Jerin Jacob > > --- > > > > As we are adding a new API and dev_ops, is a driver implementation and > testpmd/example implementation planned for this release? Yes. > > > > rfc..v1: > > - Updated API similar to rte_eth_rx_queue_count() where it returns > > "used" count instead of "free" count > > > > diff --git a/doc/guides/nics/features.rst b/doc/guides/nics/features.rst > > index f7d9980849..0d5a8733fc 100644 > > --- a/doc/guides/nics/features.rst > > +++ b/doc/guides/nics/features.rst > > @@ -962,6 +962,16 @@ management (see :doc:`../prog_guide/power_man` for > > more details). > > > > * **[implements] eth_dev_ops**: ``get_monitor_addr`` > > > > +.. _nic_features_tx_queue_used_count: > > + > > +Tx queue count > > +-- > > + > > +Supports to get the number of used descriptors of a Tx queue. > > + > > +* **[implements] eth_dev_ops**: ``tx_queue_count``. > > +* **[related] API**: ``rte_eth_tx_queue_count()``. > > + > > > > Can you please keep the order same with 'default.ini' file, > I recognized there is already some mismatch in order but we can fix them > later. Ack > > > .. _nic_features_other: > > > > Other dev ops not represented by a Feature > > diff --git a/doc/guides/nics/features/default.ini > > b/doc/guides/nics/features/default.ini > > index 806cb033ff..3ef6d45c0e 100644 > > --- a/doc/guides/nics/features/default.ini > > +++ b/doc/guides/nics/features/default.ini > > @@ -59,6 +59,7 @@ Packet type parsing = > > Timesync = > > Rx descriptor status = > > Tx descriptor status = > > +Tx queue count = > > > > Existing Rx queue count is not documented, if we are documenting this, > can you please add "Rx queue count" in a separate patch? I will do. > > > Basic stats = > > Extended stats = > > Stats per queue = > > diff --git a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h > > index b482cd12bb..f05f68a67c 100644 > > --- a/lib/ethdev/ethdev_driver.h > > +++ b/lib/ethdev/ethdev_driver.h > > @@ -58,6 +58,8 @@ struct rte_eth_dev { > > eth_rx_queue_count_t rx_queue_count; > > /** Check the status of a Rx descriptor */ > > eth_rx_descriptor_status_t rx_descriptor_status; > > + /** Get the number of used Tx descriptors */ > > + eth_tx_queue_count_t tx_queue_count; > > /** Check the status of a Tx descriptor */ > > eth_tx_descriptor_status_t tx_descriptor_status; > > /** Pointer to PMD transmit mbufs reuse function */ > > diff --git a/lib/ethdev/ethdev_private.c b/lib/ethdev/ethdev_private.c > > index a656df293c..626524558a 100644 > > --- a/lib/ethdev/ethdev_private.c > > +++ b/lib/ethdev/ethdev_private.c > > @@ -273,6 +273,7 @@ eth_dev_fp_ops_setup(struct rte_eth_fp_ops *fpo, > > fpo->tx_pkt_prepare = dev->tx_pkt_prepare; > > fpo->rx_queue_count = dev->rx_queue_count; > > fpo->rx_descriptor_status = dev->rx_descriptor_status; > > + fpo->tx_queue_count = dev->tx_queue_count; > > fpo->tx_descriptor_status = dev->tx_descriptor_status; > > fpo->recycle_tx_mbufs_reuse = dev->recycle_tx_mbufs_reuse; > > fpo->recycle_rx_descriptors_refill = > > dev->recycle_rx_descriptors_refill; > > diff --git a/lib/ethdev/ethdev_trace_points.c > > b/lib/ethdev/ethdev_trace_points.c > > index 91f71d868b..e618414392 100644 > > --- a/lib/ethdev/ethdev_trace_points.c > > +++ b/lib/ethdev/ethdev_trace_points.c > > @@ -481,6 +481,9 @@ RTE_TRACE_POINT_REGISTER(rte_eth_trace_count_aggr_ports, > > RTE_TRACE_POINT_REGISTER(rte_eth_trace_map_aggr_tx_affinity, > > lib.ethdev.map_aggr_tx_affinity) > > > > +RTE_TRACE_POINT_REGISTER(rte_eth_trace_tx_queue_count, > > + lib.ethdev.tx_queue_count) > > + > > Can you please group this with 'tx_burst' & 'call_tx_callbacks' above? Ack > > > RTE_TRACE_POINT_REGISTER(rte_flow_trace_copy, > > lib.ethdev.flow.copy) > > > > diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h > > index 21e3a21903..af59da9652 100644 > > --- a/lib/ethdev/rte_ethdev.h > > +++ b/lib/ethdev/rte_ethdev.h > > @@ -6803,6 +6803,80 @@ rte_eth_recycle_mbufs(uint16_t rx_port_id, uint16_t > > rx_queue_id, > > __rte_experimental > > int rte_eth_buffer_split_get_supported_hdr_ptypes(uint16_t port_id, > > uint32_t *ptypes, int num); > > > > +/** > > + * @warning > > + * @b EXPERIMENTAL: this API may change, or be removed, without prior > > notice > > + * > > + * Get the number of used descriptors of a Tx queue > > + * > > + * Th
[INTERNAL]RE: [PATCH] net/mlx5/hws: fix ESP matching validation
HWS guys claimed this check was needed to make sure matcher works well on ESP. Are we sure we can remove this ? > -Original Message- > From: Raslan Darawsheh > Sent: Wednesday, January 17, 2024 7:12 PM > To: Michael Baum ; dev@dpdk.org > Cc: Matan Azrad ; Dariusz Sosnowski > ; Slava Ovsiienko ; Ori > Kam ; Suanming Mou ; Hamdan > Igbaria ; sta...@dpdk.org > Subject: RE: [PATCH] net/mlx5/hws: fix ESP matching validation > > Hi, > > -Original Message- > > From: Michael Baum > > Sent: Monday, January 15, 2024 2:10 PM > > To: dev@dpdk.org > > Cc: Matan Azrad ; Dariusz Sosnowski > > ; Raslan Darawsheh ; Slava > > Ovsiienko ; Ori Kam ; > > Suanming Mou ; Hamdan Igbaria > > ; sta...@dpdk.org > > Subject: [PATCH] net/mlx5/hws: fix ESP matching validation > > > > The "mlx5dr_definer_conv_item_esp()" function validates first whether > > "ipsec_offload" PRM flag is on, if the flag is off the function > > returns error. > > > > The "ipsec_offload" PRM flag indicates whether IPsec encrypt/decrypt > > is supported, IPsec matching may be supported even when this flag is off. > > > > This patch removes this validation. > > > > Fixes: 81cf20a25abf ("net/mlx5/hws: support match on ESP item") > > Cc: hamd...@nvidia.com > > Cc: sta...@dpdk.org > > > > Signed-off-by: Michael Baum > > Acked-by: Hamdan Igbaria > > Acked-by: Matan Azrad > Patch applied to next-net-mlx, > > Kindest regards > Raslan Darawsheh
RE: [PATCH v4 5/6] mempool: avoid floating point expression in static assertion
> Clang does not handle casts in static_assert() expressions. > It doesn't like use of floating point to calculate threshold. > Use a different expression with same effect. > > Modify comment in mlx5 so that developers don't go searching > for old value. > > Signed-off-by: Stephen Hemminger > --- > drivers/net/mlx5/mlx5_rxq.c | 2 +- > lib/mempool/rte_mempool.c | 7 --- > 2 files changed, 5 insertions(+), 4 deletions(-) > > diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c > index 1bb036afebb3..7d972b6d927c 100644 > --- a/drivers/net/mlx5/mlx5_rxq.c > +++ b/drivers/net/mlx5/mlx5_rxq.c > @@ -1444,7 +1444,7 @@ mlx5_mprq_alloc_mp(struct rte_eth_dev *dev) > /* >* rte_mempool_create_empty() has sanity check to refuse large cache >* size compared to the number of elements. > - * CACHE_FLUSHTHRESH_MULTIPLIER is defined in a C file, so using a > + * CALC_CACHE_FLUSHTHRESH() is defined in a C file, so using a >* constant number 2 instead. >*/ > obj_num = RTE_MAX(obj_num, MLX5_MPRQ_MP_CACHE_SZ * 2); > diff --git a/lib/mempool/rte_mempool.c b/lib/mempool/rte_mempool.c > index b7a19bea7185..12390a2c8155 100644 > --- a/lib/mempool/rte_mempool.c > +++ b/lib/mempool/rte_mempool.c > @@ -50,9 +50,10 @@ static void > mempool_event_callback_invoke(enum rte_mempool_event event, > struct rte_mempool *mp); > > -#define CACHE_FLUSHTHRESH_MULTIPLIER 1.5 > -#define CALC_CACHE_FLUSHTHRESH(c)\ > - ((typeof(c))((c) * CACHE_FLUSHTHRESH_MULTIPLIER)) > +/* Note: avoid using floating point since that compiler > + * may not think that is constant. > + */ > +#define CALC_CACHE_FLUSHTHRESH(c) (((c) * 3) / 2) > > #if defined(RTE_ARCH_X86) > /* Acked-by: Konstantin Ananyev > -- > 2.43.0
RE: [PATCH v4 4/6] net/i40e: avoid using const variable in assertion
> Clang does not allow const variable in a static_assert > expression. > > Signed-off-by: Stephen Hemminger > --- > drivers/net/i40e/i40e_ethdev.h | 1 + > drivers/net/i40e/i40e_rxtx_vec_sse.c | 10 -- > 2 files changed, 5 insertions(+), 6 deletions(-) > > diff --git a/drivers/net/i40e/i40e_ethdev.h b/drivers/net/i40e/i40e_ethdev.h > index 1bbe7ad37600..445e1c0b381f 100644 > --- a/drivers/net/i40e/i40e_ethdev.h > +++ b/drivers/net/i40e/i40e_ethdev.h > @@ -278,6 +278,7 @@ enum i40e_flxpld_layer_idx { > #define I40E_DEFAULT_DCB_APP_PRIO 3 > > #define I40E_FDIR_PRG_PKT_CNT 128 > +#define I40E_FDIR_ID_BIT_SHIFT 13 > > /* > * Struct to store flow created. > diff --git a/drivers/net/i40e/i40e_rxtx_vec_sse.c > b/drivers/net/i40e/i40e_rxtx_vec_sse.c > index 9200a23ff662..2d4480a7651b 100644 > --- a/drivers/net/i40e/i40e_rxtx_vec_sse.c > +++ b/drivers/net/i40e/i40e_rxtx_vec_sse.c > @@ -143,10 +143,9 @@ descs_to_fdir_32b(volatile union i40e_rx_desc *rxdp, > struct rte_mbuf **rx_pkt) > /* convert fdir_id_mask into a single bit, then shift as required for >* correct location in the mbuf->olflags >*/ > - const uint32_t FDIR_ID_BIT_SHIFT = 13; > - RTE_BUILD_BUG_ON(RTE_MBUF_F_RX_FDIR_ID != (1 << FDIR_ID_BIT_SHIFT)); > + RTE_BUILD_BUG_ON(RTE_MBUF_F_RX_FDIR_ID != (1 << > I40E_FDIR_ID_BIT_SHIFT)); > v_fd_id_mask = _mm_srli_epi32(v_fd_id_mask, 31); > - v_fd_id_mask = _mm_slli_epi32(v_fd_id_mask, FDIR_ID_BIT_SHIFT); > + v_fd_id_mask = _mm_slli_epi32(v_fd_id_mask, I40E_FDIR_ID_BIT_SHIFT); > > /* The returned value must be combined into each mbuf. This is already >* being done for RSS and VLAN mbuf olflags, so return bits to OR in. > @@ -205,10 +204,9 @@ descs_to_fdir_16b(__m128i fltstat, __m128i descs[4], > struct rte_mbuf **rx_pkt) > descs[0] = _mm_blendv_epi8(descs[0], _mm_setzero_si128(), v_desc0_mask); > > /* Shift to 1 or 0 bit per u32 lane, then to RTE_MBUF_F_RX_FDIR_ID > offset */ > - const uint32_t FDIR_ID_BIT_SHIFT = 13; > - RTE_BUILD_BUG_ON(RTE_MBUF_F_RX_FDIR_ID != (1 << FDIR_ID_BIT_SHIFT)); > + RTE_BUILD_BUG_ON(RTE_MBUF_F_RX_FDIR_ID != (1 << > I40E_FDIR_ID_BIT_SHIFT)); > __m128i v_mask_one_bit = _mm_srli_epi32(v_fdir_id_mask, 31); > - return _mm_slli_epi32(v_mask_one_bit, FDIR_ID_BIT_SHIFT); > + return _mm_slli_epi32(v_mask_one_bit, I40E_FDIR_ID_BIT_SHIFT); > } > #endif > > -- Acked-by: Konstantin Ananyev > 2.43.0
Re: FreeBSD 14.0 build failure
On Wed, Jan 17, 2024 at 06:28:11PM -0800, Stephen Hemminger wrote: > DPDK will not build on FreeBSD 14.0 > > [2/2] Generating kernel/freebsd/nic_uio with a custom command > FAILED: kernel/freebsd/nic_uio.ko > /usr/bin/make -f ../kernel/freebsd/BSDmakefile.meson > KMOD_OBJDIR=kernel/freebsd KMOD_SRC=../kernel/freebsd/nic_uio/nic_uio.c > KMOD=nic_uio 'KMOD_CFLAGS=-I/home/shemminger/dpdk/build > -I/home/shemminger/dpdk/config -include rte_config.h' CC=clang > clang -O2 -pipe -include rte_config.h -fno-strict-aliasing -Werror > -D_KERNEL -DKLD_MODULE -nostdinc -I/home/shemminger/dpdk/build > -I/home/shemminger/dpdk/config -include > /home/shemminger/dpdk/build/kernel/freebsd/opt_global.h -I. -I/usr/src/sys > -I/usr/src/sys/contrib/ck/include -fno-common -fno-omit-frame-pointer > -mno-omit-leaf-frame-pointer > -fdebug-prefix-map=./machine=/usr/src/sys/amd64/include > -fdebug-prefix-map=./x86=/usr/src/sys/x86/include > -fdebug-prefix-map=./i386=/usr/src/sys/i386/include -MD > -MF.depend.nic_uio.o -MTnic_uio.o -mcmodel=kernel -mno-red-zone -mno-mmx > -mno-sse -msoft-float -fno-asynchronous-unwind-tables -ffreestanding -fwrapv > -fstack-protector -Wall -Wstrict-prototypes -Wmissing-prototypes > -Wpointer-arith -Wcast-qual -Wundef -Wno-pointer-sign > -D__printf__=__freebsd_kprintf__ -Wmissing-include-dirs > -fdiagnostics-show-option -Wno-unknown-pragmas > -Wno-error=tautological-compare -Wno-error=empty-body > -Wno-error=parentheses-equality -Wno-error=unused-function > -Wno-error=pointer-sign -Wno-error=shift-negative-value > -Wno-address-of-packed-member -Wno-format-zero-length -mno-aes -mno-avx > -std=gnu99 -c /home/shemminger/dpdk/kernel/freebsd/nic_uio/nic_uio.c -o > nic_uio.o > /home/shemminger/dpdk/kernel/freebsd/nic_uio/nic_uio.c:84:81: error: too many > arguments provided to function-like macro invocation > DRIVER_MODULE(nic_uio, pci, nic_uio_driver, nic_uio_devclass, > nic_uio_modevent, 0); > > ^ > /usr/src/sys/sys/bus.h:832:9: note: macro 'DRIVER_MODULE' defined here > #define DRIVER_MODULE(name, busname, driver, evh, arg) \ > ^ > /home/shemminger/dpdk/kernel/freebsd/nic_uio/nic_uio.c:84:1: error: type > specifier missing, defaults to 'int'; ISO C99 and later do not support > implicit int [-Werror,-Wimplicit-int] > DRIVER_MODULE(nic_uio, pci, nic_uio_driver, nic_uio_devclass, > nic_uio_modevent, 0); > ^ Yes. I've sent out a patch last month for this: https://patches.dpdk.org/project/dpdk/patch/20231219112959.10440-1-bruce.richard...@intel.com/ /Bruce
RE: [PATCH v4 1/6] eal: introduce RTE_MIN_T() and RTE_MAX_T() macros
> These macros work like RTE_MIN and RTE_MAX but take an explicit > type. Necessary when being used in static assertions since > RTE_MIN and RTE_MAX use temporary variables which confuses > compilers constant expression checks. These macros could also > be useful in other scenarios when bounded range is useful. > > Naming is chosen to be similar to Linux kernel conventions. > > Signed-off-by: Stephen Hemminger > --- > lib/eal/include/rte_common.h | 16 > 1 file changed, 16 insertions(+) > > diff --git a/lib/eal/include/rte_common.h b/lib/eal/include/rte_common.h > index c1ba32d00e47..33680e818bfb 100644 > --- a/lib/eal/include/rte_common.h > +++ b/lib/eal/include/rte_common.h > @@ -585,6 +585,14 @@ __extension__ typedef uint64_t RTE_MARKER64[0]; > _a < _b ? _a : _b; \ > }) > > +/** > + * Macro to return the minimum of two numbers > + * does not use temporarys so not safe if a or b is expression > + * but is guaranteed to be constant for use in static_assert() > + */ > +#define RTE_MIN_T(a, b, t) \ > + ((t)(a) < (t)(b) ? (t)(a) : (t)(b)) > + > /** > * Macro to return the maximum of two numbers > */ > @@ -595,6 +603,14 @@ __extension__ typedef uint64_t RTE_MARKER64[0]; > _a > _b ? _a : _b; \ > }) > > +/** > + * Macro to return the maxiimum of two numbers > + * does not use temporarys so not safe if a or b is expression > + * but is guaranteed to be constant for use in static_assert() > + */ > +#define RTE_MAX_T(a, b, t) \ > + ((t)(a) > (t)(b) ? (t)(a) : (t)(b)) > + > /*** Other general functions / macros / > > #ifndef offsetof > -- Acked-by: Konstantin Ananyev > 2.43.0
Re: [PATCH v4 3/6] net/sfc: fix non-constant expression in RTE_BUILD_BUG_ON()
On 1/17/24 21:19, Stephen Hemminger wrote: The macro RTE_MIN has some hidden assignments to provide type safety which means the statement can not be fully evaluated in first pass of compiler. Replace RTE_MIN() with equivalent macro. Fixes: 4f93d790 ("net/sfc: support TSO for EF100 native datapath") Cc: ivan.ma...@oktetlabs.ru Signed-off-by: Stephen Hemminger Acked-by: Tyler Retzlaff One nit below, anyway: Reviewed-by: Andrew Rybchenko --- drivers/net/sfc/sfc_ef100_tx.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/drivers/net/sfc/sfc_ef100_tx.c b/drivers/net/sfc/sfc_ef100_tx.c index 1b6374775f07..c49d004113d3 100644 --- a/drivers/net/sfc/sfc_ef100_tx.c +++ b/drivers/net/sfc/sfc_ef100_tx.c @@ -26,7 +26,6 @@ #include "sfc_ef100.h" #include "sfc_nic_dma_dp.h" - unrelated change #define sfc_ef100_tx_err(_txq, ...) \ SFC_DP_LOG(SFC_KVARG_DATAPATH_EF100, ERR, &(_txq)->dp.dpq, __VA_ARGS__) @@ -563,8 +562,7 @@ sfc_ef100_tx_pkt_descs_max(const struct rte_mbuf *m) * (split into many Tx descriptors). */ RTE_BUILD_BUG_ON(SFC_EF100_TX_SEND_DESC_LEN_MAX < -RTE_MIN((unsigned int)EFX_MAC_PDU_MAX, -SFC_MBUF_SEG_LEN_MAX)); +RTE_MIN_T(EFX_MAC_PDU_MAX, SFC_MBUF_SEG_LEN_MAX, uint32_t)); } if (m->ol_flags & sfc_dp_mport_override) {
Re: [PATCH v4 5/6] mempool: avoid floating point expression in static assertion
On 1/17/24 21:32, Morten Brørup wrote: From: Stephen Hemminger [mailto:step...@networkplumber.org] Sent: Wednesday, 17 January 2024 19.20 Clang does not handle casts in static_assert() expressions. It doesn't like use of floating point to calculate threshold. Use a different expression with same effect. Modify comment in mlx5 so that developers don't go searching for old value. Signed-off-by: Stephen Hemminger Acked-by: Morten Brørup Reviewed-by: Andrew Rybchenko
Re: [PATCH v4 2/6] event/opdl: fix non-constant compile time assertion
On 1/17/24 21:19, Stephen Hemminger wrote: RTE_BUILD_BUG_ON() was being used with a non-constant value. The inline function rte_is_power_of_2() is not constant since inline expansion happens later in the compile process. Replace it with the macro which will be constant. Fixes: 4236ce9bf5bf ("event/opdl: add OPDL ring infrastructure library") Cc: liang.j...@intel.com Signed-off-by: Stephen Hemminger Acked-by: Bruce Richardson Acked-by: Tyler Retzlaff Acked-by: Andrew Rybchenko
Re: [PATCH v4 1/6] eal: introduce RTE_MIN_T() and RTE_MAX_T() macros
On 1/17/24 21:19, Stephen Hemminger wrote: These macros work like RTE_MIN and RTE_MAX but take an explicit type. Necessary when being used in static assertions since RTE_MIN and RTE_MAX use temporary variables which confuses compilers constant expression checks. These macros could also be useful in other scenarios when bounded range is useful. Naming is chosen to be similar to Linux kernel conventions. Signed-off-by: Stephen Hemminger Acked-by: Andrew Rybchenko
[dpdk-dev] [v2] ethdev: support Tx queue used count
From: Jerin Jacob Introduce a new API to retrieve the number of used descriptors in a Tx queue. Applications can leverage this API in the fast path to inspect the Tx queue occupancy and take appropriate actions based on the available free descriptors. A notable use case could be implementing Random Early Discard (RED) in software based on Tx queue occupancy. Signed-off-by: Jerin Jacob Reviewed-by: Andrew Rybchenko Acked-by: Morten Brørup --- devtools/libabigail.abignore | 3 + doc/guides/nics/features.rst | 10 doc/guides/nics/features/default.ini | 1 + doc/guides/rel_notes/release_24_03.rst | 5 ++ lib/ethdev/ethdev_driver.h | 2 + lib/ethdev/ethdev_private.c| 1 + lib/ethdev/ethdev_trace_points.c | 3 + lib/ethdev/rte_ethdev.h| 80 ++ lib/ethdev/rte_ethdev_core.h | 7 ++- lib/ethdev/rte_ethdev_trace_fp.h | 8 +++ lib/ethdev/version.map | 1 + 11 files changed, 120 insertions(+), 1 deletion(-) v2: - Rename _nic_features_tx_queue_used_count to _nic_features_tx_queue_count - Fix trace emission of case fops->tx_queue_count == NULL - Rename tx_queue_id to queue_id in implementation symbols and prints - Added "goto out" for better error handling - Add release note - Added libabigail suppression rule for the reserved2 field update - Fix all ordering and grouping, empty line comment from Ferruh - Added following notes in doxygen documentation for better clarity on API usage * @note There is no requirement to call this function before rte_eth_tx_burst() invocation. * @note Utilize this function exclusively when the caller needs to determine the used queue count * across all descriptors of a Tx queue. If the use case only involves checking the status of a * specific descriptor slot, opt for rte_eth_tx_descriptor_status() instead. rfc..v1: - Updated API similar to rte_eth_rx_queue_count() where it returns "used" count instead of "free" count diff --git a/devtools/libabigail.abignore b/devtools/libabigail.abignore index 21b8cd6113..d6e98c6f52 100644 --- a/devtools/libabigail.abignore +++ b/devtools/libabigail.abignore @@ -33,3 +33,6 @@ ; Temporary exceptions till next major ABI version ; +[suppress_type] + name = rte_eth_fp_ops + has_data_member_inserted_between = {offset_of(reserved2), end} diff --git a/doc/guides/nics/features.rst b/doc/guides/nics/features.rst index f7d9980849..f38941c719 100644 --- a/doc/guides/nics/features.rst +++ b/doc/guides/nics/features.rst @@ -697,6 +697,16 @@ or "Unavailable." * **[related]API**: ``rte_eth_tx_descriptor_status()``. +.. _nic_features_tx_queue_count: + +Tx queue count +-- + +Supports to get the number of used descriptors of a Tx queue. + +* **[implements] eth_dev_ops**: ``tx_queue_count``. +* **[related] API**: ``rte_eth_tx_queue_count()``. + .. _nic_features_basic_stats: Basic stats diff --git a/doc/guides/nics/features/default.ini b/doc/guides/nics/features/default.ini index 6d50236292..5115963136 100644 --- a/doc/guides/nics/features/default.ini +++ b/doc/guides/nics/features/default.ini @@ -59,6 +59,7 @@ Packet type parsing = Timesync = Rx descriptor status = Tx descriptor status = +Tx queue count = Basic stats = Extended stats = Stats per queue = diff --git a/doc/guides/rel_notes/release_24_03.rst b/doc/guides/rel_notes/release_24_03.rst index c4fc8ad583..16dd367178 100644 --- a/doc/guides/rel_notes/release_24_03.rst +++ b/doc/guides/rel_notes/release_24_03.rst @@ -65,6 +65,11 @@ New Features * Added ``RTE_FLOW_ITEM_TYPE_RANDOM`` to match random value. * Added ``RTE_FLOW_FIELD_RANDOM`` to represent it in field ID struct. +* ** Support for getting the number of used descriptors of a Tx queue. ** + + * Added a fath path function ``rte_eth_tx_queue_count`` to get the number of used +descriptors of a Tx queue. + Removed Items - diff --git a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h index b482cd12bb..f05f68a67c 100644 --- a/lib/ethdev/ethdev_driver.h +++ b/lib/ethdev/ethdev_driver.h @@ -58,6 +58,8 @@ struct rte_eth_dev { eth_rx_queue_count_t rx_queue_count; /** Check the status of a Rx descriptor */ eth_rx_descriptor_status_t rx_descriptor_status; + /** Get the number of used Tx descriptors */ + eth_tx_queue_count_t tx_queue_count; /** Check the status of a Tx descriptor */ eth_tx_descriptor_status_t tx_descriptor_status; /** Pointer to PMD transmit mbufs reuse function */ diff --git a/lib/ethdev/ethdev_private.c b/lib/ethdev/ethdev_private.c index a656df293c..626524558a 100644 --- a/lib/ethdev/ethdev_private.c +++ b/lib/ethdev/ethdev_private.c @@ -273,6 +273,7 @@ eth_dev_fp_ops_setup(struct rte_eth_fp_ops *fpo, fpo-
RE: [dpdk-dev] [v1] ethdev: support Tx queue used count
Hi Jerin, > > > Introduce a new API to retrieve the number of used descriptors > > > in a Tx queue. Applications can leverage this API in the fast path to > > > inspect the Tx queue occupancy and take appropriate actions based on the > > > available free descriptors. > > > > > > A notable use case could be implementing Random Early Discard (RED) > > > in software based on Tx queue occupancy. > > > > > > Signed-off-by: Jerin Jacob > > > > @@ -6803,6 +6803,80 @@ rte_eth_recycle_mbufs(uint16_t rx_port_id, > > > uint16_t rx_queue_id, > > > __rte_experimental > > > int rte_eth_buffer_split_get_supported_hdr_ptypes(uint16_t port_id, > > > uint32_t *ptypes, int num); > > > > > > +/** > > > + * @warning > > > + * @b EXPERIMENTAL: this API may change, or be removed, without prior > > > notice > > > + * > > > + * Get the number of used descriptors of a Tx queue > > > + * > > > + * This function retrieves the number of used descriptors of a transmit > > > queue. > > > + * Applications can use this API in the fast path to inspect Tx queue > > > occupancy and take > > > + * appropriate actions based on the available free descriptors. > > > + * An example action could be implementing the Random Early Discard > > > (RED). > > > > Sorry, I probably misunderstood your previous mails, but wouldn't it be > > more convenient > > for user to have rte_eth_tx_queue_free_count(...) as fast-op, and > > have rte_eth_tx_queue_count(...) { queue_txd_num - > > rte_eth_tx_queue_free_count(...);} > > as a slow-path function in rte_ethdev.c? > > The general feedback is to align with the Rx queue API, specifically > rte_eth_rx_queue_count, > and it's noted that there is no equivalent rte_eth_rx_queue_free_count. > > Given that the free count can be obtained by subtracting the used > count from queue_txd_num, > it is considered that either approach is acceptable. > > The application configures queue_txd_num with tx_queue_setup(), and > the application can store that value in its structure. > This would enable fast-path usage for both base cases (whether the > application needs information about free or used descriptors) > with just one API(rte_eth_tx_queue_count()) Right now I don't use these functions, but if I think what most people are interested in: - how many packets you can receive immediately (rx_queue_count) - how many packets you can transmit immediately (tx_queue_free_count) Sure, I understand that user can store txd_num somewhere and then do subtraction himself. Though it means more effort for the user, and the only reason for that, as I can see, is to have RX and TX function naming symmetric. Which seems much less improtant to me comparing to user convenience. Anyway, as I stated above, I don't use these functions right now, so if the majority of users are happy with current approach, I would not insist :) Konstantin
[PATCH] vhost: fix deadlock during software live migration of VDPA in a nested virtualization environment
In a nested virtualization environment, running dpdk vdpa in QEMU-L1 for software live migration will result in a deadlock between dpdke-vdpa and QEMU-L2 processes. rte_vdpa_relay_vring_used-> __vhost_iova_to_vva-> vhost_user_iotlb_rd_unlock(vq)-> vhost_user_iotlb_miss-> send vhost message VHOST_USER_SLAVE_IOTLB_MSG to QEMU's vdpa socket, then call vhost_user_iotlb_rd_lock(vq) to hold the read lock `iotlb_lock`. But there is no place to release this read lock. QEMU L2 get the VHOST_USER_SLAVE_IOTLB_MSG, then call vhost_user_send_device_iotlb_msg to send VHOST_USER_IOTLB_MSG messages to dpdk-vdpa. Dpdk vdpa will call vhost_user_iotlb_msg-> vhost_user_iotlb_cache_insert, here, will obtain the write lock `iotlb_lock`, but the read lock `iotlb_lock` has not been released and will block here. This patch add lock and unlock function to fix the deadlock. Signed-off-by: Hao Chen --- lib/vhost/vdpa.c | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/lib/vhost/vdpa.c b/lib/vhost/vdpa.c index 9776fc07a9..9132414209 100644 --- a/lib/vhost/vdpa.c +++ b/lib/vhost/vdpa.c @@ -19,6 +19,7 @@ #include "rte_vdpa.h" #include "vdpa_driver.h" #include "vhost.h" +#include "iotlb.h" /** Double linked list of vDPA devices. */ TAILQ_HEAD(vdpa_device_list, rte_vdpa_device); @@ -193,10 +194,12 @@ rte_vdpa_relay_vring_used(int vid, uint16_t qid, void *vring_m) if (unlikely(nr_descs > vq->size)) return -1; + vhost_user_iotlb_rd_lock(vq); desc_ring = (struct vring_desc *)(uintptr_t) vhost_iova_to_vva(dev, vq, vq->desc[desc_id].addr, &dlen, VHOST_ACCESS_RO); + vhost_user_iotlb_rd_unlock(vq); if (unlikely(!desc_ring)) return -1; @@ -220,9 +223,12 @@ rte_vdpa_relay_vring_used(int vid, uint16_t qid, void *vring_m) if (unlikely(nr_descs-- == 0)) goto fail; desc = desc_ring[desc_id]; - if (desc.flags & VRING_DESC_F_WRITE) + if (desc.flags & VRING_DESC_F_WRITE) { + vhost_user_iotlb_rd_lock(vq); vhost_log_write_iova(dev, vq, desc.addr, desc.len); + vhost_user_iotlb_rd_unlock(vq); + } desc_id = desc.next; } while (desc.flags & VRING_DESC_F_NEXT); -- 2.27.0
[PATCH] maintainers: update for intel PMD
Remove my name for next-net-intel, fm10k, ice and af_xdp. Signed-off-by: Qi Zhang --- MAINTAINERS | 4 1 file changed, 4 deletions(-) diff --git a/MAINTAINERS b/MAINTAINERS index 0d1c8126e3..7d74486d1a 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -37,7 +37,6 @@ M: Ajit Khaparde T: git://dpdk.org/next/dpdk-next-net-brcm Next-net-intel Tree -M: Qi Zhang T: git://dpdk.org/next/dpdk-next-net-intel Next-net-mrvl Tree @@ -634,7 +633,6 @@ F: doc/guides/nics/features/afpacket.ini Linux AF_XDP M: Ciara Loftus -M: Qi Zhang F: drivers/net/af_xdp/ F: doc/guides/nics/af_xdp.rst F: doc/guides/nics/features/af_xdp.ini @@ -767,7 +765,6 @@ F: doc/guides/nics/intel_vf.rst F: doc/guides/nics/features/i40e*.ini Intel fm10k -M: Qi Zhang M: Xiao Wang T: git://dpdk.org/next/dpdk-next-net-intel F: drivers/net/fm10k/ @@ -784,7 +781,6 @@ F: doc/guides/nics/features/iavf*.ini Intel ice M: Qiming Yang -M: Qi Zhang T: git://dpdk.org/next/dpdk-next-net-intel F: drivers/net/ice/ F: doc/guides/nics/ice.rst -- 2.31.1
[Bug 1368] inconsistency in eventdev dev_info and config structs makes some valid configs impossible
https://bugs.dpdk.org/show_bug.cgi?id=1368 Bug ID: 1368 Summary: inconsistency in eventdev dev_info and config structs makes some valid configs impossible Product: DPDK Version: unspecified Hardware: All OS: All Status: UNCONFIRMED Severity: normal Priority: Normal Component: eventdev Assignee: dev@dpdk.org Reporter: bruce.richard...@intel.com Target Milestone: --- In the rte_event_dev_info struct[1], we have the max_event_queues[2] and max_single_link_event_port_queue_pairs[3] members. The doxygen docs on the latter states: "These ports and queues are not accounted for in max_event_ports or max_event_queues." This implies that a device which has 8 regular queues and an extra 8 single-link only queues, would report max_event_queues == 8, and max_single_link_event_port_queue_pairs == 8 on return from rte_event_dev_info_get() function. Those values returned from info_get are generally to be used to guide the configuration using rte_event_dev_configure() API, which takes the rte_event_dev_config[4] struct. This has two similar fields, in nb_event_queues[5] and nb_single_link_event_port_queues[6]. However, a problem arises in that the documentation states that nb_event_queues cannot be greater than the previously reported max_event_queues (which by itself makes sense), but the documentation also states that nb_single_link_event_port_queues is a subset of the overall event ports and queues, and cannot be greater than the nb_event_queues given in the same config structure. To illustrate the issue by continuing to use the same example as above, suppose an app wants to take that device with 8 regular queues and 8 single link ones, and have an app with 2 shared processing queues, e.g. for load-balancing packets/events among 8 cores, but also wants to use the 8 single link queues to allow sending packets/events directly to each core without load balancing. In this 2 + 8 scenario, there is no valid dev_config struct settings that will work: * making the 8 a subset of the nb_event_queues, means that nb_event_queues is 10, which is greater than max_event_queues and so invalid. * keeping them separate, so that nb_event_queues == 2 and nb_single_link_port_queues == 8 violates the constraint that the single_link value cannot exceed the former nb_event_queues value. We therefore need to adjust the constraints to make things work. Now we can do so, while keeping the single_link value *not included* in the total-count in dev_info, but have it *included* in the config struct, but such a setup is very confusing for the user. Therefore, I think instead we need to correct this by aligning the two structures - either the single_link queues are included in the queue/port counts in both structs, or they aren't included. [1] https://doc.dpdk.org/api/structrte__event__dev__info.html [2] https://doc.dpdk.org/api/structrte__event__dev__info.html#a1cebb1d19943d6b8e3d6e51ffc72982a [3] https://doc.dpdk.org/api/structrte__event__dev__info.html#ae65bf9e4dba80ccb205f3c43f5907d5d [4] https://doc.dpdk.org/api/structrte__event__dev__config.html [5] https://doc.dpdk.org/api/structrte__event__dev__config.html#a703c026d74436b05fc656652324101e4 [6] https://doc.dpdk.org/api/structrte__event__dev__config.html#a39f29448dce5baf491f6685299faa0c9 -- You are receiving this mail because: You are the assignee for the bug.
Re: [dpdk-dev] [v1] ethdev: support Tx queue used count
On Thu, Jan 18, 2024 at 3:47 PM Konstantin Ananyev wrote: > > > Hi Jerin, Hi Konstantin, > > > > > Introduce a new API to retrieve the number of used descriptors > > > > in a Tx queue. Applications can leverage this API in the fast path to > > > > inspect the Tx queue occupancy and take appropriate actions based on the > > > > available free descriptors. > > > > > > > > A notable use case could be implementing Random Early Discard (RED) > > > > in software based on Tx queue occupancy. > > > > > > > > Signed-off-by: Jerin Jacob > > > > > > @@ -6803,6 +6803,80 @@ rte_eth_recycle_mbufs(uint16_t rx_port_id, > > > > uint16_t rx_queue_id, > > > > __rte_experimental > > > > int rte_eth_buffer_split_get_supported_hdr_ptypes(uint16_t port_id, > > > > uint32_t *ptypes, int num); > > > > > > > > +/** > > > > + * @warning > > > > + * @b EXPERIMENTAL: this API may change, or be removed, without prior > > > > notice > > > > + * > > > > + * Get the number of used descriptors of a Tx queue > > > > + * > > > > + * This function retrieves the number of used descriptors of a > > > > transmit queue. > > > > + * Applications can use this API in the fast path to inspect Tx queue > > > > occupancy and take > > > > + * appropriate actions based on the available free descriptors. > > > > + * An example action could be implementing the Random Early Discard > > > > (RED). > > > > > > Sorry, I probably misunderstood your previous mails, but wouldn't it be > > > more convenient > > > for user to have rte_eth_tx_queue_free_count(...) as fast-op, and > > > have rte_eth_tx_queue_count(...) { queue_txd_num - > > > rte_eth_tx_queue_free_count(...);} > > > as a slow-path function in rte_ethdev.c? > > > > The general feedback is to align with the Rx queue API, specifically > > rte_eth_rx_queue_count, > > and it's noted that there is no equivalent rte_eth_rx_queue_free_count. > > > > Given that the free count can be obtained by subtracting the used > > count from queue_txd_num, > > it is considered that either approach is acceptable. > > > > The application configures queue_txd_num with tx_queue_setup(), and > > the application can store that value in its structure. > > This would enable fast-path usage for both base cases (whether the > > application needs information about free or used descriptors) > > with just one API(rte_eth_tx_queue_count()) > > Right now I don't use these functions, but if I think what most people are > interested in: > - how many packets you can receive immediately (rx_queue_count) > - how many packets you can transmit immediately (tx_queue_free_count) Yes. That's why initially I kept the free version. It seems like other prominent use cases for _used_ version is for QoS to enable something like in positive logic, if < 98% % used then action Tail drop else if < 60% used then action RED > Sure, I understand that user can store txd_num somewhere and then do > subtraction himself. > Though it means more effort for the user, and the only reason for that, as I > can see, > is to have RX and TX function naming symmetric. > Which seems much less improtant to me comparing to user convenience. > Anyway, as I stated above, I don't use these functions right now, > so if the majority of users are happy with current approach, I would not > insist :) No strong opinion for me either. Looks like majority users like used version. Let's go with. I am open for changing to free version, if the majority of users like that. > Konstantin >
Re: [Bug 1368] inconsistency in eventdev dev_info and config structs makes some valid configs impossible
On Thu, Jan 18, 2024 at 11:18:58AM +, bugzi...@dpdk.org wrote: >Bug ID [1]1368 >Summary inconsistency in eventdev dev_info and config structs makes >some valid configs impossible >Product DPDK >Version unspecified >Hardware All >OS All >Status UNCONFIRMED >Severity normal >Priority Normal >Component eventdev >Assignee dev@dpdk.org >Reporter bruce.richard...@intel.com >Target Milestone --- > > In the rte_event_dev_info struct[1], we have the max_event_queues[2] and > max_single_link_event_port_queue_pairs[3] members. The doxygen docs on the > latter states: "These ports and queues are not accounted for in > max_event_ports or max_event_queues." > > This implies that a device which has 8 regular queues and an extra 8 > single-link only queues, would report max_event_queues == 8, and > max_single_link_event_port_queue_pairs == 8 on return from > rte_event_dev_info_get() function. > > Those values returned from info_get are generally to be used to guide the > configuration using rte_event_dev_configure() API, which takes the > rte_event_dev_config[4] struct. This has two similar fields, in > nb_event_queues[5] and nb_single_link_event_port_queues[6]. However, a > problem arises in that the documentation states that nb_event_queues cannot > be greater than the previously reported max_event_queues (which by itself > makes sense), but the documentation also states that > nb_single_link_event_port_queues is a subset of the overall event ports and > queues, and cannot be greater than the nb_event_queues given in the same > config structure. > > To illustrate the issue by continuing to use the same example as above, > suppose an app wants to take that device with 8 regular queues and 8 single > link ones, and have an app with 2 shared processing queues, e.g. for > load-balancing packets/events among 8 cores, but also wants to use the 8 > single link queues to allow sending packets/events directly to each core > without load balancing. In this 2 + 8 scenario, there is no valid > dev_config struct settings that will work: > * making the 8 a subset of the nb_event_queues, means that nb_event_queues > is 10, which is greater than max_event_queues and so invalid. > * keeping them separate, so that nb_event_queues == 2 and > nb_single_link_port_queues == 8 violates the constraint that the > single_link value cannot exceed the former nb_event_queues value. > > We therefore need to adjust the constraints to make things work. Now we can > do so, while keeping the single_link value *not included* in the > total-count in dev_info, but have it *included* in the config struct, but > such a setup is very confusing for the user. Therefore, I think instead we > need to correct this by aligning the two structures - either the > single_link queues are included in the queue/port counts in both structs, > or they aren't included. > Since I'm doing some clean-up work on rte_eventdev.h doxygen comments, I'm happy enough to push a patch to help fix this, if we can agree on the solution. Of the two possibilities (make both have single-link included in count, or make both have single-link not-included), I would suggest we go for having them included, on the basis that that would involve an internal DPDK change to increase the reported counts in dev_info from any drivers supporting single link queues, but should *not* involve any changes to end applications, which would already be specifying the values based on nb_single_link being a subset of nb_event_queues. On the other hand, changing semantics of the config struct fields would likely mean changes to end-apps and so be an ABI/API break. /Bruce
[PATCH v3] crypto/ipsec_mb: unified IPsec MB interface
Currently IPsec MB provides both the JOB API and direct API. AESNI_MB PMD is using the JOB API codepath while ZUC, KASUMI, SNOW3G, CHACHA20_POLY1305 and AESNI_GCM are using the direct API. Instead of using the direct API for these PMDs, they should now make use of the JOB API codepath. This would remove all use of the IPsec MB direct API for these PMDs. Signed-off-by: Brian Dooley --- v2: - Fix compilation failure v3: - Remove session configure pointer for each PMD --- drivers/crypto/ipsec_mb/pmd_aesni_gcm.c | 757 +- drivers/crypto/ipsec_mb/pmd_aesni_gcm_priv.h | 21 - drivers/crypto/ipsec_mb/pmd_aesni_mb.c| 8 +- drivers/crypto/ipsec_mb/pmd_aesni_mb_priv.h | 13 + drivers/crypto/ipsec_mb/pmd_chacha_poly.c | 335 +--- .../crypto/ipsec_mb/pmd_chacha_poly_priv.h| 19 - drivers/crypto/ipsec_mb/pmd_kasumi.c | 403 +- drivers/crypto/ipsec_mb/pmd_kasumi_priv.h | 12 - drivers/crypto/ipsec_mb/pmd_snow3g.c | 539 + drivers/crypto/ipsec_mb/pmd_snow3g_priv.h | 13 - drivers/crypto/ipsec_mb/pmd_zuc.c | 341 +--- drivers/crypto/ipsec_mb/pmd_zuc_priv.h| 11 - 12 files changed, 34 insertions(+), 2438 deletions(-) diff --git a/drivers/crypto/ipsec_mb/pmd_aesni_gcm.c b/drivers/crypto/ipsec_mb/pmd_aesni_gcm.c index 8d40bd9169..50b65749a2 100644 --- a/drivers/crypto/ipsec_mb/pmd_aesni_gcm.c +++ b/drivers/crypto/ipsec_mb/pmd_aesni_gcm.c @@ -3,753 +3,7 @@ */ #include "pmd_aesni_gcm_priv.h" - -static void -aesni_gcm_set_ops(struct aesni_gcm_ops *ops, IMB_MGR *mb_mgr) -{ - /* Set 128 bit function pointers. */ - ops[GCM_KEY_128].pre = mb_mgr->gcm128_pre; - ops[GCM_KEY_128].init = mb_mgr->gcm128_init; - - ops[GCM_KEY_128].enc = mb_mgr->gcm128_enc; - ops[GCM_KEY_128].update_enc = mb_mgr->gcm128_enc_update; - ops[GCM_KEY_128].finalize_enc = mb_mgr->gcm128_enc_finalize; - - ops[GCM_KEY_128].dec = mb_mgr->gcm128_dec; - ops[GCM_KEY_128].update_dec = mb_mgr->gcm128_dec_update; - ops[GCM_KEY_128].finalize_dec = mb_mgr->gcm128_dec_finalize; - - ops[GCM_KEY_128].gmac_init = mb_mgr->gmac128_init; - ops[GCM_KEY_128].gmac_update = mb_mgr->gmac128_update; - ops[GCM_KEY_128].gmac_finalize = mb_mgr->gmac128_finalize; - - /* Set 192 bit function pointers. */ - ops[GCM_KEY_192].pre = mb_mgr->gcm192_pre; - ops[GCM_KEY_192].init = mb_mgr->gcm192_init; - - ops[GCM_KEY_192].enc = mb_mgr->gcm192_enc; - ops[GCM_KEY_192].update_enc = mb_mgr->gcm192_enc_update; - ops[GCM_KEY_192].finalize_enc = mb_mgr->gcm192_enc_finalize; - - ops[GCM_KEY_192].dec = mb_mgr->gcm192_dec; - ops[GCM_KEY_192].update_dec = mb_mgr->gcm192_dec_update; - ops[GCM_KEY_192].finalize_dec = mb_mgr->gcm192_dec_finalize; - - ops[GCM_KEY_192].gmac_init = mb_mgr->gmac192_init; - ops[GCM_KEY_192].gmac_update = mb_mgr->gmac192_update; - ops[GCM_KEY_192].gmac_finalize = mb_mgr->gmac192_finalize; - - /* Set 256 bit function pointers. */ - ops[GCM_KEY_256].pre = mb_mgr->gcm256_pre; - ops[GCM_KEY_256].init = mb_mgr->gcm256_init; - - ops[GCM_KEY_256].enc = mb_mgr->gcm256_enc; - ops[GCM_KEY_256].update_enc = mb_mgr->gcm256_enc_update; - ops[GCM_KEY_256].finalize_enc = mb_mgr->gcm256_enc_finalize; - - ops[GCM_KEY_256].dec = mb_mgr->gcm256_dec; - ops[GCM_KEY_256].update_dec = mb_mgr->gcm256_dec_update; - ops[GCM_KEY_256].finalize_dec = mb_mgr->gcm256_dec_finalize; - - ops[GCM_KEY_256].gmac_init = mb_mgr->gmac256_init; - ops[GCM_KEY_256].gmac_update = mb_mgr->gmac256_update; - ops[GCM_KEY_256].gmac_finalize = mb_mgr->gmac256_finalize; -} - -static int -aesni_gcm_session_configure(IMB_MGR *mb_mgr, void *session, - const struct rte_crypto_sym_xform *xform) -{ - struct aesni_gcm_session *sess = session; - const struct rte_crypto_sym_xform *auth_xform; - const struct rte_crypto_sym_xform *cipher_xform; - const struct rte_crypto_sym_xform *aead_xform; - - uint8_t key_length; - const uint8_t *key; - enum ipsec_mb_operation mode; - int ret = 0; - - ret = ipsec_mb_parse_xform(xform, &mode, &auth_xform, - &cipher_xform, &aead_xform); - if (ret) - return ret; - - /**< GCM key type */ - - sess->op = mode; - - switch (sess->op) { - case IPSEC_MB_OP_HASH_GEN_ONLY: - case IPSEC_MB_OP_HASH_VERIFY_ONLY: - /* AES-GMAC -* auth_xform = xform; -*/ - if (auth_xform->auth.algo != RTE_CRYPTO_AUTH_AES_GMAC) { - IPSEC_MB_LOG(ERR, - "Only AES GMAC is supported as an authentication only algorithm"); - ret = -ENOTSUP; - goto error_exit; - } - /* Set IV parameters */ -
[PATCH v5 1/2] drivers/net: fix buffer overflow for ptypes list
Address Sanitizer detects a buffer overflow caused by an incorrect ptypes list. Missing "RTE_PTYPE_UNKNOWN" ptype causes buffer overflow. Fix the ptypes list for drivers. Fixes: 0849ac3b6122 ("net/tap: add packet type management") Fixes: a7bdc3bd4244 ("net/dpaa: support packet type parsing") Fixes: 4ccc8d770d3b ("net/mvneta: add PMD skeleton") Fixes: f3f0d77db6b0 ("net/mrvl: support packet type parsing") Fixes: 71e8bb65046e ("net/nfp: update supported list of packet types") Fixes: 659b494d3d88 ("net/pfe: add packet types and basic statistics") Fixes: 398a1be14168 ("net/thunderx: remove generic passX references") Cc: pascal.ma...@6wind.com Cc: shreyansh.j...@nxp.com Cc: z...@semihalf.com Cc: t...@semihalf.com Cc: qin...@corigine.com Cc: g.si...@nxp.com Cc: jerin.ja...@caviumnetworks.com Cc: sta...@dpdk.org Signed-off-by: Sivaramakrishnan Venkat -- v5: modified commit message. v4: split into two patches, one for backporting and one for upstream rework v3: reworked the function to return number of elements and remove the need for RTE_PTYPE_UNKNOWN in list. v2: extended fix for multiple drivers. --- drivers/net/dpaa/dpaa_ethdev.c | 3 ++- drivers/net/mvneta/mvneta_ethdev.c | 3 ++- drivers/net/mvpp2/mrvl_ethdev.c | 3 ++- drivers/net/nfp/nfp_net_common.c| 1 + drivers/net/pfe/pfe_ethdev.c| 3 ++- drivers/net/tap/rte_eth_tap.c | 1 + drivers/net/thunderx/nicvf_ethdev.c | 2 ++ 7 files changed, 12 insertions(+), 4 deletions(-) diff --git a/drivers/net/dpaa/dpaa_ethdev.c b/drivers/net/dpaa/dpaa_ethdev.c index ef4c06db6a..779bdc5860 100644 --- a/drivers/net/dpaa/dpaa_ethdev.c +++ b/drivers/net/dpaa/dpaa_ethdev.c @@ -363,7 +363,8 @@ dpaa_supported_ptypes_get(struct rte_eth_dev *dev) RTE_PTYPE_L4_TCP, RTE_PTYPE_L4_UDP, RTE_PTYPE_L4_SCTP, - RTE_PTYPE_TUNNEL_ESP + RTE_PTYPE_TUNNEL_ESP, + RTE_PTYPE_UNKNOWN }; PMD_INIT_FUNC_TRACE(); diff --git a/drivers/net/mvneta/mvneta_ethdev.c b/drivers/net/mvneta/mvneta_ethdev.c index daa69e533a..212c300c14 100644 --- a/drivers/net/mvneta/mvneta_ethdev.c +++ b/drivers/net/mvneta/mvneta_ethdev.c @@ -198,7 +198,8 @@ mvneta_dev_supported_ptypes_get(struct rte_eth_dev *dev __rte_unused) RTE_PTYPE_L3_IPV4, RTE_PTYPE_L3_IPV6, RTE_PTYPE_L4_TCP, - RTE_PTYPE_L4_UDP + RTE_PTYPE_L4_UDP, + RTE_PTYPE_UNKNOWN }; return ptypes; diff --git a/drivers/net/mvpp2/mrvl_ethdev.c b/drivers/net/mvpp2/mrvl_ethdev.c index c12364941d..4cc64c7cad 100644 --- a/drivers/net/mvpp2/mrvl_ethdev.c +++ b/drivers/net/mvpp2/mrvl_ethdev.c @@ -1777,7 +1777,8 @@ mrvl_dev_supported_ptypes_get(struct rte_eth_dev *dev __rte_unused) RTE_PTYPE_L3_IPV6_EXT, RTE_PTYPE_L2_ETHER_ARP, RTE_PTYPE_L4_TCP, - RTE_PTYPE_L4_UDP + RTE_PTYPE_L4_UDP, + RTE_PTYPE_UNKNOWN }; return ptypes; diff --git a/drivers/net/nfp/nfp_net_common.c b/drivers/net/nfp/nfp_net_common.c index e969b840d6..46d0e07850 100644 --- a/drivers/net/nfp/nfp_net_common.c +++ b/drivers/net/nfp/nfp_net_common.c @@ -1299,6 +1299,7 @@ nfp_net_supported_ptypes_get(struct rte_eth_dev *dev) RTE_PTYPE_INNER_L4_NONFRAG, RTE_PTYPE_INNER_L4_ICMP, RTE_PTYPE_INNER_L4_SCTP, + RTE_PTYPE_UNKNOWN }; if (dev->rx_pkt_burst != nfp_net_recv_pkts) diff --git a/drivers/net/pfe/pfe_ethdev.c b/drivers/net/pfe/pfe_ethdev.c index 551f3cf193..0073dd7405 100644 --- a/drivers/net/pfe/pfe_ethdev.c +++ b/drivers/net/pfe/pfe_ethdev.c @@ -520,7 +520,8 @@ pfe_supported_ptypes_get(struct rte_eth_dev *dev) RTE_PTYPE_L3_IPV6_EXT, RTE_PTYPE_L4_TCP, RTE_PTYPE_L4_UDP, - RTE_PTYPE_L4_SCTP + RTE_PTYPE_L4_SCTP, + RTE_PTYPE_UNKNOWN }; if (dev->rx_pkt_burst == pfe_recv_pkts || diff --git a/drivers/net/tap/rte_eth_tap.c b/drivers/net/tap/rte_eth_tap.c index b41fa971cb..3fa03cdbee 100644 --- a/drivers/net/tap/rte_eth_tap.c +++ b/drivers/net/tap/rte_eth_tap.c @@ -1803,6 +1803,7 @@ tap_dev_supported_ptypes_get(struct rte_eth_dev *dev __rte_unused) RTE_PTYPE_L4_UDP, RTE_PTYPE_L4_TCP, RTE_PTYPE_L4_SCTP, + RTE_PTYPE_UNKNOWN }; return ptypes; diff --git a/drivers/net/thunderx/nicvf_ethdev.c b/drivers/net/thunderx/nicvf_ethdev.c index a504d41dfe..5a0c3dc4a6 100644 --- a/drivers/net/thunderx/nicvf_ethdev.c +++ b/drivers/net/thunderx/nicvf_ethdev.c @@ -392,12 +392,14 @@ nicvf_dev_supported_ptypes_get(struct rte_eth_dev *dev) RTE_PTYPE_L4_TCP, RTE_PTYPE_L4_UDP, RTE_PTYPE_L4_FRAG, + RTE_PTYPE_UNKNOWN }; static const uint32_t ptypes
[PATCH v5 2/2] drivers/net: return number of types in get supported types
Missing "RTE_PTYPE_UNKNOWN" ptype causes buffer overflow. Enhance code such that the dev_supported_ptypes_get() function pointer now returns the number of elements to eliminate the need for "RTE_PTYPE_UNKNOWN" as the last item. Signed-off-by: Sivaramakrishnan Venkat -- v5: - modified commit message. - tidied formatting of code. - added doxygen comment. v4: - split into two patches, one for backporting and another one for upstream rework. v3: - reworked the function to return number of elements and remove the need for RTE_PTYPE_UNKNOWN in list. v2: - extended fix for multiple drivers. --- drivers/net/atlantic/atl_ethdev.c | 10 ++ drivers/net/axgbe/axgbe_ethdev.c | 9 + drivers/net/bnxt/bnxt_ethdev.c | 4 ++-- drivers/net/cnxk/cnxk_ethdev.h | 2 +- drivers/net/cnxk/cnxk_lookup.c | 6 +++--- drivers/net/cpfl/cpfl_ethdev.c | 5 +++-- drivers/net/cxgbe/cxgbe_ethdev.c | 7 --- drivers/net/cxgbe/cxgbe_pfvf.h | 2 +- drivers/net/dpaa/dpaa_ethdev.c | 6 -- drivers/net/dpaa2/dpaa2_ethdev.c | 7 --- drivers/net/e1000/igb_ethdev.c | 10 ++ drivers/net/enetc/enetc_ethdev.c | 4 ++-- drivers/net/enic/enic_ethdev.c | 12 +++- drivers/net/failsafe/failsafe_ops.c| 4 ++-- drivers/net/fm10k/fm10k_ethdev.c | 8 drivers/net/hns3/hns3_rxtx.c | 11 ++- drivers/net/hns3/hns3_rxtx.h | 2 +- drivers/net/i40e/i40e_rxtx.c | 7 --- drivers/net/i40e/i40e_rxtx.h | 2 +- drivers/net/iavf/iavf_ethdev.c | 8 +--- drivers/net/ice/ice_dcf_ethdev.c | 4 ++-- drivers/net/ice/ice_rxtx.c | 11 ++- drivers/net/ice/ice_rxtx.h | 2 +- drivers/net/idpf/idpf_ethdev.c | 4 ++-- drivers/net/igc/igc_ethdev.c | 7 --- drivers/net/ionic/ionic_rxtx.c | 4 ++-- drivers/net/ionic/ionic_rxtx.h | 2 +- drivers/net/ixgbe/ixgbe_ethdev.c | 14 +- drivers/net/mana/mana.c| 4 ++-- drivers/net/mlx4/mlx4.h| 2 +- drivers/net/mlx4/mlx4_ethdev.c | 11 ++- drivers/net/mlx5/mlx5.h| 2 +- drivers/net/mlx5/mlx5_ethdev.c | 7 --- drivers/net/netvsc/hn_var.h| 2 +- drivers/net/netvsc/hn_vf.c | 4 ++-- drivers/net/nfp/nfp_net_common.c | 3 ++- drivers/net/nfp/nfp_net_common.h | 2 +- drivers/net/ngbe/ngbe_ethdev.c | 4 ++-- drivers/net/ngbe/ngbe_ethdev.h | 2 +- drivers/net/ngbe/ngbe_ptypes.c | 3 ++- drivers/net/ngbe/ngbe_ptypes.h | 2 +- drivers/net/octeontx/octeontx_ethdev.c | 7 --- drivers/net/pfe/pfe_ethdev.c | 6 -- drivers/net/qede/qede_ethdev.c | 7 --- drivers/net/sfc/sfc_dp_rx.h| 2 +- drivers/net/sfc/sfc_ef10.h | 2 +- drivers/net/sfc/sfc_ef100_rx.c | 4 ++-- drivers/net/sfc/sfc_ef10_rx.c | 6 +++--- drivers/net/sfc/sfc_ethdev.c | 4 ++-- drivers/net/sfc/sfc_rx.c | 4 ++-- drivers/net/tap/rte_eth_tap.c | 4 ++-- drivers/net/thunderx/nicvf_ethdev.c| 3 ++- drivers/net/txgbe/txgbe_ethdev.c | 4 ++-- drivers/net/txgbe/txgbe_ethdev.h | 2 +- drivers/net/txgbe/txgbe_ptypes.c | 4 ++-- drivers/net/txgbe/txgbe_ptypes.h | 2 +- drivers/net/vmxnet3/vmxnet3_ethdev.c | 9 + lib/ethdev/ethdev_driver.h | 18 -- lib/ethdev/rte_ethdev.c| 19 ++- 59 files changed, 188 insertions(+), 141 deletions(-) diff --git a/drivers/net/atlantic/atl_ethdev.c b/drivers/net/atlantic/atl_ethdev.c index 3a028f4290..2232f09fd9 100644 --- a/drivers/net/atlantic/atl_ethdev.c +++ b/drivers/net/atlantic/atl_ethdev.c @@ -43,7 +43,8 @@ static int atl_dev_stats_reset(struct rte_eth_dev *dev); static int atl_fw_version_get(struct rte_eth_dev *dev, char *fw_version, size_t fw_size); -static const uint32_t *atl_dev_supported_ptypes_get(struct rte_eth_dev *dev); +static const uint32_t *atl_dev_supported_ptypes_get(struct rte_eth_dev *dev, + size_t *no_of_elements); static int atl_dev_mtu_set(struct rte_eth_dev *dev, uint16_t mtu); @@ -1132,7 +1133,7 @@ atl_dev_info_get(struct rte_eth_dev *dev, struct rte_eth_dev_info *dev_info) } static const uint32_t * -atl_dev_supported_ptypes_get(struct rte_eth_dev *dev) +atl_dev_supported_ptypes_get(struct rte_eth_dev *dev, size_t *no_of_elements) { static const uint32_t ptypes[] = { RTE_PTYPE_L2_ETHER, @@ -1144,11 +1145,12 @@ atl_dev_supported_ptypes_get(struct rte_eth_dev *dev) RTE_PTYPE_L4_UDP, RTE_PTYPE_L4_SCTP, RTE_PTYPE_L4_ICMP, - RTE_PTYPE_UNKNOWN
Re: DTS testpmd and SCAPY integration
Hi folks, Let me summarize the yesterday's discussion in a few keys points: - Greg's proposal aims at simplicity and is useful mainly for test cases which can be written in a few minutes. More complex test cases are not suitable for the YAML approach. - The above implies that the YAML based test cases would be supported alongside the existing approach. This fast way to implement simple test cases would likely be a valuable addition. - The big picture idea behind the YAML test cases is to take an application with interactive input, send commands, collect output and compare the output with expected string(s). - Greg may be able to make the code available and may assess how to integrate it with DTS. Regards, Juraj On Mon, Jan 8, 2024 at 6:36 PM Honnappa Nagarahalli < honnappa.nagaraha...@arm.com> wrote: > > > > > > Hello Honnappa, > > > > [snip] > > > > > Hi Gregory, > > >I do not fully understand your proposal, it will be helpful to > join the DTS > > meetings to discuss this further. > > > > > > > Agree, let's discuss the proposal details during the DTS meeting. > > > > > YAML has wide support built around it. By using our own text format, > we will > > have to build the parsing support etc ourselves. > > > > > > However, YAML is supposed to be easy to read and understand. Is it > just a > > matter for getting used to it? > > > > > > > I selected YAML for 2 reasons: > >* Plain and intuitive YAML format minimized test meta data. > > By the meta data I refer to control tags and markup characters > > that are not test commands. > >* YAML has Python parser. > I have mis-understood your proposal. I agree with your above comments. > +1 for the proposal. > > > > > Regards, > > Gregory > > > > > >
Re: [Bug 1368] inconsistency in eventdev dev_info and config structs makes some valid configs impossible
On Thu, Jan 18, 2024 at 11:26:45AM +, Bruce Richardson wrote: > On Thu, Jan 18, 2024 at 11:18:58AM +, bugzi...@dpdk.org wrote: > >Bug ID [1]1368 > >Summary inconsistency in eventdev dev_info and config structs makes > >some valid configs impossible > >Product DPDK > >Version unspecified > >Hardware All > >OS All > >Status UNCONFIRMED > >Severity normal > >Priority Normal > >Component eventdev > >Assignee dev@dpdk.org > >Reporter bruce.richard...@intel.com > >Target Milestone --- > > > > In the rte_event_dev_info struct[1], we have the max_event_queues[2] and > > max_single_link_event_port_queue_pairs[3] members. The doxygen docs on the > > latter states: "These ports and queues are not accounted for in > > max_event_ports or max_event_queues." > > > > This implies that a device which has 8 regular queues and an extra 8 > > single-link only queues, would report max_event_queues == 8, and > > max_single_link_event_port_queue_pairs == 8 on return from > > rte_event_dev_info_get() function. > > > > Those values returned from info_get are generally to be used to guide the > > configuration using rte_event_dev_configure() API, which takes the > > rte_event_dev_config[4] struct. This has two similar fields, in > > nb_event_queues[5] and nb_single_link_event_port_queues[6]. However, a > > problem arises in that the documentation states that nb_event_queues cannot > > be greater than the previously reported max_event_queues (which by itself > > makes sense), but the documentation also states that > > nb_single_link_event_port_queues is a subset of the overall event ports and > > queues, and cannot be greater than the nb_event_queues given in the same > > config structure. > > > > To illustrate the issue by continuing to use the same example as above, > > suppose an app wants to take that device with 8 regular queues and 8 single > > link ones, and have an app with 2 shared processing queues, e.g. for > > load-balancing packets/events among 8 cores, but also wants to use the 8 > > single link queues to allow sending packets/events directly to each core > > without load balancing. In this 2 + 8 scenario, there is no valid > > dev_config struct settings that will work: > > * making the 8 a subset of the nb_event_queues, means that nb_event_queues > > is 10, which is greater than max_event_queues and so invalid. > > * keeping them separate, so that nb_event_queues == 2 and > > nb_single_link_port_queues == 8 violates the constraint that the > > single_link value cannot exceed the former nb_event_queues value. > > > > We therefore need to adjust the constraints to make things work. Now we can > > do so, while keeping the single_link value *not included* in the > > total-count in dev_info, but have it *included* in the config struct, but > > such a setup is very confusing for the user. Therefore, I think instead we > > need to correct this by aligning the two structures - either the > > single_link queues are included in the queue/port counts in both structs, > > or they aren't included. > > > > Since I'm doing some clean-up work on rte_eventdev.h doxygen comments, I'm > happy enough to push a patch to help fix this, if we can agree on the > solution. > > Of the two possibilities (make both have single-link included in count, or > make both have single-link not-included), I would suggest we go for having > them included, on the basis that that would involve an internal DPDK change > to increase the reported counts in dev_info from any drivers supporting > single link queues, but should *not* involve any changes to end > applications, which would already be specifying the values based on > nb_single_link being a subset of nb_event_queues. On the other hand, > changing semantics of the config struct fields would likely mean changes to > end-apps and so be an ABI/API break. > Checking the implementation in the eventdev.c file, I find that (unsurprisingly) the implementation doesn't correspond to the documentation. For the problematic configuration described above, it is actually possible to implement, since the API checks that nb_event_queues (and nb_event_ports) is < max_event_queues + max_single_link_queues. I will patch the documentation in the header to reflect this, but I still think we should look to change this in future as it's rather inconsistent. Regards, /Bruce
RE: [dpdk-dev] [v1] ethdev: support Tx queue used count
> From: Konstantin Ananyev [mailto:konstantin.anan...@huawei.com] > Sent: Thursday, 18 January 2024 11.17 > > Hi Jerin, > > > > > Introduce a new API to retrieve the number of used descriptors > > > > in a Tx queue. Applications can leverage this API in the fast > path to > > > > inspect the Tx queue occupancy and take appropriate actions based > on the > > > > available free descriptors. > > > > > > > > A notable use case could be implementing Random Early Discard > (RED) > > > > in software based on Tx queue occupancy. > > > > > > > > Signed-off-by: Jerin Jacob > > > > > > @@ -6803,6 +6803,80 @@ rte_eth_recycle_mbufs(uint16_t rx_port_id, > uint16_t rx_queue_id, > > > > __rte_experimental > > > > int rte_eth_buffer_split_get_supported_hdr_ptypes(uint16_t > port_id, uint32_t *ptypes, int num); > > > > > > > > +/** > > > > + * @warning > > > > + * @b EXPERIMENTAL: this API may change, or be removed, without > prior notice > > > > + * > > > > + * Get the number of used descriptors of a Tx queue > > > > + * > > > > + * This function retrieves the number of used descriptors of a > transmit queue. > > > > + * Applications can use this API in the fast path to inspect Tx > queue occupancy and take > > > > + * appropriate actions based on the available free descriptors. > > > > + * An example action could be implementing the Random Early > Discard (RED). > > > > > > Sorry, I probably misunderstood your previous mails, but wouldn't > it be more convenient > > > for user to have rte_eth_tx_queue_free_count(...) as fast-op, and > > > have rte_eth_tx_queue_count(...) { queue_txd_num - > rte_eth_tx_queue_free_count(...);} > > > as a slow-path function in rte_ethdev.c? > > > > The general feedback is to align with the Rx queue API, specifically > > rte_eth_rx_queue_count, > > and it's noted that there is no equivalent > rte_eth_rx_queue_free_count. > > > > Given that the free count can be obtained by subtracting the used > > count from queue_txd_num, > > it is considered that either approach is acceptable. > > > > The application configures queue_txd_num with tx_queue_setup(), and > > the application can store that value in its structure. > > This would enable fast-path usage for both base cases (whether the > > application needs information about free or used descriptors) > > with just one API(rte_eth_tx_queue_count()) > > Right now I don't use these functions, but if I think what most people > are interested in: > - how many packets you can receive immediately (rx_queue_count) Agreed that "used" (not "free") is the preferred info for RX. > - how many packets you can transmit immediately (tx_queue_free_count) > Sure, I understand that user can store txd_num somewhere and then do > subtraction himself. > Though it means more effort for the user, and the only reason for that, > as I can see, > is to have RX and TX function naming symmetric. > Which seems much less improtant to me comparing to user convenience. I agree 100 % with your prioritization: Usability has higher priority than symmetric naming. So here are some example use cases supporting the TX "Used" API: - RED (and similar queueing algorithms) need to know how many packets the queue holds (not how much room the queue has for more packets). - Load Balancing across multiple links, in use cases where packet reordering is allowed. - Monitoring egress queueing, especially in many-to-one-port traffic patterns, e.g. to catch micro-burst induced spikes (which may cause latency/"bufferbloat"). - The (obsolete) ifOutQLen object in the Interfaces MIB for SNMP, which I suppose was intended for monitoring egress queueing. > Anyway, as I stated above, I don't use these functions right now, > so if the majority of users are happy with current approach, I would > not insist :) I'm very happy with the current approach. :-)
[PATCH v1 1/7] eventdev: improve doxygen introduction text
Make some textual improvements to the introduction to eventdev and event devices in the eventdev header file. This text appears in the doxygen output for the header file, and introduces the key concepts, for example: events, event devices, queues, ports and scheduling. This patch makes the following improvements: * small textual fixups, e.g. correcting use of singular/plural * rewrites of some sentences to improve clarity * using doxygen markdown to split the whole large block up into sections, thereby making it easier to read. No large-scale changes are made, and blocks are not reordered Signed-off-by: Bruce Richardson --- lib/eventdev/rte_eventdev.h | 112 +--- 1 file changed, 66 insertions(+), 46 deletions(-) diff --git a/lib/eventdev/rte_eventdev.h b/lib/eventdev/rte_eventdev.h index ec9b02455d..a36c89c7a4 100644 --- a/lib/eventdev/rte_eventdev.h +++ b/lib/eventdev/rte_eventdev.h @@ -12,12 +12,13 @@ * @file * * RTE Event Device API + * * * In a polling model, lcores poll ethdev ports and associated rx queues - * directly to look for packet. In an event driven model, by contrast, lcores - * call the scheduler that selects packets for them based on programmer - * specified criteria. Eventdev library adds support for event driven - * programming model, which offer applications automatic multicore scaling, + * directly to look for packets. In an event driven model, in contrast, lcores + * call a scheduler that selects packets for them based on programmer + * specified criteria. The eventdev library adds support for the event driven + * programming model, which offers applications automatic multicore scaling, * dynamic load balancing, pipelining, packet ingress order maintenance and * synchronization services to simplify application packet processing. * @@ -25,12 +26,15 @@ * * - The application-oriented Event API that includes functions to setup * an event device (configure it, setup its queues, ports and start it), to - * establish the link between queues to port and to receive events, and so on. + * establish the links between queues and ports to receive events, and so on. * * - The driver-oriented Event API that exports a function allowing - * an event poll Mode Driver (PMD) to simultaneously register itself as + * an event poll Mode Driver (PMD) to register itself as * an event device driver. * + * Application-oriented Event API + * -- + * * Event device components: * * +-+ @@ -75,27 +79,33 @@ *| | *+---+ * - * Event device: A hardware or software-based event scheduler. + * **Event device**: A hardware or software-based event scheduler. * - * Event: A unit of scheduling that encapsulates a packet or other datatype - * like SW generated event from the CPU, Crypto work completion notification, - * Timer expiry event notification etc as well as metadata. - * The metadata includes flow ID, scheduling type, event priority, event_type, + * **Event**: A unit of scheduling that encapsulates a packet or other datatype, + * such as: SW generated event from the CPU, crypto work completion notification, + * timer expiry event notification etc., as well as metadata about the packet or data. + * The metadata includes a flow ID (if any), scheduling type, event priority, event_type, * sub_event_type etc. * - * Event queue: A queue containing events that are scheduled by the event dev. + * **Event queue**: A queue containing events that are scheduled by the event device. * An event queue contains events of different flows associated with scheduling * types, such as atomic, ordered, or parallel. + * Each event given to an eventdev must have a valid event queue id field in the metadata, + * to specify on which event queue in the device the event must be placed, + * for later scheduling to a core. * - * Event port: An application's interface into the event dev for enqueue and + * **Event port**: An application's interface into the event dev for enqueue and * dequeue operations. Each event port can be linked with one or more * event queues for dequeue operations. - * - * By default, all the functions of the Event Device API exported by a PMD - * are lock-free functions which assume to not be invoked in parallel on - * different logical cores to work on the same target object. For instance, - * the dequeue function of a PMD cannot be invoked in parallel on two logical - * cores to operates on same event port. Of course, this function + * Each port should be associated with a single core (enqueue and dequeue is not thread-safe). + * To schedule events to a core, the event device will schedule them to the event port(s) + * being polled by that core. + * + * *NOTE*: By default, all the func
[PATCH v1 2/7] eventdev: move text on driver internals to proper section
Inside the doxygen introduction text, some internal details of how eventdev works was mixed in with application-relevant details. Move these details on probing etc. to the driver-relevant section. Signed-off-by: Bruce Richardson --- lib/eventdev/rte_eventdev.h | 32 1 file changed, 16 insertions(+), 16 deletions(-) diff --git a/lib/eventdev/rte_eventdev.h b/lib/eventdev/rte_eventdev.h index a36c89c7a4..949e957f1b 100644 --- a/lib/eventdev/rte_eventdev.h +++ b/lib/eventdev/rte_eventdev.h @@ -112,22 +112,6 @@ * In all functions of the Event API, the Event device is * designated by an integer >= 0 named the device identifier *dev_id* * - * At the Event driver level, Event devices are represented by a generic - * data structure of type *rte_event_dev*. - * - * Event devices are dynamically registered during the PCI/SoC device probing - * phase performed at EAL initialization time. - * When an Event device is being probed, an *rte_event_dev* structure is allocated - * for it and the event_dev_init() function supplied by the Event driver - * is invoked to properly initialize the device. - * - * The role of the device init function is to reset the device hardware or - * to initialize the software event driver implementation. - * - * If the device init operation is successful, the device is assigned a device - * id (dev_id) for application use. - * Otherwise, the *rte_event_dev* structure is freed. - * * The functions exported by the application Event API to setup a device * must be invoked in the following order: * - rte_event_dev_configure() @@ -163,6 +147,22 @@ * Driver-Oriented Event API * - * + * At the Event driver level, Event devices are represented by a generic + * data structure of type *rte_event_dev*. + * + * Event devices are dynamically registered during the PCI/SoC device probing + * phase performed at EAL initialization time. + * When an Event device is being probed, an *rte_event_dev* structure is allocated + * for it and the event_dev_init() function supplied by the Event driver + * is invoked to properly initialize the device. + * + * The role of the device init function is to reset the device hardware or + * to initialize the software event driver implementation. + * + * If the device init operation is successful, the device is assigned a device + * id (dev_id) for application use. + * Otherwise, the *rte_event_dev* structure is freed. + * * Each function of the application Event API invokes a specific function * of the PMD that controls the target device designated by its device * identifier. -- 2.40.1
[PATCH v1 4/7] eventdev: cleanup doxygen comments on info structure
Some small rewording changes to the doxygen comments on struct rte_event_dev_info. Signed-off-by: Bruce Richardson --- lib/eventdev/rte_eventdev.c | 2 +- lib/eventdev/rte_eventdev.h | 46 - 2 files changed, 26 insertions(+), 22 deletions(-) diff --git a/lib/eventdev/rte_eventdev.c b/lib/eventdev/rte_eventdev.c index 94628a66ef..9bf7c7be89 100644 --- a/lib/eventdev/rte_eventdev.c +++ b/lib/eventdev/rte_eventdev.c @@ -83,7 +83,7 @@ rte_event_dev_socket_id(uint8_t dev_id) rte_eventdev_trace_socket_id(dev_id, dev, dev->data->socket_id); - return dev->data->socket_id; + return dev->data->socket_id < 0 ? 0 : dev->data->socket_id; } int diff --git a/lib/eventdev/rte_eventdev.h b/lib/eventdev/rte_eventdev.h index 57a2791946..872f241df2 100644 --- a/lib/eventdev/rte_eventdev.h +++ b/lib/eventdev/rte_eventdev.h @@ -482,54 +482,58 @@ struct rte_event_dev_info { const char *driver_name;/**< Event driver name */ struct rte_device *dev; /**< Device information */ uint32_t min_dequeue_timeout_ns; - /**< Minimum supported global dequeue timeout(ns) by this device */ + /**< Minimum global dequeue timeout(ns) supported by this device */ uint32_t max_dequeue_timeout_ns; - /**< Maximum supported global dequeue timeout(ns) by this device */ + /**< Maximum global dequeue timeout(ns) supported by this device */ uint32_t dequeue_timeout_ns; /**< Configured global dequeue timeout(ns) for this device */ uint8_t max_event_queues; - /**< Maximum event_queues supported by this device */ + /**< Maximum event queues supported by this device */ uint32_t max_event_queue_flows; - /**< Maximum supported flows in an event queue by this device*/ + /**< Maximum number of flows within an event queue supported by this device*/ uint8_t max_event_queue_priority_levels; /**< Maximum number of event queue priority levels by this device. -* Valid when the device has RTE_EVENT_DEV_CAP_QUEUE_QOS capability +* Valid when the device has RTE_EVENT_DEV_CAP_QUEUE_QOS capability. +* The priority levels are evenly distributed between +* @ref RTE_EVENT_DEV_PRIORITY_HIGHEST and @ref RTE_EVENT_DEV_PRIORITY_LOWEST. */ uint8_t max_event_priority_levels; /**< Maximum number of event priority levels by this device. * Valid when the device has RTE_EVENT_DEV_CAP_EVENT_QOS capability +* The priority levels are evenly distributed between +* @ref RTE_EVENT_DEV_PRIORITY_HIGHEST and @ref RTE_EVENT_DEV_PRIORITY_LOWEST. */ uint8_t max_event_ports; /**< Maximum number of event ports supported by this device */ uint8_t max_event_port_dequeue_depth; - /**< Maximum number of events can be dequeued at a time from an -* event port by this device. -* A device that does not support bulk dequeue will set this as 1. + /**< Maximum number of events that can be dequeued at a time from an event port +* on this device. +* A device that does not support bulk dequeue will set this to 1. */ uint32_t max_event_port_enqueue_depth; - /**< Maximum number of events can be enqueued at a time from an -* event port by this device. -* A device that does not support bulk enqueue will set this as 1. + /**< Maximum number of events that can be enqueued at a time to an event port +* on this device. +* A device that does not support bulk enqueue will set this to 1. */ uint8_t max_event_port_links; - /**< Maximum number of queues that can be linked to a single event -* port by this device. + /**< Maximum number of queues that can be linked to a single event port on this device. */ int32_t max_num_events; /**< A *closed system* event dev has a limit on the number of events it -* can manage at a time. An *open system* event dev does not have a -* limit and will specify this as -1. +* can manage at a time. +* Once the number of events tracked by an eventdev exceeds this number, +* any enqueues of NEW events will fail. +* An *open system* event dev does not have a limit and will specify this as -1. */ uint32_t event_dev_cap; - /**< Event device capabilities(RTE_EVENT_DEV_CAP_)*/ + /**< Event device capabilities flags (RTE_EVENT_DEV_CAP_*) */ uint8_t max_single_link_event_port_queue_pairs; - /**< Maximum number of event ports and queues that are optimized for -* (and only capable of) single-link configurations supported by this -* device. These ports and queues are not accounted for in -* max_event_ports or max_event_queues. + /**< Maximum number of event ports and queues, supported by this device, +
[PATCH v1 5/7] eventdev: improve function documentation for query fns
General improvements to the doxygen docs for eventdev functions for querying basic information: * number of devices * id for a particular device * socket id of device * capability information for a device Signed-off-by: Bruce Richardson --- lib/eventdev/rte_eventdev.h | 22 +- 1 file changed, 13 insertions(+), 9 deletions(-) diff --git a/lib/eventdev/rte_eventdev.h b/lib/eventdev/rte_eventdev.h index 872f241df2..c57c93a22e 100644 --- a/lib/eventdev/rte_eventdev.h +++ b/lib/eventdev/rte_eventdev.h @@ -440,8 +440,7 @@ struct rte_event; */ /** - * Get the total number of event devices that have been successfully - * initialised. + * Get the total number of event devices available for application use. * * @return * The total number of usable event devices. @@ -456,8 +455,10 @@ rte_event_dev_count(void); * Event device name to select the event device identifier. * * @return - * Returns event device identifier on success. - * - <0: Failure to find named event device. + * Event device identifier (dev_id >= 0) on success. + * Negative error code on failure: + * - -EINVAL - input name parameter is invalid + * - -ENODEV - no event device found with that name */ int rte_event_dev_get_dev_id(const char *name); @@ -470,7 +471,8 @@ rte_event_dev_get_dev_id(const char *name); * @return * The NUMA socket id to which the device is connected or * a default of zero if the socket could not be determined. - * -(-EINVAL) dev_id value is out of range. + * -EINVAL on error, where the given dev_id value does not + * correspond to any event device. */ int rte_event_dev_socket_id(uint8_t dev_id); @@ -539,18 +541,20 @@ struct rte_event_dev_info { }; /** - * Retrieve the contextual information of an event device. + * Retrieve details of an event device's capabilities and configuration limits. * * @param dev_id * The identifier of the device. * * @param[out] dev_info * A pointer to a structure of type *rte_event_dev_info* to be filled with the - * contextual information of the device. + * information about the device's capabilities. * * @return - * - 0: Success, driver updates the contextual information of the event device - * - <0: Error code returned by the driver info get function. + * - 0: Success, information about the event device is present in dev_info. + * - <0: Failure, error code returned by the function. + * - -EINVAL - invalid input parameters, e.g. incorrect device id + * - -ENOTSUP - device does not support returning capabilities information */ int rte_event_dev_info_get(uint8_t dev_id, struct rte_event_dev_info *dev_info); -- 2.40.1
[PATCH v1 3/7] eventdev: update documentation on device capability flags
Update the device capability docs, to: * include more cross-references * split longer text into paragraphs, in most cases with each flag having a single-line summary at the start of the doc block * general comment rewording and clarification as appropriate Signed-off-by: Bruce Richardson --- lib/eventdev/rte_eventdev.h | 130 ++-- 1 file changed, 93 insertions(+), 37 deletions(-) diff --git a/lib/eventdev/rte_eventdev.h b/lib/eventdev/rte_eventdev.h index 949e957f1b..57a2791946 100644 --- a/lib/eventdev/rte_eventdev.h +++ b/lib/eventdev/rte_eventdev.h @@ -243,143 +243,199 @@ struct rte_event; /* Event device capability bitmap flags */ #define RTE_EVENT_DEV_CAP_QUEUE_QOS (1ULL << 0) /**< Event scheduling prioritization is based on the priority and weight - * associated with each event queue. Events from a queue with highest priority - * is scheduled first. If the queues are of same priority, weight of the queues + * associated with each event queue. + * + * Events from a queue with highest priority + * are scheduled first. If the queues are of same priority, weight of the queues * are considered to select a queue in a weighted round robin fashion. * Subsequent dequeue calls from an event port could see events from the same * event queue, if the queue is configured with an affinity count. Affinity * count is the number of subsequent dequeue calls, in which an event port * should use the same event queue if the queue is non-empty * + * NOTE: A device may use both queue prioritization and event prioritization + * (@ref RTE_EVENT_DEV_CAP_EVENT_QOS capability) when making packet scheduling decisions. + * * @see rte_event_queue_setup(), rte_event_queue_attr_set() */ #define RTE_EVENT_DEV_CAP_EVENT_QOS (1ULL << 1) /**< Event scheduling prioritization is based on the priority associated with - * each event. Priority of each event is supplied in *rte_event* structure + * each event. + * + * Priority of each event is supplied in *rte_event* structure * on each enqueue operation. + * If this capability is not set, the priority field of the event structure + * is ignored for each event. * + * NOTE: A device may use both queue prioritization (@ref RTE_EVENT_DEV_CAP_QUEUE_QOS capability) + * and event prioritization when making packet scheduling decisions. + * @see rte_event_enqueue_burst() */ #define RTE_EVENT_DEV_CAP_DISTRIBUTED_SCHED (1ULL << 2) /**< Event device operates in distributed scheduling mode. + * * In distributed scheduling mode, event scheduling happens in HW or - * rte_event_dequeue_burst() or the combination of these two. + * rte_event_dequeue_burst() / rte_event_enqueue_burst() or the combination of these two. * If the flag is not set then eventdev is centralized and thus needs a * dedicated service core that acts as a scheduling thread . * - * @see rte_event_dequeue_burst() + * @see rte_event_dev_service_id_get */ #define RTE_EVENT_DEV_CAP_QUEUE_ALL_TYPES (1ULL << 3) /**< Event device is capable of enqueuing events of any type to any queue. + * * If this capability is not set, the queue only supports events of the - * *RTE_SCHED_TYPE_* type that it was created with. + * *RTE_SCHED_TYPE_* type that it was created with. + * Any events of other types scheduled to the queue will handled in an + * implementation-dependent manner. They may be dropped by the + * event device, or enqueued with the scheduling type adjusted to the + * correct/supported value. * - * @see RTE_SCHED_TYPE_* values + * @see rte_event_enqueue_burst + * @see RTE_SCHED_TYPE_ATOMIC RTE_SCHED_TYPE_ORDERED RTE_SCHED_TYPE_PARALLEL */ #define RTE_EVENT_DEV_CAP_BURST_MODE (1ULL << 4) /**< Event device is capable of operating in burst mode for enqueue(forward, - * release) and dequeue operation. If this capability is not set, application - * still uses the rte_event_dequeue_burst() and rte_event_enqueue_burst() but - * PMD accepts only one event at a time. + * release) and dequeue operation. + * + * If this capability is not set, application + * can still use the rte_event_dequeue_burst() and rte_event_enqueue_burst() but + * PMD accepts or returns only one event at a time. * * @see rte_event_dequeue_burst() rte_event_enqueue_burst() */ #define RTE_EVENT_DEV_CAP_IMPLICIT_RELEASE_DISABLE(1ULL << 5) /**< Event device ports support disabling the implicit release feature, in * which the port will release all unreleased events in its dequeue operation. + * * If this capability is set and the port is configured with implicit release * disabled, the application is responsible for explicitly releasing events - * using either the RTE_EVENT_OP_FORWARD or the RTE_EVENT_OP_RELEASE event + * using either the @ref RTE_EVENT_OP_FORWARD or the @ref RTE_EVENT_OP_RELEASE event * enqueue operations. * * @see rte_event_dequeue_burst() rte_event_enqueue_burst() */ #define RTE_EVENT_DEV_CAP_NONSEQ_M
[PATCH v1 6/7] eventdev: improve doxygen comments on configure struct
General rewording and cleanup on the rte_event_dev_config structure. Improved the wording of some sentences and created linked cross-references out of the existing references to the dev_info structure. Signed-off-by: Bruce Richardson --- lib/eventdev/rte_eventdev.h | 47 +++-- 1 file changed, 24 insertions(+), 23 deletions(-) diff --git a/lib/eventdev/rte_eventdev.h b/lib/eventdev/rte_eventdev.h index c57c93a22e..4139ccb982 100644 --- a/lib/eventdev/rte_eventdev.h +++ b/lib/eventdev/rte_eventdev.h @@ -599,9 +599,9 @@ rte_event_dev_attr_get(uint8_t dev_id, uint32_t attr_id, struct rte_event_dev_config { uint32_t dequeue_timeout_ns; /**< rte_event_dequeue_burst() timeout on this device. -* This value should be in the range of *min_dequeue_timeout_ns* and -* *max_dequeue_timeout_ns* which previously provided in -* rte_event_dev_info_get() +* This value should be in the range of @ref rte_event_dev_info.min_dequeue_timeout_ns and +* @ref rte_event_dev_info.max_dequeue_timeout_ns returned by +* @ref rte_event_dev_info_get() * The value 0 is allowed, in which case, default dequeue timeout used. * @see RTE_EVENT_DEV_CFG_PER_DEQUEUE_TIMEOUT */ @@ -609,40 +609,41 @@ struct rte_event_dev_config { /**< In a *closed system* this field is the limit on maximum number of * events that can be inflight in the eventdev at a given time. The * limit is required to ensure that the finite space in a closed system -* is not overwhelmed. The value cannot exceed the *max_num_events* -* as provided by rte_event_dev_info_get(). +* is not overwhelmed. +* Once the limit has been reached, any enqueues of NEW events to the +* system will fail. +* The value cannot exceed @ref rte_event_dev_info.max_num_events +* returned by rte_event_dev_info_get(). * This value should be set to -1 for *open system*. */ uint8_t nb_event_queues; /**< Number of event queues to configure on this device. -* This value cannot exceed the *max_event_queues* which previously -* provided in rte_event_dev_info_get() +* This value cannot exceed @ref rte_event_dev_info.max_event_queues +* returned by rte_event_dev_info_get() */ uint8_t nb_event_ports; /**< Number of event ports to configure on this device. -* This value cannot exceed the *max_event_ports* which previously -* provided in rte_event_dev_info_get() +* This value cannot exceed @ref rte_event_dev_info.max_event_ports +* returned by rte_event_dev_info_get() */ uint32_t nb_event_queue_flows; - /**< Number of flows for any event queue on this device. -* This value cannot exceed the *max_event_queue_flows* which previously -* provided in rte_event_dev_info_get() + /**< Max number of flows needed for a single event queue on this device. +* This value cannot exceed @ref rte_event_dev_info.max_event_queue_flows +* returned by rte_event_dev_info_get() */ uint32_t nb_event_port_dequeue_depth; - /**< Maximum number of events can be dequeued at a time from an -* event port by this device. -* This value cannot exceed the *max_event_port_dequeue_depth* -* which previously provided in rte_event_dev_info_get(). + /**< Max number of events that can be dequeued at a time from an event port on this device. +* This value cannot exceed @ref rte_event_dev_info.max_event_port_dequeue_depth +* returned by rte_event_dev_info_get(). * Ignored when device is not RTE_EVENT_DEV_CAP_BURST_MODE capable. -* @see rte_event_port_setup() +* @see rte_event_port_setup() rte_event_dequeue_burst() */ uint32_t nb_event_port_enqueue_depth; - /**< Maximum number of events can be enqueued at a time from an -* event port by this device. -* This value cannot exceed the *max_event_port_enqueue_depth* -* which previously provided in rte_event_dev_info_get(). + /**< Maximum number of events can be enqueued at a time to an event port on this device. +* This value cannot exceed @ref rte_event_dev_info.max_event_port_enqueue_depth +* returned by rte_event_dev_info_get(). * Ignored when device is not RTE_EVENT_DEV_CAP_BURST_MODE capable. -* @see rte_event_port_setup() +* @see rte_event_port_setup() rte_event_enqueue_burst() */ uint32_t event_dev_cfg; /**< Event device config flags(RTE_EVENT_DEV_CFG_)*/ @@ -652,7 +653,7 @@ struct rte_event_dev_config { * queues; this value cannot exceed *nb_event_ports* or * *nb_event_queues*. If the device has ports and queues that are * optimized for single-link usage, this field is a hint for
[PATCH v1 7/7] eventdev: fix documentation for counting single-link ports
The documentation of how single-link port-queue pairs were counted in the rte_event_dev_config structure did not match the actual implementation and, if following the documentation, certain valid port/queue configurations would have been impossible to configure. Fix this by changing the documentation to match the implementation - however confusing that implementation ends up being. Bugzilla ID: 1368 Fixes: 75d113136f38 ("eventdev: express DLB/DLB2 PMD constraints") Cc: sta...@dpdk.org Signed-off-by: Bruce Richardson --- lib/eventdev/rte_eventdev.h | 28 ++-- 1 file changed, 22 insertions(+), 6 deletions(-) diff --git a/lib/eventdev/rte_eventdev.h b/lib/eventdev/rte_eventdev.h index 4139ccb982..3b8f5b8101 100644 --- a/lib/eventdev/rte_eventdev.h +++ b/lib/eventdev/rte_eventdev.h @@ -490,7 +490,10 @@ struct rte_event_dev_info { uint32_t dequeue_timeout_ns; /**< Configured global dequeue timeout(ns) for this device */ uint8_t max_event_queues; - /**< Maximum event queues supported by this device */ + /**< Maximum event queues supported by this device. +* This excludes any queue-port pairs covered by the +* *max_single_link_event_port_queue_pairs* value in this structure. +*/ uint32_t max_event_queue_flows; /**< Maximum number of flows within an event queue supported by this device*/ uint8_t max_event_queue_priority_levels; @@ -506,7 +509,10 @@ struct rte_event_dev_info { * @ref RTE_EVENT_DEV_PRIORITY_HIGHEST and @ref RTE_EVENT_DEV_PRIORITY_LOWEST. */ uint8_t max_event_ports; - /**< Maximum number of event ports supported by this device */ + /**< Maximum number of event ports supported by this device +* This excludes any queue-port pairs covered by the +* *max_single_link_event_port_queue_pairs* value in this structure. +*/ uint8_t max_event_port_dequeue_depth; /**< Maximum number of events that can be dequeued at a time from an event port * on this device. @@ -618,13 +624,23 @@ struct rte_event_dev_config { */ uint8_t nb_event_queues; /**< Number of event queues to configure on this device. -* This value cannot exceed @ref rte_event_dev_info.max_event_queues -* returned by rte_event_dev_info_get() +* This value *includes* any single-link queue-port pairs to be used. +* This value cannot exceed @ref rte_event_dev_info.max_event_queues + +* @ref rte_event_dev_info.max_single_link_event_port_queue_pairs +* returned by rte_event_dev_info_get(). +* The number of non-single-link queues i.e. this value less +* *nb_single_link_event_port_queues* in this struct, cannot exceed +* @ref rte_event_dev_info.max_event_queues */ uint8_t nb_event_ports; /**< Number of event ports to configure on this device. -* This value cannot exceed @ref rte_event_dev_info.max_event_ports -* returned by rte_event_dev_info_get() +* This value *includes* any single-link queue-port pairs to be used. +* This value cannot exceed @ref rte_event_dev_info.max_event_ports + +* @ref rte_event_dev_info.max_single_link_event_port_queue_pairs +* returned by rte_event_dev_info_get(). +* The number of non-single-link ports i.e. this value less +* *nb_single_link_event_port_queues* in this struct, cannot exceed +* @ref rte_event_dev_info.max_event_ports */ uint32_t nb_event_queue_flows; /**< Max number of flows needed for a single event queue on this device. -- 2.40.1
[PATCH v1 0/7] improve eventdev API specification/documentation
This patchset makes small rewording improvements to the eventdev doxygen documentation to try and ensure that it is as clear as possible, describes the implementation as accurately as possible, and is consistent within itself. Most changes are just minor rewordings, along with plenty of changes to change references into doxygen links/cross-references. For now I am approx 1/4 way through reviewing the rte_eventdev.h file, but sending v1 now to get any reviews started. Bruce Richardson (7): eventdev: improve doxygen introduction text eventdev: move text on driver internals to proper section eventdev: update documentation on device capability flags eventdev: cleanup doxygen comments on info structure eventdev: improve function documentation for query fns eventdev: improve doxygen comments on configure struct eventdev: fix documentation for counting single-link ports lib/eventdev/rte_eventdev.c | 2 +- lib/eventdev/rte_eventdev.h | 391 +++- 2 files changed, 247 insertions(+), 146 deletions(-) -- 2.40.1
Re: [PATCH v1 4/7] eventdev: cleanup doxygen comments on info structure
On Thu, Jan 18, 2024 at 01:45:54PM +, Bruce Richardson wrote: > Some small rewording changes to the doxygen comments on struct > rte_event_dev_info. > > Signed-off-by: Bruce Richardson > --- > lib/eventdev/rte_eventdev.c | 2 +- > lib/eventdev/rte_eventdev.h | 46 - > 2 files changed, 26 insertions(+), 22 deletions(-) > > diff --git a/lib/eventdev/rte_eventdev.c b/lib/eventdev/rte_eventdev.c > index 94628a66ef..9bf7c7be89 100644 > --- a/lib/eventdev/rte_eventdev.c > +++ b/lib/eventdev/rte_eventdev.c > @@ -83,7 +83,7 @@ rte_event_dev_socket_id(uint8_t dev_id) > > rte_eventdev_trace_socket_id(dev_id, dev, dev->data->socket_id); > > - return dev->data->socket_id; > + return dev->data->socket_id < 0 ? 0 : dev->data->socket_id; > } Apologies, this is a stray change that I thought I had rolled back, but somehow made it into the commit! Please ignore when reviewing. >
Re: [PATCH] vhost: fix deadlock during software live migration of VDPA in a nested virtualization environment
Hello, On Thu, Jan 18, 2024 at 11:34 AM Hao Chen wrote: > > In a nested virtualization environment, running dpdk vdpa in QEMU-L1 for > software live migration will result in a deadlock between dpdke-vdpa and > QEMU-L2 processes. > rte_vdpa_relay_vring_used-> > __vhost_iova_to_vva-> > vhost_user_iotlb_rd_unlock(vq)-> > vhost_user_iotlb_miss-> send vhost message VHOST_USER_SLAVE_IOTLB_MSG to > QEMU's vdpa socket, > then call vhost_user_iotlb_rd_lock(vq) to hold the read lock `iotlb_lock`. > But there is no place to release this read lock. > > QEMU L2 get the VHOST_USER_SLAVE_IOTLB_MSG, > then call vhost_user_send_device_iotlb_msg to send VHOST_USER_IOTLB_MSG > messages to dpdk-vdpa. > Dpdk vdpa will call vhost_user_iotlb_msg-> > vhost_user_iotlb_cache_insert, here, will obtain the write lock > `iotlb_lock`, but the read lock `iotlb_lock` has not been released and > will block here. > > This patch add lock and unlock function to fix the deadlock. Please identify the commit that first had this issue and add a Fixes: tag. > > Signed-off-by: Hao Chen > --- > lib/vhost/vdpa.c | 8 +++- > 1 file changed, 7 insertions(+), 1 deletion(-) > > diff --git a/lib/vhost/vdpa.c b/lib/vhost/vdpa.c > index 9776fc07a9..9132414209 100644 > --- a/lib/vhost/vdpa.c > +++ b/lib/vhost/vdpa.c > @@ -19,6 +19,7 @@ > #include "rte_vdpa.h" > #include "vdpa_driver.h" > #include "vhost.h" > +#include "iotlb.h" > > /** Double linked list of vDPA devices. */ > TAILQ_HEAD(vdpa_device_list, rte_vdpa_device); > @@ -193,10 +194,12 @@ rte_vdpa_relay_vring_used(int vid, uint16_t qid, void > *vring_m) > if (unlikely(nr_descs > vq->size)) > return -1; > > + vhost_user_iotlb_rd_lock(vq); > desc_ring = (struct vring_desc *)(uintptr_t) > vhost_iova_to_vva(dev, vq, > vq->desc[desc_id].addr, &dlen, > VHOST_ACCESS_RO); > + vhost_user_iotlb_rd_unlock(vq); > if (unlikely(!desc_ring)) > return -1; > > @@ -220,9 +223,12 @@ rte_vdpa_relay_vring_used(int vid, uint16_t qid, void > *vring_m) > if (unlikely(nr_descs-- == 0)) > goto fail; > desc = desc_ring[desc_id]; > - if (desc.flags & VRING_DESC_F_WRITE) > + if (desc.flags & VRING_DESC_F_WRITE) { > + vhost_user_iotlb_rd_lock(vq); > vhost_log_write_iova(dev, vq, desc.addr, > desc.len); > + vhost_user_iotlb_rd_unlock(vq); > + } > desc_id = desc.next; > } while (desc.flags & VRING_DESC_F_NEXT); > Interesting, I suspected a bug in this area as clang was complaining. Please try to remove the __rte_no_thread_safety_analysis annotation and compile with clang. https://git.dpdk.org/dpdk/tree/lib/vhost/vdpa.c#n150 You will get: ccache clang -Ilib/librte_vhost.a.p -Ilib -I../lib -Ilib/vhost -I../lib/vhost -I. -I.. -Iconfig -I../config -Ilib/eal/include -I../lib/eal/include -Ilib/eal/linux/include -I../lib/eal/linux/include -Ilib/eal/x86/include -I../lib/eal/x86/include -Ilib/eal/common -I../lib/eal/common -Ilib/eal -I../lib/eal -Ilib/kvargs -I../lib/kvargs -Ilib/log -I../lib/log -Ilib/metrics -I../lib/metrics -Ilib/telemetry -I../lib/telemetry -Ilib/ethdev -I../lib/ethdev -Ilib/net -I../lib/net -Ilib/mbuf -I../lib/mbuf -Ilib/mempool -I../lib/mempool -Ilib/ring -I../lib/ring -Ilib/meter -I../lib/meter -Ilib/cryptodev -I../lib/cryptodev -Ilib/rcu -I../lib/rcu -Ilib/hash -I../lib/hash -Ilib/pci -I../lib/pci -Ilib/dmadev -I../lib/dmadev -fcolor-diagnostics -fsanitize=address -fno-omit-frame-pointer -D_FILE_OFFSET_BITS=64 -Wall -Winvalid-pch -Wextra -Werror -std=c11 -O0 -g -include rte_config.h -Wcast-qual -Wdeprecated -Wformat -Wformat-nonliteral -Wformat-security -Wmissing-declarations -Wmissing-prototypes -Wnested-externs -Wold-style-definition -Wpointer-arith -Wsign-compare -Wstrict-prototypes -Wundef -Wwrite-strings -Wno-address-of-packed-member -Wno-missing-field-initializers -D_GNU_SOURCE -fPIC -march=native -mrtm -DALLOW_EXPERIMENTAL_API -DALLOW_INTERNAL_API -DVHOST_CLANG_UNROLL_PRAGMA -fno-strict-aliasing -DVHOST_HAS_VDUSE -DRTE_LOG_DEFAULT_LOGTYPE=lib.vhost -DRTE_ANNOTATE_LOCKS -Wthread-safety -MD -MQ lib/librte_vhost.a.p/vhost_vdpa.c.o -MF lib/librte_vhost.a.p/vhost_vdpa.c.o.d -o lib/librte_vhost.a.p/vhost_vdpa.c.o -c ../lib/vhost/vdpa.c ../lib/vhost/vdpa.c:196:5: error: calling function 'vhost_iova_to_vva' requires holding mutex 'vq->iotlb_lock' [-Werror,-Wthread-safety-analysis] vhost_iova_to_vva(dev, vq,
Re: Ubuntu Upgrade 20.04.6 to 22.04 Jammy 23.06 HomeGateway example
After debugging this, I found a spurious port, which was renamed, and it all started working. Thanks From: James Tervit Date: Thursday, 18 January 2024 at 13:22 To: dev@dpdk.org Subject: Ubuntu Upgrade 20.04.6 to 22.04 Jammy 23.06 HomeGateway example Dear DPDK Folks, I am following the latest build of home gateway https://s3-docs.fd.io/vpp/23.06/usecases/home_gateway.html for testing and development. Running Linux PXC-SM-002 5.15.0-91-generic #101-Ubuntu vpp/jammy,now 23.10-release amd64 [installed] PRETTY_NAME="Ubuntu 22.04.3 LTS" NAME="Ubuntu" VERSION_ID="22.04" VERSION="22.04.3 LTS (Jammy Jellyfish)" In my vpp startup.conf, which is calling setup.gate and, in turn, runs setup.tmpl I get an error on systemd in bold below. Jan 18 04:23:34 PXC-SM-002 vpp[224896]: set interface l2 bridge: unknown interface `GigabitEthernetb7/0/0 1' Jan 18 04:23:34 PXC-SM-002 vpp[224896]: exec: CLI line error: GigabitEthernetb7/0/0 1 (this is related to setup.tmpl section) Jan 18 04:23:34 PXC-SM-002 vpp[224896]: exec: CLI line error: setup.gate runs define HOSTNAME Gate-001 define TRUNK GigabitEthernet65/0/0 comment { Specific MAC address yields a constant IP address } define TRUNK_MACADDR 3c:ec:ef:da:89:58 define BVI_MACADDR 3c:ce:fe:ad:01:02 comment { inside subnet 192.168..0/24 } define INSIDE_SUBNET 1 comment { Adjust as needed to match PCI addresses of inside network ports } define INSIDE_PORT1 GigabitEthernet65/0/1 define INSIDE_PORT2 GigabitEthernet65/0/2 define INSIDE_PORT3 GigabitEthernet65/0/3 define INSIDE_PORT4 GigabitEthernetb7/0/0 define INSIDE_PORT5 GigabitEthernetb7/0/1 define INSIDE_PORT6 GigabitEthernetb7/0/3 comment { feature selections } define FEATURE_ADL uncomment define FEATURE_NAT44 uncomment define FEATURE_CNAT comment define FEATURE_DNS comment define FEATURE_IP6 comment define FEATURE_IKE_RESPONDER comment define FEATURE_MACTIME uncomment define FEATURE_OVPN uncomment define FEATURE_MODEM_ROUTE uncomment exec /home/pxcghost/VPP/pxc-gate/setup.tmpl The setup.tmpl tries to create inside ports and, for some reason, fails on one port and stops processing. show macro set int mac address $(TRUNK) $(TRUNK_MACADDR) set dhcp client intfc $(TRUNK) hostname $(HOSTNAME) set int state $(TRUNK) up bvi create instance 0 set int mac address bvi0 $(BVI_MACADDR) set int l2 bridge bvi0 1 bvi set int ip address bvi0 192.168.$(INSIDE_SUBNET).1/24 set int state bvi0 up set int l2 bridge $(INSIDE_PORT1) 1 set int state $(INSIDE_PORT1) up set int l2 bridge $(INSIDE_PORT2) 1 set int state $(INSIDE_PORT2) up set int l2 bridge $(INSIDE_PORT3) 1 set int state $(INSIDE_PORT3) up set int l2 bridge $(INSIDE_PORT4) 1 set int state $(INSIDE_PORT4) up set int l2 bridge $(INSIDE_PORT5) 1 set int state $(INSIDE_PORT5) up set int l2 bridge $(INSIDE_PORT6) 1 set int state $(INSIDE_PORT6) up comment { dhcp server and host-stack access } create tap host-if-name lstack host-ip4-addr 192.168.$(INSIDE_SUBNET).2/24 host-ip4-gw 192.168.$(INSIDE_SUBNET).1 set int l2 bridge tap0 1 set int state tap0 up service restart isc-dhcp-server $(FEATURE_ADL) { bin adl_interface_enable_disable $(TRUNK) } $(FEATURE_ADL) { ip table 1 } $(FEATURE_ADL) { ip route add table 1 0.0.0.0/0 via local } $(FEATURE_NAT44) { nat44 forwarding enable } $(FEATURE_NAT44) { nat44 plugin enable sessions 63000 } $(FEATURE_NAT44) { nat44 add interface address $(TRUNK) } $(FEATURE_NAT44) { set interface nat44 in bvi0 out $(TRUNK) } comment { iPhones seem to need lots of RA messages... } $(FEATURE_IP6) { ip6 nd bvi0 ra-managed-config-flag ra-other-config-flag ra-interval 30 20 ra-lifetime 180 } comment { ip6 nd bvi0 prefix 0::0/0 ra-lifetime 10 } comment { if using the mactime plugin, configure it } $(FEATURE_MACTIME) { bin mactime_add_del_range name roku mac 3c:ec:ef:da:89:58 allow-static } $(FEATURE_MACTIME) { bin mactime_enable_disable $(INSIDE_PORT1) } $(FEATURE_MACTIME) { bin mactime_enable_disable $(INSIDE_PORT2) } $(FEATURE_MACTIME) { bin mactime_enable_disable $(INSIDE_PORT3) } $(FEATURE_MACTIME) { bin mactime_enable_disable $(INSIDE_PORT4) } $(FEATURE_MODEM_ROUTE) { ip route add 192.168.1.1/23 via $(TRUNK) } for some reason, port4 fails as per the system, and I think it fails on the DHCP part of the script; any pointers are appreciated. I can bind ports and unbind back to kernel with no problem, and I can see the interfaces inside vpp and bvi look ok. sudo vppctl show interface, ip addresses not assigned as I think that’s because the script doesn’t process the lstack and dhcp portion due to crashing. Name IdxState MTU (L3/IP4/IP6/MPLS) Counter Count GigabitEthernet65/0/0 1 up 9000/0/0/0 rx packets 585 rx bytes 72405 tx packets 3
[PATCH v5 0/6] use static_assert for build error reports
This series fixes a couple places where expressions that could not be evaluated as constant early in compiler passes were used. Then converts RTE_BUILD_BUG_ON() with static_assert. static_assert() is more picky about the expression has to be a constant, which also catches some existing undefined behavior that pre-existed. The series requires a couple of workarounds to deal with quirks in static_assert() in some toolchains. v6 - minor cleanups handle missing macro in old FreeBSD Stephen Hemminger (6): eal: introduce RTE_MIN_T() and RTE_MAX_T() macros event/opdl: fix non-constant compile time assertion net/sfc: fix non-constant expression in RTE_BUILD_BUG_ON() net/i40e: avoid using const variable in assertion mempool: avoid floating point expression in static assertion eal: replace out of bounds VLA with static_assert drivers/event/opdl/opdl_ring.c | 2 +- drivers/net/i40e/i40e_ethdev.h | 1 + drivers/net/i40e/i40e_rxtx_vec_sse.c | 10 -- drivers/net/mlx5/mlx5_rxq.c | 2 +- drivers/net/sfc/sfc_ef100_tx.c | 3 +-- lib/eal/include/rte_common.h | 27 ++- lib/mempool/rte_mempool.c| 7 --- 7 files changed, 38 insertions(+), 14 deletions(-) -- 2.43.0
[PATCH v5 1/6] eal: introduce RTE_MIN_T() and RTE_MAX_T() macros
These macros work like RTE_MIN and RTE_MAX but take an explicit type. Necessary when being used in static assertions since RTE_MIN and RTE_MAX use temporary variables which confuses compilers constant expression checks. These macros could also be useful in other scenarios when bounded range is useful. Naming is chosen to be similar to Linux kernel conventions. Signed-off-by: Stephen Hemminger Acked-by: Konstantin Ananyev Acked-by: Andrew Rybchenko --- lib/eal/include/rte_common.h | 16 1 file changed, 16 insertions(+) diff --git a/lib/eal/include/rte_common.h b/lib/eal/include/rte_common.h index c1ba32d00e47..33680e818bfb 100644 --- a/lib/eal/include/rte_common.h +++ b/lib/eal/include/rte_common.h @@ -585,6 +585,14 @@ __extension__ typedef uint64_t RTE_MARKER64[0]; _a < _b ? _a : _b; \ }) +/** + * Macro to return the minimum of two numbers + * does not use temporarys so not safe if a or b is expression + * but is guaranteed to be constant for use in static_assert() + */ +#define RTE_MIN_T(a, b, t) \ + ((t)(a) < (t)(b) ? (t)(a) : (t)(b)) + /** * Macro to return the maximum of two numbers */ @@ -595,6 +603,14 @@ __extension__ typedef uint64_t RTE_MARKER64[0]; _a > _b ? _a : _b; \ }) +/** + * Macro to return the maxiimum of two numbers + * does not use temporarys so not safe if a or b is expression + * but is guaranteed to be constant for use in static_assert() + */ +#define RTE_MAX_T(a, b, t) \ + ((t)(a) > (t)(b) ? (t)(a) : (t)(b)) + /*** Other general functions / macros / #ifndef offsetof -- 2.43.0
[PATCH v5 2/6] event/opdl: fix non-constant compile time assertion
RTE_BUILD_BUG_ON() was being used with a non-constant value. The inline function rte_is_power_of_2() is not constant since inline expansion happens later in the compile process. Replace it with the macro which will be constant. Fixes: 4236ce9bf5bf ("event/opdl: add OPDL ring infrastructure library") Cc: liang.j...@intel.com Signed-off-by: Stephen Hemminger Acked-by: Bruce Richardson Acked-by: Tyler Retzlaff Acked-by: Andrew Rybchenko --- drivers/event/opdl/opdl_ring.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/event/opdl/opdl_ring.c b/drivers/event/opdl/opdl_ring.c index 69392b56bbec..da5ea02d1928 100644 --- a/drivers/event/opdl/opdl_ring.c +++ b/drivers/event/opdl/opdl_ring.c @@ -910,7 +910,7 @@ opdl_ring_create(const char *name, uint32_t num_slots, uint32_t slot_size, RTE_CACHE_LINE_MASK) != 0); RTE_BUILD_BUG_ON((offsetof(struct opdl_ring, slots) & RTE_CACHE_LINE_MASK) != 0); - RTE_BUILD_BUG_ON(!rte_is_power_of_2(OPDL_DISCLAIMS_PER_LCORE)); + RTE_BUILD_BUG_ON(!RTE_IS_POWER_OF_2(OPDL_DISCLAIMS_PER_LCORE)); /* Parameter checking */ if (name == NULL) { -- 2.43.0
[PATCH v5 3/6] net/sfc: fix non-constant expression in RTE_BUILD_BUG_ON()
The macro RTE_MIN has some hidden assignments to provide type safety which means the statement can not be fully evaluated in first pass of compiler. Replace RTE_MIN() with equivalent macro. Fixes: 4f93d790 ("net/sfc: support TSO for EF100 native datapath") Signed-off-by: Stephen Hemminger Acked-by: Tyler Retzlaff Reviewed-by: Andrew Rybchenko --- drivers/net/sfc/sfc_ef100_tx.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/net/sfc/sfc_ef100_tx.c b/drivers/net/sfc/sfc_ef100_tx.c index 1b6374775f07..c88ab964547e 100644 --- a/drivers/net/sfc/sfc_ef100_tx.c +++ b/drivers/net/sfc/sfc_ef100_tx.c @@ -563,8 +563,7 @@ sfc_ef100_tx_pkt_descs_max(const struct rte_mbuf *m) * (split into many Tx descriptors). */ RTE_BUILD_BUG_ON(SFC_EF100_TX_SEND_DESC_LEN_MAX < -RTE_MIN((unsigned int)EFX_MAC_PDU_MAX, -SFC_MBUF_SEG_LEN_MAX)); +RTE_MIN_T(EFX_MAC_PDU_MAX, SFC_MBUF_SEG_LEN_MAX, uint32_t)); } if (m->ol_flags & sfc_dp_mport_override) { -- 2.43.0
[PATCH v5 4/6] net/i40e: avoid using const variable in assertion
Clang does not allow const variable in a static_assert expression. Signed-off-by: Stephen Hemminger Acked-by: Bruce Richardson Acked-by: Konstantin Ananyev --- drivers/net/i40e/i40e_ethdev.h | 1 + drivers/net/i40e/i40e_rxtx_vec_sse.c | 10 -- 2 files changed, 5 insertions(+), 6 deletions(-) diff --git a/drivers/net/i40e/i40e_ethdev.h b/drivers/net/i40e/i40e_ethdev.h index 1bbe7ad37600..445e1c0b381f 100644 --- a/drivers/net/i40e/i40e_ethdev.h +++ b/drivers/net/i40e/i40e_ethdev.h @@ -278,6 +278,7 @@ enum i40e_flxpld_layer_idx { #define I40E_DEFAULT_DCB_APP_PRIO 3 #define I40E_FDIR_PRG_PKT_CNT 128 +#define I40E_FDIR_ID_BIT_SHIFT 13 /* * Struct to store flow created. diff --git a/drivers/net/i40e/i40e_rxtx_vec_sse.c b/drivers/net/i40e/i40e_rxtx_vec_sse.c index 9200a23ff662..2d4480a7651b 100644 --- a/drivers/net/i40e/i40e_rxtx_vec_sse.c +++ b/drivers/net/i40e/i40e_rxtx_vec_sse.c @@ -143,10 +143,9 @@ descs_to_fdir_32b(volatile union i40e_rx_desc *rxdp, struct rte_mbuf **rx_pkt) /* convert fdir_id_mask into a single bit, then shift as required for * correct location in the mbuf->olflags */ - const uint32_t FDIR_ID_BIT_SHIFT = 13; - RTE_BUILD_BUG_ON(RTE_MBUF_F_RX_FDIR_ID != (1 << FDIR_ID_BIT_SHIFT)); + RTE_BUILD_BUG_ON(RTE_MBUF_F_RX_FDIR_ID != (1 << I40E_FDIR_ID_BIT_SHIFT)); v_fd_id_mask = _mm_srli_epi32(v_fd_id_mask, 31); - v_fd_id_mask = _mm_slli_epi32(v_fd_id_mask, FDIR_ID_BIT_SHIFT); + v_fd_id_mask = _mm_slli_epi32(v_fd_id_mask, I40E_FDIR_ID_BIT_SHIFT); /* The returned value must be combined into each mbuf. This is already * being done for RSS and VLAN mbuf olflags, so return bits to OR in. @@ -205,10 +204,9 @@ descs_to_fdir_16b(__m128i fltstat, __m128i descs[4], struct rte_mbuf **rx_pkt) descs[0] = _mm_blendv_epi8(descs[0], _mm_setzero_si128(), v_desc0_mask); /* Shift to 1 or 0 bit per u32 lane, then to RTE_MBUF_F_RX_FDIR_ID offset */ - const uint32_t FDIR_ID_BIT_SHIFT = 13; - RTE_BUILD_BUG_ON(RTE_MBUF_F_RX_FDIR_ID != (1 << FDIR_ID_BIT_SHIFT)); + RTE_BUILD_BUG_ON(RTE_MBUF_F_RX_FDIR_ID != (1 << I40E_FDIR_ID_BIT_SHIFT)); __m128i v_mask_one_bit = _mm_srli_epi32(v_fdir_id_mask, 31); - return _mm_slli_epi32(v_mask_one_bit, FDIR_ID_BIT_SHIFT); + return _mm_slli_epi32(v_mask_one_bit, I40E_FDIR_ID_BIT_SHIFT); } #endif -- 2.43.0
[PATCH v5 5/6] mempool: avoid floating point expression in static assertion
Clang does not handle casts in static_assert() expressions. It doesn't like use of floating point to calculate threshold. Use a different expression with same effect. Modify comment in mlx5 so that developers don't go searching for old value. Signed-off-by: Stephen Hemminger Acked-by: Konstantin Ananyev Reviewed-by: Andrew Rybchenko --- drivers/net/mlx5/mlx5_rxq.c | 2 +- lib/mempool/rte_mempool.c | 7 --- 2 files changed, 5 insertions(+), 4 deletions(-) diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c index 1bb036afebb3..ca2eeedc9de3 100644 --- a/drivers/net/mlx5/mlx5_rxq.c +++ b/drivers/net/mlx5/mlx5_rxq.c @@ -1444,7 +1444,7 @@ mlx5_mprq_alloc_mp(struct rte_eth_dev *dev) /* * rte_mempool_create_empty() has sanity check to refuse large cache * size compared to the number of elements. -* CACHE_FLUSHTHRESH_MULTIPLIER is defined in a C file, so using a +* CALC_CACHE_FLUSHTHRESH() is defined in a C file, so using a * constant number 2 instead. */ obj_num = RTE_MAX(obj_num, MLX5_MPRQ_MP_CACHE_SZ * 2); diff --git a/lib/mempool/rte_mempool.c b/lib/mempool/rte_mempool.c index b7a19bea7185..12390a2c8155 100644 --- a/lib/mempool/rte_mempool.c +++ b/lib/mempool/rte_mempool.c @@ -50,9 +50,10 @@ static void mempool_event_callback_invoke(enum rte_mempool_event event, struct rte_mempool *mp); -#define CACHE_FLUSHTHRESH_MULTIPLIER 1.5 -#define CALC_CACHE_FLUSHTHRESH(c) \ - ((typeof(c))((c) * CACHE_FLUSHTHRESH_MULTIPLIER)) +/* Note: avoid using floating point since that compiler + * may not think that is constant. + */ +#define CALC_CACHE_FLUSHTHRESH(c) (((c) * 3) / 2) #if defined(RTE_ARCH_X86) /* -- 2.43.0
[PATCH v5 6/6] eal: replace out of bounds VLA with static_assert
Both Gcc, clang and MSVC have better way to do compile time assertions rather than using out of bounds array access. The old method would fail if -Wvla is enabled because compiler can't determine size in that code. Also, the use of new _Static_assert will catch broken code that is passing non-constant expression to RTE_BUILD_BUG_ON(). Add workaround for clang static_assert in switch, and missing static_assert in older FreeBSD. Signed-off-by: Stephen Hemminger Acked-by: Morten Brørup Acked-by: Tyler Retzlaff --- lib/eal/include/rte_common.h | 11 ++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/lib/eal/include/rte_common.h b/lib/eal/include/rte_common.h index 33680e818bfb..aa066125b6cd 100644 --- a/lib/eal/include/rte_common.h +++ b/lib/eal/include/rte_common.h @@ -16,6 +16,7 @@ extern "C" { #endif +#include #include #include @@ -492,10 +493,18 @@ rte_is_aligned(const void * const __rte_restrict ptr, const unsigned int align) /*** Macros for compile type checks / +/* Workaround for toolchain issues with missing C11 macro in FreeBSD */ +#if !defined(static_assert) && !defined(__cplusplus) +#definestatic_assert _Static_assert +#endif + /** * Triggers an error at compilation time if the condition is true. + * + * The do { } while(0) exists to workaround a bug in clang (#55821) + * where it would not handle _Static_assert in a switch case. */ -#define RTE_BUILD_BUG_ON(condition) ((void)sizeof(char[1 - 2*!!(condition)])) +#define RTE_BUILD_BUG_ON(condition) do { static_assert(!(condition), #condition); } while (0) /*** Cache line related macros / -- 2.43.0
RE: [PATCH v3 00/24] Fixes and improvements in crypto cnxk
> Subject: [PATCH v3 00/24] Fixes and improvements in crypto cnxk > > Add following features > - TLS record processing offload (TLS 1.2-1.3, DTLS 1.2) > - Rx inject to allow lookaside packets to be injected to ethdev Rx > - Use PDCP_CHAIN opcode instead of PDCP opcode for cipher-only and auth > only cases > - PMD API to submit instructions directly to hardware > > Changes in v3 > - Addressed Akhil's commments on Rx inject patch > - Updated license year to 2024 > > Changes in v2 > - Addressed checkpatch issue > - Addressed build error with stdatomic > > Aakash Sasidharan (1): > crypto/cnxk: enable digest gen for zero len input > > Akhil Goyal (1): > common/cnxk: fix memory leak > > Anoob Joseph (6): > crypto/cnxk: use common macro > crypto/cnxk: return microcode completion code > common/cnxk: update opad-ipad gen to handle TLS > common/cnxk: add TLS record contexts > crypto/cnxk: separate IPsec from security common code > crypto/cnxk: add PMD APIs for raw submission to CPT > > Gowrishankar Muthukrishnan (1): > crypto/cnxk: fix ECDH pubkey verify in cn9k > > Rahul Bhansali (2): > common/cnxk: add Rx inject configs > crypto/cnxk: Rx inject config update > > Tejasree Kondoj (3): > crypto/cnxk: fallback to SG if headroom is not available > crypto/cnxk: replace PDCP with PDCP chain opcode > crypto/cnxk: add CPT SG mode debug > > Vidya Sagar Velumuri (10): > crypto/cnxk: enable Rx inject in security lookaside > crypto/cnxk: enable Rx inject for 103 > crypto/cnxk: rename security caps as IPsec security caps > crypto/cnxk: add TLS record session ops > crypto/cnxk: add TLS record datapath handling > crypto/cnxk: add TLS capability > crypto/cnxk: validate the combinations supported in TLS > crypto/cnxk: use a single function for opad ipad > crypto/cnxk: add support for TLS 1.3 > crypto/cnxk: add TLS 1.3 capability > > doc/api/doxy-api-index.md | 1 + > doc/api/doxy-api.conf.in | 1 + > doc/guides/cryptodevs/cnxk.rst| 12 + > doc/guides/cryptodevs/features/cn10k.ini | 1 + > doc/guides/rel_notes/release_24_03.rst| 7 + > drivers/common/cnxk/cnxk_security.c | 65 +- > drivers/common/cnxk/cnxk_security.h | 15 +- > drivers/common/cnxk/hw/cpt.h | 12 +- > drivers/common/cnxk/roc_cpt.c | 14 +- > drivers/common/cnxk/roc_cpt.h | 7 +- > drivers/common/cnxk/roc_cpt_priv.h| 2 +- > drivers/common/cnxk/roc_idev.c| 44 + > drivers/common/cnxk/roc_idev.h| 5 + > drivers/common/cnxk/roc_idev_priv.h | 6 + > drivers/common/cnxk/roc_ie_ot.c | 14 +- > drivers/common/cnxk/roc_ie_ot_tls.h | 225 + > drivers/common/cnxk/roc_mbox.h| 2 + > drivers/common/cnxk/roc_nix.c | 2 + > drivers/common/cnxk/roc_nix_inl.c | 2 +- > drivers/common/cnxk/roc_nix_inl_dev.c | 2 +- > drivers/common/cnxk/roc_se.c | 379 +++- > drivers/common/cnxk/roc_se.h | 38 +- > drivers/common/cnxk/version.map | 5 + > drivers/crypto/cnxk/cn10k_cryptodev.c | 2 +- > drivers/crypto/cnxk/cn10k_cryptodev_ops.c | 401 - > drivers/crypto/cnxk/cn10k_cryptodev_ops.h | 11 + > drivers/crypto/cnxk/cn10k_cryptodev_sec.c | 134 +++ > drivers/crypto/cnxk/cn10k_cryptodev_sec.h | 68 ++ > drivers/crypto/cnxk/cn10k_ipsec.c | 134 +-- > drivers/crypto/cnxk/cn10k_ipsec.h | 38 +- > drivers/crypto/cnxk/cn10k_ipsec_la_ops.h | 19 +- > drivers/crypto/cnxk/cn10k_tls.c | 830 ++ > drivers/crypto/cnxk/cn10k_tls.h | 35 + > drivers/crypto/cnxk/cn10k_tls_ops.h | 322 +++ > drivers/crypto/cnxk/cn9k_cryptodev_ops.c | 68 +- > drivers/crypto/cnxk/cn9k_cryptodev_ops.h | 62 ++ > drivers/crypto/cnxk/cn9k_ipsec_la_ops.h | 16 +- > drivers/crypto/cnxk/cnxk_cryptodev.c | 3 + > drivers/crypto/cnxk/cnxk_cryptodev.h | 24 +- > .../crypto/cnxk/cnxk_cryptodev_capabilities.c | 375 +++- > drivers/crypto/cnxk/cnxk_cryptodev_devargs.c | 31 + > drivers/crypto/cnxk/cnxk_cryptodev_ops.c | 128 ++- > drivers/crypto/cnxk/cnxk_cryptodev_ops.h | 7 + > drivers/crypto/cnxk/cnxk_se.h | 98 +-- > drivers/crypto/cnxk/cnxk_sg.h | 4 +- > drivers/crypto/cnxk/meson.build | 4 +- > drivers/crypto/cnxk/rte_pmd_cnxk_crypto.h | 46 + > drivers/crypto/cnxk/version.map | 3 + > 48 files changed, 3018 insertions(+), 706 deletions(-) > create mode 100644 drivers/common/cnxk/roc_ie_ot_tls.h > create mode 100644 drivers/crypto/cnxk/cn10k_cryptodev_sec.c > create mode 100644 drivers/crypto/cnxk/cn10k_cryptodev_sec.h > create mode 100644 drivers/crypto/cnxk/cn
[PATCH] net/mana: prevent values overflow returned from RDMA layer
From: Long Li The device capabilities reported from RDMA layer are in int. Those values can overflow with the data types defined in dev_info_get(). Fix this by doing a upper bound before returning those values. Fixes: 517ed6e2d590 ("net/mana: add basic driver with build environment") Cc: sta...@dpdk.org Signed-off-by: Long Li --- drivers/net/mana/mana.c | 24 ++-- 1 file changed, 14 insertions(+), 10 deletions(-) diff --git a/drivers/net/mana/mana.c b/drivers/net/mana/mana.c index 781ed76139..471beed19e 100644 --- a/drivers/net/mana/mana.c +++ b/drivers/net/mana/mana.c @@ -296,8 +296,8 @@ mana_dev_info_get(struct rte_eth_dev *dev, dev_info->min_rx_bufsize = MIN_RX_BUF_SIZE; dev_info->max_rx_pktlen = MANA_MAX_MTU + RTE_ETHER_HDR_LEN; - dev_info->max_rx_queues = priv->max_rx_queues; - dev_info->max_tx_queues = priv->max_tx_queues; + dev_info->max_rx_queues = RTE_MIN(priv->max_rx_queues, UINT16_MAX); + dev_info->max_tx_queues = RTE_MIN(priv->max_tx_queues, UINT16_MAX); dev_info->max_mac_addrs = MANA_MAX_MAC_ADDR; dev_info->max_hash_mac_addrs = 0; @@ -338,16 +338,20 @@ mana_dev_info_get(struct rte_eth_dev *dev, /* Buffer limits */ dev_info->rx_desc_lim.nb_min = MIN_BUFFERS_PER_QUEUE; - dev_info->rx_desc_lim.nb_max = priv->max_rx_desc; + dev_info->rx_desc_lim.nb_max = RTE_MIN(priv->max_rx_desc, UINT16_MAX); dev_info->rx_desc_lim.nb_align = MIN_BUFFERS_PER_QUEUE; - dev_info->rx_desc_lim.nb_seg_max = priv->max_recv_sge; - dev_info->rx_desc_lim.nb_mtu_seg_max = priv->max_recv_sge; + dev_info->rx_desc_lim.nb_seg_max = + RTE_MIN(priv->max_recv_sge, UINT16_MAX); + dev_info->rx_desc_lim.nb_mtu_seg_max = + RTE_MIN(priv->max_recv_sge, UINT16_MAX); dev_info->tx_desc_lim.nb_min = MIN_BUFFERS_PER_QUEUE; - dev_info->tx_desc_lim.nb_max = priv->max_tx_desc; + dev_info->tx_desc_lim.nb_max = RTE_MIN(priv->max_tx_desc, UINT16_MAX); dev_info->tx_desc_lim.nb_align = MIN_BUFFERS_PER_QUEUE; - dev_info->tx_desc_lim.nb_seg_max = priv->max_send_sge; - dev_info->rx_desc_lim.nb_mtu_seg_max = priv->max_recv_sge; + dev_info->tx_desc_lim.nb_seg_max = + RTE_MIN(priv->max_send_sge, UINT16_MAX); + dev_info->tx_desc_lim.nb_mtu_seg_max = + RTE_MIN(priv->max_send_sge, UINT16_MAX); /* Speed */ dev_info->speed_capa = RTE_ETH_LINK_SPEED_100G; @@ -1385,9 +1389,9 @@ mana_probe_port(struct ibv_device *ibdev, struct ibv_device_attr_ex *dev_attr, priv->max_mr = dev_attr->orig_attr.max_mr; priv->max_mr_size = dev_attr->orig_attr.max_mr_size; - DRV_LOG(INFO, "dev %s max queues %d desc %d sge %d", + DRV_LOG(INFO, "dev %s max queues %d desc %d sge %d mr %lu", name, priv->max_rx_queues, priv->max_rx_desc, - priv->max_send_sge); + priv->max_send_sge, priv->max_mr_size); rte_eth_copy_pci_info(eth_dev, pci_dev); -- 2.25.1
RE: [PATCH v5 6/6] eal: replace out of bounds VLA with static_assert
> From: Stephen Hemminger [mailto:step...@networkplumber.org] > Sent: Thursday, 18 January 2024 17.51 > > Both Gcc, clang and MSVC have better way to do compile time > assertions rather than using out of bounds array access. > The old method would fail if -Wvla is enabled because compiler > can't determine size in that code. Also, the use of new > _Static_assert will catch broken code that is passing non-constant > expression to RTE_BUILD_BUG_ON(). > > Add workaround for clang static_assert in switch, > and missing static_assert in older FreeBSD. > > Signed-off-by: Stephen Hemminger > Acked-by: Morten Brørup > Acked-by: Tyler Retzlaff > --- Reviewed-by: Morten Brørup
RE: [PATCH v5 5/6] mempool: avoid floating point expression in static assertion
> From: Stephen Hemminger [mailto:step...@networkplumber.org] > Sent: Thursday, 18 January 2024 17.51 > > Clang does not handle casts in static_assert() expressions. > It doesn't like use of floating point to calculate threshold. > Use a different expression with same effect. > > Modify comment in mlx5 so that developers don't go searching > for old value. > > Signed-off-by: Stephen Hemminger > Acked-by: Konstantin Ananyev > Reviewed-by: Andrew Rybchenko > --- Reviewed-by: Morten Brørup
Community CI Meeting Minutes - January 18, 2024
January 18, 2024 # Attendees 1. Lincoln Lavoie 2. Thomas Monjalon 3. Aaron Conole 4. Jeremy Spewock 5. Paul Szczepanek 6. Ali Alnubani 7. Juraj Linkeš # Minutes = General Announcements * First 2024 DTS WG meeting was held yesterday, and the minutes are here: https://docs.google.com/document/d/1pG_NGuwYgPuovwIfhvcs9u8PNYIJuInsFr0GeTUIU4k/edit?usp=sharing * Patrick Robb will publish the meeting minutes = CI Status - UNH-IOL Community Lab * Octeon CN106XX: Patrick worked with Hiral this week about the process for SDK rebuild. This will have to happen regularly, but we now have a VM setup which will serve as the place to rebuild SDK and act as TFTP server for the Octeon board. * After that we need to iron out switching rootfs to ubuntu, setting up DTS on the tester, validating this works fine in a CI context (Phanendra from Marvell has approved of the concept). * The Intel server arrived in the mail yesterday. We will mount the system this week and begin setup. As a reminder this now unblocks: * E810 testing for Intel * Traffic gen for the CX7 testing (this server will act as TG) * Traffic gen for the Octeon CN106xx board * The new create dpdk artifact for ci testing script is in production at UNH, and Adam is submitting the V3 patchseries for this to dpdk-ci today * There is an update from the patchwork maintainer about supporting Depends-on via patchwork. I encourage everyone to read his full thoughts on the issue below, but shorthand conclusions are: * He prefers to only support Depends-on on a series basis, not a series or patch basis. * Patrick will update the CI testing thread on this topic, but this may require the DPDK community agreeing to a new approach. From looking through the dev mailing list, it seems that series dependency is the typical use, but there are also some examples of developers using patch dependency. * Keep the same syntax which allows for depends-on: patch, but in reality translate this to depends on that patch’s series * He will need some help for the effort. We can ping the DPDK community to look for volunteers. Or, it can possibly become a community ask for development at the DPDK Community Lab. Looking at it at a high level I don’t think the scope will be too bad. * Full thread: https://github.com/getpatchwork/git-pw/issues/71 * Arm-Ampere still needs a kernel rebuild for the QAT card * Standing up ts-factory testing framework * Adam did the first cx-5 run on an ARM system, and will do an Intel XL710 run on an ARM system today. Both will be published on Oktet Lab’s Bublik for review. * Unlike our ARM testbeds, our x86 servers are single server TG/DUT testbeds, which breaks an assumption in ts-factory. My view on how to proceed is bring testing online with what works now (arm), then review the value with the community, and choose a strategy for x86 based on what we learn from running this at the lab. * Andrew from Oktet labs has added some missing components to GitHub for the data visualization tool (Bublik), and has finished categorizing the tests for the XL710 and the expected results for that NIC * Old DTS patches: * We noticed that as DPDK has grown, the compile time also increases and the old compile timeout for the DUT is no longer valid on some slower systems. Jeremy will submit a patch extending the timeout. * Once the cx7 is online, patrick will submit the patch adding the cx6 and cx7 to the NIC registry in DTS * Lincoln saw the email thread about DPDK failing to build on FreeBSD 14. We are only testing FreeBSD 13 in the lab right now, so we will add coverage for 14. - Intel Lab * John says there is still no new person who can act as a contact. There have been further staffing changes at Intel, so it may be hard to get a contact soon, but Patrick will keep asking. - Github Actions * Cirrus CI: Adding support for this to the 0 day robot. Aaron submitted a patch for polling for Cirrus ci status in ovs. * They are getting a server to migrate the VM over to, so there should be minimal downtime associated with the hardware moving from Westford to another state. - Loongarch Lab * None = DTS Improvements & Test Development * scapy/templated yaml for tests: * One motivation is for making writing simple tests in minimal time (like 3 minutes). Should we aim for a target of writing a testsuite in minutes, not hours?
Fwd: DTS WG Meeting Minutes - January 17, 2024
I forgot to CC the dev mailing list -- Forwarded message - From: Patrick Robb Date: Thu, Jan 18, 2024 at 2:52 PM Subject: DTS WG Meeting Minutes - January 17, 2024 To: Cc: , NBU-Contact-Thomas Monjalon (EXTERNAL) < tho...@monjalon.net>, Juraj Linkeš January 17, 2024 # Attendees * Juraj Linkeš * Gregory Etelson * Honnappa Nagarahalli * Jeremy Spewock * Luca Vizzarro * Paul Szczepanek # Agenda * Additions to the agenda * Patch discussions * DTS Developer documentation * 24.03 roadmap # Minutes = Additions to the agenda * Nothing = YAML test suites * Greg wanted to automate his testing; started with test writtens in Python, but was not scalable; easily understandable by newcomers. * The idea is to take an application, send commands (interactive input), collect output and compare with expected strings. * The code was available as of two months ago, but no longer is (private on GitHub). Greg may be able to share it once taking care of it in his company. * Gregory submitted an idea for writing test suites in yaml, which just passes values into a templated testpmd testsuite. * Do we want to support a secondary way of writing test suites? * Will this be usable for both functional and performance testing? * Will this coexist well with the current method? * The current method also aims to be minimalistic and intuitive * Coexistence makes sense as the yaml approach may not be able to cover more complicated cases * What are any limitations which this places on the testing framework? If there aren’t major downsides, then the benefits in terms of quickly adding new testpmd testsuites seems significant. * The traffic generator can't be configured here, we need an abstraction that works for all traffic generators; we can mark the test cases as functional/performance though, which could be enough * We can only specify test-specific testpmd cmdline options; shouldn't be a problem, but we have to keep in mind that configuration such as cores and pci addresses are configured elsewhere (the testbed configuration) * Using specific strings in testpmd is harder to maintain (if the same config is used in multiple places) * Are the phases for both setup/teardown and test cases? This could complicate results recording * Can we easily specify multiple test cases? I.e. we have a test method and we want to test different input combinations (the inputs could just be the number of cores/packet size for performance tests)
DPDK Release Status Meeting 2024-01-18
Release status meeting minutes 2024-01-18 = Agenda: * Release Dates * Subtrees * Roadmaps * LTS * Defects * Opens Participants: * AMD * Intel * Marvell * Nvidia * Red Hat Release Dates - The following are the current working dates for 24.03: * V1: 29 December 2023 * RC1: 5 February 2024 * RC2: 23 February 2024 * RC3: 4 March2024 * Release: 14 March2024 https://core.dpdk.org/roadmap/#dates Subtrees * next-net * Starting to review patches. * next-net-intel * Will merge initial patches this week * next-net-mlx * Some patches merged. * next-net-mvl * Port representor for cnxk work in progress. * next-eventdev * New feature EMLdev Event Adaptor library. * next-baseband * Reviews started and changes requested. * next-virtio * Patch from Marvell under review. * next-crypto * Started merging. * ~80 patches in backlog. * TLS support in cnxk driver. * New Nitrox PMD for compressdev. * main * Cleanups in progress. * New bus driver from Huawei. * Working on iavf fix for OVS. Proposed Schedule for 2023 -- See http://core.dpdk.org/roadmap/#dates LTS --- * 22.11.4 - In progress * 21.11.6 - Released. * 20.11.10 - Released. * 19.11.15 - Will only be updated with CVE and critical fixes. * Distros * Debian 12 contains DPDK v22.11 * Ubuntu 22.04-LTS contains DPDK v21.11 * Ubuntu 23.04 contains DPDK v22.11 Defects --- * Bugzilla links, 'Bugs', added for hosted projects * https://www.dpdk.org/hosted-projects/ DPDK Release Status Meetings The DPDK Release Status Meeting is intended for DPDK Committers to discuss the status of the master tree and sub-trees, and for project managers to track progress or milestone dates. The meeting occurs on every Thursday at 9:30 UTC over Jitsi on https://meet.jit.si/DPDK You don't need an invite to join the meeting but if you want a calendar reminder just send an email to "John McNamara john.mcnam...@intel.com" for the invite.
[PATCH] common/sfc: replace out of bounds condition with static_assert
The sfc base code had its own definition of static assertions using the out of bound array access hack. Replace it with a static_assert like rte_common.h. Fixes: f67e4719147d ("net/sfc/base: fix coding style") Signed-off-by: Stephen Hemminger --- drivers/common/sfc_efx/base/efx.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/common/sfc_efx/base/efx.h b/drivers/common/sfc_efx/base/efx.h index 3312c2fa8f81..9ce266c43610 100644 --- a/drivers/common/sfc_efx/base/efx.h +++ b/drivers/common/sfc_efx/base/efx.h @@ -17,8 +17,8 @@ extern "C" { #endif -#defineEFX_STATIC_ASSERT(_cond)\ - ((void)sizeof (char[(_cond) ? 1 : -1])) +#defineEFX_STATIC_ASSERT(_cond) \ + do { static_assert((_cond), "assert failed" #_cond); } while (0) #defineEFX_ARRAY_SIZE(_array) \ (sizeof (_array) / sizeof ((_array)[0])) -- 2.43.0
RE: [PATCH] common/sfc: replace out of bounds condition with static_assert
> From: Stephen Hemminger [mailto:step...@networkplumber.org] > Sent: Thursday, 18 January 2024 21.18 > > The sfc base code had its own definition of static assertions > using the out of bound array access hack. Replace it with a > static_assert like rte_common.h. > > Fixes: f67e4719147d ("net/sfc/base: fix coding style") > Signed-off-by: Stephen Hemminger > --- > drivers/common/sfc_efx/base/efx.h | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/common/sfc_efx/base/efx.h > b/drivers/common/sfc_efx/base/efx.h > index 3312c2fa8f81..9ce266c43610 100644 > --- a/drivers/common/sfc_efx/base/efx.h > +++ b/drivers/common/sfc_efx/base/efx.h > @@ -17,8 +17,8 @@ > extern "C" { > #endif > > -#define EFX_STATIC_ASSERT(_cond)\ > - ((void)sizeof (char[(_cond) ? 1 : -1])) > +#define EFX_STATIC_ASSERT(_cond) \ > + do { static_assert((_cond), "assert failed" #_cond); } while (0) This probably works for the DPDK project. For other projects using the same file, it might also need "#include " (containing the static_assert convenience macro for C), and possibly your workaround for toolchain issues with missing C11 macro in FreeBSD. Maybe not in this file, but somewhere. Acked-by: Morten Brørup
Re: [PATCH v5 1/6] eal: introduce RTE_MIN_T() and RTE_MAX_T() macros
On 2024/1/19 0:50, Stephen Hemminger wrote: > These macros work like RTE_MIN and RTE_MAX but take an explicit > type. Necessary when being used in static assertions since > RTE_MIN and RTE_MAX use temporary variables which confuses > compilers constant expression checks. These macros could also > be useful in other scenarios when bounded range is useful. > > Naming is chosen to be similar to Linux kernel conventions. > > Signed-off-by: Stephen Hemminger > Acked-by: Konstantin Ananyev > Acked-by: Andrew Rybchenko > --- Acked-by: Chengwen Feng
[PATCH 1/5] common/cnxk: reserve CPT LF for Rx inject
An additional CPT LF will be reserved and attached with inline device to enable RXC and use for Rx inject purpose. Signed-off-by: Rahul Bhansali --- Depends-on: series-30819 ("Fixes and improvements in crypto cnxk") drivers/common/cnxk/roc_features.h | 7 +++ drivers/common/cnxk/roc_nix.h | 1 + drivers/common/cnxk/roc_nix_inl.c | 71 -- drivers/common/cnxk/roc_nix_inl.h | 5 +- drivers/common/cnxk/roc_nix_inl_dev.c | 61 +- drivers/common/cnxk/roc_nix_inl_priv.h | 7 ++- drivers/common/cnxk/version.map| 2 + 7 files changed, 123 insertions(+), 31 deletions(-) diff --git a/drivers/common/cnxk/roc_features.h b/drivers/common/cnxk/roc_features.h index f4807ee271..3b512be132 100644 --- a/drivers/common/cnxk/roc_features.h +++ b/drivers/common/cnxk/roc_features.h @@ -83,4 +83,11 @@ roc_feature_nix_has_inl_ipsec(void) { return !roc_model_is_cnf10kb(); } + +static inline bool +roc_feature_nix_has_rx_inject(void) +{ + return (roc_model_is_cn10ka_b0() || roc_model_is_cn10kb()); +} + #endif diff --git a/drivers/common/cnxk/roc_nix.h b/drivers/common/cnxk/roc_nix.h index 84e6fc3df5..eebdd4ecc3 100644 --- a/drivers/common/cnxk/roc_nix.h +++ b/drivers/common/cnxk/roc_nix.h @@ -474,6 +474,7 @@ struct roc_nix { uint32_t meta_buf_sz; bool force_rx_aura_bp; bool custom_meta_aura_ena; + bool rx_inj_ena; /* End of input parameters */ /* LMT line base for "Per Core Tx LMT line" mode*/ uintptr_t lmt_base; diff --git a/drivers/common/cnxk/roc_nix_inl.c b/drivers/common/cnxk/roc_nix_inl.c index 07a90133ca..de8fd2a605 100644 --- a/drivers/common/cnxk/roc_nix_inl.c +++ b/drivers/common/cnxk/roc_nix_inl.c @@ -474,6 +474,34 @@ roc_nix_inl_outb_lf_base_get(struct roc_nix *roc_nix) return (struct roc_cpt_lf *)nix->cpt_lf_base; } +struct roc_cpt_lf * +roc_nix_inl_inb_inj_lf_get(struct roc_nix *roc_nix) +{ + struct nix *nix; + struct idev_cfg *idev = idev_get_cfg(); + struct nix_inl_dev *inl_dev = NULL; + struct roc_cpt_lf *lf = NULL; + + if (!idev) + return NULL; + + inl_dev = idev->nix_inl_dev; + + if (!inl_dev && roc_nix == NULL) + return NULL; + + nix = roc_nix_to_nix_priv(roc_nix); + + if (nix->inb_inl_dev && inl_dev && inl_dev->attach_cptlf && + inl_dev->rx_inj_ena) + return &inl_dev->cpt_lf[inl_dev->nb_cptlf - 1]; + + lf = roc_nix_inl_outb_lf_base_get(roc_nix); + if (lf) + lf += roc_nix->outb_nb_crypto_qs; + return lf; +} + uintptr_t roc_nix_inl_outb_sa_base_get(struct roc_nix *roc_nix) { @@ -512,6 +540,35 @@ roc_nix_inl_inb_sa_base_get(struct roc_nix *roc_nix, bool inb_inl_dev) return (uintptr_t)nix->inb_sa_base; } +bool +roc_nix_inl_inb_rx_inject_enable(struct roc_nix *roc_nix, bool inb_inl_dev) +{ + struct idev_cfg *idev = idev_get_cfg(); + struct nix_inl_dev *inl_dev; + struct nix *nix = NULL; + + if (idev == NULL) + return 0; + + if (!inb_inl_dev && roc_nix == NULL) + return 0; + + if (roc_nix) { + nix = roc_nix_to_nix_priv(roc_nix); + if (!nix->inl_inb_ena) + return 0; + } + + if (inb_inl_dev) { + inl_dev = idev->nix_inl_dev; + if (inl_dev && inl_dev->attach_cptlf && inl_dev->rx_inj_ena && + roc_nix->rx_inj_ena) + return true; + } + + return roc_nix->rx_inj_ena; +} + uint32_t roc_nix_inl_inb_spi_range(struct roc_nix *roc_nix, bool inb_inl_dev, uint32_t *min_spi, uint32_t *max_spi) @@ -941,6 +998,7 @@ roc_nix_inl_outb_init(struct roc_nix *roc_nix) bool ctx_ilen_valid = false; size_t sa_sz, ring_sz; uint8_t ctx_ilen = 0; + bool rx_inj = false; uint16_t sso_pffunc; uint8_t eng_grpmask; uint64_t blkaddr, i; @@ -958,6 +1016,12 @@ roc_nix_inl_outb_init(struct roc_nix *roc_nix) /* Retrieve inline device if present */ inl_dev = idev->nix_inl_dev; + if (roc_nix->rx_inj_ena && !(nix->inb_inl_dev && inl_dev && inl_dev->attach_cptlf && +inl_dev->rx_inj_ena)) { + nb_lf++; + rx_inj = true; + } + sso_pffunc = inl_dev ? inl_dev->dev.pf_func : idev_sso_pffunc_get(); /* Use sso_pffunc if explicitly requested */ if (roc_nix->ipsec_out_sso_pffunc) @@ -986,7 +1050,8 @@ roc_nix_inl_outb_init(struct roc_nix *roc_nix) 1ULL << ROC_CPT_DFLT_ENG_GRP_SE_IE | 1ULL << ROC_CPT_DFLT_ENG_GRP_AE); rc = cpt_lfs_alloc(dev, eng_grpmask, blkaddr, - !roc_nix->ipsec_out_sso_pffunc, ctx_ilen_valid, ctx_ilen, false, 0); + !roc_nix->ipsec_out_sso_pf
[PATCH 2/5] net/cnxk: support of Rx inject
Add Rx inject security callback APIs to configure, inject packet to CPT and receive back as in receive path. Devargs "rx_inj_ena=1" will be required to enable the inline IPsec Rx inject feature. If inline device is used then this devarg will be required for both inline device and eth device. Signed-off-by: Rahul Bhansali --- doc/guides/nics/cnxk.rst | 27 +++ drivers/net/cnxk/cn10k_ethdev.c| 4 + drivers/net/cnxk/cn10k_ethdev_sec.c| 48 + drivers/net/cnxk/cn10k_rx.h| 241 - drivers/net/cnxk/cn10k_rxtx.h | 57 ++ drivers/net/cnxk/cn10k_tx.h| 57 -- drivers/net/cnxk/cnxk_ethdev.h | 3 + drivers/net/cnxk/cnxk_ethdev_devargs.c | 8 +- drivers/net/cnxk/cnxk_ethdev_dp.h | 8 + drivers/net/cnxk/cnxk_ethdev_sec.c | 21 ++- 10 files changed, 405 insertions(+), 69 deletions(-) diff --git a/doc/guides/nics/cnxk.rst b/doc/guides/nics/cnxk.rst index 9ec52e380f..39660dba82 100644 --- a/doc/guides/nics/cnxk.rst +++ b/doc/guides/nics/cnxk.rst @@ -416,6 +416,19 @@ Runtime Config Options With the above configuration, PMD would allocate meta buffers of size 512 for inline inbound IPsec processing second pass. +- ``Rx Inject Enable inbound inline IPsec for second pass`` (default ``0``) + + Rx packet inject feature for inbound inline IPsec processing can be enabled + by ``rx_inj_ena`` ``devargs`` parameter. + This option is for OCTEON CN106-B0/CN103XX SoC family. + + For example:: + + -a 0002:02:00.0,rx_inj_ena=1 + + With the above configuration, driver would enable packet inject from ARM cores + to crypto to process and send back in Rx path. + .. note:: Above devarg parameters are configurable per device, user needs to pass the @@ -613,6 +626,20 @@ Runtime Config Options for inline device With the above configuration, driver would poll for aging flows every 50 seconds. +- ``Rx Inject Enable inbound inline IPsec for second pass`` (default ``0``) + + Rx packet inject feature for inbound inline IPsec processing can be enabled + by ``rx_inj_ena`` ``devargs`` parameter with both inline device and ethdev + device. + This option is for OCTEON CN106-B0/CN103XX SoC family. + + For example:: + + -a 0002:1d:00.0,rx_inj_ena=1 + + With the above configuration, driver would enable packet inject from ARM cores + to crypto to process and send back in Rx path. + Debugging Options - diff --git a/drivers/net/cnxk/cn10k_ethdev.c b/drivers/net/cnxk/cn10k_ethdev.c index a2e943a3d0..78d1dca3c1 100644 --- a/drivers/net/cnxk/cn10k_ethdev.c +++ b/drivers/net/cnxk/cn10k_ethdev.c @@ -593,6 +593,10 @@ cn10k_nix_dev_start(struct rte_eth_dev *eth_dev) if (dev->rx_offload_flags & NIX_RX_OFFLOAD_SECURITY_F) cn10k_nix_rx_queue_meta_aura_update(eth_dev); + /* Set flags for Rx Inject feature */ + if (roc_idev_nix_rx_inject_get(nix->port_id)) + dev->rx_offload_flags |= NIX_RX_SEC_REASSEMBLY_F; + cn10k_eth_set_tx_function(eth_dev); cn10k_eth_set_rx_function(eth_dev); return 0; diff --git a/drivers/net/cnxk/cn10k_ethdev_sec.c b/drivers/net/cnxk/cn10k_ethdev_sec.c index 575d0fabd5..42e4867d3c 100644 --- a/drivers/net/cnxk/cn10k_ethdev_sec.c +++ b/drivers/net/cnxk/cn10k_ethdev_sec.c @@ -1253,6 +1253,52 @@ eth_sec_caps_add(struct rte_security_capability eth_sec_caps[], uint32_t *idx, *idx += nb_caps; } +static uint16_t __rte_hot +cn10k_eth_sec_inb_rx_inject(void *device, struct rte_mbuf **pkts, + struct rte_security_session **sess, uint16_t nb_pkts) +{ + struct rte_eth_dev *eth_dev = (struct rte_eth_dev *)device; + struct cnxk_eth_dev *dev = cnxk_eth_pmd_priv(eth_dev); + + return cn10k_nix_inj_pkts(sess, &dev->inj_cfg, pkts, nb_pkts); +} + +static int +cn10k_eth_sec_rx_inject_config(void *device, uint16_t port_id, bool enable) +{ + struct rte_eth_dev *eth_dev = (struct rte_eth_dev *)device; + struct cnxk_eth_dev *dev = cnxk_eth_pmd_priv(eth_dev); + uint64_t channel, pf_func, inj_match_id = 0xUL; + struct cnxk_ethdev_inj_cfg *inj_cfg; + struct roc_nix *nix = &dev->nix; + struct roc_cpt_lf *inl_lf; + uint64_t sa_base; + + if (!rte_eth_dev_is_valid_port(port_id)) + return -EINVAL; + + if (eth_dev->data->dev_started || !eth_dev->data->dev_configured) + return -EBUSY; + + if (!roc_nix_inl_inb_rx_inject_enable(nix, dev->inb.inl_dev)) + return -ENOTSUP; + + roc_idev_nix_rx_inject_set(port_id, enable); + + inl_lf = roc_nix_inl_inb_inj_lf_get(nix); + sa_base = roc_nix_inl_inb_sa_base_get(nix, dev->inb.inl_dev); + + inj_cfg = &dev->inj_cfg; + inj_cfg->sa_base = sa_base | eth_dev->data->port_id; + inj_cfg->io_addr = inl_lf->io_addr; + inj_cfg->lmt_base = nix->lmt_base; + channel = roc
[PATCH 3/5] common/cnxk: fix for inline dev pointer check
Add missing check of Inline device pointer before accessing is_multi_channel variable. Fixes: 7ea187184a51 ("common/cnxk: support 1-N pool-aura per NIX LF") Cc: sta...@dpdk.org Signed-off-by: Rahul Bhansali --- drivers/common/cnxk/roc_nix_inl.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/common/cnxk/roc_nix_inl.c b/drivers/common/cnxk/roc_nix_inl.c index de8fd2a605..a205c658e9 100644 --- a/drivers/common/cnxk/roc_nix_inl.c +++ b/drivers/common/cnxk/roc_nix_inl.c @@ -933,7 +933,8 @@ roc_nix_inl_inb_init(struct roc_nix *roc_nix) inl_dev = idev->nix_inl_dev; roc_nix->custom_meta_aura_ena = (roc_nix->local_meta_aura_ena && -(inl_dev->is_multi_channel || roc_nix->custom_sa_action)); +((inl_dev && inl_dev->is_multi_channel) || + roc_nix->custom_sa_action)); if (!roc_model_is_cn9k() && !roc_errata_nix_no_meta_aura()) { nix->need_meta_aura = true; if (!roc_nix->local_meta_aura_ena || roc_nix->custom_meta_aura_ena) -- 2.25.1
[PATCH 4/5] net/cnxk: fix to add reassembly fast path flag
For IPsec decrypted packets, full packet format condition check is enabled for both reassembly and non-reassembly path as part of OOP handling. Instead, it should be only in reassembly path. To fix this, NIX_RX_REAS_F flag condition is added to avoid packet format check in non-reassembly fast path. Fixes: 5e9e008d0127 ("net/cnxk: support inline ingress out-of-place session") Cc: sta...@dpdk.org Signed-off-by: Rahul Bhansali --- drivers/net/cnxk/cn10k_rx.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/cnxk/cn10k_rx.h b/drivers/net/cnxk/cn10k_rx.h index c4ad1b64fe..89621af3fb 100644 --- a/drivers/net/cnxk/cn10k_rx.h +++ b/drivers/net/cnxk/cn10k_rx.h @@ -734,7 +734,7 @@ nix_cqe_xtract_mseg(const union nix_rx_parse_u *rx, struct rte_mbuf *mbuf, else wqe = (const uint64_t *)(mbuf + 1); - if (hdr->w0.pkt_fmt != ROC_IE_OT_SA_PKT_FMT_FULL) + if (!(flags & NIX_RX_REAS_F) || hdr->w0.pkt_fmt != ROC_IE_OT_SA_PKT_FMT_FULL) rx = (const union nix_rx_parse_u *)(wqe + 1); } -- 2.25.1
[PATCH 5/5] net/cnxk: select optimized LLC transaction type
LLC transaction optimization by using LDWB LDTYPE option in SG preparation for Tx. With this, if data is present and dirty in LLC then the LLC would mark the data clean. Signed-off-by: Rahul Bhansali --- drivers/net/cnxk/cn10k_tx.h | 16 +--- 1 file changed, 13 insertions(+), 3 deletions(-) diff --git a/drivers/net/cnxk/cn10k_tx.h b/drivers/net/cnxk/cn10k_tx.h index 664e47e1fc..fcd19be77e 100644 --- a/drivers/net/cnxk/cn10k_tx.h +++ b/drivers/net/cnxk/cn10k_tx.h @@ -331,9 +331,15 @@ cn10k_nix_tx_skeleton(struct cn10k_eth_txq *txq, uint64_t *cmd, else cmd[2] = NIX_SUBDC_EXT << 60; cmd[3] = 0; - cmd[4] = (NIX_SUBDC_SG << 60) | BIT_ULL(48); + if (!(flags & NIX_TX_OFFLOAD_MBUF_NOFF_F)) + cmd[4] = (NIX_SUBDC_SG << 60) | (NIX_SENDLDTYPE_LDWB << 58) | BIT_ULL(48); + else + cmd[4] = (NIX_SUBDC_SG << 60) | BIT_ULL(48); } else { - cmd[2] = (NIX_SUBDC_SG << 60) | BIT_ULL(48); + if (!(flags & NIX_TX_OFFLOAD_MBUF_NOFF_F)) + cmd[2] = (NIX_SUBDC_SG << 60) | (NIX_SENDLDTYPE_LDWB << 58) | BIT_ULL(48); + else + cmd[2] = (NIX_SUBDC_SG << 60) | BIT_ULL(48); } } @@ -1989,7 +1995,11 @@ cn10k_nix_xmit_pkts_vector(void *tx_queue, uint64_t *ws, senddesc01_w1 = vdupq_n_u64(0); senddesc23_w1 = senddesc01_w1; - sgdesc01_w0 = vdupq_n_u64((NIX_SUBDC_SG << 60) | BIT_ULL(48)); + if (!(flags & NIX_TX_OFFLOAD_MBUF_NOFF_F)) + sgdesc01_w0 = vdupq_n_u64((NIX_SUBDC_SG << 60) | (NIX_SENDLDTYPE_LDWB << 58) | + BIT_ULL(48)); + else + sgdesc01_w0 = vdupq_n_u64((NIX_SUBDC_SG << 60) | BIT_ULL(48)); sgdesc23_w0 = sgdesc01_w0; if (flags & NIX_TX_NEED_EXT_HDR) { -- 2.25.1
[PATCH] test/security: add inline IPsec Rx inject test
Add test for inline IPsec Rx inject verification. This test case will inject the known vector to crypto HW from ethdev and verifies it back with decrypted packet from ethdev Rx. Signed-off-by: Rahul Bhansali --- app/test/test_security_inline_proto.c | 325 ++ app/test/test_security_inline_proto_vectors.h | 27 ++ 2 files changed, 352 insertions(+) diff --git a/app/test/test_security_inline_proto.c b/app/test/test_security_inline_proto.c index 78a2064b65..385b8d86c8 100644 --- a/app/test/test_security_inline_proto.c +++ b/app/test/test_security_inline_proto.c @@ -829,6 +829,232 @@ verify_inbound_oop(struct ipsec_test_data *td, return ret; } +static int +test_ipsec_with_rx_inject(struct ip_pkt_vector *vector, const struct ipsec_test_flags *flags) +{ + struct rte_security_session_conf sess_conf_out = {0}; + struct rte_security_session_conf sess_conf_in = {0}; + uint32_t nb_tx, burst_sz, nb_sent = 0, nb_inj = 0; + void *out_ses[ENCAP_DECAP_BURST_SZ] = {0}; + void *in_ses[ENCAP_DECAP_BURST_SZ] = {0}; + struct rte_crypto_sym_xform cipher_out = {0}; + struct rte_crypto_sym_xform cipher_in = {0}; + struct rte_crypto_sym_xform auth_out = {0}; + struct rte_crypto_sym_xform aead_out = {0}; + struct rte_crypto_sym_xform auth_in = {0}; + struct rte_crypto_sym_xform aead_in = {0}; + uint32_t i, j, nb_rx = 0, nb_inj_rx = 0; + struct rte_mbuf **inj_pkts_burst; + struct ipsec_test_data sa_data; + uint32_t ol_flags; + bool outer_ipv4; + int ret = 0; + void *ctx; + + inj_pkts_burst = (struct rte_mbuf **)rte_calloc("inj_buff", + MAX_TRAFFIC_BURST, + sizeof(void *), + RTE_CACHE_LINE_SIZE); + if (!inj_pkts_burst) + return TEST_FAILED; + + burst_sz = vector->burst ? ENCAP_DECAP_BURST_SZ : 1; + nb_tx = burst_sz; + + memset(tx_pkts_burst, 0, sizeof(tx_pkts_burst[0]) * nb_tx); + memset(rx_pkts_burst, 0, sizeof(rx_pkts_burst[0]) * nb_tx); + memset(inj_pkts_burst, 0, sizeof(inj_pkts_burst[0]) * nb_tx); + + memcpy(&sa_data, vector->sa_data, sizeof(struct ipsec_test_data)); + sa_data.ipsec_xform.direction = RTE_SECURITY_IPSEC_SA_DIR_EGRESS; + outer_ipv4 = is_outer_ipv4(&sa_data); + + for (i = 0; i < nb_tx; i++) { + tx_pkts_burst[i] = init_packet(mbufpool, + vector->full_pkt->data, + vector->full_pkt->len, outer_ipv4); + if (tx_pkts_burst[i] == NULL) { + ret = -1; + printf("\n packed init failed\n"); + goto out; + } + } + + for (i = 0; i < burst_sz; i++) { + memcpy(&sa_data, vector->sa_data, sizeof(struct ipsec_test_data)); + /* Update SPI for every new SA */ + sa_data.ipsec_xform.spi += i; + sa_data.ipsec_xform.direction = RTE_SECURITY_IPSEC_SA_DIR_EGRESS; + if (sa_data.aead) { + sess_conf_out.crypto_xform = &aead_out; + } else { + sess_conf_out.crypto_xform = &cipher_out; + sess_conf_out.crypto_xform->next = &auth_out; + } + + /* Create Inline IPsec outbound session. */ + ret = create_inline_ipsec_session(&sa_data, port_id, + &out_ses[i], &ctx, &ol_flags, flags, + &sess_conf_out); + if (ret) { + printf("\nInline outbound session create failed\n"); + goto out; + } + } + + for (i = 0; i < nb_tx; i++) { + if (ol_flags & RTE_SECURITY_TX_OLOAD_NEED_MDATA) + rte_security_set_pkt_metadata(ctx, + out_ses[i], tx_pkts_burst[i], NULL); + tx_pkts_burst[i]->ol_flags |= RTE_MBUF_F_TX_SEC_OFFLOAD; + } + + for (i = 0; i < burst_sz; i++) { + memcpy(&sa_data, vector->sa_data, sizeof(struct ipsec_test_data)); + /* Update SPI for every new SA */ + sa_data.ipsec_xform.spi += i; + sa_data.ipsec_xform.direction = RTE_SECURITY_IPSEC_SA_DIR_INGRESS; + + if (sa_data.aead) { + sess_conf_in.crypto_xform = &aead_in; + } else { + sess_conf_in.crypto_xform = &auth_in; + sess_conf_in.crypto_xform->next = &cipher_in; + } + /* Create Inline IPsec inbound session. */ + ret = create_inline_ipsec_session(&sa_data, port_id, &in_ses[i], + &ctx, &ol_flags, flags, &sess_conf_in); + if (ret) { + printf("\nInline inbound session crea
Re: [PATCH] vhost: fix deadlock during software live migration of VDPA in a nested virtualization environment
在 2024/1/18 22:46, David Marchand 写道: Hello, On Thu, Jan 18, 2024 at 11:34 AM Hao Chen wrote: In a nested virtualization environment, running dpdk vdpa in QEMU-L1 for software live migration will result in a deadlock between dpdke-vdpa and QEMU-L2 processes. rte_vdpa_relay_vring_used-> __vhost_iova_to_vva-> vhost_user_iotlb_rd_unlock(vq)-> vhost_user_iotlb_miss-> send vhost message VHOST_USER_SLAVE_IOTLB_MSG to QEMU's vdpa socket, then call vhost_user_iotlb_rd_lock(vq) to hold the read lock `iotlb_lock`. But there is no place to release this read lock. QEMU L2 get the VHOST_USER_SLAVE_IOTLB_MSG, then call vhost_user_send_device_iotlb_msg to send VHOST_USER_IOTLB_MSG messages to dpdk-vdpa. Dpdk vdpa will call vhost_user_iotlb_msg-> vhost_user_iotlb_cache_insert, here, will obtain the write lock `iotlb_lock`, but the read lock `iotlb_lock` has not been released and will block here. This patch add lock and unlock function to fix the deadlock. Please identify the commit that first had this issue and add a Fixes: tag. Ok. Signed-off-by: Hao Chen --- lib/vhost/vdpa.c | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/lib/vhost/vdpa.c b/lib/vhost/vdpa.c index 9776fc07a9..9132414209 100644 --- a/lib/vhost/vdpa.c +++ b/lib/vhost/vdpa.c @@ -19,6 +19,7 @@ #include "rte_vdpa.h" #include "vdpa_driver.h" #include "vhost.h" +#include "iotlb.h" /** Double linked list of vDPA devices. */ TAILQ_HEAD(vdpa_device_list, rte_vdpa_device); @@ -193,10 +194,12 @@ rte_vdpa_relay_vring_used(int vid, uint16_t qid, void *vring_m) if (unlikely(nr_descs > vq->size)) return -1; + vhost_user_iotlb_rd_lock(vq); desc_ring = (struct vring_desc *)(uintptr_t) vhost_iova_to_vva(dev, vq, vq->desc[desc_id].addr, &dlen, VHOST_ACCESS_RO); + vhost_user_iotlb_rd_unlock(vq); if (unlikely(!desc_ring)) return -1; @@ -220,9 +223,12 @@ rte_vdpa_relay_vring_used(int vid, uint16_t qid, void *vring_m) if (unlikely(nr_descs-- == 0)) goto fail; desc = desc_ring[desc_id]; - if (desc.flags & VRING_DESC_F_WRITE) + if (desc.flags & VRING_DESC_F_WRITE) { + vhost_user_iotlb_rd_lock(vq); vhost_log_write_iova(dev, vq, desc.addr, desc.len); + vhost_user_iotlb_rd_unlock(vq); + } desc_id = desc.next; } while (desc.flags & VRING_DESC_F_NEXT); Interesting, I suspected a bug in this area as clang was complaining. Please try to remove the __rte_no_thread_safety_analysis annotation and compile with clang. https://git.dpdk.org/dpdk/tree/lib/vhost/vdpa.c#n150 You will get: ccache clang -Ilib/librte_vhost.a.p -Ilib -I../lib -Ilib/vhost -I../lib/vhost -I. -I.. -Iconfig -I../config -Ilib/eal/include -I../lib/eal/include -Ilib/eal/linux/include -I../lib/eal/linux/include -Ilib/eal/x86/include -I../lib/eal/x86/include -Ilib/eal/common -I../lib/eal/common -Ilib/eal -I../lib/eal -Ilib/kvargs -I../lib/kvargs -Ilib/log -I../lib/log -Ilib/metrics -I../lib/metrics -Ilib/telemetry -I../lib/telemetry -Ilib/ethdev -I../lib/ethdev -Ilib/net -I../lib/net -Ilib/mbuf -I../lib/mbuf -Ilib/mempool -I../lib/mempool -Ilib/ring -I../lib/ring -Ilib/meter -I../lib/meter -Ilib/cryptodev -I../lib/cryptodev -Ilib/rcu -I../lib/rcu -Ilib/hash -I../lib/hash -Ilib/pci -I../lib/pci -Ilib/dmadev -I../lib/dmadev -fcolor-diagnostics -fsanitize=address -fno-omit-frame-pointer -D_FILE_OFFSET_BITS=64 -Wall -Winvalid-pch -Wextra -Werror -std=c11 -O0 -g -include rte_config.h -Wcast-qual -Wdeprecated -Wformat -Wformat-nonliteral -Wformat-security -Wmissing-declarations -Wmissing-prototypes -Wnested-externs -Wold-style-definition -Wpointer-arith -Wsign-compare -Wstrict-prototypes -Wundef -Wwrite-strings -Wno-address-of-packed-member -Wno-missing-field-initializers -D_GNU_SOURCE -fPIC -march=native -mrtm -DALLOW_EXPERIMENTAL_API -DALLOW_INTERNAL_API -DVHOST_CLANG_UNROLL_PRAGMA -fno-strict-aliasing -DVHOST_HAS_VDUSE -DRTE_LOG_DEFAULT_LOGTYPE=lib.vhost -DRTE_ANNOTATE_LOCKS -Wthread-safety -MD -MQ lib/librte_vhost.a.p/vhost_vdpa.c.o -MF lib/librte_vhost.a.p/vhost_vdpa.c.o.d -o lib/librte_vhost.a.p/vhost_vdpa.c.o -c ../lib/vhost/vdpa.c ../lib/vhost/vdpa.c:196:5: error: calling function 'vhost_iova_to_vva' requires holding mutex 'vq->iotlb_lock' [-Werror,-Wthread-safety-analysis] vhost_iova_to_vva(dev, vq, ^ ../lib/vhost/vdpa.c:203:13: error: calling fun
Comments/Doc error for CRYPTODEV of DPDK
Hi maintainer of DPDK, I've noticed an error on comment of DPDK version 23.11 rte_cryptodev.h: 928-930 /** * Create a symmetric session mempool. * * @param name * The unique mempool name. * @param nb_elts * The number of elements in the mempool. * @param elt_size * The size of the element. This value will be ignored if it is smaller than * the minimum session header size required for the system. For the user who * want to use the same mempool for sym session and session private data it * can be the maximum value of all existing devices' private data and session * header sizes. * @param cache_size * The number of per-lcore cache elements * @param priv_size * The private data size of each session. * @param socket_id * The *socket_id* argument is the socket identifier in the case of * NUMA. The value can be *SOCKET_ID_ANY* if there is no NUMA * constraint for the reserved zone. * * @return * - On success return size of the session * - On failure returns 0 */ __rte_experimental struct rte_mempool * rte_cryptodev_sym_session_pool_create(const char *name, uint32_t nb_elts, uint32_t elt_size, uint32_t cache_size, uint16_t priv_size, int socket_id); But the return value of this function seems to be a pointer to the mempool created or NULL pointer, instead of the mempool size. Could you please check it? Thank you and BR, Songyi
[PATCH] telemetry: correct json empty dictionaries
Fix to allow telemetry to handle empty dictionaries correctly. This patch resolves an issue where empty dictionaries are reported by telemetry as '[]' rather than '{}'. Initializing the output buffer based on the container type resolves the issue. Signed-off-by: Jonathan Erb --- .mailmap | 2 +- lib/telemetry/telemetry.c | 6 +- 2 files changed, 6 insertions(+), 2 deletions(-) diff --git a/.mailmap b/.mailmap index ab0742a382..a6b66ab3ad 100644 --- a/.mailmap +++ b/.mailmap @@ -675,7 +675,7 @@ John Ousterhout John Romein John W. Linville Jonas Pfefferle -Jonathan Erb +Jonathan Erb Jonathan Tsai Jon DeVree Jon Loeliger diff --git a/lib/telemetry/telemetry.c b/lib/telemetry/telemetry.c index 92982842a8..eef4ac7bb7 100644 --- a/lib/telemetry/telemetry.c +++ b/lib/telemetry/telemetry.c @@ -169,7 +169,11 @@ container_to_json(const struct rte_tel_data *d, char *out_buf, size_t buf_len) d->type != TEL_ARRAY_INT && d->type != TEL_ARRAY_STRING) return snprintf(out_buf, buf_len, "null"); - used = rte_tel_json_empty_array(out_buf, buf_len, 0); + if (d->type == TEL_DICT) + used = rte_tel_json_empty_obj(out_buf, buf_len, 0); + else + used = rte_tel_json_empty_array(out_buf, buf_len, 0); + if (d->type == TEL_ARRAY_UINT) for (i = 0; i < d->data_len; i++) used = rte_tel_json_add_array_uint(out_buf, -- 2.34.1
Re: Comments/Doc error for CRYPTODEV of DPDK
Hello, On Fri, Jan 19, 2024 at 8:48 AM Wang, Songyi wrote: > > Hi maintainer of DPDK, Redirecting to the cryptodev maintainers. Akhil, Fan, can you have a look? Thanks. > > > > I’ve noticed an error on comment of DPDK version 23.11 rte_cryptodev.h: > 928-930 -- David Marchand