[dpdk-dev] [PATCH v2 4/4] net/iavf: fix vector mapping with queue
Fix the vector mapping with queue by changing the recircle when exceeds RX_VEC_START + nb_msix; Signed-off-by: Jingjing Wu --- drivers/net/iavf/iavf_ethdev.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/iavf/iavf_ethdev.c b/drivers/net/iavf/iavf_ethdev.c index 06395f852b..6182fc6f4f 100644 --- a/drivers/net/iavf/iavf_ethdev.c +++ b/drivers/net/iavf/iavf_ethdev.c @@ -578,7 +578,7 @@ static int iavf_config_rx_queues_irqs(struct rte_eth_dev *dev, qv_map[i].queue_id = i; qv_map[i].vector_id = vec; intr_handle->intr_vec[i] = vec++; - if (vec >= vf->nb_msix) + if (vec >= vf->nb_msix + IAVF_RX_VEC_START) vec = IAVF_RX_VEC_START; } vf->qv_map = qv_map; -- 2.21.1
Re: [dpdk-dev] [PATCH 1/1] mbuf: add extern "C" to rte_mbuf_dyn.h
On Thu, Jan 7, 2021 at 2:42 AM Ashish Sadanandan wrote: > > Hi Olivier, > > On Wed, Jan 6, 2021 at 6:21 AM Olivier Matz wrote: > > > > Hi Ashish, > > > > Yes, it should reference the patch that introduced the issue. In this case, > > it should be: > > > > Fixes: 4958ca3a443a ("mbuf: support dynamic fields and flags") > > > > Can you please also change the title to start with "fix", for > > instance like this: > > > > mbuf: fix inclusion from c++ > > > > You can also add "Cc: sta...@dpdk.org" in the commit log. > > > > Some documentation can be found in doc/guides/contributing/patches.rst > > Thanks for the pointers. I've made the changes above. However I don't > know if I'm using the message ID for --in-reply-to correctly, since I > seem to be creating a new thread each time. You can get the message ID in various places. I usually take the info from patchwork and update the previous patch state to Superseded at the same time. So in your case, the v1 patch was: http://patchwork.dpdk.org/patch/85878/ Message ID: 20201229194144.17824-1-ashish.sadanan...@gmail.com So for the v2: git send-email --to dev@dpdk.org --cc-cmd ./devtools/get-maintainer.sh --cc XXX --in-reply-to '20201229194144.17824-1-ashish.sadanan...@gmail.com' v2-*.patch -- David Marchand
[dpdk-dev] [PATCH v3 1/3] doc: fix testpmd command for i40e RSS flow
From: Alvin Zhang The command here does not create a queue region, but only sets the lookup table, so the descriptions in the doc is not exact. Signed-off-by: Alvin Zhang Fixes: feaae285b342 ("net/i40e: support hash configuration in RSS flow") Cc: sta...@dpdk.org --- V2: Divide the patch into three patch series and delete some two unused functions --- doc/guides/nics/i40e.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/doc/guides/nics/i40e.rst b/doc/guides/nics/i40e.rst index 4e5c467..64f20e7 100644 --- a/doc/guides/nics/i40e.rst +++ b/doc/guides/nics/i40e.rst @@ -562,9 +562,9 @@ Generic flow API - ``RSS Flow`` RSS Flow supports to set hash input set, hash function, enable hash - and configure queue region. + and configure queues. For example: - Configure queue region as queue 0, 1, 2, 3. + Configure queues as queue 0, 1, 2, 3. .. code-block:: console -- 1.8.3.1
[dpdk-dev] [PATCH v3 2/3] net/i40e: fix return value
From: Alvin Zhang The api should return the system error status, but it returned the hardware error status, this is confused for the caller. This patch adds check on hardware execution status and returns -EIO in case of hardware execution failure. Signed-off-by: Alvin Zhang Fixes: 1d4b2b4966bb ("net/i40e: fix VF overwrite PF RSS LUT for X722") Fixes: d0a349409bd7 ("i40e: support AQ based RSS config") Cc: sta...@dpdk.org --- drivers/net/i40e/i40e_ethdev.c | 33 - 1 file changed, 20 insertions(+), 13 deletions(-) diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c index f54769c..2034008 100644 --- a/drivers/net/i40e/i40e_ethdev.c +++ b/drivers/net/i40e/i40e_ethdev.c @@ -4426,7 +4426,6 @@ static int i40e_dev_xstats_get_names(__rte_unused struct rte_eth_dev *dev, { struct i40e_pf *pf; struct i40e_hw *hw; - int ret; if (!vsi || !lut) return -EINVAL; @@ -4435,12 +4434,16 @@ static int i40e_dev_xstats_get_names(__rte_unused struct rte_eth_dev *dev, hw = I40E_VSI_TO_HW(vsi); if (pf->flags & I40E_FLAG_RSS_AQ_CAPABLE) { - ret = i40e_aq_set_rss_lut(hw, vsi->vsi_id, - vsi->type != I40E_VSI_SRIOV, - lut, lut_size); - if (ret) { - PMD_DRV_LOG(ERR, "Failed to set RSS lookup table"); - return ret; + enum i40e_status_code status; + + status = i40e_aq_set_rss_lut(hw, vsi->vsi_id, +vsi->type != I40E_VSI_SRIOV, +lut, lut_size); + if (status) { + PMD_DRV_LOG(ERR, + "Failed to update RSS lookup table, error status: %d", + status); + return -EIO; } } else { uint32_t *lut_dw = (uint32_t *)lut; @@ -7591,7 +7594,6 @@ struct i40e_vsi * uint16_t key_idx = (vsi->type == I40E_VSI_SRIOV) ? I40E_VFQF_HKEY_MAX_INDEX : I40E_PFQF_HKEY_MAX_INDEX; - int ret = 0; if (!key || key_len == 0) { PMD_DRV_LOG(DEBUG, "No key to be configured"); @@ -7604,11 +7606,16 @@ struct i40e_vsi * if (pf->flags & I40E_FLAG_RSS_AQ_CAPABLE) { struct i40e_aqc_get_set_rss_key_data *key_dw = - (struct i40e_aqc_get_set_rss_key_data *)key; + (struct i40e_aqc_get_set_rss_key_data *)key; + enum i40e_status_code status = + i40e_aq_set_rss_key(hw, vsi->vsi_id, key_dw); - ret = i40e_aq_set_rss_key(hw, vsi->vsi_id, key_dw); - if (ret) - PMD_INIT_LOG(ERR, "Failed to configure RSS key via AQ"); + if (status) { + PMD_DRV_LOG(ERR, + "Failed to configure RSS key via AQ, error status: %d", + status); + return -EIO; + } } else { uint32_t *hash_key = (uint32_t *)key; uint16_t i; @@ -7628,7 +7635,7 @@ struct i40e_vsi * I40E_WRITE_FLUSH(hw); } - return ret; + return 0; } static int -- 1.8.3.1
[dpdk-dev] [PATCH v3 3/3] net/i40e: refactor RSS flow
From: Alvin Zhang 1. Delete original code. 2. Add 2 tables(One maps flow pattern and RSS type to PCTYPE, another maps RSS type to input set). 3. Parse RSS pattern and RSS type to get PCTYPE. 4. Parse RSS action to get queues, RSS function and hash field. 5. Create and destroy RSS filters. 6. Create new files for hash flows. Signed-off-by: Alvin Zhang --- V3: Refine RSS key --- drivers/net/i40e/i40e_ethdev.c | 905 +- drivers/net/i40e/i40e_ethdev.h | 53 +- drivers/net/i40e/i40e_flow.c | 617 +- drivers/net/i40e/i40e_hash.c | 1383 drivers/net/i40e/i40e_hash.h | 34 + drivers/net/i40e/meson.build |1 + 6 files changed, 1625 insertions(+), 1368 deletions(-) create mode 100644 drivers/net/i40e/i40e_hash.c create mode 100644 drivers/net/i40e/i40e_hash.h diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c index 2034008..b8c2cf3 100644 --- a/drivers/net/i40e/i40e_ethdev.c +++ b/drivers/net/i40e/i40e_ethdev.c @@ -39,6 +39,7 @@ #include "i40e_pf.h" #include "i40e_regs.h" #include "rte_pmd_i40e.h" +#include "i40e_hash.h" #define ETH_I40E_FLOATING_VEB_ARG "enable_floating_veb" #define ETH_I40E_FLOATING_VEB_LIST_ARG "floating_veb_list" @@ -396,7 +397,6 @@ static int i40e_sw_tunnel_filter_insert(struct i40e_pf *pf, static void i40e_tunnel_filter_restore(struct i40e_pf *pf); static void i40e_filter_restore(struct i40e_pf *pf); static void i40e_notify_all_vfs_link_status(struct rte_eth_dev *dev); -static int i40e_pf_config_rss(struct i40e_pf *pf); static const char *const valid_keys[] = { ETH_I40E_FLOATING_VEB_ARG, @@ -1764,10 +1764,6 @@ static inline void i40e_config_automask(struct i40e_pf *pf) /* initialize queue region configuration */ i40e_init_queue_region_conf(dev); - /* initialize RSS configuration from rte_flow */ - memset(&pf->rss_info, 0, - sizeof(struct i40e_rte_flow_rss_conf)); - /* reset all stats of the device, including pf and main vsi */ i40e_dev_stats_reset(dev); @@ -7576,7 +7572,7 @@ struct i40e_vsi * } /* Disable RSS */ -static void +void i40e_pf_disable_rss(struct i40e_pf *pf) { struct i40e_hw *hw = I40E_PF_TO_HW(pf); @@ -8789,7 +8785,7 @@ i40e_status_code i40e_replace_gtp_cloud_filter(struct i40e_pf *pf) } /* Calculate the maximum number of contiguous PF queues that are configured */ -static int +int i40e_pf_calc_configured_queues_num(struct i40e_pf *pf) { struct rte_eth_dev_data *data = pf->dev_data; @@ -8808,19 +8804,72 @@ i40e_status_code i40e_replace_gtp_cloud_filter(struct i40e_pf *pf) return num; } -/* Configure RSS */ -static int -i40e_pf_config_rss(struct i40e_pf *pf) +/* Reset the global configure of hash function and input sets */ +static void +i40e_pf_global_rss_reset(struct i40e_pf *pf) { - enum rte_eth_rx_mq_mode mq_mode = pf->dev_data->dev_conf.rxmode.mq_mode; struct i40e_hw *hw = I40E_PF_TO_HW(pf); - struct rte_eth_rss_conf rss_conf; - uint32_t i, lut = 0; - uint16_t j, num; + uint32_t reg, reg_val; + int i; - /* -* If both VMDQ and RSS enabled, not all of PF queues are configured. -* It's necessary to calculate the actual PF queues that are configured. + /* Reset global RSS function sets */ + reg_val = i40e_read_rx_ctl(hw, I40E_GLQF_CTL); + if (!(reg_val & I40E_GLQF_CTL_HTOEP_MASK)) { + reg_val |= I40E_GLQF_CTL_HTOEP_MASK; + i40e_write_global_rx_ctl(hw, I40E_GLQF_CTL, reg_val); + } + + for (i = 0; i <= I40E_FILTER_PCTYPE_L2_PAYLOAD; i++) { + uint64_t inset; + int j, pctype; + + if (hw->mac.type == I40E_MAC_X722) + pctype = i40e_read_rx_ctl(hw, I40E_GLQF_FD_PCTYPES(i)); + else + pctype = i; + + /* Reset pctype insets */ + inset = i40e_get_default_input_set(i); + if (inset) { + pf->hash_input_set[pctype] = inset; + inset = i40e_translate_input_set_reg(hw->mac.type, +inset); + + reg = I40E_GLQF_HASH_INSET(0, pctype); + i40e_check_write_global_reg(hw, reg, (uint32_t)inset); + reg = I40E_GLQF_HASH_INSET(1, pctype); + i40e_check_write_global_reg(hw, reg, + (uint32_t)(inset >> 32)); + + /* Clear unused mask registers of the pctype */ + for (j = 0; j < I40E_INSET_MASK_NUM_REG; j++) { + reg = I40E_GLQF_HASH_MSK(j, pctype); + i40e_check_write_global_reg(hw, reg, 0); + } + } + + /
[dpdk-dev] [PATCH v7] net/iavf: fix invalid RSS combinations rule can be created
Currently, when use 'flow' command to create a rule that combine with several RSS types, even the RSS type combination is invalid, it also be created successfully. Here list some invalid RSS combinations: - ETH_RSS_IPV4 | ETH_RSS_NONFRAG_IPV4_TCP - ETH_RSS_IPV6 | ETH_RSS_NONFRAG_IPV6_TCP For 'ETH_RSS_IPV4 | ETH_RSS_NONFRAG_IPV4_TCP' (same as IPV6), this pathch adds these combinations in 'invalid_rss_comb' array to do valid check, if the combination check failed, the rule will be created unsuccessful. Fixes: 91f27b2e39ab ("net/iavf: refactor RSS") Signed-off-by: Murphy Yang --- v7: - Remove unsupported RSS combinations array. - Restored 'ETH_RSS_GTPU' in input set mask. v6: - Add unsupported RSS combinations array. v5: - Remove 'ETH_RSS_GTPU' from input set mask. v4: - Use 'ETH_RSS_XXX' replace 'IAVF_RSS_TYPE_INNER_XXX' v3: - Update the comments. v2: - Add invalid RSS combinations. drivers/net/iavf/iavf_hash.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/net/iavf/iavf_hash.c b/drivers/net/iavf/iavf_hash.c index 7620876b58..ebaac58254 100644 --- a/drivers/net/iavf/iavf_hash.c +++ b/drivers/net/iavf/iavf_hash.c @@ -863,7 +863,9 @@ static void iavf_refine_proto_hdrs(struct virtchnl_proto_hdrs *proto_hdrs, static uint64_t invalid_rss_comb[] = { ETH_RSS_IPV4 | ETH_RSS_NONFRAG_IPV4_UDP, + ETH_RSS_IPV4 | ETH_RSS_NONFRAG_IPV4_TCP, ETH_RSS_IPV6 | ETH_RSS_NONFRAG_IPV6_UDP, + ETH_RSS_IPV6 | ETH_RSS_NONFRAG_IPV6_TCP, RTE_ETH_RSS_L3_PRE32 | RTE_ETH_RSS_L3_PRE40 | RTE_ETH_RSS_L3_PRE48 | RTE_ETH_RSS_L3_PRE56 | RTE_ETH_RSS_L3_PRE96 -- 2.17.1
[dpdk-dev] [PATCH v2 0/3] AVX512 vPMD on i40e
This patchset aims to support AVX512 vPMD on i40e. And the changes are only target to AVX512 vector path. --- v2: - Add return value check on rte_mempool_default_cache(). Leyi Rong (3): net/i40e: remove devarg use-latest-supported-vec net/i40e: add AVX512 vector path net/i40e: optimize Tx by using AVX512 doc/guides/nics/i40e.rst|9 - drivers/net/i40e/i40e_ethdev.c | 63 +- drivers/net/i40e/i40e_ethdev.h |3 - drivers/net/i40e/i40e_rxtx.c| 193 ++-- drivers/net/i40e/i40e_rxtx.h| 13 + drivers/net/i40e/i40e_rxtx_vec_avx512.c | 1136 +++ drivers/net/i40e/meson.build| 24 + 7 files changed, 1293 insertions(+), 148 deletions(-) create mode 100644 drivers/net/i40e/i40e_rxtx_vec_avx512.c -- 2.17.1
[dpdk-dev] [PATCH v2 2/3] net/i40e: add AVX512 vector path
Add AVX512 support for i40e PMD. This patch adds i40e_rxtx_vec_avx512.c to support i40e AVX512 vPMD. This patch aims to enable AVX512 on i40e vPMD. Main changes are focus on Rx path compared with AVX2 vPMD. Signed-off-by: Leyi Rong Signed-off-by: Bruce Richardson --- drivers/net/i40e/i40e_rxtx.c| 117 ++- drivers/net/i40e/i40e_rxtx.h|9 + drivers/net/i40e/i40e_rxtx_vec_avx512.c | 1024 +++ drivers/net/i40e/meson.build| 24 + 4 files changed, 1148 insertions(+), 26 deletions(-) create mode 100644 drivers/net/i40e/i40e_rxtx_vec_avx512.c diff --git a/drivers/net/i40e/i40e_rxtx.c b/drivers/net/i40e/i40e_rxtx.c index 2910619fa5..8357fb3ef8 100644 --- a/drivers/net/i40e/i40e_rxtx.c +++ b/drivers/net/i40e/i40e_rxtx.c @@ -1742,6 +1742,10 @@ i40e_dev_supported_ptypes_get(struct rte_eth_dev *dev) dev->rx_pkt_burst == i40e_recv_scattered_pkts || dev->rx_pkt_burst == i40e_recv_scattered_pkts_vec || dev->rx_pkt_burst == i40e_recv_pkts_vec || +#ifdef CC_AVX512_SUPPORT + dev->rx_pkt_burst == i40e_recv_scattered_pkts_vec_avx512 || + dev->rx_pkt_burst == i40e_recv_pkts_vec_avx512 || +#endif dev->rx_pkt_burst == i40e_recv_scattered_pkts_vec_avx2 || dev->rx_pkt_burst == i40e_recv_pkts_vec_avx2) return ptypes; @@ -3102,6 +3106,7 @@ i40e_set_rx_function(struct rte_eth_dev *dev) I40E_DEV_PRIVATE_TO_ADAPTER(dev->data->dev_private); uint16_t rx_using_sse, i; bool use_avx2 = false; + bool use_avx512 = false; /* In order to allow Vector Rx there are a few configuration * conditions to be met and Rx Bulk Allocation should be allowed. */ @@ -3125,9 +3130,19 @@ i40e_set_rx_function(struct rte_eth_dev *dev) } } - if ((rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX2) == 1 || -rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512F) == 1) && - rte_vect_get_max_simd_bitwidth() >= RTE_VECT_SIMD_256) + if (rte_vect_get_max_simd_bitwidth() >= RTE_VECT_SIMD_512 && + rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512F) == 1 && + rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512BW) == 1) +#ifdef CC_AVX512_SUPPORT + use_avx512 = true; +#else + PMD_DRV_LOG(NOTICE, + "AVX512 is not supported in build env"); +#endif + if (!use_avx512 && + (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX2) == 1 || + rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512F) == 1) && + rte_vect_get_max_simd_bitwidth() >= RTE_VECT_SIMD_256) use_avx2 = true; } } @@ -3135,21 +3150,41 @@ i40e_set_rx_function(struct rte_eth_dev *dev) if (ad->rx_vec_allowed && rte_vect_get_max_simd_bitwidth() >= RTE_VECT_SIMD_128) { if (dev->data->scattered_rx) { - PMD_INIT_LOG(DEBUG, - "Using %sVector Scattered Rx (port %d).", - use_avx2 ? "avx2 " : "", - dev->data->port_id); - dev->rx_pkt_burst = use_avx2 ? - i40e_recv_scattered_pkts_vec_avx2 : - i40e_recv_scattered_pkts_vec; + if (use_avx512) { +#ifdef CC_AVX512_SUPPORT + PMD_DRV_LOG(NOTICE, + "Using AVX512 Vector Scattered Rx (port %d).", + dev->data->port_id); + dev->rx_pkt_burst = + i40e_recv_scattered_pkts_vec_avx512; +#endif + } else { + PMD_INIT_LOG(DEBUG, + "Using %sVector Scattered Rx (port %d).", + use_avx2 ? "avx2 " : "", + dev->data->port_id); + dev->rx_pkt_burst = use_avx2 ? + i40e_recv_scattered_pkts_vec_avx2 : + i40e_recv_scattered_pkts_vec; + } } else { - PMD_INIT_LOG(DEBUG, - "Using %sVector Rx (port %d).", - use_avx2 ? "avx2 " : "", - dev->data->port_id); - dev->rx_pkt_burst = use_avx2 ? - i40e_recv_pkts_vec_avx2 : - i40e_recv_pkts_vec; + if (use_
[dpdk-dev] [PATCH v2 3/3] net/i40e: optimize Tx by using AVX512
Optimize Tx path by using AVX512 instructions and vectorize the tx free bufs process. Signed-off-by: Leyi Rong Signed-off-by: Bruce Richardson --- drivers/net/i40e/i40e_rxtx.c| 19 +++ drivers/net/i40e/i40e_rxtx.h| 4 + drivers/net/i40e/i40e_rxtx_vec_avx512.c | 152 3 files changed, 155 insertions(+), 20 deletions(-) diff --git a/drivers/net/i40e/i40e_rxtx.c b/drivers/net/i40e/i40e_rxtx.c index 8357fb3ef8..96071c55fd 100644 --- a/drivers/net/i40e/i40e_rxtx.c +++ b/drivers/net/i40e/i40e_rxtx.c @@ -2508,6 +2508,25 @@ i40e_tx_queue_release_mbufs(struct i40e_tx_queue *txq) * vPMD tx will not set sw_ring's mbuf to NULL after free, * so need to free remains more carefully. */ +#ifdef CC_AVX512_SUPPORT + if (dev->tx_pkt_burst == i40e_xmit_pkts_vec_avx512) { + struct i40e_vec_tx_entry *swr = (void *)txq->sw_ring; + + i = txq->tx_next_dd - txq->tx_rs_thresh + 1; + if (txq->tx_tail < i) { + for (; i < txq->nb_tx_desc; i++) { + rte_pktmbuf_free_seg(swr[i].mbuf); + swr[i].mbuf = NULL; + } + i = 0; + } + for (; i < txq->tx_tail; i++) { + rte_pktmbuf_free_seg(swr[i].mbuf); + swr[i].mbuf = NULL; + } + return; + } +#endif if (dev->tx_pkt_burst == i40e_xmit_pkts_vec_avx2 || dev->tx_pkt_burst == i40e_xmit_pkts_vec) { i = txq->tx_next_dd - txq->tx_rs_thresh + 1; diff --git a/drivers/net/i40e/i40e_rxtx.h b/drivers/net/i40e/i40e_rxtx.h index 2e3e50eb79..2f55073c97 100644 --- a/drivers/net/i40e/i40e_rxtx.h +++ b/drivers/net/i40e/i40e_rxtx.h @@ -129,6 +129,10 @@ struct i40e_tx_entry { uint16_t last_id; }; +struct i40e_vec_tx_entry { + struct rte_mbuf *mbuf; +}; + /* * Structure associated with each TX queue. */ diff --git a/drivers/net/i40e/i40e_rxtx_vec_avx512.c b/drivers/net/i40e/i40e_rxtx_vec_avx512.c index ccddc3e2d4..43e939c605 100644 --- a/drivers/net/i40e/i40e_rxtx_vec_avx512.c +++ b/drivers/net/i40e/i40e_rxtx_vec_avx512.c @@ -873,6 +873,115 @@ i40e_recv_scattered_pkts_vec_avx512(void *rx_queue, rx_pkts + retval, nb_pkts); } +static __rte_always_inline int +i40e_tx_free_bufs_avx512(struct i40e_tx_queue *txq) +{ + struct i40e_vec_tx_entry *txep; + uint32_t n; + uint32_t i; + int nb_free = 0; + struct rte_mbuf *m, *free[RTE_I40E_TX_MAX_FREE_BUF_SZ]; + + /* check DD bits on threshold descriptor */ + if ((txq->tx_ring[txq->tx_next_dd].cmd_type_offset_bsz & + rte_cpu_to_le_64(I40E_TXD_QW1_DTYPE_MASK)) != + rte_cpu_to_le_64(I40E_TX_DESC_DTYPE_DESC_DONE)) + return 0; + + n = txq->tx_rs_thresh; + +/* first buffer to free from S/W ring is at index + * tx_next_dd - (tx_rs_thresh-1) + */ + txep = (void *)txq->sw_ring; + txep += txq->tx_next_dd - (n - 1); + + if (txq->offloads & DEV_TX_OFFLOAD_MBUF_FAST_FREE && (n & 31) == 0) { + struct rte_mempool *mp = txep[0].mbuf->pool; + void **cache_objs; + struct rte_mempool_cache *cache = rte_mempool_default_cache(mp, + rte_lcore_id()); + + if (!cache || cache->len == 0) + goto normal; + + cache_objs = &cache->objs[cache->len]; + + if (n > RTE_MEMPOOL_CACHE_MAX_SIZE) { + rte_mempool_ops_enqueue_bulk(mp, (void *)txep, n); + goto done; + } + + /* The cache follows the following algorithm +* 1. Add the objects to the cache +* 2. Anything greater than the cache min value (if it +* crosses the cache flush threshold) is flushed to the ring. +*/ + /* Add elements back into the cache */ + uint32_t copied = 0; + /* n is multiple of 32 */ + while (copied < n) { + const __m512i a = _mm512_load_si512(&txep[copied]); + const __m512i b = _mm512_load_si512(&txep[copied + 8]); + const __m512i c = _mm512_load_si512(&txep[copied + 16]); + const __m512i d = _mm512_load_si512(&txep[copied + 24]); + + _mm512_storeu_si512(&cache_objs[copied], a); + _mm512_storeu_si512(&cache_objs[copied + 8], b); + _mm512_storeu_si512(&cache_objs[copied + 16], c); + _mm512_storeu_si512(&cache_objs[copied + 24], d); + copied += 32; + } + cache->len += n; + + if
[dpdk-dev] [PATCH v2 1/3] net/i40e: remove devarg use-latest-supported-vec
As eal parameter --force-max-simd-bitwidth is already introduced, to make it more clear when setting rx/tx function, remove devarg use-latest-supported-vec support. Signed-off-by: Leyi Rong --- doc/guides/nics/i40e.rst | 9 --- drivers/net/i40e/i40e_ethdev.c | 63 +-- drivers/net/i40e/i40e_ethdev.h | 3 - drivers/net/i40e/i40e_rxtx.c | 107 +++-- 4 files changed, 35 insertions(+), 147 deletions(-) diff --git a/doc/guides/nics/i40e.rst b/doc/guides/nics/i40e.rst index 4e5c4679b8..90fb8a4d6f 100644 --- a/doc/guides/nics/i40e.rst +++ b/doc/guides/nics/i40e.rst @@ -209,15 +209,6 @@ Runtime Config Options Currently hot-plugging of representor ports is not supported so all required representors must be specified on the creation of the PF. -- ``Use latest supported vector`` (default ``disable``) - - Latest supported vector path may not always get the best perf so vector path was - recommended to use only on later platform. But users may want the latest vector path - since it can get better perf in some real work loading cases. So ``devargs`` param - ``use-latest-supported-vec`` is introduced, for example:: - - -a 84:00.0,use-latest-supported-vec=1 - - ``Enable validation for VF message`` (default ``not enabled``) The PF counts messages from each VF. If in any period of seconds the message diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c index f54769c29d..223eb9950b 100644 --- a/drivers/net/i40e/i40e_ethdev.c +++ b/drivers/net/i40e/i40e_ethdev.c @@ -44,7 +44,6 @@ #define ETH_I40E_FLOATING_VEB_LIST_ARG "floating_veb_list" #define ETH_I40E_SUPPORT_MULTI_DRIVER "support-multi-driver" #define ETH_I40E_QUEUE_NUM_PER_VF_ARG "queue-num-per-vf" -#define ETH_I40E_USE_LATEST_VEC"use-latest-supported-vec" #define ETH_I40E_VF_MSG_CFG"vf_msg_cfg" #define I40E_CLEAR_PXE_WAIT_MS 200 @@ -403,7 +402,6 @@ static const char *const valid_keys[] = { ETH_I40E_FLOATING_VEB_LIST_ARG, ETH_I40E_SUPPORT_MULTI_DRIVER, ETH_I40E_QUEUE_NUM_PER_VF_ARG, - ETH_I40E_USE_LATEST_VEC, ETH_I40E_VF_MSG_CFG, NULL}; @@ -1301,62 +1299,6 @@ i40e_aq_debug_write_global_register(struct i40e_hw *hw, return i40e_aq_debug_write_register(hw, reg_addr, reg_val, cmd_details); } -static int -i40e_parse_latest_vec_handler(__rte_unused const char *key, - const char *value, - void *opaque) -{ - struct i40e_adapter *ad = opaque; - int use_latest_vec; - - use_latest_vec = atoi(value); - - if (use_latest_vec != 0 && use_latest_vec != 1) - PMD_DRV_LOG(WARNING, "Value should be 0 or 1, set it as 1!"); - - ad->use_latest_vec = (uint8_t)use_latest_vec; - - return 0; -} - -static int -i40e_use_latest_vec(struct rte_eth_dev *dev) -{ - struct i40e_adapter *ad = - I40E_DEV_PRIVATE_TO_ADAPTER(dev->data->dev_private); - struct rte_kvargs *kvlist; - int kvargs_count; - - ad->use_latest_vec = false; - - if (!dev->device->devargs) - return 0; - - kvlist = rte_kvargs_parse(dev->device->devargs->args, valid_keys); - if (!kvlist) - return -EINVAL; - - kvargs_count = rte_kvargs_count(kvlist, ETH_I40E_USE_LATEST_VEC); - if (!kvargs_count) { - rte_kvargs_free(kvlist); - return 0; - } - - if (kvargs_count > 1) - PMD_DRV_LOG(WARNING, "More than one argument \"%s\" and only " - "the first invalid or last valid one is used !", - ETH_I40E_USE_LATEST_VEC); - - if (rte_kvargs_process(kvlist, ETH_I40E_USE_LATEST_VEC, - i40e_parse_latest_vec_handler, ad) < 0) { - rte_kvargs_free(kvlist); - return -EINVAL; - } - - rte_kvargs_free(kvlist); - return 0; -} - static int read_vf_msg_config(__rte_unused const char *key, const char *value, @@ -1507,8 +1449,6 @@ eth_i40e_dev_init(struct rte_eth_dev *dev, void *init_params __rte_unused) i40e_parse_vf_msg_config(dev, &pf->vf_msg_cfg); /* Check if need to support multi-driver */ i40e_support_multi_driver(dev); - /* Check if users want the latest supported vec path */ - i40e_use_latest_vec(dev); /* Make sure all is clean before doing PF reset */ i40e_clear_hw(hw); @@ -13010,5 +12950,4 @@ RTE_PMD_REGISTER_PARAM_STRING(net_i40e, ETH_I40E_FLOATING_VEB_ARG "=1" ETH_I40E_FLOATING_VEB_LIST_ARG "=" ETH_I40E_QUEUE_NUM_PER_VF_ARG "=1|2|4|8|16" - ETH_I40E_SUPPORT_MULTI_DRIVER "=1" - ETH_I40E_USE_LATEST_VEC "=0|1"); + ETH_I40E_SUPP
Re: [dpdk-dev] [v2 PATCH] usertools: show an error message if unable to reserve requested hugepages
On Thu, Dec 17, 2020 at 11:19 PM Stephen Hemminger wrote: > > On Thu, 17 Dec 2020 16:16:16 +0500 > Sarosh Arif wrote: > > > +if get_hugepages(path) != pages: > > +print("Unable to reserve required pages. The pages reserved are:") > > +global SHOW_HUGEPAGES > > +SHOW_HUGEPAGES = True > > > Please don't add global's to this script. > > The script is close to being clean according to pylint, and globals > are considered bad style and shouldn't be used. > > I would just exit if huge pages could not be setup. How about if we just print a warning message such as "Unable to reserve required pages" before exiting, in case the pages are not reserved due to lack of space in RAM? Then leave it upon the user to query how many pages are actually reserved. > > The script should leave it up to the user to do another query about > status if they care about what the result is.
[dpdk-dev] [PATCH v3 0/8] ethdev: introduce GENEVE header TLV option item
The Geneve tunneling protocol is designed to allow the user to specify some data context on the packet. The GENEVE TLV (Type-Length-Variable) Option is the mean intended to present the user data. In order to support GENEVE TLV Option the new rte_flow item "rte_flow_item_geneve_opt" is introduced. The new item contains the values and masks for the following fields: -option class -option type -length -data The usage example: "flow create 0 ingress pattern eth / ipv4 / udp / geneve vni is 100 / geneve-opt class is 5 length is 1 type is 0 data is 0x66998800 / end actions count / drop / end" New item will be added to testpmd to support raw encap/decap action. Shiri Kuzin (7): lib/librte_ethdev: introduce GENEVE header TLV option item common/mlx5: check GENEVE TLV support in HCA attributes common/mlx5: create GENEVE TLV option object with DevX net/mlx5: create GENEVE TLV option management net/mlx5: add GENEVE TLV option flow validation net/mlx5: add GENEVE TLV option flow translation doc: update GENEVE TLV option support Viacheslav Ovsiienko (1): app/testpmd: add GENEVE option item support app/test-pmd/cmdline_flow.c | 102 ++- doc/guides/nics/mlx5.rst| 23 ++- doc/guides/rel_notes/release_21_02.rst | 8 + doc/guides/testpmd_app_ug/testpmd_funcs.rst | 8 + drivers/common/mlx5/mlx5_devx_cmds.c| 63 ++- drivers/common/mlx5/mlx5_devx_cmds.h| 9 + drivers/common/mlx5/mlx5_prm.h | 28 ++- drivers/common/mlx5/version.map | 1 + drivers/net/mlx5/mlx5.c | 2 + drivers/net/mlx5/mlx5.h | 13 ++ drivers/net/mlx5/mlx5_flow.c| 120 + drivers/net/mlx5/mlx5_flow.h| 11 ++ drivers/net/mlx5/mlx5_flow_dv.c | 188 +++- lib/librte_ethdev/rte_flow.c| 1 + lib/librte_ethdev/rte_flow.h| 27 +++ 15 files changed, 592 insertions(+), 12 deletions(-) -- 2.21.0
[dpdk-dev] [PATCH v3 1/8] lib/librte_ethdev: introduce GENEVE header TLV option item
The Geneve tunneling protocol is designed to allow the user to specify some data context on the packet. The GENEVE TLV (Type-Length-Variable) Option is the mean intended to present the user data. In order to support GENEVE TLV Option the new rte_flow item "rte_flow_item_geneve_opt" is added. The new item contains the values and masks for the following fields: -option class -option type -length -data New item will be added to testpmd to support match and raw encap/decap actions. Signed-off-by: Shiri Kuzin --- lib/librte_ethdev/rte_flow.c | 1 + lib/librte_ethdev/rte_flow.h | 27 +++ 2 files changed, 28 insertions(+) diff --git a/lib/librte_ethdev/rte_flow.c b/lib/librte_ethdev/rte_flow.c index a06f64c271..2af7d965e1 100644 --- a/lib/librte_ethdev/rte_flow.c +++ b/lib/librte_ethdev/rte_flow.c @@ -97,6 +97,7 @@ static const struct rte_flow_desc_data rte_flow_desc_item[] = { MK_FLOW_ITEM(L2TPV3OIP, sizeof(struct rte_flow_item_l2tpv3oip)), MK_FLOW_ITEM(PFCP, sizeof(struct rte_flow_item_pfcp)), MK_FLOW_ITEM(ECPRI, sizeof(struct rte_flow_item_ecpri)), + MK_FLOW_ITEM(GENEVE_OPT, sizeof(struct rte_flow_item_geneve_opt)), }; /** Generate flow_action[] entry. */ diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h index 0977a78270..11a6494b8e 100644 --- a/lib/librte_ethdev/rte_flow.h +++ b/lib/librte_ethdev/rte_flow.h @@ -543,6 +543,14 @@ enum rte_flow_item_type { * See struct rte_flow_item_ipv6_frag_ext. */ RTE_FLOW_ITEM_TYPE_IPV6_FRAG_EXT, + + /** +* Matches Geneve Variable Length Option +* +* See struct rte_flow_item_geneve_opt +*/ + RTE_FLOW_ITEM_TYPE_GENEVE_OPT, + }; /** @@ -1627,6 +1635,25 @@ static const struct rte_flow_item_ecpri rte_flow_item_ecpri_mask = { }; #endif +/** + * RTE_FLOW_ITEM_TYPE_GENEVE_OPT + * + * Matches a GENEVE Variable Length Option + */ +struct rte_flow_item_geneve_opt { + rte_be16_t option_class; + uint8_t option_type; + uint8_t option_len; + uint32_t *data; +}; + +/** Default mask for RTE_FLOW_ITEM_TYPE_GENEVE_OPT. */ +#ifndef __cplusplus +static const struct rte_flow_item_geneve_opt +rte_flow_item_geneve_opt_mask = { + .option_type = 0xff, +}; +#endif /** * Matching pattern item definition. * -- 2.21.0
[dpdk-dev] [PATCH v3 2/8] app/testpmd: add GENEVE option item support
From: Viacheslav Ovsiienko The patch adds the GENEVE option rte flow item support to command line interpreter. The flow command with GENEVE option items looks like: flow create 0 ingress pattern eth / ipv4 / udp / geneve vni is 100 / geneve-opt class is 99 length is 1 type is 0 data is 0x669988 / end actions drop / end The option length should be specified in 32-bit words, this value specifies the length of the data pattern/mask arrays (should be multiplied by sizeof(uint32_t) to be expressed in bytes. If match on the length itself is not needed the mask should be set to zero, in this case length is used to specify the pattern/mask array lengths only. Signed-off-by: Viacheslav Ovsiienko --- app/test-pmd/cmdline_flow.c | 102 ++-- doc/guides/testpmd_app_ug/testpmd_funcs.rst | 8 ++ 2 files changed, 104 insertions(+), 6 deletions(-) diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c index 585cab98b4..8bb2cb1b2d 100644 --- a/app/test-pmd/cmdline_flow.c +++ b/app/test-pmd/cmdline_flow.c @@ -283,6 +283,11 @@ enum index { ITEM_ECPRI_MSG_IQ_DATA_PCID, ITEM_ECPRI_MSG_RTC_CTRL_RTCID, ITEM_ECPRI_MSG_DLY_MSR_MSRID, + ITEM_GENEVE_OPT, + ITEM_GENEVE_OPT_CLASS, + ITEM_GENEVE_OPT_TYPE, + ITEM_GENEVE_OPT_LENGTH, + ITEM_GENEVE_OPT_DATA, /* Validate/create actions. */ ACTIONS, @@ -413,6 +418,9 @@ enum index { /** Maximum size for pattern in struct rte_flow_item_raw. */ #define ITEM_RAW_PATTERN_SIZE 40 +/** Maximum size for GENEVE option data pattern in bytes. */ +#define ITEM_GENEVE_OPT_DATA_SIZE 124 + /** Storage size for struct rte_flow_item_raw including pattern. */ #define ITEM_RAW_SIZE \ (sizeof(struct rte_flow_item_raw) + ITEM_RAW_PATTERN_SIZE) @@ -428,7 +436,7 @@ struct action_rss_data { }; /** Maximum data size in struct rte_flow_action_raw_encap. */ -#define ACTION_RAW_ENCAP_MAX_DATA 128 +#define ACTION_RAW_ENCAP_MAX_DATA 512 #define RAW_ENCAP_CONFS_MAX_NUM 8 /** Storage for struct rte_flow_action_raw_encap. */ @@ -658,6 +666,16 @@ struct token { .mask = (const void *)&(const s){ .f = (1 << (b)) - 1 }, \ }) +/** Static initializer for ARGS() to target a field with limits. */ +#define ARGS_ENTRY_BOUNDED(s, f, i, a) \ + (&(const struct arg){ \ + .bounded = 1, \ + .min = (i), \ + .max = (a), \ + .offset = offsetof(s, f), \ + .size = sizeof(((s *)0)->f), \ + }) + /** Static initializer for ARGS() to target an arbitrary bit-mask. */ #define ARGS_ENTRY_MASK(s, f, m) \ (&(const struct arg){ \ @@ -903,6 +921,7 @@ static const enum index next_item[] = { ITEM_AH, ITEM_PFCP, ITEM_ECPRI, + ITEM_GENEVE_OPT, END_SET, ZERO, }; @@ -1244,6 +1263,15 @@ static const enum index item_ecpri_common_type[] = { ZERO, }; +static const enum index item_geneve_opt[] = { + ITEM_GENEVE_OPT_CLASS, + ITEM_GENEVE_OPT_TYPE, + ITEM_GENEVE_OPT_LENGTH, + ITEM_GENEVE_OPT_DATA, + ITEM_NEXT, + ZERO, +}; + static const enum index next_action[] = { ACTION_END, ACTION_VOID, @@ -3230,6 +3258,47 @@ static const struct token token_list[] = { .args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_ecpri, hdr.type5.msr_id)), }, + [ITEM_GENEVE_OPT] = { + .name = "geneve-opt", + .help = "GENEVE header option", + .priv = PRIV_ITEM(GENEVE_OPT, + sizeof(struct rte_flow_item_geneve_opt) + + ITEM_GENEVE_OPT_DATA_SIZE), + .next = NEXT(item_geneve_opt), + .call = parse_vc, + }, + [ITEM_GENEVE_OPT_CLASS] = { + .name = "class", + .help = "GENEVE option class", + .next = NEXT(item_geneve_opt, NEXT_ENTRY(UNSIGNED), item_param), + .args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_geneve_opt, +option_class)), + }, + [ITEM_GENEVE_OPT_TYPE] = { + .name = "type", + .help = "GENEVE option type", + .next = NEXT(item_geneve_opt, NEXT_ENTRY(UNSIGNED), item_param), + .args = ARGS(ARGS_ENTRY(struct rte_flow_item_geneve_opt, + option_type)), + }, + [ITEM_GENEVE_OPT_LENGTH] = { + .name = "length", + .help = "GENEVE option data length (in 32b words)", + .next = NEXT(item_geneve_opt, NEXT_ENTRY(UNSIGNED), item_param), + .args = ARGS(ARGS_ENTRY_BOUNDED( + struct rte_flow_item_geneve_opt, option_len, + 0, 31)), + }, + [ITEM_GENEVE_OPT_DATA] = { + .
[dpdk-dev] [PATCH v3 3/8] common/mlx5: check GENEVE TLV support in HCA attributes
This is preparation step to support match on GENEVE TLV option. In this Patch we add the HCA attributes that will allow supporting GENEVE TLV option matching. Signed-off-by: Shiri Kuzin Acked-by: Viacheslav Ovsiienko --- drivers/common/mlx5/mlx5_devx_cmds.c | 7 +++ drivers/common/mlx5/mlx5_devx_cmds.h | 4 drivers/common/mlx5/mlx5_prm.h | 28 +--- 3 files changed, 36 insertions(+), 3 deletions(-) diff --git a/drivers/common/mlx5/mlx5_devx_cmds.c b/drivers/common/mlx5/mlx5_devx_cmds.c index 12f51a940c..fc406af062 100644 --- a/drivers/common/mlx5/mlx5_devx_cmds.c +++ b/drivers/common/mlx5/mlx5_devx_cmds.c @@ -693,6 +693,10 @@ mlx5_devx_cmd_query_hca_attr(void *ctx, attr->eth_virt = MLX5_GET(cmd_hca_cap, hcattr, eth_virt); attr->flex_parser_protocols = MLX5_GET(cmd_hca_cap, hcattr, flex_parser_protocols); + attr->max_geneve_tlv_options = MLX5_GET(cmd_hca_cap, hcattr, + max_geneve_tlv_options); + attr->max_geneve_tlv_option_data_len = MLX5_GET(cmd_hca_cap, hcattr, + max_geneve_tlv_option_data_len); attr->qos.sup = MLX5_GET(cmd_hca_cap, hcattr, qos); attr->vdpa.valid = !!(MLX5_GET64(cmd_hca_cap, hcattr, general_obj_types) & @@ -720,6 +724,9 @@ mlx5_devx_cmd_query_hca_attr(void *ctx, attr->flow_hit_aso = !!(MLX5_GET64(cmd_hca_cap, hcattr, general_obj_types) & MLX5_GENERAL_OBJ_TYPES_CAP_FLOW_HIT_ASO); + attr->geneve_tlv_opt = !!(MLX5_GET64(cmd_hca_cap, hcattr, + general_obj_types) & + MLX5_GENERAL_OBJ_TYPES_CAP_GENEVE_TLV_OPT); attr->log_max_cq = MLX5_GET(cmd_hca_cap, hcattr, log_max_cq); attr->log_max_qp = MLX5_GET(cmd_hca_cap, hcattr, log_max_qp); attr->log_max_cq_sz = MLX5_GET(cmd_hca_cap, hcattr, log_max_cq_sz); diff --git a/drivers/common/mlx5/mlx5_devx_cmds.h b/drivers/common/mlx5/mlx5_devx_cmds.h index b335b7c82c..6b2f77837b 100644 --- a/drivers/common/mlx5/mlx5_devx_cmds.h +++ b/drivers/common/mlx5/mlx5_devx_cmds.h @@ -97,6 +97,8 @@ struct mlx5_hca_attr { uint32_t lro_timer_supported_periods[MLX5_LRO_NUM_SUPP_PERIODS]; uint16_t lro_min_mss_size; uint32_t flex_parser_protocols; + uint32_t max_geneve_tlv_options; + uint32_t max_geneve_tlv_option_data_len; uint32_t hairpin:1; uint32_t log_max_hairpin_queues:5; uint32_t log_max_hairpin_wq_data_sz:5; @@ -116,6 +118,7 @@ struct mlx5_hca_attr { uint32_t regex:1; uint32_t regexp_num_of_engines; uint32_t log_max_ft_sampler_num:8; + uint32_t geneve_tlv_opt; struct mlx5_hca_qos_attr qos; struct mlx5_hca_vdpa_attr vdpa; int log_max_qp_sz; @@ -479,6 +482,7 @@ struct mlx5_devx_obj *mlx5_devx_cmd_create_flex_parser(void *ctx, __rte_internal int mlx5_devx_cmd_register_read(void *ctx, uint16_t reg_id, uint32_t arg, uint32_t *data, uint32_t dw_cnt); + /** * Create virtio queue counters object DevX API. * diff --git a/drivers/common/mlx5/mlx5_prm.h b/drivers/common/mlx5/mlx5_prm.h index 8c9b53ce10..458550279e 100644 --- a/drivers/common/mlx5/mlx5_prm.h +++ b/drivers/common/mlx5/mlx5_prm.h @@ -789,7 +789,7 @@ struct mlx5_ifc_fte_match_set_misc3_bits { u8 icmp_code[0x8]; u8 icmpv6_type[0x8]; u8 icmpv6_code[0x8]; - u8 reserved_at_120[0x20]; + u8 geneve_tlv_option_0_data[0x20]; u8 gtpu_teid[0x20]; u8 gtpu_msg_type[0x08]; u8 gtpu_msg_flags[0x08]; @@ -1082,6 +1082,8 @@ enum { (1ULL << MLX5_GENERAL_OBJ_TYPE_FLEX_PARSE_GRAPH) #define MLX5_GENERAL_OBJ_TYPES_CAP_FLOW_HIT_ASO \ (1ULL << MLX5_GENERAL_OBJ_TYPE_FLOW_HIT_ASO) +#define MLX5_GENERAL_OBJ_TYPES_CAP_GENEVE_TLV_OPT \ + (1ULL << MLX5_OBJ_TYPE_GENEVE_TLV_OPT) enum { MLX5_HCA_CAP_OPMOD_GET_MAX = 0, @@ -1380,8 +1382,10 @@ struct mlx5_ifc_cmd_hca_cap_bits { u8 reserved_at_500[0x20]; u8 num_of_uars_per_page[0x20]; u8 flex_parser_protocols[0x20]; - u8 reserved_at_560[0x20]; - u8 reserved_at_580[0x3c]; + u8 max_geneve_tlv_options[0x8]; + u8 reserved_at_568[0x3]; + u8 max_geneve_tlv_option_data_len[0x5]; + u8 reserved_at_570[0x4c]; u8 mini_cqe_resp_stride_index[0x1]; u8 cqe_128_always[0x1]; u8 cqe_compression_128[0x1]; @@ -2292,6 +2296,7 @@ struct mlx5_ifc_create_cq_in_bits { }; enum { + MLX5_OBJ_TYPE_GENEVE_TLV_OPT = 0x000b, MLX5_GENERAL_OBJ_TYPE_VIRTQ = 0x000d, MLX5_GENERAL_OBJ_TYPE_VIRTIO_Q_COUNTERS = 0x001c, MLX5_GENERAL_OBJ_TYPE_FLEX_PARSE_GRAPH = 0x0022, @@ -2326,6 +2331,17 @@ struct mlx5_ifc_virtio_q_counters_bits {
[dpdk-dev] [PATCH v3 4/8] common/mlx5: create GENEVE TLV option object with DevX
TLV object is a special firmware maintained entity used to support match on GENEVE header extension option. The TLV object is created with DevX API and accepts the option class, type and lehgth fields. The class type and length fields are set using MLX5_SET and the Devx object is created using mlx5 glue function. Signed-off-by: Shiri Kuzin Acked-by: Viacheslav Ovsiienko --- drivers/common/mlx5/mlx5_devx_cmds.c | 56 +++- drivers/common/mlx5/mlx5_devx_cmds.h | 5 +++ drivers/common/mlx5/version.map | 1 + 3 files changed, 61 insertions(+), 1 deletion(-) diff --git a/drivers/common/mlx5/mlx5_devx_cmds.c b/drivers/common/mlx5/mlx5_devx_cmds.c index fc406af062..08bd357ad1 100644 --- a/drivers/common/mlx5/mlx5_devx_cmds.c +++ b/drivers/common/mlx5/mlx5_devx_cmds.c @@ -2068,7 +2068,6 @@ mlx5_devx_cmd_create_flow_hit_aso_obj(void *ctx, uint32_t pd) * * @param[in] ctx * Context returned from mlx5 open_device() glue function. - * * @return * The DevX object created, NULL otherwise and rte_errno is set. */ @@ -2097,3 +2096,58 @@ mlx5_devx_cmd_alloc_pd(void *ctx) ppd->id = MLX5_GET(alloc_pd_out, out, pd); return ppd; } + +/** + * Create general object of type GENEVE TLV option using DevX API. + * + * @param[in] ctx + * Context returned from mlx5 open_device() glue function. + * @param [in] class + * TLV option variable value of class + * @param [in] type + * TLV option variable value of type + * @param [in] len + * TLV option variable value of len + ** + * @return + * The DevX object created, NULL otherwise and rte_errno is set. + */ +struct mlx5_devx_obj * +mlx5_devx_cmd_create_geneve_tlv_option(void *ctx, + uint16_t class, uint8_t type, uint8_t len) +{ + uint32_t in[MLX5_ST_SZ_DW(create_geneve_tlv_option_in)] = {0}; + uint32_t out[MLX5_ST_SZ_DW(general_obj_out_cmd_hdr)] = {0}; + struct mlx5_devx_obj *geneve_tlv_opt_obj = mlx5_malloc(MLX5_MEM_ZERO, + sizeof(*geneve_tlv_opt_obj), + 0, SOCKET_ID_ANY); + + if (!geneve_tlv_opt_obj) { + DRV_LOG(ERR, "Failed to allocate geneve tlv option object."); + rte_errno = ENOMEM; + return NULL; + } + void *hdr = MLX5_ADDR_OF(create_geneve_tlv_option_in, in, hdr); + void *opt = MLX5_ADDR_OF(create_geneve_tlv_option_in, in, + geneve_tlv_opt); + MLX5_SET(general_obj_in_cmd_hdr, hdr, opcode, + MLX5_CMD_OP_CREATE_GENERAL_OBJECT); + MLX5_SET(general_obj_in_cmd_hdr, hdr, obj_type, + MLX5_OBJ_TYPE_GENEVE_TLV_OPT); + MLX5_SET(geneve_tlv_option, opt, option_class, + rte_be_to_cpu_16(class)); + MLX5_SET(geneve_tlv_option, opt, option_type, type); + MLX5_SET(geneve_tlv_option, opt, option_data_length, len); + geneve_tlv_opt_obj->obj = mlx5_glue->devx_obj_create(ctx, in, + sizeof(in), out, sizeof(out)); + if (!geneve_tlv_opt_obj->obj) { + rte_errno = errno; + DRV_LOG(ERR, "Failed to create Geneve tlv option " + "Obj using DevX."); + mlx5_free(geneve_tlv_opt_obj); + return NULL; + } + geneve_tlv_opt_obj->id = MLX5_GET(general_obj_out_cmd_hdr, out, obj_id); + return geneve_tlv_opt_obj; +} + diff --git a/drivers/common/mlx5/mlx5_devx_cmds.h b/drivers/common/mlx5/mlx5_devx_cmds.h index 6b2f77837b..6774b49d2d 100644 --- a/drivers/common/mlx5/mlx5_devx_cmds.h +++ b/drivers/common/mlx5/mlx5_devx_cmds.h @@ -483,6 +483,11 @@ __rte_internal int mlx5_devx_cmd_register_read(void *ctx, uint16_t reg_id, uint32_t arg, uint32_t *data, uint32_t dw_cnt); +__rte_internal +struct mlx5_devx_obj * +mlx5_devx_cmd_create_geneve_tlv_option(void *ctx, + uint16_t class, uint8_t type, uint8_t len); + /** * Create virtio queue counters object DevX API. * diff --git a/drivers/common/mlx5/version.map b/drivers/common/mlx5/version.map index fb3952bbc6..7de8006b89 100644 --- a/drivers/common/mlx5/version.map +++ b/drivers/common/mlx5/version.map @@ -23,6 +23,7 @@ INTERNAL { mlx5_devx_cmd_create_virtio_q_counters; mlx5_devx_cmd_create_virtq; mlx5_devx_cmd_create_flow_hit_aso_obj; + mlx5_devx_cmd_create_geneve_tlv_option; mlx5_devx_cmd_destroy; mlx5_devx_cmd_flow_counter_alloc; mlx5_devx_cmd_flow_counter_query; -- 2.21.0
[dpdk-dev] [PATCH v3 5/8] net/mlx5: create GENEVE TLV option management
Currently firmware supports the only TLV object per device to match on the GENEVE header option. This patch adds the simple TLV object management to the mlx5 PMD. Signed-off-by: Shiri Kuzin Acked-by: Viacheslav Ovsiienko --- drivers/net/mlx5/mlx5.c | 2 + drivers/net/mlx5/mlx5.h | 13 drivers/net/mlx5/mlx5_flow.h| 4 ++ drivers/net/mlx5/mlx5_flow_dv.c | 108 4 files changed, 127 insertions(+) diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c index 023ef50a77..68d6352d48 100644 --- a/drivers/net/mlx5/mlx5.c +++ b/drivers/net/mlx5/mlx5.c @@ -1014,6 +1014,7 @@ mlx5_alloc_shared_dev_ctx(const struct mlx5_dev_spawn_data *spawn, rte_rwlock_write_unlock(&mlx5_shared_data->mem_event_rwlock); /* Add context to the global device list. */ LIST_INSERT_HEAD(&mlx5_dev_ctx_list, sh, next); + rte_spinlock_init(&sh->geneve_tlv_opt_sl); exit: pthread_mutex_unlock(&mlx5_dev_ctx_list_mutex); return sh; @@ -1109,6 +1110,7 @@ mlx5_free_shared_dev_ctx(struct mlx5_dev_ctx_shared *sh) mlx5_glue->devx_free_uar(sh->devx_rx_uar); if (sh->ctx) claim_zero(mlx5_glue->close_device(sh->ctx)); + MLX5_ASSERT(sh->geneve_tlv_option_resource == NULL); pthread_mutex_destroy(&sh->txpp.mutex); mlx5_free(sh); return; diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h index 41034f5d19..23272950a4 100644 --- a/drivers/net/mlx5/mlx5.h +++ b/drivers/net/mlx5/mlx5.h @@ -540,6 +540,16 @@ struct mlx5_aso_age_mng { struct mlx5_aso_sq aso_sq; /* ASO queue objects. */ }; +/* Management structure for geneve tlv option */ +struct mlx5_geneve_tlv_option_resource { + struct mlx5_devx_obj *obj; /* Pointer to the geneve tlv opt object. */ + rte_be16_t option_class; /* geneve tlv opt class.*/ + uint8_t option_type; /* geneve tlv opt type.*/ + uint8_t length; /* geneve tlv opt length. */ + uint32_t refcnt; /* geneve tlv object reference counter */ +}; + + #define MLX5_AGE_EVENT_NEW 1 #define MLX5_AGE_TRIGGER 2 #define MLX5_AGE_SET(age_info, BIT) \ @@ -752,6 +762,9 @@ struct mlx5_dev_ctx_shared { void *devx_rx_uar; /* DevX UAR for Rx. */ struct mlx5_aso_age_mng *aso_age_mng; /* Management data for aging mechanism using ASO Flow Hit. */ + struct mlx5_geneve_tlv_option_resource *geneve_tlv_option_resource; + /* Management structure for geneve tlv option */ + rte_spinlock_t geneve_tlv_opt_sl; /* Lock for geneve tlv resource */ struct mlx5_dev_shared_port port[]; /* per device port data array. */ }; diff --git a/drivers/net/mlx5/mlx5_flow.h b/drivers/net/mlx5/mlx5_flow.h index ee85c9d8a5..0c8861964c 100644 --- a/drivers/net/mlx5/mlx5_flow.h +++ b/drivers/net/mlx5/mlx5_flow.h @@ -1046,6 +1046,7 @@ struct rte_flow { uint32_t counter; /**< Holds flow counter. */ uint32_t tunnel_id; /**< Tunnel id */ uint32_t age; /**< Holds ASO age bit index. */ + uint32_t geneve_tlv_option; /**< Holds Geneve TLV option id. > */ } __rte_packed; /* @@ -1503,4 +1504,7 @@ void flow_dv_dest_array_remove_cb(struct mlx5_cache_list *list, struct mlx5_cache_entry *entry); struct mlx5_aso_age_action *flow_aso_age_get_by_idx(struct rte_eth_dev *dev, uint32_t age_idx); +int flow_dev_geneve_tlv_option_resource_register(struct rte_eth_dev *dev, +const struct rte_flow_item *item, +struct rte_flow_error *error); #endif /* RTE_PMD_MLX5_FLOW_H_ */ diff --git a/drivers/net/mlx5/mlx5_flow_dv.c b/drivers/net/mlx5/mlx5_flow_dv.c index e4736ee9b5..0a657cea41 100644 --- a/drivers/net/mlx5/mlx5_flow_dv.c +++ b/drivers/net/mlx5/mlx5_flow_dv.c @@ -7244,6 +7244,90 @@ flow_dv_translate_item_geneve(void *matcher, void *key, MLX5_GENEVE_OPTLEN_VAL(gbhdr_m)); } +/** + * Create Geneve TLV option resource. + * + * @param dev[in, out] + * Pointer to rte_eth_dev structure. + * @param[in, out] tag_be24 + * Tag value in big endian then R-shift 8. + * @parm[in, out] dev_flow + * Pointer to the dev_flow. + * @param[out] error + * pointer to error structure. + * + * @return + * 0 on success otherwise -errno and errno is set. + */ + +int +flow_dev_geneve_tlv_option_resource_register(struct rte_eth_dev *dev, +const struct rte_flow_item *item, +struct rte_flow_error *error) +{ + struct mlx5_priv *priv = dev->data->dev_private; + struct mlx5_dev_ctx_shared *sh = priv->sh; + struct mlx5_geneve_tlv_option_resource *geneve_opt_resource = + sh->geneve_tlv_option_resource; + struct mlx5_devx_obj *obj; + const struct rte_flow_item_gene
[dpdk-dev] [PATCH v3 6/8] net/mlx5: add GENEVE TLV option flow validation
This patch adds validation routine for the GENEVE header TLV option. The GENEVE TLV option match must include all fields with full masks due to NIC does not support masking on option class, type and length. The option data length must be non zero and provided data pattern should be zero neither due to hardware limitations. Signed-off-by: Shiri Kuzin Acked-by: Viacheslav Ovsiienko --- drivers/net/mlx5/mlx5_flow.c| 120 drivers/net/mlx5/mlx5_flow.h| 7 ++ drivers/net/mlx5/mlx5_flow_dv.c | 10 ++- 3 files changed, 136 insertions(+), 1 deletion(-) diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c index 2a4073c126..892ed2f2d9 100644 --- a/drivers/net/mlx5/mlx5_flow.c +++ b/drivers/net/mlx5/mlx5_flow.c @@ -2627,6 +2627,126 @@ mlx5_flow_validate_item_geneve(const struct rte_flow_item *item, return 0; } +/** + * Validate Geneve TLV option item. + * + * @param[in] item + * Item specification. + * @param[in] last_item + * Previous validated item in the pattern items. + * @param[in] dev + * Pointer to the rte_eth_dev structure. + * @param[out] error + * Pointer to error structure. + * + * @return + * 0 on success, a negative errno value otherwise and rte_errno is set. + */ +int +mlx5_flow_validate_item_geneve_opt(const struct rte_flow_item *item, + uint64_t last_item, + struct rte_eth_dev *dev, + struct rte_flow_error *error) +{ + struct mlx5_priv *priv = dev->data->dev_private; + struct mlx5_dev_ctx_shared *sh = priv->sh; + struct mlx5_geneve_tlv_option_resource *geneve_opt_resource; + struct mlx5_hca_attr *hca_attr = &priv->config.hca_attr; + uint8_t data_max_supported = + hca_attr->max_geneve_tlv_option_data_len * 4; + struct mlx5_dev_config *config = &priv->config; + const struct rte_flow_item_geneve_opt *spec = item->spec; + const struct rte_flow_item_geneve_opt *mask = item->mask; + unsigned int i; + unsigned int data_len; + const struct rte_flow_item_geneve_opt full_mask = { + .option_class = RTE_BE16(0x), + .option_type = 0xff, + .option_len = 0x1f, + }; + + if (!mask) + mask = &rte_flow_item_geneve_opt_mask; + if (!spec) + return rte_flow_error_set + (error, ENOTSUP, RTE_FLOW_ERROR_TYPE_ITEM, item, + "Geneve TLV opt class/type/length must be specified"); + if ((uint32_t)(spec->option_len) > MLX5_GENEVE_OPTLEN_MASK) + return rte_flow_error_set + (error, ENOTSUP, RTE_FLOW_ERROR_TYPE_ITEM, item, + "Geneve TLV opt length exceeeds the limit (31)"); + /* Check if class type and length masks are full. */ + if (full_mask.option_class != mask->option_class || + full_mask.option_type != mask->option_type || + full_mask.option_len != (mask->option_len & full_mask.option_len)) + return rte_flow_error_set + (error, ENOTSUP, RTE_FLOW_ERROR_TYPE_ITEM, item, + "Geneve TLV opt class/type/length masks must be full"); + /* Check if length is supported */ + if ((uint32_t)(spec->option_len) > + config->hca_attr.max_geneve_tlv_option_data_len) + return rte_flow_error_set + (error, ENOTSUP, RTE_FLOW_ERROR_TYPE_ITEM, item, + "Geneve TLV opt length not supported"); + if (config->hca_attr.max_geneve_tlv_options > 1) + DRV_LOG(DEBUG, + "max_geneve_tlv_options supports more than 1 option"); + /* Check GENEVE item preceding. */ + if (!(last_item & MLX5_FLOW_LAYER_GENEVE)) + return rte_flow_error_set + (error, ENOTSUP, RTE_FLOW_ERROR_TYPE_ITEM, item, + "Geneve opt item must be preceded with Geneve item"); + /* Check if length is 0 or data is 0. */ + if (spec->data == NULL || spec->option_len == 0) + return rte_flow_error_set + (error, ENOTSUP, RTE_FLOW_ERROR_TYPE_ITEM, item, + "Geneve TLV opt with zero data/length not supported"); + /* Check not all data & mask are 0. */ + data_len = spec->option_len * 4; + if (mask->data == NULL) { + for (i = 0; i < data_len; i++) + if (spec->data[i]) + break; + if (i == data_len) + return rte_flow_error_set(error, ENOTSUP, + RTE_FLOW_ERROR_TYPE_ITEM, item, + "Can't match on Geneve option data 0"); + } else { + for (i = 0; i < data_len; i++) + if (spec->d
[dpdk-dev] [PATCH v3 7/8] net/mlx5: add GENEVE TLV option flow translation
The GENEVE TLV option matching flows must be created using a translation function. This function checks whether we already created a Devx object for the matching and either creates the objects or updates the reference counter. Signed-off-by: Shiri Kuzin Acked-by: Viacheslav Ovsiienko --- drivers/net/mlx5/mlx5_flow_dv.c | 70 + 1 file changed, 70 insertions(+) diff --git a/drivers/net/mlx5/mlx5_flow_dv.c b/drivers/net/mlx5/mlx5_flow_dv.c index c24123fbf3..006460cd0a 100644 --- a/drivers/net/mlx5/mlx5_flow_dv.c +++ b/drivers/net/mlx5/mlx5_flow_dv.c @@ -7336,6 +7336,65 @@ flow_dev_geneve_tlv_option_resource_register(struct rte_eth_dev *dev, return ret; } +/** + * Add Geneve TLV option item to matcher. + * + * @param[in, out] dev + * Pointer to rte_eth_dev structure. + * @param[in, out] matcher + * Flow matcher. + * @param[in, out] key + * Flow matcher value. + * @param[in] item + * Flow pattern to translate. + * @param[out] error + * Pointer to error structure. + */ +static int +flow_dv_translate_item_geneve_opt(struct rte_eth_dev *dev, void *matcher, + void *key, const struct rte_flow_item *item, + struct rte_flow_error *error) +{ + const struct rte_flow_item_geneve_opt *geneve_opt_m = item->mask; + const struct rte_flow_item_geneve_opt *geneve_opt_v = item->spec; + void *misc3_m = MLX5_ADDR_OF(fte_match_param, matcher, + misc_parameters_3); + void *misc3_v = MLX5_ADDR_OF(fte_match_param, key, misc_parameters_3); + rte_be32_t opt_data_key = 0, opt_data_mask = 0; + int ret = 0; + + if (!geneve_opt_v) + return -1; + if (!geneve_opt_m) + geneve_opt_m = &rte_flow_item_geneve_opt_mask; + ret = flow_dev_geneve_tlv_option_resource_register(dev, item, + error); + if (ret) { + DRV_LOG(ERR, "Failed to create geneve_tlv_obj"); + return ret; + } + /* Set the data. */ + if (geneve_opt_v->data) { + memcpy(&opt_data_key, geneve_opt_v->data, + RTE_MIN((uint32_t)(geneve_opt_v->option_len * 4), + sizeof(opt_data_key))); + MLX5_ASSERT((uint32_t)(geneve_opt_v->option_len * 4) <= + sizeof(opt_data_key)); + memcpy(&opt_data_mask, geneve_opt_m->data, + RTE_MIN((uint32_t)(geneve_opt_v->option_len * 4), + sizeof(opt_data_mask))); + MLX5_ASSERT((uint32_t)(geneve_opt_v->option_len * 4) <= + sizeof(opt_data_mask)); + MLX5_SET(fte_match_set_misc3, misc3_m, + geneve_tlv_option_0_data, + rte_be_to_cpu_32(opt_data_mask)); + MLX5_SET(fte_match_set_misc3, misc3_v, + geneve_tlv_option_0_data, + rte_be_to_cpu_32(opt_data_key & opt_data_mask)); + } + return ret; +} + /** * Add MPLS item to matcher and to the value. * @@ -10583,6 +10642,17 @@ flow_dv_translate(struct rte_eth_dev *dev, matcher.priority = MLX5_TUNNEL_PRIO_GET(rss_desc); last_item = MLX5_FLOW_LAYER_GENEVE; break; + case RTE_FLOW_ITEM_TYPE_GENEVE_OPT: + ret = flow_dv_translate_item_geneve_opt(dev, match_mask, + match_value, + items, error); + if (ret) + return rte_flow_error_set(error, -ret, + RTE_FLOW_ERROR_TYPE_ITEM, NULL, + "cannot create GENEVE TLV option"); + flow->geneve_tlv_option = 1; + last_item = MLX5_FLOW_LAYER_GENEVE_OPT; + break; case RTE_FLOW_ITEM_TYPE_MPLS: flow_dv_translate_item_mpls(match_mask, match_value, items, last_item, tunnel); -- 2.21.0
[dpdk-dev] [PATCH v3 8/8] doc: update GENEVE TLV option support
GENEVE TLV option support added to mlx5 PMD. The limitations and support were updated in documentation. Signed-off-by: Shiri Kuzin Acked-by: Viacheslav Ovsiienko --- doc/guides/nics/mlx5.rst | 23 ++- doc/guides/rel_notes/release_21_02.rst | 8 2 files changed, 30 insertions(+), 1 deletion(-) diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst index 3bda0f8417..5f2b42b979 100644 --- a/doc/guides/nics/mlx5.rst +++ b/doc/guides/nics/mlx5.rst @@ -98,6 +98,7 @@ Features - Hardware LRO. - Hairpin. - Multiple-thread flow insertion. +- Matching on Geneve TLV option header with raw encap/decap action. Limitations --- @@ -175,7 +176,18 @@ Limitations - OAM - protocol type - options length - Currently, the only supported options length value is 0. + +- Match on Geneve TLv option is supported on the following fields: + - Class + - Type + - Length + - Data + + Only one Class/Type/Length Geneve TLV option is supported per shared device. + Class/Type/Length fields must be specified as well as masks. + Class/Type/Length specified masks must be full. + Matching Geneve TLV option without specifying data is not supported. + Matching Geneve TLV option with data & mask == 0 is not supported. - VF: flow rules created on VF devices can only match traffic targeted at the configured MAC addresses (see ``rte_eth_dev_mac_addr_add()``). @@ -1022,6 +1034,10 @@ Below are some firmware configurations listed. or FLEX_PARSER_PROFILE_ENABLE=1 +- enable Geneve TLV option flow matching:: + + FLEX_PARSER_PROFILE_ENABLE=0 + - enable GTP flow matching:: FLEX_PARSER_PROFILE_ENABLE=3 @@ -1501,6 +1517,11 @@ Supported hardware offloads | | | rdma-core 32 | | rdma-core 32 | | | | ConnectX-6 Dx| | ConnectX-6 Dx | +---+-+-+ + | Encapsulation | | DPDK 21.02| | DPDK 21.02| + | GENEVE TLV option | | OFED 5.2 | | OFED 5.2 | + | | | rdma-core 34 | | rdma-core 34 | + | | | ConnectX-6 Dx | | ConnectX-6 Dx | + +---+-+-+ Notes for metadata -- diff --git a/doc/guides/rel_notes/release_21_02.rst b/doc/guides/rel_notes/release_21_02.rst index 638f98168b..2fb5bf9c38 100644 --- a/doc/guides/rel_notes/release_21_02.rst +++ b/doc/guides/rel_notes/release_21_02.rst @@ -55,6 +55,14 @@ New Features Also, make sure to start the actual text at the margin. === +* **Updated Mellanox mlx5 driver.** + + Updated the Mellanox mlx5 driver with new features and improvements, including: + + * **Added GENEVE TLV option in rte_flow.** + + Added support for matching GENEVE TLV option and raw encap/decap of GENEVE + TLV option. Removed Items - -- 2.21.0
[dpdk-dev] [PATCH v3 0/5] introduce new iavf driver on vfio-user client
This series introduces a new net virtual device called iavf_client which is based on vfio-user client interface. Through vfio-user client interface, PCI liked device could be used similar as Intel® Ethernet Adaptive Virtual Function specification. The code to enable iavf_client mainly contains two parts: - Emulated pci interfaces for vfio-user client which is located in common/iavf/vfio_user/. - A new net driver base on vdev abstraction, i.e. iavf_client_ethdev.c -- | -- | | | iavf driver | |> (iavf_client_ethdev.c) | -- | | -- | | | device emulate | |-->(common/iavf/vfio_user/) | -- | -- | | -- | vfio-user | | client | -- This driver can be used to test the device emulation framework mentioned in: [RFC 0/2] Add device emulation support in DPDK http://patchwork.dpdk.org/cover/75549/ +--+ | +---+ +---+ | | | iavf_emudev | | iavfbe_ethdev | | | |driver | | driver| | | +---+ +---+ | | | | | | --- VDEV BUS | | | | | | +---+ +--+ | +--+| | vdev: | | vdev:| | | +--+ || | /path/to/vfio | |iavf_emudev_# | | | | iavf | || +---+ +--+ | | | client | | | | | +--+ || | | | +--+ || +--+| | | vfio-user| || | vfio-user|| | | client | |<---|->| server || | +--+ || +--+| | QEMU/DPDK|| DPDK | +--++--+ It can be launched together with patch series for the testing: [0/9] Introduce vfio-user library: http://patchwork.dpdk.org/cover/85389/ [0/8] Introduce emudev library and iavf emudev driver http://patchwork.dpdk.org/cover/85488/ [0/6] Introduce iavf backend driver http://patchwork.dpdk.org/cover/86084/ This series depends on patch serieses: [0/9] Introduce vfio-user library: http://patchwork.dpdk.org/cover/85389/ v3: - Reword commit log - Add missed patch in v2 v2: - Enable interrupt for control queue - Enable interrupt for rx queue - Rename some Macros - Fix resource release when close - Fix ptype_tbl assignment - Fix typo Jingjing Wu (5): common/iavf: emulated pci interfaces on vfio-user client net/iavf_client: introduce iavf driver on vfio-user client net/iavf_client: enable interrupt on control queue net/iavf_client: enable interrupt of Rx queue net/iavf: fix vector mapping with queue config/rte_config.h | 3 + drivers/common/iavf/iavf_common.c | 7 + drivers/common/iavf/iavf_impl.c | 45 +- drivers/common/iavf/iavf_osdep.h | 14 + drivers/common/iavf/iavf_prototype.h | 1 + drivers/common/iavf/iavf_type.h | 3 + drivers/common/iavf/meson.build | 11 +- drivers/common/iavf/version.map | 6 + drivers/common/iavf/vfio_user/vfio_user_pci.c | 525 ++ drivers/common/iavf/vfio_user/vfio_user_pci.h | 68 +++ drivers/net/iavf/iavf.h | 37 +- drivers/net/iavf/iavf_client_ethdev.c | 404 ++ drivers/net/iavf/iavf_ethdev.c| 57 +- drivers/net/iavf/iavf_rxtx.c | 23 +- drivers/net/iavf/meson.build | 1 + 15 files changed, 1157 insertions(+), 48 deletions(-) create mode 100644 drivers/common/iavf/vfio_user/vfio_user_pci.c create mode 100644 drivers/common/iavf/vfio_user/vfio_user_pci.h create mode 100644 drivers/net/iavf/iavf_client_ethdev.c -- 2.21.1
[dpdk-dev] [PATCH v3 1/5] common/iavf: emulated pci interfaces on vfio-user client
This patch implements Emulated pci interfaces on vfio-user client which is located in common/iavf/vfio_user/. Four main functions provided to upper layer (driver) to set up or talk to device: - client_vfio_user_setup - client_vfio_user_release - client_vfio_user_get_bar_addr - client_vfio_user_pci_bar_access -- | -- | | | iavf driver | | | -- | | -- | | | device emulate | |-->(common/iavf/vfio_user/) | -- | -- | | -- | vfio-user | | client | -- Signed-off-by: Jingjing Wu Signed-off-by: Chenbo Xia --- config/rte_config.h | 3 + drivers/common/iavf/iavf_common.c | 7 + drivers/common/iavf/iavf_impl.c | 45 +- drivers/common/iavf/iavf_osdep.h | 14 + drivers/common/iavf/iavf_type.h | 3 + drivers/common/iavf/meson.build | 11 +- drivers/common/iavf/version.map | 5 + drivers/common/iavf/vfio_user/vfio_user_pci.c | 525 ++ drivers/common/iavf/vfio_user/vfio_user_pci.h | 68 +++ 9 files changed, 679 insertions(+), 2 deletions(-) create mode 100644 drivers/common/iavf/vfio_user/vfio_user_pci.c create mode 100644 drivers/common/iavf/vfio_user/vfio_user_pci.h diff --git a/config/rte_config.h b/config/rte_config.h index a0b5160ff2..76ed1ae83f 100644 --- a/config/rte_config.h +++ b/config/rte_config.h @@ -151,4 +151,7 @@ #define RTE_LIBRTE_PMD_DLB2_SW_CREDIT_QUANTA 32 #define RTE_PMD_DLB2_DEFAULT_DEPTH_THRESH 256 +/* IAVF bsed on vfio_user client */ +#define RTE_LIBRTE_IAVF_CLIENT 1 + #endif /* _RTE_CONFIG_H_ */ diff --git a/drivers/common/iavf/iavf_common.c b/drivers/common/iavf/iavf_common.c index c951b7d787..2e68f5d08b 100644 --- a/drivers/common/iavf/iavf_common.c +++ b/drivers/common/iavf/iavf_common.c @@ -20,6 +20,13 @@ enum iavf_status iavf_set_mac_type(struct iavf_hw *hw) DEBUGFUNC("iavf_set_mac_type\n"); +#ifdef RTE_LIBRTE_IAVF_CLIENT + if (hw->bus.type == iavf_bus_type_vfio_user) { + hw->mac.type = IAVF_MAC_VF; + return status; + } +#endif + if (hw->vendor_id == IAVF_INTEL_VENDOR_ID) { switch (hw->device_id) { case IAVF_DEV_ID_X722_VF: diff --git a/drivers/common/iavf/iavf_impl.c b/drivers/common/iavf/iavf_impl.c index fc0da31753..79262bf50d 100644 --- a/drivers/common/iavf/iavf_impl.c +++ b/drivers/common/iavf/iavf_impl.c @@ -10,9 +10,14 @@ #include #include +#ifdef RTE_LIBRTE_IAVF_CLIENT +#include "vfio_user/vfio_user_pci.h" +#endif #include "iavf_type.h" #include "iavf_prototype.h" +int iavf_common_logger; + enum iavf_status iavf_allocate_dma_mem_d(__rte_unused struct iavf_hw *hw, struct iavf_dma_mem *mem, @@ -85,4 +90,42 @@ iavf_free_virt_mem_d(__rte_unused struct iavf_hw *hw, return IAVF_SUCCESS; } -RTE_LOG_REGISTER(iavf_common_logger, pmd.common.iavf, NOTICE); +#ifdef RTE_LIBRTE_IAVF_CLIENT +uint32_t +iavf_read_addr(struct iavf_hw *hw, uint32_t offset) +{ + uint32_t value = 0; + + if (!hw || !hw->hw_addr) + return 0; + + if (hw->bus.type == iavf_bus_type_vfio_user) + client_vfio_user_pci_bar_access( + (struct vfio_device *)hw->hw_addr, + 0, offset, 4, &value, false); + else + value = readl(hw->hw_addr + offset); + + return value; +} + +void +iavf_write_addr(struct iavf_hw *hw, uint32_t offset, uint32_t value) +{ + if (!hw || !hw->hw_addr) + return; + if (hw->bus.type == iavf_bus_type_vfio_user) + client_vfio_user_pci_bar_access( + (struct vfio_device *)hw->hw_addr, + 0, offset, 4, &value, true); + else + writel(value, hw->hw_addr + offset); +} +#endif + +RTE_INIT(iavf_common_init_log) +{ + iavf_common_logger = rte_log_register("pmd.common.iavf"); + if (iavf_common_logger >= 0) + rte_log_set_level(iavf_common_logger, RTE_LOG_NOTICE); +} diff --git a/drivers/common/iavf/iavf_osdep.h b/drivers/common/iavf/iavf_osdep.h index 7cba13ff74..025d48a702 100644 --- a/drivers/common/iavf/iavf_osdep.h +++ b/drivers/common/iavf/iavf_osdep.h @@ -107,8 +107,22 @@ writeq(uint64_t value, volatile void *addr) rte_write64(rte_cpu_to_le_64(value), addr); } +#ifdef RTE_LIBRTE_IAVF_CLIENT +struct iavf_hw; + +__rte_internal +uint32_t iavf_read_addr(struct iavf_hw *hw, uint32_t offset); +__rte_internal +void iavf_write_addr(struct iavf_hw *hw, uint32_t offset, uint32_t value); + +#define wr32(a, reg, value) iavf_write_addr((a), (reg), (value)) +#define rd32(a, reg) iavf_read_
[dpdk-dev] [PATCH v3 2/5] net/iavf_client: introduce iavf driver on vfio-user client
This patch add a new net driver based on vdev abstraction, i.e. iavf_client_ethdev.c. It is using common iavf functions to talk with Emulated pci interfaces based on vfio-user. -- | -- | | | iavf driver | |> (iavf_client_ethdev.c) | -- | | -- | | | device emulate | | | | vfio-user adapt| | | -- | -- | | -- | vfio-user | | client | -- Signed-off-by: Jingjing Wu --- drivers/common/iavf/iavf_prototype.h | 1 + drivers/common/iavf/version.map | 1 + drivers/net/iavf/iavf.h | 18 +- drivers/net/iavf/iavf_client_ethdev.c | 290 ++ drivers/net/iavf/iavf_ethdev.c| 26 +-- drivers/net/iavf/iavf_rxtx.c | 23 +- drivers/net/iavf/meson.build | 1 + 7 files changed, 339 insertions(+), 21 deletions(-) create mode 100644 drivers/net/iavf/iavf_client_ethdev.c diff --git a/drivers/common/iavf/iavf_prototype.h b/drivers/common/iavf/iavf_prototype.h index f34e77db0f..3998d26dc0 100644 --- a/drivers/common/iavf/iavf_prototype.h +++ b/drivers/common/iavf/iavf_prototype.h @@ -83,6 +83,7 @@ void iavf_destroy_spinlock(struct iavf_spinlock *sp); __rte_internal void iavf_vf_parse_hw_config(struct iavf_hw *hw, struct virtchnl_vf_resource *msg); +__rte_internal enum iavf_status iavf_vf_reset(struct iavf_hw *hw); __rte_internal enum iavf_status iavf_aq_send_msg_to_pf(struct iavf_hw *hw, diff --git a/drivers/common/iavf/version.map b/drivers/common/iavf/version.map index 883d919487..23c3577b3e 100644 --- a/drivers/common/iavf/version.map +++ b/drivers/common/iavf/version.map @@ -7,6 +7,7 @@ INTERNAL { iavf_set_mac_type; iavf_shutdown_adminq; iavf_vf_parse_hw_config; + iavf_vf_reset; client_vfio_user_setup; client_vfio_user_release; client_vfio_user_get_bar_addr; diff --git a/drivers/net/iavf/iavf.h b/drivers/net/iavf/iavf.h index 6d5912d8c1..c34f971721 100644 --- a/drivers/net/iavf/iavf.h +++ b/drivers/net/iavf/iavf.h @@ -195,7 +195,10 @@ struct iavf_adapter { struct iavf_hw hw; struct rte_eth_dev *eth_dev; struct iavf_info vf; - +#ifdef RTE_LIBRTE_IAVF_CLIENT + /* used for avf_client driver */ + struct vfio_device *user_dev; +#endif bool rx_bulk_alloc_allowed; /* For vector PMD */ bool rx_vec_allowed; @@ -231,6 +234,16 @@ iavf_init_adminq_parameter(struct iavf_hw *hw) hw->aq.asq_buf_size = IAVF_AQ_BUF_SZ; } +static inline void +iavf_disable_irq0(struct iavf_hw *hw) +{ + /* Disable all interrupt types */ + IAVF_WRITE_REG(hw, IAVF_VFINT_ICR0_ENA1, 0); + IAVF_WRITE_REG(hw, IAVF_VFINT_DYN_CTL01, + IAVF_VFINT_DYN_CTL01_ITR_INDX_MASK); + IAVF_WRITE_FLUSH(hw); +} + static inline uint16_t iavf_calc_itr_interval(int16_t interval) { @@ -284,6 +297,9 @@ _atomic_set_cmd(struct iavf_info *vf, enum virtchnl_ops ops) return !ret; } +extern const struct eth_dev_ops iavf_eth_dev_ops; + +int iavf_init_vf(struct rte_eth_dev *dev); int iavf_check_api_version(struct iavf_adapter *adapter); int iavf_get_vf_resource(struct iavf_adapter *adapter); void iavf_handle_virtchnl_msg(struct rte_eth_dev *dev); diff --git a/drivers/net/iavf/iavf_client_ethdev.c b/drivers/net/iavf/iavf_client_ethdev.c new file mode 100644 index 00..989f9d6062 --- /dev/null +++ b/drivers/net/iavf/iavf_client_ethdev.c @@ -0,0 +1,290 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2020 Intel Corporation + */ + +#include +#include +#include + +#include +#include +#include +#include +#include +#include +#include +#include + +#include "iavf.h" +#include "iavf_rxtx.h" + +static int iavf_client_dev_close(struct rte_eth_dev *dev); +static int iavf_client_dev_reset(struct rte_eth_dev *dev); + +/* set iavf_client_dev_ops to iavf's by default */ +static struct eth_dev_ops iavf_client_eth_dev_ops; + +static const char *valid_args[] = { +#define AVF_CLIENT_ARG_PATH "path" + AVF_CLIENT_ARG_PATH, + NULL +}; + +/* set up vfio_device for iavf_client*/ +static int +iavf_client_vfio_user_setup(struct rte_eth_dev *dev, const char *path) +{ + struct iavf_adapter *adapter = + IAVF_DEV_PRIVATE_TO_ADAPTER(dev->data->dev_private); + struct iavf_hw *hw = IAVF_DEV_PRIVATE_TO_HW(adapter); + struct vfio_device *vfio_dev; + + vfio_dev = client_vfio_user_setup(path, dev->device->numa_node); + if (vfio_dev == NULL) { + PMD_INIT_LOG(ERR, "Error to create vfio_device for iavf_client\n"); + return -1; + } + hw->bus.type = iavf_bus_type_vfio_user; + + /* Use hw_addr to record dev ptr */ + hw-
[dpdk-dev] [PATCH v3 3/5] net/iavf_client: enable interrupt on control queue
New devarg "intr": if intr=1, use interrupt mode on control queue Signed-off-by: Jingjing Wu --- drivers/net/iavf/iavf.h | 19 drivers/net/iavf/iavf_client_ethdev.c | 131 ++ drivers/net/iavf/iavf_ethdev.c| 18 +--- 3 files changed, 134 insertions(+), 34 deletions(-) diff --git a/drivers/net/iavf/iavf.h b/drivers/net/iavf/iavf.h index c34f971721..5516ecf021 100644 --- a/drivers/net/iavf/iavf.h +++ b/drivers/net/iavf/iavf.h @@ -198,6 +198,7 @@ struct iavf_adapter { #ifdef RTE_LIBRTE_IAVF_CLIENT /* used for avf_client driver */ struct vfio_device *user_dev; + int intr_mode; /* interrupt mode if true */ #endif bool rx_bulk_alloc_allowed; /* For vector PMD */ @@ -234,6 +235,22 @@ iavf_init_adminq_parameter(struct iavf_hw *hw) hw->aq.asq_buf_size = IAVF_AQ_BUF_SZ; } +/* Enable default admin queue interrupt setting */ +static inline void +iavf_enable_irq0(struct iavf_hw *hw) +{ + /* Enable admin queue interrupt trigger */ + IAVF_WRITE_REG(hw, IAVF_VFINT_ICR0_ENA1, + IAVF_VFINT_ICR0_ENA1_ADMINQ_MASK); + + IAVF_WRITE_REG(hw, IAVF_VFINT_DYN_CTL01, + IAVF_VFINT_DYN_CTL01_INTENA_MASK | + IAVF_VFINT_DYN_CTL01_CLEARPBA_MASK | + IAVF_VFINT_DYN_CTL01_ITR_INDX_MASK); + + IAVF_WRITE_FLUSH(hw); +} + static inline void iavf_disable_irq0(struct iavf_hw *hw) { @@ -342,4 +359,6 @@ int iavf_add_del_mc_addr_list(struct iavf_adapter *adapter, uint32_t mc_addrs_num, bool add); int iavf_request_queues(struct iavf_adapter *adapter, uint16_t num); int iavf_get_max_rss_queue_region(struct iavf_adapter *adapter); +void iavf_dev_interrupt_handler(void *param); + #endif /* _IAVF_ETHDEV_H_ */ diff --git a/drivers/net/iavf/iavf_client_ethdev.c b/drivers/net/iavf/iavf_client_ethdev.c index 989f9d6062..770a7a8200 100644 --- a/drivers/net/iavf/iavf_client_ethdev.c +++ b/drivers/net/iavf/iavf_client_ethdev.c @@ -6,6 +6,8 @@ #include #include +#include + #include #include #include @@ -18,6 +20,9 @@ #include "iavf.h" #include "iavf_rxtx.h" +#define AVF_CLIENT_ARG_PATH "path" +#define AVF_CLIENT_ARG_INTR "intr" + static int iavf_client_dev_close(struct rte_eth_dev *dev); static int iavf_client_dev_reset(struct rte_eth_dev *dev); @@ -25,11 +30,27 @@ static int iavf_client_dev_reset(struct rte_eth_dev *dev); static struct eth_dev_ops iavf_client_eth_dev_ops; static const char *valid_args[] = { -#define AVF_CLIENT_ARG_PATH "path" AVF_CLIENT_ARG_PATH, + AVF_CLIENT_ARG_INTR, NULL }; +static void +iavf_client_event_handler(void *param) +{ + struct rte_eth_dev *dev = (struct rte_eth_dev *)param; + struct iavf_hw *hw = IAVF_DEV_PRIVATE_TO_HW(dev->data->dev_private); + eventfd_t buf; + + eventfd_read(dev->intr_handle->fd, &buf); + + iavf_disable_irq0(hw); + + iavf_handle_virtchnl_msg(dev); + + iavf_enable_irq0(hw); +} + /* set up vfio_device for iavf_client*/ static int iavf_client_vfio_user_setup(struct rte_eth_dev *dev, const char *path) @@ -51,6 +72,11 @@ iavf_client_vfio_user_setup(struct rte_eth_dev *dev, const char *path) hw->back = IAVF_DEV_PRIVATE_TO_ADAPTER(dev->data->dev_private); + if (!vfio_dev->nb_irqs && adapter->intr_mode) { + PMD_INIT_LOG(ERR, "No irq support on device"); + return -1; + } + if (!dev->intr_handle) { dev->intr_handle = rte_zmalloc_socket("iavf_client_intr", sizeof(*dev->intr_handle), @@ -62,7 +88,7 @@ iavf_client_vfio_user_setup(struct rte_eth_dev *dev, const char *path) } - dev->intr_handle->fd = -1; + dev->intr_handle->fd = vfio_dev->irqfds[0]; dev->intr_handle->type = RTE_INTR_HANDLE_VDEV; dev->intr_handle->max_intr = 1; @@ -145,26 +171,52 @@ iavf_client_eth_init(struct rte_eth_dev *eth_dev) rte_ether_addr_copy((struct rte_ether_addr *)hw->mac.addr, ð_dev->data->mac_addrs[0]); - rte_eal_alarm_set(IAVF_CLIENT_ALARM_INTERVAL, - iavf_client_dev_alarm_handler, eth_dev); + if (adapter->intr_mode) { + /* register callback func to eal lib */ + rte_intr_callback_register(eth_dev->intr_handle, + iavf_client_event_handler, + (void *)eth_dev); + iavf_enable_irq0(hw); + } else { + rte_eal_alarm_set(IAVF_CLIENT_ALARM_INTERVAL, + iavf_client_dev_alarm_handler, eth_dev); + } return 0; } static int iavf_client_dev_reset(struct rte_eth_dev *dev) { + struct iavf_adapter *adapter = + IAVF_DEV_PRIVATE_TO_ADAPTER(dev->data->dev_private); str
[dpdk-dev] [PATCH v3 4/5] net/iavf_client: enable interrupt of Rx queue
Signed-off-by: Jingjing Wu --- drivers/net/iavf/iavf_client_ethdev.c | 29 +-- drivers/net/iavf/iavf_ethdev.c| 9 +++-- 2 files changed, 26 insertions(+), 12 deletions(-) diff --git a/drivers/net/iavf/iavf_client_ethdev.c b/drivers/net/iavf/iavf_client_ethdev.c index 770a7a8200..27daaed0e8 100644 --- a/drivers/net/iavf/iavf_client_ethdev.c +++ b/drivers/net/iavf/iavf_client_ethdev.c @@ -59,6 +59,7 @@ iavf_client_vfio_user_setup(struct rte_eth_dev *dev, const char *path) IAVF_DEV_PRIVATE_TO_ADAPTER(dev->data->dev_private); struct iavf_hw *hw = IAVF_DEV_PRIVATE_TO_HW(adapter); struct vfio_device *vfio_dev; + uint32_t max_fds, i; vfio_dev = client_vfio_user_setup(path, dev->device->numa_node); if (vfio_dev == NULL) { @@ -87,11 +88,21 @@ iavf_client_vfio_user_setup(struct rte_eth_dev *dev, const char *path) } } - dev->intr_handle->fd = vfio_dev->irqfds[0]; dev->intr_handle->type = RTE_INTR_HANDLE_VDEV; dev->intr_handle->max_intr = 1; + /* Assign rxq fds */ + if (vfio_dev->nb_irqs > 1) { + max_fds = RTE_MIN((uint32_t)RTE_MAX_RXTX_INTR_VEC_ID, + vfio_dev->nb_irqs - 1); + dev->intr_handle->nb_efd = max_fds; + for (i = 0; i < max_fds; ++i) + dev->intr_handle->efds[i] = + vfio_dev->irqfds[i + RTE_INTR_VEC_RXTX_OFFSET]; + dev->intr_handle->efd_counter_size = 0; + dev->intr_handle->max_intr += max_fds; + } return 0; } @@ -108,8 +119,6 @@ avf_client_init_eth_ops(void) iavf_client_eth_dev_ops.reta_query = NULL; iavf_client_eth_dev_ops.rss_hash_update= NULL; iavf_client_eth_dev_ops.rss_hash_conf_get = NULL; - iavf_client_eth_dev_ops.rx_queue_intr_enable = NULL; - iavf_client_eth_dev_ops.rx_queue_intr_disable = NULL; } #define IAVF_CLIENT_ALARM_INTERVAL 5 /* us */ @@ -227,11 +236,12 @@ iavf_client_dev_close(struct rte_eth_dev *dev) struct iavf_adapter *adapter = IAVF_DEV_PRIVATE_TO_ADAPTER(dev->data->dev_private); struct iavf_hw *hw = IAVF_DEV_PRIVATE_TO_HW(dev->data->dev_private); + struct rte_intr_handle *intr_handle = dev->intr_handle; if (adapter->intr_mode) { iavf_disable_irq0(hw); /* unregister callback func from eal lib */ - rte_intr_callback_unregister(dev->intr_handle, + rte_intr_callback_unregister(intr_handle, iavf_client_event_handler, dev); } else { rte_eal_alarm_cancel(iavf_client_dev_alarm_handler, dev); @@ -240,8 +250,15 @@ iavf_client_dev_close(struct rte_eth_dev *dev) if (!adapter->stopped) { iavf_stop_queues(dev); - if (dev->intr_handle) { - rte_free(dev->intr_handle); + if (intr_handle) { + /* Disable the interrupt for Rx */ + rte_intr_efd_disable(intr_handle); + /* Rx interrupt vector mapping free */ + if (intr_handle->intr_vec) { + rte_free(intr_handle->intr_vec); + intr_handle->intr_vec = NULL; + } + rte_free(intr_handle); dev->intr_handle = NULL; } diff --git a/drivers/net/iavf/iavf_ethdev.c b/drivers/net/iavf/iavf_ethdev.c index bc07ecbed7..06395f852b 100644 --- a/drivers/net/iavf/iavf_ethdev.c +++ b/drivers/net/iavf/iavf_ethdev.c @@ -1368,11 +1368,10 @@ iavf_dev_rx_queue_intr_enable(struct rte_eth_dev *dev, uint16_t queue_id) { struct iavf_adapter *adapter = IAVF_DEV_PRIVATE_TO_ADAPTER(dev->data->dev_private); - struct rte_pci_device *pci_dev = RTE_ETH_DEV_TO_PCI(dev); struct iavf_hw *hw = IAVF_DEV_PRIVATE_TO_HW(adapter); uint16_t msix_intr; - msix_intr = pci_dev->intr_handle.intr_vec[queue_id]; + msix_intr = dev->intr_handle->intr_vec[queue_id]; if (msix_intr == IAVF_MISC_VEC_ID) { PMD_DRV_LOG(INFO, "MISC is also enabled for control"); IAVF_WRITE_REG(hw, IAVF_VFINT_DYN_CTL01, @@ -1387,10 +1386,9 @@ iavf_dev_rx_queue_intr_enable(struct rte_eth_dev *dev, uint16_t queue_id) IAVF_VFINT_DYN_CTL01_CLEARPBA_MASK | IAVF_VFINT_DYN_CTLN1_ITR_INDX_MASK); } - IAVF_WRITE_FLUSH(hw); - rte_intr_ack(&pci_dev->intr_handle); + rte_intr_ack(dev->intr_handle); return 0; } @@ -1398,11 +1396,10 @@ iavf_dev_rx_queue_intr_enable(struct rte_eth_dev *dev, uint16_t queue_id) static int iavf_dev_rx_queue_intr_disable(stru
[dpdk-dev] [PATCH v3 5/5] net/iavf: fix vector mapping with queue
Fix the vector mapping with queue by changing the recircle when exceeds RX_VEC_START + nb_msix; Fixes: d6bde6b5eae9 ("net/avf: enable Rx interrupt") Signed-off-by: Jingjing Wu --- drivers/net/iavf/iavf_ethdev.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/net/iavf/iavf_ethdev.c b/drivers/net/iavf/iavf_ethdev.c index 06395f852b..b169b19975 100644 --- a/drivers/net/iavf/iavf_ethdev.c +++ b/drivers/net/iavf/iavf_ethdev.c @@ -570,7 +570,7 @@ static int iavf_config_rx_queues_irqs(struct rte_eth_dev *dev, /* If Rx interrupt is reuquired, and we can use * multi interrupts, then the vec is from 1 */ - vf->nb_msix = RTE_MIN(vf->vf_res->max_vectors, + vf->nb_msix = RTE_MIN(vf->vf_res->max_vectors - 1, intr_handle->nb_efd); vf->msix_base = IAVF_RX_VEC_START; vec = IAVF_RX_VEC_START; @@ -578,7 +578,7 @@ static int iavf_config_rx_queues_irqs(struct rte_eth_dev *dev, qv_map[i].queue_id = i; qv_map[i].vector_id = vec; intr_handle->intr_vec[i] = vec++; - if (vec >= vf->nb_msix) + if (vec >= vf->nb_msix + IAVF_RX_VEC_START) vec = IAVF_RX_VEC_START; } vf->qv_map = qv_map; -- 2.21.1
Re: [dpdk-dev] [PATCH 4/8] emu/iavf: add vfio-user device register and unregister
Hi Beilei, > -Original Message- > From: Xing, Beilei > Sent: Thursday, January 7, 2021 3:19 PM > To: Xia, Chenbo ; dev@dpdk.org; tho...@monjalon.net; > david.march...@redhat.com > Cc: step...@networkplumber.org; Liang, Cunming ; Lu, > Xiuchun ; Li, Miao ; Wu, Jingjing > > Subject: RE: [dpdk-dev] [PATCH 4/8] emu/iavf: add vfio-user device register > and unregister > > > > > -Original Message- > > From: dev On Behalf Of Chenbo Xia > > Sent: Friday, December 18, 2020 3:48 PM > > To: dev@dpdk.org; tho...@monjalon.net; david.march...@redhat.com > > Cc: step...@networkplumber.org; Liang, Cunming > > ; Lu, Xiuchun ; Li, Miao > > ; Wu, Jingjing > > Subject: [dpdk-dev] [PATCH 4/8] emu/iavf: add vfio-user device register and > > unregister > > > > This patch adds vfio-user APIs call in driver probe and remove. > > rte_vfio_user_register() and rte_vfio_user_unregister() are called to > > create/destroy a vfio-user device. Notify callbacks that libvfio_user > defines are > > also implemented. > > > > Signed-off-by: Chenbo Xia > > Signed-off-by: Miao Li > > --- > > > > +static struct iavf_emudev *find_iavf_with_dev_id(int vfio_dev_id) { > > It's better to change the function name to follow other function names' style. > iavf_emu_xxx OK, thanks! > > > + struct iavf_emu_sock_list *list; > > + char sock_addr[PATH_MAX]; > > + int ret; > > + > > + ret = rte_vfio_get_sock_addr(vfio_dev_id, sock_addr, > > + sizeof(sock_addr)); > > + if (ret) { > > + EMU_IAVF_LOG(ERR, "Can not find vfio device %d " > > + "sock_addr.\n", vfio_dev_id); > > + return NULL; > > + } > > + > > + list = iavf_emu_find_sock_list(sock_addr); > > + if (!list) { > > + EMU_IAVF_LOG(ERR, "Can not find sock list.\n"); > > + return NULL; > > + } > > + > > + return (struct iavf_emudev *)list->emu_dev->priv_data; } > > It's better to check if list->emu_dev is NULL first. 'list->emu_dev->priv_data' is already used in iavf_emu_find_sock_list. And based on the way we add to the list, emu_dev will not be NULL. But thanks to your comment, I noticed I should just return struct iavf_emudev in iavf_emu_find_sock_list. And the linked list may just store 'struct iavf_emudev' to make it easier. > > > + > > +static int iavf_emu_new_device(int vfio_dev_id) { > > + struct iavf_emudev *dev; > > + int ret; > > + > > + dev = find_iavf_with_dev_id(vfio_dev_id); > > + if (!dev) > > + return -1; > > + > > + dev->vfio->dev_id = vfio_dev_id; > > + > > + ret = iavf_emu_setup_mem_table(dev); > > + if (ret) { > > + EMU_IAVF_LOG(ERR, "Failed to set up memtable for " > > + "device %d", dev->vfio->dev_id); > > + return ret; > > + } > > + > > + ret = iavf_emu_setup_irq(dev); > > + if (ret) { > > + EMU_IAVF_LOG(ERR, "Failed to set up irq for " > > + "device %d", dev->vfio->dev_id); > > + return ret; > > + } > > + > > + ret = iavf_emu_setup_queues(dev); > > + if (ret) { > > + EMU_IAVF_LOG(ERR, "Failed to set up queues for " > > + "device %d", dev->vfio->dev_id); > > + return ret; > > + } > > + > > + ret = dev->ops->device_ready(dev->edev); > > Same as above, and please also check other functions, such as > device_destroy... Yes! Will add check then. > > > + if (ret) > > + return ret; > > + > > + dev->ready = 1; > > + return 0; > > +} > > + > > +static void iavf_emu_destroy_device(int vfio_dev_id) { > > + struct iavf_emudev *dev; > > + > > + dev = find_iavf_with_dev_id(vfio_dev_id); > > + if (!dev) > > + return; > > + > > + iavf_emu_reset_all_resources(dev); > > Should we add 'dev->ready = 0' here? Yes. Will fix it. > > > + > > + dev->ops->device_destroy(dev->edev); > > +} > > + > > > > > +static int iavf_emu_lock_datapath(int vfio_dev_id, int lock) { > > + struct iavf_emudev *dev; > > + > > + dev = find_iavf_with_dev_id(vfio_dev_id); > > + if (!dev) > > + return -1; > > + > > + return dev->ops->lock_dp(dev->edev, lock); } > > + > > +static int iavf_emu_reset_device(int vfio_dev_id) { > > + struct iavf_emudev *dev; > > + > > + dev = find_iavf_with_dev_id(vfio_dev_id); > > + if (!dev) > > + return -1; > > + > > + iavf_emu_reset_all_resources(dev); > > Should we add 'dev->ready = 0' here? Yes. Will fix it. Thanks, Chenbo > > > + > > + return dev->ops->reset_device(dev->edev); > > +} > > +
[dpdk-dev] [PATCH v8] net/iavf: fix invalid RSS combinations rule can be created
Currently, when use 'flow' command to create a rule with following invalid RSS type combination, it can be created successfully. Invalid RSS combinations list: - ETH_RSS_IPV4 | ETH_RSS_NONFRAG_IPV4_TCP - ETH_RSS_IPV6 | ETH_RSS_NONFRAG_IPV6_TCP This patch adds these combinations in 'invalid_rss_comb' array to do valid check, if the combination check failed, the rule will be created unsuccessful. Fixes: 91f27b2e39ab ("net/iavf: refactor RSS") Signed-off-by: Murphy Yang --- v8: - Update the comments. v7: - Remove unsupported RSS combinations array. - Restored 'ETH_RSS_GTPU' in input set mask. v6: - Add unsupported RSS combinations array. v5: - Remove 'ETH_RSS_GTPU' from input set mask. v4: - Use 'ETH_RSS_XXX' replace 'IAVF_RSS_TYPE_INNER_XXX' v3: - Update the comments. v2: - Add invalid RSS combinations. drivers/net/iavf/iavf_hash.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/net/iavf/iavf_hash.c b/drivers/net/iavf/iavf_hash.c index 7620876b58..ebaac58254 100644 --- a/drivers/net/iavf/iavf_hash.c +++ b/drivers/net/iavf/iavf_hash.c @@ -863,7 +863,9 @@ static void iavf_refine_proto_hdrs(struct virtchnl_proto_hdrs *proto_hdrs, static uint64_t invalid_rss_comb[] = { ETH_RSS_IPV4 | ETH_RSS_NONFRAG_IPV4_UDP, + ETH_RSS_IPV4 | ETH_RSS_NONFRAG_IPV4_TCP, ETH_RSS_IPV6 | ETH_RSS_NONFRAG_IPV6_UDP, + ETH_RSS_IPV6 | ETH_RSS_NONFRAG_IPV6_TCP, RTE_ETH_RSS_L3_PRE32 | RTE_ETH_RSS_L3_PRE40 | RTE_ETH_RSS_L3_PRE48 | RTE_ETH_RSS_L3_PRE56 | RTE_ETH_RSS_L3_PRE96 -- 2.17.1
Re: [dpdk-dev] [dpdk-dev v2 1/2] ethdev: add new tunnel type for ecpri
> -Original Message- > From: Thomas Monjalon > Sent: Thursday, January 7, 2021 6:12 AM > To: Guo, Jia > Cc: Zhang, Qi Z ; Wu, Jingjing > ; Yang, Qiming ; Wang, > Haiyue ; dev@dpdk.org; Yigit, Ferruh > ; andrew.rybche...@oktetlabs.ru > Subject: Re: [dpdk-dev] [dpdk-dev v2 1/2] ethdev: add new tunnel type for > ecpri > > 24/12/2020 07:59, Jeff Guo: > > Add type of RTE_TUNNEL_TYPE_ECPRI into the enum of ethdev tunnel > type. > > > > Signed-off-by: Jeff Guo > > Reviewed-by: Qi Zhang > [...] > > --- a/lib/librte_ethdev/rte_ethdev.h > > +++ b/lib/librte_ethdev/rte_ethdev.h > > @@ -1219,6 +1219,7 @@ enum rte_eth_tunnel_type { > > RTE_TUNNEL_TYPE_IP_IN_GRE, > > RTE_L2_TUNNEL_TYPE_E_TAG, > > RTE_TUNNEL_TYPE_VXLAN_GPE, > > + RTE_TUNNEL_TYPE_ECPRI, > > RTE_TUNNEL_TYPE_MAX, > > }; > > We tried to remove all these legacy API in DPDK 20.11. > Andrew decided to not remove this one because it is not yet completely > replaced by rte_flow in all drivers. > However, I am against continuing to update this API. > The opposite work should be done: migrate to rte_flow. > Agree but seems that the legacy api and driver legacy implementation still keep in this release, and there is no a general way to replace the legacy by rte_flow right now. > Sorry, it is a nack. > BTW, it is probably breaking the ABI because of RTE_TUNNEL_TYPE_MAX. > Oh, the ABI break should be a problem. > PS: please Cc ethdev maintainers for such patch, thanks. > tip: use --cc-cmd devtools/get-maintainer.sh > Thanks for your helpful tip.
Re: [dpdk-dev] [PATCH v8] net/iavf: fix invalid RSS combinations rule can be created
Acked-by: Jeff Guo > -Original Message- > From: Murphy Yang > Sent: Thursday, January 7, 2021 5:17 PM > To: dev@dpdk.org > Cc: Yang, Qiming ; Guo, Jia ; > Zhang, Qi Z ; Yang, SteveX ; > Wu, Jingjing ; Xing, Beilei ; > Yang, MurphyX > Subject: [PATCH v8] net/iavf: fix invalid RSS combinations rule can be created > > Currently, when use 'flow' command to create a rule with following invalid > RSS type combination, it can be created successfully. > > Invalid RSS combinations list: > - ETH_RSS_IPV4 | ETH_RSS_NONFRAG_IPV4_TCP > - ETH_RSS_IPV6 | ETH_RSS_NONFRAG_IPV6_TCP > > This patch adds these combinations in 'invalid_rss_comb' array to do valid > check, if the combination check failed, the rule will be created unsuccessful. > > Fixes: 91f27b2e39ab ("net/iavf: refactor RSS") > > Signed-off-by: Murphy Yang > --- > v8: > - Update the comments. > v7: > - Remove unsupported RSS combinations array. > - Restored 'ETH_RSS_GTPU' in input set mask. > v6: > - Add unsupported RSS combinations array. > v5: > - Remove 'ETH_RSS_GTPU' from input set mask. > v4: > - Use 'ETH_RSS_XXX' replace 'IAVF_RSS_TYPE_INNER_XXX' > v3: > - Update the comments. > v2: > - Add invalid RSS combinations. > drivers/net/iavf/iavf_hash.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/drivers/net/iavf/iavf_hash.c b/drivers/net/iavf/iavf_hash.c index > 7620876b58..ebaac58254 100644 > --- a/drivers/net/iavf/iavf_hash.c > +++ b/drivers/net/iavf/iavf_hash.c > @@ -863,7 +863,9 @@ static void iavf_refine_proto_hdrs(struct > virtchnl_proto_hdrs *proto_hdrs, > > static uint64_t invalid_rss_comb[] = { > ETH_RSS_IPV4 | ETH_RSS_NONFRAG_IPV4_UDP, > + ETH_RSS_IPV4 | ETH_RSS_NONFRAG_IPV4_TCP, > ETH_RSS_IPV6 | ETH_RSS_NONFRAG_IPV6_UDP, > + ETH_RSS_IPV6 | ETH_RSS_NONFRAG_IPV6_TCP, > RTE_ETH_RSS_L3_PRE32 | RTE_ETH_RSS_L3_PRE40 | > RTE_ETH_RSS_L3_PRE48 | RTE_ETH_RSS_L3_PRE56 | > RTE_ETH_RSS_L3_PRE96 > -- > 2.17.1
Re: [dpdk-dev] [dpdk-dev v2 1/2] ethdev: add new tunnel type for ecpri
On 1/7/21 12:32 PM, Guo, Jia wrote: -Original Message- From: Thomas Monjalon Sent: Thursday, January 7, 2021 6:12 AM To: Guo, Jia Cc: Zhang, Qi Z ; Wu, Jingjing ; Yang, Qiming ; Wang, Haiyue ; dev@dpdk.org; Yigit, Ferruh ; andrew.rybche...@oktetlabs.ru Subject: Re: [dpdk-dev] [dpdk-dev v2 1/2] ethdev: add new tunnel type for ecpri 24/12/2020 07:59, Jeff Guo: Add type of RTE_TUNNEL_TYPE_ECPRI into the enum of ethdev tunnel type. Signed-off-by: Jeff Guo Reviewed-by: Qi Zhang [...] --- a/lib/librte_ethdev/rte_ethdev.h +++ b/lib/librte_ethdev/rte_ethdev.h @@ -1219,6 +1219,7 @@ enum rte_eth_tunnel_type { RTE_TUNNEL_TYPE_IP_IN_GRE, RTE_L2_TUNNEL_TYPE_E_TAG, RTE_TUNNEL_TYPE_VXLAN_GPE, + RTE_TUNNEL_TYPE_ECPRI, RTE_TUNNEL_TYPE_MAX, }; We tried to remove all these legacy API in DPDK 20.11. Andrew decided to not remove this one because it is not yet completely replaced by rte_flow in all drivers. However, I am against continuing to update this API. The opposite work should be done: migrate to rte_flow. Agree but seems that the legacy api and driver legacy implementation still keep in this release, and there is no a general way to replace the legacy by rte_flow right now. Which legacy API is kept? (I've tried to cleanup drivers, but it is not always easy to do and I relied on driver maintainers) If you are talking about an API to set UDP port for UDP-based tunnels, yes it is kept right now since there is no replacement yet. It looks like eCRPI could be encapsulated in UDP, but I don't know if UDP port is fixed for the tunnel type or requires configuring.
Re: [dpdk-dev] [dpdk-dev v2 1/2] ethdev: add new tunnel type for ecpri
07/01/2021 10:32, Guo, Jia: > From: Thomas Monjalon > > 24/12/2020 07:59, Jeff Guo: > > > Add type of RTE_TUNNEL_TYPE_ECPRI into the enum of ethdev tunnel > > type. > > > > > > Signed-off-by: Jeff Guo > > > Reviewed-by: Qi Zhang > > [...] > > > --- a/lib/librte_ethdev/rte_ethdev.h > > > +++ b/lib/librte_ethdev/rte_ethdev.h > > > @@ -1219,6 +1219,7 @@ enum rte_eth_tunnel_type { > > > RTE_TUNNEL_TYPE_IP_IN_GRE, > > > RTE_L2_TUNNEL_TYPE_E_TAG, > > > RTE_TUNNEL_TYPE_VXLAN_GPE, > > > + RTE_TUNNEL_TYPE_ECPRI, > > > RTE_TUNNEL_TYPE_MAX, > > > }; > > > > We tried to remove all these legacy API in DPDK 20.11. > > Andrew decided to not remove this one because it is not yet completely > > replaced by rte_flow in all drivers. > > However, I am against continuing to update this API. > > The opposite work should be done: migrate to rte_flow. > > Agree but seems that the legacy api and driver legacy implementation > still keep in this release, and there is no a general way to replace > the legacy by rte_flow right now. I think rte_flow is a complete replacement with more features. You can match, encap, decap. There is even a new API to get tunnel infos after decap. What is missing? > > Sorry, it is a nack. > > BTW, it is probably breaking the ABI because of RTE_TUNNEL_TYPE_MAX. > > Oh, the ABI break should be a problem. > > > PS: please Cc ethdev maintainers for such patch, thanks. > > tip: use --cc-cmd devtools/get-maintainer.sh > > Thanks for your helpful tip.
Re: [dpdk-dev] [PATCH v2 1/9] ethdev: refactor representor infrastructure
On Wed, Jan 6, 2021 at 9:48 PM Xueming Li wrote: > > To support extended representor syntax, this patch refactor represntor Typo in 'representor' > infrastructure: > 1. introduces representor type enum > 2. devargs representor port range extraction from partial value > > Signed-off-by: Xueming Li > --- > drivers/net/bnxt/bnxt_ethdev.c| 12 > drivers/net/enic/enic_ethdev.c| 7 ++ > drivers/net/i40e/i40e_ethdev.c| 8 +++ > drivers/net/ixgbe/ixgbe_ethdev.c | 8 +++ > drivers/net/mlx5/linux/mlx5_os.c | 11 > lib/librte_ethdev/ethdev_private.c| 93 --- > lib/librte_ethdev/ethdev_private.h| 3 - > lib/librte_ethdev/rte_class_eth.c | 4 +- > lib/librte_ethdev/rte_ethdev.c| 5 +- > lib/librte_ethdev/rte_ethdev_driver.h | 7 ++ > 10 files changed, 98 insertions(+), 60 deletions(-) > > diff --git a/drivers/net/bnxt/bnxt_ethdev.c b/drivers/net/bnxt/bnxt_ethdev.c > index 81c8f8d79d..844a6c3c66 100644 > --- a/drivers/net/bnxt/bnxt_ethdev.c > +++ b/drivers/net/bnxt/bnxt_ethdev.c > @@ -5520,6 +5520,18 @@ static int bnxt_rep_port_probe(struct rte_pci_device > *pci_dev, > int i, ret = 0; > struct rte_kvargs *kvlist = NULL; > > + if (eth_da->type == RTE_ETH_REPRESENTOR_NONE) > + return 0; > + if (eth_da->type != RTE_ETH_REPRESENTOR_VF) { > + PMD_DRV_LOG(ERR, "unsupported representor type %d\n", > + eth_da->type); > + return -ENOTSUP; > + } > + if (eth_da->type != RTE_ETH_REPRESENTOR_VF) { Seems like an extra 'if ' condition by mistake? Otherwise there is no diff b/n this 'if' condition and the one few lines above? > + PMD_DRV_LOG(ERR, "unsupported representor type %d\n", > + eth_da->type); > + return -EINVAL; > + } > num_rep = eth_da->nb_representor_ports; > if (num_rep > BNXT_MAX_VF_REPS) { > PMD_DRV_LOG(ERR, "nb_representor_ports = %d > %d MAX VF > REPS\n", > diff --git a/drivers/net/enic/enic_ethdev.c b/drivers/net/enic/enic_ethdev.c > index d041a6bee9..dd085caa93 100644 > --- a/drivers/net/enic/enic_ethdev.c > +++ b/drivers/net/enic/enic_ethdev.c > @@ -1303,6 +1303,13 @@ static int eth_enic_pci_probe(struct rte_pci_driver > *pci_drv __rte_unused, > if (retval) > return retval; > } > + if (eth_da.type == RTE_ETH_REPRESENTOR_NONE) > + return 0; > + if (eth_da.type != RTE_ETH_REPRESENTOR_VF) { > + ENICPMD_LOG(ERR, "unsupported representor type: %s\n", > + pci_dev->device.devargs->args); > + return -ENOTSUP; > + } > retval = rte_eth_dev_create(&pci_dev->device, pci_dev->device.name, > sizeof(struct enic), > eth_dev_pci_specific_init, pci_dev, > diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c > index f54769c29d..05ed2e1079 100644 > --- a/drivers/net/i40e/i40e_ethdev.c > +++ b/drivers/net/i40e/i40e_ethdev.c > @@ -640,6 +640,14 @@ eth_i40e_pci_probe(struct rte_pci_driver *pci_drv > __rte_unused, > return retval; > } > > + if (eth_da.type == RTE_ETH_REPRESENTOR_NONE) > + return 0; > + if (eth_da.type != RTE_ETH_REPRESENTOR_VF) { > + PMD_DRV_LOG(ERR, "unsupported representor type: %s\n", > + pci_dev->device.devargs->args); > + return -ENOTSUP; > + } > + > retval = rte_eth_dev_create(&pci_dev->device, pci_dev->device.name, > sizeof(struct i40e_adapter), > eth_dev_pci_specific_init, pci_dev, > diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c > b/drivers/net/ixgbe/ixgbe_ethdev.c > index 9a47a8b262..9ea0139197 100644 > --- a/drivers/net/ixgbe/ixgbe_ethdev.c > +++ b/drivers/net/ixgbe/ixgbe_ethdev.c > @@ -1717,6 +1717,14 @@ eth_ixgbe_pci_probe(struct rte_pci_driver *pci_drv > __rte_unused, > } else > memset(ð_da, 0, sizeof(eth_da)); > > + if (eth_da.type == RTE_ETH_REPRESENTOR_NONE) > + return 0; > + if (eth_da.type != RTE_ETH_REPRESENTOR_VF) { > + PMD_DRV_LOG(ERR, "unsupported representor type: %s\n", > + pci_dev->device.devargs->args); > + return -ENOTSUP; > + } > + > retval = rte_eth_dev_create(&pci_dev->device, pci_dev->device.name, > sizeof(struct ixgbe_adapter), > eth_dev_pci_specific_init, pci_dev, > diff --git a/drivers/net/mlx5/linux/mlx5_os.c > b/drivers/net/mlx5/linux/mlx5_os.c > index 6812a1f215..6981ba1f41 100644 > --- a/drivers/net/mlx5/linux/mlx5_os.c > +++ b/drivers/net/mlx5/linux/mlx5_os.c > @@ -706,6 +706,17 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev, > strerror(rte_errno)); >
Re: [dpdk-dev] [PATCH 4/4] vdpa/mlx5: set default event mode to polling
On 12/3/20 12:36 AM, Xueming Li wrote: > For better performance and latency, this patch sets default event > handling mode to polling mode which uses dedicate thread per device to > poll and process event. > > Signed-off-by: Xueming Li > Acked-by: Matan Azrad > --- > doc/guides/vdpadevs/mlx5.rst | 2 +- > drivers/vdpa/mlx5/mlx5_vdpa.c | 2 +- > 2 files changed, 2 insertions(+), 2 deletions(-) > > diff --git a/doc/guides/vdpadevs/mlx5.rst b/doc/guides/vdpadevs/mlx5.rst > index 20254257c9..730e171ba3 100644 > --- a/doc/guides/vdpadevs/mlx5.rst > +++ b/doc/guides/vdpadevs/mlx5.rst > @@ -116,7 +116,7 @@ Driver options >- 2, Completion queue scheduling will be managed by interrupts. Each CQ > burst > arms the CQ in order to get an interrupt event in the next traffic burst. > > - - Default mode is 0. > + - Default mode is 1. > > - ``event_us`` parameter [int] > > diff --git a/drivers/vdpa/mlx5/mlx5_vdpa.c b/drivers/vdpa/mlx5/mlx5_vdpa.c > index 1f92c529c9..5d954d48ce 100644 > --- a/drivers/vdpa/mlx5/mlx5_vdpa.c > +++ b/drivers/vdpa/mlx5/mlx5_vdpa.c > @@ -647,7 +647,7 @@ mlx5_vdpa_config_get(struct rte_devargs *devargs, struct > mlx5_vdpa_priv *priv) > { > struct rte_kvargs *kvlist; > > - priv->event_mode = MLX5_VDPA_EVENT_MODE_DYNAMIC_TIMER; > + priv->event_mode = MLX5_VDPA_EVENT_MODE_FIXED_TIMER; > priv->event_us = 0; > priv->event_core = -1; > priv->no_traffic_time_s = MLX5_VDPA_DEFAULT_NO_TRAFFIC_TIME_S; > Reviewed-by: Maxime Coquelin Thanks, Maxime
Re: [dpdk-dev] [PATCH v2] net/virtio-user: fix run close(0) and close callfd
Cc'ing sta...@dpdk.org. On 12/11/20 6:11 PM, Jiawei Zhu wrote: > From: Jiawei Zhu > > When i < VIRTIO_MAX_VIRTQUEUES and j == i, > dev->callfds[i] and dev->kickfds[i] are default 0. > So it will close(0), close the standard input (stdin). > And when the code fails in kickfd creation, > it will leaves one callfd not closed. > > Fixes: e6e7ad8b3024 ("net/virtio-user: move eventfd open/close into > init/uninit") > Cc: sta...@dpdk.org: > > Signed-off-by: Jiawei Zhu > --- > v2: > * Add close callfd when fail in kickfd creation before break. > --- > --- > drivers/net/virtio/virtio_user/virtio_user_dev.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/drivers/net/virtio/virtio_user/virtio_user_dev.c > b/drivers/net/virtio/virtio_user/virtio_user_dev.c > index 053f026..e1cbad0 100644 > --- a/drivers/net/virtio/virtio_user/virtio_user_dev.c > +++ b/drivers/net/virtio/virtio_user/virtio_user_dev.c > @@ -276,6 +276,7 @@ int virtio_user_stop_device(struct virtio_user_dev *dev) > } > kickfd = eventfd(0, EFD_CLOEXEC | EFD_NONBLOCK); > if (kickfd < 0) { > + close(callfd); > PMD_DRV_LOG(ERR, "kickfd error, %s", strerror(errno)); > break; > } > @@ -284,7 +285,7 @@ int virtio_user_stop_device(struct virtio_user_dev *dev) > } > > if (i < VIRTIO_MAX_VIRTQUEUES) { > - for (j = 0; j <= i; ++j) { > + for (j = 0; j < i; ++j) { > close(dev->callfds[j]); > close(dev->kickfds[j]); > } > Reviewed-by: Maxime Coquelin Thanks! Maxime
Re: [dpdk-dev] [PATCH v1 2/4] net/virtio: add vectorized packed ring Rx NEON path
On 1/5/21 3:27 PM, Maxime Coquelin wrote: > > > On 1/5/21 3:16 PM, Maxime Coquelin wrote: >> >> >> On 11/17/20 11:06 AM, Joyce Kong wrote: >>> Optimize packed ring Rx batch path with NEON instructions. >>> >>> Signed-off-by: Joyce Kong >>> Reviewed-by: Ruifeng Wang >>> --- >>> drivers/net/virtio/virtio_rxtx_packed.h | 15 ++ >>> drivers/net/virtio/virtio_rxtx_packed_neon.h | 150 +++ >>> 2 files changed, 165 insertions(+) >>> create mode 100644 drivers/net/virtio/virtio_rxtx_packed_neon.h >>> >>> diff --git a/drivers/net/virtio/virtio_rxtx_packed.h >>> b/drivers/net/virtio/virtio_rxtx_packed.h >>> index b0b1d63ec..8f5198ad7 100644 >>> --- a/drivers/net/virtio/virtio_rxtx_packed.h >>> +++ b/drivers/net/virtio/virtio_rxtx_packed.h >>> @@ -19,9 +19,16 @@ >>> #include "virtqueue.h" >>> >>> #define BYTE_SIZE 8 >>> + >>> +#ifdef CC_AVX512_SUPPORT >>> /* flag bits offset in packed ring desc higher 64bits */ >>> #define FLAGS_BITS_OFFSET ((offsetof(struct vring_packed_desc, flags) - \ >>> offsetof(struct vring_packed_desc, len)) * BYTE_SIZE) >>> +#elif defined(RTE_ARCH_ARM) >>> +/* flag bits offset in packed ring desc from ID */ >>> +#define FLAGS_BITS_OFFSET ((offsetof(struct vring_packed_desc, flags) - \ >>> + offsetof(struct vring_packed_desc, id)) * BYTE_SIZE) >>> +#endif >>> >>> #define PACKED_FLAGS_MASK ((0ULL | VRING_PACKED_DESC_F_AVAIL_USED) << \ >>> FLAGS_BITS_OFFSET) >>> @@ -44,8 +51,16 @@ >>> /* net hdr short size mask */ >>> #define NET_HDR_MASK 0x3F >>> >>> +#ifdef RTE_ARCH_ARM >>> +/* The cache line size on different Arm platforms are different, so >>> + * put a four batch size here to match with the minimum cache line >>> + * size and accommodate NEON register size. >>> + */ >>> +#define PACKED_BATCH_SIZE 4 >>> +#else >>> #define PACKED_BATCH_SIZE (RTE_CACHE_LINE_SIZE / \ >>> sizeof(struct vring_packed_desc)) >>> +#endif >>> #define PACKED_BATCH_MASK (PACKED_BATCH_SIZE - 1) >>> >>> #ifdef VIRTIO_GCC_UNROLL_PRAGMA >>> diff --git a/drivers/net/virtio/virtio_rxtx_packed_neon.h >>> b/drivers/net/virtio/virtio_rxtx_packed_neon.h >>> new file mode 100644 >>> index 0..fb1e49909 >>> --- /dev/null >>> +++ b/drivers/net/virtio/virtio_rxtx_packed_neon.h >>> @@ -0,0 +1,150 @@ >>> +/* SPDX-License-Identifier: BSD-3-Clause >>> + * Copyright(c) 2020 Arm Corporation >>> + */ >>> + >>> +#include >>> +#include >>> +#include >>> +#include >>> +#include >>> + >>> +#include >>> +#include >>> + >>> +#include "virtio_ethdev.h" >>> +#include "virtio_pci.h" >>> +#include "virtio_rxtx_packed.h" >>> +#include "virtqueue.h" >>> + >>> +static inline uint16_t >>> +virtqueue_dequeue_batch_packed_vec(struct virtnet_rx *rxvq, >>> + struct rte_mbuf **rx_pkts) >>> +{ >>> + struct virtqueue *vq = rxvq->vq; >>> + struct virtio_hw *hw = vq->hw; >>> + uint16_t head_size = hw->vtnet_hdr_size; >>> + uint16_t id = vq->vq_used_cons_idx; >>> + struct vring_packed_desc *p_desc; >>> + uint16_t i; >>> + >>> + if (id & PACKED_BATCH_MASK) >>> + return -1; >>> + >>> + if (unlikely((id + PACKED_BATCH_SIZE) > vq->vq_nentries)) >>> + return -1; >> >> This function returns an unsigned short, I think you should return 0 >> here since it failed to dequeue packets. >> >>> + /* Map packed descriptor to mbuf fields. */ >>> + uint8x16_t shuf_msk1 = { >>> + 0xFF, 0xFF, 0xFF, 0xFF, /* pkt_type set as unknown */ >>> + 0, 1, /* octet 1~0, low 16 bits pkt_len */ >>> + 0xFF, 0xFF, /* skip high 16 bits of pkt_len, zero >>> out */ >>> + 0, 1, /* octet 1~0, 16 bits data_len */ >>> + 0xFF, 0xFF, /* vlan tci set as unknown */ >>> + 0xFF, 0xFF, 0xFF, 0xFF >>> + }; >>> + >>> + uint8x16_t shuf_msk2 = { >>> + 0xFF, 0xFF, 0xFF, 0xFF, /* pkt_type set as unknown */ >>> + 8, 9, /* octet 9~8, low 16 bits pkt_len */ >>> + 0xFF, 0xFF, /* skip high 16 bits of pkt_len, zero >>> out */ >>> + 8, 9, /* octet 9~8, 16 bits data_len */ >>> + 0xFF, 0xFF, /* vlan tci set as unknown */ >>> + 0xFF, 0xFF, 0xFF, 0xFF >>> + }; >>> + >>> + /* Subtract the header length. */ >>> + uint16x8_t len_adjust = { >>> + 0, 0, /* ignore pkt_type field */ >>> + head_size, /* sub head_size on pkt_len */ >>> + 0, /* ignore high 16 bits of pkt_len */ >>> + head_size, /* sub head_size on data_len */ >>> + 0, 0, 0 /* ignore non-length fields */ >>> + }; >>> + >>> + uint64x2_t desc[PACKED_BATCH_SIZE / 2]; >>> + uint64x2x2_t mbp[PACKED_BATCH_SIZE / 2]; >>> + uint64x2_t pkt_mb[PACKED_BATCH_SIZE]; >>> + >>> + p_desc = &vq->vq_packed.ring.desc[id]; >>> + /* Load high 64 bits of packed descriptor 0,1. */ >>> + desc[0] = v
Re: [dpdk-dev] [PATCH v2 1/9] ethdev: refactor representor infrastructure
On Thu, Jan 7, 2021 at 12:08 PM Xueming(Steven) Li wrote: > > > > >-Original Message- > >From: Somnath Kotur > >Sent: Thursday, January 7, 2021 2:32 PM > >To: Xueming(Steven) Li > >Cc: NBU-Contact-Thomas Monjalon ; Ferruh Yigit > >; Andrew Rybchenko > >; Olivier Matz ; > >Slava Ovsiienko ; dev ; Asaf Penso > > > >Subject: Re: [dpdk-dev] [PATCH v2 1/9] ethdev: refactor representor > >infrastructure > > > >On Wed, Jan 6, 2021 at 9:48 PM Xueming Li wrote: > >> > >> To support extended representor syntax, this patch refactor represntor Please fix this Typo in 'representor' as well ...Thanks > >> infrastructure: > >> 1. introduces representor type enum > >> 2. devargs representor port range extraction from partial value > >> > >> Signed-off-by: Xueming Li > >> --- > >> drivers/net/bnxt/bnxt_ethdev.c| 12 > >> drivers/net/enic/enic_ethdev.c| 7 ++ > >> drivers/net/i40e/i40e_ethdev.c| 8 +++ > >> drivers/net/ixgbe/ixgbe_ethdev.c | 8 +++ > >> drivers/net/mlx5/linux/mlx5_os.c | 11 > >> lib/librte_ethdev/ethdev_private.c| 93 --- > >> lib/librte_ethdev/ethdev_private.h| 3 - > >> lib/librte_ethdev/rte_class_eth.c | 4 +- > >> lib/librte_ethdev/rte_ethdev.c| 5 +- > >> lib/librte_ethdev/rte_ethdev_driver.h | 7 ++ > >> 10 files changed, 98 insertions(+), 60 deletions(-) > >> > >> diff --git a/drivers/net/bnxt/bnxt_ethdev.c > >b/drivers/net/bnxt/bnxt_ethdev.c > >> index 81c8f8d79d..844a6c3c66 100644 > >> --- a/drivers/net/bnxt/bnxt_ethdev.c > >> +++ b/drivers/net/bnxt/bnxt_ethdev.c > >> @@ -5520,6 +5520,18 @@ static int bnxt_rep_port_probe(struct > >rte_pci_device *pci_dev, > >> int i, ret = 0; > >> struct rte_kvargs *kvlist = NULL; > >> > >> + if (eth_da->type == RTE_ETH_REPRESENTOR_NONE) > >> + return 0; > >> + if (eth_da->type != RTE_ETH_REPRESENTOR_VF) { > >> + PMD_DRV_LOG(ERR, "unsupported representor type %d\n", > >> + eth_da->type); > >> + return -ENOTSUP; > >> + } > >> + if (eth_da->type != RTE_ETH_REPRESENTOR_VF) { > >Seems like an extra 'if ' condition by mistake? Otherwise there is no > >diff b/n this 'if' condition and the one few lines above? > > Thanks, good catch! > > >> + PMD_DRV_LOG(ERR, "unsupported representor type %d\n", > >> + eth_da->type); > >> + return -EINVAL; > >> + } > >> num_rep = eth_da->nb_representor_ports; > >> if (num_rep > BNXT_MAX_VF_REPS) { > >> PMD_DRV_LOG(ERR, "nb_representor_ports = %d > %d MAX VF > >REPS\n", > >> diff --git a/drivers/net/enic/enic_ethdev.c > >> b/drivers/net/enic/enic_ethdev.c > >> index d041a6bee9..dd085caa93 100644 > >> --- a/drivers/net/enic/enic_ethdev.c > >> +++ b/drivers/net/enic/enic_ethdev.c > >> @@ -1303,6 +1303,13 @@ static int eth_enic_pci_probe(struct > >rte_pci_driver *pci_drv __rte_unused, > >> if (retval) > >> return retval; > >> } > >> + if (eth_da.type == RTE_ETH_REPRESENTOR_NONE) > >> + return 0; > >> + if (eth_da.type != RTE_ETH_REPRESENTOR_VF) { > >> + ENICPMD_LOG(ERR, "unsupported representor type: %s\n", > >> + pci_dev->device.devargs->args); > >> + return -ENOTSUP; > >> + } > >> retval = rte_eth_dev_create(&pci_dev->device, pci_dev->device.name, > >> sizeof(struct enic), > >> eth_dev_pci_specific_init, pci_dev, > >> diff --git a/drivers/net/i40e/i40e_ethdev.c > >> b/drivers/net/i40e/i40e_ethdev.c > >> index f54769c29d..05ed2e1079 100644 > >> --- a/drivers/net/i40e/i40e_ethdev.c > >> +++ b/drivers/net/i40e/i40e_ethdev.c > >> @@ -640,6 +640,14 @@ eth_i40e_pci_probe(struct rte_pci_driver *pci_drv > >__rte_unused, > >> return retval; > >> } > >> > >> + if (eth_da.type == RTE_ETH_REPRESENTOR_NONE) > >> + return 0; > >> + if (eth_da.type != RTE_ETH_REPRESENTOR_VF) { > >> + PMD_DRV_LOG(ERR, "unsupported representor type: %s\n", > >> + pci_dev->device.devargs->args); > >> + return -ENOTSUP; > >> + } > >> + > >> retval = rte_eth_dev_create(&pci_dev->device, pci_dev->device.name, > >> sizeof(struct i40e_adapter), > >> eth_dev_pci_specific_init, pci_dev, > >> diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c > >b/drivers/net/ixgbe/ixgbe_ethdev.c > >> index 9a47a8b262..9ea0139197 100644 > >> --- a/drivers/net/ixgbe/ixgbe_ethdev.c > >> +++ b/drivers/net/ixgbe/ixgbe_ethdev.c > >> @@ -1717,6 +1717,14 @@ eth_ixgbe_pci_probe(struct rte_pci_driver > >*pci_drv __rte_unused, > >> } else > >> memset(ð_da, 0, sizeof(eth_da)); > >> > >> + if (eth_da.type == RTE_ETH_REPRESENTOR_NONE) > >> +
Re: [dpdk-dev] [PATCH v3] net/axgbe: add support for reading FW version
-Original Message- From: Sebastian, Selwin Sent: Wednesday, January 6, 2021 1:31 PM To: dev@dpdk.org Cc: Somalapuram, Amaranath ; ferruh.yi...@intel.com Subject: [PATCH v3] net/axgbe: add support for reading FW version From: Selwin Sebastian Added support for fw_version_get API Signed-off-by: Selwin Sebastian --- doc/guides/nics/features/axgbe.ini | 1 + drivers/net/axgbe/axgbe_ethdev.c | 1 + drivers/net/axgbe/axgbe_rxtx.c | 28 drivers/net/axgbe/axgbe_rxtx.h | 3 +++ 4 files changed, 33 insertions(+) diff --git a/doc/guides/nics/features/axgbe.ini b/doc/guides/nics/features/axgbe.ini index 34df0d1ee..3adc5639f 100644 --- a/doc/guides/nics/features/axgbe.ini +++ b/doc/guides/nics/features/axgbe.ini @@ -17,6 +17,7 @@ CRC offload = Y L3 checksum offload = Y L4 checksum offload = Y Basic stats = Y +FW version = Y Linux UIO= Y x86-32 = Y x86-64 = Y diff --git a/drivers/net/axgbe/axgbe_ethdev.c b/drivers/net/axgbe/axgbe_ethdev.c index cfe6aba73..1982c6a8e 100644 --- a/drivers/net/axgbe/axgbe_ethdev.c +++ b/drivers/net/axgbe/axgbe_ethdev.c @@ -257,6 +257,7 @@ static const struct eth_dev_ops axgbe_eth_dev_ops = { .timesync_adjust_time = axgbe_timesync_adjust_time, .timesync_read_time = axgbe_timesync_read_time, .timesync_write_time = axgbe_timesync_write_time, + .fw_version_get = axgbe_dev_fw_version_get, }; static int axgbe_phy_reset(struct axgbe_port *pdata) diff --git a/drivers/net/axgbe/axgbe_rxtx.c b/drivers/net/axgbe/axgbe_rxtx.c index 032e3cebc..c10127a02 100644 --- a/drivers/net/axgbe/axgbe_rxtx.c +++ b/drivers/net/axgbe/axgbe_rxtx.c @@ -571,6 +571,34 @@ int axgbe_dev_tx_queue_setup(struct rte_eth_dev *dev, uint16_t queue_idx, return 0; } +int axgbe_dev_fw_version_get(struct rte_eth_dev *eth_dev, + char *fw_version, size_t fw_size) +{ + struct axgbe_port *pdata; + struct axgbe_hw_features *hw_feat; + int ret; + + pdata = (struct axgbe_port *)eth_dev->data->dev_private; + hw_feat = &pdata->hw_feat; + + if (fw_version == NULL) + return -EINVAL; + + ret = snprintf(fw_version, fw_size, "%d.%d.%d", + AXGMAC_GET_BITS(hw_feat->version, MAC_VR, USERVER), + AXGMAC_GET_BITS(hw_feat->version, MAC_VR, DEVID), + AXGMAC_GET_BITS(hw_feat->version, MAC_VR, SNPSVER)); + if (ret < 0) + return -EINVAL; + + ret += 1; /* add the size of '\0' */ + + if (fw_size < (size_t)ret) + return ret; + else + return 0; +} + static void axgbe_txq_prepare_tx_stop(struct axgbe_port *pdata, unsigned int queue) { diff --git a/drivers/net/axgbe/axgbe_rxtx.h b/drivers/net/axgbe/axgbe_rxtx.h index f2fbe9299..c2b11bb0e 100644 --- a/drivers/net/axgbe/axgbe_rxtx.h +++ b/drivers/net/axgbe/axgbe_rxtx.h @@ -162,6 +162,9 @@ void axgbe_dev_disable_tx(struct rte_eth_dev *dev); int axgbe_dev_tx_queue_start(struct rte_eth_dev *dev, uint16_t tx_queue_id); int axgbe_dev_tx_queue_stop(struct rte_eth_dev *dev, uint16_t tx_queue_id); +int axgbe_dev_fw_version_get(struct rte_eth_dev *eth_dev, + char *fw_version, size_t fw_size); + uint16_t axgbe_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts); uint16_t axgbe_xmit_pkts_vec(void *tx_queue, struct rte_mbuf **tx_pkts, -- 2.17.1 Acked-by: Somalapuram Amaranath
Re: [dpdk-dev] [PATCH v3 0/2] Enhance Async Enqueue for Small Packets
On 12/25/20 9:28 AM, Jiayu Hu wrote: > Async enqueue offloads large copies to DMA devices, and small copies > are still performed by the CPU. However, it requires users to get > enqueue completed packets by rte_vhost_poll_enqueue_completed(), even > if they are completed by the CPU when rte_vhost_submit_enqueue_burst() > returns. This design incurs extra overheads of tracking completed > pktmbufs and function calls, thus degrading performance on small packets. > > The first patch cleans up async enqueue code, and the second patch > enables rte_vhost_submit_enqueue_burst() to return completed packets. > > Change log > == > v3: > - fix incorrect ret value when DMA ring is full > - enhance description of API declaration and programmer guide > v2: > - fix typo > - rename API variables > - update programmer guide > > Jiayu Hu (2): > vhost: cleanup async enqueue > vhost: enhance async enqueue for small packets > > doc/guides/prog_guide/vhost_lib.rst | 8 +- > lib/librte_vhost/rte_vhost_async.h | 32 +++-- > lib/librte_vhost/vhost.c| 14 +- > lib/librte_vhost/vhost.h| 7 +- > lib/librte_vhost/vhost_user.c | 7 +- > lib/librte_vhost/virtio_net.c | 258 > > 6 files changed, 185 insertions(+), 141 deletions(-) > CI reports build failure with your series, because API changes are not done in the examples: FAILED: examples/dpdk-vhost.p/vhost_main.c.o cc -Iexamples/dpdk-vhost.p -Iexamples -I../examples -Iexamples/vhost -I../examples/vhost -I. -I.. -Iconfig -I../config -Ilib/librte_eal/include -I../lib/librte_eal/include -Ilib/librte_eal/linux/include -I../lib/librte_eal/linux/include -Ilib/librte_eal/x86/include -I../lib/librte_eal/x86/include -Ilib/librte_eal/common -I../lib/librte_eal/common -Ilib/librte_eal -I../lib/librte_eal -Ilib/librte_kvargs -I../lib/librte_kvargs -Ilib/librte_metrics -I../lib/librte_metrics -Ilib/librte_telemetry -I../lib/librte_telemetry -Ilib/librte_mempool -I../lib/librte_mempool -Ilib/librte_ring -I../lib/librte_ring -Ilib/librte_net -I../lib/librte_net -Ilib/librte_mbuf -I../lib/librte_mbuf -Ilib/librte_ethdev -I../lib/librte_ethdev -Ilib/librte_meter -I../lib/librte_meter -Ilib/librte_cmdline -I../lib/librte_cmdline -Ilib/librte_vhost -I../lib/librte_vhost -Ilib/librte_cryptodev -I../lib/librte_cryptodev -Ilib/librte_hash -I../lib/librte_hash -Ilib/librte_rcu -I../lib/librte_rcu -Ilib/librte_pci -I../lib/librte_pci -Idrivers/raw/ioat -I../drivers/raw/ioat -Ilib/librte_rawdev -I../lib/librte_rawdev -Idrivers/bus/pci -I../drivers/bus/pci -I../drivers/bus/pci/linux -Idrivers/bus/vdev -I../drivers/bus/vdev -fdiagnostics-color=always -pipe -D_FILE_OFFSET_BITS=64 -Wall -Winvalid-pch -Werror -O3 -include rte_config.h -Wextra -Wcast-qual -Wdeprecated -Wformat -Wformat-nonliteral -Wformat-security -Wmissing-declarations -Wmissing-prototypes -Wnested-externs -Wold-style-definition -Wpointer-arith -Wsign-compare -Wstrict-prototypes -Wundef -Wwrite-strings -Wno-missing-field-initializers -D_GNU_SOURCE -march=native -Wno-format-truncation -DALLOW_EXPERIMENTAL_API -MD -MQ examples/dpdk-vhost.p/vhost_main.c.o -MF examples/dpdk-vhost.p/vhost_main.c.o.d -o examples/dpdk-vhost.p/vhost_main.c.o -c ../examples/vhost/main.c ../examples/vhost/main.c: In function 'virtio_xmit': ../examples/vhost/main.c:817:9: error: too few arguments to function 'rte_vhost_submit_enqueue_burst' ret = rte_vhost_submit_enqueue_burst(dst_vdev->vid, VIRTIO_RXQ, ^~ In file included from ../examples/vhost/ioat.h:10:0, from ../examples/vhost/main.c:28: ../lib/librte_vhost/rte_vhost_async.h:171:10: note: declared here uint16_t rte_vhost_submit_enqueue_burst(int vid, uint16_t queue_id, ^~ ../examples/vhost/main.c: In function 'drain_eth_rx': ../examples/vhost/main.c:1126:19: error: too few arguments to function 'rte_vhost_submit_enqueue_burst' enqueue_count = rte_vhost_submit_enqueue_burst(vdev->vid, ^~ In file included from ../examples/vhost/ioat.h:10:0, from ../examples/vhost/main.c:28: ../lib/librte_vhost/rte_vhost_async.h:171:10: note: declared here uint16_t rte_vhost_submit_enqueue_burst(int vid, uint16_t queue_id, ^~
[dpdk-dev] [PATCH 3/5] common/dpaax/caamflib: update zuc-zuc descriptor sharing
From: Akhil Goyal the descriptor sharing needed to be changed for ZUC+ZUC as we were getting invalid CHA combination error due to sharing being done on DECOs simultaneously. Signed-off-by: Akhil Goyal --- drivers/common/dpaax/caamflib/desc/pdcp.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/common/dpaax/caamflib/desc/pdcp.h b/drivers/common/dpaax/caamflib/desc/pdcp.h index f084cf1de0..659e289a45 100644 --- a/drivers/common/dpaax/caamflib/desc/pdcp.h +++ b/drivers/common/dpaax/caamflib/desc/pdcp.h @@ -3282,7 +3282,7 @@ cnstr_shdsc_pdcp_u_plane_encap(uint32_t *descbuf, SHR_ALWAYS, /* NULL */ SHR_WAIT, /* SNOW f9 */ SHR_WAIT, /* AES CMAC */ - SHR_ALWAYS /* ZUC-I */ + SHR_WAIT/* ZUC-I */ }, }; LABEL(pdb_end); @@ -3485,7 +3485,7 @@ cnstr_shdsc_pdcp_u_plane_decap(uint32_t *descbuf, SHR_ALWAYS, /* NULL */ SHR_WAIT, /* SNOW f9 */ SHR_WAIT, /* AES CMAC */ - SHR_ALWAYS /* ZUC-I */ + SHR_WAIT/* ZUC-I */ }, }; -- 2.17.1
[dpdk-dev] [PATCH 2/5] test/crypto: add AES-XCBC hash only test case
This patch adds test case for AES-XCBC hash only for Digest and Digest-verify Signed-off-by: Hemant Agrawal --- app/test/test_cryptodev_hash_test_vectors.h | 41 + 1 file changed, 41 insertions(+) diff --git a/app/test/test_cryptodev_hash_test_vectors.h b/app/test/test_cryptodev_hash_test_vectors.h index e261dfe36c..8b23f011e8 100644 --- a/app/test/test_cryptodev_hash_test_vectors.h +++ b/app/test/test_cryptodev_hash_test_vectors.h @@ -352,6 +352,37 @@ cmac_test_vector = { } }; +static const uint8_t aes_xcbc_mac_plain_text[32] = { + 0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, + 0x08, 0x09, 0x0a, 0x0b, 0x0c, 0x0d, 0x0e, 0x0f, + 0x10, 0x11, 0x12, 0x13, 0x14, 0x15, 0x16, 0x17, + 0x18, 0x19, 0x1a, 0x1b, 0x1c, 0x1d, 0x1e, 0x1f +}; + +static const struct blockcipher_test_data +aes_xcbc_mac_test_vector = { + .auth_algo = RTE_CRYPTO_AUTH_AES_XCBC_MAC, + .ciphertext = { + .data = (uint8_t *)&aes_xcbc_mac_plain_text, + .len = 32 + }, + .auth_key = { + .data = { + 0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, + 0x08, 0x09, 0x0a, 0x0b, 0x0c, 0x0d, 0x0e, 0x0f + }, + .len = 16 + }, + .digest = { + .data = { + 0xf5, 0x4f, 0x0e, 0xc8, 0xd2, 0xb9, 0xf3, 0xd3, + 0x68, 0x07, 0x73, 0x4b, 0xd5, 0x28, 0x3f, 0xd4 + }, + .len = 16, + .truncated_len = 16 + } +}; + static const struct blockcipher_test_data null_auth_test_vector = { .auth_algo = RTE_CRYPTO_AUTH_NULL, @@ -576,6 +607,16 @@ static const struct blockcipher_test_case hash_test_cases[] = { .op_mask = BLOCKCIPHER_TEST_OP_AUTH_VERIFY, .feature_mask = BLOCKCIPHER_TEST_FEATURE_OOP, }, + { + .test_descr = "AES-XCBC-MAC Digest 16B", + .test_data = &aes_xcbc_mac_test_vector, + .op_mask = BLOCKCIPHER_TEST_OP_AUTH_GEN, + }, + { + .test_descr = "AES-XCBC-MAC Digest Verify 16B", + .test_data = &aes_xcbc_mac_test_vector, + .op_mask = BLOCKCIPHER_TEST_OP_AUTH_VERIFY, + }, }; -- 2.17.1
[dpdk-dev] [PATCH 1/5] crypto/dpaa2_sec: support AES-XCBC-MAC
From: Akhil Goyal This patch add support for AES-XCBC-MAC for following cases - AES-XCBC-MAC auth only - AES-CBC/CTR + AES-XCBC-MAC (non-proto) - AES-CBC/CTR + AES-XCBC-MAC (protocol offload) - DES-CBC + AES-XCBC-MAC (non-proto) - 3DES-CBC + AES-XCBC-MAC (non-proto) Signed-off-by: Barry Cao Signed-off-by: Hemant Agrawal Signed-off-by: Akhil Goyal --- doc/guides/cryptodevs/dpaa2_sec.rst | 1 + doc/guides/cryptodevs/features/dpaa2_sec.ini | 1 + drivers/common/dpaax/caamflib/desc/algo.h | 63 +++ drivers/common/dpaax/caamflib/desc/ipsec.h| 18 -- .../common/dpaax/caamflib/rta/operation_cmd.h | 6 +- drivers/crypto/dpaa2_sec/dpaa2_sec_dpseci.c | 20 +- drivers/crypto/dpaa2_sec/dpaa2_sec_priv.h | 21 +++ 7 files changed, 123 insertions(+), 7 deletions(-) diff --git a/doc/guides/cryptodevs/dpaa2_sec.rst b/doc/guides/cryptodevs/dpaa2_sec.rst index 83565d7175..275ccf28de 100644 --- a/doc/guides/cryptodevs/dpaa2_sec.rst +++ b/doc/guides/cryptodevs/dpaa2_sec.rst @@ -121,6 +121,7 @@ Hash algorithms: * ``RTE_CRYPTO_AUTH_SHA384_HMAC`` * ``RTE_CRYPTO_AUTH_SHA512_HMAC`` * ``RTE_CRYPTO_AUTH_MD5_HMAC`` +* ``RTE_CRYPTO_AUTH_AES_XCBC_MAC`` AEAD algorithms: diff --git a/doc/guides/cryptodevs/features/dpaa2_sec.ini b/doc/guides/cryptodevs/features/dpaa2_sec.ini index 02c1bf4185..9828d1528e 100644 --- a/doc/guides/cryptodevs/features/dpaa2_sec.ini +++ b/doc/guides/cryptodevs/features/dpaa2_sec.ini @@ -46,6 +46,7 @@ SHA384 HMAC = Y SHA512 = Y SHA512 HMAC = Y SNOW3G UIA2 = Y +AES XCBC MAC = Y ZUC EIA3 = Y ; diff --git a/drivers/common/dpaax/caamflib/desc/algo.h b/drivers/common/dpaax/caamflib/desc/algo.h index 41cac5abd0..cf43d9c14c 100644 --- a/drivers/common/dpaax/caamflib/desc/algo.h +++ b/drivers/common/dpaax/caamflib/desc/algo.h @@ -873,4 +873,67 @@ cnstr_shdsc_gcm_decap(uint32_t *descbuf, bool ps, bool swap, return PROGRAM_FINALIZE(p); } +/** + * cnstr_shdsc_aes_xcbc_mac - AES_XCBC_MAC + * @descbuf: pointer to descriptor-under-construction buffer + * @ps: if 36/40bit addressing is desired, this parameter must be true + * @swap: must be true when core endianness doesn't match SEC endianness + * @share: sharing type of shared descriptor + * @authdata: pointer to authentication transform definitions; + *message digest algorithm: OP_ALG_ALGSEL_AES. + * @do_icv: 0 if ICV checking is not desired, any other value if ICV checking + * is needed for all the packets processed by this shared descriptor + * @trunc_len: Length of the truncated ICV to be written in the output buffer, + * 0 if no truncation is needed + * + * Note: There's no support for keys longer than the block size of the + * underlying hash function, according to the selected algorithm. + * + * Return: size of descriptor written in words or negative number on error + */ +static inline int +cnstr_shdsc_aes_xcbc_mac(uint32_t *descbuf, bool ps, bool swap, + enum rta_share_type share, + struct alginfo *authdata, uint8_t do_icv, + uint8_t trunc_len) +{ + struct program prg; + struct program *p = &prg; + uint8_t opicv, dir; + + opicv = do_icv ? ICV_CHECK_ENABLE : ICV_CHECK_DISABLE; + dir = do_icv ? DIR_DEC : DIR_ENC; + + PROGRAM_CNTXT_INIT(p, descbuf, 0); + if (swap) + PROGRAM_SET_BSWAP(p); + if (ps) + PROGRAM_SET_36BIT_ADDR(p); + SHR_HDR(p, share, 1, SC); + + KEY(p, KEY2, authdata->key_enc_flags, authdata->key, authdata->keylen, + INLINE_KEY(authdata)); + + /* compute sequences */ + if (opicv == ICV_CHECK_ENABLE) + MATHB(p, SEQINSZ, SUB, trunc_len, VSEQINSZ, 4, IMMED2); + else + MATHB(p, SEQINSZ, SUB, MATH2, VSEQINSZ, 4, 0); + + /* Do operation */ + ALG_OPERATION(p, authdata->algtype, authdata->algmode, + OP_ALG_AS_INITFINAL, opicv, dir); + + /* Do load (variable length) */ + SEQFIFOLOAD(p, MSG2, 0, VLF | LAST2); + + if (opicv == ICV_CHECK_ENABLE) { + LOAD(p, trunc_len, ICV2SZ, 0, 4, IMMED); + SEQFIFOLOAD(p, ICV2, trunc_len, LAST2); + } else + SEQSTORE(p, CONTEXT2, 0, trunc_len, 0); + + return PROGRAM_FINALIZE(p); +} + #endif /* __DESC_ALGO_H__ */ diff --git a/drivers/common/dpaax/caamflib/desc/ipsec.h b/drivers/common/dpaax/caamflib/desc/ipsec.h index 83dd93f587..668d21649d 100644 --- a/drivers/common/dpaax/caamflib/desc/ipsec.h +++ b/drivers/common/dpaax/caamflib/desc/ipsec.h @@ -865,6 +865,7 @@ cnstr_shdsc_ipsec_decap(uint32_t *descbuf, bool ps, bool swap, * cnstr_shdsc_ipsec_encap_des_aes_xcbc - IPSec DES-CBC/3DES-CBC and * AES-XCBC-MAC-96 ESP encapsulation shared descriptor. * @descbuf: pointer to buffer used for descriptor construction + * @share: sharing type of shared descriptor * @pdb: pointer to the PDB to be used with this
[dpdk-dev] [PATCH 4/5] crypto/dpaa2_sec: add support for AES CMAC integrity check
This patch adds support for AES_CMAC integrity in non-security mode. This patch modifies the camm flib to handles the AES CMAC without conflicting the proto ALG operations. i.e. by creating another ALG operation routine. Signed-off-by: Hemant Agrawal --- doc/guides/cryptodevs/dpaa2_sec.rst | 1 + doc/guides/cryptodevs/features/dpaa2_sec.ini | 1 + drivers/common/dpaax/caamflib/desc/algo.h | 16 ++- drivers/common/dpaax/caamflib/rta.h | 3 + .../common/dpaax/caamflib/rta/operation_cmd.h | 103 +- drivers/crypto/dpaa2_sec/dpaa2_sec_dpseci.c | 39 ++- drivers/crypto/dpaa2_sec/dpaa2_sec_priv.h | 21 7 files changed, 171 insertions(+), 13 deletions(-) diff --git a/doc/guides/cryptodevs/dpaa2_sec.rst b/doc/guides/cryptodevs/dpaa2_sec.rst index 275ccf28de..a7fc9cef99 100644 --- a/doc/guides/cryptodevs/dpaa2_sec.rst +++ b/doc/guides/cryptodevs/dpaa2_sec.rst @@ -122,6 +122,7 @@ Hash algorithms: * ``RTE_CRYPTO_AUTH_SHA512_HMAC`` * ``RTE_CRYPTO_AUTH_MD5_HMAC`` * ``RTE_CRYPTO_AUTH_AES_XCBC_MAC`` +* ``RTE_CRYPTO_AUTH_AES_CMAC`` AEAD algorithms: diff --git a/doc/guides/cryptodevs/features/dpaa2_sec.ini b/doc/guides/cryptodevs/features/dpaa2_sec.ini index 9828d1528e..a1c91821de 100644 --- a/doc/guides/cryptodevs/features/dpaa2_sec.ini +++ b/doc/guides/cryptodevs/features/dpaa2_sec.ini @@ -48,6 +48,7 @@ SHA512 HMAC = Y SNOW3G UIA2 = Y AES XCBC MAC = Y ZUC EIA3 = Y +AES CMAC (128) = Y ; ; Supported AEAD algorithms of the 'dpaa2_sec' crypto driver. diff --git a/drivers/common/dpaax/caamflib/desc/algo.h b/drivers/common/dpaax/caamflib/desc/algo.h index cf43d9c14c..7f66ee5fd9 100644 --- a/drivers/common/dpaax/caamflib/desc/algo.h +++ b/drivers/common/dpaax/caamflib/desc/algo.h @@ -1,7 +1,7 @@ /* SPDX-License-Identifier: (BSD-3-Clause OR GPL-2.0) * * Copyright 2008-2016 Freescale Semiconductor Inc. - * Copyright 2016,2019-2020 NXP + * Copyright 2016,2019-2021 NXP * */ @@ -435,13 +435,17 @@ cnstr_shdsc_hmac(uint32_t *descbuf, bool ps, bool swap, INLINE_KEY(authdata)); /* Do operation */ - ALG_OPERATION(p, authdata->algtype, OP_ALG_AAI_HMAC, + ALG_OPERATION(p, authdata->algtype, authdata->algmode, OP_ALG_AS_INITFINAL, opicv, dir); pjmpprecomp = JUMP(p, jmpprecomp, LOCAL_JUMP, ALL_TRUE, 0); SET_LABEL(p, keyjmp); - ALG_OPERATION(p, authdata->algtype, OP_ALG_AAI_HMAC_PRECOMP, + if (authdata->algmode == OP_ALG_AAI_HMAC) + ALG_OPERATION(p, authdata->algtype, OP_ALG_AAI_HMAC_PRECOMP, + OP_ALG_AS_INITFINAL, opicv, dir); + else + ALG_OPERATION(p, authdata->algtype, authdata->algmode, OP_ALG_AS_INITFINAL, opicv, dir); SET_LABEL(p, jmpprecomp); @@ -874,7 +878,7 @@ cnstr_shdsc_gcm_decap(uint32_t *descbuf, bool ps, bool swap, } /** - * cnstr_shdsc_aes_xcbc_mac - AES_XCBC_MAC + * cnstr_shdsc_aes_xx_mac - AES_XCBC_MAC, CMAC cases * @descbuf: pointer to descriptor-under-construction buffer * @ps: if 36/40bit addressing is desired, this parameter must be true * @swap: must be true when core endianness doesn't match SEC endianness @@ -892,7 +896,7 @@ cnstr_shdsc_gcm_decap(uint32_t *descbuf, bool ps, bool swap, * Return: size of descriptor written in words or negative number on error */ static inline int -cnstr_shdsc_aes_xcbc_mac(uint32_t *descbuf, bool ps, bool swap, +cnstr_shdsc_aes_xx_mac(uint32_t *descbuf, bool ps, bool swap, enum rta_share_type share, struct alginfo *authdata, uint8_t do_icv, uint8_t trunc_len) @@ -921,7 +925,7 @@ cnstr_shdsc_aes_xcbc_mac(uint32_t *descbuf, bool ps, bool swap, MATHB(p, SEQINSZ, SUB, MATH2, VSEQINSZ, 4, 0); /* Do operation */ - ALG_OPERATION(p, authdata->algtype, authdata->algmode, + ALG_OPERATION_NP(p, authdata->algtype, authdata->algmode, OP_ALG_AS_INITFINAL, opicv, dir); /* Do load (variable length) */ diff --git a/drivers/common/dpaax/caamflib/rta.h b/drivers/common/dpaax/caamflib/rta.h index c4bbad0b41..e5a736346e 100644 --- a/drivers/common/dpaax/caamflib/rta.h +++ b/drivers/common/dpaax/caamflib/rta.h @@ -485,6 +485,9 @@ rta_get_sec_era(void) #define ALG_OPERATION(program, cipher_alg, aai, algo_state, icv_check, enc) \ rta_operation(program, cipher_alg, aai, algo_state, icv_check, enc) +#define ALG_OPERATION_NP(program, cipher_alg, aai, algo_state, icv_check, enc) \ + rta_operation2(program, cipher_alg, aai, algo_state, icv_check, enc) + /** * PROTOCOL - Configures PROTOCOL OPERATION command * @program: pointer to struct program diff --git a/drivers/common/dpaax/caamflib/rta/operation_cmd.h b/drivers/common/dpaax/caamflib/rta/operation_cmd.h index 04732aa3d2..f341fdcc54 100644 --- a/drivers/common/dpaax/caamflib/rta/operation_cmd.h +++ b/drivers/common/dpaax/caamflib/rta/operatio
[dpdk-dev] [PATCH 5/5] crypto/dpaa_sec: reduce the log on queue closure
if for some reason the queue is not close properly, specially in test cases. The QUEUE retire prints are flooding the screen. They are not really required as WARNING. Signed-off-by: Hemant Agrawal --- drivers/crypto/dpaa_sec/dpaa_sec.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/crypto/dpaa_sec/dpaa_sec.c b/drivers/crypto/dpaa_sec/dpaa_sec.c index 44c742738f..a4c4b094bb 100644 --- a/drivers/crypto/dpaa_sec/dpaa_sec.c +++ b/drivers/crypto/dpaa_sec/dpaa_sec.c @@ -2283,7 +2283,7 @@ dpaa_sec_detach_rxq(struct dpaa_sec_dev_private *qi, struct qman_fq *fq) for (i = 0; i < RTE_DPAA_MAX_RX_QUEUE; i++) { if (&qi->inq[i] == fq) { if (qman_retire_fq(fq, NULL) != 0) - DPAA_SEC_WARN("Queue is not retired\n"); + DPAA_SEC_DEBUG("Queue is not retired\n"); qman_oos_fq(fq); qi->inq_attach[i] = 0; return 0; -- 2.17.1
Re: [dpdk-dev] [PATCH v2] app/testpmd: fix IP checksum calculation
On 1/7/2021 5:39 AM, George Prekas wrote: On 1/6/2021 12:02 PM, Ferruh Yigit wrote: On 12/5/2020 5:42 AM, George Prekas wrote: Strict-aliasing rules are violated by cast to uint16_t* in flowgen.c and the calculated IP checksum is wrong on GCC 9 and GCC 10. Signed-off-by: George Prekas --- v2: * Instead of a compiler barrier, use a compiler flag. ---  app/test-pmd/meson.build | 1 +  1 file changed, 1 insertion(+) diff --git a/app/test-pmd/meson.build b/app/test-pmd/meson.build index 7e9c7bdd6..5d24e807f 100644 --- a/app/test-pmd/meson.build +++ b/app/test-pmd/meson.build @@ -4,6 +4,7 @@  # override default name to drop the hyphen  name = 'testpmd'  cflags += '-Wno-deprecated-declarations' +cflags += '-fno-strict-aliasing'  sources = files('5tswap.c',  'cmdline.c',  'cmdline_flow.c', Hi George, I am trying to understand this, the relevant code is as below: ip_hdr->hdr_checksum = ip_sum((unaligned_uint16_t *)ip_hdr, sizeof(*ip_hdr)); You are suspicious of strict aliasing rule violation, with more details: The concern is the "struct rte_ipv4_hdr *ip_hdr;" aliased to "const unaligned_uint16_t *hdr", and compiler can optimize out the calculations using data pointed by 'hdr' pointer, since the 'hdr' pointer is not used to alter the data and compiler may think data is not changed at all. 1) But the pointer "hdr" is assigned in the loop, from another pointer whose content is changing, why this is not helping to figure out that the data 'hdr' pointing is changed. 2) I tried to debug this, but I am not able to reproduce the issue, 'ip_sum()' called each time and checksum calculated correctly. Using gcc 10.2.1-9. Can you able to confirm the case with debug, or from the assembly/object file? And if the issue is strict aliasing rule violation as you said, compiler flag is an option but not sure how much it reduces the compiler optimization benefit, I guess other options also not so good, memcpy brings too much work on runtime and union requires bigger change and makes code complex. I wonder if making 'ip_sum()' a non inline function can help, can you please give a try since you can reproduce it? Hi Ferruh, Thanks for looking into it. I am copy-pasting at the end of this email a minimal reproduction. It calculates a checksum and prints it. The correct value is f8d9. If you compile it with -O0 or -O3 -fno-strict-aliasing, you will get the correct value. If you compile it with gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0 and -O3, you will get f8e8. You can also try it on https://godbolt.org/ and see how different versions behave. My understanding is that the code violates the C standard (https://stackoverflow.com/a/99010). Thanks for the sample code below, I copied to the godbolt: https://godbolt.org/z/6fMK19 In gcc 10, the checksum calculation is done during compilation (when optimization is enabled) and the value is returned directly: mov$0xffed,%esi Since a calculation is happening I assume the compiler knows about the aliasing and OK with it. But that optimized calculation seems wrong, when it is disabled [1] the checksum is correct again. [1] all following seems helping to disable compile time calculation - disabling optimization - putting a compiler barrier - putting a 'printf' inside 'ip_sum()' - fno-strict-aliasing gcc 8 & 9 is not doing this compile time calculation, hence they are not affected. This feels like an optimization issue in gcc10, but not sure exactly on the root cause, and how to disable it properly in our case. --- cut here --- #include #include #include #include struct rte_ipv4_hdr { uint8_t version_ihl; uint8_t type_of_service; uint16_t total_length; uint16_t packet_id; uint16_t fragment_offset; uint8_t time_to_live; uint8_t next_proto_id; uint16_t hdr_checksum; uint32_t src_addr; uint32_t dst_addr; }; static inline uint16_t ip_sum(const uint16_t *hdr, int hdr_len) { uint32_t sum = 0; while (hdr_len > 1) { sum += *hdr++; if (sum & 0x8000) sum = (sum & 0x) + (sum >> 16); hdr_len -= 2; } while (sum >> 16) sum = (sum & 0x) + (sum >> 16); return ~sum; } static void pkt_burst_flow_gen(void) { struct rte_ipv4_hdr *ip_hdr = (struct rte_ipv4_hdr *) malloc(4096); memset(ip_hdr, 0, sizeof(*ip_hdr)); ip_hdr->version_ihl = 1; ip_hdr->type_of_service = 2; ip_hdr->fragment_offset = 3; ip_hdr->time_to_live = 4; ip_hdr->next_proto_id= 5; ip_hdr->packet_id= 6; ip_hdr->src_addr = 7; ip_hdr->dst_addr = 8; ip_hdr->total_length = 9; ip_hdr->hdr_checksum = ip_sum((uint16_t *)ip_hdr, sizeof(*ip_hdr)); printf("%x\n", ip_hdr->hdr_checksum); } int main(void) { pkt_burst_fl
Re: [dpdk-dev] [PATCH] mlx5: fix __mlx5_bit_off macro warning for Windows
> Subject: Re: [dpdk-dev] [PATCH] mlx5: fix __mlx5_bit_off macro warning for > Windows > > External email: Use caution opening links or attachments > > > On Wed, 6 Jan 2021 15:42:21 +0200, Tal Shnaiderman wrote: > > While compiling with clang 11 the callers of the __mlx5_bit_off macro > > warns on the cast of pointers to unsigned long which is a smaller int > > type in Windows. > > > > warning: cast to smaller integer type 'unsigned long' > > from 'u8 (*)[16]' [-Wpointer-to-int-cast] > > > > To resolve it the type is changed to size_t to be compatible for both > > Linux and Windows. > > uintptr_t is the type for integers storing pointer values, not size_t. Correct, thanks.
[dpdk-dev] [PATCH v2] mlx5: fix __mlx5_bit_off macro warning for Windows
While compiling with clang 11 the callers of the __mlx5_bit_off macro warns on the cast of pointers to unsigned long which is a smaller int type in Windows. warning: cast to smaller integer type 'unsigned long' from 'u8 (*)[16]' [-Wpointer-to-int-cast] To resolve it the type is changed to uintptr_t to be compatible for both Linux and Windows. Fixes: 865a0c15672c ("net/mlx5: add Direct Verbs prepare function") Cc: sta...@dpdk.org Signed-off-by: Tal Shnaiderman --- v2: change type to uintptr_t [DmitryK] --- drivers/common/mlx5/mlx5_prm.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/common/mlx5/mlx5_prm.h b/drivers/common/mlx5/mlx5_prm.h index 8c9b53ce10..b6a3a3cfcb 100644 --- a/drivers/common/mlx5/mlx5_prm.h +++ b/drivers/common/mlx5/mlx5_prm.h @@ -601,7 +601,7 @@ typedef uint8_t u8; #define __mlx5_nullp(typ) ((struct mlx5_ifc_##typ##_bits *)0) #define __mlx5_bit_sz(typ, fld) sizeof(__mlx5_nullp(typ)->fld) -#define __mlx5_bit_off(typ, fld) ((unsigned int)(unsigned long) \ +#define __mlx5_bit_off(typ, fld) ((unsigned int)(uintptr_t) \ (&(__mlx5_nullp(typ)->fld))) #define __mlx5_dw_bit_off(typ, fld) (32 - __mlx5_bit_sz(typ, fld) - \ (__mlx5_bit_off(typ, fld) & 0x1f)) -- 2.16.1.windows.4
[dpdk-dev] [PATCH v7] raw/ioat: add secondary process support
Add support for secondary processes in ioat devices. The update allocates a memzone for a primary process or returns it in a secondary process. Signed-off-by: Kumar Amber --- v5 * add error check for memzone lookup v6 * fix compilation v7 * include dev ops for secondary --- drivers/raw/ioat/ioat_common.c | 22 -- drivers/raw/ioat/ioat_rawdev.c | 21 +++-- 2 files changed, 39 insertions(+), 4 deletions(-) diff --git a/drivers/raw/ioat/ioat_common.c b/drivers/raw/ioat/ioat_common.c index 142e171bc9..2428c8a5b6 100644 --- a/drivers/raw/ioat/ioat_common.c +++ b/drivers/raw/ioat/ioat_common.c @@ -215,14 +215,32 @@ idxd_rawdev_create(const char *name, struct rte_device *dev, goto cleanup; } + /* Allocate memory for the primary process or else return the memory +* of primary memzone for the secondary process. +*/ snprintf(mz_name, sizeof(mz_name), "rawdev%u_private", rawdev->dev_id); - mz = rte_memzone_reserve(mz_name, sizeof(struct idxd_rawdev), - dev->numa_node, RTE_MEMZONE_IOVA_CONTIG); + if (rte_eal_process_type() == RTE_PROC_SECONDARY) { + mz = rte_memzone_lookup(mz_name); + if (mz == NULL) { + IOAT_PMD_ERR("Unable lookup memzone for private data\n"); + ret = -ENOMEM; + goto cleanup; + } + rawdev->dev_private = mz->addr; + rawdev->dev_ops = ops; + rawdev->device = dev; + return 0; + } + mz = rte_memzone_reserve(mz_name, + sizeof(struct rte_ioat_rawdev), + dev->numa_node, + RTE_MEMZONE_IOVA_CONTIG); if (mz == NULL) { IOAT_PMD_ERR("Unable to reserve memzone for private data\n"); ret = -ENOMEM; goto cleanup; } + rawdev->dev_private = mz->addr; rawdev->dev_ops = ops; rawdev->device = dev; diff --git a/drivers/raw/ioat/ioat_rawdev.c b/drivers/raw/ioat/ioat_rawdev.c index 2c88b4369f..e152004219 100644 --- a/drivers/raw/ioat/ioat_rawdev.c +++ b/drivers/raw/ioat/ioat_rawdev.c @@ -165,9 +165,26 @@ ioat_rawdev_create(const char *name, struct rte_pci_device *dev) goto cleanup; } + /* Allocate memory for the primary process or else return the memory +* of primary memzone for the secondary process. +*/ snprintf(mz_name, sizeof(mz_name), "rawdev%u_private", rawdev->dev_id); - mz = rte_memzone_reserve(mz_name, sizeof(struct rte_ioat_rawdev), - dev->device.numa_node, RTE_MEMZONE_IOVA_CONTIG); + if (rte_eal_process_type() == RTE_PROC_SECONDARY) { + mz = rte_memzone_lookup(mz_name); + if (mz == NULL) { + IOAT_PMD_ERR("Unable lookup memzone for private data\n"); + ret = -ENOMEM; + goto cleanup; + } + rawdev->dev_private = mz->addr; + rawdev->dev_ops = &ioat_rawdev_ops; + rawdev->device = &dev->device; + return 0; + } + mz = rte_memzone_reserve(mz_name, + sizeof(struct rte_ioat_rawdev), + dev->device.numa_node, + RTE_MEMZONE_IOVA_CONTIG); if (mz == NULL) { IOAT_PMD_ERR("Unable to reserve memzone for private data\n"); ret = -ENOMEM; -- 2.25.1
Re: [dpdk-dev] [PATCH v7] raw/ioat: add secondary process support
On Thu, Jan 07, 2021 at 05:53:12PM +0530, Kumar Amber wrote: > Add support for secondary processes in ioat devices. The update > allocates a memzone for a primary process or returns it in a > secondary process. > > Signed-off-by: Kumar Amber > > --- > v5 > * add error check for memzone lookup > v6 > * fix compilation > v7 > * include dev ops for secondary > --- Two comments below. With those fixed, you can add my ack to v8. Acked-by: Bruce Richardson > drivers/raw/ioat/ioat_common.c | 22 -- > drivers/raw/ioat/ioat_rawdev.c | 21 +++-- > 2 files changed, 39 insertions(+), 4 deletions(-) > > diff --git a/drivers/raw/ioat/ioat_common.c b/drivers/raw/ioat/ioat_common.c > index 142e171bc9..2428c8a5b6 100644 > --- a/drivers/raw/ioat/ioat_common.c > +++ b/drivers/raw/ioat/ioat_common.c > @@ -215,14 +215,32 @@ idxd_rawdev_create(const char *name, struct rte_device > *dev, > goto cleanup; > } > > + /* Allocate memory for the primary process or else return the memory > + * of primary memzone for the secondary process. > + */ > snprintf(mz_name, sizeof(mz_name), "rawdev%u_private", rawdev->dev_id); > - mz = rte_memzone_reserve(mz_name, sizeof(struct idxd_rawdev), > - dev->numa_node, RTE_MEMZONE_IOVA_CONTIG); This line actually doesn't need to be changed, and in doing so, it's actually introduced a bug. The sizeof() call here, has "idxd_rawdev" while the replacement below is a copy-paste error using "ioat_rawdev". > + if (rte_eal_process_type() == RTE_PROC_SECONDARY) { > + mz = rte_memzone_lookup(mz_name); > + if (mz == NULL) { > + IOAT_PMD_ERR("Unable lookup memzone for private > data\n"); > + ret = -ENOMEM; > + goto cleanup; > + } > + rawdev->dev_private = mz->addr; > + rawdev->dev_ops = ops; > + rawdev->device = dev; > + return 0; > + } > + mz = rte_memzone_reserve(mz_name, > + sizeof(struct rte_ioat_rawdev), > + dev->numa_node, > + RTE_MEMZONE_IOVA_CONTIG); > if (mz == NULL) { > IOAT_PMD_ERR("Unable to reserve memzone for private data\n"); > ret = -ENOMEM; > goto cleanup; > } > + > rawdev->dev_private = mz->addr; > rawdev->dev_ops = ops; > rawdev->device = dev; > diff --git a/drivers/raw/ioat/ioat_rawdev.c b/drivers/raw/ioat/ioat_rawdev.c > index 2c88b4369f..e152004219 100644 > --- a/drivers/raw/ioat/ioat_rawdev.c > +++ b/drivers/raw/ioat/ioat_rawdev.c > @@ -165,9 +165,26 @@ ioat_rawdev_create(const char *name, struct > rte_pci_device *dev) > goto cleanup; > } > > + /* Allocate memory for the primary process or else return the memory > + * of primary memzone for the secondary process. > + */ > snprintf(mz_name, sizeof(mz_name), "rawdev%u_private", rawdev->dev_id); > - mz = rte_memzone_reserve(mz_name, sizeof(struct rte_ioat_rawdev), > - dev->device.numa_node, RTE_MEMZONE_IOVA_CONTIG); No bug introduced here, but again I think changing this line is unnecessary. > + if (rte_eal_process_type() == RTE_PROC_SECONDARY) { > + mz = rte_memzone_lookup(mz_name); > + if (mz == NULL) { > + IOAT_PMD_ERR("Unable lookup memzone for private > data\n"); > + ret = -ENOMEM; > + goto cleanup; > + } > + rawdev->dev_private = mz->addr; > + rawdev->dev_ops = &ioat_rawdev_ops; > + rawdev->device = &dev->device; > + return 0; > + } > + mz = rte_memzone_reserve(mz_name, > + sizeof(struct rte_ioat_rawdev), > + dev->device.numa_node, > + RTE_MEMZONE_IOVA_CONTIG); > if (mz == NULL) { > IOAT_PMD_ERR("Unable to reserve memzone for private data\n"); > ret = -ENOMEM; > -- > 2.25.1 >
Re: [dpdk-dev] [dpdk-dev v2 1/2] ethdev: add new tunnel type for ecpri
> -Original Message- > From: Thomas Monjalon > Sent: Thursday, January 7, 2021 6:12 PM > To: Guo, Jia > Cc: Zhang, Qi Z ; Wu, Jingjing ; > Yang, Qiming ; Wang, Haiyue > ; dev@dpdk.org; Yigit, Ferruh > ; andrew.rybche...@oktetlabs.ru; or...@nvidia.com; > getel...@nvidia.com > Subject: Re: [dpdk-dev] [dpdk-dev v2 1/2] ethdev: add new tunnel type for > ecpri > > 07/01/2021 10:32, Guo, Jia: > > From: Thomas Monjalon > > > 24/12/2020 07:59, Jeff Guo: > > > > Add type of RTE_TUNNEL_TYPE_ECPRI into the enum of ethdev tunnel > > > type. > > > > > > > > Signed-off-by: Jeff Guo > > > > Reviewed-by: Qi Zhang > > > [...] > > > > --- a/lib/librte_ethdev/rte_ethdev.h > > > > +++ b/lib/librte_ethdev/rte_ethdev.h > > > > @@ -1219,6 +1219,7 @@ enum rte_eth_tunnel_type { > > > > RTE_TUNNEL_TYPE_IP_IN_GRE, > > > > RTE_L2_TUNNEL_TYPE_E_TAG, > > > > RTE_TUNNEL_TYPE_VXLAN_GPE, > > > > + RTE_TUNNEL_TYPE_ECPRI, > > > > RTE_TUNNEL_TYPE_MAX, > > > > }; > > > > > > We tried to remove all these legacy API in DPDK 20.11. > > > Andrew decided to not remove this one because it is not yet > > > completely replaced by rte_flow in all drivers. > > > However, I am against continuing to update this API. > > > The opposite work should be done: migrate to rte_flow. > > > > Agree but seems that the legacy api and driver legacy implementation > > still keep in this release, and there is no a general way to replace > > the legacy by rte_flow right now. > > I think rte_flow is a complete replacement with more features. Thomas, I may not agree with this. Actually the "enum rte_eth_tunnel_type" is used by rte_eth_dev_udp_tunnel_port_add A packet with specific dst udp port will be recognized as a specific tunnel packet type (e.g. vxlan, vxlan-gpe, ecpri...) In Intel NIC, the API actually changes the configuration of the packet parser in HW but not add a filter rule and I guess all other devices may enable it in a similar way. so naturally it should be a device (port) level configuration but not a rte_flow rule for match, encap, decap... So I think it's not a good idea to replace the rte_eth_dev_udp_tunnel_port_add with rte_flow config and also there is no existing rte_flow_action can cover this requirement unless we introduce some new one. > You can match, encap, decap. > There is even a new API to get tunnel infos after decap. > What is missing? > > > > > Sorry, it is a nack. > > > BTW, it is probably breaking the ABI because of RTE_TUNNEL_TYPE_MAX. Yes that may break the ABI but fortunately the checking-abi-compatibility tool shows negative :) , thanks Ferruh' s guide. https://github.com/ferruhy/dpdk/actions/runs/468859673 Thanks Qi > > > > Oh, the ABI break should be a problem. > > > > > PS: please Cc ethdev maintainers for such patch, thanks. > > > tip: use --cc-cmd devtools/get-maintainer.sh > > > > Thanks for your helpful tip. > >
Re: [dpdk-dev] [PATCH v2] mlx5: fix __mlx5_bit_off macro warning for Windows
From: Tal Shnaiderman > While compiling with clang 11 the callers of the __mlx5_bit_off macro warns > on the cast of pointers to unsigned long which is a smaller int type in > Windows. > > warning: cast to smaller integer type 'unsigned long' > from 'u8 (*)[16]' [-Wpointer-to-int-cast] > > To resolve it the type is changed to uintptr_t to be compatible for both Linux > and Windows. > > Fixes: 865a0c15672c ("net/mlx5: add Direct Verbs prepare function") > Cc: sta...@dpdk.org > > Signed-off-by: Tal Shnaiderman Acked-by: Matan Azrad
Re: [dpdk-dev] [PATCH v2] app/testpmd: fix IP checksum calculation
On 1/7/2021 11:32 AM, Ferruh Yigit wrote: On 1/7/2021 5:39 AM, George Prekas wrote: On 1/6/2021 12:02 PM, Ferruh Yigit wrote: On 12/5/2020 5:42 AM, George Prekas wrote: Strict-aliasing rules are violated by cast to uint16_t* in flowgen.c and the calculated IP checksum is wrong on GCC 9 and GCC 10. Signed-off-by: George Prekas --- v2: * Instead of a compiler barrier, use a compiler flag. ---   app/test-pmd/meson.build | 1 +   1 file changed, 1 insertion(+) diff --git a/app/test-pmd/meson.build b/app/test-pmd/meson.build index 7e9c7bdd6..5d24e807f 100644 --- a/app/test-pmd/meson.build +++ b/app/test-pmd/meson.build @@ -4,6 +4,7 @@   # override default name to drop the hyphen   name = 'testpmd'   cflags += '-Wno-deprecated-declarations' +cflags += '-fno-strict-aliasing'   sources = files('5tswap.c',   'cmdline.c',   'cmdline_flow.c', Hi George, I am trying to understand this, the relevant code is as below: ip_hdr->hdr_checksum = ip_sum((unaligned_uint16_t *)ip_hdr, sizeof(*ip_hdr)); You are suspicious of strict aliasing rule violation, with more details: The concern is the "struct rte_ipv4_hdr *ip_hdr;" aliased to "const unaligned_uint16_t *hdr", and compiler can optimize out the calculations using data pointed by 'hdr' pointer, since the 'hdr' pointer is not used to alter the data and compiler may think data is not changed at all. 1) But the pointer "hdr" is assigned in the loop, from another pointer whose content is changing, why this is not helping to figure out that the data 'hdr' pointing is changed. 2) I tried to debug this, but I am not able to reproduce the issue, 'ip_sum()' called each time and checksum calculated correctly. Using gcc 10.2.1-9. Can you able to confirm the case with debug, or from the assembly/object file? And if the issue is strict aliasing rule violation as you said, compiler flag is an option but not sure how much it reduces the compiler optimization benefit, I guess other options also not so good, memcpy brings too much work on runtime and union requires bigger change and makes code complex. I wonder if making 'ip_sum()' a non inline function can help, can you please give a try since you can reproduce it? Hi Ferruh, Thanks for looking into it. I am copy-pasting at the end of this email a minimal reproduction. It calculates a checksum and prints it. The correct value is f8d9. If you compile it with -O0 or -O3 -fno-strict-aliasing, you will get the correct value. If you compile it with gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0 and -O3, you will get f8e8. You can also try it on https://godbolt.org/ and see how different versions behave. My understanding is that the code violates the C standard (https://stackoverflow.com/a/99010). Thanks for the sample code below, I copied to the godbolt: https://godbolt.org/z/6fMK19 In gcc 10, the checksum calculation is done during compilation (when optimization is enabled) and the value is returned directly: mov   $0xffed,%esi Since a calculation is happening I assume the compiler knows about the aliasing and OK with it. But that optimized calculation seems wrong, when it is disabled [1] the checksum is correct again. [1] all following seems helping to disable compile time calculation - disabling optimization - putting a compiler barrier - putting a 'printf' inside 'ip_sum()' - fno-strict-aliasing gcc 8 & 9 is not doing this compile time calculation, hence they are not affected. This feels like an optimization issue in gcc10, but not sure exactly on the root cause, and how to disable it properly in our case. As checked with the Harry, latest finding is gcc 10 left out any _non_ uint16_t type variable in sturct during its compile time calculation. Not sure if it is because of broken aliasing or gcc defect, I will report the issue. Meanwhile for short time solution, can you please try force uninline the 'ip_sum()' and try? --- cut here --- #include #include #include #include struct rte_ipv4_hdr { uint8_t version_ihl; uint8_t type_of_service; uint16_t total_length; uint16_t packet_id; uint16_t fragment_offset; uint8_t time_to_live; uint8_t next_proto_id; uint16_t hdr_checksum; uint32_t src_addr; uint32_t dst_addr; }; static inline uint16_t ip_sum(const uint16_t *hdr, int hdr_len) { uint32_t sum = 0; while (hdr_len > 1) {    sum += *hdr++;    if (sum & 0x8000)    sum = (sum & 0x) + (sum >> 16);    hdr_len -= 2; } while (sum >> 16)    sum = (sum & 0x) + (sum >> 16); return ~sum; } static void pkt_burst_flow_gen(void) { struct rte_ipv4_hdr *ip_hdr = (struct rte_ipv4_hdr *) malloc(4096); memset(ip_hdr, 0, sizeof(*ip_hdr)); ip_hdr->version_ihl   = 1; ip_hdr->type_of_service   = 2; ip_hdr->fragment_offset   = 3; ip_hdr->time_to_live   = 4; ip_hdr->next_proto_id   = 5; ip_hdr->packet_id   = 6; ip_hdr->src_addr   = 7;
[dpdk-dev] [PATCH v3] mlx5: split multi-threaded flows per OS
multi-threaded flows feature uses pthread function pthread_key_create but for Windows the destruction option in the function is unimplemented. to resolve it Windows will implement destruction mechanism to cleanup mlx5_flow_workspace object for each terminated thread. Linux flow will keep the current behavior. Signed-off-by: Tal Shnaiderman Acked-by: Matan Azrad --- Depends-on: series-14562 ("support generic threading functions") v2: fix style issues v3: adjust to new API, remove nl from EOFs. --- --- drivers/net/mlx5/linux/meson.build | 1 + drivers/net/mlx5/linux/mlx5_flow_os.c | 38 +++ drivers/net/mlx5/mlx5.c | 8 ++ drivers/net/mlx5/mlx5_flow.c| 29 +- drivers/net/mlx5/mlx5_flow.h| 10 ++ drivers/net/mlx5/windows/mlx5_flow_os.c | 178 6 files changed, 240 insertions(+), 24 deletions(-) create mode 100644 drivers/net/mlx5/linux/mlx5_flow_os.c diff --git a/drivers/net/mlx5/linux/meson.build b/drivers/net/mlx5/linux/meson.build index 6c4402169e..8412edce78 100644 --- a/drivers/net/mlx5/linux/meson.build +++ b/drivers/net/mlx5/linux/meson.build @@ -9,5 +9,6 @@ sources += files( 'mlx5_verbs.c', 'mlx5_mp_os.c', 'mlx5_vlan_os.c', + 'mlx5_flow_os.c', ) diff --git a/drivers/net/mlx5/linux/mlx5_flow_os.c b/drivers/net/mlx5/linux/mlx5_flow_os.c new file mode 100644 index 00..732b1b2dd8 --- /dev/null +++ b/drivers/net/mlx5/linux/mlx5_flow_os.c @@ -0,0 +1,38 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright 2020 Mellanox Technologies, Ltd + */ + +#include "mlx5_flow_os.h" + +#include + +/* Key of thread specific flow workspace data. */ +static rte_tls_key key_workspace; + +int +mlx5_flow_os_init_workspace_once(void) +{ + if (rte_thread_tls_key_create(&key_workspace, flow_release_workspace)) { + DRV_LOG(ERR, "Can't create flow workspace data thread key."); + return -ENOMEM; + } + return 0; +} + +void * +mlx5_flow_os_get_specific_workspace(void) +{ + return rte_thread_tls_value_get(key_workspace); +} + +int +mlx5_flow_os_set_specific_workspace(struct mlx5_flow_workspace *data) +{ + return rte_thread_tls_value_set(key_workspace, data); +} + +void +mlx5_flow_os_release_workspace(void) +{ + rte_thread_tls_key_delete(key_workspace); +} diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c index 023ef50a77..e47ad6bd42 100644 --- a/drivers/net/mlx5/mlx5.c +++ b/drivers/net/mlx5/mlx5.c @@ -1004,6 +1004,11 @@ mlx5_alloc_shared_dev_ctx(const struct mlx5_dev_spawn_data *spawn, err = rte_errno; goto error; } + if (LIST_EMPTY(&mlx5_dev_ctx_list)) { + err = mlx5_flow_os_init_workspace_once(); + if (err) + goto error; + } mlx5_flow_aging_init(sh); mlx5_flow_counters_mng_init(sh); mlx5_flow_ipool_create(sh, config); @@ -1079,6 +1084,9 @@ mlx5_free_shared_dev_ctx(struct mlx5_dev_ctx_shared *sh) mlx5_mr_release_cache(&sh->share_cache); /* Remove context from the global device list. */ LIST_REMOVE(sh, next); + /* Release flow workspaces objects on the last device. */ + if (LIST_EMPTY(&mlx5_dev_ctx_list)) + mlx5_flow_os_release_workspace(); pthread_mutex_unlock(&mlx5_dev_ctx_list_mutex); /* * Ensure there is no async event handler installed. diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c index f110c6b714..a2a294eac2 100644 --- a/drivers/net/mlx5/mlx5_flow.c +++ b/drivers/net/mlx5/mlx5_flow.c @@ -696,11 +696,6 @@ static struct mlx5_flow_tunnel_info tunnels_info[] = { }, }; -/* Key of thread specific flow workspace data. */ -static pthread_key_t key_workspace; - -/* Thread specific flow workspace data once initialization data. */ -static pthread_once_t key_workspace_init; /** @@ -5698,7 +5693,7 @@ mlx5_flow_start_default(struct rte_eth_dev *dev) /** * Release key of thread specific flow workspace data. */ -static void +void flow_release_workspace(void *data) { struct mlx5_flow_workspace *wks = data; @@ -5712,16 +5707,6 @@ flow_release_workspace(void *data) } } -/** - * Initialize key of thread specific flow workspace data. - */ -static void -flow_alloc_workspace(void) -{ - if (pthread_key_create(&key_workspace, flow_release_workspace)) - DRV_LOG(ERR, "Can't create flow workspace data thread key."); -} - /** * Get thread specific current flow workspace. * @@ -5732,7 +5717,7 @@ mlx5_flow_get_thread_workspace(void) { struct mlx5_flow_workspace *data; - data = pthread_getspecific(key_workspace); + data = mlx5_flow_os_get_specific_workspace(); MLX5_ASSERT(data && data->inuse); if (!data || !data->inuse) DRV_LOG(ERR, "flow workspace not initialized."); @@ -5780,11 +5765,7 @@
Re: [dpdk-dev] [dpdk-dev v2 1/2] ethdev: add new tunnel type for ecpri
07/01/2021 13:47, Zhang, Qi Z: > > > -Original Message- > > From: Thomas Monjalon > > Sent: Thursday, January 7, 2021 6:12 PM > > To: Guo, Jia > > Cc: Zhang, Qi Z ; Wu, Jingjing > > ; > > Yang, Qiming ; Wang, Haiyue > > ; dev@dpdk.org; Yigit, Ferruh > > ; andrew.rybche...@oktetlabs.ru; or...@nvidia.com; > > getel...@nvidia.com > > Subject: Re: [dpdk-dev] [dpdk-dev v2 1/2] ethdev: add new tunnel type for > > ecpri > > > > 07/01/2021 10:32, Guo, Jia: > > > From: Thomas Monjalon > > > > 24/12/2020 07:59, Jeff Guo: > > > > > Add type of RTE_TUNNEL_TYPE_ECPRI into the enum of ethdev tunnel > > > > type. > > > > > > > > > > Signed-off-by: Jeff Guo > > > > > Reviewed-by: Qi Zhang > > > > [...] > > > > > --- a/lib/librte_ethdev/rte_ethdev.h > > > > > +++ b/lib/librte_ethdev/rte_ethdev.h > > > > > @@ -1219,6 +1219,7 @@ enum rte_eth_tunnel_type { > > > > > RTE_TUNNEL_TYPE_IP_IN_GRE, > > > > > RTE_L2_TUNNEL_TYPE_E_TAG, > > > > > RTE_TUNNEL_TYPE_VXLAN_GPE, > > > > > + RTE_TUNNEL_TYPE_ECPRI, > > > > > RTE_TUNNEL_TYPE_MAX, > > > > > }; > > > > > > > > We tried to remove all these legacy API in DPDK 20.11. > > > > Andrew decided to not remove this one because it is not yet > > > > completely replaced by rte_flow in all drivers. > > > > However, I am against continuing to update this API. > > > > The opposite work should be done: migrate to rte_flow. > > > > > > Agree but seems that the legacy api and driver legacy implementation > > > still keep in this release, and there is no a general way to replace > > > the legacy by rte_flow right now. > > > > I think rte_flow is a complete replacement with more features. > > Thomas, I may not agree with this. > > Actually the "enum rte_eth_tunnel_type" is used by > rte_eth_dev_udp_tunnel_port_add > A packet with specific dst udp port will be recognized as a specific tunnel > packet type (e.g. vxlan, vxlan-gpe, ecpri...) > In Intel NIC, the API actually changes the configuration of the packet parser > in HW but not add a filter rule and I guess all other devices may enable it > in a similar way. > so naturally it should be a device (port) level configuration but not a > rte_flow rule for match, encap, decap... I don't understand how it helps to identify an UDP port if there is no rule for this tunnel. What is the usage? > So I think it's not a good idea to replace > the rte_eth_dev_udp_tunnel_port_add with rte_flow config > and also there is no existing rte_flow_action > can cover this requirement unless we introduce some new one. > > > You can match, encap, decap. > > There is even a new API to get tunnel infos after decap. > > What is missing? I still don't see which use case is missing. > > > > Sorry, it is a nack. > > > > BTW, it is probably breaking the ABI because of RTE_TUNNEL_TYPE_MAX. > > Yes that may break the ABI but fortunately the checking-abi-compatibility > tool shows negative :) , thanks Ferruh' s guide. > https://github.com/ferruhy/dpdk/actions/runs/468859673 That's very strange. An enum value is changed. Why it is not flagged by libabigail? > > > Oh, the ABI break should be a problem. > > > > > > > PS: please Cc ethdev maintainers for such patch, thanks. > > > > tip: use --cc-cmd devtools/get-maintainer.sh > > > > > > Thanks for your helpful tip.
Re: [dpdk-dev] [dpdk-dev v2 1/2] ethdev: add new tunnel type for ecpri
On Thu, Jan 7, 2021 at 2:33 PM Thomas Monjalon wrote: > > Yes that may break the ABI but fortunately the checking-abi-compatibility > > tool shows negative :) , thanks Ferruh' s guide. > > https://github.com/ferruhy/dpdk/actions/runs/468859673 > > That's very strange. An enum value is changed. > Why it is not flagged by libabigail? I suspect this is because the enum is not referenced in any object... all I see is an integer with a comment that it should be filled with values from the enum. -- David Marchand
[dpdk-dev] [PATCH] net/enic: use 64B completion queue entries if available
Latest VIC adapters support 64B CQ (completion queue) entries as well as 16B entries available on all VIC models. 64B entries can greatly reduce cache contention (CPU stall cycles) between DMA writes (Rx packet descriptors) and polling CPU. The effect is very noticeable on Intel platforms with DDIO. As most UCS servers are based on Intel platforms, enable and use 64B CQ entries by default, if available. Also, add devarg 'cq64' so the user can explicitly disable 64B CQ. Signed-off-by: Hyong Youb Kim Reviewed-by: John Daley --- doc/guides/nics/enic.rst | 25 doc/guides/rel_notes/release_21_02.rst | 3 ++ drivers/net/enic/base/cq_enet_desc.h | 13 drivers/net/enic/base/vnic_dev.c | 24 +++ drivers/net/enic/base/vnic_dev.h | 4 +++ drivers/net/enic/base/vnic_devcmd.h| 31 +++ drivers/net/enic/enic.h| 5 drivers/net/enic/enic_ethdev.c | 12 +++- drivers/net/enic/enic_main.c | 19 drivers/net/enic/enic_res.c| 13 drivers/net/enic/enic_rxtx.c | 41 ++ 11 files changed, 183 insertions(+), 7 deletions(-) diff --git a/doc/guides/nics/enic.rst b/doc/guides/nics/enic.rst index 5d1cc9f7fa..4e7629c5cd 100644 --- a/doc/guides/nics/enic.rst +++ b/doc/guides/nics/enic.rst @@ -388,6 +388,31 @@ vectorized handler is selected, enable debug logging enic_use_vector_rx_handler use the non-scatter avx2 Rx handler +64B Completion Queue Entry +-- + +Recent VIC adapters support 64B completion queue entries, as well as +16B entries that are available on all adapter models. ENIC PMD enables +and uses 64B entries by default, if available. 64B entries generally +lower CPU cycles per Rx packet, as they avoid partial DMA writes and +reduce cache contention between DMA and polling CPU. The effect is +most pronounced when multiple Rx queues are used on Intel platforms +with Data Direct I/O Technology (DDIO). + +If 64B entries are not available, PMD uses 16B entries. The user may +explicitly disable 64B entries and use 16B entries by setting +``devarg`` parameter ``cq64=0``. For example:: + +-a 12:00.0,cq64=0 + +To verify the selected entry size, enable debug logging +(``--log-level=enic,debug``) and check the following messages. + +.. code-block:: console + +PMD: rte_enic_pmd: Supported CQ entry sizes: 16 32 +PMD: rte_enic_pmd: Using 16B CQ entry size + .. _enic_limitations: Limitations diff --git a/doc/guides/rel_notes/release_21_02.rst b/doc/guides/rel_notes/release_21_02.rst index 638f98168b..802a4c8ac0 100644 --- a/doc/guides/rel_notes/release_21_02.rst +++ b/doc/guides/rel_notes/release_21_02.rst @@ -55,6 +55,9 @@ New Features Also, make sure to start the actual text at the margin. === +* **Updated Cisco enic driver.** + + * Added support for 64B completion queue entries Removed Items - diff --git a/drivers/net/enic/base/cq_enet_desc.h b/drivers/net/enic/base/cq_enet_desc.h index 602ac22b65..a34a4f5400 100644 --- a/drivers/net/enic/base/cq_enet_desc.h +++ b/drivers/net/enic/base/cq_enet_desc.h @@ -58,6 +58,19 @@ struct cq_enet_rq_clsf_desc { uint8_t type_color; }; +/* Completion queue descriptor: Ethernet receive queue, 64B */ +struct cq_enet_rq_desc_64 { + uint16_t completed_index_flags; + uint16_t q_number_rss_type_flags; + uint32_t rss_hash; + uint16_t bytes_written_flags; + uint16_t vlan; + uint16_t checksum_fcoe; + uint8_t flags; + uint8_t unused[48]; + uint8_t type_color; +}; + #define CQ_ENET_RQ_DESC_FLAGS_INGRESS_PORT (0x1 << 12) #define CQ_ENET_RQ_DESC_FLAGS_FCOE (0x1 << 13) #define CQ_ENET_RQ_DESC_FLAGS_EOP (0x1 << 14) diff --git a/drivers/net/enic/base/vnic_dev.c b/drivers/net/enic/base/vnic_dev.c index aaca07ca67..526273cef5 100644 --- a/drivers/net/enic/base/vnic_dev.c +++ b/drivers/net/enic/base/vnic_dev.c @@ -1320,3 +1320,27 @@ int vnic_dev_capable_geneve(struct vnic_dev *vdev) ret = vnic_dev_cmd(vdev, CMD_GET_SUPP_FEATURE_VER, &a0, &a1, wait); return ret == 0 && (a1 & FEATURE_GENEVE_OPTIONS); } + +uint64_t vnic_dev_capable_cq_entry_size(struct vnic_dev *vdev) +{ + uint64_t a0 = CMD_CQ_ENTRY_SIZE_SET; + uint64_t a1 = 0; + int wait = 1000; + int ret; + + ret = vnic_dev_cmd(vdev, CMD_CAPABILITY, &a0, &a1, wait); + /* All models support 16B CQ entry by default */ + if (!(ret == 0 && a0 == 0)) + a1 = VNIC_RQ_CQ_ENTRY_SIZE_16_CAPABLE; + return a1; +} + +int vnic_dev_set_cq_entry_size(struct vnic_dev *vdev, uint32_t rq_idx, + uint32_t size_flag) +{ + uint64_t a0 = rq_idx; + uint64_t a1 = size_flag; + int wait = 1000; + + return vnic_dev_cmd(vdev, CMD_CQ_E
Re: [dpdk-dev] [PATCH v1 1/4] net/virtio: remove unnecessary rmb barrier
On 12/21/20 3:23 PM, Joyce Kong wrote: > As desc_is_used has a load-acquire or rte_io_rmb inside > and wait for used desc in virtqueue, it is ok to remove > virtio_rmb behind it. > > Signed-off-by: Joyce Kong > Reviewed-by: Ruifeng Wang > --- > drivers/net/virtio/virtio_ethdev.c | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/drivers/net/virtio/virtio_ethdev.c > b/drivers/net/virtio/virtio_ethdev.c > index 6c233b75b..0d91f7a50 100644 > --- a/drivers/net/virtio/virtio_ethdev.c > +++ b/drivers/net/virtio/virtio_ethdev.c > @@ -209,12 +209,12 @@ virtio_send_command_packed(struct virtnet_ctl *cvq, > virtio_wmb(vq->hw->weak_barriers); > virtqueue_notify(vq); > > - /* wait for used descriptors in virtqueue */ > + /* wait for used desc in virtqueue > + * desc_is_used has a load-acquire or rte_io_rmb inside > + */ > while (!desc_is_used(&desc[head], vq)) > usleep(100); > > - virtio_rmb(vq->hw->weak_barriers); > - > /* now get used descriptors */ > vq->vq_free_cnt += nb_descs; > vq->vq_used_cons_idx += nb_descs; > Reviewed-by: Maxime Coquelin Thanks, Maxime
Re: [dpdk-dev] [RFC] ethdev: introduce copy_field rte flow action
> Tuesday, January 5, 2021 17:17, Thomas Monjalon > > RTE Flows API lacks the ability to save an arbitrary header field in > > order to use it later for advanced packet manipulations. Examples > > include the usage of VxLAN ID after the packet is decapsulated or > > storing this ID inside the packet payload itself or swapping an > > arbitrary inner and outer packet fields. > > > > The idea is to allow a copy of a specified number of bits form any > > packet header field into another header field: > > RTE_FLOW_ACTION_TYPE_COPY_FIELD with the structure defined below. > > > > struct rte_flow_action_copy_field { > > struct rte_flow_action_copy_data dest; > > struct rte_flow_action_copy_data src; > > uint16_t width; > > }; > > > > Arbitrary header field (as well as mark, metadata or tag values) can be > > used as both source and destination fields. This way we can save an > > arbitrary header field by copying its value to a tag/mark/metadata or > > copy it into another header field directly. tag/mark/metadata can also > > be used as a value to be stored in an arbitrary packet header field. > > > > struct rte_flow_action_copy_data { > > enum rte_flow_field_id field; > > uint16_t index; > > uint16_t offset; > > }; > > > > The rte_flow_field_id specifies the particular packet field (or > > tag/mark/metadata) to be used as a copy source or destination. > > The index gives access to inner packet headers or elements in the tags > > array. The offset allows to copy a packet field value into the payload. > > So index is in reality the layer? How is it numbered exactly? It is a layer for packet fields, inner headers get higher number index. But is it also an index in the TAG array, so the name comes from it. > What is the field id if an offset is given? Field ID stays the same, you can specify a small offset to copy just a few bits from the entire packet field or a big offset to move to completely different area. > Can we say that a field id can always be replaced by an offset? Not really. You can use offset to jump around packet fields for sure, but it is going to be hard and cumbersome to calculate all the offsets for that. Field ID is much more convenient. > > It is proposed to implement the "set copy_field" command to store all > > You are talking about testpmd here? > It looks unrelated to this patch and not sure it helps understanding. I'm working on testpmd implementation and will send it shortly. > > the required parameters and then to use this template by specifying the > > index of the needed copy action. For example, to modify the GTP tunnel > > ID after the packet is encapsulated following testpmd rules are used: > > > > set copy_field width 32 src field tag index 1 offset 0 > > dst field teid index 0 offset 0 > > flow create 0 ingress pattern ... / end > > raw_decap index 1 / raw_encap index 2 / > > copy_field index 1 / end > > > > A more generic mechanism to overwrite an arbitrary header field may be > > introduced to the RTE flows implementation later: > > RTE_FLOW_ACTION_TYPE_SET_FIELD with the structure defined below. > > > > struct rte_flow_action_copy_field { > > struct rte_flow_action_copy_data dest; > > uint8_t *data; > > uint16_t width; > > }; > > > > This way we can have the generic way to specify an immediate value and > > use it as data for any packet header field instead of having separate > > RTE Flow action for each of the packet fields. Deprecation notice may > > be issued to RTE_FLOW_ACTION_TYPE_SET_XXX actions after the unified > > method of setting a value to any packet field is implemented. > > Yes having a single action for setting any field looks to be a good idea. Thanks. > > > Signed-off-by: Alexander Kozyrev > > --- > > lib/librte_ethdev/rte_flow.c | 1 + > > We are very close to the -rc1 date and implemention is missing. > Please update. > > > lib/librte_ethdev/rte_flow.h | 59 > > 2 files changed, 60 insertions(+) > > [...] > > +enum rte_flow_field_id { > > + RTE_FLOW_FIELD_NONE = 0, > > + RTE_FLOW_FIELD_MAC_DST, > > + RTE_FLOW_FIELD_MAC_SRC, > > + RTE_FLOW_FIELD_VLAN_TYPE, > > + RTE_FLOW_FIELD_VLAN_ID, > > + RTE_FLOW_FIELD_MAC_TYPE, > > + RTE_FLOW_FIELD_IPV4_DSCP, > > + RTE_FLOW_FIELD_IPV4_TTL, > > + RTE_FLOW_FIELD_IPV4_SRC, > > + RTE_FLOW_FIELD_IPV4_DST, > > + RTE_FLOW_FIELD_IPV6_HOPLIMIT, > > + RTE_FLOW_FIELD_IPV6_SRC, > > + RTE_FLOW_FIELD_IPV6_DST, > > + RTE_FLOW_FIELD_TCP_PORT_SRC, > > + RTE_FLOW_FIELD_TCP_PORT_DST, > > + RTE_FLOW_FIELD_TCP_SEQ_NUM, > > + RTE_FLOW_FIELD_TCP_ACK_NUM, > > + RTE_FLOW_FIELD_TCP_FLAGS, > > + RTE_FLOW_FIELD_UDP_PORT_SRC, > > + RTE_FLOW_FIELD_UDP_PORT_DST, > > + RTE_FLOW_FIELD_VXLAN_VNI, > > + RTE_FLOW_FIELD_GENEVE_VNI, > > + RTE_FLOW_FIELD_GTP_TEID, > > + RTE_FLOW_FIELD_TAG, > > + RTE_FLOW_FIELD_MARK, > > + RTE_FLOW_FIELD_META, > > +}; > > I don't really like having to l
Re: [dpdk-dev] [dpdk-dev v2 1/2] ethdev: add new tunnel type for ecpri
David Marchand writes: > On Thu, Jan 7, 2021 at 2:33 PM Thomas Monjalon wrote: >> > Yes that may break the ABI but fortunately the checking-abi-compatibility >> > tool shows negative :) , thanks Ferruh' s guide. >> > https://github.com/ferruhy/dpdk/actions/runs/468859673 >> >> That's very strange. An enum value is changed. >> Why it is not flagged by libabigail? > > I suspect this is because the enum is not referenced in any object... > all I see is an integer with a comment that it should be filled with > values from the enum. I am not sure about the full context but David is right in theory. If the enum is not reachable from a publically exported interface (function or global variable) then it won't we considered as being part of the ABI and changes to that enum won't be reported. I am not sure if that is what is happening in this particular case, though. Cheers, -- Dodji
Re: [dpdk-dev] [PATCH v1 2/4] net/virtio: replace smp barrier with IO barrier
On 12/21/20 3:23 PM, Joyce Kong wrote: > Replace rte_smp_wmb/rmb with rte_io_wmb/rmb as they are the same on x86 > and ppc platforms. Then, for function virtqueue_fetch_flags_packed/ > virtqueue_store_flags_packed, the if and else branch are still identical > for the platforms except Arm. > > Signed-off-by: Joyce Kong > Reviewed-by: Ruifeng Wang > --- > drivers/net/virtio/virtqueue.h | 27 +-- > 1 file changed, 13 insertions(+), 14 deletions(-) > Reviewed-by: Maxime Coquelin
Re: [dpdk-dev] [PATCH v9 2/2] eal: add generic thread-local-storage functions
On Wed, 6 Jan 2021 22:35:53 +0200, Tal Shnaiderman wrote: [...] > +int > +rte_thread_tls_key_create(rte_tls_key *key, void (*destructor)(void *)) > +{ > + int err; > + > + *key = malloc(sizeof(**key)); > + if ((*key) == NULL) { > + RTE_LOG(DEBUG, EAL, "Cannot allocate TLS key."); Missing "\n", same for Windows part. Aside from this nit, for series, Acked-by: Dmitry Kozlyuk
Re: [dpdk-dev] [PATCH v1 3/4] net/virtio: replace full barrier with relaxed barrier for Arm platform
On 12/21/20 3:23 PM, Joyce Kong wrote: > Relax the full write barriers to one-way barriers for virtio > control path for Arm platform > > Signed-off-by: Joyce Kong > Reviewed-by: Ruifeng Wang > --- > drivers/net/virtio/virtio_ethdev.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/net/virtio/virtio_ethdev.c > b/drivers/net/virtio/virtio_ethdev.c > index 0d91f7a50..b3e5cba70 100644 > --- a/drivers/net/virtio/virtio_ethdev.c > +++ b/drivers/net/virtio/virtio_ethdev.c > @@ -203,8 +203,8 @@ virtio_send_command_packed(struct virtnet_ctl *cvq, > vq->vq_packed.cached_flags ^= VRING_PACKED_DESC_F_AVAIL_USED; > } > > - virtio_wmb(vq->hw->weak_barriers); > - desc[head].flags = VRING_DESC_F_NEXT | flags; > + virtqueue_store_flags_packed(&desc[head], VRING_DESC_F_NEXT | flags, > + vq->hw->weak_barriers); > > virtio_wmb(vq->hw->weak_barriers); > virtqueue_notify(vq); > Performance does not matter in the case of ctrl queue, but it is cleaner to reuse existing helpers anyway: Reviewed-by: Maxime Coquelin Thanks, Maxime
Re: [dpdk-dev] [PATCH 3/4] app/flow-perf: change clock measurement functions
26/11/2020 12:15, Wisam Jaddo: > The clock() function is not good practice to use for multiple > cores/threads, since it measures the CPU time used by the process > and not the wall clock time, while when running through multiple > cores/threads simultaneously, we can burn through CPU time much > faster. > > As a result this commit will change the way of measurement to use > rd_tsc, and the results will be divided by the processor frequency. > > Signed-off-by: Wisam Jaddo > Reviewed-by: Alexander Kozyrev > Reviewed-by: Suanming Mou > --- > - start_batch = clock(); > + start_batch = rte_rdtsc(); Please could you try the generic wrapper rte_get_timer_cycles? It should be the same (inline wrapper) when HPET is disabled. rdtsc refer to an x86 instruction so I prefer a more generic API. Can be a separate patch. While at it, I believe more apps could be converted.
Re: [dpdk-dev] [PATCH 0/4] app/flow-perf: add multi threaded support
26/11/2020 12:15, Wisam Jaddo: > After this series the application will start supporting testing > multiple threaded insertion and deletion rates. > > Also it will provide the latency & throughput rates of all threads. > > > Wisam Jaddo (4): > app/flow-perf: refactor flows handler > app/flow-perf: add multiple cores insertion and deletion > app/flow-perf: change clock measurement functions > app/flow-perf: remove redundant items memset and vars Pending for 6 weeks without any comment. Applied, thanks.
Re: [dpdk-dev] [RFC] ethdev: introduce copy_field rte flow action
07/01/2021 15:17, Alexander Kozyrev: > > Tuesday, January 5, 2021 17:17, Thomas Monjalon > > > RTE Flows API lacks the ability to save an arbitrary header field in > > > order to use it later for advanced packet manipulations. Examples > > > include the usage of VxLAN ID after the packet is decapsulated or > > > storing this ID inside the packet payload itself or swapping an > > > arbitrary inner and outer packet fields. > > > > > > The idea is to allow a copy of a specified number of bits form any > > > packet header field into another header field: > > > RTE_FLOW_ACTION_TYPE_COPY_FIELD with the structure defined below. > > > > > > struct rte_flow_action_copy_field { > > > struct rte_flow_action_copy_data dest; > > > struct rte_flow_action_copy_data src; > > > uint16_t width; > > > }; > > > > > > Arbitrary header field (as well as mark, metadata or tag values) can be > > > used as both source and destination fields. This way we can save an > > > arbitrary header field by copying its value to a tag/mark/metadata or > > > copy it into another header field directly. tag/mark/metadata can also > > > be used as a value to be stored in an arbitrary packet header field. > > > > > > struct rte_flow_action_copy_data { > > > enum rte_flow_field_id field; > > > uint16_t index; > > > uint16_t offset; > > > }; > > > > > > The rte_flow_field_id specifies the particular packet field (or > > > tag/mark/metadata) to be used as a copy source or destination. > > > The index gives access to inner packet headers or elements in the tags > > > array. The offset allows to copy a packet field value into the payload. > > > > So index is in reality the layer? How is it numbered exactly? > > It is a layer for packet fields, inner headers get higher number index. > But is it also an index in the TAG array, so the name comes from it. Sorry it is not obvious. Please describe the exact numbering in tunnel and VLAN cases. > > What is the field id if an offset is given? > > Field ID stays the same, you can specify a small offset to copy just a few > bits > from the entire packet field or a big offset to move to completely different > area. I don't understand what is an offset then. Isn't it the byte or bit where the copy start? Do you handle sizes smaller than a byte? > > Can we say that a field id can always be replaced by an offset? > > Not really. You can use offset to jump around packet fields for sure, but it > is going to be > hard and cumbersome to calculate all the offsets for that. Field ID is much > more convenient. I think it depends for who. For some use cases, it may be easier to pass an offset. For some drivers, it may be more efficient to directly manage offsets.
Re: [dpdk-dev] [RFC] ethdev: introduce copy_field rte flow action
> > > Thursday, January 7, 2021 10:07, Thomas Monjalon > > > > RTE Flows API lacks the ability to save an arbitrary header field in > > > > order to use it later for advanced packet manipulations. Examples > > > > include the usage of VxLAN ID after the packet is decapsulated or > > > > storing this ID inside the packet payload itself or swapping an > > > > arbitrary inner and outer packet fields. > > > > > > > > The idea is to allow a copy of a specified number of bits form any > > > > packet header field into another header field: > > > > RTE_FLOW_ACTION_TYPE_COPY_FIELD with the structure defined below. > > > > > > > > struct rte_flow_action_copy_field { > > > > struct rte_flow_action_copy_data dest; > > > > struct rte_flow_action_copy_data src; > > > > uint16_t width; > > > > }; > > > > > > > > Arbitrary header field (as well as mark, metadata or tag values) can be > > > > used as both source and destination fields. This way we can save an > > > > arbitrary header field by copying its value to a tag/mark/metadata or > > > > copy it into another header field directly. tag/mark/metadata can also > > > > be used as a value to be stored in an arbitrary packet header field. > > > > > > > > struct rte_flow_action_copy_data { > > > > enum rte_flow_field_id field; > > > > uint16_t index; > > > > uint16_t offset; > > > > }; > > > > > > > > The rte_flow_field_id specifies the particular packet field (or > > > > tag/mark/metadata) to be used as a copy source or destination. > > > > The index gives access to inner packet headers or elements in the tags > > > > array. The offset allows to copy a packet field value into the payload. > > > > > > So index is in reality the layer? How is it numbered exactly? > > > > It is a layer for packet fields, inner headers get higher number index. > > But is it also an index in the TAG array, so the name comes from it. > > Sorry it is not obvious. > Please describe the exact numbering in tunnel and VLAN cases. > > > > What is the field id if an offset is given? > > > > Field ID stays the same, you can specify a small offset to copy just a few > > bits > > from the entire packet field or a big offset to move to completely different > area. > > I don't understand what is an offset then. > Isn't it the byte or bit where the copy start? > Do you handle sizes smaller than a byte? It is the bit offset, you can copy 20 bits out of 32 bits of IPv4 address for example. > > > Can we say that a field id can always be replaced by an offset? > > > > Not really. You can use offset to jump around packet fields for sure, but > > it is > going to be > > hard and cumbersome to calculate all the offsets for that. Field ID is much > more convenient. > > I think it depends for who. > For some use cases, it may be easier to pass an offset. > For some drivers, it may be more efficient to directly manage offsets. It is possible with this RFC, driver can choose what to use: id and/or offset.
Re: [dpdk-dev] [PATCH 1/2] regexdev: add resource limit reached rsp flag
Hi I would very much like a review for this patch. Thank you, Ori > -Original Message- > From: Ori Kam > Sent: Thursday, December 17, 2020 12:38 PM > Subject: [PATCH 1/2] regexdev: add resource limit reached rsp flag > > When scanning a buffer it is possible that the scan will abort > due to some internal resource limit. > > This commit adds such response flag, so application can handle such cases. > > Signed-off-by: Francis Kelly > Signed-off-by: Ori Kam > --- > lib/librte_regexdev/rte_regexdev.h | 5 + > 1 file changed, 5 insertions(+) > > diff --git a/lib/librte_regexdev/rte_regexdev.h > b/lib/librte_regexdev/rte_regexdev.h > index 0001658925..86f0b231b0 100644 > --- a/lib/librte_regexdev/rte_regexdev.h > +++ b/lib/librte_regexdev/rte_regexdev.h > @@ -1333,6 +1333,11 @@ struct rte_regexdev_match { > * @see RTE_REGEXDEV_ATTR_MAX_PREFIX > */ > > +#define RTE_REGEX_OPS_RSP_RESOURCE_LIMIT_REACHED_F (1 << 4) > +/**< Indicates that the RegEx device has reached the max allowed resource > + * allowed while scanning the given buffer. > + */ > + > /** > * The generic *rte_regex_ops* structure to hold the RegEx attributes > * for enqueue and dequeue operation. > -- > 2.25.1
Re: [dpdk-dev] [RFC] ethdev: introduce copy_field rte flow action
07/01/2021 16:10, Alexander Kozyrev: > > > > Thursday, January 7, 2021 10:07, Thomas Monjalon > > > > > RTE Flows API lacks the ability to save an arbitrary header field in > > > > > order to use it later for advanced packet manipulations. Examples > > > > > include the usage of VxLAN ID after the packet is decapsulated or > > > > > storing this ID inside the packet payload itself or swapping an > > > > > arbitrary inner and outer packet fields. > > > > > > > > > > The idea is to allow a copy of a specified number of bits form any > > > > > packet header field into another header field: > > > > > RTE_FLOW_ACTION_TYPE_COPY_FIELD with the structure defined below. > > > > > > > > > > struct rte_flow_action_copy_field { > > > > > struct rte_flow_action_copy_data dest; > > > > > struct rte_flow_action_copy_data src; > > > > > uint16_t width; > > > > > }; > > > > > > > > > > Arbitrary header field (as well as mark, metadata or tag values) can > > > > > be > > > > > used as both source and destination fields. This way we can save an > > > > > arbitrary header field by copying its value to a tag/mark/metadata or > > > > > copy it into another header field directly. tag/mark/metadata can also > > > > > be used as a value to be stored in an arbitrary packet header field. > > > > > > > > > > struct rte_flow_action_copy_data { > > > > > enum rte_flow_field_id field; > > > > > uint16_t index; > > > > > uint16_t offset; > > > > > }; > > > > > > > > > > The rte_flow_field_id specifies the particular packet field (or > > > > > tag/mark/metadata) to be used as a copy source or destination. > > > > > The index gives access to inner packet headers or elements in the tags > > > > > array. The offset allows to copy a packet field value into the > > > > > payload. > > > > > > > > So index is in reality the layer? How is it numbered exactly? > > > > > > It is a layer for packet fields, inner headers get higher number index. > > > But is it also an index in the TAG array, so the name comes from it. > > > > Sorry it is not obvious. > > Please describe the exact numbering in tunnel and VLAN cases. > > > > > > What is the field id if an offset is given? > > > > > > Field ID stays the same, you can specify a small offset to copy just a > > > few bits > > > from the entire packet field or a big offset to move to completely > > > different > > area. > > > > I don't understand what is an offset then. > > Isn't it the byte or bit where the copy start? > > Do you handle sizes smaller than a byte? > > It is the bit offset, you can copy 20 bits out of 32 bits of IPv4 address for > example. Now I'm confused. You mean rte_flow_action_copy_data.offset is a bit offset? > > > > Can we say that a field id can always be replaced by an offset? > > > > > > Not really. You can use offset to jump around packet fields for sure, but > > > it is > > going to be > > > hard and cumbersome to calculate all the offsets for that. Field ID is > > > much > > more convenient. > > > > I think it depends for who. > > For some use cases, it may be easier to pass an offset. > > For some drivers, it may be more efficient to directly manage offsets. > > It is possible with this RFC, driver can choose what to use: id and/or offset. We can set field and index to 0, and use only offset? Then it is a byte offset from the beginning mbuf.data?
Re: [dpdk-dev] [RFC] ethdev: introduce copy_field rte flow action
> 07/01/2021 16:10, Alexander Kozyrev: > > > > > Thursday, January 7, 2021 10:18, Thomas Monjalon > > > > > > > RTE Flows API lacks the ability to save an arbitrary header field in > > > > > > order to use it later for advanced packet manipulations. Examples > > > > > > include the usage of VxLAN ID after the packet is decapsulated or > > > > > > storing this ID inside the packet payload itself or swapping an > > > > > > arbitrary inner and outer packet fields. > > > > > > > > > > > > The idea is to allow a copy of a specified number of bits form any > > > > > > packet header field into another header field: > > > > > > RTE_FLOW_ACTION_TYPE_COPY_FIELD with the structure defined > below. > > > > > > > > > > > > struct rte_flow_action_copy_field { > > > > > > struct rte_flow_action_copy_data dest; > > > > > > struct rte_flow_action_copy_data src; > > > > > > uint16_t width; > > > > > > }; > > > > > > > > > > > > Arbitrary header field (as well as mark, metadata or tag values) > > > > > > can be > > > > > > used as both source and destination fields. This way we can save an > > > > > > arbitrary header field by copying its value to a tag/mark/metadata > > > > > > or > > > > > > copy it into another header field directly. tag/mark/metadata can > > > > > > also > > > > > > be used as a value to be stored in an arbitrary packet header field. > > > > > > > > > > > > struct rte_flow_action_copy_data { > > > > > > enum rte_flow_field_id field; > > > > > > uint16_t index; > > > > > > uint16_t offset; > > > > > > }; > > > > > > > > > > > > The rte_flow_field_id specifies the particular packet field (or > > > > > > tag/mark/metadata) to be used as a copy source or destination. > > > > > > The index gives access to inner packet headers or elements in the > > > > > > tags > > > > > > array. The offset allows to copy a packet field value into the > > > > > > payload. > > > > > > > > > > So index is in reality the layer? How is it numbered exactly? > > > > > > > > It is a layer for packet fields, inner headers get higher number index. > > > > But is it also an index in the TAG array, so the name comes from it. > > > > > > Sorry it is not obvious. > > > Please describe the exact numbering in tunnel and VLAN cases. > > > > > > > > What is the field id if an offset is given? > > > > > > > > Field ID stays the same, you can specify a small offset to copy just a > > > > few > bits > > > > from the entire packet field or a big offset to move to completely > > > > different > > > area. > > > > > > I don't understand what is an offset then. > > > Isn't it the byte or bit where the copy start? > > > Do you handle sizes smaller than a byte? > > > > It is the bit offset, you can copy 20 bits out of 32 bits of IPv4 address > > for > example. > > Now I'm confused. > You mean rte_flow_action_copy_data.offset is a bit offset? rte_flow_action_copy_data.offset and rte_flow_action_copy_field.width are measured in bits, right. > > > > > Can we say that a field id can always be replaced by an offset? > > > > > > > > Not really. You can use offset to jump around packet fields for sure, > > > > but it > is > > > going to be > > > > hard and cumbersome to calculate all the offsets for that. Field ID is > > > > much > > > more convenient. > > > > > > I think it depends for who. > > > For some use cases, it may be easier to pass an offset. > > > For some drivers, it may be more efficient to directly manage offsets. > > > > It is possible with this RFC, driver can choose what to use: id and/or > > offset. > > We can set field and index to 0, and use only offset? Yes, I'm not inending to put any restrictions against that. > Then it is a byte offset from the beginning mbuf.data? Yes, but it is still bit offset, not byte offset.
Re: [dpdk-dev] [PATCH v2] app/testpmd: fix IP checksum calculation
On 1/7/2021 2:20 PM, George Prekas wrote: On 1/7/2021 5:32 AM, Ferruh Yigit wrote: On 1/7/2021 5:39 AM, George Prekas wrote: On 1/6/2021 12:02 PM, Ferruh Yigit wrote: On 12/5/2020 5:42 AM, George Prekas wrote: Strict-aliasing rules are violated by cast to uint16_t* in flowgen.c and the calculated IP checksum is wrong on GCC 9 and GCC 10. Signed-off-by: George Prekas --- v2: * Instead of a compiler barrier, use a compiler flag. ---   app/test-pmd/meson.build | 1 +   1 file changed, 1 insertion(+) diff --git a/app/test-pmd/meson.build b/app/test-pmd/meson.build index 7e9c7bdd6..5d24e807f 100644 --- a/app/test-pmd/meson.build +++ b/app/test-pmd/meson.build @@ -4,6 +4,7 @@   # override default name to drop the hyphen   name = 'testpmd'   cflags += '-Wno-deprecated-declarations' +cflags += '-fno-strict-aliasing'   sources = files('5tswap.c',   'cmdline.c',   'cmdline_flow.c', Hi George, I am trying to understand this, the relevant code is as below: ip_hdr->hdr_checksum = ip_sum((unaligned_uint16_t *)ip_hdr, sizeof(*ip_hdr)); You are suspicious of strict aliasing rule violation, with more details: The concern is the "struct rte_ipv4_hdr *ip_hdr;" aliased to "const unaligned_uint16_t *hdr", and compiler can optimize out the calculations using data pointed by 'hdr' pointer, since the 'hdr' pointer is not used to alter the data and compiler may think data is not changed at all. 1) But the pointer "hdr" is assigned in the loop, from another pointer whose content is changing, why this is not helping to figure out that the data 'hdr' pointing is changed. 2) I tried to debug this, but I am not able to reproduce the issue, 'ip_sum()' called each time and checksum calculated correctly. Using gcc 10.2.1-9. Can you able to confirm the case with debug, or from the assembly/object file? And if the issue is strict aliasing rule violation as you said, compiler flag is an option but not sure how much it reduces the compiler optimization benefit, I guess other options also not so good, memcpy brings too much work on runtime and union requires bigger change and makes code complex. I wonder if making 'ip_sum()' a non inline function can help, can you please give a try since you can reproduce it? Hi Ferruh, Thanks for looking into it. I am copy-pasting at the end of this email a minimal reproduction. It calculates a checksum and prints it. The correct value is f8d9. If you compile it with -O0 or -O3 -fno-strict-aliasing, you will get the correct value. If you compile it with gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0 and -O3, you will get f8e8. You can also try it on https://godbolt.org/ and see how different versions behave. My understanding is that the code violates the C standard (https://stackoverflow.com/a/99010). Thanks for the sample code below, I copied to the godbolt: https://godbolt.org/z/6fMK19 In gcc 10, the checksum calculation is done during compilation (when optimization is enabled) and the value is returned directly: mov   $0xffed,%esi Since a calculation is happening I assume the compiler knows about the aliasing and OK with it. According to https://gcc.gnu.org/bugs/: "if compiling with -fno-strict-aliasing -fwrapv -fno-aggressive-loop-optimizations makes a difference ... then your code is probably not correct" Yep, I saw it while submitting the gcc ticket, and it seems it was right: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98582 But that optimized calculation seems wrong, when it is disabled [1] the checksum is correct again. [1] all following seems helping to disable compile time calculation - disabling optimization - putting a compiler barrier - putting a 'printf' inside 'ip_sum()' - fno-strict-aliasing gcc 8 & 9 is not doing this compile time calculation, hence they are not affected. I just checked gcc 8.3 and gcc 9.3 on godbolt and I got f8e8 (which is wrong; the correct is f8d9). True, I missed that they generate wrong value. This feels like an optimization issue in gcc10, but not sure exactly on the root cause, and how to disable it properly in our case. I've tried with __attribute__ ((noinline)) and it fixes the problem. But keep in mind that we are dealing with broken C code. This attribute just prevents the optimization that reveals the problem. It does not guarantee that the problem will not reappear in a future compiler version. I've also tried to use a union as suggested by Stephen Hemminger and it works correctly but it requires significant code changes: you have to copy paste the IP header structure inside a union and access it only through the union. As a side note, here is a piece of opinion from Linus Torvalds regarding strict aliasing: https://lkml.org/lkml/2018/6/5/769 DPDK already uses -fno-strict-aliasing for librte_node and librte_vhost. In the above ticket, 'may_alias' attribute is also suggested, which is working for the sample, can you please try with it too? It may be better to allow non compa
Re: [dpdk-dev] [dpdk-dev v2 1/2] ethdev: add new tunnel type for ecpri
> -Original Message- > From: Thomas Monjalon > Sent: Thursday, January 7, 2021 9:34 PM > To: Guo, Jia ; Zhang, Qi Z > Cc: Wu, Jingjing ; Yang, Qiming > ; Wang, Haiyue ; > dev@dpdk.org; Yigit, Ferruh ; > andrew.rybche...@oktetlabs.ru; or...@nvidia.com; getel...@nvidia.com; > Dodji Seketeli > Subject: Re: [dpdk-dev] [dpdk-dev v2 1/2] ethdev: add new tunnel type for > ecpri > > 07/01/2021 13:47, Zhang, Qi Z: > > > > > -Original Message- > > > From: Thomas Monjalon > > > Sent: Thursday, January 7, 2021 6:12 PM > > > To: Guo, Jia > > > Cc: Zhang, Qi Z ; Wu, Jingjing > > > ; Yang, Qiming ; Wang, > > > Haiyue ; dev@dpdk.org; Yigit, Ferruh > > > ; andrew.rybche...@oktetlabs.ru; > > > or...@nvidia.com; getel...@nvidia.com > > > Subject: Re: [dpdk-dev] [dpdk-dev v2 1/2] ethdev: add new tunnel > > > type for ecpri > > > > > > 07/01/2021 10:32, Guo, Jia: > > > > From: Thomas Monjalon > > > > > 24/12/2020 07:59, Jeff Guo: > > > > > > Add type of RTE_TUNNEL_TYPE_ECPRI into the enum of ethdev > > > > > > tunnel > > > > > type. > > > > > > > > > > > > Signed-off-by: Jeff Guo > > > > > > Reviewed-by: Qi Zhang > > > > > [...] > > > > > > --- a/lib/librte_ethdev/rte_ethdev.h > > > > > > +++ b/lib/librte_ethdev/rte_ethdev.h > > > > > > @@ -1219,6 +1219,7 @@ enum rte_eth_tunnel_type { > > > > > > RTE_TUNNEL_TYPE_IP_IN_GRE, > > > > > > RTE_L2_TUNNEL_TYPE_E_TAG, > > > > > > RTE_TUNNEL_TYPE_VXLAN_GPE, > > > > > > + RTE_TUNNEL_TYPE_ECPRI, > > > > > > RTE_TUNNEL_TYPE_MAX, > > > > > > }; > > > > > > > > > > We tried to remove all these legacy API in DPDK 20.11. > > > > > Andrew decided to not remove this one because it is not yet > > > > > completely replaced by rte_flow in all drivers. > > > > > However, I am against continuing to update this API. > > > > > The opposite work should be done: migrate to rte_flow. > > > > > > > > Agree but seems that the legacy api and driver legacy > > > > implementation still keep in this release, and there is no a > > > > general way to replace the legacy by rte_flow right now. > > > > > > I think rte_flow is a complete replacement with more features. > > > > Thomas, I may not agree with this. > > > > Actually the "enum rte_eth_tunnel_type" is used by > > rte_eth_dev_udp_tunnel_port_add A packet with specific dst udp port > > will be recognized as a specific tunnel packet type (e.g. vxlan, vxlan-gpe, > ecpri...) In Intel NIC, the API actually changes the configuration of the > packet > parser in HW but not add a filter rule and I guess all other devices may > enable it > in a similar way. > > so naturally it should be a device (port) level configuration but not a > > rte_flow > rule for match, encap, decap... > > I don't understand how it helps to identify an UDP port if there is no rule > for > this tunnel. > What is the usage? Yes, in general It is a rule, it matches a udp packet's dst port and the action is "now the packet is identified as vxlan packet" then all other rte_flow rules that match for a vlxan as pattern will take effect. but somehow, I think they are not rules in the same domain, just like we have dedicate API for mac/vlan filter, we'd better have a dedicate API for this also. ( RFC for Vxlan explains why we need this. https://tools.ietf.org/html/rfc7348). "Destination Port: IANA has assigned the value 4789 for the VXLAN UDP port, and this value SHOULD be used by default as the destination UDP port. Some early implementations of VXLAN have used other values for the destination port. To enable interoperability with these implementations, the destination port SHOULD be configurable." Thanks Qi > > > So I think it's not a good idea to replace the > > rte_eth_dev_udp_tunnel_port_add with rte_flow config and also there is > > no existing rte_flow_action can cover this requirement unless we > > introduce some new one. > > > > > You can match, encap, decap. > > > There is even a new API to get tunnel infos after decap. > > > What is missing? > > I still don't see which use case is missing. > > > > > > > Sorry, it is a nack. > > > > > BTW, it is probably breaking the ABI because of > RTE_TUNNEL_TYPE_MAX. > > > > Yes that may break the ABI but fortunately the checking-abi-compatibility > > tool > shows negative :) , thanks Ferruh' s guide. > > https://github.com/ferruhy/dpdk/actions/runs/468859673 > > That's very strange. An enum value is changed. > Why it is not flagged by libabigail? > > > > > > Oh, the ABI break should be a problem. > > > > > > > > > PS: please Cc ethdev maintainers for such patch, thanks. > > > > > tip: use --cc-cmd devtools/get-maintainer.sh > > > > > > > > Thanks for your helpful tip. > >
Re: [dpdk-dev] [PATCH] net/bnxt: limit per-poll Rx representor pkts
On 1/6/2021 8:54 PM, Lance Richardson wrote: On Wed, Jan 6, 2021 at 4:27 AM Ferruh Yigit wrote: On 12/14/2020 6:53 PM, Lance Richardson wrote: Limit number of representor packets transferred per poll to requested burst size. Hi Lance, Can you please describe the impact of the change? Since it has a fixes line, it seems it is fixing something but it is not clear what is fixed. Hi Ferruh, How does this look: Without some limit on the number of packets transferred from the hw ring to the representor ring per burst receive call, an entire ring's worth of packets can be transferred. This can break assumptions about ring indices (index on return could be identical to the index on entry, which is assumed to mean that no packets were processed), and can result in representor packets being dropped unnecessarily due to representor ring overflow. Fix by limiting the number of representor packets transferred per poll to requested burst size. Thank you, updated the commit log in next-net.
Re: [dpdk-dev] [v2 PATCH] usertools: show an error message if unable to reserve requested hugepages
On Thu, 7 Jan 2021 13:06:35 +0500 Sarosh Arif wrote: > On Thu, Dec 17, 2020 at 11:19 PM Stephen Hemminger > wrote: > > > > On Thu, 17 Dec 2020 16:16:16 +0500 > > Sarosh Arif wrote: > > > > > +if get_hugepages(path) != pages: > > > +print("Unable to reserve required pages. The pages reserved > > > are:") > > > +global SHOW_HUGEPAGES > > > +SHOW_HUGEPAGES = True > > > > > > Please don't add global's to this script. > > > > The script is close to being clean according to pylint, and globals > > are considered bad style and shouldn't be used. > > > > I would just exit if huge pages could not be setup. > > How about if we just print a warning message such as "Unable to > reserve required pages" before exiting, in case the pages are not > reserved due to lack of space in RAM? Then leave it upon the user to > query how many pages are actually reserved. > > > > The script should leave it up to the user to do another query about > > status if they care about what the result is. Just call sys.exit with a message that is all that is needed. Or maybe trapping other write errors to sysfs here. Probably the kernel has already tried to report the error, but the try/except code is not seeing it.
Re: [dpdk-dev] [PATCH v2] app/testpmd: fix IP checksum calculation
On Wed, 6 Jan 2021 23:39:39 -0600 George Prekas wrote: > On 1/6/2021 12:02 PM, Ferruh Yigit wrote: > > On 12/5/2020 5:42 AM, George Prekas wrote: > >> Strict-aliasing rules are violated by cast to uint16_t* in flowgen.c > >> and the calculated IP checksum is wrong on GCC 9 and GCC 10. > >> > >> Signed-off-by: George Prekas > >> --- > >> v2: > >> * Instead of a compiler barrier, use a compiler flag. > >> --- > >> Â app/test-pmd/meson.build | 1 + > >> Â 1 file changed, 1 insertion(+) > >> > >> diff --git a/app/test-pmd/meson.build b/app/test-pmd/meson.build > >> index 7e9c7bdd6..5d24e807f 100644 > >> --- a/app/test-pmd/meson.build > >> +++ b/app/test-pmd/meson.build > >> @@ -4,6 +4,7 @@ > >> Â # override default name to drop the hyphen > >> Â name = 'testpmd' > >> Â cflags += '-Wno-deprecated-declarations' > >> +cflags += '-fno-strict-aliasing' > >> Â sources = files('5tswap.c', > >> Â 'cmdline.c', > >> Â 'cmdline_flow.c', > >> > > > > Hi George, > > > > I am trying to understand this, the relevant code is as below: > > ip_hdr->hdr_checksum = ip_sum((unaligned_uint16_t *)ip_hdr, > > sizeof(*ip_hdr)); > > > > You are suspicious of strict aliasing rule violation, with more details: > > The concern is the "struct rte_ipv4_hdr *ip_hdr;" aliased to "const > > unaligned_uint16_t *hdr", and compiler can optimize out the calculations > > using > > data pointed by 'hdr' pointer, since the 'hdr' pointer is not used to alter > > the > > data and compiler may think data is not changed at all. > > > > 1) But the pointer "hdr" is assigned in the loop, from another pointer whose > > content is changing, why this is not helping to figure out that the data > > 'hdr' > > pointing is changed. > > > > 2) I tried to debug this, but I am not able to reproduce the issue, > > 'ip_sum()' > > called each time and checksum calculated correctly. Using gcc 10.2.1-9. Can > > you > > able to confirm the case with debug, or from the assembly/object file? > > > > > > And if the issue is strict aliasing rule violation as you said, compiler > > flag is > > an option but not sure how much it reduces the compiler optimization > > benefit, I > > guess other options also not so good, memcpy brings too much work on > > runtime and > > union requires bigger change and makes code complex. > > I wonder if making 'ip_sum()' a non inline function can help, can you please > > give a try since you can reproduce it? > > Hi Ferruh, > > Thanks for looking into it. > > I am copy-pasting at the end of this email a minimal reproduction. It > calculates a checksum and prints it. The correct value is f8d9. If you > compile it with -O0 or -O3 -fno-strict-aliasing, you will get the correct > value. If you compile it with gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0 and > -O3, you will get f8e8. You can also try it on https://godbolt.org/ and see > how different versions behave. > > My understanding is that the code violates the C standard > (https://stackoverflow.com/a/99010). > > --- cut here --- > > #include > #include > #include > #include > > struct rte_ipv4_hdr { > uint8_t version_ihl; > uint8_t type_of_service; > uint16_t total_length; > uint16_t packet_id; > uint16_t fragment_offset; > uint8_t time_to_live; > uint8_t next_proto_id; > uint16_t hdr_checksum; > uint32_t src_addr; > uint32_t dst_addr; > }; > > static inline uint16_t ip_sum(const uint16_t *hdr, int hdr_len) > { > uint32_t sum = 0; > > while (hdr_len > 1) > { > sum += *hdr++; > if (sum & 0x8000) > sum = (sum & 0x) + (sum >> 16); > hdr_len -= 2; > } > > while (sum >> 16) > sum = (sum & 0x) + (sum >> 16); > > return ~sum; > } > > static void pkt_burst_flow_gen(void) > { > struct rte_ipv4_hdr *ip_hdr = (struct rte_ipv4_hdr *) malloc(4096); > memset(ip_hdr, 0, sizeof(*ip_hdr)); > ip_hdr->version_ihl = 1; > ip_hdr->type_of_service = 2; > ip_hdr->fragment_offset = 3; > ip_hdr->time_to_live= 4; > ip_hdr->next_proto_id = 5; > ip_hdr->packet_id = 6; > ip_hdr->src_addr= 7; > ip_hdr->dst_addr= 8; > ip_hdr->total_length= 9; > ip_hdr->hdr_checksum= ip_sum((uint16_t *)ip_hdr, sizeof(*ip_hdr)); > printf("%x\n", ip_hdr->hdr_checksum); > } > > int main(void) > { > pkt_burst_flow_gen(); > return 0; > } If I change your code like this to use union, Gcc 10 is still broken. It is a compiler bug. It maybe because optimizer is not smart enough to know that memset has cleared the header. #include #include #include #include struct rte_ipv4_hdr { uint8_t version_ihl; uint8_t type_of_service; uint16_t total_length; uint16_t packet_id; uint16_t fragment_offset; uint8_t time_to_live; uint8_t
Re: [dpdk-dev] [PATCH v2] app/testpmd: fix IP checksum calculation
On 1/7/2021 3:50 PM, Stephen Hemminger wrote: On Wed, 6 Jan 2021 23:39:39 -0600 George Prekas wrote: On 1/6/2021 12:02 PM, Ferruh Yigit wrote: On 12/5/2020 5:42 AM, George Prekas wrote: Strict-aliasing rules are violated by cast to uint16_t* in flowgen.c and the calculated IP checksum is wrong on GCC 9 and GCC 10. Signed-off-by: George Prekas --- v2: * Instead of a compiler barrier, use a compiler flag. ---  app/test-pmd/meson.build | 1 +  1 file changed, 1 insertion(+) diff --git a/app/test-pmd/meson.build b/app/test-pmd/meson.build index 7e9c7bdd6..5d24e807f 100644 --- a/app/test-pmd/meson.build +++ b/app/test-pmd/meson.build @@ -4,6 +4,7 @@  # override default name to drop the hyphen  name = 'testpmd'  cflags += '-Wno-deprecated-declarations' +cflags += '-fno-strict-aliasing'  sources = files('5tswap.c',  'cmdline.c',  'cmdline_flow.c', Hi George, I am trying to understand this, the relevant code is as below: ip_hdr->hdr_checksum = ip_sum((unaligned_uint16_t *)ip_hdr, sizeof(*ip_hdr)); You are suspicious of strict aliasing rule violation, with more details: The concern is the "struct rte_ipv4_hdr *ip_hdr;" aliased to "const unaligned_uint16_t *hdr", and compiler can optimize out the calculations using data pointed by 'hdr' pointer, since the 'hdr' pointer is not used to alter the data and compiler may think data is not changed at all. 1) But the pointer "hdr" is assigned in the loop, from another pointer whose content is changing, why this is not helping to figure out that the data 'hdr' pointing is changed. 2) I tried to debug this, but I am not able to reproduce the issue, 'ip_sum()' called each time and checksum calculated correctly. Using gcc 10.2.1-9. Can you able to confirm the case with debug, or from the assembly/object file? And if the issue is strict aliasing rule violation as you said, compiler flag is an option but not sure how much it reduces the compiler optimization benefit, I guess other options also not so good, memcpy brings too much work on runtime and union requires bigger change and makes code complex. I wonder if making 'ip_sum()' a non inline function can help, can you please give a try since you can reproduce it? Hi Ferruh, Thanks for looking into it. I am copy-pasting at the end of this email a minimal reproduction. It calculates a checksum and prints it. The correct value is f8d9. If you compile it with -O0 or -O3 -fno-strict-aliasing, you will get the correct value. If you compile it with gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0 and -O3, you will get f8e8. You can also try it on https://godbolt.org/ and see how different versions behave. My understanding is that the code violates the C standard (https://stackoverflow.com/a/99010). --- cut here --- #include #include #include #include struct rte_ipv4_hdr { uint8_t version_ihl; uint8_t type_of_service; uint16_t total_length; uint16_t packet_id; uint16_t fragment_offset; uint8_t time_to_live; uint8_t next_proto_id; uint16_t hdr_checksum; uint32_t src_addr; uint32_t dst_addr; }; static inline uint16_t ip_sum(const uint16_t *hdr, int hdr_len) { uint32_t sum = 0; while (hdr_len > 1) { sum += *hdr++; if (sum & 0x8000) sum = (sum & 0x) + (sum >> 16); hdr_len -= 2; } while (sum >> 16) sum = (sum & 0x) + (sum >> 16); return ~sum; } static void pkt_burst_flow_gen(void) { struct rte_ipv4_hdr *ip_hdr = (struct rte_ipv4_hdr *) malloc(4096); memset(ip_hdr, 0, sizeof(*ip_hdr)); ip_hdr->version_ihl = 1; ip_hdr->type_of_service = 2; ip_hdr->fragment_offset = 3; ip_hdr->time_to_live = 4; ip_hdr->next_proto_id= 5; ip_hdr->packet_id= 6; ip_hdr->src_addr = 7; ip_hdr->dst_addr = 8; ip_hdr->total_length = 9; ip_hdr->hdr_checksum = ip_sum((uint16_t *)ip_hdr, sizeof(*ip_hdr)); printf("%x\n", ip_hdr->hdr_checksum); } int main(void) { pkt_burst_flow_gen(); return 0; } If I change your code like this to use union, Gcc 10 is still broken. This worked fine for me: https://godbolt.org/z/vdsxh9 It is a compiler bug. It maybe because optimizer is not smart enough to know that memset has cleared the header. #include #include #include #include struct rte_ipv4_hdr { uint8_t version_ihl; uint8_t type_of_service; uint16_t total_length; uint16_t packet_id; uint16_t fragment_offset; uint8_t time_to_live; uint8_t next_proto_id; uint16_t hdr_checksum; uint32_t src_addr; uint32_t dst_addr; }; static inline uint16_t ip_sum(const uint16_t *hdr, int hdr_len) { uint32_t sum = 0; while (hdr_len > 1) {
Re: [dpdk-dev] [PATCH v1 4/4] net/virtio: replace full barrier with thread fence
On 12/21/20 3:23 PM, Joyce Kong wrote: > Replace the smp barriers with atomic thread fence for synchronization > between different threads, if there are no load/store operations. > > Signed-off-by: Joyce Kong > Reviewed-by: Ruifeng Wang > --- > drivers/net/virtio/virtqueue.h | 8 > 1 file changed, 4 insertions(+), 4 deletions(-) > Reviewed-by: Maxime Coquelin Thanks, Maxime
Re: [dpdk-dev] [PATCH v2] app/testpmd: fix IP checksum calculation
On Thu, 7 Jan 2021 15:59:59 + Ferruh Yigit wrote: > >> > >> int main(void) > >> { > >>pkt_burst_flow_gen(); > >>return 0; > >> } > > > > If I change your code like this to use union, Gcc 10 is still broken. > > This worked fine for me: https://godbolt.org/z/vdsxh9 I was looking for wrong result. The union version gives same answer with and without optimization.
Re: [dpdk-dev] [PATCH v1 1/8] examples/vhost: relax memory ordering when enqueue/dequeue
On 12/21/20 4:50 PM, Joyce Kong wrote: > Use C11 atomic APIs with one-way barriers to replace two-way > barriers when operating enqueue/dequeue. Used->idx and avail->idx > are the synchronization points for split vring. > > Signed-off-by: Joyce Kong > Reviewed-by: Ruifeng Wang > --- > examples/vhost/virtio_net.c | 12 > 1 file changed, 4 insertions(+), 8 deletions(-) Nice! Reviewed-by: Maxime Coquelin Thanks, Maxime
Re: [dpdk-dev] [PATCH v1 2/8] examples/vhost_blk: replace smp with thread fence
On 12/21/20 4:50 PM, Joyce Kong wrote: > Simply replace the rte_smp_mb barriers with SEQ_CST atomic thread fence, > if there is no load/store operations. > > Signed-off-by: Joyce Kong > Reviewed-by: Ruifeng Wang > --- > examples/vhost_blk/vhost_blk.c | 8 > 1 file changed, 4 insertions(+), 4 deletions(-) > > diff --git a/examples/vhost_blk/vhost_blk.c b/examples/vhost_blk/vhost_blk.c > index bb293d492..7ea60863d 100644 > --- a/examples/vhost_blk/vhost_blk.c > +++ b/examples/vhost_blk/vhost_blk.c > @@ -86,9 +86,9 @@ enqueue_task(struct vhost_blk_task *task) >*/ > used->ring[used->idx & (vq->vring.size - 1)].id = task->req_idx; > used->ring[used->idx & (vq->vring.size - 1)].len = task->data_len; >From here > - rte_smp_mb(); > + rte_atomic_thread_fence(__ATOMIC_SEQ_CST); > used->idx++; > - rte_smp_mb(); > + rte_atomic_thread_fence(__ATOMIC_SEQ_CST); to here, couldn't it be replaced with: __atomic_add_fetch(&used->idx, 1, __ATOMIC_RELEASE); ? > rte_vhost_clr_inflight_desc_split(task->ctrlr->vid, > vq->id, used->idx, task->req_idx); > @@ -112,12 +112,12 @@ enqueue_task_packed(struct vhost_blk_task *task) > desc->id = task->buffer_id; > desc->addr = 0; > > - rte_smp_mb(); > + rte_atomic_thread_fence(__ATOMIC_SEQ_CST); > if (vq->used_wrap_counter) > desc->flags |= VIRTQ_DESC_F_AVAIL | VIRTQ_DESC_F_USED; > else > desc->flags &= ~(VIRTQ_DESC_F_AVAIL | VIRTQ_DESC_F_USED); > - rte_smp_mb(); > + rte_atomic_thread_fence(__ATOMIC_SEQ_CST); > > rte_vhost_clr_inflight_desc_packed(task->ctrlr->vid, vq->id, > task->inflight_idx); >
Re: [dpdk-dev] [PATCH v1 3/8] vhost: remove unnecessary smp barrier for desc flags
On 12/21/20 4:50 PM, Joyce Kong wrote: > As function desc_is_avail performs a load-acquire barrier to > enforce the ordering between desc flags and desc content, it is > unnecessary to add a rte_smp_rmb barrier around the trace which > follows desc_is_avail. > > Signed-off-by: Joyce Kong > Reviewed-by: Ruifeng Wang > --- > lib/librte_vhost/virtio_net.c | 3 --- > 1 file changed, 3 deletions(-) > Reviewed-by: Maxime Coquelin Thanks, Maxime
Re: [dpdk-dev] [PATCH v1 4/8] vhost: remove unnecessary smp barrier for avail idx
On 12/21/20 4:50 PM, Joyce Kong wrote: > The ordering between avail index and desc reads has been enforced > by load-acquire for split vring, so smp_rmb barrier is not needed > behind it. > > Signed-off-by: Joyce Kong > Reviewed-by: Ruifeng Wang > --- > lib/librte_vhost/virtio_net.c | 7 ++- > 1 file changed, 2 insertions(+), 5 deletions(-) Reviewed-by: Maxime Coquelin Thanks, Maxime
Re: [dpdk-dev] [PATCH v1 5/8] vhost: relax full barriers for desc flags
On 12/21/20 4:50 PM, Joyce Kong wrote: > Relax the full read barrier to one-way barrier for desc flags in > packed vring. > > Signed-off-by: Joyce Kong > Reviewed-by: Ruifeng Wang > --- > lib/librte_vhost/virtio_net.c | 5 +++-- > 1 file changed, 3 insertions(+), 2 deletions(-) > Reviewed-by: Maxime Coquelin Thanks, Maxime
Re: [dpdk-dev] [PATCH v1 6/8] vhost: relax full barriers for used idx
On 12/21/20 4:50 PM, Joyce Kong wrote: > Used idx can be synchronized by one-way barrier instead of full > write barrier for split vring. > > Signed-off-by: Joyce Kong > Reviewed-by: Ruifeng Wang > --- > lib/librte_vhost/vdpa.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) Reviewed-by: Maxime Coquelin Thanks, Maxime
Re: [dpdk-dev] [RFC] ethdev: introduce copy_field rte flow action
07/01/2021 16:22, Alexander Kozyrev: > > 07/01/2021 16:10, Alexander Kozyrev: > > > > > > Thursday, January 7, 2021 10:18, Thomas Monjalon > > > > > > > > > RTE Flows API lacks the ability to save an arbitrary header field > > > > > > > in > > > > > > > order to use it later for advanced packet manipulations. Examples > > > > > > > include the usage of VxLAN ID after the packet is decapsulated or > > > > > > > storing this ID inside the packet payload itself or swapping an > > > > > > > arbitrary inner and outer packet fields. > > > > > > > > > > > > > > The idea is to allow a copy of a specified number of bits form any > > > > > > > packet header field into another header field: > > > > > > > RTE_FLOW_ACTION_TYPE_COPY_FIELD with the structure defined > > below. > > > > > > > > > > > > > > struct rte_flow_action_copy_field { > > > > > > > struct rte_flow_action_copy_data dest; > > > > > > > struct rte_flow_action_copy_data src; > > > > > > > uint16_t width; > > > > > > > }; > > > > > > > > > > > > > > Arbitrary header field (as well as mark, metadata or tag values) > > > > > > > can be > > > > > > > used as both source and destination fields. This way we can save > > > > > > > an > > > > > > > arbitrary header field by copying its value to a > > > > > > > tag/mark/metadata or > > > > > > > copy it into another header field directly. tag/mark/metadata can > > > > > > > also > > > > > > > be used as a value to be stored in an arbitrary packet header > > > > > > > field. > > > > > > > > > > > > > > struct rte_flow_action_copy_data { > > > > > > > enum rte_flow_field_id field; > > > > > > > uint16_t index; > > > > > > > uint16_t offset; > > > > > > > }; > > > > > > > > > > > > > > The rte_flow_field_id specifies the particular packet field (or > > > > > > > tag/mark/metadata) to be used as a copy source or destination. > > > > > > > The index gives access to inner packet headers or elements in the > > > > > > > tags > > > > > > > array. The offset allows to copy a packet field value into the > > > > > > > payload. > > > > > > > > > > > > So index is in reality the layer? How is it numbered exactly? > > > > > > > > > > It is a layer for packet fields, inner headers get higher number > > > > > index. > > > > > But is it also an index in the TAG array, so the name comes from it. > > > > > > > > Sorry it is not obvious. > > > > Please describe the exact numbering in tunnel and VLAN cases. > > > > > > > > > > What is the field id if an offset is given? > > > > > > > > > > Field ID stays the same, you can specify a small offset to copy just > > > > > a few > > bits > > > > > from the entire packet field or a big offset to move to completely > > > > > different > > > > area. > > > > > > > > I don't understand what is an offset then. > > > > Isn't it the byte or bit where the copy start? > > > > Do you handle sizes smaller than a byte? > > > > > > It is the bit offset, you can copy 20 bits out of 32 bits of IPv4 address > > > for > > example. > > > > Now I'm confused. > > You mean rte_flow_action_copy_data.offset is a bit offset? > > rte_flow_action_copy_data.offset and rte_flow_action_copy_field.width > are measured in bits, right. So the offset is limited to 16 bits? How can it be useful? Is it an offset starting from the specified field? > > > > > > Can we say that a field id can always be replaced by an offset? > > > > > > > > > > Not really. You can use offset to jump around packet fields for sure, > > > > > but it > > is > > > > going to be > > > > > hard and cumbersome to calculate all the offsets for that. Field ID > > > > > is much > > > > more convenient. > > > > > > > > I think it depends for who. > > > > For some use cases, it may be easier to pass an offset. > > > > For some drivers, it may be more efficient to directly manage offsets. > > > > > > It is possible with this RFC, driver can choose what to use: id and/or > > > offset. > > > > > We can set field and index to 0, and use only offset? > Yes, I'm not inending to put any restrictions against that. > > Then it is a byte offset from the beginning mbuf.data? > Yes, but it is still bit offset, not byte offset.
[dpdk-dev] [PATCH v2 0/5] regex/mlx5: pmd improvements
This series adds a few fixes and improvements to the Nvidia RegEx PMD. V2: * Fix small issue in combined rule patch. * Add new patch for number of queues. Ori Kam (5): regex/mlx5: fix memory rule alignment regex/mlx5: add support for combined rule file regex/mlx5: fix support for group id regex/mlx5: add support for priority match regex/mlx5: fix num of supported queues drivers/regex/mlx5/mlx5_regex.c | 9 + drivers/regex/mlx5/mlx5_regex.h | 1 + drivers/regex/mlx5/mlx5_regex_fastpath.c | 23 +- drivers/regex/mlx5/mlx5_rxp.c| 846 ++- drivers/regex/mlx5/mlx5_rxp_csrs.h | 3 + 5 files changed, 385 insertions(+), 497 deletions(-) -- 2.25.1
[dpdk-dev] [PATCH v2 2/5] regex/mlx5: add support for combined rule file
The rof file holds programming instructions for a given HW version. In order to support future generation of HW it was decided that the rof file will hold number of rule configurations, and the driver will use the one that matches the HW version. In current code we force sync after each write block. This has impact on performance. The solution is to move the sync to the end of the entire programming sequence. Signed-off-by: Ori Kam --- V2: * fix issue with external rules. --- drivers/regex/mlx5/mlx5_regex.c| 9 + drivers/regex/mlx5/mlx5_regex.h| 1 + drivers/regex/mlx5/mlx5_rxp.c | 841 - drivers/regex/mlx5/mlx5_rxp_csrs.h | 3 + 4 files changed, 364 insertions(+), 490 deletions(-) diff --git a/drivers/regex/mlx5/mlx5_regex.c b/drivers/regex/mlx5/mlx5_regex.c index c91c444dda..25eaec5802 100644 --- a/drivers/regex/mlx5/mlx5_regex.c +++ b/drivers/regex/mlx5/mlx5_regex.c @@ -119,6 +119,7 @@ mlx5_regex_pci_probe(struct rte_pci_driver *pci_drv __rte_unused, struct mlx5_hca_attr attr; char name[RTE_REGEXDEV_NAME_MAX_LEN]; int ret; + uint32_t val; ibv = mlx5_regex_get_ib_device_match(&pci_dev->addr); if (!ibv) { @@ -161,6 +162,14 @@ mlx5_regex_pci_probe(struct rte_pci_driver *pci_drv __rte_unused, } priv->ctx = ctx; priv->nb_engines = 2; /* attr.regexp_num_of_engines */ + ret = mlx5_devx_regex_register_read(priv->ctx, 0, + MLX5_RXP_CSR_IDENTIFIER, &val); + if (ret) { + DRV_LOG(ERR, "CSR read failed!"); + return -1; + } + if (val == MLX5_RXP_BF2_IDENTIFIER) + priv->is_bf2 = 1; /* Default RXP programming mode to Shared. */ priv->prog_mode = MLX5_RXP_SHARED_PROG_MODE; mlx5_regex_get_name(name, pci_dev); diff --git a/drivers/regex/mlx5/mlx5_regex.h b/drivers/regex/mlx5/mlx5_regex.h index 2c4877c37d..60f29a84d2 100644 --- a/drivers/regex/mlx5/mlx5_regex.h +++ b/drivers/regex/mlx5/mlx5_regex.h @@ -80,6 +80,7 @@ struct mlx5_regex_priv { struct ibv_pd *pd; struct mlx5_dbr_page_list dbrpgs; /* Door-bell pages. */ struct mlx5_mr_share_cache mr_scache; /* Global shared MR cache. */ + uint8_t is_bf2; /* The device is BF2 device. */ }; /* mlx5_regex.c */ diff --git a/drivers/regex/mlx5/mlx5_rxp.c b/drivers/regex/mlx5/mlx5_rxp.c index bd721f0b11..66cbc1f481 100644 --- a/drivers/regex/mlx5/mlx5_rxp.c +++ b/drivers/regex/mlx5/mlx5_rxp.c @@ -8,6 +8,7 @@ #include #include #include +#include #include #include @@ -24,6 +25,8 @@ #define MLX5_REGEX_MAX_RULES_PER_GROUP UINT32_MAX #define MLX5_REGEX_MAX_GROUPS MLX5_RXP_MAX_SUBSETS +#define MLX5_REGEX_RXP_ROF2_LINE_LEN 34 + /* Private Declarations */ static int rxp_poll_csr_for_value(struct ibv_context *ctx, uint32_t *value, @@ -36,30 +39,15 @@ mlnx_resume_database(struct mlx5_regex_priv *priv, uint8_t id); static int mlnx_update_database(struct mlx5_regex_priv *priv, uint8_t id); static int -program_rxp_rules(struct mlx5_regex_priv *priv, - struct mlx5_rxp_ctl_rules_pgm *rules, uint8_t id); +program_rxp_rules(struct mlx5_regex_priv *priv, const char *buf, uint32_t len, + uint8_t id); static int rxp_init_eng(struct mlx5_regex_priv *priv, uint8_t id); static int -write_private_rules(struct mlx5_regex_priv *priv, - struct mlx5_rxp_ctl_rules_pgm *rules, - uint8_t id); -static int -write_shared_rules(struct mlx5_regex_priv *priv, - struct mlx5_rxp_ctl_rules_pgm *rules, uint32_t count, - uint8_t db_to_program); -static int rxp_db_setup(struct mlx5_regex_priv *priv); static void rxp_dump_csrs(struct ibv_context *ctx, uint8_t id); static int -rxp_write_rules_via_cp(struct ibv_context *ctx, - struct mlx5_rxp_rof_entry *rules, - int count, uint8_t id); -static int -rxp_flush_rules(struct ibv_context *ctx, struct mlx5_rxp_rof_entry *rules, - int count, uint8_t id); -static int rxp_start_engine(struct ibv_context *ctx, uint8_t id); static int rxp_stop_engine(struct ibv_context *ctx, uint8_t id); @@ -123,110 +111,6 @@ mlx5_regex_info_get(struct rte_regexdev *dev __rte_unused, return 0; } -/** - * Actual writing of RXP instructions to RXP via CSRs. - */ -static int -rxp_write_rules_via_cp(struct ibv_context *ctx, - struct mlx5_rxp_rof_entry *rules, - int count, uint8_t id) -{ - int i, ret = 0; - uint32_t tmp; - - for (i = 0; i < count; i++) { - tmp = (uint32_t)rules[i].value; - ret |= mlx5_devx_regex_register_write(ctx, id, - MLX5_RXP_RTRU_CSR_DATA_0, - tmp); - tmp = (uint32_t)(rules[i].value >> 32); -
[dpdk-dev] [PATCH v2 3/5] regex/mlx5: fix support for group id
In order to know which groups in the RegEx engine should be used there is a need to check the req_flags. This commit adds the missing check. Cc: sta...@dpdk.org Fixes: 4d4e245ad637 ("regex/mlx5: support enqueue") Signed-off-by: Ori Kam --- drivers/regex/mlx5/mlx5_regex_fastpath.c | 21 + 1 file changed, 17 insertions(+), 4 deletions(-) diff --git a/drivers/regex/mlx5/mlx5_regex_fastpath.c b/drivers/regex/mlx5/mlx5_regex_fastpath.c index 5857617282..8d134ac98e 100644 --- a/drivers/regex/mlx5/mlx5_regex_fastpath.c +++ b/drivers/regex/mlx5/mlx5_regex_fastpath.c @@ -105,7 +105,21 @@ prep_one(struct mlx5_regex_priv *priv, struct mlx5_regex_qp *qp, { size_t wqe_offset = (sq->pi & (sq_size_get(sq) - 1)) * MLX5_SEND_WQE_BB; uint32_t lkey; - + uint16_t group0 = op->req_flags & RTE_REGEX_OPS_REQ_GROUP_ID0_VALID_F ? + op->group_id0 : 0; + uint16_t group1 = op->req_flags & RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F ? + op->group_id1 : 0; + uint16_t group2 = op->req_flags & RTE_REGEX_OPS_REQ_GROUP_ID2_VALID_F ? + op->group_id2 : 0; + uint16_t group3 = op->req_flags & RTE_REGEX_OPS_REQ_GROUP_ID3_VALID_F ? + op->group_id3 : 0; + + /* For backward compatibility. */ + if (!(op->req_flags & (RTE_REGEX_OPS_REQ_GROUP_ID0_VALID_F | + RTE_REGEX_OPS_REQ_GROUP_ID1_VALID_F | + RTE_REGEX_OPS_REQ_GROUP_ID2_VALID_F | + RTE_REGEX_OPS_REQ_GROUP_ID3_VALID_F))) + group0 = op->group_id0; lkey = mlx5_mr_addr2mr_bh(priv->pd, 0, &priv->mr_scache, &qp->mr_ctrl, rte_pktmbuf_mtod(op->mbuf, uintptr_t), @@ -116,9 +130,8 @@ prep_one(struct mlx5_regex_priv *priv, struct mlx5_regex_qp *qp, set_wqe_ctrl_seg((struct mlx5_wqe_ctrl_seg *)wqe, sq->pi, MLX5_OPCODE_MMO, MLX5_OPC_MOD_MMO_REGEX, sq->obj->id, 0, ds, 0, 0); - set_regex_ctrl_seg(wqe + 12, 0, op->group_id0, op->group_id1, - op->group_id2, - op->group_id3, 0); + set_regex_ctrl_seg(wqe + 12, 0, group0, group1, group2, group3, + 0); struct mlx5_wqe_data_seg *input_seg = (struct mlx5_wqe_data_seg *)(wqe + MLX5_REGEX_WQE_GATHER_OFFSET); -- 2.25.1
[dpdk-dev] [PATCH v2 1/5] regex/mlx5: fix memory rule alignment
Due to Kernel requirement the memory allocated must be aligned to 2M. Fixes: b34d816363b5 ("regex/mlx5: support rules import") Cc: sta...@dpdk.org Signed-off-by: Ori Kam --- drivers/regex/mlx5/mlx5_rxp.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/regex/mlx5/mlx5_rxp.c b/drivers/regex/mlx5/mlx5_rxp.c index fcbc766441..bd721f0b11 100644 --- a/drivers/regex/mlx5/mlx5_rxp.c +++ b/drivers/regex/mlx5/mlx5_rxp.c @@ -892,7 +892,7 @@ rxp_db_setup(struct mlx5_regex_priv *priv) /* Setup database memories for both RXP engines + reprogram memory. */ for (i = 0; i < (priv->nb_engines + MLX5_RXP_EM_COUNT); i++) { - priv->db[i].ptr = rte_malloc("", MLX5_MAX_DB_SIZE, 0); + priv->db[i].ptr = rte_malloc("", MLX5_MAX_DB_SIZE, 1 << 21); if (!priv->db[i].ptr) { DRV_LOG(ERR, "Failed to alloc db memory!"); ret = ENODEV; -- 2.25.1
[dpdk-dev] [PATCH v2 4/5] regex/mlx5: add support for priority match
The high priority match request flags means that the RegEx engine should stop on the first match. This commit add this flag check to the RegEx engine. Signed-off-by: Ori Kam --- drivers/regex/mlx5/mlx5_regex_fastpath.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/regex/mlx5/mlx5_regex_fastpath.c b/drivers/regex/mlx5/mlx5_regex_fastpath.c index 8d134ac98e..ee72f89c99 100644 --- a/drivers/regex/mlx5/mlx5_regex_fastpath.c +++ b/drivers/regex/mlx5/mlx5_regex_fastpath.c @@ -113,6 +113,8 @@ prep_one(struct mlx5_regex_priv *priv, struct mlx5_regex_qp *qp, op->group_id2 : 0; uint16_t group3 = op->req_flags & RTE_REGEX_OPS_REQ_GROUP_ID3_VALID_F ? op->group_id3 : 0; + uint8_t control = op->req_flags & + RTE_REGEX_OPS_REQ_MATCH_HIGH_PRIORITY_F ? 1 : 0; /* For backward compatibility. */ if (!(op->req_flags & (RTE_REGEX_OPS_REQ_GROUP_ID0_VALID_F | @@ -131,7 +133,7 @@ prep_one(struct mlx5_regex_priv *priv, struct mlx5_regex_qp *qp, MLX5_OPCODE_MMO, MLX5_OPC_MOD_MMO_REGEX, sq->obj->id, 0, ds, 0, 0); set_regex_ctrl_seg(wqe + 12, 0, group0, group1, group2, group3, - 0); + control); struct mlx5_wqe_data_seg *input_seg = (struct mlx5_wqe_data_seg *)(wqe + MLX5_REGEX_WQE_GATHER_OFFSET); -- 2.25.1
[dpdk-dev] [PATCH v2 5/5] regex/mlx5: fix num of supported queues
The RegEx engine as no limitation on number of queues. This commits modifies the max supported queues reported to the application. Fixes: fbc8c7003b93 ("regex/mlx5: add completion queue creation") Cc: sta...@dpdk.org Signed-off-by: Ori Kam --- drivers/regex/mlx5/mlx5_rxp.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/regex/mlx5/mlx5_rxp.c b/drivers/regex/mlx5/mlx5_rxp.c index 66cbc1f481..e5d9b9245a 100644 --- a/drivers/regex/mlx5/mlx5_rxp.c +++ b/drivers/regex/mlx5/mlx5_rxp.c @@ -103,11 +103,10 @@ mlx5_regex_info_get(struct rte_regexdev *dev __rte_unused, info->max_payload_size = MLX5_REGEX_MAX_PAYLOAD_SIZE; info->max_rules_per_group = MLX5_REGEX_MAX_RULES_PER_GROUP; info->max_groups = MLX5_REGEX_MAX_GROUPS; - info->max_queue_pairs = 1; info->regexdev_capa = RTE_REGEXDEV_SUPP_PCRE_GREEDY_F | RTE_REGEXDEV_CAPA_QUEUE_PAIR_OOS_F; info->rule_flags = 0; - info->max_queue_pairs = 10; + info->max_queue_pairs = UINT16_MAX; return 0; } -- 2.25.1
Re: [dpdk-dev] [RFC] ethdev: introduce copy_field rte flow action
> 07/01/2021 16:22, Alexander Kozyrev: > > > 07/01/2021 16:10, Alexander Kozyrev: > > > > > > > Thursday, January 7, 2021 10:18, Thomas Monjalon > > > > > > > > > > > RTE Flows API lacks the ability to save an arbitrary header > > > > > > > > field in > > > > > > > > order to use it later for advanced packet manipulations. > > > > > > > > Examples > > > > > > > > include the usage of VxLAN ID after the packet is decapsulated > > > > > > > > or > > > > > > > > storing this ID inside the packet payload itself or swapping an > > > > > > > > arbitrary inner and outer packet fields. > > > > > > > > > > > > > > > > The idea is to allow a copy of a specified number of bits form > > > > > > > > any > > > > > > > > packet header field into another header field: > > > > > > > > RTE_FLOW_ACTION_TYPE_COPY_FIELD with the structure defined > > > below. > > > > > > > > > > > > > > > > struct rte_flow_action_copy_field { > > > > > > > > struct rte_flow_action_copy_data dest; > > > > > > > > struct rte_flow_action_copy_data src; > > > > > > > > uint16_t width; > > > > > > > > }; > > > > > > > > > > > > > > > > Arbitrary header field (as well as mark, metadata or tag > > > > > > > > values) can > be > > > > > > > > used as both source and destination fields. This way we can > > > > > > > > save an > > > > > > > > arbitrary header field by copying its value to a > > > > > > > > tag/mark/metadata > or > > > > > > > > copy it into another header field directly. tag/mark/metadata > > > > > > > > can > also > > > > > > > > be used as a value to be stored in an arbitrary packet header > > > > > > > > field. > > > > > > > > > > > > > > > > struct rte_flow_action_copy_data { > > > > > > > > enum rte_flow_field_id field; > > > > > > > > uint16_t index; > > > > > > > > uint16_t offset; > > > > > > > > }; > > > > > > > > > > > > > > > > The rte_flow_field_id specifies the particular packet field (or > > > > > > > > tag/mark/metadata) to be used as a copy source or destination. > > > > > > > > The index gives access to inner packet headers or elements in > > > > > > > > the > tags > > > > > > > > array. The offset allows to copy a packet field value into the > payload. > > > > > > > > > > > > > > So index is in reality the layer? How is it numbered exactly? > > > > > > > > > > > > It is a layer for packet fields, inner headers get higher number > > > > > > index. > > > > > > But is it also an index in the TAG array, so the name comes from it. > > > > > > > > > > Sorry it is not obvious. > > > > > Please describe the exact numbering in tunnel and VLAN cases. > > > > > > > > > > > > What is the field id if an offset is given? > > > > > > > > > > > > Field ID stays the same, you can specify a small offset to copy > > > > > > just a > few > > > bits > > > > > > from the entire packet field or a big offset to move to completely > different > > > > > area. > > > > > > > > > > I don't understand what is an offset then. > > > > > Isn't it the byte or bit where the copy start? > > > > > Do you handle sizes smaller than a byte? > > > > > > > > It is the bit offset, you can copy 20 bits out of 32 bits of IPv4 > > > > address for > > > example. > > > > > > Now I'm confused. > > > You mean rte_flow_action_copy_data.offset is a bit offset? > > > > rte_flow_action_copy_data.offset and rte_flow_action_copy_field.width > > are measured in bits, right. > > So the offset is limited to 16 bits? > How can it be useful? Is it an offset starting from the specified field? Why 16? It can be up to 2^16=65536 bits. Do you think that is not enough? And it starts from the specific packet field pointed by the Field ID, correct. > > > > > > > > Can we say that a field id can always be replaced by an offset? > > > > > > > > > > > > Not really. You can use offset to jump around packet fields for > > > > > > sure, but > it > > > is > > > > > going to be > > > > > > hard and cumbersome to calculate all the offsets for that. Field ID > > > > > > is > much > > > > > more convenient. > > > > > > > > > > I think it depends for who. > > > > > For some use cases, it may be easier to pass an offset. > > > > > For some drivers, it may be more efficient to directly manage offsets. > > > > > > > > It is possible with this RFC, driver can choose what to use: id and/or > > > > offset. > > > > > > > > We can set field and index to 0, and use only offset? > > Yes, I'm not inending to put any restrictions against that. > > > Then it is a byte offset from the beginning mbuf.data? > > Yes, but it is still bit offset, not byte offset.
Re: [dpdk-dev] [dpdk-dev v2 1/2] ethdev: add new tunnel type for ecpri
07/01/2021 16:24, Zhang, Qi Z: > From: Thomas Monjalon > > 07/01/2021 13:47, Zhang, Qi Z: > > > From: Thomas Monjalon > > > > 07/01/2021 10:32, Guo, Jia: > > > > > From: Thomas Monjalon > > > > > > 24/12/2020 07:59, Jeff Guo: > > > > > > > --- a/lib/librte_ethdev/rte_ethdev.h > > > > > > > +++ b/lib/librte_ethdev/rte_ethdev.h > > > > > > > @@ -1219,6 +1219,7 @@ enum rte_eth_tunnel_type { > > > > > > > RTE_TUNNEL_TYPE_IP_IN_GRE, > > > > > > > RTE_L2_TUNNEL_TYPE_E_TAG, > > > > > > > RTE_TUNNEL_TYPE_VXLAN_GPE, > > > > > > > + RTE_TUNNEL_TYPE_ECPRI, > > > > > > > RTE_TUNNEL_TYPE_MAX, > > > > > > > }; > > > > > > > > > > > > We tried to remove all these legacy API in DPDK 20.11. > > > > > > Andrew decided to not remove this one because it is not yet > > > > > > completely replaced by rte_flow in all drivers. > > > > > > However, I am against continuing to update this API. > > > > > > The opposite work should be done: migrate to rte_flow. > > > > > > > > > > Agree but seems that the legacy api and driver legacy > > > > > implementation still keep in this release, and there is no a > > > > > general way to replace the legacy by rte_flow right now. > > > > > > > > I think rte_flow is a complete replacement with more features. > > > > > > Thomas, I may not agree with this. > > > > > > Actually the "enum rte_eth_tunnel_type" is used by > > > rte_eth_dev_udp_tunnel_port_add A packet with specific dst udp port > > > will be recognized as a specific tunnel packet type (e.g. vxlan, > > > vxlan-gpe, > > ecpri...) In Intel NIC, the API actually changes the configuration of the > > packet > > parser in HW but not add a filter rule and I guess all other devices may > > enable it > > in a similar way. > > > so naturally it should be a device (port) level configuration but not a > > > rte_flow > > rule for match, encap, decap... > > > > I don't understand how it helps to identify an UDP port if there is no rule > > for > > this tunnel. > > What is the usage? > > Yes, in general It is a rule, it matches a udp packet's dst port and the > action is "now the packet is identified as vxlan packet" then all other > rte_flow rules that match for a vlxan as pattern will take effect. but > somehow, I think they are not rules in the same domain, just like we have > dedicate API for mac/vlan filter, we'd better have a dedicate API for this > also. ( RFC for Vxlan explains why we need this. > https://tools.ietf.org/html/rfc7348). > > "Destination Port: IANA has assigned the value 4789 for the > VXLAN UDP port, and this value SHOULD be used by default as the > destination UDP port. Some early implementations of VXLAN have > used other values for the destination port. To enable > interoperability with these implementations, the destination > port SHOULD be configurable." Yes the port number is free. But isn't it more natural to specify this port number as part of the rte_flow rule?
Re: [dpdk-dev] [RFC] ethdev: introduce copy_field rte flow action
07/01/2021 17:57, Alexander Kozyrev: > > 07/01/2021 16:22, Alexander Kozyrev: > > > > 07/01/2021 16:10, Alexander Kozyrev: > > > > > > > > Thursday, January 7, 2021 10:18, Thomas Monjalon > > > > > > > > > > > > > RTE Flows API lacks the ability to save an arbitrary header > > > > > > > > > field in > > > > > > > > > order to use it later for advanced packet manipulations. > > > > > > > > > Examples > > > > > > > > > include the usage of VxLAN ID after the packet is > > > > > > > > > decapsulated or > > > > > > > > > storing this ID inside the packet payload itself or swapping > > > > > > > > > an > > > > > > > > > arbitrary inner and outer packet fields. > > > > > > > > > > > > > > > > > > The idea is to allow a copy of a specified number of bits > > > > > > > > > form any > > > > > > > > > packet header field into another header field: > > > > > > > > > RTE_FLOW_ACTION_TYPE_COPY_FIELD with the structure defined > > > > below. > > > > > > > > > > > > > > > > > > struct rte_flow_action_copy_field { > > > > > > > > > struct rte_flow_action_copy_data dest; > > > > > > > > > struct rte_flow_action_copy_data src; > > > > > > > > > uint16_t width; > > > > > > > > > }; > > > > > > > > > > > > > > > > > > Arbitrary header field (as well as mark, metadata or tag > > > > > > > > > values) can > > be > > > > > > > > > used as both source and destination fields. This way we can > > > > > > > > > save an > > > > > > > > > arbitrary header field by copying its value to a > > > > > > > > > tag/mark/metadata > > or > > > > > > > > > copy it into another header field directly. tag/mark/metadata > > > > > > > > > can > > also > > > > > > > > > be used as a value to be stored in an arbitrary packet header > > > > > > > > > field. > > > > > > > > > > > > > > > > > > struct rte_flow_action_copy_data { > > > > > > > > > enum rte_flow_field_id field; > > > > > > > > > uint16_t index; > > > > > > > > > uint16_t offset; > > > > > > > > > }; > > > > > > > > > > > > > > > > > > The rte_flow_field_id specifies the particular packet field > > > > > > > > > (or > > > > > > > > > tag/mark/metadata) to be used as a copy source or destination. > > > > > > > > > The index gives access to inner packet headers or elements in > > > > > > > > > the > > tags > > > > > > > > > array. The offset allows to copy a packet field value into the > > payload. > > > > > > > > > > > > > > > > So index is in reality the layer? How is it numbered exactly? > > > > > > > > > > > > > > It is a layer for packet fields, inner headers get higher number > > > > > > > index. > > > > > > > But is it also an index in the TAG array, so the name comes from > > > > > > > it. > > > > > > > > > > > > Sorry it is not obvious. > > > > > > Please describe the exact numbering in tunnel and VLAN cases. > > > > > > > > > > > > > > What is the field id if an offset is given? > > > > > > > > > > > > > > Field ID stays the same, you can specify a small offset to copy > > > > > > > just a > > few > > > > bits > > > > > > > from the entire packet field or a big offset to move to completely > > different > > > > > > area. > > > > > > > > > > > > I don't understand what is an offset then. > > > > > > Isn't it the byte or bit where the copy start? > > > > > > Do you handle sizes smaller than a byte? > > > > > > > > > > It is the bit offset, you can copy 20 bits out of 32 bits of IPv4 > > > > > address for > > > > example. > > > > > > > > Now I'm confused. > > > > You mean rte_flow_action_copy_data.offset is a bit offset? > > > > > > rte_flow_action_copy_data.offset and rte_flow_action_copy_field.width > > > are measured in bits, right. > > > > So the offset is limited to 16 bits? > > How can it be useful? Is it an offset starting from the specified field? > > Why 16? It can be up to 2^16=65536 bits. Do you think that is not enough? Yes 8KB may be too small for huge packets. I recommend 32 bits. > And it starts from the specific packet field pointed by the Field ID, correct. I think it would be more useful as a global offset starting from the first bit of the packet. > > > > > > > > Can we say that a field id can always be replaced by an offset? > > > > > > > > > > > > > > Not really. You can use offset to jump around packet fields for > > > > > > > sure, but > > it > > > > is > > > > > > going to be > > > > > > > hard and cumbersome to calculate all the offsets for that. Field > > > > > > > ID is > > much > > > > > > more convenient. > > > > > > > > > > > > I think it depends for who. > > > > > > For some use cases, it may be easier to pass an offset. > > > > > > For some drivers, it may be more efficient to directly manage > > > > > > offsets. > > > > > > > > > > It is possible with this RFC, driver can choose what to use: id > > > > > and/or offset. > > > > > > > > > > > We can set field and index to 0, and use only offset? > > > Yes, I'm not inending to put any restrictions against that. > > > > Then it is a byte offse
Re: [dpdk-dev] [PATCH v1 7/8] vhost: replace smp with thread fence for packed vring
On 12/21/20 4:50 PM, Joyce Kong wrote: > Simply relace smp barriers with atomic thread fence for > virtio packed vring. > > Signed-off-by: Joyce Kong > Reviewed-by: Ruifeng Wang > --- > lib/librte_vhost/virtio_net.c | 9 + > 1 file changed, 5 insertions(+), 4 deletions(-) > Reviewed-by: Maxime Coquelin Thanks, Maxime
[dpdk-dev] [PATCH v2 1/1] mbuf: fix rte_mbuf_dyn.h inclusion from C++
The header was missing the extern "C" directive which causes name mangling of functions by C++ compilers, leading to linker errors complaining of undefined references to these functions. Fixes: 4958ca3a443a ("mbuf: support dynamic fields and flags") Cc: olivier.m...@6wind.com Cc: sta...@dpdk.org Signed-off-by: Ashish Sadanandan --- v2: * No functional change * Add fixes tag with reference to commit that introduced the header * CC sta...@dpdk.org lib/librte_mbuf/rte_mbuf_dyn.h | 9 + 1 file changed, 9 insertions(+) diff --git a/lib/librte_mbuf/rte_mbuf_dyn.h b/lib/librte_mbuf/rte_mbuf_dyn.h index d88e7ba..fc4eee7 100644 --- a/lib/librte_mbuf/rte_mbuf_dyn.h +++ b/lib/librte_mbuf/rte_mbuf_dyn.h @@ -67,6 +67,11 @@ */ #include + +#ifdef __cplusplus +extern "C" { +#endif + /** * Maximum length of the dynamic field or flag string. */ @@ -326,4 +331,8 @@ int rte_mbuf_dyn_rx_timestamp_register(int *field_offset, uint64_t *rx_flag); __rte_experimental int rte_mbuf_dyn_tx_timestamp_register(int *field_offset, uint64_t *tx_flag); +#ifdef __cplusplus +} #endif + +#endif /* _RTE_MBUF_DYN_H_ */ -- 2.9.3
Re: [dpdk-dev] [PATCH v1 8/8] vhost: replace smp with thread fence for control path
On 12/21/20 4:50 PM, Joyce Kong wrote: > Simply replace the smp barriers with atomic thread fence for vhost control > path, if there are no synchronization points. > > Signed-off-by: Joyce Kong > Reviewed-by: Ruifeng Wang > --- > lib/librte_vhost/vhost.c | 18 +- > lib/librte_vhost/vhost.h | 6 +++--- > lib/librte_vhost/vhost_user.c | 2 +- > lib/librte_vhost/virtio_net.c | 2 +- > 4 files changed, 14 insertions(+), 14 deletions(-) > Reviewed-by: Maxime Coquelin Thanks, Maxime
Re: [dpdk-dev] [PATCH 1/2] common/mlx5: support vDPA completion queue moderation
On 1/6/21 4:06 AM, Xueming Li wrote: > This patch introduces new parameters for VirtQ CQ moderation, used for > performance tuning. > > Signed-off-by: Xueming Li > --- > drivers/common/mlx5/mlx5_devx_cmds.c | 3 +++ > drivers/common/mlx5/mlx5_devx_cmds.h | 3 +++ > drivers/common/mlx5/mlx5_prm.h | 6 +- > 3 files changed, 11 insertions(+), 1 deletion(-) > Reviewed-by: Maxime Coquelin Thanks, Maxime
Re: [dpdk-dev] [PATCH 1/1] mbuf: add extern "C" to rte_mbuf_dyn.h
Hi David, On Thu, Jan 7, 2021 at 1:02 AM David Marchand wrote: > On Thu, Jan 7, 2021 at 2:42 AM Ashish Sadanandan > wrote: > > > > Hi Olivier, > > > > On Wed, Jan 6, 2021 at 6:21 AM Olivier Matz > wrote: > > > > > > Hi Ashish, > > > > > > Yes, it should reference the patch that introduced the issue. In this > case, > > > it should be: > > > > > > Fixes: 4958ca3a443a ("mbuf: support dynamic fields and flags") > > > > > > Can you please also change the title to start with "fix", for > > > instance like this: > > > > > > mbuf: fix inclusion from c++ > > > > > > You can also add "Cc: sta...@dpdk.org" in the commit log. > > > > > > Some documentation can be found in doc/guides/contributing/patches.rst > > > > Thanks for the pointers. I've made the changes above. However I don't > > know if I'm using the message ID for --in-reply-to correctly, since I > > seem to be creating a new thread each time. > > You can get the message ID in various places. > I usually take the info from patchwork and update the previous patch > state to Superseded at the same time. > > So in your case, the v1 patch was: > http://patchwork.dpdk.org/patch/85878/ > > Message ID: 20201229194144.17824-1-ashish.sadanan...@gmail.com > > So for the v2: > git send-email --to dev@dpdk.org --cc-cmd ./devtools/get-maintainer.sh > --cc XXX --in-reply-to > '20201229194144.17824-1-ashish.sadanan...@gmail.com' v2-*.patch > > This is almost the exact command I ran (except for the --cc-cmd arg since I don't have the Linux source tree cloned). Just tried it once more, but I think it created a new thread again! I don't get what I'm doing wrong and am terribly sorry for the spam I've been generating. - Ashish > > -- > David Marchand > >
Re: [dpdk-dev] [PATCH 2/2] vdpa/mlx5: hardware queue moderation
On 1/6/21 4:06 AM, Xueming Li wrote: > The next parameters control the HW queue moderation feature. > This feature helps to control the traffic performance and latency > tradeoff. > > Each packet completion report from HW to SW requires CQ processing by SW > and triggers interrupt for the guest driver. Interrupt report and > handling cost CPU cycles and time and the amount of this affects > directly on packet performance and latency. > > hw_latency_mode parameters [int] > 0, HW default. > 1, Latency is counted from the first packet completion report. > 2, Latency is counted from the last packet completion. > hw_max_latency_us parameters [int] > 0 - 4095, The maximum time in microseconds that packet completion > report can be delayed. > hw_max_pending_comp parameter [int] > 0 - 65535, The maximum number of pending packets completions in an HW > queue. > > Signed-off-by: Xueming Li > --- > doc/guides/vdpadevs/mlx5.rst| 24 > drivers/vdpa/mlx5/mlx5_vdpa.c | 6 ++ > drivers/vdpa/mlx5/mlx5_vdpa.h | 3 +++ > drivers/vdpa/mlx5/mlx5_vdpa_virtq.c | 3 +++ > 4 files changed, 36 insertions(+) Reviewed-by: Maxime Coquelin Thanks, Maxime
[dpdk-dev] DPDK Release Status Meeting 7/01/2021
Meeting minutes of 7 January 2021 - Agenda: * Release Dates * Subtrees * LTS * Opens Participants: * Arm * Broadcom * Debian/Microsoft * Intel * Marvell * Nvidia * NXP * Red Hat Release Dates - * Happy new year! * v21.02 dates * Proposal/V1 passed, it was on Sunday, 20 December 2020 * -rc1: Friday, 15 January 2021 * -rc2: Friday, 29 January 2021 * Release:Friday, 5 February 2021 * Please send roadmaps, preferably before beginning of the release * Thanks to NTT, Huawei hns3, Intel, Red Hat, Marvell, Broadcom, Arm, Nvidia for sending roadmap Subtrees * main * Working on patches from previous release * Intel power management series * Arm build series * Suggested to discuss device emulation in the techboard * There is a dependency to the Qemu * Adding a lot of code, need to clarify usecase * Not enough reviews, hoping to having more review/activity after everyone return from holidays * rte_flow action change RFC is out, it is a design change, please review * http://patchwork.dpdk.org/patch/85384/ * next-net * Some patches merged already, pulling from sub-trees * ~160 patches in the backlog * Planning to complete ethdev changes before -rc1 * There will be early pull to main tree this Friday * next-crypto * Will start merging next week * Two sets in the queue * enqueue & dequeue callbacks * Will be reviewed next week * There are some Marvell PMD patches planned for -rc1 * next-eventdev * Very few patches, will be reviewed this weekend * next-virtio * Quite a few patches in the backlog * vectorized data path, smp barriers, virtio PMD rework, ... * reviews are going on * Working on virtio PMD rework * Planning to have a pull request tomorrow * next-net-mlx * Thomas is acting as backup maintainer * There are big Windows enablement sets * Will pull to next-net today * Some features are waiting for ehtdev patches * next-net-brcm * Some in the backlog, no more big set/feature for this release * next-net-intel * Pulled, more patches expected including base code update * i40e Windows enablement patches will go directly to main tree * next-net-mrvl * mvpp2 set is under review, will do second round of review on weekend * There is octeontx endpoint PMD * Will be merged directly to next-net * First internal review done, Jerin will try to review LTS --- * v19.11.6 is released * http://inbox.dpdk.org/dev/20201218104305.3815770-1-luca.bocca...@gmail.com/ * v18.11.11-rc1 is out, please test * http://inbox.dpdk.org/dev/20201217120304.351927-1-ktray...@redhat.com/ * Waiting for test reports, some already received * Red Hat test result looks good * Intel reported two issues * One is related to kernel version, other is not new regression * Ping (from Intel) confirmed issues can be ignored * OvS team started to test it * Planned release date is 19th January Opens - * Coverity checks are automated * It will run tree times in a week DPDK Release Status Meetings The DPDK Release Status Meeting is intended for DPDK Committers to discuss the status of the master tree and sub-trees, and for project managers to track progress or milestone dates. The meeting occurs on every Thursdays at 8:30 UTC. on https://meet.jit.si/DPDK If you wish to attend just send an email to "John McNamara " for the invite.