[PATCH 1/1] doc: add author name on cc to git fixline alias
We should add both the author name and the author email address to the Cc: line of a blamed patch. Fixes: f9ef083c6cfe ("doc: add author on cc to git fixline alias") Cc: Harry van Haaren Signed-off-by: Heinrich Schuchardt --- doc/guides/contributing/patches.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/doc/guides/contributing/patches.rst b/doc/guides/contributing/patches.rst index bebcaf3925..aad2429e58 100644 --- a/doc/guides/contributing/patches.rst +++ b/doc/guides/contributing/patches.rst @@ -260,7 +260,7 @@ Here are some guidelines for the body of a commit message: You can generate the required lines using the following git alias, which prints the commit SHA and the author of the original code:: - git config alias.fixline "log -1 --abbrev=12 --format='Fixes: %h (\"%s\")%nCc: %ae'" + git config alias.fixline "log -1 --abbrev=12 --format='Fixes: %h (\"%s\")%nCc: %an <%ae>'" The output of ``git fixline `` must then be added to the commit message:: -- 2.36.1
[PATCH] vdpa/mlx5: return correct error code after rte_intr_instance_alloc failed
From: newsky647 <835703...@qq.com> After function rte_intr_instance_alloc failed, we should return ENOMEM for error code. Fixes: 5fe068bf7a2 ("vdpa/mlx5: reuse resources in reconfiguration") Signed-off-by: newsky647 <835703...@qq.com> --- drivers/vdpa/mlx5/mlx5_vdpa_event.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/vdpa/mlx5/mlx5_vdpa_event.c b/drivers/vdpa/mlx5/mlx5_vdpa_event.c index 7167a98db0..6223afaae8 100644 --- a/drivers/vdpa/mlx5/mlx5_vdpa_event.c +++ b/drivers/vdpa/mlx5/mlx5_vdpa_event.c @@ -395,6 +395,7 @@ mlx5_vdpa_err_event_setup(struct mlx5_vdpa_priv *priv) rte_intr_instance_alloc(RTE_INTR_INSTANCE_F_SHARED); if (priv->err_intr_handle == NULL) { DRV_LOG(ERR, "Fail to allocate intr_handle"); + rte_errno = ENOMEM; goto error; } if (rte_intr_fd_set(priv->err_intr_handle, priv->err_chnl->fd)) -- 2.35.1.windows.2
[PATCH] devtools: ensure proper tag sequence
From: Jakub Palider This change to log checking procedure ensures that all tags are in proper order. The order of tags is as follows: * Coverity issue * Bugzilla ID * Fixes * Cc * * Suggested-by * Reported-by + Signed-off-by * Acked-by * Reviewed-by * Tested-by where: * => 0 or more than one instance possible + => more than once instance possible In order to satisfy the above requirements an extra check is performed for obligatory tags. Signed-off-by: Jakub Palider Change-Id: I58d94fecddd5f978e0567b04f72bd2dc1c0b8a08 --- devtools/check-git-log.sh | 51 + doc/guides/contributing/patches.rst | 25 ++ 2 files changed, 76 insertions(+) diff --git a/devtools/check-git-log.sh b/devtools/check-git-log.sh index 23c6a7d9bb..0d8fff94ee 100755 --- a/devtools/check-git-log.sh +++ b/devtools/check-git-log.sh @@ -54,6 +54,7 @@ fixes=$(git log --format='%h %s' --reverse $range | grep -i ': *fix' | cut -d' ' stablefixes=$($selfdir/git-log-fixes.sh $range | sed '/(N\/A)$/d' | cut -d' ' -f2) tags=$(git log --format='%b' --reverse $range | grep -i -e 'by *:' -e 'fix.*:') bytag='\(Reported\|Suggested\|Signed-off\|Acked\|Reviewed\|Tested\)-by:' +exttag='Coverity issue:\|Bugzilla ID:\|Fixes:\|Cc:' failure=false @@ -203,6 +204,56 @@ done) [ -z "$bad" ] || { printf "Is it candidate for Cc: sta...@dpdk.org backport?\n$bad\n"\ && failure=true;} +# check tag sequence +bad=$(for commit in $commits; do + body=$(git log --format='%b' -1 $commit) + echo "$body" |\ + grep -o -e "$exttag\|^[[:blank:]]*$\|$bytag" | \ + # retrieve tags only + cut -f1 -d":" |\ + # it is okay to have several tags of the same type but for processing + # we need to squash them + uniq |\ + # make sure the tags are in the proper order as presented in SEQ + awk -v cmt="$commit" 'BEGIN{ + SEQ[0] = "Coverity issue"; + SEQ[1] = "Bugzilla ID"; + SEQ[2] = "Fixes"; + SEQ[3] = "Cc"; + SEQ[4] = "^$"; + SEQ[5] = "Suggested-by"; + SEQ[6] = "Reported-by"; + SEQ[7] = "Signed-off-by"; + SEQ[8] = "Acked-by"; + SEQ[9] = "Reviewed-by"; + SEQ[10] = "Tested-by"; + latest = 0; + } + { + for (seq = 0; seq < length(SEQ); seq++) { + if (match($0, SEQ[seq])) { + if (seq < latest) { + print "\tCommit " cmt " (" $0 ":)"; + break; + } else { + latest = seq; + } + } + } +}' +done) +[ -z "$bad" ] || { printf "Wrong tag order: \n$bad\n"\ + && failure=true;} + +# check required tag +bad=$(for commit in $commits; do + body=$(git log --format='%b' -1 $commit) + echo $body | grep -q "Signed-off-by:" \ + || echo "\tCommit" $commit "(Signed-off-by:)" + done) +[ -z "$bad" ] || { printf "Missing obligatory tag: \n$bad\n"\ + && failure=true;} + total=$(echo "$commits" | wc -l) if $failure ; then printf "\nInvalid patch(es) found - checked $total patch" diff --git a/doc/guides/contributing/patches.rst b/doc/guides/contributing/patches.rst index bebcaf3925..868a9a7bf2 100644 --- a/doc/guides/contributing/patches.rst +++ b/doc/guides/contributing/patches.rst @@ -360,6 +360,31 @@ Where ``N`` is patchwork ID for patch or series:: --- Depends-on: series-1 ("Title of the series") +Tag order +~ + +There is a pattern indicating how tags should relate to each other. + +Example of proper tag sequence:: + + Coverity issue: + Bugzilla ID: + Fixes: + Cc: + + Suggested-by: + Reported-by: + Signed-off-by: + Acked-by: + Reviewed-by: + Tested-by: + +Between first and second tag section there is and empty line. + +While ``Signed-off-by:`` is an obligatory tag and must exists in each commit, +all other tags are optional. Any tag, as long as it is in proper location +to other adjacent tags (if present), may occur multiple times. + Creating Patches -- 2.25.1
[PATCH] event/cnxk: add eth port specific PTP enable
From: Pavan Nikhilesh Add support to enable PTP per ethernet device when that specific ethernet device is connected to event device via Rx adapter. Signed-off-by: Pavan Nikhilesh Signed-off-by: Nithin Dabilpuram --- drivers/common/cnxk/roc_io.h | 5 ++- drivers/event/cnxk/cn10k_eventdev.c | 9 ++--- drivers/event/cnxk/cn10k_worker.h| 48 +++- drivers/event/cnxk/cn9k_eventdev.c | 13 +++ drivers/event/cnxk/cn9k_worker.h | 32 +++- drivers/event/cnxk/cnxk_eventdev.h | 14 --- drivers/event/cnxk/cnxk_eventdev_adptr.c | 9 + drivers/net/cnxk/cn10k_rx.h | 3 +- 8 files changed, 82 insertions(+), 51 deletions(-) diff --git a/drivers/common/cnxk/roc_io.h b/drivers/common/cnxk/roc_io.h index 62e98d9d00..68db13e748 100644 --- a/drivers/common/cnxk/roc_io.h +++ b/drivers/common/cnxk/roc_io.h @@ -159,14 +159,15 @@ roc_lmt_mov(void *out, const void *in, const uint32_t lmtext) { volatile const __uint128_t *src128 = (const __uint128_t *)in; volatile __uint128_t *dst128 = (__uint128_t *)out; + uint32_t i; dst128[0] = src128[0]; dst128[1] = src128[1]; /* lmtext receives following value: * 1: NIX_SUBDC_EXT needed i.e. tx vlan case */ - if (lmtext) - dst128[2] = src128[2]; + for (i = 0; i < lmtext; i++) + dst128[2 + i] = src128[2 + i]; } static __plt_always_inline void diff --git a/drivers/event/cnxk/cn10k_eventdev.c b/drivers/event/cnxk/cn10k_eventdev.c index 0da809db29..a0c3d22284 100644 --- a/drivers/event/cnxk/cn10k_eventdev.c +++ b/drivers/event/cnxk/cn10k_eventdev.c @@ -693,8 +693,7 @@ cn10k_sso_rx_adapter_caps_get(const struct rte_eventdev *event_dev, } static void -cn10k_sso_set_priv_mem(const struct rte_eventdev *event_dev, void *lookup_mem, - void *tstmp_info) +cn10k_sso_set_priv_mem(const struct rte_eventdev *event_dev, void *lookup_mem) { struct cnxk_sso_evdev *dev = cnxk_sso_pmd_priv(event_dev); int i; @@ -702,7 +701,7 @@ cn10k_sso_set_priv_mem(const struct rte_eventdev *event_dev, void *lookup_mem, for (i = 0; i < dev->nb_event_ports; i++) { struct cn10k_sso_hws *ws = event_dev->data->ports[i]; ws->lookup_mem = lookup_mem; - ws->tstamp = tstmp_info; + ws->tstamp = dev->tstamp; } } @@ -714,7 +713,6 @@ cn10k_sso_rx_adapter_queue_add( { struct cn10k_eth_rxq *rxq; void *lookup_mem; - void *tstmp_info; int rc; rc = strncmp(eth_dev->device->driver->name, "net_cn10k", 8); @@ -727,8 +725,7 @@ cn10k_sso_rx_adapter_queue_add( return -EINVAL; rxq = eth_dev->data->rx_queues[0]; lookup_mem = rxq->lookup_mem; - tstmp_info = rxq->tstamp; - cn10k_sso_set_priv_mem(event_dev, lookup_mem, tstmp_info); + cn10k_sso_set_priv_mem(event_dev, lookup_mem); cn10k_sso_fp_fns_set((struct rte_eventdev *)(uintptr_t)event_dev); return 0; diff --git a/drivers/event/cnxk/cn10k_worker.h b/drivers/event/cnxk/cn10k_worker.h index 034f508dd8..61c40eff7f 100644 --- a/drivers/event/cnxk/cn10k_worker.h +++ b/drivers/event/cnxk/cn10k_worker.h @@ -108,12 +108,29 @@ cn10k_wqe_to_mbuf(uint64_t wqe, const uint64_t __mbuf, uint8_t port_id, mbuf_init | ((uint64_t)port_id) << 48, flags); } +static void +cn10k_sso_process_tstamp(uint64_t u64, uint64_t mbuf, +struct cnxk_timesync_info *tstamp) +{ + uint64_t tstamp_ptr; + uint8_t laptr; + + laptr = (uint8_t) * + (uint64_t *)(u64 + (CNXK_SSO_WQE_LAYR_PTR * sizeof(uint64_t))); + if (laptr == sizeof(uint64_t)) { + /* Extracting tstamp, if PTP enabled*/ + tstamp_ptr = *(uint64_t *)(((struct nix_wqe_hdr_s *)u64) + + CNXK_SSO_WQE_SG_PTR); + cn10k_nix_mbuf_to_tstamp((struct rte_mbuf *)mbuf, tstamp, true, +(uint64_t *)tstamp_ptr); + } +} + static __rte_always_inline void cn10k_process_vwqe(uintptr_t vwqe, uint16_t port_id, const uint32_t flags, void *lookup_mem, void *tstamp, uintptr_t lbase) { - uint64_t mbuf_init = 0x10001ULL | RTE_PKTMBUF_HEADROOM | -(flags & NIX_RX_OFFLOAD_TSTAMP_F ? 8 : 0); + uint64_t mbuf_init = 0x10001ULL | RTE_PKTMBUF_HEADROOM; struct rte_event_vector *vec; uint64_t aura_handle, laddr; uint16_t nb_mbufs, non_vec; @@ -133,6 +150,9 @@ cn10k_process_vwqe(uintptr_t vwqe, uint16_t port_id, const uint32_t flags, for (i = OBJS_PER_CLINE; i < vec->nb_elem; i += OBJS_PER_CLINE) rte_prefetch0(&vec->ptrs[i]); + if (flags & NIX_RX_OFFLOAD_TSTAMP_F && tstamp) + mbuf_init |= 8; + nb_mbufs =
Re: [PATCH] devtools: ensure proper tag sequence
12/06/2022 16:23, jpali...@marvell.com: > From: Jakub Palider > > This change to log checking procedure ensures that all > tags are in proper order. > The order of tags is as follows: > * Coverity issue > * Bugzilla ID > * Fixes > * Cc > * > * Suggested-by > * Reported-by > + Signed-off-by > * Acked-by > * Reviewed-by > * Tested-by For the *-by sequence, the order is more chronological. Ack/Review/Test can be in any order I think.
RE: [PATCH v7] net/i40e: add outer VLAN processing
Hi, Ben This patch can only take effect after firmware v8.6. It cannot be used on firmware v8.4 and v8.5. The reason is that the firmware team gave a clear reply that there are many related bugs in firmware v8.4 and v8.5, and they were fixed in firmware v8.6. The firmware team recommends using v8.6. Regards Kevin From: Ben Magistro Sent: 2022年6月10日 22:27 To: Liu, KevinX Cc: dev@dpdk.org; Zhang, Yuying ; Xing, Beilei ; Yang, SteveX ; Zhang, RobinX Subject: Re: [PATCH v7] net/i40e: add outer VLAN processing I'm trying to understand if this change is at all related to an issue we are experiencing with newer firmwares (https://mails.dpdk.org/archives/dev/2022-April/238621.html) that happened to start with 8.4 and affected qinq offload processing. I can say loading this patch and running testpmd does not seem to have any effect on the issue we are experiencing. Side note, there is also a minor typo in the comment block "i40e_vlan_tpie_set" vs "i40e_vlan_tpid_set". Thanks On Fri, Jun 10, 2022 at 4:30 AM Kevin Liu mailto:kevinx@intel.com>> wrote: From: Robin Zhang mailto:robinx.zh...@intel.com>> Outer VLAN processing is supported after firmware v8.4, kernel driver also change the default behavior to support this feature. To align with kernel driver, add support for outer VLAN processing in DPDK. But it is forbidden for firmware to change the Inner/Outer VLAN configuration while there are MAC/VLAN filters in the switch table. Therefore, we need to clear the MAC table before setting config, and then restore the MAC table after setting. This will not impact on an old firmware. Signed-off-by: Robin Zhang mailto:robinx.zh...@intel.com>> Signed-off-by: Kevin Liu mailto:kevinx@intel.com>> --- drivers/net/i40e/i40e_ethdev.c | 94 -- drivers/net/i40e/i40e_ethdev.h | 3 ++ 2 files changed, 92 insertions(+), 5 deletions(-) diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c index 755786dc10..4cae163cb9 100644 --- a/drivers/net/i40e/i40e_ethdev.c +++ b/drivers/net/i40e/i40e_ethdev.c @@ -2575,6 +2575,7 @@ i40e_dev_close(struct rte_eth_dev *dev) struct i40e_hw *hw = I40E_DEV_PRIVATE_TO_HW(dev->data->dev_private); struct rte_pci_device *pci_dev = RTE_ETH_DEV_TO_PCI(dev); struct rte_intr_handle *intr_handle = pci_dev->intr_handle; + struct rte_eth_rxmode *rxmode = &dev->data->dev_conf.rxmode; struct i40e_filter_control_settings settings; struct rte_flow *p_flow; uint32_t reg; @@ -2587,6 +2588,18 @@ i40e_dev_close(struct rte_eth_dev *dev) if (rte_eal_process_type() != RTE_PROC_PRIMARY) return 0; + /* +* It is a workaround, if the double VLAN is disabled when +* the program exits, an abnormal error will occur on the +* NIC. Need to enable double VLAN when dev is closed. +*/ + if (pf->fw8_3gt) { + if (!(rxmode->offloads & RTE_ETH_RX_OFFLOAD_VLAN_EXTEND)) { + rxmode->offloads |= RTE_ETH_RX_OFFLOAD_VLAN_EXTEND; + i40e_vlan_offload_set(dev, RTE_ETH_VLAN_EXTEND_MASK); + } + } + ret = rte_eth_switch_domain_free(pf->switch_domain_id); if (ret) PMD_INIT_LOG(WARNING, "failed to free switch domain: %d", ret); @@ -3909,6 +3922,7 @@ i40e_vlan_tpid_set(struct rte_eth_dev *dev, struct i40e_pf *pf = I40E_DEV_PRIVATE_TO_PF(dev->data->dev_private); int qinq = dev->data->dev_conf.rxmode.offloads & RTE_ETH_RX_OFFLOAD_VLAN_EXTEND; + u16 sw_flags = 0, valid_flags = 0; int ret = 0; if ((vlan_type != RTE_ETH_VLAN_TYPE_INNER && @@ -3927,15 +3941,32 @@ i40e_vlan_tpid_set(struct rte_eth_dev *dev, /* 802.1ad frames ability is added in NVM API 1.7*/ if (hw->flags & I40E_HW_FLAG_802_1AD_CAPABLE) { if (qinq) { + if (pf->fw8_3gt) { + sw_flags = I40E_AQ_SET_SWITCH_CFG_OUTER_VLAN; + valid_flags = I40E_AQ_SET_SWITCH_CFG_OUTER_VLAN; + } if (vlan_type == RTE_ETH_VLAN_TYPE_OUTER) hw->first_tag = rte_cpu_to_le_16(tpid); else if (vlan_type == RTE_ETH_VLAN_TYPE_INNER) hw->second_tag = rte_cpu_to_le_16(tpid); } else { - if (vlan_type == RTE_ETH_VLAN_TYPE_OUTER) - hw->second_tag = rte_cpu_to_le_16(tpid); + /* +* If tpid is equal to 0x88A8, indicates that the +* disable double VLAN operation is in progress. +* Need set switch configuration back to default. +*/ + if (pf->fw8_3gt && tpid == RTE_ETHER_TYPE_QINQ) { + sw_f
[PATCH v6] app/testpmd: add Host Shaper command
this patch is taken out from series of "introduce per-queue available descriptor threshold and host shaper" to simplify the review, and it's the last one for non-PMD change. However it depends on a PMD commit for host shaper config API, should be merged after PMD patches. -- v6: - add 'static' keyword for cmdline structs - in AVAIL_THRESH event callback, add check to ensure it's mlx5 port Spike Du (1): app/testpmd: add Host Shaper command app/test-pmd/testpmd.c | 11 +++ doc/guides/nics/mlx5.rst| 46 + drivers/net/mlx5/meson.build| 4 + drivers/net/mlx5/mlx5_testpmd.c | 206 drivers/net/mlx5/mlx5_testpmd.h | 26 + 5 files changed, 293 insertions(+) create mode 100644 drivers/net/mlx5/mlx5_testpmd.c create mode 100644 drivers/net/mlx5/mlx5_testpmd.h -- 1.8.3.1
[PATCH v6] app/testpmd: add Host Shaper command
Add command line options to support host shaper configure. - Command syntax: mlx5 set port host_shaper avail_thresh_triggered <0|1> rate - Example commands: To enable avail_thresh_triggered on port 1 and disable current host shaper: testpmd> mlx5 set port 1 host_shaper avail_thresh_triggered 1 rate 0 To disable avail_thresh_triggered and current host shaper on port 1: testpmd> mlx5 set port 1 host_shaper avail_thresh_triggered 0 rate 0 The rate unit is 100Mbps. To disable avail_thresh_triggered and configure a shaper of 5Gbps on port 1: testpmd> mlx5 set port 1 host_shaper avail_thresh_triggered 0 rate 50 Add sample code to handle rxq available descriptor threshold event, it delays a while so that rxq empties, then disables host shaper and rearms available descriptor threshold event. Signed-off-by: Spike Du --- app/test-pmd/testpmd.c | 11 +++ doc/guides/nics/mlx5.rst| 46 + drivers/net/mlx5/meson.build| 4 + drivers/net/mlx5/mlx5_testpmd.c | 206 drivers/net/mlx5/mlx5_testpmd.h | 26 + 5 files changed, 293 insertions(+) create mode 100644 drivers/net/mlx5/mlx5_testpmd.c create mode 100644 drivers/net/mlx5/mlx5_testpmd.h diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c index 33d9b85..e1ac75a 100644 --- a/app/test-pmd/testpmd.c +++ b/app/test-pmd/testpmd.c @@ -69,6 +69,9 @@ #ifdef RTE_NET_BOND #include #endif +#ifdef RTE_NET_MLX5 +#include "mlx5_testpmd.h" +#endif #include "testpmd.h" @@ -3659,6 +3662,14 @@ struct pmd_test_command { break; printf("Received avail_thresh event, port:%d rxq_id:%d\n", port_id, rxq_id); + + struct rte_eth_dev_info dev_info; + if (rte_eth_dev_info_get(port_id, &dev_info) != 0 || + (strncmp(dev_info.driver_name, "mlx5", 4) != 0)) + printf("%s\n", dev_info.driver_name); +#ifdef RTE_NET_MLX5 + mlx5_test_avail_thresh_event_handler(port_id, rxq_id); +#endif } break; default: diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst index a1e13e7..b5a3ee3 100644 --- a/doc/guides/nics/mlx5.rst +++ b/doc/guides/nics/mlx5.rst @@ -1727,3 +1727,49 @@ which can be installed from OFED mstflint package. Meson detects ``libmtcr_ul`` existence at configure stage. If the library is detected, the application must link with ``-lmtcr_ul``, as done by the pkg-config file libdpdk.pc. + +How to use available descriptor threshold and Host Shaper +-- + +There are sample command lines to configure available descriptor threshold in testpmd. +Testpmd also contains sample logic to handle available descriptor threshold event. +The typical workflow is: testpmd configure available descriptor threshold for Rx queues, enable +avail_thresh_triggered in host shaper and register a callback, when traffic from host is +too high and Rx queue emptiness is below available descriptor threshold, PMD receives an event and +firmware configures a 100Mbps shaper on host port automatically, then PMD call +the callback registered previously, which will delay a while to let Rx queue +empty, then disable host shaper. + +Let's assume we have a simple Blue Field 2 setup: port 0 is uplink, port 1 +is VF representor. Each port has 2 Rx queues. +In order to control traffic from host to ARM, we can enable available descriptor threshold in testpmd by: + +.. code-block:: console + + testpmd> mlx5 set port 1 host_shaper avail_thresh_triggered 1 rate 0 + testpmd> set port 1 rxq 0 avail_thresh 70 + testpmd> set port 1 rxq 1 avail_thresh 70 + +The first command disables current host shaper, and enables available descriptor threshold triggered mode. +The left commands configure available descriptor threshold to 70% of Rx queue size for both Rx queues, +When traffic from host is too high, you can see testpmd console prints log +about available descriptor threshold event receiving, then host shaper is disabled. +The traffic rate from host is controlled and less drop happens in Rx queues. + +When disable available descriptor threshold and avail_thresh_triggered, we can invoke below commands in testpmd: + +.. code-block:: console + + testpmd> mlx5 set port 1 host_shaper avail_thresh_triggered 0 rate 0 + testpmd> set port 1 rxq 0 avail_thresh 0 + testpmd> set port 1 rxq 1 avail_thresh 0 + +It's recommended an application disables available descriptor threshold and avail_thresh_triggered before exit, +if it enables them before. + +We can also configure the shaper with a value, the rate unit is 100Mbps, below +command sets current shaper to 5Gbps and disables avail_thresh_triggered. + +.. code-block:: console + + testpmd> mlx5 set port 1 host_shaper avail_thresh_triggered 0 rate 50 diff --git a/drivers/net/mlx5/meson.build b/driver
RE: [PATCH] config/arm: add PHYTIUM ft2000plus
> -Original Message- > From: luzhipeng > Sent: Friday, June 10, 2022 1:47 PM > To: dev@dpdk.org > Cc: Jan Viktorin ; Ruifeng Wang > ; Bruce Richardson > ; luzhipeng > Subject: [PATCH] config/arm: add PHYTIUM ft2000plus > > Here adds configs for PHYTIUM server. > > Signed-off-by: luzhipeng I cannot find this patch in Patchwork. > --- > config/arm/arm64_ft2000plus_linux_gcc | 16 > config/arm/meson.build| 26 +- > 2 files changed, 41 insertions(+), 1 deletion(-) create mode 100644 > config/arm/arm64_ft2000plus_linux_gcc > > diff --git a/config/arm/arm64_ft2000plus_linux_gcc > b/config/arm/arm64_ft2000plus_linux_gcc > new file mode 100644 > index 00..554517054e > --- /dev/null > +++ b/config/arm/arm64_ft2000plus_linux_gcc > @@ -0,0 +1,16 @@ > +[binaries] > +c = 'aarch64-linux-gnu-gcc' You may want to add ccache support when below one is merged. http://patches.dpdk.org/project/dpdk/patch/20220608171304.945454-1-jer...@marvell.com/ > +cpp = 'aarch64-linux-gnu-cpp' > +ar = 'aarch64-linux-gnu-gcc-ar' > +strip = 'aarch64-linux-gnu-strip' > +pkgconfig = 'aarch64-linux-gnu-pkg-config' > +pcap-config = '' > + > +[host_machine] > +system = 'linux' > +cpu_family = 'aarch64' > +cpu = 'armv8-a' > +endian = 'little' > + > +[properties] > +platform = 'ft2000plus' > diff --git a/config/arm/meson.build b/config/arm/meson.build index > aa12eb76f4..48e5f6af5b 100644 > --- a/config/arm/meson.build > +++ b/config/arm/meson.build > @@ -218,6 +218,20 @@ implementer_qualcomm = { > } > } > > +implementer_phytium = { > +'description': 'PHYTIUM', > +'flags': [ > +['RTE_MACHINE', '"armv8a"'], > +['RTE_USE_C11_MEM_MODEL', true], > +['RTE_CACHE_LINE_SIZE', 64], > +['RTE_MAX_LCORE', 64], > +['RTE_MAX_NUMA_NODES', 8] > +], > +'part_number_config': { > +'0x662': {'machine_args': ['-march=armv8-a+crc']} > +} > +} > + > ## Arm implementers (ID from MIDR in Arm Architecture Reference Manual) > implementers = { > 'generic': implementer_generic, > @@ -225,7 +239,8 @@ implementers = { > '0x43': implementer_cavium, > '0x48': implementer_hisilicon, > '0x50': implementer_ampere, > -'0x51': implementer_qualcomm > +'0x51': implementer_qualcomm, > +'0x70': implementer_phytium > } > > # SoC specific armv8 flags have the highest priority @@ -378,6 +393,13 @@ > soc_thunderxt83 = { > 'part_number': '0xa3' > } > > +soc_ft2000plus = { > +'description': 'PHYTIUM ft2000plus', > +'implementer': '0x70', > +'part_number': '0x662', > +'numa': true > +} > + > ''' > Start of SoCs list > generic: Generic un-optimized build for armv8 aarch64 execution mode. > @@ -398,6 +420,7 @@ stingray:Broadcom Stingray > thunderx2: Marvell ThunderX2 T99 > thunderxt88: Marvell ThunderX T88 > thunderxt83: Marvell ThunderX T83 > +ft2000plus: PHYTIUM ft2000plus > End of SoCs list > ''' > # The string above is included in the documentation, keep it in sync with the > @@ -421,6 +444,7 @@ socs = { > 'thunderx2': soc_thunderx2, > 'thunderxt88': soc_thunderxt88, > 'thunderxt83': soc_thunderxt83, > +'ft2000plus': soc_ft2000plus, > } > > dpdk_conf.set('RTE_ARCH_ARM', 1) > -- > 2.27.0 > >
RE: [EXT] [PATCH] test/crypto: fix warnings for optimization=1 build
Hi Rahul, Please see inline. Thanks, Anoob > -Original Message- > From: Rahul Lakkireddy > Sent: Saturday, June 11, 2022 3:40 AM > To: dev@dpdk.org > Cc: Akhil Goyal ; roy.fan.zh...@intel.com; > sta...@dpdk.org > Subject: [EXT] [PATCH] test/crypto: fix warnings for optimization=1 build > > External Email > > -- > Skip IPSec ESN and antireplay cases, if there are no packets. Fixes following > warning when using optimization=1 build flag with GCC 11. > > ../app/test/test_cryptodev.c: In function ‘test_ipsec_pkt_replay’: > ../app/test/test_cryptodev.c:10074:15: warning: ‘td_outb’ may be used > uninitialized [-Wmaybe-uninitialized] > ret = test_ipsec_proto_process(td_outb, td_inb, nb_pkts, true, > > ^~~~ > &flags); > ~~~ > ../app/test/test_cryptodev.c:9150:1: note: by argument 1 of type ‘const > struct ipsec_test_data[]’ to ‘test_ipsec_proto_process’ declared here > test_ipsec_proto_process(const struct ipsec_test_data td[], > ^~~~ > ../app/test/test_cryptodev.c:10056:32: note: ‘td_outb’ declared here > struct ipsec_test_data td_outb[IPSEC_TEST_PACKETS_MAX]; > ^~~ > > Fixes: d02c6bfcb99a ("test/crypto: add ESN and antireplay cases") > Cc: sta...@dpdk.org > > Signed-off-by: Rahul Lakkireddy > --- > app/test/test_cryptodev.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/app/test/test_cryptodev.c b/app/test/test_cryptodev.c index > 524019ce0e..47ce3d8420 100644 > --- a/app/test/test_cryptodev.c > +++ b/app/test/test_cryptodev.c > @@ -10058,6 +10058,9 @@ test_ipsec_pkt_replay(const void *test_data, > const uint64_t esn[], > struct ipsec_test_flags flags; > uint32_t i = 0, ret = 0; > > + if (nb_pkts == 0) > + return TEST_SKIPPED; > + [Anoob] nb_pkts == 0 would indicate a badly written test case. Might be better to return TEST_FAILED in that case. > memset(&flags, 0, sizeof(flags)); > flags.antireplay = true; > > -- > 2.27.0
Re: [PATCH 05/12] net/enetfec: fix build with GCC 12
Hello David, I understood and agree with your suggestion. We are using GCC 11.3 where we were not seeing this warning. We will fix this on priority and submit the patch asap. regards, Sachin Saxena On 6/10/2022 6:38 PM, David Marchand wrote: On Wed, May 18, 2022 at 12:17 PM David Marchand wrote: GCC 12 raises the following warning: ../drivers/net/enetfec/enet_ethdev.c: In function ‘enetfec_rx_queue_setup’: ../drivers/net/enetfec/enet_ethdev.c:473:9: error: array subscript 1 is above array bounds of ‘uint32_t[1]’ {aka ‘unsigned int[1]’} [-Werror=array-bounds] 473 | rte_write32(rte_cpu_to_le_32(fep->bd_addr_p_r[queue_idx]), | ^~ 474 | (uint8_t *)fep->hw_baseaddr_v + ENETFEC_RD_START(queue_idx)); | In file included from ../drivers/net/enetfec/enet_ethdev.c:9: ../drivers/net/enetfec/enet_ethdev.h:113:33: note: while referencing ‘bd_addr_p_r’ 113 | uint32_tbd_addr_p_r[ENETFEC_MAX_Q]; | ^~~ This driver properly announces that it only supports 1 rxq. Silence this warning by adding an explicit check on the queue id. Cc: sta...@dpdk.org Signed-off-by: David Marchand Any comment from driver maintainers? Thanks.
Re: [PATCH] event/cnxk: add free for Tx adapter
On Mon, May 30, 2022 at 6:49 PM Volodymyr Fialko wrote: > > Tx adapter allocate data during eth_tx_adapter_queue_add() call and > it's only cleaned but not freed during eth_tx_adapter_queue_del(). > Implemented eth_tx_adapter_free() callback to free adapter data. > > Signed-off-by: Volodymyr Fialko Applied to dpdk-next-net-eventdev/for-main. Thanks > --- > drivers/event/cnxk/cn10k_eventdev.c | 3 +++ > drivers/event/cnxk/cn9k_eventdev.c | 3 +++ > drivers/event/cnxk/cnxk_eventdev.h | 4 +++ > drivers/event/cnxk/cnxk_eventdev_adptr.c | 34 > 4 files changed, 44 insertions(+) > > diff --git a/drivers/event/cnxk/cn10k_eventdev.c > b/drivers/event/cnxk/cn10k_eventdev.c > index 214eca4239..110c7af439 100644 > --- a/drivers/event/cnxk/cn10k_eventdev.c > +++ b/drivers/event/cnxk/cn10k_eventdev.c > @@ -878,6 +878,9 @@ static struct eventdev_ops cn10k_sso_dev_ops = { > .eth_tx_adapter_caps_get = cn10k_sso_tx_adapter_caps_get, > .eth_tx_adapter_queue_add = cn10k_sso_tx_adapter_queue_add, > .eth_tx_adapter_queue_del = cn10k_sso_tx_adapter_queue_del, > + .eth_tx_adapter_start = cnxk_sso_tx_adapter_start, > + .eth_tx_adapter_stop = cnxk_sso_tx_adapter_stop, > + .eth_tx_adapter_free = cnxk_sso_tx_adapter_free, > > .timer_adapter_caps_get = cnxk_tim_caps_get, > > diff --git a/drivers/event/cnxk/cn9k_eventdev.c > b/drivers/event/cnxk/cn9k_eventdev.c > index 076847d9a4..bf7bbc2c43 100644 > --- a/drivers/event/cnxk/cn9k_eventdev.c > +++ b/drivers/event/cnxk/cn9k_eventdev.c > @@ -1110,6 +1110,9 @@ static struct eventdev_ops cn9k_sso_dev_ops = { > .eth_tx_adapter_caps_get = cn9k_sso_tx_adapter_caps_get, > .eth_tx_adapter_queue_add = cn9k_sso_tx_adapter_queue_add, > .eth_tx_adapter_queue_del = cn9k_sso_tx_adapter_queue_del, > + .eth_tx_adapter_start = cnxk_sso_tx_adapter_start, > + .eth_tx_adapter_stop = cnxk_sso_tx_adapter_stop, > + .eth_tx_adapter_free = cnxk_sso_tx_adapter_free, > > .timer_adapter_caps_get = cnxk_tim_caps_get, > > diff --git a/drivers/event/cnxk/cnxk_eventdev.h > b/drivers/event/cnxk/cnxk_eventdev.h > index e7cd90095d..baee5d11f0 100644 > --- a/drivers/event/cnxk/cnxk_eventdev.h > +++ b/drivers/event/cnxk/cnxk_eventdev.h > @@ -113,6 +113,7 @@ struct cnxk_sso_evdev { > uint16_t max_port_id; > uint16_t max_queue_id[RTE_MAX_ETHPORTS]; > uint8_t tx_adptr_configured; > + uint32_t tx_adptr_active_mask; > uint16_t tim_adptr_ring_cnt; > uint16_t *timer_adptr_rings; > uint64_t *timer_adptr_sz; > @@ -310,5 +311,8 @@ int cnxk_sso_tx_adapter_queue_add(const struct > rte_eventdev *event_dev, > int cnxk_sso_tx_adapter_queue_del(const struct rte_eventdev *event_dev, > const struct rte_eth_dev *eth_dev, > int32_t tx_queue_id); > +int cnxk_sso_tx_adapter_start(uint8_t id, const struct rte_eventdev > *event_dev); > +int cnxk_sso_tx_adapter_stop(uint8_t id, const struct rte_eventdev > *event_dev); > +int cnxk_sso_tx_adapter_free(uint8_t id, const struct rte_eventdev > *event_dev); > > #endif /* __CNXK_EVENTDEV_H__ */ > diff --git a/drivers/event/cnxk/cnxk_eventdev_adptr.c > b/drivers/event/cnxk/cnxk_eventdev_adptr.c > index fa96090bfa..586a7751e2 100644 > --- a/drivers/event/cnxk/cnxk_eventdev_adptr.c > +++ b/drivers/event/cnxk/cnxk_eventdev_adptr.c > @@ -589,3 +589,37 @@ cnxk_sso_tx_adapter_queue_del(const struct rte_eventdev > *event_dev, > > return 0; > } > + > +int > +cnxk_sso_tx_adapter_start(uint8_t id, const struct rte_eventdev *event_dev) > +{ > + struct cnxk_sso_evdev *dev = cnxk_sso_pmd_priv(event_dev); > + > + dev->tx_adptr_active_mask |= (1 << id); > + > + return 0; > +} > + > +int > +cnxk_sso_tx_adapter_stop(uint8_t id, const struct rte_eventdev *event_dev) > +{ > + struct cnxk_sso_evdev *dev = cnxk_sso_pmd_priv(event_dev); > + > + dev->tx_adptr_active_mask &= ~(1 << id); > + > + return 0; > +} > + > +int > +cnxk_sso_tx_adapter_free(uint8_t id __rte_unused, > +const struct rte_eventdev *event_dev) > +{ > + struct cnxk_sso_evdev *dev = cnxk_sso_pmd_priv(event_dev); > + > + if (dev->tx_adptr_data_sz && dev->tx_adptr_active_mask == 0) { > + dev->tx_adptr_data_sz = 0; > + free(dev->tx_adptr_data); > + } > + > + return 0; > +} > -- > 2.25.1 >
[RFC PATCH v1] net/i40e: put mempool cache out of API
Refer to "i40e_tx_free_bufs_avx512", this patch puts mempool cache out of API to free buffers directly. There are two changes different with previous version: 1. change txep from "i40e_entry" to "i40e_vec_entry" 2. put cache out of "mempool_bulk" API to copy buffers into it directly Performance Test with l3fwd neon path: with this patch n1sdp: no perforamnce change amper-altra:+4.0% Suggested-by: Konstantin Ananyev Suggested-by: Honnappa Nagarahalli Signed-off-by: Feifei Wang --- drivers/net/i40e/i40e_rxtx_vec_common.h | 36 - drivers/net/i40e/i40e_rxtx_vec_neon.c | 10 --- 2 files changed, 36 insertions(+), 10 deletions(-) diff --git a/drivers/net/i40e/i40e_rxtx_vec_common.h b/drivers/net/i40e/i40e_rxtx_vec_common.h index 959832ed6a..e418225b4e 100644 --- a/drivers/net/i40e/i40e_rxtx_vec_common.h +++ b/drivers/net/i40e/i40e_rxtx_vec_common.h @@ -81,7 +81,7 @@ reassemble_packets(struct i40e_rx_queue *rxq, struct rte_mbuf **rx_bufs, static __rte_always_inline int i40e_tx_free_bufs(struct i40e_tx_queue *txq) { - struct i40e_tx_entry *txep; + struct i40e_vec_tx_entry *txep; uint32_t n; uint32_t i; int nb_free = 0; @@ -98,17 +98,39 @@ i40e_tx_free_bufs(struct i40e_tx_queue *txq) /* first buffer to free from S/W ring is at index * tx_next_dd - (tx_rs_thresh-1) */ - txep = &txq->sw_ring[txq->tx_next_dd - (n - 1)]; + txep = (void *)txq->sw_ring; + txep += txq->tx_next_dd - (n - 1); if (txq->offloads & RTE_ETH_TX_OFFLOAD_MBUF_FAST_FREE) { - for (i = 0; i < n; i++) { - free[i] = txep[i].mbuf; - /* no need to reset txep[i].mbuf in vector path */ + struct rte_mempool *mp = txep[0].mbuf->pool; + void **cache_objs; + struct rte_mempool_cache *cache = rte_mempool_default_cache(mp, + rte_lcore_id()); + + if (!cache || cache->len == 0) + goto normal; + + cache_objs = &cache->objs[cache->len]; + + if (n > RTE_MEMPOOL_CACHE_MAX_SIZE) { + rte_mempool_ops_enqueue_bulk(mp, (void *)txep, n); + goto done; + } + + rte_memcpy(cache_objs, txep, sizeof(void *) * n); + /* no need to reset txep[i].mbuf in vector path */ + cache->len += n; + + if (cache->len >= cache->flushthresh) { + rte_mempool_ops_enqueue_bulk + (mp, &cache->objs[cache->size], + cache->len - cache->size); + cache->len = cache->size; } - rte_mempool_put_bulk(free[0]->pool, (void **)free, n); goto done; } +normal: m = rte_pktmbuf_prefree_seg(txep[0].mbuf); if (likely(m != NULL)) { free[0] = m; @@ -147,7 +169,7 @@ i40e_tx_free_bufs(struct i40e_tx_queue *txq) } static __rte_always_inline void -tx_backlog_entry(struct i40e_tx_entry *txep, +tx_backlog_entry(struct i40e_vec_tx_entry *txep, struct rte_mbuf **tx_pkts, uint16_t nb_pkts) { int i; diff --git a/drivers/net/i40e/i40e_rxtx_vec_neon.c b/drivers/net/i40e/i40e_rxtx_vec_neon.c index 12e6f1cbcb..d2d61e8ef4 100644 --- a/drivers/net/i40e/i40e_rxtx_vec_neon.c +++ b/drivers/net/i40e/i40e_rxtx_vec_neon.c @@ -680,12 +680,15 @@ i40e_xmit_fixed_burst_vec(void *__rte_restrict tx_queue, { struct i40e_tx_queue *txq = (struct i40e_tx_queue *)tx_queue; volatile struct i40e_tx_desc *txdp; - struct i40e_tx_entry *txep; + struct i40e_vec_tx_entry *txep; uint16_t n, nb_commit, tx_id; uint64_t flags = I40E_TD_CMD; uint64_t rs = I40E_TX_DESC_CMD_RS | I40E_TD_CMD; int i; + /* cross rx_thresh boundary is not allowed */ + nb_pkts = RTE_MIN(nb_pkts, txq->tx_rs_thresh); + if (txq->nb_tx_free < txq->tx_free_thresh) i40e_tx_free_bufs(txq); @@ -695,7 +698,8 @@ i40e_xmit_fixed_burst_vec(void *__rte_restrict tx_queue, tx_id = txq->tx_tail; txdp = &txq->tx_ring[tx_id]; - txep = &txq->sw_ring[tx_id]; + txep = (void *)txq->sw_ring; + txep += tx_id; txq->nb_tx_free = (uint16_t)(txq->nb_tx_free - nb_pkts); @@ -715,7 +719,7 @@ i40e_xmit_fixed_burst_vec(void *__rte_restrict tx_queue, /* avoid reach the end of ring */ txdp = &txq->tx_ring[tx_id]; - txep = &txq->sw_ring[tx_id]; + txep = (void *)txq->sw_ring; } tx_backlog_entry(txep, tx_pkts, nb_commit); -- 2.25.1
RE: [PATCH] config/arm: add PHYTIUM ft2000plus
> -Original Message- > From: luzhipeng > Sent: Monday, June 13, 2022 11:56 AM > To: Ruifeng Wang ; dev@dpdk.org > Cc: Jan Viktorin ; Bruce Richardson > ; nd > Subject: Re: [PATCH] config/arm: add PHYTIUM ft2000plus > > Thanks > > 在 2022/6/13 11:39, Ruifeng Wang 写道: > >> -Original Message- > >> From: luzhipeng > >> Sent: Friday, June 10, 2022 1:47 PM > >> To: dev@dpdk.org > >> Cc: Jan Viktorin ; Ruifeng Wang > >> ; Bruce Richardson > >> ; luzhipeng > >> Subject: [PATCH] config/arm: add PHYTIUM ft2000plus > >> > >> Here adds configs for PHYTIUM server. > >> > >> Signed-off-by: luzhipeng > > > > I cannot find this patch in Patchwork. > I send the patch and receive the message: > > Your mail to 'dev' with the subject > > [PATCH] config/arm: add PHYTIUM ft2000plus > > Is being held until the list moderator can review it for approval. > > The reason it is being held: > > Post by non-member to a members-only list > OK. Then you need to register to the dev@dpdk.org mailing list. https://core.dpdk.org/contribute/#send > > > > > >> --- > >> config/arm/arm64_ft2000plus_linux_gcc | 16 > >> config/arm/meson.build| 26 +- > >> 2 files changed, 41 insertions(+), 1 deletion(-) create mode > >> 100644 config/arm/arm64_ft2000plus_linux_gcc > >> > >> diff --git a/config/arm/arm64_ft2000plus_linux_gcc > >> b/config/arm/arm64_ft2000plus_linux_gcc > >> new file mode 100644 > >> index 00..554517054e > >> --- /dev/null > >> +++ b/config/arm/arm64_ft2000plus_linux_gcc > >> @@ -0,0 +1,16 @@ > >> +[binaries] > >> +c = 'aarch64-linux-gnu-gcc' > > > > You may want to add ccache support when below one is merged. > > http://patches.dpdk.org/project/dpdk/patch/20220608171304.945454-1- > jer > > i...@marvell.com/ > > > OK, > >> +cpp = 'aarch64-linux-gnu-cpp' > >> +ar = 'aarch64-linux-gnu-gcc-ar' > >> +strip = 'aarch64-linux-gnu-strip' > >> +pkgconfig = 'aarch64-linux-gnu-pkg-config' > >> +pcap-config = '' > >> + > >> +[host_machine] > >> +system = 'linux' > >> +cpu_family = 'aarch64' > >> +cpu = 'armv8-a' > >> +endian = 'little' > >> + > >> +[properties] > >> +platform = 'ft2000plus' > >> diff --git a/config/arm/meson.build b/config/arm/meson.build index > >> aa12eb76f4..48e5f6af5b 100644 > >> --- a/config/arm/meson.build > >> +++ b/config/arm/meson.build > >> @@ -218,6 +218,20 @@ implementer_qualcomm = { > >> } > >> } > >> > >> +implementer_phytium = { > >> +'description': 'PHYTIUM', > >> +'flags': [ > >> +['RTE_MACHINE', '"armv8a"'], > >> +['RTE_USE_C11_MEM_MODEL', true], > >> +['RTE_CACHE_LINE_SIZE', 64], > >> +['RTE_MAX_LCORE', 64], > >> +['RTE_MAX_NUMA_NODES', 8] > >> +], > >> +'part_number_config': { > >> +'0x662': {'machine_args': ['-march=armv8-a+crc']} > >> +} > >> +} > >> + > >> ## Arm implementers (ID from MIDR in Arm Architecture Reference > >> Manual) implementers = { > >> 'generic': implementer_generic, @@ -225,7 +239,8 @@ > >> implementers = { > >> '0x43': implementer_cavium, > >> '0x48': implementer_hisilicon, > >> '0x50': implementer_ampere, > >> -'0x51': implementer_qualcomm > >> +'0x51': implementer_qualcomm, > >> +'0x70': implementer_phytium > >> } > >> > >> # SoC specific armv8 flags have the highest priority @@ -378,6 > >> +393,13 @@ > >> soc_thunderxt83 = { > >> 'part_number': '0xa3' > >> } > >> > >> +soc_ft2000plus = { > >> +'description': 'PHYTIUM ft2000plus', > >> +'implementer': '0x70', > >> +'part_number': '0x662', > >> +'numa': true > >> +} > >> + > >> ''' > >> Start of SoCs list > >> generic: Generic un-optimized build for armv8 aarch64 execution > mode. > >> @@ -398,6 +420,7 @@ stingray:Broadcom Stingray > >> thunderx2: Marvell ThunderX2 T99 > >> thunderxt88: Marvell ThunderX T88 > >> thunderxt83: Marvell ThunderX T83 > >> +ft2000plus: PHYTIUM ft2000plus > >> End of SoCs list > >> ''' > >> # The string above is included in the documentation, keep it in > >> sync with the @@ -421,6 +444,7 @@ socs = { > >> 'thunderx2': soc_thunderx2, > >> 'thunderxt88': soc_thunderxt88, > >> 'thunderxt83': soc_thunderxt83, > >> +'ft2000plus': soc_ft2000plus, > >> } > >> > >> dpdk_conf.set('RTE_ARCH_ARM', 1) > >> -- > >> 2.27.0 > >> > >> > > > > > > >
Re: [PATCH] app/eventdev: add Tx first option to pipeline mode
On Wed, May 25, 2022 at 2:31 PM wrote: > > From: Pavan Nikhilesh > > Add Tx first support to pipeline mode tests, the transmition is done Fixed transmission typo. Acked-by: Jerin Jacob Updated the git commit as follows and applied to dpdk-next-net-eventdev/for-main. Thanks app/eventdev: add Tx first option to pipeline mode Add Tx first support to pipeline mode tests, the transmission is done on all the ethernet ports. This helps in testing eventdev performance with standalone loopback interfaces. Example: ./dpdk-test-eventdev ... -- ... --tx_first 512 512 defines the number of packets to transmit. Add an option Tx packet size, the default packet size is 64. Example: ./dpdk-test-eventdev ... -- ... --tx_first 512 --tx_pkt_sz 320 Signed-off-by: Pavan Nikhilesh Acked-by: Jerin Jacob > on all the ethernet ports. This helps in testing eventdev performance > with standalone loopback interfaces. > > Example: > ./dpdk-test-eventdev ... -- ... --tx_first 512 > > 512 defines the number of packets to transmit. > > Add an option Tx packet size, the default packet size is 64. > > Example: > ./dpdk-test-eventdev ... -- ... --tx_first 512 --tx_pkt_sz 320 > > Signed-off-by: Pavan Nikhilesh > --- > app/test-eventdev/evt_common.h | 1 + > app/test-eventdev/evt_options.c | 28 > app/test-eventdev/evt_options.h | 2 + > app/test-eventdev/test_pipeline_common.c | 165 ++- > app/test-eventdev/test_pipeline_common.h | 2 + > doc/guides/tools/testeventdev.rst| 21 ++- > 6 files changed, 214 insertions(+), 5 deletions(-) > > diff --git a/app/test-eventdev/evt_common.h b/app/test-eventdev/evt_common.h > index 2f301a7e79..e304998cd7 100644 > --- a/app/test-eventdev/evt_common.h > +++ b/app/test-eventdev/evt_common.h > @@ -65,6 +65,7 @@ struct evt_options { > uint16_t eth_queues; > uint32_t nb_flows; > uint32_t tx_first; > + uint16_t tx_pkt_sz; > uint32_t max_pkt_sz; > uint32_t prod_enq_burst_sz; > uint32_t deq_tmo_nsec; > diff --git a/app/test-eventdev/evt_options.c b/app/test-eventdev/evt_options.c > index d3c704d2b3..fb17f5a159 100644 > --- a/app/test-eventdev/evt_options.c > +++ b/app/test-eventdev/evt_options.c > @@ -106,6 +106,26 @@ evt_parse_eth_prod_type(struct evt_options *opt, const > char *arg __rte_unused) > return 0; > } > > +static int > +evt_parse_tx_first(struct evt_options *opt, const char *arg __rte_unused) > +{ > + int ret; > + > + ret = parser_read_uint32(&(opt->tx_first), arg); > + > + return ret; > +} > + > +static int > +evt_parse_tx_pkt_sz(struct evt_options *opt, const char *arg __rte_unused) > +{ > + int ret; > + > + ret = parser_read_uint16(&(opt->tx_pkt_sz), arg); > + > + return ret; > +} > + > static int > evt_parse_timer_prod_type(struct evt_options *opt, const char *arg > __rte_unused) > { > @@ -376,6 +396,10 @@ usage(char *program) > "\t--vector_size : Max vector size.\n" > "\t--vector_tmo_ns: Max vector timeout in nanoseconds\n" > "\t--per_port_pool: Configure unique pool per ethdev > port\n" > + "\t--tx_first : Transmit given number of packets\n" > + " across all the ethernet devices > before\n" > + " event workers start.\n" > + "\t--tx_pkt_sz: Packet size to use with Tx first." > ); > printf("available tests:\n"); > evt_test_dump_names(); > @@ -456,6 +480,8 @@ static struct option lgopts[] = { > { EVT_VECTOR_TMO, 1, 0, 0 }, > { EVT_PER_PORT_POOL, 0, 0, 0 }, > { EVT_HELP,0, 0, 0 }, > + { EVT_TX_FIRST,1, 0, 0 }, > + { EVT_TX_PKT_SZ, 1, 0, 0 }, > { NULL,0, 0, 0 } > }; > > @@ -497,6 +523,8 @@ evt_opts_parse_long(int opt_idx, struct evt_options *opt) > { EVT_VECTOR_SZ, evt_parse_vector_size}, > { EVT_VECTOR_TMO, evt_parse_vector_tmo_ns}, > { EVT_PER_PORT_POOL, evt_parse_per_port_pool}, > + { EVT_TX_FIRST, evt_parse_tx_first}, > + { EVT_TX_PKT_SZ, evt_parse_tx_pkt_sz}, > }; > > for (i = 0; i < RTE_DIM(parsermap); i++) { > diff --git a/app/test-eventdev/evt_options.h b/app/test-eventdev/evt_options.h > index 2231c58801..4aa06976b7 100644 > --- a/app/test-eventdev/evt_options.h > +++ b/app/test-eventdev/evt_options.h > @@ -51,6 +51,8 @@ > #define EVT_VECTOR_SZ("vector_size") > #define EVT_VECTOR_TMO ("vector_tmo_ns") > #define EVT_PER_PORT_POOL ("per_port_pool") > +#define EVT_TX_FIRST("tx_first") > +#define EVT_TX_PKT_SZ ("tx_pkt_sz") > #define EVT_HELP ("help") > > void evt_options_default(struct evt_options *opt)
回复: [PATCH v1 0/5] Direct re-arming of buffers on receive side
> -邮件原件- > 发件人: Konstantin Ananyev > 发送时间: Tuesday, May 24, 2022 9:26 AM > 收件人: Feifei Wang > 抄送: nd ; dev@dpdk.org; Ruifeng Wang > ; Honnappa Nagarahalli > > 主题: Re: [PATCH v1 0/5] Direct re-arming of buffers on receive side > > [konstantin.v.anan...@yandex.ru appears similar to someone who previously > sent you email, but may not be that person. Learn why this could be a risk at > https://aka.ms/LearnAboutSenderIdentification.] > > 16/05/2022 07:10, Feifei Wang пишет: > > > >>> Currently, the transmit side frees the buffers into the lcore cache > >>> and the receive side allocates buffers from the lcore cache. The > >>> transmit side typically frees 32 buffers resulting in 32*8=256B of > >>> stores to lcore cache. The receive side allocates 32 buffers and > >>> stores them in the receive side software ring, resulting in > >>> 32*8=256B of stores and 256B of load from the lcore cache. > >>> > >>> This patch proposes a mechanism to avoid freeing to/allocating from > >>> the lcore cache. i.e. the receive side will free the buffers from > >>> transmit side directly into it's software ring. This will avoid the > >>> 256B of loads and stores introduced by the lcore cache. It also > >>> frees up the cache lines used by the lcore cache. > >>> > >>> However, this solution poses several constraints: > >>> > >>> 1)The receive queue needs to know which transmit queue it should > >>> take the buffers from. The application logic decides which transmit > >>> port to use to send out the packets. In many use cases the NIC might > >>> have a single port ([1], [2], [3]), in which case a given transmit > >>> queue is always mapped to a single receive queue (1:1 Rx queue: Tx > >>> queue). This is easy to configure. > >>> > >>> If the NIC has 2 ports (there are several references), then we will > >>> have > >>> 1:2 (RX queue: TX queue) mapping which is still easy to configure. > >>> However, if this is generalized to 'N' ports, the configuration can > >>> be long. More over the PMD would have to scan a list of transmit > >>> queues to pull the buffers from. > > > >> Just to re-iterate some generic concerns about this proposal: > >> - We effectively link RX and TX queues - when this feature is enabled, > >> user can't stop TX queue without stopping linked RX queue first. > >> Right now user is free to start/stop any queues at his will. > >> If that feature will allow to link queues from different ports, > >> then even ports will become dependent and user will have to pay extra > >> care when managing such ports. > > > > [Feifei] When direct rearm enabled, there are two path for thread to > > choose. If there are enough Tx freed buffers, Rx can put buffers from > > Tx. > > Otherwise, Rx will put buffers from mempool as usual. Thus, users do > > not need to pay much attention managing ports. > > What I am talking about: right now different port or different queues of the > same port can be treated as independent entities: > in general user is free to start/stop (and even reconfigure in some > cases) one entity without need to stop other entity. > I.E user can stop and re-configure TX queue while keep receiving packets > from RX queue. > With direct re-arm enabled, I think it wouldn't be possible any more: > before stopping/reconfiguring TX queue user would have make sure that > corresponding RX queue wouldn't be used by datapath. > > > > >> - very limited usage scenario - it will have a positive effect only > >>when we have a fixed forwarding mapping: all (or nearly all) packets > >>from the RX queue are forwarded into the same TX queue. > > > > [Feifei] Although the usage scenario is limited, this usage scenario > > has a wide range of applications, such as NIC with one port. > > yes, there are NICs with one port, but no guarantee there wouldn't be several > such NICs within the system. > > > Furtrhermore, I think this is a tradeoff between performance and > > flexibility. > > Our goal is to achieve best performance, this means we need to give up > > some flexibility decisively. For example of 'FAST_FREE Mode', it > > deletes most of the buffer check (refcnt > 1, external buffer, chain > > buffer), chooses a shorest path, and then achieve significant performance > improvement. > >> Wonder did you had a chance to consider mempool-cache ZC API, similar > >> to one we have for the ring? > >> It would allow us on TX free path to avoid copying mbufs to temporary > >> array on the stack. > >> Instead we can put them straight from TX SW ring to the mempool cache. > >> That should save extra store/load for mbuf and might help to achieve > >> some performance gain without by-passing mempool. > >> It probably wouldn't be as fast as what you proposing, but might be > >> fast enough to consider as alternative. > >> Again, it would be a generic one, so we can avoid all these > >> implications and limitations. > > > > [Feifei] I think this is a good try. However, the most important thing > > i
Re: [PATCH 1/2] app/eventdev: use mempool cache for vector pool
On Fri, Jun 10, 2022 at 7:14 PM Jerin Jacob wrote: > > On Mon, May 23, 2022 at 3:30 PM wrote: > > > > From: Pavan Nikhilesh > > > > Use mempool cache for vector mempool as vectors are freed by the Tx > > routine, also increase the minimum pool size to 512 to avoid resource > > contention on Rx. > > > > Signed-off-by: Pavan Nikhilesh > > Acked-by: Jerin Jacob Series applied to dpdk-next-net-eventdev/for-main. Thanks > > > > --- > > app/test-eventdev/test_pipeline_common.c | 5 +++-- > > 1 file changed, 3 insertions(+), 2 deletions(-) > > > > diff --git a/app/test-eventdev/test_pipeline_common.c > > b/app/test-eventdev/test_pipeline_common.c > > index c66656cd39..856a2f1a52 100644 > > --- a/app/test-eventdev/test_pipeline_common.c > > +++ b/app/test-eventdev/test_pipeline_common.c > > @@ -338,9 +338,10 @@ pipeline_event_rx_adapter_setup(struct evt_options > > *opt, uint8_t stride, > > if (opt->ena_vector) { > > unsigned int nb_elem = (opt->pool_sz / opt->vector_size) << > > 1; > > > > - nb_elem = nb_elem ? nb_elem : 1; > > + nb_elem = RTE_MAX(512U, nb_elem); > > + nb_elem += evt_nr_active_lcores(opt->wlcores) * 32; > > vector_pool = rte_event_vector_pool_create( > > - "vector_pool", nb_elem, 0, opt->vector_size, > > + "vector_pool", nb_elem, 32, opt->vector_size, > > opt->socket_id); > > if (vector_pool == NULL) { > > evt_err("failed to create event vector pool"); > > -- > > 2.25.1 > >
Re: [PATCH] event/cnxk: fix incorrect return value
On Wed, May 18, 2022 at 7:34 PM wrote: > > From: Pavan Nikhilesh > > The `rte_event_eth_tx_adapter_enqueue()` function expects driver layer > to return the total number of events successfully transmitted. > Fix cn10k driver returning the number of packets transmitted in a > event vector instead of number of events. > > Fixes: 761a321acf91 ("event/cnxk: support vectorized Tx event fast path") > Cc: sta...@dpdk.org > > Signed-off-by: Pavan Nikhilesh Applied to dpdk-next-net-eventdev/for-main. Thanks > --- > drivers/event/cnxk/cn10k_worker.h | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/event/cnxk/cn10k_worker.h > b/drivers/event/cnxk/cn10k_worker.h > index 034f508dd8..0915f404e0 100644 > --- a/drivers/event/cnxk/cn10k_worker.h > +++ b/drivers/event/cnxk/cn10k_worker.h > @@ -651,7 +651,7 @@ cn10k_sso_hws_event_tx(struct cn10k_sso_hws *ws, struct > rte_event *ev, > } > rte_mempool_put(rte_mempool_from_obj(ev->vec), ev->vec); > rte_prefetch0(ws); > - return (meta & 0x); > + return 1; > } > > m = ev->mbuf; > -- > 2.25.1 >
Re: [PATCH] drivers: fix cnxk event qos devarg processing
On Fri, May 20, 2022 at 12:41 PM Shijith Thotton wrote: > > Fixed qos parameters getting over written and IAQ/TAQ threshold > calculation. > > Fixes: 910da32c53a9 ("event/cnxk: add device start") > Cc: sta...@dpdk.org > > Signed-off-by: Shijith Thotton Applied to dpdk-next-net-eventdev/for-main. Thanks > --- > drivers/common/cnxk/roc_sso.c | 4 ++-- > drivers/event/cnxk/cnxk_eventdev.c | 8 > 2 files changed, 6 insertions(+), 6 deletions(-) > > diff --git a/drivers/common/cnxk/roc_sso.c b/drivers/common/cnxk/roc_sso.c > index 358d37a9f2..8251c967fb 100644 > --- a/drivers/common/cnxk/roc_sso.c > +++ b/drivers/common/cnxk/roc_sso.c > @@ -406,10 +406,10 @@ roc_sso_hwgrp_qos_config(struct roc_sso *roc_sso, > struct roc_sso_hwgrp_qos *qos, > } > req->grp = qos[i].hwgrp; > req->xaq_limit = (nb_xaq * (xaq_prcnt ? xaq_prcnt : 100)) / > 100; > - req->taq_thr = (SSO_HWGRP_IAQ_MAX_THR_MASK * > + req->iaq_thr = (SSO_HWGRP_IAQ_MAX_THR_MASK * > (iaq_prcnt ? iaq_prcnt : 100)) / >100; > - req->iaq_thr = (SSO_HWGRP_TAQ_MAX_THR_MASK * > + req->taq_thr = (SSO_HWGRP_TAQ_MAX_THR_MASK * > (taq_prcnt ? taq_prcnt : 100)) / >100; > } > diff --git a/drivers/event/cnxk/cnxk_eventdev.c > b/drivers/event/cnxk/cnxk_eventdev.c > index b66f241ef8..2455b7be2d 100644 > --- a/drivers/event/cnxk/cnxk_eventdev.c > +++ b/drivers/event/cnxk/cnxk_eventdev.c > @@ -513,10 +513,10 @@ cnxk_sso_start(struct rte_eventdev *event_dev, > cnxk_sso_hws_reset_t reset_fn, > > plt_sso_dbg(); > for (i = 0; i < dev->qos_queue_cnt; i++) { > - qos->hwgrp = dev->qos_parse_data[i].queue; > - qos->iaq_prcnt = dev->qos_parse_data[i].iaq_prcnt; > - qos->taq_prcnt = dev->qos_parse_data[i].taq_prcnt; > - qos->xaq_prcnt = dev->qos_parse_data[i].xaq_prcnt; > + qos[i].hwgrp = dev->qos_parse_data[i].queue; > + qos[i].iaq_prcnt = dev->qos_parse_data[i].iaq_prcnt; > + qos[i].taq_prcnt = dev->qos_parse_data[i].taq_prcnt; > + qos[i].xaq_prcnt = dev->qos_parse_data[i].xaq_prcnt; > } > rc = roc_sso_hwgrp_qos_config(&dev->sso, qos, dev->qos_queue_cnt, > dev->xae_cnt); > -- > 2.25.1 >
Re: [PATCH v2] event/dlb2: fix advertized capabilities
On Sat, Jun 11, 2022 at 12:06 AM Timothy McDaniel wrote: > > This commit corrects the advertized capabilities reported by the DLB2 PMD. > > Previously DLB2 reported supporting RTE_EVENT_DEV_CAP_QUEUE_QOS, but the > DLB2 hardware does not support such capability. This commit removes that > feature from the reported capabilities feature set. > > Additionally, two capabilities that DLB2 does support were not being > reported in the capabilities feature set. This commit adds those. > > RTE_EVENT_DEV_CAP_MULTIPLE_QUEUE_PORT = Event device is capable of > setting up the link between multiple queues and a single port. If the > flag is not set, the eventdev can only map a single queue to each > port or map a single queue to many port > > RTE_EVENT_DEV_CAP_RUNTIME_PORT_LINK = Event device is capable of > configuring the queue/port link at runtime. If the flag is not set, > the eventdev queue/port link is only can be configured during > initialization > > Finally, the file doc/guides/eventdevs/features/dlb2.ini has been updated > to match the capabilities actually reported by the PMD. > > Signed-off-by: Timothy McDaniel Please fix [for-main]dell[dpdk-next-eventdev] $ ./devtools/check-git-log.sh -n 1 Missing 'Fixes' tag: event/dlb2: fix advertized capabilities Invalid patch(es) found - checked 1 patch > === > > Changes since V1; > 1) reorder capabilities flags to match the order that they appear in > in the default.ini file > 2) update the dlb2.ini file with the new set of features supported by DLB2 > 3) expanded the commit message to provide additional details > --- > doc/guides/eventdevs/features/dlb2.ini | 3 ++- > drivers/event/dlb2/dlb2.c | 9 + > 2 files changed, 7 insertions(+), 5 deletions(-) > > diff --git a/doc/guides/eventdevs/features/dlb2.ini > b/doc/guides/eventdevs/features/dlb2.ini > index 29747b1c26..48a2a18aff 100644 > --- a/doc/guides/eventdevs/features/dlb2.ini > +++ b/doc/guides/eventdevs/features/dlb2.ini > @@ -4,12 +4,13 @@ > ; Refer to default.ini for the full list of available PMD features. > ; > [Scheduling Features] > -queue_qos = Y > event_qos = Y > distributed_sched = Y > queue_all_types= Y > burst_mode = Y > implicit_release_disable = Y > +runtime_port_link = Y > +multiple_queue_port= Y > maintenance_free = Y > > [Eth Rx adapter Features] > diff --git a/drivers/event/dlb2/dlb2.c b/drivers/event/dlb2/dlb2.c > index 3641ed2942..bc4e705e0b 100644 > --- a/drivers/event/dlb2/dlb2.c > +++ b/drivers/event/dlb2/dlb2.c > @@ -61,12 +61,13 @@ static struct rte_event_dev_info evdev_dlb2_default_info > = { > .max_num_events = DLB2_MAX_NUM_LDB_CREDITS, > .max_single_link_event_port_queue_pairs = > DLB2_MAX_NUM_DIR_PORTS(DLB2_HW_V2), > - .event_dev_cap = (RTE_EVENT_DEV_CAP_QUEUE_QOS | > - RTE_EVENT_DEV_CAP_EVENT_QOS | > - RTE_EVENT_DEV_CAP_BURST_MODE | > + .event_dev_cap = (RTE_EVENT_DEV_CAP_EVENT_QOS | > RTE_EVENT_DEV_CAP_DISTRIBUTED_SCHED | > - RTE_EVENT_DEV_CAP_IMPLICIT_RELEASE_DISABLE | > RTE_EVENT_DEV_CAP_QUEUE_ALL_TYPES | > + RTE_EVENT_DEV_CAP_BURST_MODE | > + RTE_EVENT_DEV_CAP_IMPLICIT_RELEASE_DISABLE | > + RTE_EVENT_DEV_CAP_RUNTIME_PORT_LINK | > + RTE_EVENT_DEV_CAP_MULTIPLE_QUEUE_PORT | > RTE_EVENT_DEV_CAP_MAINTENANCE_FREE), > }; > > -- > 2.25.1 >
[PATCH] net/bnxt: reduce barriers in NEON vector Rx
To read descriptors in expected order, barriers are inserted after each descriptor read. The excessive use of barriers is unnecessary and could cause performance drop. Removed barriers between descriptor reads. And changed counting of valid packets so as to handle discontinuous valid packets. Because out of order read could lead to valid descriptors that fetched being discontinuous. In VPP L3 routing test, 6% performance gain was observed. The test was done on a platform with ThunderX2 CPU and Broadcom PS225 NIC. Signed-off-by: Ruifeng Wang --- drivers/net/bnxt/bnxt_rxtx_vec_neon.c | 47 ++- 1 file changed, 25 insertions(+), 22 deletions(-) diff --git a/drivers/net/bnxt/bnxt_rxtx_vec_neon.c b/drivers/net/bnxt/bnxt_rxtx_vec_neon.c index 32f8e59b3a..6a4ece681b 100644 --- a/drivers/net/bnxt/bnxt_rxtx_vec_neon.c +++ b/drivers/net/bnxt/bnxt_rxtx_vec_neon.c @@ -235,34 +235,32 @@ recv_burst_vec_neon(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts) * IO barriers are used to ensure consistent state. */ rxcmp1[3] = vld1q_u32((void *)&cpr->cp_desc_ring[cons + 7]); - rte_io_rmb(); + rxcmp1[2] = vld1q_u32((void *)&cpr->cp_desc_ring[cons + 5]); + rxcmp1[1] = vld1q_u32((void *)&cpr->cp_desc_ring[cons + 3]); + rxcmp1[0] = vld1q_u32((void *)&cpr->cp_desc_ring[cons + 1]); + + /* Use acquire fence to order loads of descriptor words. */ + rte_atomic_thread_fence(__ATOMIC_ACQUIRE); /* Reload lower 64b of descriptors to make it ordered after info3_v. */ rxcmp1[3] = vreinterpretq_u32_u64(vld1q_lane_u64 ((void *)&cpr->cp_desc_ring[cons + 7], vreinterpretq_u64_u32(rxcmp1[3]), 0)); - rxcmp[3] = vld1q_u32((void *)&cpr->cp_desc_ring[cons + 6]); - - rxcmp1[2] = vld1q_u32((void *)&cpr->cp_desc_ring[cons + 5]); - rte_io_rmb(); rxcmp1[2] = vreinterpretq_u32_u64(vld1q_lane_u64 ((void *)&cpr->cp_desc_ring[cons + 5], vreinterpretq_u64_u32(rxcmp1[2]), 0)); - rxcmp[2] = vld1q_u32((void *)&cpr->cp_desc_ring[cons + 4]); - - t1 = vreinterpretq_u64_u32(vzip2q_u32(rxcmp1[2], rxcmp1[3])); - - rxcmp1[1] = vld1q_u32((void *)&cpr->cp_desc_ring[cons + 3]); - rte_io_rmb(); rxcmp1[1] = vreinterpretq_u32_u64(vld1q_lane_u64 ((void *)&cpr->cp_desc_ring[cons + 3], vreinterpretq_u64_u32(rxcmp1[1]), 0)); - rxcmp[1] = vld1q_u32((void *)&cpr->cp_desc_ring[cons + 2]); - - rxcmp1[0] = vld1q_u32((void *)&cpr->cp_desc_ring[cons + 1]); - rte_io_rmb(); rxcmp1[0] = vreinterpretq_u32_u64(vld1q_lane_u64 ((void *)&cpr->cp_desc_ring[cons + 1], vreinterpretq_u64_u32(rxcmp1[0]), 0)); + + rxcmp[3] = vld1q_u32((void *)&cpr->cp_desc_ring[cons + 6]); + rxcmp[2] = vld1q_u32((void *)&cpr->cp_desc_ring[cons + 4]); + + t1 = vreinterpretq_u64_u32(vzip2q_u32(rxcmp1[2], rxcmp1[3])); + + rxcmp[1] = vld1q_u32((void *)&cpr->cp_desc_ring[cons + 2]); rxcmp[0] = vld1q_u32((void *)&cpr->cp_desc_ring[cons + 0]); t0 = vreinterpretq_u64_u32(vzip2q_u32(rxcmp1[0], rxcmp1[1])); @@ -278,16 +276,21 @@ recv_burst_vec_neon(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts) * bits and count the number of set bits in order to determine * the number of valid descriptors. */ - valid = vget_lane_u64(vreinterpret_u64_u16(vqmovn_u32(info3_v)), - 0); + valid = vget_lane_u64(vreinterpret_u64_s16(vshr_n_s16 + (vreinterpret_s16_u16(vshl_n_u16 + (vqmovn_u32(info3_v), 15)), 15)), 0); + /* * At this point, 'valid' is a 64-bit value containing four -* 16-bit fields, each of which is either 0x0001 or 0x. -* Compute number of valid descriptors from the index of -* the highest non-zero field. +* 16-bit fields, each of which is either 0x or 0x. +* Count the number of consecutive 1s from LSB in order to +* determine the number of valid descriptors. */ - num_valid = (sizeof(uint64_t) / sizeof(uint16_t)) - - (__builtin_clzl(valid & desc_valid_mask) / 16); + valid = ~(valid & desc_valid_mask); + if (valid == 0) + num_valid = 4; + else +
Re: [PATCH v8] event/dlb2: add support for single 512B write of 4 QEs
On Fri, Jun 10, 2022 at 9:58 PM Timothy McDaniel wrote: > > On Xeon, 512b accesses are available, so movdir64 instruction is able to > perform 512b read and write to DLB producer port. In order for movdir64 > to be able to pull its data from store buffers (store-buffer-forwarding) > (before actual write), data should be in single 512b write format. > This commit add change when code is built for Xeon with 512b AVX support > to make single 512b write of all 4 QEs instead of 4x64b writes. > > Signed-off-by: Timothy McDaniel > Acked-by: Kent Wires @McDaniel, Timothy Some cosmetic comments are below. Good to merge the next version. @Richardson, Bruce Hope you are OK with this version patch. > +#ifdef CC_AVX512_SUPPORT I think, We can avoid putting it under CC_AVX512_SUPPORT to avoid clutter. > + if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512VL) && > + rte_vect_get_max_simd_bitwidth() >= RTE_VECT_SIMD_512) > + ev_port->qm_port.use_avx512 = true; > + else > + ev_port->qm_port.use_avx512 = false; > +#endif > return 0; > } > > @@ -2457,21 +2464,6 @@ dlb2_eventdev_start(struct rte_eventdev *dev) > return 0; > } > > diff --git a/drivers/event/dlb2/dlb2_avx512.c > b/drivers/event/dlb2/dlb2_avx512.c > new file mode 100644 > index 00..ce2d006006 > --- /dev/null > +++ b/drivers/event/dlb2/dlb2_avx512.c > @@ -0,0 +1,267 @@ > +/* SPDX-License-Identifier: BSD-3-Clause > + * Copyright(c) 2016-2020 Intel Corporation Fix copyright year. > + */ > + > diff --git a/drivers/event/dlb2/dlb2_priv.h b/drivers/event/dlb2/dlb2_priv.h > index 4a06d649ab..e8d2d0c656 100644 > --- a/drivers/event/dlb2/dlb2_priv.h > +++ b/drivers/event/dlb2/dlb2_priv.h > @@ -377,6 +377,9 @@ struct dlb2_port { > struct dlb2_eventdev_port *ev_port; /* back ptr */ > bool use_scalar; /* force usage of scalar code */ > uint16_t hw_credit_quanta; > +#ifdef CC_AVX512_SUPPORT Not really need to be under compile time to avoid the compile-time clutter. > + bool use_avx512; > +#endif > }; > > diff --git a/drivers/event/dlb2/dlb2_sse.c b/drivers/event/dlb2/dlb2_sse.c > new file mode 100644 > index 00..82f6588e2a > --- /dev/null > +++ b/drivers/event/dlb2/dlb2_sse.c > @@ -0,0 +1,219 @@ > +/* SPDX-License-Identifier: BSD-3-Clause > + * Copyright(c) 2016-2020 Intel Corporation Fix copyright year. > diff --git a/drivers/event/dlb2/dlb2_sve.c b/drivers/event/dlb2/dlb2_sve.c > new file mode 100644 > index 00..82f6588e2a > --- /dev/null > +++ b/drivers/event/dlb2/dlb2_sve.c > @@ -0,0 +1,219 @@ > +/* SPDX-License-Identifier: BSD-3-Clause > + * Copyright(c) 2016-2020 Intel Corporation Fix copyright year.
RE: [PATCH] app/test-eventdev: wait for workers before cryptodev destroy
> -Original Message- > From: Shijith Thotton > Sent: Thursday, June 2, 2022 5:15 PM > To: Jerin Jacob Kollanukkaran > Cc: Shijith Thotton ; dev@dpdk.org; Pavan > Nikhilesh Bhagavatula > Subject: [PATCH] app/test-eventdev: wait for workers before cryptodev > destroy > > Destroying cryptodev resources before exiting workers are not safe. > Moved cryptodev destroy after worker thread exit in main thread. > > Fixes: de2bc16e1bd1 ("app/eventdev: add crypto producer mode") > > Signed-off-by: Shijith Thotton Acked-by: Pavan Nikhilesh > --- > app/test-eventdev/evt_main.c | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/app/test-eventdev/evt_main.c b/app/test-eventdev/evt_main.c > index c5d63061bf..b785e603ee 100644 > --- a/app/test-eventdev/evt_main.c > +++ b/app/test-eventdev/evt_main.c > @@ -159,9 +159,6 @@ main(int argc, char **argv) > if (test->ops.ethdev_rx_stop) > test->ops.ethdev_rx_stop(test, &opt); > > - if (test->ops.cryptodev_destroy) > - test->ops.cryptodev_destroy(test, &opt); > - > rte_eal_mp_wait_lcore(); > > if (test->ops.test_result) > @@ -173,6 +170,9 @@ main(int argc, char **argv) > if (test->ops.eventdev_destroy) > test->ops.eventdev_destroy(test, &opt); > > + if (test->ops.cryptodev_destroy) > + test->ops.cryptodev_destroy(test, &opt); > + > if (test->ops.mempool_destroy) > test->ops.mempool_destroy(test, &opt); > > -- > 2.25.1 <>
RE: [PATCH v1] raw/ifpga: free file handle before function return
> -Original Message- > From: Huang, Wei > Sent: Thursday, June 9, 2022 4:50 PM > To: dev@dpdk.org; tho...@monjalon.net; nipun.gu...@nxp.com; > hemant.agra...@nxp.com > Cc: sta...@dpdk.org; Xu, Rosen ; Zhang, Tianfei > ; Zhang, Qi Z ; Huang, Wei > > Subject: [PATCH v1] raw/ifpga: free file handle before function return > > Coverity issue: 379064 > Fixes: 673c897f4d73 ("raw/ifpga: support OFS card probing") > > Signed-off-by: Wei Huang > --- > drivers/raw/ifpga/base/ifpga_enumerate.c | 28 +++- > 1 file changed, 19 insertions(+), 9 deletions(-) > > diff --git a/drivers/raw/ifpga/base/ifpga_enumerate.c > b/drivers/raw/ifpga/base/ifpga_enumerate.c > index 7a5d264..61eb660 100644 > --- a/drivers/raw/ifpga/base/ifpga_enumerate.c > +++ b/drivers/raw/ifpga/base/ifpga_enumerate.c > @@ -837,8 +837,10 @@ static int find_dfls_by_vsec(struct dfl_fpga_enum_info > *info) > vndr_hdr = 0; > ret = pread(fd, &vndr_hdr, sizeof(vndr_hdr), > voff + PCI_VNDR_HEADER); > - if (ret < 0) > - return -EIO; > + if (ret < 0) { > + ret = -EIO; > + goto free_handle; > + } > if (PCI_VNDR_HEADER_ID(vndr_hdr) == > PCI_VSEC_ID_INTEL_DFLS && > pci_data->vendor_id == PCI_VENDOR_ID_INTEL) > break; > @@ -846,19 +848,23 @@ static int find_dfls_by_vsec(struct > dfl_fpga_enum_info *info) > > if (!voff) { > dev_debug(hw, "%s no DFL VSEC found\n", __func__); > - return -ENODEV; > + ret = -ENODEV; > + goto free_handle; > } > > dfl_cnt = 0; > ret = pread(fd, &dfl_cnt, sizeof(dfl_cnt), voff + PCI_VNDR_DFLS_CNT); > - if (ret < 0) > - return -EIO; > + if (ret < 0) { > + ret = -EIO; > + goto free_handle; > + } > > dfl_res_off = voff + PCI_VNDR_DFLS_RES; > if (dfl_res_off + (dfl_cnt * sizeof(u32)) > PCI_CFG_SPACE_EXP_SIZE) { > dev_err(hw, "%s DFL VSEC too big for PCIe config space\n", > __func__); > - return -EINVAL; > + ret = -EINVAL; > + goto free_handle; > } > > for (i = 0; i < dfl_cnt; i++, dfl_res_off += sizeof(u32)) { @@ -868,7 > +874,8 > @@ static int find_dfls_by_vsec(struct dfl_fpga_enum_info *info) > if (bir >= PCI_MAX_RESOURCE) { > dev_err(hw, "%s bad bir number %d\n", > __func__, bir); > - return -EINVAL; > + ret = -EINVAL; > + goto free_handle; > } > > len = pci_data->region[bir].len; > @@ -876,7 +883,8 @@ static int find_dfls_by_vsec(struct dfl_fpga_enum_info > *info) > if (offset >= len) { > dev_err(hw, "%s bad offset %u >= %"PRIu64"\n", > __func__, offset, len); > - return -EINVAL; > + ret = -EINVAL; > + goto free_handle; > } > > dev_debug(hw, "%s BAR %d offset 0x%x\n", __func__, bir, > offset); @@ -886,7 +894,9 @@ static int find_dfls_by_vsec(struct > dfl_fpga_enum_info *info) > dfl_fpga_enum_info_add_dfl(info, start, len, addr); > } > > - return 0; > +free_handle: > + close(fd); > + return ret; > } > > /* default method of finding dfls starting at offset 0 of bar 0 */ It looks good for me. Acked-by: Tianfei Zhang