Re: [PATCH v2 1/4] power: refactor core power management library
Hi Sivaprasa, Some comments inline. /Huisong 在 2024/8/26 21:06, Sivaprasad Tummala 写道: This patch introduces a comprehensive refactor to the core power management library. The primary focus is on improving modularity and organization by relocating specific driver implementations from the 'lib/power' directory to dedicated directories within 'drivers/power/core/*'. The adjustment of meson.build files enables the selective activation of individual drivers. These changes contribute to a significant enhancement in code organization, providing a clearer structure for driver implementations. The refactor aims to improve overall code clarity and boost maintainability. Additionally, it establishes a foundation for future development, allowing for more focused work on individual drivers and seamless integration of forthcoming enhancements. v2: - added NULL check for global_core_ops in rte_power_get_core_ops Signed-off-by: Sivaprasad Tummala --- drivers/meson.build | 1 + .../power/acpi/acpi_cpufreq.c | 22 +- .../power/acpi/acpi_cpufreq.h | 6 +- drivers/power/acpi/meson.build| 10 + .../power/amd_pstate/amd_pstate_cpufreq.c | 24 +- .../power/amd_pstate/amd_pstate_cpufreq.h | 8 +- drivers/power/amd_pstate/meson.build | 10 + .../power/cppc/cppc_cpufreq.c | 22 +- .../power/cppc/cppc_cpufreq.h | 8 +- drivers/power/cppc/meson.build| 10 + .../power/kvm_vm}/guest_channel.c | 0 .../power/kvm_vm}/guest_channel.h | 0 .../power/kvm_vm/kvm_vm.c | 22 +- .../power/kvm_vm/kvm_vm.h | 6 +- drivers/power/kvm_vm/meson.build | 16 + drivers/power/meson.build | 12 + drivers/power/pstate/meson.build | 10 + .../power/pstate/pstate_cpufreq.c | 22 +- .../power/pstate/pstate_cpufreq.h | 6 +- lib/power/meson.build | 7 +- lib/power/power_common.c | 2 +- lib/power/power_common.h | 16 +- lib/power/rte_power.c | 291 ++ lib/power/rte_power.h | 139 ++--- lib/power/rte_power_core_ops.h| 208 + lib/power/version.map | 14 + 26 files changed, 621 insertions(+), 271 deletions(-) rename lib/power/power_acpi_cpufreq.c => drivers/power/acpi/acpi_cpufreq.c (95%) rename lib/power/power_acpi_cpufreq.h => drivers/power/acpi/acpi_cpufreq.h (98%) create mode 100644 drivers/power/acpi/meson.build rename lib/power/power_amd_pstate_cpufreq.c => drivers/power/amd_pstate/amd_pstate_cpufreq.c (95%) rename lib/power/power_amd_pstate_cpufreq.h => drivers/power/amd_pstate/amd_pstate_cpufreq.h (97%) create mode 100644 drivers/power/amd_pstate/meson.build rename lib/power/power_cppc_cpufreq.c => drivers/power/cppc/cppc_cpufreq.c (95%) rename lib/power/power_cppc_cpufreq.h => drivers/power/cppc/cppc_cpufreq.h (97%) create mode 100644 drivers/power/cppc/meson.build rename {lib/power => drivers/power/kvm_vm}/guest_channel.c (100%) rename {lib/power => drivers/power/kvm_vm}/guest_channel.h (100%) rename lib/power/power_kvm_vm.c => drivers/power/kvm_vm/kvm_vm.c (82%) rename lib/power/power_kvm_vm.h => drivers/power/kvm_vm/kvm_vm.h (98%) create mode 100644 drivers/power/kvm_vm/meson.build create mode 100644 drivers/power/meson.build create mode 100644 drivers/power/pstate/meson.build rename lib/power/power_pstate_cpufreq.c => drivers/power/pstate/pstate_cpufreq.c (96%) rename lib/power/power_pstate_cpufreq.h => drivers/power/pstate/pstate_cpufreq.h (98%) create mode 100644 lib/power/rte_power_core_ops.h How about use the following directory structure? *For power libs* lib/power/power_common.* lib/power/rte_power_pmd_mgmt.* lib/power/rte_power_cpufreq_api.* (replacing rte_power.c file maybe simple for us. but I'm not sure if we can put the init of core, uncore and pmd mgmt to rte_power_init.c in rte_power.c.) lib/power/rte_power_uncore_freq_api.* *And has directories under drivers/power:* 1> For core dvfs driver: drivers/power/cpufreq/acpi_cpufreq.c drivers/power/cpufreq/cppc_cpufreq.c drivers/power/cpufreq/amd_pstate_cpufreq.c drivers/power/cpufreq/intel_pstate_cpufreq.c drivers/power/cpufreq/kvm_cpufreq.c The code of each cpufreq driver is not too much and doesn't probably increase. So don't need to use a directory for it. 2> For uncore dvfs driver: drivers/power/uncorefreq/intel_uncore.* diff --git a/drivers/meson.build b/drivers/meson.build index 66931d4241..9d77e0deab 100644 --- a/drivers/meson.build +++ b/drivers/meson.build @@ -29,6 +29,7 @@ subdirs = [ 'event', # depends on common, bus, mempool and net. 'baseband', # depends on common and bus
RE: Bihash Support in DPDK
Hi Rajesh, Please clarify what do you mean by “bihash”? Bidirectional? Bounded index? As for concurrent lookup/updates, yes, DPDK hash table supports multi-process/multi-thread, please see the documentation: https://doc.dpdk.org/guides/prog_guide/hash_lib.html#multi-process-support From: rajesh goel Sent: Tuesday, August 27, 2024 7:04 AM To: Ferruh Yigit Cc: Wang, Yipeng1 ; Gobriel, Sameh ; Richardson, Bruce ; Medvedkin, Vladimir ; dev@dpdk.org Subject: Re: Bihash Support in DPDK Hi All, Can we get some reply. Thanks Rajesh On Thu, Aug 22, 2024 at 9:32 PM Ferruh Yigit mailto:ferruh.yi...@amd.com>> wrote: On 8/22/2024 8:51 AM, rajesh goel wrote: > Hi All, > Need info if DPDK hash library supports bihash table where for multi- > thread and multi-process we can update/del/lookup entries per bucket level. > > + hash library maintainers.
[DPDK/DTS Bug 1528] Port MTU update test suite to DTS
https://bugs.dpdk.org/show_bug.cgi?id=1528 Bug ID: 1528 Summary: Port MTU update test suite to DTS Product: DPDK Version: unspecified Hardware: All OS: All Status: UNCONFIRMED Severity: normal Priority: Normal Component: DTS Assignee: dev@dpdk.org Reporter: alex.chap...@arm.com CC: juraj.lin...@pantheon.tech, luca.vizza...@arm.com, pr...@iol.unh.edu Target Milestone: --- https://git.dpdk.org/tools/dts/tree/test_plans/mtu_update_test_plan.rst -- You are receiving this mail because: You are the assignee for the bug.
Re: [PATCH v3 01/12] net/ice: use correct format specifiers for unsigned ints
On Fri, Aug 23, 2024 at 09:56:39AM +, Soumyadeep Hore wrote: > From: Yogesh Bhosale > > Firmware was giving a number for the MSIX vectors that was > way too big and obviously not right. Because of the wrong > format specifier, this big number ended up looking like a > tiny negative number in the logs. This was fixed by using > the right format specifier everywhere it's needed. > > Signed-off-by: Yogesh Bhosale > Signed-off-by: Soumyadeep Hore > --- > drivers/net/ice/base/ice_common.c | 54 +++ > 1 file changed, 27 insertions(+), 27 deletions(-) > Since this is a base code update, the commit log prefix should be "net/ice/base" rather than just "net/ice". Base code patches tend to get reviewed slightly differently since it's understood that they come from a separate shared source, so we are checking for things like patch split as much as for code correctness. Therefore we try to keep such patches explicitly marked as affecting base code. Also, please run checkpatches.sh and check-git-log.sh from the devtools directory. It allows picking up more patch issues, such as in this case a missing mailmap entry for the original patch author. In cases like this, the mailmap entry is added in the first patch from that author. Thanks, /Bruce
Re: [PATCH] net/mana: support building the driver on arm64
On 8/26/2024 11:38 PM, lon...@linuxonhyperv.com wrote: > From: Long Li > > The driver has been verified on Linux arm64. Enable this build option and > add a missing header file for arm64. > > Signed-off-by: Long Li > Hi Long, I don't see this patch in the mail list and patchwork, it can be because of the email address it has been sent, fyi.
[PATCH] net/mana: support building the driver on arm64
From: Long Li The driver has been verified on Linux arm64. Enable this build option and add a missing header file for arm64. Signed-off-by: Long Li --- drivers/net/mana/meson.build | 4 ++-- drivers/net/mana/mp.c| 1 + 2 files changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/net/mana/meson.build b/drivers/net/mana/meson.build index 2d72eca5a8..330d30b2ff 100644 --- a/drivers/net/mana/meson.build +++ b/drivers/net/mana/meson.build @@ -1,9 +1,9 @@ # SPDX-License-Identifier: BSD-3-Clause # Copyright(c) 2022 Microsoft Corporation -if not is_linux or not dpdk_conf.has('RTE_ARCH_X86') +if not is_linux or not (dpdk_conf.has('RTE_ARCH_X86') or dpdk_conf.has('RTE_ARCH_ARM64')) build = false -reason = 'only supported on x86 Linux' +reason = 'only supported on x86 or arm64 Linux' subdir_done() endif diff --git a/drivers/net/mana/mp.c b/drivers/net/mana/mp.c index 738487f65a..34b45ed832 100644 --- a/drivers/net/mana/mp.c +++ b/drivers/net/mana/mp.c @@ -5,6 +5,7 @@ #include #include #include +#include #include -- 2.43.0
Re: Bihash Support in DPDK
Hi All, Can we get some reply. Thanks Rajesh On Thu, Aug 22, 2024 at 9:32 PM Ferruh Yigit wrote: > On 8/22/2024 8:51 AM, rajesh goel wrote: > > Hi All, > > Need info if DPDK hash library supports bihash table where for multi- > > thread and multi-process we can update/del/lookup entries per bucket > level. > > > > > > + hash library maintainers. >
Re: [PATCH] net/mana: support building the driver on arm64
Hello, On Tue, Aug 27, 2024 at 12:01 PM Ferruh Yigit wrote: > > On 8/26/2024 11:38 PM, lon...@linuxonhyperv.com wrote: > > From: Long Li > > > > The driver has been verified on Linux arm64. Enable this build option and > > add a missing header file for arm64. > > > > Signed-off-by: Long Li > > > > Hi Long, > > I don't see this patch in the mail list and patchwork, it can be because > of the email address it has been sent, fyi. The mail was waiting in the moderation queue. Long, please fix your send-email setup, or send with a mail address registered to the dev@ ml. Thanks. -- David Marchand
Re: [PATCH v2 2/4] power: refactor uncore power management library
Hi Sivaprasad, Suggest to split this patch into two patches for easiler to review: patch-1: abstract a file for uncore dvfs core level, namely, the rte_power_uncore_ops.c you did. patch-2: move and rename, lib/power/power_intel_uncore.c => drivers/power/intel_uncore/intel_uncore.c patch[1/4] is also too big and not good to review. In addition, I have some question and am not sure if we can adjust uncore init process. /Huisong 在 2024/8/26 21:06, Sivaprasad Tummala 写道: This patch refactors the power management library, addressing uncore power management. The primary changes involve the creation of dedicated directories for each driver within 'drivers/power/uncore/*'. The adjustment of meson.build files enables the selective activation of individual drivers. This refactor significantly improves code organization, enhances clarity and boosts maintainability. It lays the foundation for more focused development on individual drivers and facilitates seamless integration of future enhancements, particularly the AMD uncore driver. Signed-off-by: Sivaprasad Tummala --- .../power/intel_uncore/intel_uncore.c | 18 +- .../power/intel_uncore/intel_uncore.h | 8 +- drivers/power/intel_uncore/meson.build| 6 + drivers/power/meson.build | 3 +- lib/power/meson.build | 2 +- lib/power/rte_power_uncore.c | 205 ++- lib/power/rte_power_uncore.h | 87 --- lib/power/rte_power_uncore_ops.h | 239 ++ lib/power/version.map | 1 + 9 files changed, 405 insertions(+), 164 deletions(-) rename lib/power/power_intel_uncore.c => drivers/power/intel_uncore/intel_uncore.c (95%) rename lib/power/power_intel_uncore.h => drivers/power/intel_uncore/intel_uncore.h (97%) create mode 100644 drivers/power/intel_uncore/meson.build create mode 100644 lib/power/rte_power_uncore_ops.h diff --git a/lib/power/power_intel_uncore.c b/drivers/power/intel_uncore/intel_uncore.c similarity index 95% rename from lib/power/power_intel_uncore.c rename to drivers/power/intel_uncore/intel_uncore.c index 4eb9c5900a..804ad5d755 100644 --- a/lib/power/power_intel_uncore.c +++ b/drivers/power/intel_uncore/intel_uncore.c @@ -8,7 +8,7 @@ #include -#include "power_intel_uncore.h" +#include "intel_uncore.h" #include "power_common.h" #define MAX_NUMA_DIE 8 @@ -475,3 +475,19 @@ power_intel_uncore_get_num_dies(unsigned int pkg) return count; } <...> -#endif /* POWER_INTEL_UNCORE_H */ +#endif /* INTEL_UNCORE_H */ diff --git a/drivers/power/intel_uncore/meson.build b/drivers/power/intel_uncore/meson.build new file mode 100644 index 00..876df8ad14 --- /dev/null +++ b/drivers/power/intel_uncore/meson.build @@ -0,0 +1,6 @@ +# SPDX-License-Identifier: BSD-3-Clause +# Copyright(c) 2017 Intel Corporation +# Copyright(c) 2024 Advanced Micro Devices, Inc. + +sources = files('intel_uncore.c') +deps += ['power'] diff --git a/drivers/power/meson.build b/drivers/power/meson.build index 8c7215c639..c83047af94 100644 --- a/drivers/power/meson.build +++ b/drivers/power/meson.build @@ -6,7 +6,8 @@ drivers = [ 'amd_pstate', 'cppc', 'kvm_vm', -'pstate' +'pstate', +'intel_uncore' The cppc, amd_pstate and so on belong to cpufreq scope. And intel_uncore belongs to uncore dvfs scope. They are not the same level. So I proposes that we need to create one directory called like cpufreq or core. This 'intel_uncore' name don't seems appropriate. what do you think the following directory structure: drivers/power/uncore/intel_uncore.c drivers/power/uncore/amd_uncore.c (according to the patch[4/4]). ] std_deps = ['power'] diff --git a/lib/power/meson.build b/lib/power/meson.build index f3e3451cdc..9b13d98810 100644 --- a/lib/power/meson.build +++ b/lib/power/meson.build @@ -13,7 +13,6 @@ if not is_linux endif sources = files( 'power_common.c', -'power_intel_uncore.c', 'rte_power.c', 'rte_power_uncore.c', 'rte_power_pmd_mgmt.c', @@ -24,6 +23,7 @@ headers = files( 'rte_power_guest_channel.h', 'rte_power_pmd_mgmt.h', 'rte_power_uncore.h', +'rte_power_uncore_ops.h', ) if cc.has_argument('-Wno-cast-qual') cflags += '-Wno-cast-qual' diff --git a/lib/power/rte_power_uncore.c b/lib/power/rte_power_uncore.c index 48c75a5da0..9f8771224f 100644 --- a/lib/power/rte_power_uncore.c +++ b/lib/power/rte_power_uncore.c @@ -1,6 +1,7 @@ /* SPDX-License-Identifier: BSD-3-Clause * Copyright(c) 2010-2014 Intel Corporation * Copyright(c) 2023 AMD Corporation + * Copyright(c) 2024 Advanced Micro Devices, Inc. */ #include @@ -12,98 +13,50 @@ #include "rte_power_uncore.h" #include "power_intel_uncore.h" -enum rte_uncore_power_mgmt_env default_uncore_env = RTE_UNCORE_PM_ENV_NOT_SET; +static
Re: [PATCH v3] net/gve: add support for TSO in DQO RDA
On 8/27/2024 4:11 AM, Rushil Gupta wrote: > On Fri, Aug 9, 2024 at 11:49 AM Tathagat Priyadarshi > wrote: >> >> The patch intends on adding support for TSO in DQO RDA format. >> >> Signed-off-by: Tathagat Priyadarshi >> Signed-off-by: Varun Lakkur Ambaji Rao > > Acked-by: Rushil Gupta > Applied to dpdk-next-net/main, thanks.
Re: [PATCH] net/mana: support building the driver on arm64
On 8/26/2024 11:38 PM, lon...@linuxonhyperv.com wrote: > From: Long Li > > The driver has been verified on Linux arm64. Enable this build option and > add a missing header file for arm64. > > Signed-off-by: Long Li > Applied to dpdk-next-net/main, thanks.
[RFC 0/2] introduce LLC aware functions
As core density continues to increase, chiplet-based core packing has become a key trend. In AMD SoC EPYC architectures, core complexes within the same chiplet share a Last-Level Cache (LLC). By packing logical cores within the same LLC, we can enhance pipeline processing stages due to reduced latency and improved data locality. To leverage these benefits, DPDK libraries and examples can utilize localized lcores. This approach ensures more consistent latencies by minimizing the dispersion of lcores across different chiplet complexes and enhances packet processing by ensuring that data for subsequent pipeline stages is likely to reside within the LLC. < Function: Purpose > - - rte_get_llc_first_lcores: Retrieves all the first lcores in the shared LLC. - rte_get_llc_lcore: Retrieves all lcores that share the LLC. - rte_get_llc_n_lcore: Retrieves the first n or skips the first n lcores in the shared LLC. < MACRO: Purpose > -- RTE_LCORE_FOREACH_LLC_FIRST: iterates through all first lcore from each LLC. RTE_LCORE_FOREACH_LLC_FIRST_WORKER: iterates through all first worker lcore from each LLC. RTE_LCORE_FOREACH_LLC_WORKER: iterates lcores from LLC based on hint (lcore id). RTE_LCORE_FOREACH_LLC_SKIP_FIRST_WORKER: iterates lcores from LLC while skipping first worker. RTE_LCORE_FOREACH_LLC_FIRST_N_WORKER: iterates through `n` lcores from each LLC. RTE_LCORE_FOREACH_LLC_SKIP_N_WORKER: skip first `n` lcores, then iterates through reaming lcores in each LLC. Vipin Varghese (2): eal: add llc aware functions eal/lcore: add llc aware for each macro lib/eal/common/eal_common_lcore.c | 279 -- lib/eal/include/rte_lcore.h | 89 ++ 2 files changed, 356 insertions(+), 12 deletions(-) -- 2.34.1
[RFC 1/2] eal: add llc aware functions
Introduce lcore functions which operates on Last Level Cache for core complexes or chiplet cores. On non chiplet core complexes, the function will iterate over all the available dpdk lcores. Functions added: - rte_get_llc_first_lcores - rte_get_llc_lcore - rte_get_llc_n_lcore Signed-off-by: Vipin Varghese --- lib/eal/common/eal_common_lcore.c | 279 -- 1 file changed, 267 insertions(+), 12 deletions(-) diff --git a/lib/eal/common/eal_common_lcore.c b/lib/eal/common/eal_common_lcore.c index 2ff9252c52..4ff8b9e116 100644 --- a/lib/eal/common/eal_common_lcore.c +++ b/lib/eal/common/eal_common_lcore.c @@ -14,6 +14,7 @@ #ifndef RTE_EXEC_ENV_WINDOWS #include #endif +#include #include "eal_private.h" #include "eal_thread.h" @@ -93,25 +94,279 @@ int rte_lcore_is_enabled(unsigned int lcore_id) return cfg->lcore_role[lcore_id] == ROLE_RTE; } -unsigned int rte_get_next_lcore(unsigned int i, int skip_main, int wrap) +#define LCORE_GET_LLC \ + "ls -d /sys/bus/cpu/devices/cpu%u/cache/index[0-9] | sort -r | grep -m1 index[0-9] | awk -F '[x]' '{print $2}' " +#define LCORE_GET_SHAREDLLC \ + "grep [0-9] /sys/bus/cpu/devices/cpu%u/cache/index%u/shared_cpu_list" + +unsigned int rte_get_llc_first_lcores (rte_cpuset_t *llc_cpu) { - i++; - if (wrap) - i %= RTE_MAX_LCORE; + CPU_ZERO((rte_cpuset_t *)llc_cpu); - while (i < RTE_MAX_LCORE) { - if (!rte_lcore_is_enabled(i) || - (skip_main && (i == rte_get_main_lcore( { - i++; - if (wrap) - i %= RTE_MAX_LCORE; + char cmdline[2048] = {'\0'}; + char output_llc[8] = {'\0'}; + char output_threads[16] = {'\0'}; + + for (unsigned int lcore =0; lcore < RTE_MAX_LCORE; lcore++) + { + if (!rte_lcore_is_enabled (lcore)) continue; + + /* get sysfs llc index */ + snprintf(cmdline, 2047, LCORE_GET_LLC, lcore); + FILE *fp = popen (cmdline, "r"); + if (fp == NULL) { + return -1; } - break; + if (fgets(output_llc, sizeof(output_llc) - 1, fp) == NULL) { + pclose(fp); + return -1; + } + pclose(fp); + int llc_index = atoi (output_llc); + + /* get sysfs core group of the same core index*/ + snprintf(cmdline, 2047, LCORE_GET_SHAREDLLC, lcore, llc_index); + fp = popen (cmdline, "r"); + if (fp == NULL) { + return -1; + } + if (fgets(output_threads, sizeof(output_threads) - 1, fp) == NULL) { + pclose(fp); + return -1; + } + pclose(fp); + + output_threads [strlen(output_threads) - 1] = '\0'; + char *smt_thrds[2]; + int smt_threads = rte_strsplit(output_threads, sizeof(output_threads), smt_thrds, 2, ','); + + for (int index = 0; index < smt_threads; index++) { + char *llc[2] = {'\0'}; + int smt_cpu = rte_strsplit(smt_thrds[index], sizeof(smt_thrds[index]), llc, 2, '-'); + RTE_SET_USED(smt_cpu); + + unsigned int first_cpu = atoi (llc[0]); + unsigned int last_cpu = (NULL == llc[1]) ? atoi (llc[0]) : atoi (llc[1]); + + + for (unsigned int temp_cpu = first_cpu; temp_cpu <= last_cpu; temp_cpu++) { + if (rte_lcore_is_enabled(temp_cpu)) { + CPU_SET (temp_cpu, (rte_cpuset_t *) llc_cpu); + lcore = last_cpu; + break; + } + } + } + } + + return CPU_COUNT((rte_cpuset_t *)llc_cpu); +} + +unsigned int +rte_get_llc_lcore (unsigned int lcore, rte_cpuset_t *llc_cpu, + unsigned int *first_cpu, unsigned int * last_cpu) +{ + CPU_ZERO((rte_cpuset_t *)llc_cpu); + + char cmdline[2048] = {'\0'}; + char output_llc[8] = {'\0'}; + char output_threads[16] = {'\0'}; + + *first_cpu = *last_cpu = RTE_MAX_LCORE; + + /* get sysfs llc index */ + snprintf(cmdline, 2047, LCORE_GET_LLC, lcore); + FILE *fp = popen (cmdline, "r"); + if (fp == NULL) { + return -1; + } + if (fgets(output_llc, sizeof(output_llc) - 1, fp) == NULL) { + pclose(fp); + return -1; + } + pclose(fp); + int llc_index = atoi (output_llc); + + /* get sysfs core group of the same core index*/ + snprintf(cmdline
[RFC 2/2] eal/lcore: add llc aware for each macro
add RTE_LCORE_FOREACH for dpdk lcore sharing the Last Level Cache. For core complexes with shared LLC, the macro iterates for same llc lcores. For cores within single LLC, the macro iterates over all availble lcores. MACRO added: - RTE_LCORE_FOREACH_LLC_FIRST - RTE_LCORE_FOREACH_LLC_FIRST_WORKER - RTE_LCORE_FOREACH_LLC_WORKER - RTE_LCORE_FOREACH_LLC_SKIP_FIRST_WORKER - RTE_LCORE_FOREACH_LLC_FIRST_N_WORKER - RTE_LCORE_FOREACH_LLC_SKIP_N_WORKER Signed-off-by: Vipin Varghese --- lib/eal/include/rte_lcore.h | 89 + 1 file changed, 89 insertions(+) diff --git a/lib/eal/include/rte_lcore.h b/lib/eal/include/rte_lcore.h index 7deae47af3..7c1a240bde 100644 --- a/lib/eal/include/rte_lcore.h +++ b/lib/eal/include/rte_lcore.h @@ -18,6 +18,7 @@ #include #include #include +#include #ifdef __cplusplus extern "C" { @@ -196,6 +197,21 @@ rte_cpuset_t rte_lcore_cpuset(unsigned int lcore_id); */ int rte_lcore_is_enabled(unsigned int lcore_id); +/** + * Get the next enabled lcore ID within same llc. + * + * @param i + * The current lcore (reference). + * @param skip_main + * If true, do not return the ID of the main lcore. + * @param wrap + * If true, go back to 0 when RTE_MAX_LCORE is reached; otherwise, + * return RTE_MAX_LCORE. + * @return + * The next lcore_id or RTE_MAX_LCORE if not found. + */ +unsigned int rte_get_next_llc_lcore(unsigned int i, int skip_main, int wrap); + /** * Get the next enabled lcore ID. * @@ -211,6 +227,11 @@ int rte_lcore_is_enabled(unsigned int lcore_id); */ unsigned int rte_get_next_lcore(unsigned int i, int skip_main, int wrap); +unsigned int rte_get_llc_lcore (unsigned int i, rte_cpuset_t *llc_cpu, unsigned int *start, unsigned int *end); +unsigned int rte_get_llc_first_lcores (rte_cpuset_t *llc_cpu); +unsigned int rte_get_llc_n_lcore (unsigned int i, rte_cpuset_t *llc_cpu, unsigned int *start, unsigned int *end, unsigned int n, bool skip); + + /** * Macro to browse all running lcores. */ @@ -219,6 +240,7 @@ unsigned int rte_get_next_lcore(unsigned int i, int skip_main, int wrap); i < RTE_MAX_LCORE; \ i = rte_get_next_lcore(i, 0, 0)) + /** * Macro to browse all running lcores except the main lcore. */ @@ -227,6 +249,73 @@ unsigned int rte_get_next_lcore(unsigned int i, int skip_main, int wrap); i < RTE_MAX_LCORE; \ i = rte_get_next_lcore(i, 1, 0)) +/** Browse all the the cores in the provided llc domain **/ + +#define RTE_LCORE_FOREACH_LLC_FIRST(i) \ + rte_cpuset_t llc_foreach_first_lcores; \ + CPU_ZERO(&llc_foreach_first_lcores); i = 0; \ + unsigned int llc_foreach_num_iter = rte_get_llc_first_lcores(&llc_foreach_first_lcores);\ + i = (0 == llc_foreach_num_iter) ? RTE_MAX_LCORE : i; \ + for (; i < RTE_MAX_LCORE; i++) \ + if (CPU_ISSET(i, &llc_foreach_first_lcores)) + +#define RTE_LCORE_FOREACH_LLC_FIRST_WORKER(i) \ + rte_cpuset_t llc_foreach_first_lcores; \ + CPU_ZERO(&llc_foreach_first_lcores); i = 0; \ + unsigned int llc_foreach_num_iter = rte_get_llc_first_lcores(&llc_foreach_first_lcores);\ + CPU_CLR(rte_get_main_lcore(), &llc_foreach_first_lcores); \ + i = (0 == llc_foreach_num_iter) ? RTE_MAX_LCORE : i; \ + for (; i < RTE_MAX_LCORE; i++) \ + if (CPU_ISSET(i, &llc_foreach_first_lcores)) + +#define RTE_LCORE_FOREACH_LLC_WORKER(i)\ + rte_cpuset_t llc_foreach_first_lcores; \ + rte_cpuset_t llc_foreach_lcore; \ +unsigned int start,end; \ + CPU_ZERO(&llc_foreach_first_lcores); i = 0; \ + unsigned int llc_foreach_num_iter = rte_get_llc_first_lcores(&llc_foreach_first_lcores);\ + i = (0 == llc_foreach_num_iter) ? RTE_MAX_LCORE : i; \ + for (unsigned int llc_i = i; llc_i < RTE_MAX_LCORE; llc_i++) \ + if (CPU_ISSET(llc_i, &llc_foreach_first_lcores) && rte_get_llc_lcore (llc_i, &llc_foreach_lcore, &start, &end)) \ + for (i = start; (i <= end); i
Re: [PATCH v2 1/3] app/testpmd: add register keyword
On 8/21/2024 8:25 PM, Stephen Hemminger wrote: Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding. On Wed, 21 Aug 2024 20:08:55 +0530 Vipin Varghese wrote: diff --git a/app/test-pmd/macswap_sse.h b/app/test-pmd/macswap_sse.h index 223f87a539..29088843b7 100644 --- a/app/test-pmd/macswap_sse.h +++ b/app/test-pmd/macswap_sse.h @@ -16,13 +16,13 @@ do_macswap(struct rte_mbuf *pkts[], uint16_t nb, uint64_t ol_flags; int i; int r; - __m128i addr0, addr1, addr2, addr3; + register __m128i addr0, addr1, addr2, addr3; Some compilers treat register as a no-op. Are you sure? Did you check with godbolt. Thank you Stephen, I have tested the code changes on Linux using GCC and Clang compiler. In both cases in Linux environment, we have seen the the values loaded onto register `xmm`. ``` registerconst__m128i shfl_msk = _mm_set_epi8(15, 14, 13, 12, 5, 4, 3, 2, 1, 0, 11, 10, 9, 8, 7, 6); vmovdqaxmm0, xmmwordptr[rip+ .LCPI0_0] ``` Both cases we have performance improvement. Can you please help us understand if we have missed out something?
[PATCH v3 0/9] riscv: implement accelerated crc using zbc
The RISC-V Zbc extension adds instructions for carry-less multiplication we can use to implement CRC in hardware. This patch set contains two new implementations: - one in lib/hash/rte_crc_riscv64.h that uses a Barrett reduction to implement the four rte_hash_crc_* functions - one in lib/net/net_crc_zbc.c that uses repeated single-folds to reduce the buffer until it is small enough for a Barrett reduction to implement rte_crc16_ccitt_zbc_handler and rte_crc32_eth_zbc_handler My approach is largely based on the Intel's "Fast CRC Computation Using PCLMULQDQ Instruction" white paper https://www.researchgate.net/publication/263424619_Fast_CRC_computation and a post about "Optimizing CRC32 for small payload sizes on x86" https://mary.rs/lab/crc32/ Whether these new implementations are enabled is controlled by new build-time and run-time detection of the RISC-V extensions present in the compiler and on the target system. I have carried out some performance comparisons between the generic table implementations and the new hardware implementations. Listed below is the number of cycles it takes to compute the CRC hash for buffers of various sizes (as reported by rte_get_timer_cycles()). These results were collected on a Kendryte K230 and averaged over 20 samples: |Buffer| CRC32-ETH (lib/net) | CRC32C (lib/hash) | |Size (MB) | Table| Hardware | Table| Hardware | |--|--|--|--|--| |1 | 155168 |11610 |73026 |18385 | |2 | 311203 |22998 | 145586 |35886 | |3 | 466744 |34370 | 218536 |53939 | |4 | 621843 |45536 | 291574 |71944 | |5 | 777908 |56989 | 364152 |89706 | |6 | 932736 |68023 | 437016 | 107726 | |7 | 1088756 |79236 | 510197 | 125426 | |8 | 1243794 |90467 | 583231 | 143614 | These results suggest a speed-up of lib/net by thirteen times, and of lib/hash by four times. I have also run the hash_functions_autotest benchmark in dpdk_test, which measures the performance of the lib/hash implementation on small buffers, getting the following times: | Key Length | Time (ticks/op) | | (bytes)| Table| Hardware | ||--|--| | 1 | 0.47 | 0.85 | | 2 | 0.57 | 0.87 | | 4 | 0.99 | 0.88 | | 8 | 1.35 | 0.88 | | 9 | 1.20 | 1.09 | | 13 | 1.76 | 1.35 | | 16 | 1.87 | 1.02 | | 32 | 2.96 | 0.98 | | 37 | 3.35 | 1.45 | | 40 | 3.49 | 1.12 | | 48 | 4.02 | 1.25 | | 64 | 5.08 | 1.54 | v3: - rebase on 24.07 - replace crc with CRC in commits (check-git-log.sh) v2: - replace compile flag with build-time (riscv extension macros) and run-time detection (linux hwprobe syscall) (Stephen Hemminger) - add qemu target that supports zbc (Stanislaw Kardach) - fix spelling error in commit message - fix a bug in the net/ implementation that would cause segfaults on small unaligned buffers - refactor net/ implementation to move variable declarations to top of functions - enable the optimisation in a couple other places optimised crc is preferred to jhash - l3fwd-power - cuckoo-hash Daniel Gregory (9): config/riscv: detect presence of Zbc extension hash: implement CRC using riscv carryless multiply net: implement CRC using riscv carryless multiply config/riscv: add qemu crossbuild target examples/l3fwd: use accelerated CRC on riscv ipfrag: use accelerated CRC on riscv examples/l3fwd-power: use accelerated CRC on riscv hash/cuckoo: use accelerated CRC on riscv member: use accelerated CRC on riscv MAINTAINERS | 2 + app/test/test_crc.c | 9 + app/test/test_hash.c | 7 + config/riscv/meson.build | 44 +++- config/riscv/riscv64_qemu_linux_gcc | 17 ++ .../linux_gsg/cross_build_dpdk_for_riscv.rst | 5 + examples/l3fwd-power/main.c | 2 +- examples/l3fwd/l3fwd_em.c | 2 +- lib/eal/riscv/include/rte_cpuflags.h | 2 + lib/eal/riscv/rte_cpuflags.c | 112 +++--- lib/hash/meson.build | 1 + lib/hash/rte_crc_riscv64.h| 89 lib/hash/rte_cuckoo_hash.c| 3 + lib/hash/rte_hash_crc.c | 13 +- lib/hash/rte_hash_crc.h | 6 +- lib/ip_frag/ip_frag_internal.c| 6 +- lib/member/rte_member.h | 2 +- lib/net/meson.build | 4 + lib/net/net_crc.h | 11 + lib/net/net_crc_zbc.c | 191 ++ lib/net/rte_net_crc.c | 40 lib/net
[PATCH v3 1/9] config/riscv: detect presence of Zbc extension
The RISC-V Zbc extension adds carry-less multiply instructions we can use to implement more efficient CRC hashing algorithms. The RISC-V C api defines architecture extension test macros https://github.com/riscv-non-isa/riscv-c-api-doc/blob/main/riscv-c-api.md#architecture-extension-test-macros These let us detect whether the Zbc extension is supported on the compiler and -march we're building with. The C api also defines Zbc intrinsics we can use rather than inline assembly on newer versions of GCC (14.1.0+) and Clang (18.1.0+). The Linux kernel exposes a RISC-V hardware probing syscall for getting information about the system at run-time including which extensions are available. We detect whether this interface is present by looking for the header, as it's only present in newer kernels (v6.4+). Furthermore, support for detecting certain extensions, including Zbc, wasn't present until versions after this, so we need to check the constants this header exports. The kernel exposes bitmasks for each extension supported by the probing interface, rather than the bit index that is set if that extensions is present, so modify the existing cpu flag HWCAP table entries to line up with this. The values returned by the interface are 64-bits long, so grow the hwcap registers array to be able to hold them. If the Zbc extension and intrinsics are both present and we can detect the Zbc extension at runtime, we define a flag, RTE_RISCV_FEATURE_ZBC. Signed-off-by: Daniel Gregory --- config/riscv/meson.build | 41 ++ lib/eal/riscv/include/rte_cpuflags.h | 2 + lib/eal/riscv/rte_cpuflags.c | 112 +++ 3 files changed, 123 insertions(+), 32 deletions(-) diff --git a/config/riscv/meson.build b/config/riscv/meson.build index 07d7d9da23..5d8411b254 100644 --- a/config/riscv/meson.build +++ b/config/riscv/meson.build @@ -119,6 +119,47 @@ foreach flag: arch_config['machine_args'] endif endforeach +# check if we can do buildtime detection of extensions supported by the target +riscv_extension_macros = false +if (cc.get_define('__riscv_arch_test', args: machine_args) == '1') + message('Detected architecture extension test macros') + riscv_extension_macros = true +else + warning('RISC-V architecture extension test macros not available. Build-time detection of extensions not possible') +endif + +# check if we can use hwprobe interface for runtime extension detection +riscv_hwprobe = false +if (cc.check_header('asm/hwprobe.h', args: machine_args)) + message('Detected hwprobe interface, enabling runtime detection of supported extensions') + machine_args += ['-DRTE_RISCV_FEATURE_HWPROBE'] + riscv_hwprobe = true +else + warning('Hwprobe interface not available (present in Linux v6.4+), instruction extensions won\'t be enabled') +endif + +# detect extensions +# RISC-V Carry-less multiplication extension (Zbc) for hardware implementations +# of CRC-32C (lib/hash/rte_crc_riscv64.h) and CRC-32/16 (lib/net/net_crc_zbc.c). +# Requires intrinsics available in GCC 14.1.0+ and Clang 18.1.0+ +if (riscv_extension_macros and riscv_hwprobe and +(cc.get_define('__riscv_zbc', args: machine_args) != '')) + if ((cc.get_id() == 'gcc' and cc.version().version_compare('>=14.1.0')) + or (cc.get_id() == 'clang' and cc.version().version_compare('>=18.1.0'))) +# determine whether we can detect Zbc extension (this wasn't possible until +# Linux kernel v6.8) +if (cc.compiles('''#include + int a = RISCV_HWPROBE_EXT_ZBC;''', args: machine_args)) + message('Compiling with the Zbc extension') + machine_args += ['-DRTE_RISCV_FEATURE_ZBC'] +else + warning('Detected Zbc extension but cannot use because runtime detection doesn\'t support it (support present in Linux kernel v6.8+)') +endif + else +warning('Detected Zbc extension but cannot use because intrinsics are not available (present in GCC 14.1.0+ and Clang 18.1.0+)') + endif +endif + # apply flags foreach flag: dpdk_flags if flag.length() > 0 diff --git a/lib/eal/riscv/include/rte_cpuflags.h b/lib/eal/riscv/include/rte_cpuflags.h index d742efc40f..4e26b584b3 100644 --- a/lib/eal/riscv/include/rte_cpuflags.h +++ b/lib/eal/riscv/include/rte_cpuflags.h @@ -42,6 +42,8 @@ enum rte_cpu_flag_t { RTE_CPUFLAG_RISCV_ISA_X, /* Non-standard extension present */ RTE_CPUFLAG_RISCV_ISA_Y, /* Reserved */ RTE_CPUFLAG_RISCV_ISA_Z, /* Reserved */ + + RTE_CPUFLAG_RISCV_EXT_ZBC, /* Carry-less multiplication */ }; #include "generic/rte_cpuflags.h" diff --git a/lib/eal/riscv/rte_cpuflags.c b/lib/eal/riscv/rte_cpuflags.c index eb4105c18b..dedf0395ab 100644 --- a/lib/eal/riscv/rte_cpuflags.c +++ b/lib/eal/riscv/rte_cpuflags.c @@ -11,6 +11,15 @@ #include #include #include +#include + +/* + * when hardware probing is not possible, we assume all extensions are missing + * at runtime + */ +#ifdef RTE_RISCV_FEATURE_HWPROBE +#include +#endif
[PATCH v3 2/9] hash: implement CRC using riscv carryless multiply
Using carryless multiply instructions from RISC-V's Zbc extension, implement a Barrett reduction that calculates CRC-32C checksums. Based on the approach described by Intel's whitepaper on "Fast CRC Computation for Generic Polynomials Using PCLMULQDQ Instruction", which is also described here (https://web.archive.org/web/20240111232520/https://mary.rs/lab/crc32/) Add a case to the autotest_hash unit test. Signed-off-by: Daniel Gregory --- MAINTAINERS| 1 + app/test/test_hash.c | 7 +++ lib/hash/meson.build | 1 + lib/hash/rte_crc_riscv64.h | 89 ++ lib/hash/rte_hash_crc.c| 13 +- lib/hash/rte_hash_crc.h| 6 ++- 6 files changed, 115 insertions(+), 2 deletions(-) create mode 100644 lib/hash/rte_crc_riscv64.h diff --git a/MAINTAINERS b/MAINTAINERS index c5a703b5c0..fa081552c7 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -322,6 +322,7 @@ M: Stanislaw Kardach F: config/riscv/ F: doc/guides/linux_gsg/cross_build_dpdk_for_riscv.rst F: lib/eal/riscv/ +F: lib/hash/rte_crc_riscv64.h Intel x86 M: Bruce Richardson diff --git a/app/test/test_hash.c b/app/test/test_hash.c index 65b9cad93c..dd491ea4d9 100644 --- a/app/test/test_hash.c +++ b/app/test/test_hash.c @@ -231,6 +231,13 @@ test_crc32_hash_alg_equiv(void) printf("Failed checking CRC32_SW against CRC32_ARM64\n"); break; } + + /* Check against 8-byte-operand RISCV64 CRC32 if available */ + rte_hash_crc_set_alg(CRC32_RISCV64); + if (hash_val != rte_hash_crc(data64, data_len, init_val)) { + printf("Failed checking CRC32_SW against CRC32_RISC64\n"); + break; + } } /* Resetting to best available algorithm */ diff --git a/lib/hash/meson.build b/lib/hash/meson.build index 277eb9fa93..8355869a80 100644 --- a/lib/hash/meson.build +++ b/lib/hash/meson.build @@ -12,6 +12,7 @@ headers = files( indirect_headers += files( 'rte_crc_arm64.h', 'rte_crc_generic.h', +'rte_crc_riscv64.h', 'rte_crc_sw.h', 'rte_crc_x86.h', 'rte_thash_x86_gfni.h', diff --git a/lib/hash/rte_crc_riscv64.h b/lib/hash/rte_crc_riscv64.h new file mode 100644 index 00..94f6857c69 --- /dev/null +++ b/lib/hash/rte_crc_riscv64.h @@ -0,0 +1,89 @@ +/* SPDX-License_Identifier: BSD-3-Clause + * Copyright(c) ByteDance 2024 + */ + +#include +#include + +#include + +#ifndef _RTE_CRC_RISCV64_H_ +#define _RTE_CRC_RISCV64_H_ + +/* + * CRC-32C takes a reflected input (bit 7 is the lsb) and produces a reflected + * output. As reflecting the value we're checksumming is expensive, we instead + * reflect the polynomial P (0x11EDC6F41) and mu and our CRC32 algorithm. + * + * The mu constant is used for a Barrett reduction. It's 2^96 / P (0x11F91CAF6) + * reflected. Picking 2^96 rather than 2^64 means we can calculate a 64-bit crc + * using only two multiplications (https://mary.rs/lab/crc32/) + */ +static const uint64_t p = 0x105EC76F1; +static const uint64_t mu = 0x4869EC38DEA713F1UL; + +/* Calculate the CRC32C checksum using a Barrett reduction */ +static inline uint32_t +crc32c_riscv64(uint64_t data, uint32_t init_val, uint32_t bits) +{ + assert((bits == 64) || (bits == 32) || (bits == 16) || (bits == 8)); + + /* Combine data with the initial value */ + uint64_t crc = (uint64_t)(data ^ init_val) << (64 - bits); + + /* +* Multiply by mu, which is 2^96 / P. Division by 2^96 occurs by taking +* the lower 64 bits of the result (remember we're inverted) +*/ + crc = __riscv_clmul_64(crc, mu); + /* Multiply by P */ + crc = __riscv_clmulh_64(crc, p); + + /* Subtract from original (only needed for smaller sizes) */ + if (bits == 16 || bits == 8) + crc ^= init_val >> bits; + + return crc; +} + +/* + * Use carryless multiply to perform hash on a value, falling back on the + * software in case the Zbc extension is not supported + */ +static inline uint32_t +rte_hash_crc_1byte(uint8_t data, uint32_t init_val) +{ + if (likely(rte_hash_crc32_alg & CRC32_RISCV64)) + return crc32c_riscv64(data, init_val, 8); + + return crc32c_1byte(data, init_val); +} + +static inline uint32_t +rte_hash_crc_2byte(uint16_t data, uint32_t init_val) +{ + if (likely(rte_hash_crc32_alg & CRC32_RISCV64)) + return crc32c_riscv64(data, init_val, 16); + + return crc32c_2bytes(data, init_val); +} + +static inline uint32_t +rte_hash_crc_4byte(uint32_t data, uint32_t init_val) +{ + if (likely(rte_hash_crc32_alg & CRC32_RISCV64)) + return crc32c_riscv64(data, init_val, 32); + + return crc32c_1word(data, init_val); +} + +static inline uint32_t +rte_hash_crc_8byte(uint64_t data, uint32_t init_val) +{ + if (likely(rte_hash_crc32_alg & CRC32_RISCV64)) +
[PATCH v3 3/9] net: implement CRC using riscv carryless multiply
Using carryless multiply instructions (clmul) from RISC-V's Zbc extension, implement CRC-32 and CRC-16 calculations on buffers. Based on the approach described in Intel's whitepaper on "Fast CRC Computation for Generic Polynomails Using PCLMULQDQ Instructions", we perform repeated folds-by-1 whilst the buffer is still big enough, then perform Barrett's reductions on the rest. Add a case to the crc_autotest suite that tests this implementation. Signed-off-by: Daniel Gregory --- MAINTAINERS | 1 + app/test/test_crc.c | 9 ++ lib/net/meson.build | 4 + lib/net/net_crc.h | 11 +++ lib/net/net_crc_zbc.c | 191 ++ lib/net/rte_net_crc.c | 40 + lib/net/rte_net_crc.h | 2 + 7 files changed, 258 insertions(+) create mode 100644 lib/net/net_crc_zbc.c diff --git a/MAINTAINERS b/MAINTAINERS index fa081552c7..eeaa2c645e 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -323,6 +323,7 @@ F: config/riscv/ F: doc/guides/linux_gsg/cross_build_dpdk_for_riscv.rst F: lib/eal/riscv/ F: lib/hash/rte_crc_riscv64.h +F: lib/net/net_crc_zbc.c Intel x86 M: Bruce Richardson diff --git a/app/test/test_crc.c b/app/test/test_crc.c index b85fca35fe..fa91557cf5 100644 --- a/app/test/test_crc.c +++ b/app/test/test_crc.c @@ -168,6 +168,15 @@ test_crc(void) return ret; } + /* set CRC riscv mode */ + rte_net_crc_set_alg(RTE_NET_CRC_ZBC); + + ret = test_crc_calc(); + if (ret < 0) { + printf("test crc (riscv64 zbc clmul): failed (%d)\n", ret); + return ret; + } + return 0; } diff --git a/lib/net/meson.build b/lib/net/meson.build index 0b69138949..404d8dd3ae 100644 --- a/lib/net/meson.build +++ b/lib/net/meson.build @@ -125,4 +125,8 @@ elif (dpdk_conf.has('RTE_ARCH_ARM64') and cc.get_define('__ARM_FEATURE_CRYPTO', args: machine_args) != '') sources += files('net_crc_neon.c') cflags += ['-DCC_ARM64_NEON_PMULL_SUPPORT'] +elif (dpdk_conf.has('RTE_ARCH_RISCV') and +cc.get_define('RTE_RISCV_FEATURE_ZBC', args: machine_args) != '') +sources += files('net_crc_zbc.c') +cflags += ['-DCC_RISCV64_ZBC_CLMUL_SUPPORT'] endif diff --git a/lib/net/net_crc.h b/lib/net/net_crc.h index 7a74d5406c..06ae113b47 100644 --- a/lib/net/net_crc.h +++ b/lib/net/net_crc.h @@ -42,4 +42,15 @@ rte_crc16_ccitt_neon_handler(const uint8_t *data, uint32_t data_len); uint32_t rte_crc32_eth_neon_handler(const uint8_t *data, uint32_t data_len); +/* RISCV64 Zbc */ +void +rte_net_crc_zbc_init(void); + +uint32_t +rte_crc16_ccitt_zbc_handler(const uint8_t *data, uint32_t data_len); + +uint32_t +rte_crc32_eth_zbc_handler(const uint8_t *data, uint32_t data_len); + + #endif /* _NET_CRC_H_ */ diff --git a/lib/net/net_crc_zbc.c b/lib/net/net_crc_zbc.c new file mode 100644 index 00..be416ba52f --- /dev/null +++ b/lib/net/net_crc_zbc.c @@ -0,0 +1,191 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) ByteDance 2024 + */ + +#include +#include + +#include +#include + +#include "net_crc.h" + +/* CLMUL CRC computation context structure */ +struct crc_clmul_ctx { + uint64_t Pr; + uint64_t mu; + uint64_t k3; + uint64_t k4; + uint64_t k5; +}; + +struct crc_clmul_ctx crc32_eth_clmul; +struct crc_clmul_ctx crc16_ccitt_clmul; + +/* Perform Barrett's reduction on 8, 16, 32 or 64-bit value */ +static inline uint32_t +crc32_barrett_zbc( + const uint64_t data, + uint32_t crc, + uint32_t bits, + const struct crc_clmul_ctx *params) +{ + assert((bits == 64) || (bits == 32) || (bits == 16) || (bits == 8)); + + /* Combine data with the initial value */ + uint64_t temp = (uint64_t)(data ^ crc) << (64 - bits); + + /* +* Multiply by mu, which is 2^96 / P. Division by 2^96 occurs by taking +* the lower 64 bits of the result (remember we're inverted) +*/ + temp = __riscv_clmul_64(temp, params->mu); + /* Multiply by P */ + temp = __riscv_clmulh_64(temp, params->Pr); + + /* Subtract from original (only needed for smaller sizes) */ + if (bits == 16 || bits == 8) + temp ^= crc >> bits; + + return temp; +} + +/* Repeat Barrett's reduction for short buffer sizes */ +static inline uint32_t +crc32_repeated_barrett_zbc( + const uint8_t *data, + uint32_t data_len, + uint32_t crc, + const struct crc_clmul_ctx *params) +{ + while (data_len >= 8) { + crc = crc32_barrett_zbc(*(const uint64_t *)data, crc, 64, params); + data += 8; + data_len -= 8; + } + if (data_len >= 4) { + crc = crc32_barrett_zbc(*(const uint32_t *)data, crc, 32, params); + data += 4; + data_len -= 4; + } + if (data_len >= 2) { + crc = crc32_barrett_zbc(*(const uint16_t *)data, crc, 16, params); + data += 2
[PATCH v3 4/9] config/riscv: add qemu crossbuild target
A new cross-compilation target that has extensions that DPDK uses and QEMU supports. Initially, this is just the Zbc extension for hardware CRC support. Signed-off-by: Daniel Gregory --- config/riscv/meson.build| 3 ++- config/riscv/riscv64_qemu_linux_gcc | 17 + .../linux_gsg/cross_build_dpdk_for_riscv.rst| 5 + 3 files changed, 24 insertions(+), 1 deletion(-) create mode 100644 config/riscv/riscv64_qemu_linux_gcc diff --git a/config/riscv/meson.build b/config/riscv/meson.build index 5d8411b254..337b26bbac 100644 --- a/config/riscv/meson.build +++ b/config/riscv/meson.build @@ -43,7 +43,8 @@ vendor_generic = { ['RTE_MAX_NUMA_NODES', 2] ], 'arch_config': { -'generic': {'machine_args': ['-march=rv64gc']} +'generic': {'machine_args': ['-march=rv64gc']}, +'qemu': {'machine_args': ['-march=rv64gc_zbc']}, } } diff --git a/config/riscv/riscv64_qemu_linux_gcc b/config/riscv/riscv64_qemu_linux_gcc new file mode 100644 index 00..007cc98885 --- /dev/null +++ b/config/riscv/riscv64_qemu_linux_gcc @@ -0,0 +1,17 @@ +[binaries] +c = ['ccache', 'riscv64-linux-gnu-gcc'] +cpp = ['ccache', 'riscv64-linux-gnu-g++'] +ar = 'riscv64-linux-gnu-ar' +strip = 'riscv64-linux-gnu-strip' +pcap-config = '' + +[host_machine] +system = 'linux' +cpu_family = 'riscv64' +cpu = 'rv64gc_zbc' +endian = 'little' + +[properties] +vendor_id = 'generic' +arch_id = 'qemu' +pkg_config_libdir = '/usr/lib/riscv64-linux-gnu/pkgconfig' diff --git a/doc/guides/linux_gsg/cross_build_dpdk_for_riscv.rst b/doc/guides/linux_gsg/cross_build_dpdk_for_riscv.rst index 7d7f7ac72b..c3b67671a0 100644 --- a/doc/guides/linux_gsg/cross_build_dpdk_for_riscv.rst +++ b/doc/guides/linux_gsg/cross_build_dpdk_for_riscv.rst @@ -110,6 +110,11 @@ Currently the following targets are supported: * SiFive U740 SoC: ``config/riscv/riscv64_sifive_u740_linux_gcc`` +* QEMU: ``config/riscv/riscv64_qemu_linux_gcc`` + + * A target with all the extensions that QEMU supports that DPDK has a use for +(currently ``rv64gc_zbc``). Requires QEMU version 7.0.0 or newer. + To add a new target support, ``config/riscv/meson.build`` has to be modified by adding a new vendor/architecture id and a corresponding cross-file has to be added to ``config/riscv`` directory. -- 2.39.2
[PATCH v3 5/9] examples/l3fwd: use accelerated CRC on riscv
When the RISC-V Zbc (carryless multiplication) extension is present, an implementation of CRC hashing using hardware instructions is available. Use it rather than jhash. Signed-off-by: Daniel Gregory --- examples/l3fwd/l3fwd_em.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/l3fwd/l3fwd_em.c b/examples/l3fwd/l3fwd_em.c index 31a7e05e39..36520401e5 100644 --- a/examples/l3fwd/l3fwd_em.c +++ b/examples/l3fwd/l3fwd_em.c @@ -29,7 +29,7 @@ #include "l3fwd_event.h" #include "em_route_parse.c" -#if defined(RTE_ARCH_X86) || defined(__ARM_FEATURE_CRC32) +#if defined(RTE_ARCH_X86) || defined(__ARM_FEATURE_CRC32) || defined(RTE_RISCV_FEATURE_ZBC) #define EM_HASH_CRC 1 #endif -- 2.39.2
[PATCH v3 6/9] ipfrag: use accelerated CRC on riscv
When the RISC-V Zbc (carryless multiplication) extension is present, an implementation of CRC hashing using hardware instructions is available. Use it rather than jhash. Signed-off-by: Daniel Gregory --- lib/ip_frag/ip_frag_internal.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/lib/ip_frag/ip_frag_internal.c b/lib/ip_frag/ip_frag_internal.c index 7cbef647df..19a28c447b 100644 --- a/lib/ip_frag/ip_frag_internal.c +++ b/lib/ip_frag/ip_frag_internal.c @@ -45,14 +45,14 @@ ipv4_frag_hash(const struct ip_frag_key *key, uint32_t *v1, uint32_t *v2) p = (const uint32_t *)&key->src_dst; -#if defined(RTE_ARCH_X86) || defined(RTE_ARCH_ARM64) +#if defined(RTE_ARCH_X86) || defined(RTE_ARCH_ARM64) || defined(RTE_RISCV_FEATURE_ZBC) v = rte_hash_crc_4byte(p[0], PRIME_VALUE); v = rte_hash_crc_4byte(p[1], v); v = rte_hash_crc_4byte(key->id, v); #else v = rte_jhash_3words(p[0], p[1], key->id, PRIME_VALUE); -#endif /* RTE_ARCH_X86 */ +#endif /* RTE_ARCH_X86 || RTE_ARCH_ARM64 || RTE_RISCV_FEATURE_ZBC */ *v1 = v; *v2 = (v << 7) + (v >> 14); @@ -66,7 +66,7 @@ ipv6_frag_hash(const struct ip_frag_key *key, uint32_t *v1, uint32_t *v2) p = (const uint32_t *) &key->src_dst; -#if defined(RTE_ARCH_X86) || defined(RTE_ARCH_ARM64) +#if defined(RTE_ARCH_X86) || defined(RTE_ARCH_ARM64) || defined(RTE_RISCV_FEATURE_ZBC) v = rte_hash_crc_4byte(p[0], PRIME_VALUE); v = rte_hash_crc_4byte(p[1], v); v = rte_hash_crc_4byte(p[2], v); -- 2.39.2
[PATCH v3 8/9] hash/cuckoo: use accelerated CRC on riscv
When the RISC-V Zbc (carryless multiplication) extension is present, an implementation of CRC hashing using hardware instructions is available. Use it rather than jhash. Signed-off-by: Daniel Gregory --- lib/hash/rte_cuckoo_hash.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/lib/hash/rte_cuckoo_hash.c b/lib/hash/rte_cuckoo_hash.c index 577b5839d3..872f88fdce 100644 --- a/lib/hash/rte_cuckoo_hash.c +++ b/lib/hash/rte_cuckoo_hash.c @@ -427,6 +427,9 @@ rte_hash_create(const struct rte_hash_parameters *params) #elif defined(RTE_ARCH_ARM64) if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_CRC32)) default_hash_func = (rte_hash_function)rte_hash_crc; +#elif defined(RTE_ARCH_RISCV) + if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_RISCV_EXT_ZBC)) + default_hash_func = (rte_hash_function)rte_hash_crc; #endif /* Setup hash context */ strlcpy(h->name, params->name, sizeof(h->name)); -- 2.39.2
[PATCH v3 7/9] examples/l3fwd-power: use accelerated CRC on riscv
When the RISC-V Zbc (carryless multiplication) extension is present, an implementation of CRC hashing using hardware instructions is available. Use it rather than jhash. Signed-off-by: Daniel Gregory --- examples/l3fwd-power/main.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/l3fwd-power/main.c b/examples/l3fwd-power/main.c index 2bb6b092c3..c631c14193 100644 --- a/examples/l3fwd-power/main.c +++ b/examples/l3fwd-power/main.c @@ -270,7 +270,7 @@ static struct rte_mempool * pktmbuf_pool[NB_SOCKETS]; #if (APP_LOOKUP_METHOD == APP_LOOKUP_EXACT_MATCH) -#ifdef RTE_ARCH_X86 +#if defined(RTE_ARCH_X86) || defined(RTE_RISCV_FEATURE_ZBC) #include #define DEFAULT_HASH_FUNC rte_hash_crc #else -- 2.39.2
[PATCH v3 9/9] member: use accelerated CRC on riscv
When the RISC-V Zbc (carryless multiplication) extension is present, an implementation of CRC hashing using hardware instructions is available. Use it rather than jhash. Signed-off-by: Daniel Gregory --- lib/member/rte_member.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/lib/member/rte_member.h b/lib/member/rte_member.h index aec192eba5..152659628a 100644 --- a/lib/member/rte_member.h +++ b/lib/member/rte_member.h @@ -92,7 +92,7 @@ typedef uint16_t member_set_t; #define RTE_MEMBER_SKETCH_COUNT_BYTE 0x02 /** @internal Hash function used by membership library. */ -#if defined(RTE_ARCH_X86) || defined(__ARM_FEATURE_CRC32) +#if defined(RTE_ARCH_X86) || defined(__ARM_FEATURE_CRC32) || defined(RTE_RISCV_FEATURE_ZBC) #include #define MEMBER_HASH_FUNC rte_hash_crc #else -- 2.39.2
Re: [PATCH v3 01/12] dts: fix default device error handling mode
On Wed, Aug 21, 2024 at 10:53 AM Juraj Linkeš wrote: > The device_error_handling_mode of testpmd port may not be present, e.g. > in VM ports. > > Fixes: 61d5bc9bf974 ("dts: add port info command to testpmd shell") > > Signed-off-by: Juraj Linkeš > Reviewed-by: Dean Marx
Re: [PATCH v3 02/12] dts: add the aenum dependency
On Wed, Aug 21, 2024 at 10:53 AM Juraj Linkeš wrote: > Regular Python enumerations create only one instance for members with > the same value, such as: > class MyEnum(Enum): > foo = 1 > bar = 1 > > MyEnum.foo and MyEnum.bar are aliases that return the same instance. > > DTS needs to return different instances in the above scenario so that we > can map capabilities with different names to the same function that > retrieves the capabilities. > > Signed-off-by: Juraj Linkeš > Reviewed-by: Dean Marx
Re: [PATCH v3 08/12] dts: add NIC capability support
On Wed, Aug 21, 2024 at 10:53 AM Juraj Linkeš wrote: > diff --git a/dts/framework/testbed_model/capability.py > b/dts/framework/testbed_model/capability.py > index 8899f07f76..9a79e6ebb3 100644 > --- a/dts/framework/testbed_model/capability.py > +++ b/dts/framework/testbed_model/capability.py > @@ -5,14 +5,40 @@ > +@classmethod > +def get_supported_capabilities( > +cls, sut_node: SutNode, topology: "Topology" > +) -> set["DecoratedNicCapability"]: > +"""Overrides :meth:`~Capability.get_supported_capabilities`. > + > +The capabilities are first sorted by decorators, then reduced into a > single function which > +is then passed to the decorator. This way we only execute each > decorator only once. > +""" > +supported_conditional_capabilities: set["DecoratedNicCapability"] = > set() > +logger = get_dts_logger(f"{sut_node.name}.{cls.__name__}") > +if topology.type is Topology.type.no_link: As a follow-up, I didn't notice this during my initial review, but in testing this line was throwing attribute errors for me due to Topology not having an attribute named `type`. I think this was because of `Topology.type.no_link` since this attribute isn't initialized on the class itself. I fixed this by just replacing it with `TopologyType.no_link` locally. > +logger.debug( > +"No links available in the current topology, not getting NIC > capabilities." > +) > +return supported_conditional_capabilities > +logger.debug( > +f"Checking which NIC capabilities from > {cls.capabilities_to_check} are supported." > +) > +if cls.capabilities_to_check: > +capabilities_to_check_map = cls._get_decorated_capabilities_map() > +with TestPmdShell(sut_node, privileged=True) as testpmd_shell: > +for conditional_capability_fn, capabilities in > capabilities_to_check_map.items(): > +supported_capabilities: set[NicCapability] = set() >
[PATCH v8 0/1] dts: add second scatter test case
From: Jeremy Spewock v8: * update test suite to use newly submitted capabilities series * split the MTU update patch into its own series. * now that the --max-pkt-len bug is fixed on mlx in 24.07, no longer need to set MTU directly so this is also removed. Jeremy Spewock (1): dts: add test case that utilizes offload to pmd_buffer_scatter dts/tests/TestSuite_pmd_buffer_scatter.py | 47 +++ 1 file changed, 31 insertions(+), 16 deletions(-) -- 2.46.0
[PATCH v8 1/1] dts: add test case that utilizes offload to pmd_buffer_scatter
From: Jeremy Spewock Some NICs tested in DPDK allow for the scattering of packets without an offload and others enforce that you enable the scattered_rx offload in testpmd. The current version of the suite for testing support of scattering packets only tests the case where the NIC supports testing without the offload, so an expansion of coverage is needed to cover the second case as well. depends-on: series-32799 ("dts: add test skipping based on capabilities") Signed-off-by: Jeremy Spewock --- dts/tests/TestSuite_pmd_buffer_scatter.py | 47 +++ 1 file changed, 31 insertions(+), 16 deletions(-) diff --git a/dts/tests/TestSuite_pmd_buffer_scatter.py b/dts/tests/TestSuite_pmd_buffer_scatter.py index 64c48b0793..6704c04325 100644 --- a/dts/tests/TestSuite_pmd_buffer_scatter.py +++ b/dts/tests/TestSuite_pmd_buffer_scatter.py @@ -19,7 +19,7 @@ from scapy.layers.inet import IP # type: ignore[import-untyped] from scapy.layers.l2 import Ether # type: ignore[import-untyped] -from scapy.packet import Raw # type: ignore[import-untyped] +from scapy.packet import Packet, Raw # type: ignore[import-untyped] from scapy.utils import hexstr # type: ignore[import-untyped] from framework.params.testpmd import SimpleForwardingModes @@ -55,25 +55,25 @@ def set_up_suite(self) -> None: """Set up the test suite. Setup: -Increase the MTU of both ports on the traffic generator to 9000 -to support larger packet sizes. +The traffic generator needs to send and receive packets that are, at most, as large as +the mbuf size of the ports + 5 in each test case, so 9000 should more than suffice. """ self.tg_node.main_session.configure_port_mtu(9000, self._tg_port_egress) self.tg_node.main_session.configure_port_mtu(9000, self._tg_port_ingress) -def scatter_pktgen_send_packet(self, pktsize: int) -> str: +def scatter_pktgen_send_packet(self, pktsize: int) -> list[Packet]: """Generate and send a packet to the SUT then capture what is forwarded back. Generate an IP packet of a specific length and send it to the SUT, -then capture the resulting received packet and extract its payload. -The desired length of the packet is met by packing its payload +then capture the resulting received packets and filter them down to the ones that have the +correct layers. The desired length of the packet is met by packing its payload with the letter "X" in hexadecimal. Args: pktsize: Size of the packet to generate and send. Returns: -The payload of the received packet as a string. +The filtered down list of received packets. """ packet = Ether() / IP() / Raw() packet.getlayer(2).load = "" @@ -83,20 +83,27 @@ def scatter_pktgen_send_packet(self, pktsize: int) -> str: for X_in_hex in payload: packet.load += struct.pack("=B", int("%s%s" % (X_in_hex[0], X_in_hex[1]), 16)) received_packets = self.send_packet_and_capture(packet) +# filter down the list to packets that have the appropriate structure +received_packets = list( +filter(lambda p: Ether in p and IP in p and Raw in p, received_packets) +) self.verify(len(received_packets) > 0, "Did not receive any packets.") -load = hexstr(received_packets[0].getlayer(2), onlyhex=1) -return load +return received_packets -def pmd_scatter(self, mbsize: int) -> None: +def pmd_scatter(self, mbsize: int, enable_offload: bool = False) -> None: """Testpmd support of receiving and sending scattered multi-segment packets. Support for scattered packets is shown by sending 5 packets of differing length where the length of the packet is calculated by taking mbuf-size + an offset. The offsets used in the test are -1, 0, 1, 4, 5 respectively. +Args: +mbsize: Size to set memory buffers to when starting testpmd. +enable_offload: Whether or not to offload the scattering functionality in testpmd. + Test: -Start testpmd and run functional test with preset mbsize. +Start testpmd and run functional test with preset `mbsize`. """ with TestPmdShell( self.sut_node, @@ -105,16 +112,19 @@ def pmd_scatter(self, mbsize: int) -> None: mbuf_size=[mbsize], max_pkt_len=9000, tx_offloads=0x8000, +enable_scatter=True if enable_offload else None, ) as testpmd: testpmd.start() for offset in [-1, 0, 1, 4, 5]: -recv_payload = self.scatter_pktgen_send_packet(mbsize + offset) -self._logger.debug( -f"Payload of scattered packet after forwarding: \n{recv_payload}" -)
Re: [RFC 1/2] eal: add llc aware functions
On Tue, 27 Aug 2024 20:40:13 +0530 Vipin Varghese wrote: > + "ls -d /sys/bus/cpu/devices/cpu%u/cache/index[0-9] | sort -r > | grep -m1 index[0-9] | awk -F '[x]' '{print $2}' " NAK Running shell commands from EAL is non-portable and likely to be flagged by security scanning tools. Do it in C please.
Re: [PATCH v2 1/3] app/testpmd: add register keyword
On Tue, 27 Aug 2024 21:02:00 +0530 "Varghese, Vipin" wrote: > On 8/21/2024 8:25 PM, Stephen Hemminger wrote: > > Caution: This message originated from an External Source. Use proper > > caution when opening attachments, clicking links, or responding. > > > > > > On Wed, 21 Aug 2024 20:08:55 +0530 > > Vipin Varghese wrote: > > > >> diff --git a/app/test-pmd/macswap_sse.h b/app/test-pmd/macswap_sse.h > >> index 223f87a539..29088843b7 100644 > >> --- a/app/test-pmd/macswap_sse.h > >> +++ b/app/test-pmd/macswap_sse.h > >> @@ -16,13 +16,13 @@ do_macswap(struct rte_mbuf *pkts[], uint16_t nb, > >>uint64_t ol_flags; > >>int i; > >>int r; > >> - __m128i addr0, addr1, addr2, addr3; > >> + register __m128i addr0, addr1, addr2, addr3; > > Some compilers treat register as a no-op. Are you sure? Did you check with > > godbolt. > > Thank you Stephen, I have tested the code changes on Linux using GCC and > Clang compiler. > > In both cases in Linux environment, we have seen the the values loaded > onto register `xmm`. > > ``` > registerconst__m128i shfl_msk = _mm_set_epi8(15, 14, 13, 12, 5, 4, 3, 2, > 1, 0, 11, 10, 9, 8, 7, 6); > vmovdqaxmm0, xmmwordptr[rip+ .LCPI0_0] > > ``` > > Both cases we have performance improvement. > > > Can you please help us understand if we have missed out something? Ok, not sure why compiler would not decide to already use a register here?
RE: [PATCH v8 2/3] eventdev: add support for independent enqueue
Hi Mattias, I will update patch tomorrow with updated suggestion from Pravin, If I don’t hear from you I guess you are okay?
Re: [PATCH v3 01/12] dts: fix default device error handling mode
On Wed, Aug 21, 2024 at 10:53 AM Juraj Linkeš wrote: > > The device_error_handling_mode of testpmd port may not be present, e.g. > in VM ports. > > Fixes: 61d5bc9bf974 ("dts: add port info command to testpmd shell") > > Signed-off-by: Juraj Linkeš > --- Reviewed-by: Nicholas Pratte
Re: [PATCH v3 02/12] dts: add the aenum dependency
On Wed, Aug 21, 2024 at 10:53 AM Juraj Linkeš wrote: > > Regular Python enumerations create only one instance for members with > the same value, such as: > class MyEnum(Enum): > foo = 1 > bar = 1 > > MyEnum.foo and MyEnum.bar are aliases that return the same instance. > > DTS needs to return different instances in the above scenario so that we > can map capabilities with different names to the same function that > retrieves the capabilities. > > Signed-off-by: Juraj Linkeš Reviewed-by: Nicholas Pratte
RE: [RFC 1/2] eal: add llc aware functions
> -unsigned int rte_get_next_lcore(unsigned int i, int skip_main, int wrap) > +#define LCORE_GET_LLC \ > + "ls -d /sys/bus/cpu/devices/cpu%u/cache/index[0-9] | sort -r > | grep -m1 index[0-9] | awk -F '[x]' '{print $2}' " > This won't work for some SOCs. How to ensure the index you got is for an LLC? Some SOCs may only show upper-level caches here, therefore cannot be use blindly without knowing the SOC. Also, unacceptable to execute a shell script, consider implementing in C. --wathsala
Re: [RFC 0/2] introduce LLC aware functions
On 2024-08-27 17:10, Vipin Varghese wrote: As core density continues to increase, chiplet-based core packing has become a key trend. In AMD SoC EPYC architectures, core complexes within the same chiplet share a Last-Level Cache (LLC). By packing logical cores within the same LLC, we can enhance pipeline processing stages due to reduced latency and improved data locality. To leverage these benefits, DPDK libraries and examples can utilize localized lcores. This approach ensures more consistent latencies by minimizing the dispersion of lcores across different chiplet complexes and enhances packet processing by ensuring that data for subsequent pipeline stages is likely to reside within the LLC. We shouldn't have a separate CPU/cache hierarchy API instead? Could potentially be built on the 'hwloc' library. I much agree cache/core topology may be of interest of the application (or a work scheduler, like a DPDK event device), but it's not limited to LLC. It may well be worthwhile to care about which cores shares L2 cache, for example. Not sure the RTE_LCORE_FOREACH_* approach scales. < Function: Purpose > - - rte_get_llc_first_lcores: Retrieves all the first lcores in the shared LLC. - rte_get_llc_lcore: Retrieves all lcores that share the LLC. - rte_get_llc_n_lcore: Retrieves the first n or skips the first n lcores in the shared LLC. < MACRO: Purpose > -- RTE_LCORE_FOREACH_LLC_FIRST: iterates through all first lcore from each LLC. RTE_LCORE_FOREACH_LLC_FIRST_WORKER: iterates through all first worker lcore from each LLC. RTE_LCORE_FOREACH_LLC_WORKER: iterates lcores from LLC based on hint (lcore id). RTE_LCORE_FOREACH_LLC_SKIP_FIRST_WORKER: iterates lcores from LLC while skipping first worker. RTE_LCORE_FOREACH_LLC_FIRST_N_WORKER: iterates through `n` lcores from each LLC. RTE_LCORE_FOREACH_LLC_SKIP_N_WORKER: skip first `n` lcores, then iterates through reaming lcores in each LLC. Vipin Varghese (2): eal: add llc aware functions eal/lcore: add llc aware for each macro lib/eal/common/eal_common_lcore.c | 279 -- lib/eal/include/rte_lcore.h | 89 ++ 2 files changed, 356 insertions(+), 12 deletions(-)
[PATCH v2 0/1] bbdev: removing unnecessaray symbols
v2: Actually several functions can be removed from bbdev version map since they are inline and hence ABI versionning is not relevant. I checked with other lib (cryptodev/ethdev) and the same guideline is followed, with inline functions not part of version.map. Similarly the script as part of CICD do no enforce versionning for inline functions either. Discussed a bitwith Maxime off list. Any thoughts? Good to clean it up now. v1: A few functions were somehow missing for the last few years in the version map file. Nicolas Chautru (1): bbdev: removing unnecessaray symbols from version map lib/bbdev/rte_bbdev.h| 1 - lib/bbdev/rte_bbdev_op.h | 2 -- lib/bbdev/version.map| 24 +--- 3 files changed, 1 insertion(+), 26 deletions(-) -- 2.34.1
[PATCH v2 1/1] bbdev: removing unnecessaray symbols from version map
A number of inline functions should not be in the version map since ABI versionning would be irrelevant. Signed-off-by: Nicolas Chautru --- lib/bbdev/rte_bbdev.h| 1 - lib/bbdev/rte_bbdev_op.h | 2 -- lib/bbdev/version.map| 24 +--- 3 files changed, 1 insertion(+), 26 deletions(-) diff --git a/lib/bbdev/rte_bbdev.h b/lib/bbdev/rte_bbdev.h index 25514c58ac..bd49a0a304 100644 --- a/lib/bbdev/rte_bbdev.h +++ b/lib/bbdev/rte_bbdev.h @@ -897,7 +897,6 @@ rte_bbdev_dequeue_fft_ops(uint16_t dev_id, uint16_t queue_id, * The number of operations actually dequeued (this is the number of entries * copied into the @p ops array). */ -__rte_experimental static inline uint16_t rte_bbdev_dequeue_mldts_ops(uint16_t dev_id, uint16_t queue_id, struct rte_bbdev_mldts_op **ops, uint16_t num_ops) diff --git a/lib/bbdev/rte_bbdev_op.h b/lib/bbdev/rte_bbdev_op.h index 459631d0d0..5b862c13a6 100644 --- a/lib/bbdev/rte_bbdev_op.h +++ b/lib/bbdev/rte_bbdev_op.h @@ -1159,7 +1159,6 @@ rte_bbdev_fft_op_alloc_bulk(struct rte_mempool *mempool, * - 0 on success. * - EINVAL if invalid mempool is provided. */ -__rte_experimental static inline int rte_bbdev_mldts_op_alloc_bulk(struct rte_mempool *mempool, struct rte_bbdev_mldts_op **ops, uint16_t num_ops) @@ -1236,7 +1235,6 @@ rte_bbdev_fft_op_free_bulk(struct rte_bbdev_fft_op **ops, unsigned int num_ops) * @param num_ops * Number of structures */ -__rte_experimental static inline void rte_bbdev_mldts_op_free_bulk(struct rte_bbdev_mldts_op **ops, unsigned int num_ops) { diff --git a/lib/bbdev/version.map b/lib/bbdev/version.map index e0d82ff752..2a5baacd90 100644 --- a/lib/bbdev/version.map +++ b/lib/bbdev/version.map @@ -6,21 +6,9 @@ DPDK_25 { rte_bbdev_callback_unregister; rte_bbdev_close; rte_bbdev_count; - rte_bbdev_dec_op_alloc_bulk; - rte_bbdev_dec_op_free_bulk; - rte_bbdev_dequeue_dec_ops; - rte_bbdev_dequeue_enc_ops; - rte_bbdev_dequeue_fft_ops; rte_bbdev_device_status_str; rte_bbdev_devices; - rte_bbdev_enc_op_alloc_bulk; - rte_bbdev_enc_op_free_bulk; - rte_bbdev_enqueue_dec_ops; - rte_bbdev_enqueue_enc_ops; - rte_bbdev_enqueue_fft_ops; rte_bbdev_enqueue_status_str; - rte_bbdev_fft_op_alloc_bulk; - rte_bbdev_fft_op_free_bulk; rte_bbdev_find_next; rte_bbdev_get_named_dev; rte_bbdev_info_get; @@ -44,14 +32,4 @@ DPDK_25 { rte_bbdev_stop; local: *; -}; - -EXPERIMENTAL { - global: - - # added in 23.11 - rte_bbdev_dequeue_mldts_ops; - rte_bbdev_enqueue_mldts_ops; - rte_bbdev_mldts_op_alloc_bulk; - rte_bbdev_mldts_op_free_bulk; -}; +}; \ No newline at end of file -- 2.34.1
[PATCH v3 1/1] bbdev: removing unnecessary symbols from version map
A number of inline functions should not be in the version map since ABI versioning would be irrelevant. Signed-off-by: Nicolas Chautru --- lib/bbdev/rte_bbdev.h| 1 - lib/bbdev/rte_bbdev_op.h | 2 -- lib/bbdev/version.map| 24 +--- 3 files changed, 1 insertion(+), 26 deletions(-) diff --git a/lib/bbdev/rte_bbdev.h b/lib/bbdev/rte_bbdev.h index 25514c58ac..bd49a0a304 100644 --- a/lib/bbdev/rte_bbdev.h +++ b/lib/bbdev/rte_bbdev.h @@ -897,7 +897,6 @@ rte_bbdev_dequeue_fft_ops(uint16_t dev_id, uint16_t queue_id, * The number of operations actually dequeued (this is the number of entries * copied into the @p ops array). */ -__rte_experimental static inline uint16_t rte_bbdev_dequeue_mldts_ops(uint16_t dev_id, uint16_t queue_id, struct rte_bbdev_mldts_op **ops, uint16_t num_ops) diff --git a/lib/bbdev/rte_bbdev_op.h b/lib/bbdev/rte_bbdev_op.h index 459631d0d0..5b862c13a6 100644 --- a/lib/bbdev/rte_bbdev_op.h +++ b/lib/bbdev/rte_bbdev_op.h @@ -1159,7 +1159,6 @@ rte_bbdev_fft_op_alloc_bulk(struct rte_mempool *mempool, * - 0 on success. * - EINVAL if invalid mempool is provided. */ -__rte_experimental static inline int rte_bbdev_mldts_op_alloc_bulk(struct rte_mempool *mempool, struct rte_bbdev_mldts_op **ops, uint16_t num_ops) @@ -1236,7 +1235,6 @@ rte_bbdev_fft_op_free_bulk(struct rte_bbdev_fft_op **ops, unsigned int num_ops) * @param num_ops * Number of structures */ -__rte_experimental static inline void rte_bbdev_mldts_op_free_bulk(struct rte_bbdev_mldts_op **ops, unsigned int num_ops) { diff --git a/lib/bbdev/version.map b/lib/bbdev/version.map index e0d82ff752..2a5baacd90 100644 --- a/lib/bbdev/version.map +++ b/lib/bbdev/version.map @@ -6,21 +6,9 @@ DPDK_25 { rte_bbdev_callback_unregister; rte_bbdev_close; rte_bbdev_count; - rte_bbdev_dec_op_alloc_bulk; - rte_bbdev_dec_op_free_bulk; - rte_bbdev_dequeue_dec_ops; - rte_bbdev_dequeue_enc_ops; - rte_bbdev_dequeue_fft_ops; rte_bbdev_device_status_str; rte_bbdev_devices; - rte_bbdev_enc_op_alloc_bulk; - rte_bbdev_enc_op_free_bulk; - rte_bbdev_enqueue_dec_ops; - rte_bbdev_enqueue_enc_ops; - rte_bbdev_enqueue_fft_ops; rte_bbdev_enqueue_status_str; - rte_bbdev_fft_op_alloc_bulk; - rte_bbdev_fft_op_free_bulk; rte_bbdev_find_next; rte_bbdev_get_named_dev; rte_bbdev_info_get; @@ -44,14 +32,4 @@ DPDK_25 { rte_bbdev_stop; local: *; -}; - -EXPERIMENTAL { - global: - - # added in 23.11 - rte_bbdev_dequeue_mldts_ops; - rte_bbdev_enqueue_mldts_ops; - rte_bbdev_mldts_op_alloc_bulk; - rte_bbdev_mldts_op_free_bulk; -}; +}; \ No newline at end of file -- 2.34.1
[PATCH v3 0/1] bbdev: removing unnecessaray symbols
v3: typo fixes. v2: Actually several functions can be removed from bbdev version map since they are inline and hence ABI versionning is not relevant. I checked with other lib (cryptodev/ethdev) and the same guideline is followed, with inline functions not part of version.map. Similarly the script as part of CICD do no enforce versionning for inline functions either. Discussed a bitwith Maxime off list. Any thoughts? Good to clean it up now. v1: A few functions were somehow missing for the last few years in the version map file. Nicolas Chautru (1): bbdev: removing unnecessary symbols from version map lib/bbdev/rte_bbdev.h| 1 - lib/bbdev/rte_bbdev_op.h | 2 -- lib/bbdev/version.map| 24 +--- 3 files changed, 1 insertion(+), 26 deletions(-) -- 2.34.1
RE: [PATCH] net/mana: support building the driver on arm64
> > Hi Long, > > > > I don't see this patch in the mail list and patchwork, it can be > > because of the email address it has been sent, fyi. > > The mail was waiting in the moderation queue. > Long, please fix your send-email setup, or send with a mail address registered > to the dev@ ml. I think the patch has made it to patchwork: https://patchwork.dpdk.org/project/dpdk/patch/1724711938-3108-1-git-send-email-lon...@linuxonhyperv.com/ The sender is lon...@linuxonhyperv.com, the author is lon...@microsoft.com. I hope that is okay. Long
[PATCH] net/ice: support for more flexible loading of DDP package
The "Dynamic Device Personalization" package is loaded at initialization time by the driver, but the specific package file loaded depends upon what package file is found first by searching through a hard-coded list of firmware paths. To enable greater control over the package loading, this commit two ways to support custom DDP packages: 1. Add device option to choose a specific DDP package file to load. For example: -a 80:00.0,ddp_pkg_file=/path/to/ice-version.pkg 2. Read firmware search path from "/sys/module/firmware_class/parameters/path" like the kernel behavior. Signed-off-by: Bruce Richardson Signed-off-by: Zhichao Zeng --- doc/guides/nics/ice.rst | 12 drivers/net/ice/ice_ethdev.c | 59 drivers/net/ice/ice_ethdev.h | 2 ++ 3 files changed, 73 insertions(+) diff --git a/doc/guides/nics/ice.rst b/doc/guides/nics/ice.rst index ae975d19ad..0484fafbc1 100644 --- a/doc/guides/nics/ice.rst +++ b/doc/guides/nics/ice.rst @@ -108,6 +108,18 @@ Runtime Configuration -a 80:00.0,default-mac-disable=1 +- ``DDP Package File`` + + Rather than have the driver search for the DDP package to load, + or to override what package is used, + the ``ddp_pkg_file`` option can be used to provide the path to a specific package file. + For example:: + +-a 80:00.0,ddp_pkg_file=/path/to/ice-version.pkg + + There is also support for customizing the firmware search path, will read the search path + from "/sys/module/firmware_class/parameters/path" and try to load DDP package. + - ``Protocol extraction for per queue`` Configure the RX queues to do protocol extraction into mbuf for protocol diff --git a/drivers/net/ice/ice_ethdev.c b/drivers/net/ice/ice_ethdev.c index 304f959b7e..a1b542c1af 100644 --- a/drivers/net/ice/ice_ethdev.c +++ b/drivers/net/ice/ice_ethdev.c @@ -36,6 +36,7 @@ #define ICE_ONE_PPS_OUT_ARG "pps_out" #define ICE_RX_LOW_LATENCY_ARG"rx_low_latency" #define ICE_MBUF_CHECK_ARG "mbuf_check" +#define ICE_DDP_FILENAME "ddp_pkg_file" #define ICE_CYCLECOUNTER_MASK 0xULL @@ -52,6 +53,7 @@ static const char * const ice_valid_args[] = { ICE_RX_LOW_LATENCY_ARG, ICE_DEFAULT_MAC_DISABLE, ICE_MBUF_CHECK_ARG, + ICE_DDP_FILENAME, NULL }; @@ -692,6 +694,18 @@ handle_field_name_arg(__rte_unused const char *key, const char *value, return 0; } +static int +handle_ddp_filename_arg(__rte_unused const char *key, const char *value, void *name_args) +{ + const char **filename = name_args; + if (strlen(value) >= ICE_MAX_PKG_FILENAME_SIZE) { + PMD_DRV_LOG(ERR, "The DDP package filename is too long : '%s'", value); + return -1; + } + *filename = strdup(value); + return 0; +} + static void ice_check_proto_xtr_support(struct ice_hw *hw) { @@ -1873,6 +1887,20 @@ ice_load_pkg_type(struct ice_hw *hw) return package_type; } +static int ice_read_customized_path(char *pkg_file) +{ + char buf[ICE_MAX_PKG_FILENAME_SIZE]; + FILE *fp = fopen(ICE_PKG_FILE_CUSTOMIZED_PATH, "r"); + if (fp == NULL) { + PMD_INIT_LOG(ERR, "Failed to read CUSTOMIZED_PATH"); + return -EIO; + } + fscanf(fp, "%s\n", buf); + strncpy(pkg_file, buf, ICE_MAX_PKG_FILENAME_SIZE); + + return 0; +} + int ice_load_pkg(struct ice_adapter *adapter, bool use_dsn, uint64_t dsn) { struct ice_hw *hw = &adapter->hw; @@ -1882,12 +1910,28 @@ int ice_load_pkg(struct ice_adapter *adapter, bool use_dsn, uint64_t dsn) size_t bufsz; int err; + if (adapter->devargs.ddp_filename != NULL) { + strlcpy(pkg_file, adapter->devargs.ddp_filename, sizeof(pkg_file)); + if (rte_firmware_read(pkg_file, &buf, &bufsz) == 0) { + goto load_fw; + } else { + PMD_INIT_LOG(ERR, "Cannot load DDP file: %s\n", pkg_file); + return -1; + } + } + if (!use_dsn) goto no_dsn; memset(opt_ddp_filename, 0, ICE_MAX_PKG_FILENAME_SIZE); snprintf(opt_ddp_filename, ICE_MAX_PKG_FILENAME_SIZE, "ice-%016" PRIx64 ".pkg", dsn); + + ice_read_customized_path(pkg_file); + strcat(pkg_file, opt_ddp_filename); + if (rte_firmware_read(pkg_file, &buf, &bufsz) == 0) + goto load_fw; + strncpy(pkg_file, ICE_PKG_FILE_SEARCH_PATH_UPDATES, ICE_MAX_PKG_FILENAME_SIZE); strcat(pkg_file, opt_ddp_filename); @@ -1901,6 +1945,10 @@ int ice_load_pkg(struct ice_adapter *adapter, bool use_dsn, uint64_t dsn) goto load_fw; no_dsn: + ice_read_customized_path(pkg_file); + if (rte_firmware_read(pkg_file, &buf, &bufsz) == 0) + goto load_fw; + strncpy(pkg_file, ICE_PKG_FILE_UPDATES, ICE_MAX_PKG_FILENAME_SIZE
[PATCH] net/ice: support for more flexible loading of DDP package
The "Dynamic Device Personalization" package is loaded at initialization time by the driver, but the specific package file loaded depends upon what package file is found first by searching through a hard-coded list of firmware paths. To enable greater control over the package loading, this commit two ways to support custom DDP packages: 1. Add device option to choose a specific DDP package file to load. For example: -a 80:00.0,ddp_pkg_file=/path/to/ice-version.pkg 2. Read firmware search path from "/sys/module/firmware_class/parameters/path" like the kernel behavior. Signed-off-by: Bruce Richardson Signed-off-by: Zhichao Zeng --- doc/guides/nics/ice.rst | 12 +++ drivers/net/ice/ice_ethdev.c | 61 drivers/net/ice/ice_ethdev.h | 2 ++ 3 files changed, 75 insertions(+) diff --git a/doc/guides/nics/ice.rst b/doc/guides/nics/ice.rst index ae975d19ad..0484fafbc1 100644 --- a/doc/guides/nics/ice.rst +++ b/doc/guides/nics/ice.rst @@ -108,6 +108,18 @@ Runtime Configuration -a 80:00.0,default-mac-disable=1 +- ``DDP Package File`` + + Rather than have the driver search for the DDP package to load, + or to override what package is used, + the ``ddp_pkg_file`` option can be used to provide the path to a specific package file. + For example:: + +-a 80:00.0,ddp_pkg_file=/path/to/ice-version.pkg + + There is also support for customizing the firmware search path, will read the search path + from "/sys/module/firmware_class/parameters/path" and try to load DDP package. + - ``Protocol extraction for per queue`` Configure the RX queues to do protocol extraction into mbuf for protocol diff --git a/drivers/net/ice/ice_ethdev.c b/drivers/net/ice/ice_ethdev.c index 304f959b7e..bd78c14000 100644 --- a/drivers/net/ice/ice_ethdev.c +++ b/drivers/net/ice/ice_ethdev.c @@ -36,6 +36,7 @@ #define ICE_ONE_PPS_OUT_ARG "pps_out" #define ICE_RX_LOW_LATENCY_ARG"rx_low_latency" #define ICE_MBUF_CHECK_ARG "mbuf_check" +#define ICE_DDP_FILENAME "ddp_pkg_file" #define ICE_CYCLECOUNTER_MASK 0xULL @@ -52,6 +53,7 @@ static const char * const ice_valid_args[] = { ICE_RX_LOW_LATENCY_ARG, ICE_DEFAULT_MAC_DISABLE, ICE_MBUF_CHECK_ARG, + ICE_DDP_FILENAME, NULL }; @@ -692,6 +694,18 @@ handle_field_name_arg(__rte_unused const char *key, const char *value, return 0; } +static int +handle_ddp_filename_arg(__rte_unused const char *key, const char *value, void *name_args) +{ + const char **filename = name_args; + if (strlen(value) >= ICE_MAX_PKG_FILENAME_SIZE) { + PMD_DRV_LOG(ERR, "The DDP package filename is too long : '%s'", value); + return -1; + } + *filename = strdup(value); + return 0; +} + static void ice_check_proto_xtr_support(struct ice_hw *hw) { @@ -1873,6 +1887,22 @@ ice_load_pkg_type(struct ice_hw *hw) return package_type; } +static int ice_read_customized_path(char *pkg_file) +{ + char buf[ICE_MAX_PKG_FILENAME_SIZE]; + FILE *fp = fopen(ICE_PKG_FILE_CUSTOMIZED_PATH, "r"); + if (fp == NULL) { + PMD_INIT_LOG(ERR, "Failed to read CUSTOMIZED_PATH"); + return -EIO; + } + if (fscanf(fp, "%s\n", buf) > 0) + strncpy(pkg_file, buf, ICE_MAX_PKG_FILENAME_SIZE); + else + return -EIO; + + return 0; +} + int ice_load_pkg(struct ice_adapter *adapter, bool use_dsn, uint64_t dsn) { struct ice_hw *hw = &adapter->hw; @@ -1882,12 +1912,28 @@ int ice_load_pkg(struct ice_adapter *adapter, bool use_dsn, uint64_t dsn) size_t bufsz; int err; + if (adapter->devargs.ddp_filename != NULL) { + strlcpy(pkg_file, adapter->devargs.ddp_filename, sizeof(pkg_file)); + if (rte_firmware_read(pkg_file, &buf, &bufsz) == 0) { + goto load_fw; + } else { + PMD_INIT_LOG(ERR, "Cannot load DDP file: %s\n", pkg_file); + return -1; + } + } + if (!use_dsn) goto no_dsn; memset(opt_ddp_filename, 0, ICE_MAX_PKG_FILENAME_SIZE); snprintf(opt_ddp_filename, ICE_MAX_PKG_FILENAME_SIZE, "ice-%016" PRIx64 ".pkg", dsn); + + ice_read_customized_path(pkg_file); + strcat(pkg_file, opt_ddp_filename); + if (rte_firmware_read(pkg_file, &buf, &bufsz) == 0) + goto load_fw; + strncpy(pkg_file, ICE_PKG_FILE_SEARCH_PATH_UPDATES, ICE_MAX_PKG_FILENAME_SIZE); strcat(pkg_file, opt_ddp_filename); @@ -1901,6 +1947,10 @@ int ice_load_pkg(struct ice_adapter *adapter, bool use_dsn, uint64_t dsn) goto load_fw; no_dsn: + ice_read_customized_path(pkg_file); + if (rte_firmware_read(pkg_file, &buf, &bufsz) == 0) + goto load_fw; + strncpy(
Re: [PATCH] net/mana: support building the driver on arm64
On Wed, Aug 28, 2024 at 1:59 AM Long Li wrote: > > The mail was waiting in the moderation queue. > > Long, please fix your send-email setup, or send with a mail address > > registered > > to the dev@ ml. > > I think the patch has made it to patchwork: > https://patchwork.dpdk.org/project/dpdk/patch/1724711938-3108-1-git-send-email-lon...@linuxonhyperv.com/ > I released this mail from the moderation queue, so the patch could make it to patchwork. > The sender is lon...@linuxonhyperv.com, the author is lon...@microsoft.com. I > hope that is okay. The @linuxonhyperv.com address one is not registered to the dev ml, which is the reason why the mail got moderated. Hope it is clear now.. -- David Marchand
Re: Bihash Support in DPDK
Thanks for the reply. Bihash I mean bounded index what Vpp supports. Iam looking for the bucket level lock support. Currently Iam using hash table shared by multiple process or multiple core/threads. So I have to take the write lock by single core and then read lock by multiple cores to read the value wrote in this hash table. Multiple readers are getting blocked due to this. I want to avoid this to increase performance. Let me know your thoughts on this. Regards Rajesh On Tue, 27 Aug, 2024, 14:44 Medvedkin, Vladimir, < vladimir.medved...@intel.com> wrote: > Hi Rajesh, > > > > Please clarify what do you mean by “bihash”? Bidirectional? Bounded index? > > > > As for concurrent lookup/updates, yes, DPDK hash table supports > multi-process/multi-thread, please see the documentation: > > https://doc.dpdk.org/guides/prog_guide/hash_lib.html#multi-process-support > > > > > > *From:* rajesh goel > *Sent:* Tuesday, August 27, 2024 7:04 AM > *To:* Ferruh Yigit > *Cc:* Wang, Yipeng1 ; Gobriel, Sameh < > sameh.gobr...@intel.com>; Richardson, Bruce ; > Medvedkin, Vladimir ; dev@dpdk.org > *Subject:* Re: Bihash Support in DPDK > > > > Hi All, > > Can we get some reply. > > > > Thanks > > Rajesh > > > > On Thu, Aug 22, 2024 at 9:32 PM Ferruh Yigit wrote: > > On 8/22/2024 8:51 AM, rajesh goel wrote: > > Hi All, > > Need info if DPDK hash library supports bihash table where for multi- > > thread and multi-process we can update/del/lookup entries per bucket > level. > > > > > > + hash library maintainers. > >