Re: [PATCH v6 4/4] hash: add SVE support for bulk key lookup

Yoan Picchi Tue, 12 Mar 2024 08:08:24 -0700

On 3/12/24 03:57, fengchengwen wrote:

Hi Yoan,


On 2024/3/12 7:21, Yoan Picchi wrote:

- Implemented SVE code for comparing signatures in bulk lookup.
- Added Defines in code for SVE code support.
- Optimise NEON code


This commit does not include this part. Pls only describe the content in this 
commit.


Thank you. I forgot to edit that out after moving commit around.

- New SVE code is ~5% slower than optimized NEON for N2 processor.

Signed-off-by: Yoan Picchi <[email protected]>
Signed-off-by: Harjot Singh <[email protected]>
Reviewed-by: Nathan Brown <[email protected]>
Reviewed-by: Ruifeng Wang <[email protected]>
---
  lib/hash/arch/arm/compare_signatures.h | 58 ++++++++++++++++++++++++++
  lib/hash/rte_cuckoo_hash.c             |  2 +
  2 files changed, 60 insertions(+)

diff --git a/lib/hash/arch/arm/compare_signatures.h 
b/lib/hash/arch/arm/compare_signatures.h
index b5a457f936..8a0627e119 100644
--- a/lib/hash/arch/arm/compare_signatures.h
+++ b/lib/hash/arch/arm/compare_signatures.h
@@ -47,6 +47,64 @@ compare_signatures_dense(uint16_t *hitmask_buffer,
                *hitmask_buffer = vaddvq_u16(hit2);
                }
                break;
+#endif
+#if defined(RTE_HAS_SVE_ACLE)
+       case RTE_HASH_COMPARE_SVE: {

...

  #endif
        default:
                for (unsigned int i = 0; i < RTE_HASH_BUCKET_ENTRIES; i++) {
diff --git a/lib/hash/rte_cuckoo_hash.c b/lib/hash/rte_cuckoo_hash.c
index e41f03270a..7a474267f0 100644
--- a/lib/hash/rte_cuckoo_hash.c
+++ b/lib/hash/rte_cuckoo_hash.c
@@ -452,6 +452,8 @@ rte_hash_create(const struct rte_hash_parameters *params)
  #elif defined(RTE_ARCH_ARM64)
        if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_NEON)) {
                h->sig_cmp_fn = RTE_HASH_COMPARE_NEON;
+               if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_SVE))
+                       h->sig_cmp_fn = RTE_HASH_COMPARE_SVE;


The RTE_HASH_COMPARE_SVE was defined in "PATCH v6 1/4] hash: pack the hitmask for 
hash in bulk lookup",
but its first use is in this commit, so I think it should defined in this 
commit.

If RTE_CPUFLAG_SVE and RTE_HAS_SVE_ACLE both set, then SVE impl will be chosen.
If RTE_CPUFLAG_SVE defined, but RTE_HAS_SVE_ACLE was not, then scalar will be 
chosen. --- in this case we could back to NEON impl.
So I suggest direct use "#if defined(RTE_HAS_SVE_ACLE)" here.


Sounds fair. I'll do it.

        }
        else
  #endif


Plus:
I notice the commit log said the SVE performance is slower than NEON.

And I also notice other platform SVE also lower than NEON,
1. b4ee9c07bd config/arm: disable SVE ACLE for CN10K
2. 4eea7c6461 config/arm: add SVE ACLE control flag

So maybe we should disable RTE_HAS_SVE_ACLE default by:
diff --git a/config/arm/meson.build b/config/arm/meson.build
index 9d6fb87d7f..a5b890d100 100644
--- a/config/arm/meson.build
+++ b/config/arm/meson.build
@@ -875,7 +875,7 @@ endif

  if cc.get_define('__ARM_FEATURE_SVE', args: machine_args) != ''
      compile_time_cpuflags += ['RTE_CPUFLAG_SVE']
-    if (cc.check_header('arm_sve.h') and soc_config.get('sve_acle', true))
+    if (cc.check_header('arm_sve.h') and soc_config.get('sve_acle', false))
          dpdk_conf.set('RTE_HAS_SVE_ACLE', 1)
      endif
  endif

If the platform verify SVE has higher performance, then it could enable SVE by add 
"sve_acle: true" in soc_xxx config.

Thanks

Here I kinda disagree. In this particular instance, SVE is a bit slowerwith narrow vectors (128b), but could be faster with some wider vectorsizes.Even in general SVE 128b is not just slower than neon. It's a case bycase basis. Sometime it's slower, sometime it's faster, so I don't thinkwe should just disable it by default. In any case, disabling it shouldbe its own patch with much discussion, not just a offhand thing weinclude in the middle of this patch.This SVE version is still faster than the upstream neon version. I justhappen to have improved the neon version even more.

Re: [PATCH v6 4/4] hash: add SVE support for bulk key lookup

Reply via email to