Hi Sunyuechi,

On 28/05/2025 18:00, u...@foxmail.com wrote:
From: sunyuechi<sunyue...@iscas.ac.cn>

bpi-f3:
     scalar: 5.7 cycles
     rvv:    2.4 cycles

Maybe runtime detection in LPM should be added for all architectures,
but this commit is only about the RVV part.

Iwouldadviseyouto lookintothe FIBlibrary,ithasexactlywhatyouare lookingfor.

Also,pleaseconsiderwritinga slightlymoreinformativeandexplanatorycommit message.

Signed-off-by: sunyuechi<sunyue...@iscas.ac.cn>
---
  MAINTAINERS           |  2 +
  lib/lpm/meson.build   |  1 +
  lib/lpm/rte_lpm.h     |  2 +
  lib/lpm/rte_lpm_rvv.h | 91 +++++++++++++++++++++++++++++++++++++++++++
  4 files changed, 96 insertions(+)
  create mode 100644 lib/lpm/rte_lpm_rvv.h

<snip>
+static inline void rte_lpm_lookupx4_rvv(
+       const struct rte_lpm *lpm, xmm_t ip, uint32_t hop[4], uint32_t defv)
+{
+       size_t vl = 4;
+
+       const uint32_t *tbl24_p = (const uint32_t *)lpm->tbl24;
+       uint32_t tbl_entries[4] = {
+               tbl24_p[((uint32_t)ip[0]) >> 8],
+               tbl24_p[((uint32_t)ip[1]) >> 8],
+               tbl24_p[((uint32_t)ip[2]) >> 8],
+               tbl24_p[((uint32_t)ip[3]) >> 8],
+       };

I'm notan expertinRISC-V,butwhyis itdonein a scalarwayinsteadofusing__riscv_vsrl_vx_u32m1()? Iassumeyou're relyingonthe compilerhere?

Also,have youredefinedthe xmm_t typeforproperindexaddressing?

+       vuint32m1_t vtbl_entry = __riscv_vle32_v_u32m1(tbl_entries, vl);
+
+       vbool32_t mask = __riscv_vmseq_vx_u32m1_b32(
+           __riscv_vand_vx_u32m1(vtbl_entry, RTE_LPM_VALID_EXT_ENTRY_BITMASK, 
vl),
+           RTE_LPM_VALID_EXT_ENTRY_BITMASK, vl);
<snip>
+
+static inline void rte_lpm_lookupx4(
+       const struct rte_lpm *lpm, xmm_t ip, uint32_t hop[4], uint32_t defv)
+{
+       lpm_lookupx4_impl(lpm, ip, hop, defv);
+}
+
+RTE_INIT(rte_lpm_init_alg)
+{
+       lpm_lookupx4_impl = rte_cpu_get_flag_enabled(RTE_CPUFLAG_RISCV_ISA_V)
+           ? rte_lpm_lookupx4_rvv
+           : rte_lpm_lookupx4_scalar;
+}
AsImentionedearlier,I'd recommendthat youuseFIBtoselectan implementationatruntime. All the rest LPM vector x4 implementations are done this way, and their code is inlined.
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_LPM_RVV_H_ */

--
Regards,
Vladimir

Reply via email to