Le maanantaina 17. heinäkuuta 2023, 20.25.57 EEST Rémi Denis-Courmont a écrit
:
> 1) Take the reductive sum out of the loop,
>leaving a regular vector addition in the loop.
> 2) Merge the addition and the multiplication.
> 3) Unroll.
>
> Before:
> scalarproduct_float_rvv_f32: 832.5
>
> After
1) Take the reductive sum out of the loop,
leaving a regular vector addition in the loop.
2) Merge the addition and the multiplication.
3) Unroll.
Before:
scalarproduct_float_rvv_f32: 832.5
After:
scalarproduct_float_rvv_f32: 275.2
---
libavutil/riscv/float_dsp_rvv.S | 13 +++--
1 fil