> I think we can drop the 2x2 transforms. In all likelihood, scalar code
will
> end up faster than vector code on future hardware, especially out-of-order
> pipelines.
I want to drop 2x2, but since there's only one function to handle all
situations instead of 7*7 functions, how can I drop only 2x2
Le lauantaina 1. kesäkuuta 2024, 21.01.16 EEST u...@foxmail.com a écrit :
> From: sunyuechi
>
> C908 X60
> avg_8_2x2_c:1.01.0
> avg_8_2x2_rvv_i32 :1.01.
> In keeping in line with the rest of the project, that should probably go
into
> **libavcodec/riscv/vvc/**
> Expanding the macro 49 times, with up to 14 **branches** to get there is
maybe not
> such a great idea. It might look nice on the checkasm µbenchmarks because
the
> branches under test get
From: sunyuechi
C908 X60
avg_8_2x2_c:1.01.0
avg_8_2x2_rvv_i32 :1.01.0
avg_8_2x4_c:1.72.0
avg_8_2x4_rvv_i