Le keskiviikkona 31. heinäkuuta 2024, 13.36.00 EEST flow gg a écrit :
> I'm a bit confused because the calculation here goes up to 32 bits and then
> returns to 8 bits. It seems that the vmax and vnclipu instructions can't be
> removed by using round-related instructions?
You seem to be adding 64
I'm a bit confused because the calculation here goes up to 32 bits and then
returns to 8 bits. It seems that the vmax and vnclipu instructions can't be
removed by using round-related instructions?
Rémi Denis-Courmont 于2024年7月29日周一 23:21写道:
> Le tiistaina 23. heinäkuuta 2024, 11.51.48 EEST u...@f
Le tiistaina 23. heinäkuuta 2024, 11.51.48 EEST u...@foxmail.com a écrit :
> From: sunyuechi
>
> C908 X60
> vp9_avg_8tap_smooth_4h_8bpp_c : 12.7 11.2
> vp9_avg_8tap_smooth_4h_8bpp_rvv_i32:4.74.
> TBH it is very hard to review this due to the large extents of code
> conditionals. This should avoidable at least partly. You can name macros
for
> each filter and then expand those macros instead of using if's.
Do you mean that before the addition of .equ ff_vp9_subpel_filters_xxx,
epel_filter
From: sunyuechi
C908 X60
vp9_avg_8tap_smooth_4h_8bpp_c : 12.7 11.2
vp9_avg_8tap_smooth_4h_8bpp_rvv_i32:4.74.2
vp9_avg_8tap_smooth_4v_8bpp_c : 29.7 12.5
vp9_avg_8tap_smo
Le lauantaina 15. kesäkuuta 2024, 14.50.33 EEST u...@foxmail.com a écrit :
> From: sunyuechi
OK, so I realise that this review is very late, but...
TBH it is very hard to review this due to the large extents of code
conditionals. This should avoidable at least partly. You can name macros for
e
> You can directly LLA filters + 16 * 8 * 2 and save one add. Same below.
You can
> also use .equ to alias the filter addresses, and avoid if's.
> That's a lot of address dependencies, which is going to hurt performance.
It
> might help to just spill more S registers if needed.
> This can be done
From: sunyuechi
C908 X60
vp9_avg_8tap_smooth_4h_8bpp_c : 12.7 11.2
vp9_avg_8tap_smooth_4h_8bpp_rvv_i32:4.74.2
vp9_avg_8tap_smooth_4v_8bpp_c : 29.7 12.5
vp9_avg_8tap_smo