Initially, I tried using `vnclip.wi` with reference to h264,
-vwadd.wxv16, v16, t4
-vnsra.wiv16, v16, 4
+vnclip.wi v16, v16, 4
but couldn't find the correct way... I think there might be some overflow
issues that I didn't understand correctly. How do y
Le lauantaina 15. kesäkuuta 2024, 14.50.32 EEST u...@foxmail.com a écrit :
> From: sunyuechi
>
> C908 X60
> vp9_avg_bilin_4hv_8bpp_c : 10.79.5
> vp9_avg_bilin_4hv_8bpp_rvv_i32 :4.03.
> Copying vectors is rarely justified - mostly only before destructive
> instructions such as FMA.
It is slightly different from VP8. In VP8, many scalar values are positive,
so the related calculations can be easily replaced. However, in this
context of VP9, since t2 is a negative number, vwmaccs
From: sunyuechi
C908 X60
vp9_avg_bilin_4hv_8bpp_c : 10.79.5
vp9_avg_bilin_4hv_8bpp_rvv_i32 :4.03.5
vp9_avg_bilin_8hv_8bpp_c : 38.5 34.2
vp9_avg_bilin_8h