Le tiistaina 26. syyskuuta 2023, 12.24.58 EEST flow gg a écrit : > benchmark: > fcmul_add_c: 19.7 > fcmul_add_rvv_f32: 6.7
With optimisations enabled and the benchmarking fix, I get this (on the same hardware, I believe): fcmul_add_c: 3.5 fcmul_add_rvv_f32: 6.7 For sure unfortunate design limitations of T-Head C910 are to blame to no small extent. It is not the first occurrence of an RVV optimisation that turns out worse than scalar due to those, and I still have honest hopes that newer (and conformant) IP would give saner results, but... I also believe that the code could be improved regardless. -- Rémi Denis-Courmont http://www.remlab.net/ _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".