This fixes issues addressed in previous patchset: - move sub instruction in vsad8_intra, - remove unnecessary mov instructions, - remove single lane extraction in loop and place it at the end.
Removing mov instructions from pix_median_abs functions significantly increased peformance for both. Hubert Mazur (3): lavc/aarch64: Add neon implementation for pix_median_abs16 lavc/aarch64: Add neon implementation for vsad8_intra lavc/aarch64: Add neon implementation for pix_median_abs8 libavcodec/aarch64/me_cmp_init_aarch64.c | 10 ++ libavcodec/aarch64/me_cmp_neon.S | 182 +++++++++++++++++++++++ libavcodec/me_cmp.c | 5 +- 3 files changed, 195 insertions(+), 2 deletions(-) -- 2.34.1 _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".