Also adds a filter_line3 method which on aarch64 neon yields approx 30% speedup over 2xfilter_line and a memcpy
Differences from v2: coeffs moved into const segment number of patches reduced John Cox (7): tests/checkasm: Add test for vf_bwdif filter_intra avfilter/vf_bwdif: Add neon for filter_intra tests/checkasm: Add test for vf_bwdif filter_edge avfilter/vf_bwdif: Add neon for filter_edge avfilter/vf_bwdif: Add neon for filter_line Exports C filter_line needed for tail fixup of neon code avfilter/vf_bwdif: Add a filter_line3 method for optimisation avfilter/vf_bwdif: Add neon for filter_line3 libavfilter/aarch64/Makefile | 2 + libavfilter/aarch64/vf_bwdif_init_aarch64.c | 125 +++ libavfilter/aarch64/vf_bwdif_neon.S | 793 ++++++++++++++++++++ libavfilter/bwdif.h | 20 + libavfilter/vf_bwdif.c | 70 +- tests/checkasm/vf_bwdif.c | 172 +++++ 6 files changed, 1167 insertions(+), 15 deletions(-) create mode 100644 libavfilter/aarch64/vf_bwdif_init_aarch64.c create mode 100644 libavfilter/aarch64/vf_bwdif_neon.S -- 2.39.2 _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".