On Sat, 13 Aug 2022, Swinney, Jonathan wrote:
This specialization handles the case where filtersize is 4 mod 8, e.g.
12, 20, etc. Aarch64 was previously using the c function for this case.
This implementation speeds up that case significantly.
hscale_8_to_15__fs_12_dstW_512_c: 6234.1
hscale_8_to_15__fs_12_dstW_512_neon: 1505.6
Signed-off-by: Jonathan Swinney <jswin...@amazon.com>
---
libswscale/aarch64/hscale.S | 107 +++++++++++++++++++++++++++++++++++
libswscale/aarch64/swscale.c | 18 +++---
2 files changed, 117 insertions(+), 8 deletions(-)
Thanks, this update looks fine to me, so I pushed it!
// Martin
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".