On Sat, 13 Aug 2022, Swinney, Jonathan wrote:

This specialization handles the case where filtersize is 4 mod 8, e.g.
12, 20, etc. Aarch64 was previously using the c function for this case.
This implementation speeds up that case significantly.

hscale_8_to_15__fs_12_dstW_512_c: 6234.1
hscale_8_to_15__fs_12_dstW_512_neon: 1505.6

Signed-off-by: Jonathan Swinney <jswin...@amazon.com>
---
libswscale/aarch64/hscale.S  | 107 +++++++++++++++++++++++++++++++++++
libswscale/aarch64/swscale.c |  18 +++---
2 files changed, 117 insertions(+), 8 deletions(-)

Thanks, this update looks fine to me, so I pushed it!

// Martin

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Reply via email to