On Wed, 23 Apr 2025, Zhao Zhili wrote:
From: Zhao Zhili <zhiliz...@tencent.com>
On rpi5 (A76):
put_hevc_pel_bi_w_pixels4_8_c: 90.0 ( 1.00x)
put_hevc_pel_bi_w_pixels4_8_neon: 34.1 ( 2.64x)
put_hevc_pel_bi_w_pixels6_8_c: 188.3 ( 1.00x)
put_hevc_pel_bi_w_pixels6_8_neon: 73.5 ( 2.56x)
put_hevc_pel_bi_w_pixels8_8_c: 327.1 ( 1.00x)
put_hevc_pel_bi_w_pixels8_8_neon: 75.8 ( 4.32x)
put_hevc_pel_bi_w_pixels12_8_c: 728.8 ( 1.00x)
put_hevc_pel_bi_w_pixels12_8_neon: 186.1 ( 3.92x)
put_hevc_pel_bi_w_pixels16_8_c: 1288.1 ( 1.00x)
put_hevc_pel_bi_w_pixels16_8_neon: 268.5 ( 4.80x)
put_hevc_pel_bi_w_pixels24_8_c: 2855.5 ( 1.00x)
put_hevc_pel_bi_w_pixels24_8_neon: 723.8 ( 3.95x)
put_hevc_pel_bi_w_pixels32_8_c: 5095.3 ( 1.00x)
put_hevc_pel_bi_w_pixels32_8_neon: 1165.0 ( 4.37x)
put_hevc_pel_bi_w_pixels48_8_c: 11521.5 ( 1.00x)
put_hevc_pel_bi_w_pixels48_8_neon: 2856.0 ( 4.03x)
put_hevc_pel_bi_w_pixels64_8_c: 21020.5 ( 1.00x)
put_hevc_pel_bi_w_pixels64_8_neon: 4699.1 ( 4.47x)
---
libavcodec/aarch64/h26x/dsp.h | 5 +
libavcodec/aarch64/h26x/epel_neon.S | 373 ++++++++++++++++++++++
libavcodec/aarch64/hevcdsp_init_aarch64.c | 13 +
3 files changed, 391 insertions(+)
This looks good overall, thanks!
It's quite regrettable how many duplicates of near-identical functions
there are in the h26x qpel/epel code; ideally we should be able to
produce most of these function variants with some sort of template instead
of having them all duplicated (with minor style differences).
// Martin
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".