On Wed, 19 Feb 2025, Krzysztof Pyrkosz via ffmpeg-devel wrote:
--- Before and after on A78 dmvr_8_12x20_neon: 86.2 ( 6.90x) dmvr_8_20x12_neon: 94.8 ( 5.93x) dmvr_8_20x20_neon: 141.5 ( 6.50x) dmvr_12_12x20_neon: 158.0 ( 3.76x) dmvr_12_20x12_neon: 151.2 ( 3.73x) dmvr_12_20x20_neon: 247.2 ( 3.71x) dmvr_hv_8_12x20_neon: 423.2 ( 3.75x) dmvr_hv_8_20x12_neon: 434.0 ( 3.69x) dmvr_hv_8_20x20_neon: 706.0 ( 3.69x) dmvr_8_12x20_neon: 77.2 ( 7.70x) dmvr_8_20x12_neon: 66.5 ( 8.49x) dmvr_8_20x20_neon: 92.2 ( 9.90x) dmvr_12_12x20_neon: 80.2 ( 7.38x) dmvr_12_20x12_neon: 58.2 ( 9.59x) dmvr_12_20x20_neon: 90.0 (10.15x) dmvr_hv_8_12x20_neon: 369.0 ( 4.34x) dmvr_hv_8_20x12_neon: 355.8 ( 4.49x) dmvr_hv_8_20x20_neon: 574.2 ( 4.51x) libavcodec/aarch64/vvc/inter.S | 72 ++++++++++------------------------ 1 file changed, 20 insertions(+), 52 deletions(-) diff --git a/libavcodec/aarch64/vvc/inter.S b/libavcodec/aarch64/vvc/inter.S index c9d698ee29..45add44b6e 100644 --- a/libavcodec/aarch64/vvc/inter.S +++ b/libavcodec/aarch64/vvc/inter.S @@ -369,22 +369,18 @@ function ff_vvc_dmvr_8_neon, export=1 1: cbz w15, 2f ldr q0, [src], #16 - uxtl v1.8h, v0.8b - uxtl2 v2.8h, v0.16b - ushl v1.8h, v1.8h, v16.8h - ushl v2.8h, v2.8h, v16.8h + ushll v1.8h, v0.8b, #2 + ushll2 v2.8h, v0.16b, #2
In addition to what's mentioned in the commit message, this bit is semantically a different one, so we should probably mention that in the commit message as well. If you're reposting patch 1/2 of this set, can you update the commit message on this one, to mention this (and move the measurements into the actual commit message).
Other than that, this patch looks very good to me, thanks! // Martin _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".