On Wed, 19 Feb 2025, Krzysztof Pyrkosz via ffmpeg-devel wrote:

---

Before and after on A78

dmvr_8_12x20_neon:                                      86.2 ( 6.90x)
dmvr_8_20x12_neon:                                      94.8 ( 5.93x)
dmvr_8_20x20_neon:                                     141.5 ( 6.50x)
dmvr_12_12x20_neon:                                    158.0 ( 3.76x)
dmvr_12_20x12_neon:                                    151.2 ( 3.73x)
dmvr_12_20x20_neon:                                    247.2 ( 3.71x)
dmvr_hv_8_12x20_neon:                                  423.2 ( 3.75x)
dmvr_hv_8_20x12_neon:                                  434.0 ( 3.69x)
dmvr_hv_8_20x20_neon:                                  706.0 ( 3.69x)

dmvr_8_12x20_neon:                                      77.2 ( 7.70x)
dmvr_8_20x12_neon:                                      66.5 ( 8.49x)
dmvr_8_20x20_neon:                                      92.2 ( 9.90x)
dmvr_12_12x20_neon:                                     80.2 ( 7.38x)
dmvr_12_20x12_neon:                                     58.2 ( 9.59x)
dmvr_12_20x20_neon:                                     90.0 (10.15x)
dmvr_hv_8_12x20_neon:                                  369.0 ( 4.34x)
dmvr_hv_8_20x12_neon:                                  355.8 ( 4.49x)
dmvr_hv_8_20x20_neon:                                  574.2 ( 4.51x)

libavcodec/aarch64/vvc/inter.S | 72 ++++++++++------------------------
1 file changed, 20 insertions(+), 52 deletions(-)

diff --git a/libavcodec/aarch64/vvc/inter.S b/libavcodec/aarch64/vvc/inter.S
index c9d698ee29..45add44b6e 100644
--- a/libavcodec/aarch64/vvc/inter.S
+++ b/libavcodec/aarch64/vvc/inter.S
@@ -369,22 +369,18 @@ function ff_vvc_dmvr_8_neon, export=1
1:
        cbz             w15, 2f
        ldr             q0, [src], #16
-        uxtl            v1.8h, v0.8b
-        uxtl2           v2.8h, v0.16b
-        ushl            v1.8h, v1.8h, v16.8h
-        ushl            v2.8h, v2.8h, v16.8h
+        ushll           v1.8h, v0.8b, #2
+        ushll2          v2.8h, v0.16b, #2

In addition to what's mentioned in the commit message, this bit is semantically a different one, so we should probably mention that in the commit message as well. If you're reposting patch 1/2 of this set, can you update the commit message on this one, to mention this (and move the measurements into the actual commit message).

Other than that, this patch looks very good to me, thanks!

// Martin

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Reply via email to