Treat the 32 bit stride registers as signed. Alternatively, we could make the stride arguments ptrdiff_t instead of int, and changing all of the assembly to operate on these registers with their full 64 bit width, but that's a much larger and more intrusive change (and risks missing some operation, which would clamp the intermediates to 32 bit still).
Fixes: https://trac.ffmpeg.org/ticket/9985 Signed-off-by: Martin Storsjö <mar...@martin.st> --- libswscale/aarch64/yuv2rgb_neon.S | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/libswscale/aarch64/yuv2rgb_neon.S b/libswscale/aarch64/yuv2rgb_neon.S index f4b220fb60..f341268c5d 100644 --- a/libswscale/aarch64/yuv2rgb_neon.S +++ b/libswscale/aarch64/yuv2rgb_neon.S @@ -118,8 +118,8 @@ .endm .macro increment_yuv422p - add x6, x6, w7, UXTW // srcU += incU - add x13, x13, w14, UXTW // srcV += incV + add x6, x6, w7, SXTW // srcU += incU + add x13, x13, w14, SXTW // srcV += incV .endm .macro compute_rgba r1 g1 b1 a1 r2 g2 b2 a2 @@ -189,8 +189,8 @@ function ff_\ifmt\()_to_\ofmt\()_neon, export=1 st4 {v16.8B,v17.8B,v18.8B,v19.8B}, [x2], #32 subs w8, w8, #16 // width -= 16 b.gt 2b - add x2, x2, w3, UXTW // dst += padding - add x4, x4, w5, UXTW // srcY += paddingY + add x2, x2, w3, SXTW // dst += padding + add x4, x4, w5, SXTW // srcY += paddingY increment_\ifmt subs w1, w1, #1 // height -= 1 b.gt 1b -- 2.37.0 (Apple Git-136) _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".