On Fri, 25 Mar 2022, Ben Avison wrote:

Disable ff_add_pixels_clamped_arm, which was found to fail the test. As this
is normally only used for Arms prior to Armv6 (ARM11) it seems quite unlikely
that anyone is still using this, so I haven't put in the effort to debug it.

I had a look at this function, and I see that the overflow checks are using

        tst             r6,  #0x100

to see whether the addition overflowed (either above or below). However, if block[] was e.g. 0x200, it's possible to overflow without setting this bit at all.

If it would be the case that the valid range of block[] values would be e.g. [-255,255], then this kind of overflow checking would work though. (As there exists assembly for armv6, then this function probably hasn't been used much in modern times, so this doesn't say much about what values actually are used here.)

Secondly, the clamping seems to be done with

        movne           r6,  r5,  lsr #24

However that should use asr, not lsr, I think, to get proper clamping in both ends?


Thirdly - the added test also occasionally fails for the other existing functions (armv6, neon) and the newly added aarch64 neon version. If you have e.g. src[] = 32767, dst[] = 255, then the widening 8->16 addition will overflow, as there's no operation that both widens and clamps at the same time.

I think this is reason to limit the range of src[] at least somewhat in the test, since I don't think the full 16 bit signed range actually is relevant here.

// Martin

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Reply via email to