On Sun, Dec 01, 2024 at 02:39:19AM +0100, Michael Niedermayer wrote: > Hi Ramiro > > On Sat, Nov 30, 2024 at 04:23:36PM +0100, Ramiro Polla wrote: > > For bit depths <= 14, the result is saturated to 15 bits. > > For bit depths > 14, the result is saturated to 19 bits. > > > > x86_64: > > chrRangeFromJpeg8_1920_c: 5827.4 5804.5 ( 1.00x) > > chrRangeFromJpeg16_1920_c: 5793.2 5792.8 ( 1.00x) > > chrRangeToJpeg8_1920_c: 11726.2 9388.6 ( 1.25x) > > chrRangeToJpeg16_1920_c: 10610.8 5796.5 ( 1.83x) > > lumRangeFromJpeg8_1920_c: 4165.7 4147.9 ( 1.00x) > > lumRangeFromJpeg16_1920_c: 4530.0 4529.0 ( 1.00x) > > lumRangeToJpeg8_1920_c: 6044.8 5694.1 ( 1.06x) > > lumRangeToJpeg16_1920_c: 5343.6 5334.2 ( 1.00x) > > > > aarch64 A55: > > chrRangeFromJpeg8_1920_c: 28839.3 28833.8 ( 1.00x) > > chrRangeFromJpeg16_1920_c: 28843.8 28842.8 ( 1.00x) > > chrRangeToJpeg8_1920_c: 44196.1 23070.6 ( 1.92x) > > chrRangeToJpeg16_1920_c: 36526.7 17313.8 ( 2.11x) > > lumRangeFromJpeg8_1920_c: 15384.3 15388.1 ( 1.00x) > > lumRangeFromJpeg16_1920_c: 15390.1 15388.0 ( 1.00x) > > lumRangeToJpeg8_1920_c: 23066.7 19226.2 ( 1.20x) > > lumRangeToJpeg16_1920_c: 19224.6 19225.5 ( 1.00x) > > > > aarch64 A76: > > chrRangeFromJpeg8_1920_c: 6316.2 6317.8 ( 1.00x) > > chrRangeFromJpeg16_1920_c: 6321.9 6322.9 ( 1.00x) > > chrRangeToJpeg8_1920_c: 11389.3 9287.1 ( 1.23x) > > chrRangeToJpeg16_1920_c: 9514.4 6104.9 ( 1.56x) > > lumRangeFromJpeg8_1920_c: 4376.0 4359.1 ( 1.00x) > > lumRangeFromJpeg16_1920_c: 4437.9 4358.8 ( 1.02x) > > lumRangeToJpeg8_1920_c: 6667.0 5957.2 ( 1.12x) > > lumRangeToJpeg16_1920_c: 6062.5 6072.5 ( 1.00x) > > > > NOTE: all simd optimizations for range_convert have been disabled > > except for x86, which already had the same behaviour. > > they will be re-enabled when they are fixed for each architecture. > > --- > > libswscale/aarch64/swscale.c | 5 +++++ > > libswscale/loongarch/swscale_init_loongarch.c | 5 +++++ > > libswscale/riscv/swscale.c | 5 +++++ > > libswscale/swscale.c | 21 ++++++++++++------- > > libswscale/x86/range_convert.asm | 3 --- > > 5 files changed, 29 insertions(+), 10 deletions(-) > > [...] > > > @@ -160,8 +160,10 @@ static void chrRangeToJpeg_c(int16_t *dstU, int16_t > > *dstV, int width) > > { > > int i; > > for (i = 0; i < width; i++) { > > - dstU[i] = (FFMIN(dstU[i], 30775) * 4663 - 9289992) >> 12; // -264 > > - dstV[i] = (FFMIN(dstV[i], 30775) * 4663 - 9289992) >> 12; // -264 > > + int U = (dstU[i] * 4663 - 9289992) >> 12; // -264 > > + int V = (dstV[i] * 4663 - 9289992) >> 12; // -264 > > The way this is written it triggers undefined behavior if the input to teh > function > is too large
I misread the code somehow, the FFMIN only protects the 16bit output which the new code does too, so teh chaneg is ok thx [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB Rewriting code that is poorly written but fully understood is good. Rewriting code that one doesnt understand is a sign that one is less smart than the original author, trying to rewrite it will not make it better.
signature.asc
Description: PGP signature
_______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".