Hi all, As part of my ongoing swscale rewrite, we have both the opportunity and the need to make a central decision about how to apply rounding and/or dithering.
Some particular cases I want to point out and gather feedback on include: 1. Should we dither and/or accurately round when scaling up full range content? For example, say you are converting from full-range rgb24 to rgb30. The correct conversion is (rgb / 255 * 1023), which involves a rational factor of exactly 341 / 85, or roughly 4.01176. The fact that this factor is irrational means that an exact conversion without dithering, while not strictly speaking *lossy*, necessarily introduces rounding error. An input value of 200, for example, gives 200 * 1023 / 255 = 802.35294..., which ought to be accurately dithered down to a 35%/65% mix of 802 and 803. This is not what current swscale (nor many other pieces of software) do, instead they simply calculate the much easier (x << 2) | (x >> 6). This amounts to chopping off the lowest 6 bits. i.e. truncating down. With a light bit of extra effort we can at least round correctly by adding on the (x & 5) bit to the result. This is especially problematic for the alpha channel, as a correct upconversion of yuva444p to yuva444p10 would otherwise collapse to a simple left shift by 2 if not for the presence of the alpha channel which would require a full float conversion, multiplication and dither pass. 2. At what bit depth does dithering become negligible? For context, the generally quoted threshold of human visual perception is ~12 bits SDR and ~14 bits HDR. So for something like yuv444p16, we could get away with outputting the truncated results without dithering nor accurate rounding, without the risk of human visible error. However, this does increase the risk of a *compounding* error as more and more conversions are performed. 3. Should we dither per-channel after conversion from grayscale to RGB? For example, say I am converting gray10 to rgb24. The most performance way to do this would be to dither the gray channel down to gray8 and then copy it to all three values (R, G, B) = (Y8). The more accurate way to do it, OTOH, would be to set (R10, G10, B10) = (Y10) and then dither each channel independently, with an offset dither mask per channel. This gives greater precision, which may matter especially when dithering to a very low bit depth (e.g. rgb8 or rgb4), but makes the conversion roughly 3x more expensive. 3. What should we make of the SWS_ACCURATE_RND and SWS_BITEXACT flags? I am personally thinking that SWS_BITEXACT should become a no-op flag, with bit exact output being the default behavior of all new implementations. But What about SWS_ACCURATE_RND? I am thinking that SWS_ACCURATE_RND should essentially be the switch that toggles our preferred resolution of question 1. So in other words, with SWS_ACCURATE_RND specified, full range upconversions should go through an accurate dither pass, while being relaxed to the simple (x << 2) | (x >> 6) upconversion in the absence of this flag. How should this flag relate to question 2? With the flag specified, I am thinking that we should also force dithering even at 16 bit depth, and skip dithering in this case only in the flag's absence. If so, what bit depth should the cutoff threshold be, for when to skip accurate dithering? I am thinking to simply use the 12/14 bit SDR/HDR threshold as appropriate for the content type. This would lead to the following conversions, as an illustration: SWS_ACCURATE_RND specified: - rgb24 -> yuv420p10: full dithering - rgb24 -> yuv420p12: full dithering - rgb24 -> rgb30: full dithering - rgb24 -> rgba64: full dithering - yuva444p -> yuva444p10: scale YUV, dither alpha - yuva444p14 -> yuva444p16: scale YUV, dither alpha - yuv444p10 -> yuv444p14: left shift, no dithering needed SWS_ACCURATE_RND absent: - rgb24 -> yuv420p10: full dithering - rgb24 -> yuv420p12: truncate if SDR, full dithering if HDR - rgb24 -> rgb30: truncate - rgb24 -> rgba64: truncate - yuva444p -> yuva444p10: left shift YUV, truncate alpha - yuva444p14 -> yuva444p16: left shift YUV, truncate alpha Does this seem reasonable? _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".