On Tue, Mar 31, 2026 at 11:06 PM Kovac, Krunoslav <[email protected]> wrote:
> On 3/31/2026 16:13, Mario Kleiner wrote: > > To clarify: I do agree with Michel and Kruno that in most typical use > cases > > you'd probably want to get all or as much of the internal HW pipelines > > precision as possible to the Eyes of the person in front of the display, > or > > at least an approximation of it. My understanding of current AMD hardware > > is that you can have an up to 16 bpc framebuffer, which gets > > truncated/rounded down to 12 bpc somewhere in the pipeline (gamma tables, > > color transformation matrices, etc.) and then retained at 12 bpc until > > shortly before the actual output, which can be 6, 8, 10 or 12 bpc > depending > > on connection type, bandwidth, "max bpc" etc. > > At the start of the pipeline we immediately go into a 19bpc space. Except > around 3DLUT, all color processing happens at this precision. > It's near the end of the color pipeline that we go from 19 to 12 in a > block called DENORM. Other than some specialized ABM block, we then go into > FMT that does 12 -> monitor bpc. > Interesting new info, good to know. > I didn't see any difference with my capture HW for 8bpc->8bpc either, so I > left spatial dithering as default. There's enough extra precision that > apparently it doesn't matter here. > Yes, that satisfies both the high precision and identity passthrough use cases. > > Maybe your case always has an FP16 plane, say compositing space? > It all depends on the application. My software is not a single purpose self-contained app, but a toolkit - a set of extensions for Matlab/Octave/Python with many wildly different use cases and requirements. Think of it as SDL, but tailored to neuroscience research and related bio-medical research, and for scientists with usually only basic programming skills (Matlab scripting language or Python). Can be 8 bpc SDR, 10 bpc SDR or HDR, effective ~11 bpc unorm SDR (hw that can do fp16 surfaces but not rgba16 unorm) 12 bpc (AMD hw with rgba16unorm framebuffer -> native 12 bpc or dithered 12 bpc) SDR/HDR, 14 bpc or 16 bpc SDR on special (and very expensive) neuroscience display equipment. For high precision modes, the software often uses fp32 "framebuffers" and its own composition pipeline with various GLSL shader plugins for post-processing, then outputting to the actual 8/10/fp16/16bpc framebuffer via OpenGL or Vulkan or OpenXR for VR applications. So it's difficult to explain the specific use case, because there isn't a specific use case. > > If the final output depth is lower than 12 bpc one would usually still > want > > an approximation of 12 bpc reaching the "eyes" of the person > > (/animal/retina in some of the use cases of my research users) in front > of > > the display, and spatial dithering down from 12 bpc -> 10/8/6 bpc is the > > way to go. That's also true for most of my users use cases, and > especially > > for the use cases involving 16 bpc framebuffers/surfaces. > > > > Some more special use cases will require an absolutely perfect identity > > passthrough of pixel color values, where any kind of transformation in > the > > pipeline, including spatial dithering, would be bad. Some of your > customers > > seem to require this for 10 bpc output. Some of my users require this > for 8 > > bpc output of a 8 bpc framebuffer. Specifically, some neuroscience > research > > requires up to 16 bpc color or luminance precision, but all graphics > cards > > and normal displays max out at 12 bpc. There exist special display > devices > > and converters that can do up to 16 bpc precision (native or via some > form > > of spatial or temporal dithering), e.g., the Bits# or Display++ from > > Cambridge Research Systems (UK) and Datapixx, ViewPixx and ProPixx > devices > > from VPixx in Canada. These are essentially active DVI-D or DisplayPort 8 > > bpc to 14 bpc or 16 bpc VGA analog converters with 14 or 16 DAC's, or > > special purpose LCD panels or DLP video projectors which can do 14/16 bpc > > precision. Because commercially available gpu's and PHY's do not support > > true 16 bpc output (the DP and HDMI standards specify such signal > formats, > > but no actual transmitter hardware afaik), these devices encode 16 bpc > > color content on top of a 8 bpc framebuffer and link: The software > renders > > 16 bpc unorm/fp or 32 bpc float content and then uses GLSL shaders to > split > > up 16 bpc into 8 MSB and 8 LSB and puts the 8 MSB into the 8 bpc red > > channel and 8 LSB into the 8 bpc blue channel (and 8 bpc color index > > overlay into the 8 bpc blue channel) to false-color encode a pure > grayscale > > image + some 256 color index palette overlay. Or for true color images, > it > > sacrifices half the horizontal resolution by putting 8 MSB of each color > > channel into the even pixel columns, and the 8 LSB into the odd pixel > > columns. So a false color 8 rgb8 framebuffer -> pixel identity > passthrough > > -> 8 bpc link output via DVI-D or DP, and the video sink then decodes and > > reassembles again into 16 bpc color/luminance content and uses special > > display hardware to these 16 bpc into the eyes of the being in front of > the > > display. Some medical imaging displays, e.g., for Radiology use (e.g., > > cancer screening) in hospitals or at eye doctors, also use such special > > framebuffer encodings to get > 12 bpc content out of the gpu Siemens > > Medical and similar companies sell these for research and medical use. > > > > Another use case of my users requiring perfect pixel identity passthrough > > is to encode side-band signals into the scanlines of the vactive area of > an > > image, encoding binary control data and packets as false color pixel > > values, similar to the various info packets transmitted inside vblank. > This > > for control data that is very custom and not standardized in any Vesa or > > HDMI standard, e.g., in my case to control special neuroscience hardware, > > e.g., sound microsecond synchronized to pictures, sending various analog > > waveforms to electrophysiology equipment or haptic stimulation, or > digital > > trigger signals to transmagnetic stimulators (magnetic pulses to brain > > regions), or start/stop/synchronize various recording equipment (fMRI and > > MEG scanners, electrophysiological recordings, video capture etc.) > > > > For the pixel identity passthrough, the difference is that I only need it > > for 8 bpc framebuffers to 8 bpc (DVI-D or DP) outputs atm., and that > works > > fine under OpenGL with an identity gamma table loaded, despite spatial > > dithering down to 8 bpc active. Right now, I neither have the need nor > the > > equipment to verify 10 bpc identity passthrough, as my capture hw can > only > > process 8 bpc signals. > > > > I don't think there is an automated way for the driver to guess the > proper > > configuration in all cases. The proper solution would be a drm connector > > property that can be queried/set to control dithering > on/off/method/target > > depth, and plumb that through. Or maybe something that could be derived > > from existing connector properties? E.g., if a content property has > > something standardized that essentially requires identity passthrough? In > > my case, it is important that such settings still fully work under native > > X11 via RandR properties. Something that is only realistically accessible > > via an atomic client or Wayland server is insufficient for me. > > > > So yes, as Michel points out, there is a disconnect between the > framebuffer > > color depth and hw pipeline depth and what dither settings should be > used. > > But Harry's patch, if it worked, would be at least a good enough > > guess-o-matic or heuristic to make the situation better in the short > > term, even if it is not optimal. Or at least for my users use cases it > > would make it better, as for my use cases the framebuffer color depth > > usually corresponds to what my users need as effective output precision. > > For me there is also the urgency of wanting to have a not broken > situation > > for Linux 7.0 and upcoming Ubuntu 26.04-LTS / Fedora Core 44. If I have > the > > choice of having the current state, or this patch, I'd gladly have this > > patch as a step up. > > > > I hoped this patch would be still simple and contained and early enough, > to > > make it into drm-fixes for Linux 7.0, and maybe be backportable to older > > kernels, as all kernels since late 2023 are impaired from my use cases > > point of view. But as I said, my testing didn't confirm the patch is > > actually working - it always ends up enabling dithering. Which, to be > > sneaky, would also be a step up for me, as that "only" breaks use cases > > that don't affect my users specifically :/ > > > > On Tue, Mar 31, 2026 at 9:16 AM Michel Dänzer < > [email protected]> > > wrote: > > > >> On 3/30/26 19:36, Harry Wentland wrote: > >>> On 2026-03-30 12:20, Michel Dänzer wrote: > >>>> On 3/24/26 20:20, Mario Kleiner wrote: > >>>>> On Sun, Mar 22, 2026 at 7:11 PM Kovac, Krunoslav < > >> [email protected] <mailto:[email protected]>> wrote: > >>> > >>>>>> I believe we don't have surface info in that code, but one way > to > >> work around it would be to use spatial dithering for FP16/ARGB16 and > >> rounding for 10 bits. But if we just switch to spatial, some of the > earlier > >> complaints about 10-bit output having one-off bit errors will be coming > >> back. > >>>>> > >>>>> Looking at all callers of > resource_build_bit_depth_reduction_params(), > >> they all have access to the associated "struct pipe_ctx", which should > give > >> access to pipe_ctx ->plane_state->format of an associated display > plane. I > >> could prepare a patch that passes the pipe_ctx from each caller > >> into resource_build_bit_depth_reduction_params() and that function could > >> check if a 16 bpc framebuffer is in use and switch to spatial dithering > >> down-to-10-bpc in this case, and leave the rounding/truncation to 10 bpc > >> otherwise. > >>>> > >>>> That doesn't really make sense, the output of the display HW colour > >> pipeline has more than 10 bpc regardless of framebuffer format. > >>>> > >>> > >>> The output will be determined by the link bandwidth, display-advertised > >> supported bpc, and userspace-selected "max bpc" on a drm_connector. This > >> could very well be 10 bpc, 8 bpc, even 6 bpc. Or are you referring to > the > >> internal DCN HW representation of the values? > >> > >> I am indeed. > >> > >>> They're higher, but that's somewhat irrelevant. > >> > >> How so? Surely dithering is applied to those values, not to the original > >> values sampled from the framebuffer. > >> > >> > >> -- > >> Earthling Michel Dänzer \ GNOME / Xwayland / Mesa developer > >> https://redhat.com \ Libre software > enthusiast > >> > > > >
