On 2025-06-03 06:51, Pekka Paalanen wrote:
> On Tue, 3 Jun 2025 08:30:23 +0000
> "Shankar, Uma" <uma.shan...@intel.com> wrote:
> 
>>> -----Original Message-----
>>> From: Pekka Paalanen <pekka.paala...@collabora.com>
>>> Sent: Friday, May 30, 2025 7:28 PM
>>> To: Shankar, Uma <uma.shan...@intel.com>
>>> Cc: Simon Ser <cont...@emersion.fr>; Harry Wentland
>>> <harry.wentl...@amd.com>; Alex Hung <alex.h...@amd.com>; dri-
>>> de...@lists.freedesktop.org; amd-...@lists.freedesktop.org; intel-
>>> g...@lists.freedesktop.org; wayland-de...@lists.freedesktop.org;
>>> leo....@amd.com; ville.syrj...@linux.intel.com; 
>>> pekka.paala...@collabora.com;
>>> m...@igalia.com; jad...@redhat.com; sebastian.w...@redhat.com;
>>> shashank.sha...@amd.com; ago...@nvidia.com; jos...@froggi.es;
>>> mdaen...@redhat.com; aleix...@kde.org; xaver.h...@gmail.com;
>>> victo...@system76.com; dan...@ffwll.ch; quic_nas...@quicinc.com;
>>> quic_cbr...@quicinc.com; quic_abhin...@quicinc.com; mar...@marcan.st;
>>> liviu.du...@arm.com; sashamcint...@google.com; Borah, Chaitanya Kumar
>>> <chaitanya.kumar.bo...@intel.com>; louis.chau...@bootlin.com
>>> Subject: Re: [PATCH V8 32/43] drm/colorop: Add 1D Curve Custom LUT type
>>>
>>> On Thu, 22 May 2025 11:33:00 +0000
>>> "Shankar, Uma" <uma.shan...@intel.com> wrote:
>>>   
>>>> One request though: Can we enhance the lut samples from existing
>>>> 16bits to 32bits as lut precision is going to be more than 16 in certain 
>>>> hardware.  
>>> While adding the new UAPI, lets extend this to 32 to make it future proof.  
>>>> Reference:
>>>> https://patchwork.freedesktop.org/patch/642592/?series=129811&rev=4
>>>>
>>>> +/**
>>>> + * struct drm_color_lut_32 - Represents high precision lut values
>>>> + *
>>>> + * Creating 32 bit palette entries for better data
>>>> + * precision. This will be required for HDR and
>>>> + * similar color processing usecases.
>>>> + */
>>>> +struct drm_color_lut_32 {
>>>> +  /*
>>>> +   * Data for high precision LUTs
>>>> +   */
>>>> +  __u32 red;
>>>> +  __u32 green;
>>>> +  __u32 blue;
>>>> +  __u32 reserved;
>>>> +};  
>>>
>>> Hi,
>>>
>>> I suppose you need this much precision for optical data? If so, 
>>> floating-point would
>>> be much more appropriate and we could probably keep 16-bit storage.
>>>
>>> What does the "more than 16-bit" hardware actually use? ISTR at least AMD
>>> having some sort of float'ish point internal pipeline?
>>>
>>> This sounds the same thing as non-uniformly distributed taps in a LUT.
>>> That mimics floating-point input while this feels like floating-point 
>>> output of a LUT.
>>>
>>> I've recently decided for myself (and Weston) that I will never store 
>>> optical data in
>>> an integer format, because it is far too wasteful. That's why the electrical
>>> encodings like power-2.2 are so useful, not just for emulating a CRT.  
>>
>> Hi Pekka,
>> Internal pipeline in hardware can operate at higher precision than the input 
>> framebuffer
>> to plane engines. So, in case we have optical data of 16bits or 10bits 
>> precision, hardware
>> can scale this up to higher precision in internal pipeline in hardware to 
>> take care of rounding
>> and overflow issues. Even FP16 optical data will be normalized and converted 
>> internally for
>> further processing.
> 
> Is it integer or floating-point?
> 

For AMD the internal format is floating point with slightly
higher precision than FP16.

> If we take the full range of PQ as optical and put it into 16-bit
> integer format, the luminance step from code 1 to code 2 is 0.15 cd/m².
> That seems like a huge step in the dark end. Such a step would
> probably need to be divided over several taps in a LUT, which wouldn't
> be possible.
> 

Right, and with 32-bpc we'll get a luminance step size of
~0.0000023 cd/m^2, which seems plenty fine-grained.

> In that sense, if a LUT is used for the PQ EOTF, I totally agree that
> 16-bit integer won't be even nearly enough precision.
> 
> This actually points out the caveat that increasing the number of taps
> in a LUT can cause the LUT to become non-monotonic when the sample
> precision runs out. That is, consecutive taps don't always increase in
> value.
> 
>> Input to LUT hardware can be 16bits or even higher, so the look up table we 
>> program can
>> be of higher precision than 16 (certain cases 24 in Intel pipeline). This is 
>> later truncated to bpc supported
>> in output formats from sync (10, 12 or 16), mostly for electrical value to 
>> be sent to sink.
>>
>> Hence requesting to increase the container from current u16 to u32, to get 
>> advantage of higher
>> precision luts.
> 
> My argument though is to use a floating-point format for the LUT samples
> instead of adding more and more integer bits. That naturally puts more
> precision where it is needed: near zero.
> 
> A driver can easily convert that to any format the hardware needs.
> 
> However, it might make best sense for a driver to expose a LUT with a
> format that best matches the hardware precision, especially
> floating-point vs. integer.
> 
> I guess we may eventually need both 32 bpc integer and 16 (or 32) bpc
> floating-point.
> 

While I like floating point better for representing these things
I don't think it's a great idea to pass floating point values
via IOCTLs but 32 bpc integer values make sense here.

Thanks, Uma, for pushing on this.

Harry

> 
> Thanks,
> pq

Reply via email to