On 01/03/24 08:53, Pekka Paalanen wrote:
> On Thu, 29 Feb 2024 14:57:06 -0300
> Arthur Grillo <arthurgri...@riseup.net> wrote:
> 
>> On 29/02/24 09:12, Pekka Paalanen wrote:
>>> On Wed, 28 Feb 2024 22:52:09 -0300
>>> Arthur Grillo <arthurgri...@riseup.net> wrote:
>>>   
>>>> On 27/02/24 17:01, Arthur Grillo wrote:  
>>>>>
>>>>>
>>>>> On 27/02/24 12:02, Louis Chauvet wrote:    
>>>>>> Hi Pekka,
>>>>>>
>>>>>> For all the comment related to the conversion part, maybe Arthur have an 
>>>>>> opinion on it, I took his patch as a "black box" (I did not want to 
>>>>>> break (and debug) it).
>>>>>>
>>>>>> Le 26/02/24 - 14:19, Pekka Paalanen a écrit :    
>>>>>>> On Fri, 23 Feb 2024 12:37:26 +0100
>>>>>>> Louis Chauvet <louis.chau...@bootlin.com> wrote:
>>>>>>>    
>>>>>>>> From: Arthur Grillo <arthurgri...@riseup.net>
>>>>>>>>
>>>>>>>> Add support to the YUV formats bellow:
>>>>>>>>
>>>>>>>> - NV12
>>>>>>>> - NV16
>>>>>>>> - NV24
>>>>>>>> - NV21
>>>>>>>> - NV61
>>>>>>>> - NV42
>>>>>>>> - YUV420
>>>>>>>> - YUV422
>>>>>>>> - YUV444
>>>>>>>> - YVU420
>>>>>>>> - YVU422
>>>>>>>> - YVU444
>>>>>>>>
>>>>>>>> The conversion matrices of each encoding and range were obtained by
>>>>>>>> rounding the values of the original conversion matrices multiplied by
>>>>>>>> 2^8. This is done to avoid the use of fixed point operations.
>>>>>>>>
>>>>>>>> Signed-off-by: Arthur Grillo <arthurgri...@riseup.net>
>>>>>>>> [Louis Chauvet: Adapted Arthur's work and implemented the read_line_t
>>>>>>>> callbacks for yuv formats]
>>>>>>>> Signed-off-by: Louis Chauvet <louis.chau...@bootlin.com>
>>>>>>>> ---
>>>>>>>>  drivers/gpu/drm/vkms/vkms_composer.c |   2 +-
>>>>>>>>  drivers/gpu/drm/vkms/vkms_drv.h      |   6 +-
>>>>>>>>  drivers/gpu/drm/vkms/vkms_formats.c  | 289 
>>>>>>>> +++++++++++++++++++++++++++++++++--
>>>>>>>>  drivers/gpu/drm/vkms/vkms_formats.h  |   4 +
>>>>>>>>  drivers/gpu/drm/vkms/vkms_plane.c    |  14 +-
>>>>>>>>  5 files changed, 295 insertions(+), 20 deletions(-)
> 
> ...
> 
>>>> diff --git a/drivers/gpu/drm/vkms/tests/vkms_format_test.c 
>>>> b/drivers/gpu/drm/vkms/tests/vkms_format_test.c
>>>> index f66584549827..4cee3c2d8d84 100644
>>>> --- a/drivers/gpu/drm/vkms/tests/vkms_format_test.c
>>>> +++ b/drivers/gpu/drm/vkms/tests/vkms_format_test.c
>>>> @@ -59,7 +59,7 @@ static struct yuv_u8_to_argb_u16_case 
>>>> yuv_u8_to_argb_u16_cases[] = {
>>>>                    {"white", {0xff, 0x80, 0x80}, {0x0000, 0xffff, 0xffff, 
>>>> 0xffff}},
>>>>                    {"gray",  {0x80, 0x80, 0x80}, {0x0000, 0x8000, 0x8000, 
>>>> 0x8000}},
>>>>                    {"black", {0x00, 0x80, 0x80}, {0x0000, 0x0000, 0x0000, 
>>>> 0x0000}},
>>>> -                  {"red",   {0x35, 0x63, 0xff}, {0x0000, 0xffff, 0x0000, 
>>>> 0x0000}},
>>>> +                  {"red",   {0x36, 0x63, 0xff}, {0x0000, 0xffff, 0x0000, 
>>>> 0x0000}},
>>>>                    {"green", {0xb6, 0x1e, 0x0c}, {0x0000, 0x0000, 0xffff, 
>>>> 0x0000}},
>>>>                    {"blue",  {0x12, 0xff, 0x74}, {0x0000, 0x0000, 0x0000, 
>>>> 0xffff}},
>>>>            },
>>>> diff --git a/drivers/gpu/drm/vkms/vkms_formats.c 
>>>> b/drivers/gpu/drm/vkms/vkms_formats.c
>>>> index e06bbd7c0a67..043f23dbf80d 100644
>>>> --- a/drivers/gpu/drm/vkms/vkms_formats.c
>>>> +++ b/drivers/gpu/drm/vkms/vkms_formats.c
>>>> @@ -121,10 +121,12 @@ static void RGB565_to_argb_u16(u8 **src_pixels, 
>>>> struct pixel_argb_u16 *out_pixel
>>>>    out_pixel->b = drm_fixp2int_round(drm_fixp_mul(fp_b, fp_rb_ratio));
>>>>  }
>>>>  
>>>> -static void ycbcr2rgb(const s16 m[3][3], u8 y, u8 cb, u8 cr, u8 y_offset, 
>>>> u8 *r, u8 *g, u8 *b)
>>>> +#define BIT_DEPTH 32
>>>> +
>>>> +static void ycbcr2rgb(const s64 m[3][3], u8 y, u8 cb, u8 cr, u8 y_offset, 
>>>> u8 *r, u8 *g, u8 *b)
>>>>  {
>>>> -  s32 y_16, cb_16, cr_16;
>>>> -  s32 r_16, g_16, b_16;
>>>> +  s64 y_16, cb_16, cr_16;
>>>> +  s64 r_16, g_16, b_16;
>>>>  
>>>>    y_16 =  y - y_offset;
>>>>    cb_16 = cb - 128;
>>>> @@ -134,9 +136,18 @@ static void ycbcr2rgb(const s16 m[3][3], u8 y, u8 cb, 
>>>> u8 cr, u8 y_offset, u8 *r,
>>>>    g_16 = m[1][0] * y_16 + m[1][1] * cb_16 + m[1][2] * cr_16;
>>>>    b_16 = m[2][0] * y_16 + m[2][1] * cb_16 + m[2][2] * cr_16;
>>>>  
>>>> -  *r = clamp(r_16, 0, 0xffff) >> 8;
>>>> -  *g = clamp(g_16, 0, 0xffff) >> 8;
>>>> -  *b = clamp(b_16, 0, 0xffff) >> 8;
>>>> +  // rounding the values
>>>> +  r_16 = r_16 + (1LL << (BIT_DEPTH - 4));
>>>> +  g_16 = g_16 + (1LL << (BIT_DEPTH - 4));
>>>> +  b_16 = b_16 + (1LL << (BIT_DEPTH - 4));
>>>> +
>>>> +  r_16 = clamp(r_16, 0, (1LL << (BIT_DEPTH + 8)) - 1);
>>>> +  g_16 = clamp(g_16, 0, (1LL << (BIT_DEPTH + 8)) - 1);
>>>> +  b_16 = clamp(b_16, 0, (1LL << (BIT_DEPTH + 8)) - 1);  
>>>
>>> Where do the BIT_DEPTH - 4 and BIT_DEPTH + 8 come from?  
>>
>> Basically, the numbers are in this form in hex:
>>
>> 0xsspppppppp
>>
>> In the end, we only want the 's' bits.
>>
>> The matrix multiplication is not giving us perfect results, making some
>> of KUnit test not pass, This is because the values end up a little bit
>> off. KUnit expects 0xfe, but this functions is returning 0xfd.
>>
>> I noticed that before shifting the values to get the 's' bytes the
>> values were a lot close to what is expected, something like:
>>
>> 0xfdfe287312
>>     ^
>> So the rounding part adds 1 to this marked 'f' to round a bit the values
>> (drm_fixed.h does something similar on drm_fixp2int_round).
>> Like that:
>>
>>    0xfdfe287312
>> +  0x0010000000
>>    ------------
>>    0xfe0e287312
>>
>> That's why the BIT_DEPTH - 4.
> 
> I have a hard time deciphering this. There is some sort of strange
> combination of UNORM and fixed-point going on here, where you process
> the range 0.0 - 255.0 including 32-bit fraction. All variables being
> named "_16" does not help, I've no idea what that refers to.

Totally forgot to rename that, sorry.

> 
> Usually when you have unsigned pixel format type, it's UNORM, that is
> an unsigned integer representation that maps to [0.0, 1.0]. When
> converting UNORM properly to e.g. fixed-point, you don't have to
> consider the UNORM bit depth when computing in fixed-point.
> 
> There is a catch: since 0xff maps to 1.0, the divisor is 0xff, and not
> a bit shift by 8. This must be taken into account when converting
> between different depths of UNORM, or between UNORM and fixed-point.
> Converting between different depths of fixed-point does not have this
> problem.
> 
> If you want to proper rounding, meaning that 0.5 rounds up to 1.0 and
> 0.4999 rounds down to 0.0 when rounding to integers, you have to add
> 0.5 before truncating.
> 
> So in this case you need to add 0x0100_0000 / 2 = 0x0080_0000, not
> 0x0010_0000.

Thanks for the explanations, I will try to take all this into account.

> 
> I don't understand what drm_fixp2int_round() is even doing. The offset
> is not 0.5, it's 0.0000076.
> 
>> After that, the values need to be clamped to not get wrong results when
>> shifting this s64 and casting it to u8. We clamp it to the minimum
>> allowed value: 0, and to the maximum allowed value, which in this case
>> is all the (BIT_DEPTH + 8) bits set to 1, The '+ 8' is to account for
>> the size of the 's' bits.
> 
> Ok. You could also shift with >> BIT_DEPTH first, and then clamp to 0,
> 255.

Great idea! This makes more sense.

Best Regards,
~Arthur Grillo

> 
> 
> Thanks,
> pq
> 
>> After writing this, I think that maybe it would be good to add this
>> explanation as a comment on the code.
>>
>>>   
>>>> +
>>>> +  *r = r_16 >> BIT_DEPTH;
>>>> +  *g = g_16 >> BIT_DEPTH;
>>>> +  *b = b_16 >> BIT_DEPTH;
>>>>  }  

Reply via email to