On 01/02/24 14:38, Louis Chauvet wrote:
>>
>> /*
>> @@ -23,27 +23,25 @@ static size_t pixel_offset(const struct vkms_frame_info
>> *frame_info, int x, int
>> * @frame_info: Buffer metadata
>> * @x: The x(width) coordinate of the 2D buffer
>> * @y: The y(Heigth) coordinate of the 2D buffer
>> + * @index: The index of the plane on the 2D buffer
>> *
>> * Takes the information stored in the frame_info, a pair of coordinates,
>> and
>> - * returns the address of the first color channel.
>> - * This function assumes the channels are packed together, i.e. a color
>> channel
>> - * comes immediately after another in the memory. And therefore, this
>> function
>> - * doesn't work for YUV with chroma subsampling (e.g. YUV420 and NV21).
>> + * returns the address of the first color channel on the desired index.
>> */
>> static void *packed_pixels_addr(const struct vkms_frame_info *frame_info,
>> - int x, int y)
>> + int x, int y, size_t index)
>> {
>> - size_t offset = pixel_offset(frame_info, x, y);
>> + size_t offset = pixel_offset(frame_info, x, y, index);
>>
>> return (u8 *)frame_info->map[0].vaddr + offset;
>> }
>
> This implementation of packed_pixels_addr will only work with
> block_w == block_h == 1. For packed or tiled formats we will need to use
> x/y information to extract the correct address, and this address will not
> be a single pixel. See below my explanation.
You're right, currently, VKMS only supports non-packed/tiled formats. As
all the formats I plan to add are too not packed or tiled, I haven't
added support to it. But if you want to add it, please do :).
>> @@ -130,17 +128,28 @@ void vkms_compose_row(struct line_buffer
>> *stage_buffer, struct vkms_plane_state
>> {
>> struct pixel_argb_u16 *out_pixels = stage_buffer->pixels;
>> struct vkms_frame_info *frame_info = plane->frame_info;
>> - u8 *src_pixels = get_packed_src_addr(frame_info, y);
>> + const struct drm_format_info *frame_format = frame_info->fb->format;
>> int limit = min_t(size_t, drm_rect_width(&frame_info->dst),
>> stage_buffer->n_pixels);
>> + u8 *src_pixels[DRM_FORMAT_MAX_PLANES];
>>
>> - for (size_t x = 0; x < limit; x++, src_pixels +=
>> frame_info->fb->format->cpp[0]) {
>> + for (size_t i = 0; i < frame_format->num_planes; i++)
>> + src_pixels[i] = get_packed_src_addr(frame_info, y, i);
>> +
>> + for (size_t x = 0; x < limit; x++) {
>> int x_pos = get_x_position(frame_info, limit, x);
>>
>> - if (drm_rotation_90_or_270(frame_info->rotation))
>> - src_pixels = get_packed_src_addr(frame_info, x +
>> frame_info->rotated.y1)
>> - + frame_info->fb->format->cpp[0] * y;
>> + if (drm_rotation_90_or_270(frame_info->rotation)) {
>> + for (size_t i = 0; i < frame_format->num_planes; i++) {
>> + src_pixels[i] = get_packed_src_addr(frame_info,
>> + x +
>> frame_info->rotated.y1, i);
>> + src_pixels[i] += frame_format->cpp[i] * y;
>
> I find the current rotation management a bit complex to understand. This
> is not related to your patch, but as I had to understand this to create my
> second patch, I think this could be significanlty simplified.
I also found the rotation logic complex when implementing this. I would
appreciate it if it were simplified.
>
> Please see the below comment about frame_format->cpp, it applies here too.
> I think the "easy" way here is simply to reuse the method
> get_packed_src_addr every time you need a new pixel.
>
>> + }
>> + }
>>
>> plane->pixel_read(src_pixels, &out_pixels[x_pos]);
>> +
>
> The usage of cpp and pointer to specific pixel only work for non-packed
> and non-blocked pixels, but for example NV30 or Y0L0 need more
> informations about the exact location of the pixel to convert and write
> the correct pixel value (each pixel can't be referenced directly by a
> pointer). For example NV30 uses 5 bytes to store 3 pixels (10 bits each),
> so to access the "middle" one you need to read the 5 bytes and do a small
> computation to extract it's value.
Great explanation, I can see what is the problem here.
>
> I think a simple solution to handle most cases would be to profide two
> more parameters: the x and y positions of the pixel to copy, using
> "absolute coordinates" (i.e x=0,y=0 means the first byte of the src
> buffer, not the first pixel in the `drm_rect src`, this way the method
> `pixel_read` can extract the correct value).
>
> This way it become easy to manage "complex" pixel representations in this
> loop: simply increment x/y and let the pixel_read method handle
> everything.
>
> The second patch I will send is doing this. And as explained before, it
> will also simplify a lot the code related to rotation and translation (no
> more switch case everywhere to add offset to x/y, it simply use drm_rect_*
> helpers).
I like this, expect my review soon :).
>
> It's not optimal in term of performance (in some situation it will read
> the same block multiple time to generate different pixels), but I
> believe it still is an intersting trade-off.
>
> In the future, if performance is actally critical, the whole composition
> loop will have to be specialized for each pixel formats: some can be
> treated line by line (as it's done today), but with blocks or packed
> pixels it's more complex.
>
>> + for (size_t i = 0; i < frame_format->num_planes; i++)
>> + src_pixels[i] += frame_format->cpp[i];
>
> This is likely working with format with block_w != 1, see explanation
> above.
I think you meant that is _not_ working. Yeah, as I already explained,
it was never my plan to add support for packed or tiled formats.
Best Regards,
~Arthur Grillo