This is getting long, so tl;dr:

 - Pitch alignment *by itself* is manageable.
 - Adding constraints in modifiers quickly leads to combinatorial
   explosion.
 - drm_fourcc.h explicitly says "it's incorrect to encode pitch
   alignment in a modifier", for all the reasons Daniel raised. That
   needs addressing.

On Thu, Dec 19, 2024 at 07:33:07PM +0000, Marek Olšák wrote:
> On Thu, Dec 19, 2024 at 5:32 AM Brian Starkey <brian.star...@arm.com> wrote:
> 
> > On Wed, Dec 18, 2024 at 09:53:56PM +0000, Marek Olšák wrote:
> 
> The pitch doesn't always describe the layout. In practice, the pitch has no
> effect on any tiled layouts (on AMD), and it also has no effect on linear
> layouts if the pitch must be equal to a specifically rounded up width. In
> that case, the only function of the pitch is to reject importing a DMABUF
> if it's incorrect with respect to the width. In other cases, the pitch is a
> parameter of the modifier (i.e. the pitch augments the layout, so the
> layout is described by {modifier, width, height, bpp, pitch} instead of
> just {modifier, width, height, bpp}).

I'm only talking about LINEAR here.

The ALIGN modifier tells an allocator what values of pitch are
acceptable, but it doesn't add any information about the buffer layout
which isn't already communicated by {format, width, height, bpp,
pitch}.

The AMD driver doesn't need the ALIGN modifier to determine if a
buffer is valid, it can do it entirely based on {format, width,
height, bpp, pitch}.

These two buffers are identical, and a driver which accepts one should
accept both:

{
   .format = DRM_FORMAT_XRGB8888,
   .width = 64,
   .height = 64,
   .pitch = 256,
   .modifier = DRM_FORMAT_MOD_LINEAR,
}

{
   .format = DRM_FORMAT_XRGB8888,
   .width = 64,
   .height = 64,
   .pitch = 256,
   .modifier = DRM_FORMAT_MOD_LINEAR_PITCH_ALIGN_256B,
}

This new modifier is a clear and direct violation of the comment in
drm_fourcc.h:

 * Modifiers must uniquely encode buffer layout. In other words, a buffer must
 * match only a single modifier. A modifier must not be a subset of layouts of
 * another modifier. For instance, it's incorrect to encode pitch alignment in
 * a modifier: a buffer may match a 64-pixel aligned modifier and a 32-pixel
 * aligned modifier. That said, modifiers can have implicit minimal
 * requirements.

I assume the argument here is this doesn't apply in this case because
we're deprecating LINEAR / the current LINEAR definition is wrong; but
it smells bad - it implies that this isn't the right API.

> 
> >
> > There _is_ a problem with having a million modifiers. The opaque set
> > intersection means that all authors from all vendors need to expose
> > the correct sets. The harder that is to do, the less likely things are
> > to work.
> >
> 
> No, exposing the correct set is never required. You only expose your set,
> and then also expose those modifiers where you need interop. Interop
> between every pair of devices is generally unsupported (since LINEAR
> between devices is practically unsupported).
> 

How do I know where I need interop?

> 
> >
> > Look at GENERIC_16_16_TILE. We spotted that our layout was the same as
> > something already defined under SAMSUNG. If there were a million
> > modifiers, we wouldn't be able to spot that commonality, and you'd end
> > up with disjoint sets even when you have layouts in common.
> >
> 
> This is unrelated.

More modifiers makes maintenance of the list harder. That seems
entirely relevant in light of the combinatorial explosion I described
below.

> 
> >
> > For this specific case of pitch alignment it seems like the consensus
> > is we should add a modifier, but I still strongly disagree that
> > modifiers are the right place in-general for trying to describe device
> > buffer usage constraints.
> >
> > I'm worried that adding these alignment constraints without any
> > statement on future intention pushes us down the slippery slope, and
> > it's *very* slippery.
> >
> > Up-thread you mentioned offset alignment. If we start putting that in
> > modifiers then we have:
> >
> > * Pitch alignment
> >   * Arbitrary, 1 byte
> >   * At least 16 byte aligned, arbitrary padding (Arm needs this)
> >   * Exactly the next 64 bytes (AMD?)
> > * Offset alignment
> >   * Arbitrary, 1 byte
> >   * You mentioned 4096 bytes (AMD?)
> >   * Arm needs 16, 8, 4 or 2 bytes, depending on format. Oh and it's
> >     different for the chroma plane of planar YUV too, so it's more
> >     like 16, 8, 4, 2, 2Y_1CbCr
> >
> > We would need a new modifier value for each *combination* of
> > constraints, so 3 (pitch) * 7 (offset) gives 21 new LINEAR modifiers
> > which need defining, and a device with no pitch/offset constraints
> > needs to expose *all* of them to make sure it can interop with an
> > Arm/AMD device.
> >
> 
> No, it's not needed to expose all of them. Again, you just expose what you
> need to interop with.

How does a driver developer know what they need to interop with? I
want my display controller driver to work with any GPU.

It needs to expose PITCH_ALIGN_16B (the actual HW capability),
PITCH_ALIGN_256B (so it can interop with AMD) and any other values
which are >16B and aligned to 16B for interop with any other
producer. i.e. "all of them".

That's manageable for PITCH_ALIGN. It's not manageable if we start
adding other constraints to modifiers.

> 
> We know that the LINEAR modifier doesn't work with 1B pitch and offset
> alignment pretty much everywhere. What are you going to do about it?

Have an allocator use some "reasonable" pitch alignment (I think we
default to 64 bytes for RGB), and allocate well-aligned buffers. If
"reasonable" is 256B, use that. Better is to have userspace allocator
which knows the devices in the system, knows the buffer usage, and
allocates accordingly.

"How does it know?" --> The allocator just codes in what it needs to
interop with, obviously ;-)
I'd definitely rather bake that interop list in userspace than kernel
drivers.

I think it would be great to have a kernel interface to help
allocators "know". I don't think `IN_FORMATS` should be that
interface.

Cheers,
-Brian

> 
> Perhaps the solution is what Intel has done to interop with AMD: Intel's
> image allocator was changed to align the linear pitch to 256B. We can
> demand that all drivers must align the pitch to 256B in their allocators
> too. If they don't want to do it, they will likely be forced to do it by
> their management, which is likely why Intel did it. Is that the future we
> want? It's already happening.
> 
> Minimum alignment requirements (for AMD):
> * Offset: 256B
> * Pitch: 128B or 256B (only minimum or any multiple - different chips have
> different limits)
> * Slice size alignment: 256B
> * Contiguous pages (not visible to uAPI since the kernel can reallocate to
> enforce this constraint when needed)
> 
> Marek

Reply via email to