On Tue, 2025-09-02 at 11:54 +0200, Christian König wrote:
> On 02.09.25 09:42, Timur Kristóf wrote:
> > On Tue, 2025-09-02 at 08:39 +0200, Christian König wrote:
> > > On 01.09.25 12:00, Timur Kristóf wrote:
> > > > Technically not necessary, but clear the extra dwords too,
> > > > so that the command processors don't read uninitialized memory.
> > >
> > > That is most likely a really bad idea.
> > >
> > > The extra DWs are filled with a specific pattern for a HW
> > > workaround.
> >
> > I was unaware so it looked to me like the extra dwords just remain
> > uninitialized.
> >
> > Where in the code does that happen?
>
> See vcn_v4_0_init_ring_metadata() for an example.
I see, thanks for explaining that.
So basically it's up to the ring initialization to fill the extra
dwords with something, and it shouldn't be done when clearing the ring.
>
> > And what is the issue that is being worked around?
>
> IIRC on the VCN4 engine there is a HW bug which triggered when some
> metadata for the ring was not in the same X megabyte segment of the
> ring buffer. So we just put this small metadata structure directly
> after the ring.
>
> On some jpeg engine you had to insert commands after the end of the
> ring to actually make the ring work reliable. See
> jpeg_v1_0_decode_ring_set_patch_ring().
Thanks!
>
> > Also, while we are at it, how was it possible to initialize that
> > without causing VM faults? Considering that the allocated BO was
> > not
> > sufficiently large to hold the extra dwords. (That is fixed by the
> > first patch of this series.)
>
> BOs for VRAM and GTT are always rounded to the next 4KiB. The
> workaround need something like 100 bytes or 64 DW, so that always
> worked.
>
> BTW: I just found that vcn_v4_0_unified_ring_vm_funcs() is using
> extra_dw as bytes already :)
I can fix that and include the fix in the next version of this series.
>
> > >
> > > Clearing them to NOPs makes no sense at all and potentially even
> > > breaks the HW workaround.
> > >
> > > Regards,
> > > Christian.
> > >
> > > >
> > > > Fixes: c8c1a1d2ef04 ("drm/amdgpu: define and add extra dword
> > > > for
> > > > jpeg ring")
> > > > Signed-off-by: Timur Kristóf <[email protected]>
> > > > ---
> > > > drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 5 +++++
> > > > 1 file changed, 5 insertions(+)
> > > >
> > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> > > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> > > > index 7670f5d82b9e..6a55a85744a9 100644
> > > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> > > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
> > > > @@ -474,6 +474,11 @@ static inline void
> > > > amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
> > > > while (i <= ring->buf_mask)
> > > > ring->ring[i++] = ring->funcs->nop;
> > > >
> > > > + /* Technically not necessary, but clear the extra
> > > > dwords
> > > > too,
> > > > + * so that the command processors don't read
> > > > uninitialized
> > > > memory.
> > > > + */
> > > > + for (i = 0; i < ring->funcs->extra_dw; i++)
> > > > + ring->ring[ring->ring_size / 4 + i] = ring-
> > > > >funcs-
> > > > > nop;
> > > > }
> > > >
> > > > static inline void amdgpu_ring_write(struct amdgpu_ring *ring,
> > > > uint32_t v)