Re: [PATCH] drm/amdgpu: Raven: don't allow mixing GTT and VRAM

Thadeu Lima de Souza Cascardo Tue, 22 Jul 2025 04:21:56 -0700

On Fri, Jul 18, 2025 at 07:00:39PM -0400, Alex Deucher wrote:
> On Fri, Jul 18, 2025 at 6:01 PM Leo Li <sunpeng...@amd.com> wrote:
> >
> >
> >
> > On 2025-07-18 17:33, Alex Deucher wrote:
> > > On Fri, Jul 18, 2025 at 5:02 PM Leo Li <sunpeng...@amd.com> wrote:
> > >>
> > >>
> > >>
> > >> On 2025-07-18 16:07, Alex Deucher wrote:
> > >>> On Fri, Jul 18, 2025 at 1:57 PM Brian Geffon <bgef...@google.com> wrote:
> > >>>>
> > >>>> On Thu, Jul 17, 2025 at 10:59 AM Alex Deucher <alexdeuc...@gmail.com> 
> > >>>> wrote:
> > >>>>>
> > >>>>> On Wed, Jul 16, 2025 at 8:13 PM Brian Geffon <bgef...@google.com> 
> > >>>>> wrote:
> > >>>>>>
> > >>>>>> On Wed, Jul 16, 2025 at 5:03 PM Alex Deucher <alexdeuc...@gmail.com> 
> > >>>>>> wrote:
> > >>>>>>>
> > >>>>>>> On Wed, Jul 16, 2025 at 12:40 PM Brian Geffon <bgef...@google.com> 
> > >>>>>>> wrote:
> > >>>>>>>>
> > >>>>>>>> On Wed, Jul 16, 2025 at 12:33 PM Alex Deucher 
> > >>>>>>>> <alexdeuc...@gmail.com> wrote:
> > >>>>>>>>>
> > >>>>>>>>> On Wed, Jul 16, 2025 at 12:18 PM Brian Geffon 
> > >>>>>>>>> <bgef...@google.com> wrote:
> > >>>>>>>>>>
> > >>>>>>>>>> Commit 81d0bcf99009 ("drm/amdgpu: make display pinning more 
> > >>>>>>>>>> flexible (v2)")
> > >>>>>>>>>> allowed for newer ASICs to mix GTT and VRAM, this change also 
> > >>>>>>>>>> noted that
> > >>>>>>>>>> some older boards, such as Stoney and Carrizo do not support 
> > >>>>>>>>>> this.
> > >>>>>>>>>> It appears that at least one additional ASIC does not support 
> > >>>>>>>>>> this which
> > >>>>>>>>>> is Raven.
> > >>>>>>>>>>
> > >>>>>>>>>> We observed this issue when migrating a device from a 5.4 to 6.6 
> > >>>>>>>>>> kernel
> > >>>>>>>>>> and have confirmed that Raven also needs to be excluded from 
> > >>>>>>>>>> mixing GTT
> > >>>>>>>>>> and VRAM.
> > >>>>>>>>>
> > >>>>>>>>> Can you elaborate a bit on what the problem is?  For carrizo and
> > >>>>>>>>> stoney this is a hardware limitation (all display buffers need to 
> > >>>>>>>>> be
> > >>>>>>>>> in GTT or VRAM, but not both).  Raven and newer don't have this
> > >>>>>>>>> limitation and we tested raven pretty extensively at the time.s
> > >>>>>>>>
> > >>>>>>>> Thanks for taking the time to look. We have automated testing and a
> > >>>>>>>> few igt gpu tools tests failed and after debugging we found that
> > >>>>>>>> commit 81d0bcf99009 is what introduced the failures on this 
> > >>>>>>>> hardware
> > >>>>>>>> on 6.1+ kernels. The specific tests that fail are kms_async_flips 
> > >>>>>>>> and
> > >>>>>>>> kms_plane_alpha_blend, excluding Raven from this sharing of GTT and
> > >>>>>>>> VRAM buffers resolves the issue.
> > >>>>>>>
> > >>>>>>> + Harry and Leo
> > >>>>>>>
> > >>>>>>> This sounds like the memory placement issue we discussed last week.
> > >>>>>>> In that case, the issue is related to where the buffer ends up when 
> > >>>>>>> we
> > >>>>>>> try to do an async flip.  In that case, we can't do an async flip
> > >>>>>>> without a full modeset if the buffers locations are different than 
> > >>>>>>> the
> > >>>>>>> last modeset because we need to update more than just the buffer 
> > >>>>>>> base
> > >>>>>>> addresses.  This change works around that limitation by always 
> > >>>>>>> forcing
> > >>>>>>> display buffers into VRAM or GTT.  Adding raven to this case may fix
> > >>>>>>> those tests but will make the overall experience worse because we'll
> > >>>>>>> end up effectively not being able to not fully utilize both gtt and
> > >>>>>>> vram for display which would reintroduce all of the problems fixed 
> > >>>>>>> by
> > >>>>>>> 81d0bcf99009 ("drm/amdgpu: make display pinning more flexible 
> > >>>>>>> (v2)").
> > >>>>>>
> > >>>>>> Thanks Alex, the thing is, we only observe this on Raven boards, why
> > >>>>>> would Raven only be impacted by this? It would seem that all devices
> > >>>>>> would have this issue, no? Also, I'm not familiar with how
> > >>>>>
> > >>>>> It depends on memory pressure and available memory in each pool.
> > >>>>> E.g., initially the display buffer is in VRAM when the initial mode
> > >>>>> set happens.  The watermarks, etc. are set for that scenario.  One of
> > >>>>> the next frames ends up in a pool different than the original.  Now
> > >>>>> the buffer is in GTT.  The async flip interface does a fast validation
> > >>>>> to try and flip as soon as possible, but that validation fails because
> > >>>>> the watermarks need to be updated which requires a full modeset.
> > >>
> > >> Huh, I'm not sure if this actually is an issue for APUs. The fix that 
> > >> introduced
> > >> a check for same memory placement on async flips was on a system with a 
> > >> DGPU,
> > >> for which VRAM placement does matter:
> > >> https://github.com/torvalds/linux/commit/a7c0cad0dc060bb77e9c9d235d68441b0fc69507
> > >>
> > >> Looking around in DM/DML, for APUs, I don't see any logic that changes 
> > >> DCN
> > >> bandwidth validation depending on memory placement. There's a 
> > >> gpuvm_enable flag
> > >> for SG, but it's statically set to 1 on APU DCN versions. It sounds like 
> > >> for
> > >> APUs specifically, we *should* be able to ignore the mem placement 
> > >> check. I can
> > >> spin up a patch to test this out.
> > >
> > > Is the gpu_vm_support flag ever set for dGPUs?  The allowed domains
> > > for display buffers are determined by
> > > amdgpu_display_supported_domains() and we only allow GTT as a domain
> > > if gpu_vm_support is set, which I think is just for APUs.  In that
> > > case, we could probably only need the checks specifically for
> > > CHIP_CARRIZO and CHIP_STONEY since IIRC, they don't support mixed VRAM
> > > and GTT (only one or the other?).  dGPUs and really old APUs will
> > > always get VRAM, and newer APUs will get VRAM | GTT.
> >
> > It doesn't look like gpu_vm_support is set for DGPUs
> > https://elixir.bootlin.com/linux/v6.15.6/source/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c#L1866
> >
> > Though interestingly, further up at #L1858, Raven has gpu_vm_support = 0. 
> > Maybe it had stability issues?
> > https://github.com/torvalds/linux/commit/098c13079c6fdd44f10586b69132c392ebf87450
> 
> We need to be a little careful here asic_type == CHIP_RAVEN covers
> several variants:
> apu_flags & AMD_APU_IS_RAVEN - raven1 (gpu_vm_support = false)
> apu_flags & AMD_APU_IS_RAVEN2 - raven2 (gpu_vm_support = true)
> apu_flags & AMD_APU_IS_PICASSO - picasso (gpu_vm_support = true)
> 
> amdgpu_display_supported_domains() only sets AMDGPU_GEM_DOMAIN_GTT if
> gpu_vm_support is true.  so we'd never get into the check in
> amdgpu_bo_get_preferred_domain() for raven1.
> 
> Anyway, back to your suggestion, I think we can probably drop the
> checks as you should always get a compatible memory buffer due to
> amdgpu_bo_get_preferred_domain(). Pinning should fail if we can't pin
> in the required domain.  amdgpu_display_supported_domains() will
> ensure you always get VRAM or GTT or VRAM | GTT depending on what the
> chip supports.  Then amdgpu_bo_get_preferred_domain() will either
> leave that as is, or force VRAM or GTT for the STONEY/CARRIZO case.
> On the off chance we do get incompatible memory, something like the
> attached patch should do the trick.
> 
> Alex
>


Thanks for the patch, Alex.

I have tested it, and though kms_async_flips and kms_plane_alpha_blend
pass, kms_plane_cursor still fail.

I am going to investigate a little more today and send more details from my
findings.

Thanks.
Cascardo.

> 
> >
> > - Leo
> >
> > >
> > > Alex
> > >
> > >>
> > >> Thanks,
> > >> Leo
> > >>
> > >>>>>
> > >>>>> It's tricky to fix because you don't want to use the worst case
> > >>>>> watermarks all the time because that will limit the number available
> > >>>>> display options and you don't want to force everything to a particular
> > >>>>> memory pool because that will limit the amount of memory that can be
> > >>>>> used for display (which is what the patch in question fixed).  Ideally
> > >>>>> the caller would do a test commit before the page flip to determine
> > >>>>> whether or not it would succeed before issuing it and then we'd have
> > >>>>> some feedback mechanism to tell the caller that the commit would fail
> > >>>>> due to buffer placement so it would do a full modeset instead.  We
> > >>>>> discussed this feedback mechanism last week at the display hackfest.
> > >>>>>
> > >>>>>
> > >>>>>> kms_plane_alpha_blend works, but does this also support that test
> > >>>>>> failing as the cause?
> > >>>>>
> > >>>>> That may be related.  I'm not too familiar with that test either, but
> > >>>>> Leo or Harry can provide some guidance.
> > >>>>>
> > >>>>> Alex
> > >>>>
> > >>>> Thanks everyone for the input so far. I have a question for the
> > >>>> maintainers, given that it seems that this is functionally broken for
> > >>>> ASICs which are iGPUs, and there does not seem to be an easy fix, does
> > >>>> it make sense to extend this proposed patch to all iGPUs until a more
> > >>>> permanent fix can be identified? At the end of the day I'll take
> > >>>> functional correctness over performance.
> > >>>
> > >>> It's not functional correctness, it's usability.  All that is
> > >>> potentially broken is async flips (which depend on memory pressure and
> > >>> buffer placement), while if you effectively revert the patch, you end
> > >>> up  limiting all display buffers to either VRAM or GTT which may end
> > >>> up causing the inability to display anything because there is not
> > >>> enough memory in that pool for the next modeset.  We'll start getting
> > >>> bug reports about blank screens and failure to set modes because of
> > >>> memory pressure.  I think if we want a short term fix, it would be to
> > >>> always set the worst case watermarks.  The downside to that is that it
> > >>> would possibly cause some working display setups to stop working if
> > >>> they were on the margins to begin with.
> > >>>
> > >>> Alex
> > >>>
> > >>>>
> > >>>> Brian
> > >>>>
> > >>>>>
> > >>>>>>
> > >>>>>> Thanks again,
> > >>>>>> Brian
> > >>>>>>
> > >>>>>>>
> > >>>>>>> Alex
> > >>>>>>>
> > >>>>>>>>
> > >>>>>>>> Brian
> > >>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>> Alex
> > >>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>> Fixes: 81d0bcf99009 ("drm/amdgpu: make display pinning more 
> > >>>>>>>>>> flexible (v2)")
> > >>>>>>>>>> Cc: Luben Tuikov <luben.tui...@amd.com>
> > >>>>>>>>>> Cc: Christian König <christian.koe...@amd.com>
> > >>>>>>>>>> Cc: Alex Deucher <alexander.deuc...@amd.com>
> > >>>>>>>>>> Cc: sta...@vger.kernel.org # 6.1+
> > >>>>>>>>>> Tested-by: Thadeu Lima de Souza Cascardo <casca...@igalia.com>
> > >>>>>>>>>> Signed-off-by: Brian Geffon <bgef...@google.com>
> > >>>>>>>>>> ---
> > >>>>>>>>>>  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 3 ++-
> > >>>>>>>>>>  1 file changed, 2 insertions(+), 1 deletion(-)
> > >>>>>>>>>>
> > >>>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
> > >>>>>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> > >>>>>>>>>> index 73403744331a..5d7f13e25b7c 100644
> > >>>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> > >>>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> > >>>>>>>>>> @@ -1545,7 +1545,8 @@ uint32_t 
> > >>>>>>>>>> amdgpu_bo_get_preferred_domain(struct amdgpu_device *adev,
> > >>>>>>>>>>                                             uint32_t domain)
> > >>>>>>>>>>  {
> > >>>>>>>>>>         if ((domain == (AMDGPU_GEM_DOMAIN_VRAM | 
> > >>>>>>>>>> AMDGPU_GEM_DOMAIN_GTT)) &&
> > >>>>>>>>>> -           ((adev->asic_type == CHIP_CARRIZO) || 
> > >>>>>>>>>> (adev->asic_type == CHIP_STONEY))) {
> > >>>>>>>>>> +           ((adev->asic_type == CHIP_CARRIZO) || 
> > >>>>>>>>>> (adev->asic_type == CHIP_STONEY) ||
> > >>>>>>>>>> +            (adev->asic_type == CHIP_RAVEN))) {
> > >>>>>>>>>>                 domain = AMDGPU_GEM_DOMAIN_VRAM;
> > >>>>>>>>>>                 if (adev->gmc.real_vram_size <= 
> > >>>>>>>>>> AMDGPU_SG_THRESHOLD)
> > >>>>>>>>>>                         domain = AMDGPU_GEM_DOMAIN_GTT;
> > >>>>>>>>>> --
> > >>>>>>>>>> 2.50.0.727.gbf7dc18ff4-goog
> > >>>>>>>>>>
> > >>
> >

> From cce1652c62c42c858de64c306ea0ddc7af3bd0b1 Mon Sep 17 00:00:00 2001
> From: Alex Deucher <alexander.deuc...@amd.com>
> Date: Fri, 18 Jul 2025 18:40:26 -0400
> Subject: [PATCH] drm/amd/display: refine framebuffer placement checks
> 
> When we commit planes, we need to make sure the
> framebuffer memory locations are compatible. Various
> hardware has the following requirements for display buffers:
> dGPUs, old APUs, raven1 - must be in VRAM
> cazziro/stoney - must be in VRAM or GTT, but not both
> newer APUs (raven2/picasso and newer) - can be in VRAM or GTT
> 
> You should always get a compatible memory buffer due to
> amdgpu_bo_get_preferred_domain(). amdgpu_display_supported_domains()
> will ensure you always get VRAM or GTT or VRAM | GTT depending on
> what the chip supports.  Then amdgpu_bo_get_preferred_domain()
> will either leave that as is when pinning, or force VRAM or GTT
> for the STONEY/CARRIZO case.
> 
> As such the checks could probably be removed, but on the off chance
> we do end up getting different memory pool for the old
> and new framebuffers, refine the check to take into account the
> hardware capabilities.
> 
> Fixes: a7c0cad0dc06 ("drm/amd/display: ensure async flips are only accepted 
> for fast updates")
> Reported-by: Brian Geffon <bgef...@google.com>
> Cc: Leo Li <sunpeng...@amd.com>
> Signed-off-by: Alex Deucher <alexander.deuc...@amd.com>
> ---
>  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 20 ++++++++++++++++---
>  1 file changed, 17 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> index 129476b6d5fa9..de2bd789ec15b 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> @@ -9288,6 +9288,18 @@ static void amdgpu_dm_enable_self_refresh(struct 
> amdgpu_crtc *acrtc_attach,
>       }
>  }
>  
> +static bool amdgpu_dm_mem_type_compatible(struct amdgpu_device *adev,
> +                                       struct drm_framebuffer *old_fb,
> +                                       struct drm_framebuffer *new_fb)
> +{
> +     if (!adev->mode_info.gpu_vm_support ||
> +         (adev->asic_type == CHIP_CARRIZO) ||
> +         (adev->asic_type == CHIP_STONEY))
> +             return get_mem_type(old_fb) == get_mem_type(new_fb);
> +
> +     return true;
> +}
> +
>  static void amdgpu_dm_commit_planes(struct drm_atomic_state *state,
>                                   struct drm_device *dev,
>                                   struct amdgpu_display_manager *dm,
> @@ -9465,7 +9477,7 @@ static void amdgpu_dm_commit_planes(struct 
> drm_atomic_state *state,
>                */
>               if (crtc->state->async_flip &&
>                   (acrtc_state->update_type != UPDATE_TYPE_FAST ||
> -                  get_mem_type(old_plane_state->fb) != get_mem_type(fb)))
> +                  !amdgpu_dm_mem_type_compatible(dm->adev, 
> old_plane_state->fb, fb)))
>                       drm_warn_once(state->dev,
>                                     "[PLANE:%d:%s] async flip with non-fast 
> update\n",
>                                     plane->base.id, plane->name);
> @@ -9473,7 +9485,7 @@ static void amdgpu_dm_commit_planes(struct 
> drm_atomic_state *state,
>               bundle->flip_addrs[planes_count].flip_immediate =
>                       crtc->state->async_flip &&
>                       acrtc_state->update_type == UPDATE_TYPE_FAST &&
> -                     get_mem_type(old_plane_state->fb) == get_mem_type(fb);
> +                     amdgpu_dm_mem_type_compatible(dm->adev, 
> old_plane_state->fb, fb);
>  
>               timestamp_ns = ktime_get_ns();
>               bundle->flip_addrs[planes_count].flip_timestamp_in_us = 
> div_u64(timestamp_ns, 1000);
> @@ -11760,6 +11772,7 @@ static bool amdgpu_dm_crtc_mem_type_changed(struct 
> drm_device *dev,
>                                           struct drm_atomic_state *state,
>                                           struct drm_crtc_state *crtc_state)
>  {
> +     struct amdgpu_device *adev = drm_to_adev(dev);
>       struct drm_plane *plane;
>       struct drm_plane_state *new_plane_state, *old_plane_state;
>  
> @@ -11773,7 +11786,8 @@ static bool amdgpu_dm_crtc_mem_type_changed(struct 
> drm_device *dev,
>               }
>  
>               if (old_plane_state->fb && new_plane_state->fb &&
> -                 get_mem_type(old_plane_state->fb) != 
> get_mem_type(new_plane_state->fb))
> +                 !amdgpu_dm_mem_type_compatible(adev, old_plane_state->fb,
> +                                                new_plane_state->fb))
>                       return true;
>       }
>  
> -- 
> 2.50.1
>

Re: [PATCH] drm/amdgpu: Raven: don't allow mixing GTT and VRAM

Reply via email to