On Fri, Jul 18, 2025 at 5:02 PM Leo Li <sunpeng...@amd.com> wrote: > > > > On 2025-07-18 16:07, Alex Deucher wrote: > > On Fri, Jul 18, 2025 at 1:57 PM Brian Geffon <bgef...@google.com> wrote: > >> > >> On Thu, Jul 17, 2025 at 10:59 AM Alex Deucher <alexdeuc...@gmail.com> > >> wrote: > >>> > >>> On Wed, Jul 16, 2025 at 8:13 PM Brian Geffon <bgef...@google.com> wrote: > >>>> > >>>> On Wed, Jul 16, 2025 at 5:03 PM Alex Deucher <alexdeuc...@gmail.com> > >>>> wrote: > >>>>> > >>>>> On Wed, Jul 16, 2025 at 12:40 PM Brian Geffon <bgef...@google.com> > >>>>> wrote: > >>>>>> > >>>>>> On Wed, Jul 16, 2025 at 12:33 PM Alex Deucher <alexdeuc...@gmail.com> > >>>>>> wrote: > >>>>>>> > >>>>>>> On Wed, Jul 16, 2025 at 12:18 PM Brian Geffon <bgef...@google.com> > >>>>>>> wrote: > >>>>>>>> > >>>>>>>> Commit 81d0bcf99009 ("drm/amdgpu: make display pinning more flexible > >>>>>>>> (v2)") > >>>>>>>> allowed for newer ASICs to mix GTT and VRAM, this change also noted > >>>>>>>> that > >>>>>>>> some older boards, such as Stoney and Carrizo do not support this. > >>>>>>>> It appears that at least one additional ASIC does not support this > >>>>>>>> which > >>>>>>>> is Raven. > >>>>>>>> > >>>>>>>> We observed this issue when migrating a device from a 5.4 to 6.6 > >>>>>>>> kernel > >>>>>>>> and have confirmed that Raven also needs to be excluded from mixing > >>>>>>>> GTT > >>>>>>>> and VRAM. > >>>>>>> > >>>>>>> Can you elaborate a bit on what the problem is? For carrizo and > >>>>>>> stoney this is a hardware limitation (all display buffers need to be > >>>>>>> in GTT or VRAM, but not both). Raven and newer don't have this > >>>>>>> limitation and we tested raven pretty extensively at the time. > >>>>>> > >>>>>> Thanks for taking the time to look. We have automated testing and a > >>>>>> few igt gpu tools tests failed and after debugging we found that > >>>>>> commit 81d0bcf99009 is what introduced the failures on this hardware > >>>>>> on 6.1+ kernels. The specific tests that fail are kms_async_flips and > >>>>>> kms_plane_alpha_blend, excluding Raven from this sharing of GTT and > >>>>>> VRAM buffers resolves the issue. > >>>>> > >>>>> + Harry and Leo > >>>>> > >>>>> This sounds like the memory placement issue we discussed last week. > >>>>> In that case, the issue is related to where the buffer ends up when we > >>>>> try to do an async flip. In that case, we can't do an async flip > >>>>> without a full modeset if the buffers locations are different than the > >>>>> last modeset because we need to update more than just the buffer base > >>>>> addresses. This change works around that limitation by always forcing > >>>>> display buffers into VRAM or GTT. Adding raven to this case may fix > >>>>> those tests but will make the overall experience worse because we'll > >>>>> end up effectively not being able to not fully utilize both gtt and > >>>>> vram for display which would reintroduce all of the problems fixed by > >>>>> 81d0bcf99009 ("drm/amdgpu: make display pinning more flexible (v2)"). > >>>> > >>>> Thanks Alex, the thing is, we only observe this on Raven boards, why > >>>> would Raven only be impacted by this? It would seem that all devices > >>>> would have this issue, no? Also, I'm not familiar with how > >>> > >>> It depends on memory pressure and available memory in each pool. > >>> E.g., initially the display buffer is in VRAM when the initial mode > >>> set happens. The watermarks, etc. are set for that scenario. One of > >>> the next frames ends up in a pool different than the original. Now > >>> the buffer is in GTT. The async flip interface does a fast validation > >>> to try and flip as soon as possible, but that validation fails because > >>> the watermarks need to be updated which requires a full modeset. > > Huh, I'm not sure if this actually is an issue for APUs. The fix that > introduced > a check for same memory placement on async flips was on a system with a DGPU, > for which VRAM placement does matter: > https://github.com/torvalds/linux/commit/a7c0cad0dc060bb77e9c9d235d68441b0fc69507 > > Looking around in DM/DML, for APUs, I don't see any logic that changes DCN > bandwidth validation depending on memory placement. There's a gpuvm_enable > flag > for SG, but it's statically set to 1 on APU DCN versions. It sounds like for > APUs specifically, we *should* be able to ignore the mem placement check. I > can > spin up a patch to test this out.
Is the gpu_vm_support flag ever set for dGPUs? The allowed domains for display buffers are determined by amdgpu_display_supported_domains() and we only allow GTT as a domain if gpu_vm_support is set, which I think is just for APUs. In that case, we could probably only need the checks specifically for CHIP_CARRIZO and CHIP_STONEY since IIRC, they don't support mixed VRAM and GTT (only one or the other?). dGPUs and really old APUs will always get VRAM, and newer APUs will get VRAM | GTT. Alex > > Thanks, > Leo > > >>> > >>> It's tricky to fix because you don't want to use the worst case > >>> watermarks all the time because that will limit the number available > >>> display options and you don't want to force everything to a particular > >>> memory pool because that will limit the amount of memory that can be > >>> used for display (which is what the patch in question fixed). Ideally > >>> the caller would do a test commit before the page flip to determine > >>> whether or not it would succeed before issuing it and then we'd have > >>> some feedback mechanism to tell the caller that the commit would fail > >>> due to buffer placement so it would do a full modeset instead. We > >>> discussed this feedback mechanism last week at the display hackfest. > >>> > >>> > >>>> kms_plane_alpha_blend works, but does this also support that test > >>>> failing as the cause? > >>> > >>> That may be related. I'm not too familiar with that test either, but > >>> Leo or Harry can provide some guidance. > >>> > >>> Alex > >> > >> Thanks everyone for the input so far. I have a question for the > >> maintainers, given that it seems that this is functionally broken for > >> ASICs which are iGPUs, and there does not seem to be an easy fix, does > >> it make sense to extend this proposed patch to all iGPUs until a more > >> permanent fix can be identified? At the end of the day I'll take > >> functional correctness over performance. > > > > It's not functional correctness, it's usability. All that is > > potentially broken is async flips (which depend on memory pressure and > > buffer placement), while if you effectively revert the patch, you end > > up limiting all display buffers to either VRAM or GTT which may end > > up causing the inability to display anything because there is not > > enough memory in that pool for the next modeset. We'll start getting > > bug reports about blank screens and failure to set modes because of > > memory pressure. I think if we want a short term fix, it would be to > > always set the worst case watermarks. The downside to that is that it > > would possibly cause some working display setups to stop working if > > they were on the margins to begin with. > > > > Alex > > > >> > >> Brian > >> > >>> > >>>> > >>>> Thanks again, > >>>> Brian > >>>> > >>>>> > >>>>> Alex > >>>>> > >>>>>> > >>>>>> Brian > >>>>>> > >>>>>>> > >>>>>>> > >>>>>>> Alex > >>>>>>> > >>>>>>>> > >>>>>>>> Fixes: 81d0bcf99009 ("drm/amdgpu: make display pinning more flexible > >>>>>>>> (v2)") > >>>>>>>> Cc: Luben Tuikov <luben.tui...@amd.com> > >>>>>>>> Cc: Christian König <christian.koe...@amd.com> > >>>>>>>> Cc: Alex Deucher <alexander.deuc...@amd.com> > >>>>>>>> Cc: sta...@vger.kernel.org # 6.1+ > >>>>>>>> Tested-by: Thadeu Lima de Souza Cascardo <casca...@igalia.com> > >>>>>>>> Signed-off-by: Brian Geffon <bgef...@google.com> > >>>>>>>> --- > >>>>>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 3 ++- > >>>>>>>> 1 file changed, 2 insertions(+), 1 deletion(-) > >>>>>>>> > >>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c > >>>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c > >>>>>>>> index 73403744331a..5d7f13e25b7c 100644 > >>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c > >>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c > >>>>>>>> @@ -1545,7 +1545,8 @@ uint32_t amdgpu_bo_get_preferred_domain(struct > >>>>>>>> amdgpu_device *adev, > >>>>>>>> uint32_t domain) > >>>>>>>> { > >>>>>>>> if ((domain == (AMDGPU_GEM_DOMAIN_VRAM | > >>>>>>>> AMDGPU_GEM_DOMAIN_GTT)) && > >>>>>>>> - ((adev->asic_type == CHIP_CARRIZO) || (adev->asic_type > >>>>>>>> == CHIP_STONEY))) { > >>>>>>>> + ((adev->asic_type == CHIP_CARRIZO) || (adev->asic_type > >>>>>>>> == CHIP_STONEY) || > >>>>>>>> + (adev->asic_type == CHIP_RAVEN))) { > >>>>>>>> domain = AMDGPU_GEM_DOMAIN_VRAM; > >>>>>>>> if (adev->gmc.real_vram_size <= AMDGPU_SG_THRESHOLD) > >>>>>>>> domain = AMDGPU_GEM_DOMAIN_GTT; > >>>>>>>> -- > >>>>>>>> 2.50.0.727.gbf7dc18ff4-goog > >>>>>>>> >