On 2025-07-18 17:33, Alex Deucher wrote:
> On Fri, Jul 18, 2025 at 5:02 PM Leo Li <sunpeng...@amd.com> wrote:
>>
>>
>>
>> On 2025-07-18 16:07, Alex Deucher wrote:
>>> On Fri, Jul 18, 2025 at 1:57 PM Brian Geffon <bgef...@google.com> wrote:
>>>>
>>>> On Thu, Jul 17, 2025 at 10:59 AM Alex Deucher <alexdeuc...@gmail.com>
>>>> wrote:
>>>>>
>>>>> On Wed, Jul 16, 2025 at 8:13 PM Brian Geffon <bgef...@google.com> wrote:
>>>>>>
>>>>>> On Wed, Jul 16, 2025 at 5:03 PM Alex Deucher <alexdeuc...@gmail.com>
>>>>>> wrote:
>>>>>>>
>>>>>>> On Wed, Jul 16, 2025 at 12:40 PM Brian Geffon <bgef...@google.com>
>>>>>>> wrote:
>>>>>>>>
>>>>>>>> On Wed, Jul 16, 2025 at 12:33 PM Alex Deucher <alexdeuc...@gmail.com>
>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> On Wed, Jul 16, 2025 at 12:18 PM Brian Geffon <bgef...@google.com>
>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>> Commit 81d0bcf99009 ("drm/amdgpu: make display pinning more flexible
>>>>>>>>>> (v2)")
>>>>>>>>>> allowed for newer ASICs to mix GTT and VRAM, this change also noted
>>>>>>>>>> that
>>>>>>>>>> some older boards, such as Stoney and Carrizo do not support this.
>>>>>>>>>> It appears that at least one additional ASIC does not support this
>>>>>>>>>> which
>>>>>>>>>> is Raven.
>>>>>>>>>>
>>>>>>>>>> We observed this issue when migrating a device from a 5.4 to 6.6
>>>>>>>>>> kernel
>>>>>>>>>> and have confirmed that Raven also needs to be excluded from mixing
>>>>>>>>>> GTT
>>>>>>>>>> and VRAM.
>>>>>>>>>
>>>>>>>>> Can you elaborate a bit on what the problem is? For carrizo and
>>>>>>>>> stoney this is a hardware limitation (all display buffers need to be
>>>>>>>>> in GTT or VRAM, but not both). Raven and newer don't have this
>>>>>>>>> limitation and we tested raven pretty extensively at the time.
>>>>>>>>
>>>>>>>> Thanks for taking the time to look. We have automated testing and a
>>>>>>>> few igt gpu tools tests failed and after debugging we found that
>>>>>>>> commit 81d0bcf99009 is what introduced the failures on this hardware
>>>>>>>> on 6.1+ kernels. The specific tests that fail are kms_async_flips and
>>>>>>>> kms_plane_alpha_blend, excluding Raven from this sharing of GTT and
>>>>>>>> VRAM buffers resolves the issue.
>>>>>>>
>>>>>>> + Harry and Leo
>>>>>>>
>>>>>>> This sounds like the memory placement issue we discussed last week.
>>>>>>> In that case, the issue is related to where the buffer ends up when we
>>>>>>> try to do an async flip. In that case, we can't do an async flip
>>>>>>> without a full modeset if the buffers locations are different than the
>>>>>>> last modeset because we need to update more than just the buffer base
>>>>>>> addresses. This change works around that limitation by always forcing
>>>>>>> display buffers into VRAM or GTT. Adding raven to this case may fix
>>>>>>> those tests but will make the overall experience worse because we'll
>>>>>>> end up effectively not being able to not fully utilize both gtt and
>>>>>>> vram for display which would reintroduce all of the problems fixed by
>>>>>>> 81d0bcf99009 ("drm/amdgpu: make display pinning more flexible (v2)").
>>>>>>
>>>>>> Thanks Alex, the thing is, we only observe this on Raven boards, why
>>>>>> would Raven only be impacted by this? It would seem that all devices
>>>>>> would have this issue, no? Also, I'm not familiar with how
>>>>>
>>>>> It depends on memory pressure and available memory in each pool.
>>>>> E.g., initially the display buffer is in VRAM when the initial mode
>>>>> set happens. The watermarks, etc. are set for that scenario. One of
>>>>> the next frames ends up in a pool different than the original. Now
>>>>> the buffer is in GTT. The async flip interface does a fast validation
>>>>> to try and flip as soon as possible, but that validation fails because
>>>>> the watermarks need to be updated which requires a full modeset.
>>
>> Huh, I'm not sure if this actually is an issue for APUs. The fix that
>> introduced
>> a check for same memory placement on async flips was on a system with a DGPU,
>> for which VRAM placement does matter:
>> https://github.com/torvalds/linux/commit/a7c0cad0dc060bb77e9c9d235d68441b0fc69507
>>
>> Looking around in DM/DML, for APUs, I don't see any logic that changes DCN
>> bandwidth validation depending on memory placement. There's a gpuvm_enable
>> flag
>> for SG, but it's statically set to 1 on APU DCN versions. It sounds like for
>> APUs specifically, we *should* be able to ignore the mem placement check. I
>> can
>> spin up a patch to test this out.
>
> Is the gpu_vm_support flag ever set for dGPUs? The allowed domains
> for display buffers are determined by
> amdgpu_display_supported_domains() and we only allow GTT as a domain
> if gpu_vm_support is set, which I think is just for APUs. In that
> case, we could probably only need the checks specifically for
> CHIP_CARRIZO and CHIP_STONEY since IIRC, they don't support mixed VRAM
> and GTT (only one or the other?). dGPUs and really old APUs will
> always get VRAM, and newer APUs will get VRAM | GTT.
It doesn't look like gpu_vm_support is set for DGPUs
https://elixir.bootlin.com/linux/v6.15.6/source/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c#L1866
Though interestingly, further up at #L1858, Raven has gpu_vm_support = 0. Maybe
it had stability issues?
https://github.com/torvalds/linux/commit/098c13079c6fdd44f10586b69132c392ebf87450
- Leo
>
> Alex
>
>>
>> Thanks,
>> Leo
>>
>>>>>
>>>>> It's tricky to fix because you don't want to use the worst case
>>>>> watermarks all the time because that will limit the number available
>>>>> display options and you don't want to force everything to a particular
>>>>> memory pool because that will limit the amount of memory that can be
>>>>> used for display (which is what the patch in question fixed). Ideally
>>>>> the caller would do a test commit before the page flip to determine
>>>>> whether or not it would succeed before issuing it and then we'd have
>>>>> some feedback mechanism to tell the caller that the commit would fail
>>>>> due to buffer placement so it would do a full modeset instead. We
>>>>> discussed this feedback mechanism last week at the display hackfest.
>>>>>
>>>>>
>>>>>> kms_plane_alpha_blend works, but does this also support that test
>>>>>> failing as the cause?
>>>>>
>>>>> That may be related. I'm not too familiar with that test either, but
>>>>> Leo or Harry can provide some guidance.
>>>>>
>>>>> Alex
>>>>
>>>> Thanks everyone for the input so far. I have a question for the
>>>> maintainers, given that it seems that this is functionally broken for
>>>> ASICs which are iGPUs, and there does not seem to be an easy fix, does
>>>> it make sense to extend this proposed patch to all iGPUs until a more
>>>> permanent fix can be identified? At the end of the day I'll take
>>>> functional correctness over performance.
>>>
>>> It's not functional correctness, it's usability. All that is
>>> potentially broken is async flips (which depend on memory pressure and
>>> buffer placement), while if you effectively revert the patch, you end
>>> up limiting all display buffers to either VRAM or GTT which may end
>>> up causing the inability to display anything because there is not
>>> enough memory in that pool for the next modeset. We'll start getting
>>> bug reports about blank screens and failure to set modes because of
>>> memory pressure. I think if we want a short term fix, it would be to
>>> always set the worst case watermarks. The downside to that is that it
>>> would possibly cause some working display setups to stop working if
>>> they were on the margins to begin with.
>>>
>>> Alex
>>>
>>>>
>>>> Brian
>>>>
>>>>>
>>>>>>
>>>>>> Thanks again,
>>>>>> Brian
>>>>>>
>>>>>>>
>>>>>>> Alex
>>>>>>>
>>>>>>>>
>>>>>>>> Brian
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Alex
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Fixes: 81d0bcf99009 ("drm/amdgpu: make display pinning more flexible
>>>>>>>>>> (v2)")
>>>>>>>>>> Cc: Luben Tuikov <luben.tui...@amd.com>
>>>>>>>>>> Cc: Christian König <christian.koe...@amd.com>
>>>>>>>>>> Cc: Alex Deucher <alexander.deuc...@amd.com>
>>>>>>>>>> Cc: sta...@vger.kernel.org # 6.1+
>>>>>>>>>> Tested-by: Thadeu Lima de Souza Cascardo <casca...@igalia.com>
>>>>>>>>>> Signed-off-by: Brian Geffon <bgef...@google.com>
>>>>>>>>>> ---
>>>>>>>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 3 ++-
>>>>>>>>>> 1 file changed, 2 insertions(+), 1 deletion(-)
>>>>>>>>>>
>>>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>>>>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>>>>>>>>> index 73403744331a..5d7f13e25b7c 100644
>>>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>>>>>>>>> @@ -1545,7 +1545,8 @@ uint32_t amdgpu_bo_get_preferred_domain(struct
>>>>>>>>>> amdgpu_device *adev,
>>>>>>>>>> uint32_t domain)
>>>>>>>>>> {
>>>>>>>>>> if ((domain == (AMDGPU_GEM_DOMAIN_VRAM |
>>>>>>>>>> AMDGPU_GEM_DOMAIN_GTT)) &&
>>>>>>>>>> - ((adev->asic_type == CHIP_CARRIZO) || (adev->asic_type
>>>>>>>>>> == CHIP_STONEY))) {
>>>>>>>>>> + ((adev->asic_type == CHIP_CARRIZO) || (adev->asic_type
>>>>>>>>>> == CHIP_STONEY) ||
>>>>>>>>>> + (adev->asic_type == CHIP_RAVEN))) {
>>>>>>>>>> domain = AMDGPU_GEM_DOMAIN_VRAM;
>>>>>>>>>> if (adev->gmc.real_vram_size <= AMDGPU_SG_THRESHOLD)
>>>>>>>>>> domain = AMDGPU_GEM_DOMAIN_GTT;
>>>>>>>>>> --
>>>>>>>>>> 2.50.0.727.gbf7dc18ff4-goog
>>>>>>>>>>
>>