On 10/24/2024 8:24 AM, Liang, Prike wrote:
> [Public]
>
>> From: Lazar, Lijo <lijo.la...@amd.com>
>> Sent: Wednesday, October 23, 2024 6:55 PM
>> To: Liang, Prike <prike.li...@amd.com>; amd-gfx@lists.freedesktop.org
>> Cc: Deucher, Alexander <alexander.deuc...@amd.com>
>> Subject: Re: [PATCH v2 2/2] drm/amdgpu: clean up the suspend_complete
>>
>>
>>
>> On 10/14/2024 1:19 PM, Prike Liang wrote:
>>> To check the status of S3 suspend completion, use the PM core
>>> pm_suspend_global_flags bit(1) to detect S3 abort events. Therefore,
>>> clean up the AMDGPU driver's private flag suspend_complete.
>>>
>>> Signed-off-by: Prike Liang <prike.li...@amd.com>
>>> ---
>>> drivers/gpu/drm/amd/amdgpu/amdgpu.h | 2 --
>>> drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 2 --
>>> drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 4 ++--
>>> drivers/gpu/drm/amd/amdgpu/soc15.c | 7 ++-----
>>> drivers/gpu/drm/amd/amdgpu/soc21.c | 2 +-
>>> 5 files changed, 5 insertions(+), 12 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>> index 48c9b9b06905..9b35763ae0a7 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>> @@ -1111,8 +1111,6 @@ struct amdgpu_device {
>>> bool in_s3;
>>> bool in_s4;
>>> bool in_s0ix;
>>> - /* indicate amdgpu suspension status */
>>> - bool suspend_complete;
>>>
>>> enum pp_mp1_state mp1_state;
>>> struct amdgpu_doorbell_index doorbell_index; diff --git
>>> a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>> index 680e44fdee6e..78972151b970 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>> @@ -2501,7 +2501,6 @@ static int amdgpu_pmops_suspend(struct device *dev)
>>> struct drm_device *drm_dev = dev_get_drvdata(dev);
>>> struct amdgpu_device *adev = drm_to_adev(drm_dev);
>>>
>>> - adev->suspend_complete = false;
>>> if (amdgpu_acpi_is_s0ix_active(adev))
>>> adev->in_s0ix = true;
>>> else if (amdgpu_acpi_is_s3_active(adev)) @@ -2516,7 +2515,6 @@
>>> static int amdgpu_pmops_suspend_noirq(struct device *dev)
>>> struct drm_device *drm_dev = dev_get_drvdata(dev);
>>> struct amdgpu_device *adev = drm_to_adev(drm_dev);
>>>
>>> - adev->suspend_complete = true;
>>> if (amdgpu_acpi_should_gpu_reset(adev))
>>> return amdgpu_asic_reset(adev);
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
>>> b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
>>> index be320d753507..ba8e66744376 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
>>> @@ -3276,8 +3276,8 @@ static int gfx_v9_0_cp_gfx_start(struct amdgpu_device
>> *adev)
>>> * confirmed that the APU gfx10/gfx11 needn't such update.
>>> */
>>> if (adev->flags & AMD_IS_APU &&
>>> - adev->in_s3 && !adev->suspend_complete) {
>>> - DRM_INFO(" Will skip the CSB packet resubmit\n");
>>> + adev->in_s3 && !pm_resume_via_firmware()) {
>>> + DRM_INFO("Will skip the CSB packet resubmit\n");
>>> return 0;
>>> }
>>> r = amdgpu_ring_alloc(ring, gfx_v9_0_get_csb_size(adev) + 4 + 3);
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c
>>> b/drivers/gpu/drm/amd/amdgpu/soc15.c
>>> index 12ff6cf568dc..d9d11131a744 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/soc15.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/soc15.c
>>> @@ -584,13 +584,10 @@ static bool soc15_need_reset_on_resume(struct
>> amdgpu_device *adev)
>>> * performing pm core test.
>>> */
>>> if (adev->flags & AMD_IS_APU && adev->in_s3 &&
>>> - !pm_resume_via_firmware()) {
>>> - adev->suspend_complete = false;
>>> + !pm_resume_via_firmware())
>>> return true;
>>> - } else {
>>> - adev->suspend_complete = true;
>>> + else
>>> return false;
>>> - }
>>> }
>>>
>>> static int soc15_asic_reset(struct amdgpu_device *adev) diff --git
>>> a/drivers/gpu/drm/amd/amdgpu/soc21.c
>>> b/drivers/gpu/drm/amd/amdgpu/soc21.c
>>> index c4b950e75133..7a47a21ef00f 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/soc21.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/soc21.c
>>> @@ -904,7 +904,7 @@ static bool soc21_need_reset_on_resume(struct
>> amdgpu_device *adev)
>>> * 2) S3 suspend got aborted and TOS is active.
>>> */
>>> if (!(adev->flags & AMD_IS_APU) && adev->in_s3 &&
>>> - !adev->suspend_complete) {
>>> + !pm_resume_via_firmware()) {
>>
>> Looks like this will cover only ACPI based systems. Not sure if that
>> assumption is
>> valid for dGPU cases.
>>
>> Thanks,
>> Lijo
>
> Yes, the pm_set_resume_via_firmware() function is only called during the
> ACPI_STATE_S3 suspend process. However, ACPI-enabled systems are popular in
> the desktop world. If there are concerns about ACPI configuration, one option
> could be to check if the dGPU needs a reset by directly checking the SOL
> register. As far as I can see, when the dGPU completes its suspend process,
> the SOL value will remain zero until the dGPU is resumed. Conversely, in the
> case of a suspend abort, the SOL value will be non-zero.
>
in_s3 is set for dGPU in case of s0ix as well. Probably, that's the only
case where need the flag to avoid unnecessary reset. Otherwise SOL check
could be sufficient.
Thanks,
Lijo
> Thanks,
> Prike
>>
>>> sol_reg1 = RREG32_SOC15(MP0, 0, regMP0_SMN_C2PMSG_81);
>>> msleep(100);
>>> sol_reg2 = RREG32_SOC15(MP0, 0, regMP0_SMN_C2PMSG_81);