On 7/6/2025 10:28 PM, Lazar, Lijo wrote:
On 7/7/2025 2:04 AM, Mario Limonciello wrote:
On 7/4/2025 6:12 AM, Samuel Zhang wrote:
For normal hibernation, GPU do not need to be resumed in thaw since it
is not involved in writing the hibernation image. Skip resume in this
case can reduce the hibernation time.
Since you have the measurements would you mind including them in the
commit message for reference?
For cancelled hibernation, GPU need to be resumed.
If I'm following right you are actually handling two different things in
this patch aren't you?
1) A change in thaw() to only resume on aborted hibernation
2) A change in shutdown() to skip running if the in s4 when shutdown()
is called.
So I think it would be more logical to split this into two patches.
This is doing only one thing - Keep the device in suspended state for
thaw() operation during a successful hibernation. Splitting into two
could break hibernation during integration of the first part - it will
attempt another suspend during shutdown. I think we don't take care of
consecutive suspend calls.
Thanks,
Lijo
Got it; thanks for clarification.
Signed-off-by: Samuel Zhang <guoqing.zh...@amd.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/
drm/amd/amdgpu/amdgpu_drv.c
index 4f8632737574..e064816aae4d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -2541,6 +2541,10 @@ amdgpu_pci_shutdown(struct pci_dev *pdev)
if (amdgpu_ras_intr_triggered())
return;
+ /* device maybe not resumed here, return immediately in this
case */
+ if (adev->in_s4 && adev->in_suspend)
+ return;
+
/* if we are running in a VM, make sure the device
* torn down properly on reboot/shutdown.
* unfortunately we can't detect certain
@@ -2655,6 +2659,10 @@ static int amdgpu_pmops_thaw(struct device *dev)
{
struct drm_device *drm_dev = dev_get_drvdata(dev);
+ /* do not resume device for normal hibernation */
+ if (pm_transition.event == PM_EVENT_THAW)
+ return 0;
+
Without digging into pm.h documentation I think it's not going to be
very obvious next time we look at this code that amdgpu_device_resume()
is only intended for the aborted case.
How would you feel about a switch/case?
Something like this:
switch (pm_transition.event) {
/* normal hibernation */
case PM_EVENT_THAW:
return 0;
/* for aborted hibernation */
case PM_EVENT_RECOVER:
return amdgpu_device_resume(drm_dev, true);
default:
return -EOPNOTSUP;
}
return amdgpu_device_resume(drm_dev, true);
}