https://bugzilla.kernel.org/show_bug.cgi?id=220553
Bug ID: 220553 Summary: Suspend to mem fails on rx5600xt (regression in 6.16.2) Product: Drivers Version: 2.5 Hardware: AMD OS: Linux Status: NEW Severity: normal Priority: P3 Component: Video(DRI - non Intel) Assignee: drivers_video-...@kernel-bugs.osdl.org Reporter: rbmc...@gmail.com Regression: No Created attachment 308637 --> https://bugzilla.kernel.org/attachment.cgi?id=308637&action=edit git diff that demonstrates what fixed it for me The GPU reset fails when attempting to suspend. First noticed in 6.16.2 (on arch) and it still occurs with 6.17.0-rc5. This in the dmesg: [ 25.934105] amdgpu 0000:2d:00.0: amdgpu: MODE1 reset [ 25.934111] amdgpu 0000:2d:00.0: amdgpu: GPU mode1 reset [ 25.934172] amdgpu 0000:2d:00.0: amdgpu: GPU psp mode1 reset [ 26.656776] amdgpu 0000:2d:00.0: amdgpu: psp reg (0x16061) wait timed out, mask: 8000ffff, read: ffffffff exp: 80000000 [ 26.656780] [drm] psp mode 1 reset failed! [ 26.656782] amdgpu 0000:2d:00.0: amdgpu: GPU mode1 reset failed [ 26.656783] amdgpu 0000:2d:00.0: PM: pci_pm_suspend_noirq(): amdgpu_pmops_suspend_noirq [amdgpu] returns -22 [ 26.656944] amdgpu 0000:2d:00.0: PM: dpm_run_callback(): pci_pm_suspend_noirq returns -22 [ 26.656949] amdgpu 0000:2d:00.0: PM: failed to suspend async noirq: error -22 [ 26.706582] PM: noirq suspend of devices failed I tracked it down to commit 8345a71fc54b which replaces numbers with named constants, but it has several errors in which named constant is used, and the two introduced MASKs are the same when only one should include the STATUS_MASK. Note that the preexisting comment on some of the psp_wait_for actually did not reflect what the code was doing and that may have caused the error in this commit. Restoring the numbers back to the original values solves my problem. To help out, I have attached the patch I used. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.