Since I upgraded from 5.10 (own compiled kernel or debian kernel) to
5.15 (own compiled from same config or debian kernel) and even 5.16
kernel from debian, I get this behavior :
1) First suspend and resume works,
2) But later suspendend always fails. I have kernel exceptions
in amdgpu :
Feb 14 15:40:58 pink-floyd3 kernel: [ 830.803952] Hardware name:
Micro-Star International Co., L
td. Bravo 17 A4DDR/MS-17FK, BIOS E17FKAMS.117 10/29/2020
Feb 14 15:40:58 pink-floyd3 kernel: [ 830.803956] Workqueue: pm
pm_runtime_work
Feb 14 15:40:58 pink-floyd3 kernel: [ 830.803966] RIP:
0010:dm_suspend+0x241/0x260 [amdgpu]
Feb 14 15:40:58 pink-floyd3 kernel: [ 830.804340] Code: 4c 89 e6 4c 89
ef e8 ee 8a 16 00 83 f8 0
1 74 21 89 c2 48 c7 c6 a0 59 69 c1 48 c7 c7 60 83 76 c1 e8 14 ba 06 ff
e9 6d ff ff ff <0f> 0b e9
f8 fd ff ff 4c 89 e6 4c 89 ef e8 6d ba 15 00 e9 56 ff ff
Feb 14 15:40:58 pink-floyd3 kernel: [ 830.804344] RSP:
0018:ffffae6040647c90 EFLAGS: 00010286
Feb 14 15:40:58 pink-floyd3 kernel: [ 830.804348] RAX: 0000000000000000
RBX: ffff973088a40000 RC
X: 0000000000000000
Feb 14 15:40:58 pink-floyd3 kernel: [ 830.804351] RDX: 000000000000000a
RSI: 0000000000000000 RD
I: ffff973088a40000
Feb 14 15:40:58 pink-floyd3 kernel: [ 830.804353] RBP: 0000000000000000
R08: 0000000003c0ca00 R0
9: 0000000080380002
Feb 14 15:40:58 pink-floyd3 kernel: [ 830.804356] R10: ffff9730860f25a0
R11: 000000000000005f R1
2: ffff973088a40000
Feb 14 15:40:58 pink-floyd3 kernel: [ 830.804358] R13: ffff97308132a0d0
R14: 0000000000000008 R1
5: 0000000000000000
Feb 14 15:40:58 pink-floyd3 kernel: [ 830.804361] FS:
0000000000000000(0000) GS:ffff97339f68000
0(0000) knlGS:0000000000000000
Feb 14 15:40:58 pink-floyd3 kernel: [ 830.804364] CS: 0010 DS: 0000
ES: 0000 CR0: 0000000080050
033
Feb 14 15:40:58 pink-floyd3 kernel: [ 830.804366] CR2: 00007fb8c9ce6000
CR3: 00000001115fa000 CR
4: 0000000000350ee0
Feb 14 15:40:58 pink-floyd3 kernel: [ 830.804369] Call Trace:
Feb 14 15:40:58 pink-floyd3 kernel: [ 830.804375] <TASK>
Feb 14 15:40:58 pink-floyd3 kernel: [ 830.804380] ?
nv_common_set_clockgating_state+0xa3/0xb0 [
amdgpu]
Feb 14 15:40:58 pink-floyd3 kernel: [ 830.804693]
amdgpu_device_ip_suspend_phase1+0x63/0xc0 [am
dgpu]
Feb 14 15:40:58 pink-floyd3 kernel: [ 830.804977]
amdgpu_device_suspend+0x66/0x110 [amdgpu]
Feb 14 15:40:58 pink-floyd3 kernel: [ 830.805260]
amdgpu_pmops_runtime_suspend+0xad/0x180 [amdg
pu]
Feb 14 15:40:58 pink-floyd3 kernel: [ 830.805542]
pci_pm_runtime_suspend+0x5a/0x160
Feb 14 15:40:58 pink-floyd3 kernel: [ 830.805549] ? pci_dev_put+0x20/0x20
Feb 14 15:40:58 pink-floyd3 kernel: [ 830.805553]
__rpm_callback+0x44/0x150
Feb 14 15:40:58 pink-floyd3 kernel: [ 830.805558] ? pci_dev_put+0x20/0x20
Feb 14 15:40:58 pink-floyd3 kernel: [ 830.805561] rpm_callback+0x59/0x70
Feb 14 15:40:58 pink-floyd3 kernel: [ 830.805565] ? pci_dev_put+0x20/0x20
Feb 14 15:40:58 pink-floyd3 kernel: [ 830.805568] rpm_suspend+0x14a/0x720
Feb 14 15:40:58 pink-floyd3 kernel: [ 830.805572] ?
_raw_spin_unlock+0x16/0x30
Feb 14 15:40:58 pink-floyd3 kernel: [ 830.805580] ?
finish_task_switch.isra.0+0xc1/0x2f0
Feb 14 15:40:58 pink-floyd3 kernel: [ 830.805586] ?
__switch_to+0x114/0x440
Feb 14 15:40:58 pink-floyd3 kernel: [ 830.805593]
pm_runtime_work+0x94/0xa0
Feb 14 15:40:58 pink-floyd3 kernel: [ 830.805597]
process_one_work+0x1e8/0x3c0
Feb 14 15:40:58 pink-floyd3 kernel: [ 830.805604] worker_thread+0x50/0x3b0
Feb 14 15:40:58 pink-floyd3 kernel: [ 830.805608] ?
rescuer_thread+0x370/0x370
Feb 14 15:40:58 pink-floyd3 kernel: [ 830.805611] kthread+0x16b/0x190
Feb 14 15:40:58 pink-floyd3 kernel: [ 830.805616] ?
set_kthread_struct+0x40/0x40
Feb 14 15:40:58 pink-floyd3 kernel: [ 830.805621] ret_from_fork+0x22/0x30
Feb 14 15:40:58 pink-floyd3 kernel: [ 830.805630] </TASK>
Feb 14 15:40:58 pink-floyd3 kernel: [ 830.805632] ---[ end trace
8d77579b410d926d ]---
Feb 14 15:40:58 pink-floyd3 kernel: [ 831.142133] amdgpu 0000:03:00.0:
[drm:amdgpu_ring_test_hel
per [amdgpu]] *ERROR* ring kiq_2.1.0 test failed (-110)
Feb 14 15:40:58 pink-floyd3 kernel: [ 831.142444]
[drm:gfx_v10_0_hw_fini [amdgpu]] *ERROR* KGQ d
isable failed
Feb 14 15:40:59 pink-floyd3 kernel: [ 831.462157] amdgpu 0000:03:00.0:
[drm:amdgpu_ring_test_hel
per [amdgpu]] *ERROR* ring kiq_2.1.0 test failed (-110)
Feb 14 15:40:59 pink-floyd3 kernel: [ 831.462465]
[drm:gfx_v10_0_hw_fini [amdgpu]] *ERROR* KCQ d
isable failed
Feb 14 15:40:59 pink-floyd3 kernel: [ 831.782375]
[drm:gfx_v10_0_hw_fini [amdgpu]] *ERROR* faile
d to halt cp gfx
Feb 14 15:41:05 pink-floyd3 kernel: [ 837.297839] amdgpu 0000:03:00.0:
amdgpu: SMU: I'm not done
with your previous command: SMN_C2PMSG_66:0x0000003A
SMN_C2PMSG_82:0x00000000