Hello,
I have a PC which worked fine for many years that I did not use for half
a year. Yesterday I want to use it, but sway appears to be crashing
amdgpu in DRM. The components are:

- ASUS System Product Name/TUF GAMING B650M-PLUS
- AMD Ryzen 9 7950X 16-Core Processor
- Debian trixie

I already tried the following:

        - Upgrading to Debian forky
        - Debian trixie live cd
        - Installing the latested amd gpu firmware
        - Updating the Bios to the latest.

In order to reproduce the issue, I boot linux, start sway and open an
alacritty terminal with a tmux inside. amdgpu crashes immediatly. Find
here a video and the full dmesg.

https://tg.st/u/dmesg_9f62587406fb808dc4d91d41029ccf88ceeadf13e1f91d65c27b57536f375550.txt
https://tg.st/u/amdgpu_device_coredump_data_a25f2060c56260bb46ac95ee3123969d5127bf31b29ea3adfe3feeac67bf4edc.zst
https://tg.st/u/VID_20251222_071051104.mp4

[   57.342777] amdgpu 0000:0b:00.0: amdgpu: Dumping IP State
[   57.343822] amdgpu 0000:0b:00.0: amdgpu: Dumping IP State Completed
[   57.343869] amdgpu 0000:0b:00.0: amdgpu: [drm] AMDGPU device coredump file 
has been created
[   57.343871] amdgpu 0000:0b:00.0: amdgpu: [drm] Check your 
/sys/class/drm/card0/device/devcoredump/data
[   57.343872] amdgpu 0000:0b:00.0: amdgpu: ring gfx_0.0.0 timeout, signaled 
seq=106, emitted seq=108
[   57.343873] amdgpu 0000:0b:00.0: amdgpu:  Process sway pid 2021 thread 
sway:cs0 pid 2317
[   57.343875] amdgpu 0000:0b:00.0: amdgpu: Starting gfx_0.0.0 ring reset
[   57.485168] amdgpu 0000:0b:00.0: amdgpu: Ring gfx_0.0.0 reset failed
[   57.485170] amdgpu 0000:0b:00.0: amdgpu: GPU reset begin!
[   57.609921] amdgpu 0000:0b:00.0: amdgpu: MODE2 reset
[   57.616920] amdgpu 0000:0b:00.0: amdgpu: GPU reset succeeded, trying to 
resume
[   57.617008] [drm] PCIE GART of 1024M enabled (table at 0x000000F41FC00000).
[   57.617024] amdgpu 0000:0b:00.0: amdgpu: PSP is resuming...
[   57.638326] amdgpu 0000:0b:00.0: amdgpu: reserve 0xa00000 from 0xf41e000000 
for PSP TMR
[   57.832236] amdgpu 0000:0b:00.0: amdgpu: RAS: optional ras ta ucode is not 
available
[   57.837959] amdgpu 0000:0b:00.0: amdgpu: RAP: optional rap ta ucode is not 
available
[   57.837961] amdgpu 0000:0b:00.0: amdgpu: SECUREDISPLAY: optional 
securedisplay ta ucode is not available
[   57.837963] amdgpu 0000:0b:00.0: amdgpu: SMU is resuming...
[   57.838869] amdgpu 0000:0b:00.0: amdgpu: SMU is resumed successfully!
[   57.839132] amdgpu 0000:0b:00.0: amdgpu: kiq ring mec 2 pipe 1 q 0
[   57.842333] amdgpu 0000:0b:00.0: amdgpu: [drm] DMUB hardware initialized: 
version=0x05002C00
[   57.944932] amdgpu 0000:0b:00.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 on 
hub 0
[   57.944935] amdgpu 0000:0b:00.0: amdgpu: ring gfx_0.1.0 uses VM inv eng 1 on 
hub 0
[   57.944936] amdgpu 0000:0b:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 4 
on hub 0
[   57.944937] amdgpu 0000:0b:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 5 
on hub 0
[   57.944938] amdgpu 0000:0b:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 6 
on hub 0
[   57.944938] amdgpu 0000:0b:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 7 
on hub 0
[   57.944939] amdgpu 0000:0b:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 8 
on hub 0
[   57.944939] amdgpu 0000:0b:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 9 
on hub 0
[   57.944940] amdgpu 0000:0b:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 10 
on hub 0
[   57.944940] amdgpu 0000:0b:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 11 
on hub 0
[   57.944941] amdgpu 0000:0b:00.0: amdgpu: ring kiq_0.2.1.0 uses VM inv eng 12 
on hub 0
[   57.944941] amdgpu 0000:0b:00.0: amdgpu: ring sdma0 uses VM inv eng 13 on 
hub 0
[   57.944942] amdgpu 0000:0b:00.0: amdgpu: ring vcn_dec_0 uses VM inv eng 0 on 
hub 8
[   57.944943] amdgpu 0000:0b:00.0: amdgpu: ring vcn_enc_0.0 uses VM inv eng 1 
on hub 8
[   57.944943] amdgpu 0000:0b:00.0: amdgpu: ring vcn_enc_0.1 uses VM inv eng 4 
on hub 8
[   57.944944] amdgpu 0000:0b:00.0: amdgpu: ring jpeg_dec uses VM inv eng 5 on 
hub 8
[   57.948092] amdgpu 0000:0b:00.0: amdgpu: GPU reset(1) succeeded!
[   57.948107] amdgpu 0000:0b:00.0: [drm] device wedged, but recovered through 
reset
[   57.961832] [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize 
parser -125!

I'm grateful for any pointers that resolve the issue and available for 
debugging.

Cheers,
        Thomas

Reply via email to