https://bugs.freedesktop.org/show_bug.cgi?id=108118

            Bug ID: 108118
           Summary: AMDGPU sometimes hangs forever when running graphical
                    applications
           Product: DRI
           Version: unspecified
          Hardware: x86-64 (AMD64)
                OS: Linux (All)
            Status: NEW
          Severity: major
          Priority: medium
         Component: DRM/AMDgpu
          Assignee: dri-devel@lists.freedesktop.org
          Reporter: duoora...@gmail.com

Sometimes when running a graphical application the display will freeze up but
system sound will continue. The machine is still functioning and is accessible
over ssh. Using ssh the following related messages are found in the output of
dmesg:

[drm:admgpu_job_timeout [amdgpu]] *ERROR* ring gfx timeout, signalled
seq=1256221, emitted seq=1256223
amdgpu 0000:10:00.0 GPU reset begin!
[drm:amdgpu_dm_atomic_check [amdgpu]] *ERROR* [CRTC:43:crtc-0] hw_done or
flip_done timed out

This is the output with amdgpu.gpu_recovery=1 set in the kernel launch
parameters. Without that the output is the same except the last two messages
are replaced with a message about GPU recovery being disabled. Any attempt to
access /sys/kernel/debug/dri/0/amdgpu_gpu_recover in either state hangs
forever. Magic SysRq keys still work and processes can be killed over SSH but
killing the game/Xorg/etc. will not cause the display to start working again, a
reset is required.

This has been observed with both Xorg and the KDE's Wayland compositor.

This has only been observed with Vulkan applications (native Dota 2's Vulkan
mode and DXVK backed Wine games) but hasn't been confirmed to not occur with
others.

This was observed with the libvulkan_radeon.so Mesa Vulkan driver. I couldn't
confirm the behavior with AMDVLK because graphical applications failed to
launch with it installed. I don't believe the GPU in question is faulty, I've
used it for long periods of time on Windows and it's rock stable.

Observed on kernels 4.18.11 and 4.19.0-rc6. Searching around the Internet
suggests this may have started with 4.18.0 but I haven't confirmed that yet.

Hardware:
CPU: AMD R7 1800X
GPU: AMD RX Vega 64

-- 
You are receiving this mail because:
You are the assignee for the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Reply via email to