5.13.0-24.24 helps to me. With 5.13.0-23.23 my server don't boot at all:
it starts booting and then monitor goes into inactive state, so I don't
even have a way to see an error message.
My configuration:
00:00.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h (Models
60h-6fh) Processor Root Complex
00:01.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Wani
[Radeon R5/R6/R7 Graphics] (rev 85)
00:02.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h (Models
60h-6fh) Host Bridge
00:02.2 PCI bridge: Advanced Micro Devices, Inc. [AMD] Family 15h (Models
60h-6fh) Processor Root Port
00:02.5 PCI bridge: Advanced Micro Devices, Inc. [AMD] Family 15h (Models
60h-6fh) Processor Root Port
00:03.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h (Models
60h-6fh) Host Bridge
00:08.0 Encryption controller: Advanced Micro Devices, Inc. [AMD] Carrizo
Platform Security Processor
00:09.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Carrizo Audio Dummy
Host Bridge
00:10.0 USB controller: Advanced Micro Devices, Inc. [AMD] FCH USB XHCI
Controller (rev 20)
00:11.0 SATA controller: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller
[AHCI mode] (rev 49)
00:12.0 USB controller: Advanced Micro Devices, Inc. [AMD] FCH USB EHCI
Controller (rev 49)
00:14.0 SMBus: Advanced Micro Devices, Inc. [AMD] FCH SMBus Controller (rev 4a)
00:14.3 ISA bridge: Advanced Micro Devices, Inc. [AMD] FCH LPC Bridge (rev 11)
00:18.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h (Models
60h-6fh) Processor Function 0
00:18.1 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h (Models
60h-6fh) Processor Function 1
00:18.2 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h (Models
60h-6fh) Processor Function 2
00:18.3 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h (Models
60h-6fh) Processor Function 3
00:18.4 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h (Models
60h-6fh) Processor Function 4
00:18.5 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 15h (Models
60h-6fh) Processor Function 5
01:00.0 SATA controller: Marvell Technology Group Ltd. 88SE9230 PCIe 2.0 x2
4-port SATA 6 Gb/s RAID Controller (rev 11)
02:00.0 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme BCM5720
Gigabit Ethernet PCIe
02:00.1 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme BCM5720
Gigabit Ethernet PCIe
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1956401
Title:
amdgpu hangs for 90 seconds at a time in 5.13.0-23, but 5.13.0-22
works
Status in linux package in Ubuntu:
Invalid
Status in linux source package in Impish:
Confirmed
Bug description:
SRU Justification
Impact:
This does not occur with linux-image-5.13.0-22-generic, but does with
linux-image-5.13.0-23-generic.
On startup, I get about a 60 second hang, with the following in the kernel
dmesg:
Jan 4 15:26:36 inspiron-3505 kernel: [ 34.160572] amdgpu :04:00.0:
amdgp : failed to write reg 28b4 wait reg 28c6
Jan 4 15:26:56 inspiron-3505 kernel: [ 54.189055] amdgpu :04:00.0:
amdgp : failed to write reg 1a6f4 wait reg 1a706
Jan 4 15:27:16 inspiron-3505 kernel: [ 74.329264] amdgpu :04:00.0:
amdgp : failed to write reg 28b4 wait reg 28c6
Jan 4 15:27:36 inspiron-3505 kernel: [ 94.337904] amdgpu :04:00.0:
amdgp : failed to write reg 1a6f4 wait reg 1a706
I have the following GPU:
04:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI]
Picass
o (rev c2) (prog-if 00 [VGA controller])
04:00.0 0300: 1002:15d8 (rev c2)
(This is a Ryzen 5 3450U CPU with Radeon Vega Mobile.)
I get a similar hang if I start firefox (when it's probing OpenGL
contexts), and even with glxgears and glxinfo. Seems like anything
that'd kick on a OpenGL context does it. I had a freeze as well when
I tried running firefox and glxgears both. Along with odd BUG:
messages logged (I have some in the attached log.)
I was running with "iommu=pt", but did try with this removed, still
got the errors (I think amdgpu driver uses the IOMMU even when it's
set to IOMMU=pt though.). See the attached log for some very odd
"[Hardware Error]" messages that were logged on one test run. I think
this was when I tried to run firestorm (second life viewer) -- that
had a large pause then opened to a black window.
Per Google, I see there was a bug like this that turned up in kernel
5.14.15 but fixed in 5.14.17. See
https://gitlab.freedesktop.org/drm/amd/-/issues/1770
Thanks!
--Henry
Fix:
upstream commit afd18180c070 ("drm/amdkfd: fix boot failure when iommu is
disabled in Picasso.")
Patch was included in the Impish kernel in -proposed (5.13.0.24.24)
from an upstream patch set. multiple confirmations the problem is
resolved with the kernel in -proposed.
To manage notifications about