> Tainted: G
In the two upstream bugs it was noted that there was an amdgpu dkms
package in place. I believe that's where this issue likely was. Commit
63a9ab264a8c came in 6.3-rc1 and the commit it fixes was also in 6.3-rc1
(b1a9557a7d00).
So at least one of the issues is probably invalid in Ubuntu's 5.19, but
there are valid upstream bugs, including in 6.3 as there is still
another patch to test for one of the problems.
I'll adjust the tasks accordingly, as I think this should still be
tracked to fix in mantic.
** Also affects: linux (Ubuntu Mantic)
Importance: Undecided
Status: Confirmed
** Also affects: linux (Ubuntu Kinetic)
Importance: Undecided
Status: New
** Changed in: linux (Ubuntu Kinetic)
Status: New => Invalid
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2018470
Title:
Linux 5.19 amdgpu: NULL pointer on GCN2 and invalid load on GCN1
Status in linux package in Ubuntu:
Confirmed
Status in linux source package in Kinetic:
Invalid
Status in linux source package in Mantic:
Confirmed
Bug description:
The day I updated from Ubuntu 22.04 to 22.10 some months ago, I had to
stick on Linux 5.15 because 5.19 was not working with my computer. The
last two days I spent time to find a way to run Linux 5.19, and found
one version working: 5.19.0-23.
Here are the versions I tested:
- 5.19.0-23
- 5.19.0-29
- 5.19.0-31
- 5.19.0-42
In that list, only Linux 5.19.0-23 is working with that computer.
There may be other versions that work I have not tested, but basically
the breakages occurred after 5.19.0-23.
I face two problems, let's talk about the first one, the graphic one
still present in 5.19.0-42. It starts to occurs with 5.19.0-31
(5.19.0-29 is not affected): graphic breaks at the moment it should
switch from low resolution display to high resolution display at the
very beginning of startup. The computer is not completely broken, but
the graphic is dead. X11 cannot start, trying to use the framebuffer,
meaning the amdgpu driver is not functional).
The second bug is the one I get with the 5.19.0-29 version. Linux
5.19.0-29 doesn't experience the graphic bug but has another issue
that makes the computer unusable: some CPU got locked, and some btrfs
process runs at 100% CPU, syncing never ends, even preventing to
reboot. This bug is less important because I don't reproduce it on
version 5.19.0-42, so if 5.19.0-42 fixes the graphic all will be fine.
I have not updated to Ubuntu 23.04 yet because I'm afraid of newer
kernels from it would leave my computer totally unusable, I have run
Ubuntu 22.10 with Ubuntu 22.04's 5.15 kernel until today because of
that fear.
It actually took me two work days to test various combinations to boot
the computer so I'm sticking on 5.19.0-29 for now, and I have limited
time to test other options. I also tried various BIOS options, and
also upgraded the BIOS…, and since that ThreadRipper PRO computer has
very slow booting BIOS, trying various configurations or software
versions that requires a reboot quickly eats-up whole hours.
The attached logs may have traces of dkim modules like amdgpu-pro, but
the first time I experienced the bug I had none of them. I reproduced
the bug on a 5.19.0-42 kernel free of amdgpu-pro yesterday. I'm simply
opening the ticket from my working environment, and I decided to not
spend one more hour just to uninstall amdgpu-pro and reboot only to do
that ticket.
Here are some details on the hardware:
- MOBO: Gigabyte WRX80-SU8-IPMI rev. 1.0 (BIOS version F5, also named
WRX80PRO-F1 in dmidecode, dated 08/04/2022)
https://www.gigabyte.com/Motherboard/WRX80-SU8-IPMI-rev-10
- RAM: 8× Kingston Server Premier 32GB DDR4 3200 MHz ECC CL22 2Rx8 PC4-25600
KSM32ED8/32ME 16Gbit Micron E
- CPU: AMD Ryzen Threadripper PRO 3955WX 16-Cores (Castle Peak, Zen 2)
- GPU: AMD Radeon R9 390X (Hawaii/Grenada, GCN2, amdgpu driver)
- GPU: AMD Radeon R7 240 (Oland, GCN1, amdgpu driver)
- GPU: ASPEED graphic Family rev 41
The ASPEED graphic is a small card integrated in the motherboard and
part of the BMC, I cannot remove it. This may participate in the
trouble.
When the graphic works (Linux 5.19.0-23, Linux 5.19.0-29), the boot is
displayed on all AMD and ASPEED graphic output, then at the moment the
graphic switches from low resolution to high resolution, the ASPEED
graphic goes off and the display continue on AMD cards.
When the graphic doesn't work (5.19.0-31, 5.19.0-42), the boot is
displayed on all AMD and ASPEED graphic output, then at the moment the
graphic switches from low resolution to high resolution, the AMD cards
display garbage but the display continue on the ASPEED card. The
ASPEED card is a very basic integrated card without hardware
acceleration and featuring only one VGA output so that's unusable. As
an additional information I know X11 never start on the ASPEED if
there are discrete cards plugged in (tested last year).
So right now that computer is sticking on Linux 5.19.0-23 which
doesn't doesn't the graphic and btrfs bugs.
The last kernel to not feature the graphic bug is Linux 5.19.0-29.
Linux 5.19.0-31 is the first one reproducing the graphic bug (the
repository doesn't provide 5.19.0-30 for me to test).
I also have reproduced the graphic bug when using the radeon driver
instead of the amdgpu one.
ProblemType: Bug
DistroRelease: Ubuntu 22.10
Package: linux-image-generic 5.19.0.42.38
ProcVersionSignature: Ubuntu 5.19.0-23.24-generic 5.19.7
Uname: Linux 5.19.0-23-generic x86_64
ApportVersion: 2.23.1-0ubuntu3.3
Architecture: amd64
CasperMD5CheckResult: unknown
CurrentDesktop: GNOME
Date: Thu May 4 11:52:02 2023
HibernationDevice: RESUME=none
MachineType: Default string Default string
ProcEnviron:
LANGUAGE=fr_FR:en
TERM=xterm-256color
PATH=(custom, no user)
LANG=fr_FR.UTF-8
SHELL=/bin/bash
ProcFB:
0 astdrmfb
1 amdgpudrmfb
2 amdgpudrmfb
ProcKernelCmdLine: BOOT_IMAGE=/@/boot/vmlinuz-5.19.0-23-generic
root=UUID=f35ecf77-511e-4dde-ac11-c1d848e97315 ro rootflags=subvol=@
amdgpu.si_support=1 radeon.si_support=0 amdgpu.cik_support=1
radeon.cik_support=0 amdgpu.exp_hw_support=1 amdgpu.gpu_recovery=1
amdgpu.ppfeaturemask=0xffffffff delayacct zswap.enabled=1
PulseList: Error: command ['pacmd', 'list'] failed with exit code 1: No
PulseAudio daemon running, or not running as session daemon.
RelatedPackageVersions:
linux-restricted-modules-5.19.0-23-generic N/A
linux-backports-modules-5.19.0-23-generic N/A
linux-firmware 20220923.gitf09bebf3-0ubuntu1.6
RfKill:
SourcePackage: linux
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 08/04/2022
dmi.bios.release: 5.23
dmi.bios.vendor: American Megatrends International, LLC.
dmi.bios.version: WRX80PRO-F1
dmi.board.asset.tag: Default string
dmi.board.name: Default string
dmi.board.vendor: Default string
dmi.board.version: Default string
dmi.chassis.asset.tag: Default string
dmi.chassis.type: 3
dmi.chassis.vendor: Default string
dmi.chassis.version: Default string
dmi.modalias:
dmi:bvnAmericanMegatrendsInternational,LLC.:bvrWRX80PRO-F1:bd08/04/2022:br5.23:svnDefaultstring:pnDefaultstring:pvrDefaultstring:rvnDefaultstring:rnDefaultstring:rvrDefaultstring:cvnDefaultstring:ct3:cvrDefaultstring:skuDefaultstring:
dmi.product.family: Default string
dmi.product.name: Default string
dmi.product.sku: Default string
dmi.product.version: Default string
dmi.sys.vendor: Default string
modified.conffile..etc.default.apport: [modified]
mtime.conffile..etc.default.apport: 2018-06-16T17:39:00.798346
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2018470/+subscriptions
--
Mailing list: https://launchpad.net/~kernel-packages
Post to : [email protected]
Unsubscribe : https://launchpad.net/~kernel-packages
More help : https://help.launchpad.net/ListHelp