Status changed to 'Confirmed' because the bug affects multiple users. ** Changed in: mesa (Ubuntu) Status: New => Confirmed
-- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to mesa in Ubuntu. https://bugs.launchpad.net/bugs/1928393 Title: linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0] retry page fault" Status in amd: New Status in linux-firmware package in Ubuntu: Incomplete Status in mesa package in Ubuntu: Confirmed Bug description: After upgrading linux-firmware from 1.190.5 to 1.197 (as part of the upgrade from Ubuntu 20.10 to 21.04), I started experiencing frequent and severe GPU instability. When this happens, I see this error in dmesg: [20061.061069] amdgpu 0000:03:00.0: amdgpu: [gfxhub0] retry page fault (src_id:0 ring:0 vmid:1 pasid:32769, for process Xorg pid 1141 thread Xorg:cs0 pid 1236) [20061.061103] amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x800000401000 from client 27 [20061.061135] amdgpu 0000:03:00.0: amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00101031 [20061.061147] amdgpu 0000:03:00.0: amdgpu: Faulty UTCL2 client ID: TCP (0x8) [20061.061157] amdgpu 0000:03:00.0: amdgpu: MORE_FAULTS: 0x1 [20061.061167] amdgpu 0000:03:00.0: amdgpu: WALKER_ERROR: 0x0 [20061.061174] amdgpu 0000:03:00.0: amdgpu: PERMISSION_FAULTS: 0x3 [20061.061183] amdgpu 0000:03:00.0: amdgpu: MAPPING_ERROR: 0x0 [20061.061189] amdgpu 0000:03:00.0: amdgpu: RW: 0x0 I'll attach a couple of full dmesgs that I collected. Many of the times when this happens, the screen and keyboard freeze irreversibly (I tried waiting for more than 30 minutes, but it doesn't help). I can still log in via ssh though. When there's no freeze, I can continue using the computer normally, but the laptop fans keep running are always running and the battery depletes fast. There's probably something on a permanent loop either in the kernel or in the GPU. This bug happens several times a day, rendering the machine so unstable as to be almost unusable. It is a severe regression and I'm aghast that it passed AMD's Quality Assurance. After downgrading back to linux-firmware 1.190.5, the machine is back to the previous, mostly-reliable state. Which is to say, this bug is gone, I'm just left with the other amdgpu suspend bug I've learned to live with since I bought this computer. Please revert the amdgpu firmware in this package as soon as possible. This is unbearable. Relevant information: Ubuntu version: 21.04 Linux kernel: 5.11.0-17-generic x86_64 CPU model: AMD Ryzen 7 3700U with Radeon Vega Mobile Gfx GPU: 03:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Picasso (rev c1) Laptop model: Lenovo Ideapad S145 To manage notifications about this bug go to: https://bugs.launchpad.net/amd/+bug/1928393/+subscriptions -- Mailing list: https://launchpad.net/~touch-packages Post to : touch-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~touch-packages More help : https://help.launchpad.net/ListHelp