Why in my Update Manager when viewing the Kernels, 5.15.0-116 is shown as "Installed" but not "Active"?
And when powering up I get: "error: file `/boot/' not found" Maybe this is the problem and not the new Kernel? Thanks, Ralph Sent with Proton Mail secure email. On Tuesday, July 16th, 2024 at 7:02 PM, Erv Bendiks <2068...@bugs.launchpad.net> wrote: > I just booted my AMD 3200G Linux Mint box with 5.15.0-116-generic and it > didn't hang with a blank screen. This box passed test. > > On Tue, Jul 16, 2024 at 11:00 AM Roger Ramjet 2068...@bugs.launchpad.net > > wrote: > > > Not sure if I'm replying correctly, this bug fix did not help me. > > > > Unfortunately, I still have the same problem, after updating, I power down > > and restart, I get: > > error: file `/boot/' not found. > > > > I view the Kernels in the update manager, it shows 5.15.0-116 is > > "installed" and "supported until April 2027" > > > > The Kernel is loaded and installed but "not found" > > Then another window opens and I must choose "Boot from next volume" > > Then another window where I'm given the choice of booting from > > 5.15.0-107-generic (on /dev/sda5) > > This new Kernel is not listed at this window, not sure why not, seems it > > should be. > > > > If I can give you more info. let me know. > > > > Ralph Goe > > > > Sent with Proton Mail secure email. > > > > On Tuesday, July 16th, 2024 at 9:56 AM, Ubuntu Kernel Bot > > 2068...@bugs.launchpad.net wrote: > > > > > This bug is awaiting verification that the linux-gke/5.15.0-1063.69 > > > kernel in -proposed solves the problem. Please test the kernel and > > > update this bug with the results. If the problem is solved, change the > > > tag 'verification-needed-jammy-linux-gke' to 'verification-done-jammy- > > > linux-gke'. If the problem still exists, change the tag 'verification- > > > needed-jammy-linux-gke' to 'verification-failed-jammy-linux-gke'. > > > > > > If verification is not done by 5 working days from today, this fix will > > > be dropped from the source code, and this bug will be closed. > > > > > > See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how > > > to enable and use -proposed. Thank you! > > > > > > ** Tags added: kernel-spammed-jammy-linux-gke-v2 > > > verification-needed-jammy-linux-gke > > > > > > -- > > > You received this bug notification because you are subscribed to a > > > duplicate bug report (2069485). > > > https://bugs.launchpad.net/bugs/2068738 > > > > > > Title: > > > AMD GPUs fail with null pointer dereference when IOMMU enabled, > > > leading to black screen > > > > > > Status in linux package in Ubuntu: > > > Fix Released > > > Status in linux source package in Jammy: > > > Fix Released > > > > > > Bug description: > > > BugLink: https://bugs.launchpad.net/bugs/2068738 > > > > > > [Impact] > > > > > > On systems with AMD Picasso/Raven 2 GPU devices, when the IOMMU is > > > enabled, the system fails to boot correctly, and all users see is a > > > black screen. > > > > > > This is caused by a null pointer dereference when enabling the IOMMU > > > after the device has been initialised. It should happen the other way > > > around. > > > > > > AMD-Vi: AMD IOMMUv2 loaded and initialized > > > ... > > > amdgpu: Topology: Add APU node [0x15d8:0x1002] > > > kfd kfd: amdgpu: added device 1002:15d8 > > > kfd kfd: amdgpu: Failed to resume IOMMU for device 1002:15d8 > > > ... > > > amdgpu 0000:06:00.0: amdgpu: amdgpu_device_ip_init failed > > > amdgpu 0000:06:00.0: amdgpu: Fatal error during GPU init > > > amdgpu 0000:06:00.0: amdgpu: amdgpu: finishing device. > > > ... > > > BUG: kernel NULL pointer dereference, address: 000000000000013c > > > ... > > > CPU: 1 PID: 223 Comm: systemd-udevd Not tainted 5.15.0-112-generic > > > #122-Ubuntu > > > ... > > > RIP: 0010:amdgpu_dm_fini+0x149/0x1f0 [amdgpu] > > > ... > > > Call Trace: > > > <TASK> > > > > > > ? srso_return_thunk+0x5/0x10 > > > ? show_trace_log_lvl+0x28e/0x2ea > > > ? show_trace_log_lvl+0x28e/0x2ea > > > ? dm_hw_fini+0x23/0x30 [amdgpu] > > > ? show_regs.part.0+0x23/0x29 > > > ? __die_body.cold+0x8/0xd > > > ? __die+0x2b/0x37 > > > ? page_fault_oops+0x13b/0x170 > > > ? srso_return_thunk+0x5/0x10 > > > ? do_user_addr_fault+0x321/0x670 > > > ? srso_return_thunk+0x5/0x10 > > > ? __free_pages_ok+0x34a/0x4f0 > > > ? exc_page_fault+0x77/0x170 > > > ? asm_exc_page_fault+0x27/0x30 > > > ? amdgpu_dm_fini+0x149/0x1f0 [amdgpu] > > > dm_hw_fini+0x23/0x30 [amdgpu] > > > amdgpu_device_ip_fini_early.isra.0+0x278/0x312 [amdgpu] > > > amdgpu_device_fini_hw+0x156/0x208 [amdgpu] > > > amdgpu_driver_unload_kms+0x69/0x90 [amdgpu] > > > amdgpu_driver_load_kms.cold+0x81/0x107 [amdgpu] > > > amdgpu_pci_probe+0x1d1/0x290 [amdgpu] > > > local_pci_probe+0x4b/0x90 > > > ? srso_return_thunk+0x5/0x10 > > > pci_device_probe+0x119/0x200 > > > really_probe+0x222/0x420 > > > __driver_probe_device+0xe8/0x140 > > > driver_probe_device+0x23/0xc0 > > > __driver_attach+0xf7/0x1f0 > > > ? __device_attach_driver+0x140/0x140 > > > bus_for_each_dev+0x7f/0xd0 > > > driver_attach+0x1e/0x30 > > > bus_add_driver+0x148/0x220 > > > ? srso_return_thunk+0x5/0x10 > > > driver_register+0x95/0x100 > > > __pci_register_driver+0x68/0x70 > > > amdgpu_init+0x7c/0x1000 [amdgpu] > > > ? 0xffffffffc0e0b000 > > > do_one_initcall+0x49/0x1e0 > > > ? srso_return_thunk+0x5/0x10 > > > ? kmem_cache_alloc_trace+0x19e/0x2e0 > > > do_init_module+0x52/0x260 > > > load_module+0xb45/0xbe0 > > > __do_sys_finit_module+0xbf/0x120 > > > __x64_sys_finit_module+0x18/0x20 > > > x64_sys_call+0x1ac3/0x1fa0 > > > do_syscall_64+0x56/0xb0 > > > ... > > > entry_SYSCALL_64_after_hwframe+0x67/0xd1 > > > > > > A workaround does exist. Users can set "nomodeset" or "amd_iommu=off" > > > to GRUB_CMDLINE_LINUX_DEFAULT, update-grub and reboot. > > > > > > [Fix] > > > > > > The regression was caused by the following commit that landed in > > > 5.15.0-112-generic, and 5.15.150 upstream: > > > > > > commit 3c7e53c0d4b43ffe6e7715414b5f2b3177881ecd ubuntu-jammy > > > Author: Yifan Zhang yifan1.zh...@amd.com > > > > > > Date: Tue Sep 28 15:42:35 2021 +0800 > > > Subject: drm/amdgpu: init iommu after amdkfd device init > > > Link: > > > https://git.launchpad.net/~ubuntu-kernel/ubuntu/+source/linux/+git/jammy/commit/?id=3c7e53c0d4b43ffe6e7715414b5f2b3177881ecd > > > > > > The fix is to revert this patch, as it was not suppose to be > > > backported to 5.15 stable. > > > > > > The mailing list discussion with AMD developers is: > > > > > > https://lore.kernel.org/amd-gfx/20240523173031.4212-1-w_ar...@gmx.de/ > > > > > > The fix hasn't been acknowledged by Greg KH or Sasha Levin yet, so > > > sending as a Ubuntu SAUCE patch. If the upstream status changes, we > > > can NAK and resend. > > > > > > [Testcase] > > > > > > You need a system with an AMD Picasso/Raven 2 device. It will likely > > > be an APU, and not a discrete graphics card, but any AMD Picasso/Raven > > > 2 device is affected. > > > > > > Install the kernel and boot. Make sure full modesetting is enabled. > > > > > > There is a test kernel available in the ppa below: > > > > > > https://launchpad.net/~mruffell/+archive/ubuntu/lp2068738-test > > > > > > If you install the test kernel, your system should boot successfully. > > > > > > [Where problems could occur] > > > > > > We are reverting a problematic patch and going back to how it was > > > before 5.15.0-112-generic. This should not cause any issues for users. > > > > > > If a regression were to occur, users can set "nomodeset" or > > > "amd_iommu=off" to GRUB_CMDLINE_LINUX_DEFAULT and reboot, or pin their > > > kernel to a working one. > > > > > > The impact of a regression would be high, as users displays could be > > > blank. > > > > > > [Other Info] > > > > > > User reports: > > > https://forums.linuxmint.com/viewtopic.php?t=421484 > > > https://forums.linuxmint.com/viewtopic.php?t=421441 > > > > https://www.reddit.com/r/Ubuntu/comments/1d9uviz/had_to_purge_kernel_5150112_could_not_boot/ > > > > https://www.reddit.com/r/linuxmint/comments/1d9w6c9/kernel_5150112_boot_failure/ > > > > > https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2068735 > > > https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2068793 > > > https://bugs.launchpad.net/bugs/2068812 > > > > > > As bizarre as it is, this commit was actually originally included in > > > 5.15-rc5: > > > > > > commit 714d9e4574d54596973ee3b0624ee4a16264d700 > > > Author: Yifan Zhang yifan1.zh...@amd.com > > > > > > Date: Tue Sep 28 15:42:35 2021 +0800 > > > Subject: drm/amdgpu: init iommu after amdkfd device init > > > Link: > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=714d9e4574d54596973ee3b0624ee4a16264d700 > > > > > > It seems to have caused issues back then too, and was removed in the > > > following fixups, in 5.16-rc1: > > > > > > commit 93cec184788b0cf3926bc1f7b47fed74ba87990c > > > Author: James Zhu james....@amd.com > > > > > > Date: Tue Nov 2 21:33:50 2021 -0400 > > > Subject: drm/amdgpu: remove duplicated kfd_resume_iommu > > > Link: > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=93cec184788b0cf3926bc1f7b47fed74ba87990c > > > > > > commit 9f4f2c1a35248f56b2a9c1c004e0aaff3609b15d > > > Author: shaoyunl shaoyun....@amd.com > > > > > > Date: Fri Nov 5 12:34:14 2021 -0400 > > > Subject: drm/amd/amdgpu: fix the kfd pre_reset sequence in sriov > > > Link: > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=9f4f2c1a35248f56b2a9c1c004e0aaff3609b15d > > > > > > I'm not exactly in favor of rewriting history twice, so I think we > > > should just revert the upstream stable patch and move on. > > > > > > To manage notifications about this bug go to: > > > > https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2068738/+subscriptions > > > > -- > > You received this bug notification because you are subscribed to the bug > > report. > > https://bugs.launchpad.net/bugs/2068738 > > > > Title: > > AMD GPUs fail with null pointer dereference when IOMMU enabled, > > leading to black screen > > > > Status in linux package in Ubuntu: > > Fix Released > > Status in linux source package in Jammy: > > Fix Released > > > > Bug description: > > BugLink: https://bugs.launchpad.net/bugs/2068738 > > > > [Impact] > > > > On systems with AMD Picasso/Raven 2 GPU devices, when the IOMMU is > > enabled, the system fails to boot correctly, and all users see is a > > black screen. > > > > This is caused by a null pointer dereference when enabling the IOMMU > > after the device has been initialised. It should happen the other way > > around. > > > > AMD-Vi: AMD IOMMUv2 loaded and initialized > > ... > > amdgpu: Topology: Add APU node [0x15d8:0x1002] > > kfd kfd: amdgpu: added device 1002:15d8 > > kfd kfd: amdgpu: Failed to resume IOMMU for device 1002:15d8 > > ... > > amdgpu 0000:06:00.0: amdgpu: amdgpu_device_ip_init failed > > amdgpu 0000:06:00.0: amdgpu: Fatal error during GPU init > > amdgpu 0000:06:00.0: amdgpu: amdgpu: finishing device. > > ... > > BUG: kernel NULL pointer dereference, address: 000000000000013c > > ... > > CPU: 1 PID: 223 Comm: systemd-udevd Not tainted 5.15.0-112-generic > > #122-Ubuntu > > ... > > RIP: 0010:amdgpu_dm_fini+0x149/0x1f0 [amdgpu] > > ... > > Call Trace: > > <TASK> > > ? srso_return_thunk+0x5/0x10 > > ? show_trace_log_lvl+0x28e/0x2ea > > ? show_trace_log_lvl+0x28e/0x2ea > > ? dm_hw_fini+0x23/0x30 [amdgpu] > > ? show_regs.part.0+0x23/0x29 > > ? __die_body.cold+0x8/0xd > > ? __die+0x2b/0x37 > > ? page_fault_oops+0x13b/0x170 > > ? srso_return_thunk+0x5/0x10 > > ? do_user_addr_fault+0x321/0x670 > > ? srso_return_thunk+0x5/0x10 > > ? __free_pages_ok+0x34a/0x4f0 > > ? exc_page_fault+0x77/0x170 > > ? asm_exc_page_fault+0x27/0x30 > > ? amdgpu_dm_fini+0x149/0x1f0 [amdgpu] > > dm_hw_fini+0x23/0x30 [amdgpu] > > amdgpu_device_ip_fini_early.isra.0+0x278/0x312 [amdgpu] > > amdgpu_device_fini_hw+0x156/0x208 [amdgpu] > > amdgpu_driver_unload_kms+0x69/0x90 [amdgpu] > > amdgpu_driver_load_kms.cold+0x81/0x107 [amdgpu] > > amdgpu_pci_probe+0x1d1/0x290 [amdgpu] > > local_pci_probe+0x4b/0x90 > > ? srso_return_thunk+0x5/0x10 > > pci_device_probe+0x119/0x200 > > really_probe+0x222/0x420 > > __driver_probe_device+0xe8/0x140 > > driver_probe_device+0x23/0xc0 > > __driver_attach+0xf7/0x1f0 > > ? __device_attach_driver+0x140/0x140 > > bus_for_each_dev+0x7f/0xd0 > > driver_attach+0x1e/0x30 > > bus_add_driver+0x148/0x220 > > ? srso_return_thunk+0x5/0x10 > > driver_register+0x95/0x100 > > __pci_register_driver+0x68/0x70 > > amdgpu_init+0x7c/0x1000 [amdgpu] > > ? 0xffffffffc0e0b000 > > do_one_initcall+0x49/0x1e0 > > ? srso_return_thunk+0x5/0x10 > > ? kmem_cache_alloc_trace+0x19e/0x2e0 > > do_init_module+0x52/0x260 > > load_module+0xb45/0xbe0 > > __do_sys_finit_module+0xbf/0x120 > > __x64_sys_finit_module+0x18/0x20 > > x64_sys_call+0x1ac3/0x1fa0 > > do_syscall_64+0x56/0xb0 > > ... > > entry_SYSCALL_64_after_hwframe+0x67/0xd1 > > > > A workaround does exist. Users can set "nomodeset" or "amd_iommu=off" > > to GRUB_CMDLINE_LINUX_DEFAULT, update-grub and reboot. > > > > [Fix] > > > > The regression was caused by the following commit that landed in > > 5.15.0-112-generic, and 5.15.150 upstream: > > > > commit 3c7e53c0d4b43ffe6e7715414b5f2b3177881ecd ubuntu-jammy > > Author: Yifan Zhang yifan1.zh...@amd.com > > Date: Tue Sep 28 15:42:35 2021 +0800 > > Subject: drm/amdgpu: init iommu after amdkfd device init > > Link: > > https://git.launchpad.net/~ubuntu-kernel/ubuntu/+source/linux/+git/jammy/commit/?id=3c7e53c0d4b43ffe6e7715414b5f2b3177881ecd > > > > The fix is to revert this patch, as it was not suppose to be > > backported to 5.15 stable. > > > > The mailing list discussion with AMD developers is: > > > > https://lore.kernel.org/amd-gfx/20240523173031.4212-1-w_ar...@gmx.de/ > > > > The fix hasn't been acknowledged by Greg KH or Sasha Levin yet, so > > sending as a Ubuntu SAUCE patch. If the upstream status changes, we > > can NAK and resend. > > > > [Testcase] > > > > You need a system with an AMD Picasso/Raven 2 device. It will likely > > be an APU, and not a discrete graphics card, but any AMD Picasso/Raven > > 2 device is affected. > > > > Install the kernel and boot. Make sure full modesetting is enabled. > > > > There is a test kernel available in the ppa below: > > > > https://launchpad.net/~mruffell/+archive/ubuntu/lp2068738-test > > > > If you install the test kernel, your system should boot successfully. > > > > [Where problems could occur] > > > > We are reverting a problematic patch and going back to how it was > > before 5.15.0-112-generic. This should not cause any issues for users. > > > > If a regression were to occur, users can set "nomodeset" or > > "amd_iommu=off" to GRUB_CMDLINE_LINUX_DEFAULT and reboot, or pin their > > kernel to a working one. > > > > The impact of a regression would be high, as users displays could be > > blank. > > > > [Other Info] > > > > User reports: > > https://forums.linuxmint.com/viewtopic.php?t=421484 > > https://forums.linuxmint.com/viewtopic.php?t=421441 > > > > https://www.reddit.com/r/Ubuntu/comments/1d9uviz/had_to_purge_kernel_5150112_could_not_boot/ > > > > https://www.reddit.com/r/linuxmint/comments/1d9w6c9/kernel_5150112_boot_failure/ > > https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2068735 > > https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2068793 > > https://bugs.launchpad.net/bugs/2068812 > > > > As bizarre as it is, this commit was actually originally included in > > 5.15-rc5: > > > > commit 714d9e4574d54596973ee3b0624ee4a16264d700 > > Author: Yifan Zhang yifan1.zh...@amd.com > > Date: Tue Sep 28 15:42:35 2021 +0800 > > Subject: drm/amdgpu: init iommu after amdkfd device init > > Link: > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=714d9e4574d54596973ee3b0624ee4a16264d700 > > > > It seems to have caused issues back then too, and was removed in the > > following fixups, in 5.16-rc1: > > > > commit 93cec184788b0cf3926bc1f7b47fed74ba87990c > > Author: James Zhu james....@amd.com > > Date: Tue Nov 2 21:33:50 2021 -0400 > > Subject: drm/amdgpu: remove duplicated kfd_resume_iommu > > Link: > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=93cec184788b0cf3926bc1f7b47fed74ba87990c > > > > commit 9f4f2c1a35248f56b2a9c1c004e0aaff3609b15d > > Author: shaoyunl shaoyun....@amd.com > > Date: Fri Nov 5 12:34:14 2021 -0400 > > Subject: drm/amd/amdgpu: fix the kfd pre_reset sequence in sriov > > Link: > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=9f4f2c1a35248f56b2a9c1c004e0aaff3609b15d > > > > I'm not exactly in favor of rewriting history twice, so I think we > > should just revert the upstream stable patch and move on. > > > > To manage notifications about this bug go to: > > https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2068738/+subscriptions > > > -- > Erv Bendiks > > 416-816-9802 > > -- > You received this bug notification because you are subscribed to a > duplicate bug report (2069485). > https://bugs.launchpad.net/bugs/2068738 > > Title: > AMD GPUs fail with null pointer dereference when IOMMU enabled, > leading to black screen > > Status in linux package in Ubuntu: > Fix Released > Status in linux source package in Jammy: > Fix Released > > Bug description: > BugLink: https://bugs.launchpad.net/bugs/2068738 > > [Impact] > > On systems with AMD Picasso/Raven 2 GPU devices, when the IOMMU is > enabled, the system fails to boot correctly, and all users see is a > black screen. > > This is caused by a null pointer dereference when enabling the IOMMU > after the device has been initialised. It should happen the other way > around. > > AMD-Vi: AMD IOMMUv2 loaded and initialized > ... > amdgpu: Topology: Add APU node [0x15d8:0x1002] > kfd kfd: amdgpu: added device 1002:15d8 > kfd kfd: amdgpu: Failed to resume IOMMU for device 1002:15d8 > ... > amdgpu 0000:06:00.0: amdgpu: amdgpu_device_ip_init failed > amdgpu 0000:06:00.0: amdgpu: Fatal error during GPU init > amdgpu 0000:06:00.0: amdgpu: amdgpu: finishing device. > ... > BUG: kernel NULL pointer dereference, address: 000000000000013c > ... > CPU: 1 PID: 223 Comm: systemd-udevd Not tainted 5.15.0-112-generic #122-Ubuntu > ... > RIP: 0010:amdgpu_dm_fini+0x149/0x1f0 [amdgpu] > ... > Call Trace: > <TASK> > > ? srso_return_thunk+0x5/0x10 > ? show_trace_log_lvl+0x28e/0x2ea > ? show_trace_log_lvl+0x28e/0x2ea > ? dm_hw_fini+0x23/0x30 [amdgpu] > ? show_regs.part.0+0x23/0x29 > ? __die_body.cold+0x8/0xd > ? __die+0x2b/0x37 > ? page_fault_oops+0x13b/0x170 > ? srso_return_thunk+0x5/0x10 > ? do_user_addr_fault+0x321/0x670 > ? srso_return_thunk+0x5/0x10 > ? __free_pages_ok+0x34a/0x4f0 > ? exc_page_fault+0x77/0x170 > ? asm_exc_page_fault+0x27/0x30 > ? amdgpu_dm_fini+0x149/0x1f0 [amdgpu] > dm_hw_fini+0x23/0x30 [amdgpu] > amdgpu_device_ip_fini_early.isra.0+0x278/0x312 [amdgpu] > amdgpu_device_fini_hw+0x156/0x208 [amdgpu] > amdgpu_driver_unload_kms+0x69/0x90 [amdgpu] > amdgpu_driver_load_kms.cold+0x81/0x107 [amdgpu] > amdgpu_pci_probe+0x1d1/0x290 [amdgpu] > local_pci_probe+0x4b/0x90 > ? srso_return_thunk+0x5/0x10 > pci_device_probe+0x119/0x200 > really_probe+0x222/0x420 > __driver_probe_device+0xe8/0x140 > driver_probe_device+0x23/0xc0 > __driver_attach+0xf7/0x1f0 > ? __device_attach_driver+0x140/0x140 > bus_for_each_dev+0x7f/0xd0 > driver_attach+0x1e/0x30 > bus_add_driver+0x148/0x220 > ? srso_return_thunk+0x5/0x10 > driver_register+0x95/0x100 > __pci_register_driver+0x68/0x70 > amdgpu_init+0x7c/0x1000 [amdgpu] > ? 0xffffffffc0e0b000 > do_one_initcall+0x49/0x1e0 > ? srso_return_thunk+0x5/0x10 > ? kmem_cache_alloc_trace+0x19e/0x2e0 > do_init_module+0x52/0x260 > load_module+0xb45/0xbe0 > __do_sys_finit_module+0xbf/0x120 > __x64_sys_finit_module+0x18/0x20 > x64_sys_call+0x1ac3/0x1fa0 > do_syscall_64+0x56/0xb0 > ... > entry_SYSCALL_64_after_hwframe+0x67/0xd1 > > A workaround does exist. Users can set "nomodeset" or "amd_iommu=off" > to GRUB_CMDLINE_LINUX_DEFAULT, update-grub and reboot. > > [Fix] > > The regression was caused by the following commit that landed in > 5.15.0-112-generic, and 5.15.150 upstream: > > commit 3c7e53c0d4b43ffe6e7715414b5f2b3177881ecd ubuntu-jammy > Author: Yifan Zhang yifan1.zh...@amd.com > > Date: Tue Sep 28 15:42:35 2021 +0800 > Subject: drm/amdgpu: init iommu after amdkfd device init > Link: > https://git.launchpad.net/~ubuntu-kernel/ubuntu/+source/linux/+git/jammy/commit/?id=3c7e53c0d4b43ffe6e7715414b5f2b3177881ecd > > The fix is to revert this patch, as it was not suppose to be > backported to 5.15 stable. > > The mailing list discussion with AMD developers is: > > https://lore.kernel.org/amd-gfx/20240523173031.4212-1-w_ar...@gmx.de/ > > The fix hasn't been acknowledged by Greg KH or Sasha Levin yet, so > sending as a Ubuntu SAUCE patch. If the upstream status changes, we > can NAK and resend. > > [Testcase] > > You need a system with an AMD Picasso/Raven 2 device. It will likely > be an APU, and not a discrete graphics card, but any AMD Picasso/Raven > 2 device is affected. > > Install the kernel and boot. Make sure full modesetting is enabled. > > There is a test kernel available in the ppa below: > > https://launchpad.net/~mruffell/+archive/ubuntu/lp2068738-test > > If you install the test kernel, your system should boot successfully. > > [Where problems could occur] > > We are reverting a problematic patch and going back to how it was > before 5.15.0-112-generic. This should not cause any issues for users. > > If a regression were to occur, users can set "nomodeset" or > "amd_iommu=off" to GRUB_CMDLINE_LINUX_DEFAULT and reboot, or pin their > kernel to a working one. > > The impact of a regression would be high, as users displays could be > blank. > > [Other Info] > > User reports: > https://forums.linuxmint.com/viewtopic.php?t=421484 > https://forums.linuxmint.com/viewtopic.php?t=421441 > https://www.reddit.com/r/Ubuntu/comments/1d9uviz/had_to_purge_kernel_5150112_could_not_boot/ > https://www.reddit.com/r/linuxmint/comments/1d9w6c9/kernel_5150112_boot_failure/ > https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2068735 > https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2068793 > https://bugs.launchpad.net/bugs/2068812 > > As bizarre as it is, this commit was actually originally included in > 5.15-rc5: > > commit 714d9e4574d54596973ee3b0624ee4a16264d700 > Author: Yifan Zhang yifan1.zh...@amd.com > > Date: Tue Sep 28 15:42:35 2021 +0800 > Subject: drm/amdgpu: init iommu after amdkfd device init > Link: > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=714d9e4574d54596973ee3b0624ee4a16264d700 > > It seems to have caused issues back then too, and was removed in the > following fixups, in 5.16-rc1: > > commit 93cec184788b0cf3926bc1f7b47fed74ba87990c > Author: James Zhu james....@amd.com > > Date: Tue Nov 2 21:33:50 2021 -0400 > Subject: drm/amdgpu: remove duplicated kfd_resume_iommu > Link: > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=93cec184788b0cf3926bc1f7b47fed74ba87990c > > commit 9f4f2c1a35248f56b2a9c1c004e0aaff3609b15d > Author: shaoyunl shaoyun....@amd.com > > Date: Fri Nov 5 12:34:14 2021 -0400 > Subject: drm/amd/amdgpu: fix the kfd pre_reset sequence in sriov > Link: > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=9f4f2c1a35248f56b2a9c1c004e0aaff3609b15d > > I'm not exactly in favor of rewriting history twice, so I think we > should just revert the upstream stable patch and move on. > > To manage notifications about this bug go to: > https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2068738/+subscriptions -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/2068738 Title: AMD GPUs fail with null pointer dereference when IOMMU enabled, leading to black screen To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2068738/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs