Public bug reported: We operate a Linux-based cluster of edge routers within a highly dynamic setting where BGP-advertised routes constantly evolve due to new next hop destinations frequently appearing and disappearing. In areas experiencing significant IPv6 traffic, we periodically face kernel soft lockups. Upon investigation, these lockups seem to occur during the traversal of the multipath circular linked-list, for finding the optimal path, in the fib6_select_path function, specifically while iterating through the siblings in the multipath linked-list. The problem typically arises when the linked list is unexpectedly deleted (its reference count reaches zero), leading to an infinite loop. This results in a soft lockup that, if not resolved, triggers a system panic due to the watchdog timer.
It's worth noting that this is a longstanding issue we've been dealing with for about a year. Despite switching between different kernel versions, the problem persists. Attached please find the `.crash` report generated by `apport-cli`. Thanks, Omid ProblemType: Bug DistroRelease: Ubuntu 22.04 Package: linux-image-6.5.0-26-generic 6.5.0-26.26~22.04.1 ProcVersionSignature: Ubuntu 6.5.0-26.26~22.04.1-generic 6.5.13 Uname: Linux 6.5.0-26-generic x86_64 ApportVersion: 2.20.11-0ubuntu82.5 Architecture: amd64 CasperMD5CheckResult: unknown Date: Tue Apr 23 00:00:14 2024 ProcEnviron: TERM=xterm-256color PATH=(custom, no user) LANG=C.UTF-8 SHELL=/bin/bash SourcePackage: linux-signed-hwe-6.5 UpgradeStatus: No upgrade log present (probably fresh install) ** Affects: linux-signed-hwe-6.5 (Ubuntu) Importance: Undecided Status: New ** Tags: amd64 apport-bug jammy uec-images ** Attachment added: "report extracted from the core dump by `apport-cli`" https://bugs.launchpad.net/bugs/2063495/+attachment/5770523/+files/linux-image-6.5.0-27-generic-202404231849.crash -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-signed-hwe-6.5 in Ubuntu. https://bugs.launchpad.net/bugs/2063495 Title: Kernel soft lockup occures in `fib6_select_path` function Status in linux-signed-hwe-6.5 package in Ubuntu: New Bug description: We operate a Linux-based cluster of edge routers within a highly dynamic setting where BGP-advertised routes constantly evolve due to new next hop destinations frequently appearing and disappearing. In areas experiencing significant IPv6 traffic, we periodically face kernel soft lockups. Upon investigation, these lockups seem to occur during the traversal of the multipath circular linked-list, for finding the optimal path, in the fib6_select_path function, specifically while iterating through the siblings in the multipath linked-list. The problem typically arises when the linked list is unexpectedly deleted (its reference count reaches zero), leading to an infinite loop. This results in a soft lockup that, if not resolved, triggers a system panic due to the watchdog timer. It's worth noting that this is a longstanding issue we've been dealing with for about a year. Despite switching between different kernel versions, the problem persists. Attached please find the `.crash` report generated by `apport-cli`. Thanks, Omid ProblemType: Bug DistroRelease: Ubuntu 22.04 Package: linux-image-6.5.0-26-generic 6.5.0-26.26~22.04.1 ProcVersionSignature: Ubuntu 6.5.0-26.26~22.04.1-generic 6.5.13 Uname: Linux 6.5.0-26-generic x86_64 ApportVersion: 2.20.11-0ubuntu82.5 Architecture: amd64 CasperMD5CheckResult: unknown Date: Tue Apr 23 00:00:14 2024 ProcEnviron: TERM=xterm-256color PATH=(custom, no user) LANG=C.UTF-8 SHELL=/bin/bash SourcePackage: linux-signed-hwe-6.5 UpgradeStatus: No upgrade log present (probably fresh install) To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-signed-hwe-6.5/+bug/2063495/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp