[Kernel-packages] [Bug 2081685] Re: [Ubuntu 24.04-generic Kernel-6.8]Hard lockup on 8 Socket System, ThinkSystem SR950 V3.

Paolo Pisati Thu, 26 Jun 2025 08:11:06 -0700

Hi Shixiong / Shuang,

according to the 6.8.0-58.log in #59, you are not experiencing the issue
anymore, am i correct?


As mentioned in #56, our bisect was at this point:

4d60b13f267d workqueue: Don't call cpumask_test_cpu() with -1 CPU in 
wq_update_node_max_active()
adc1b642f72f workqueue: Implement system-wide nr_active enforcement for unbound 
workqueues
929b7fbecbcc workqueue: Introduce struct wq_node_nr_active
afd774d513f5 workqueue: RCU protect wq->dfl_pwq and implement accessors for it
31a8e16645d7 workqueue: Make wq_adjust_max_active() round-robin pwqs while 
activating
e4bbec8ce062 workqueue: Move nr_active handling into helpers
865f7641cf47 workqueue: Replace pwq_activate_inactive_work() with 
[__]pwq_activate_work()
a88074533304 workqueue: Factor out pwq_is_empty()
5d378b3d47e1 workqueue: Move pwq->max_active to wq->max_active
eb182ba1f6cb workqueue.c: Increase workqueue name length
...
7fdb45c9bbbc (tag: Ubuntu-6.8.0-31.31, 
refs/bisect/good-7fdb45c9bbbc95a3300b4d8de3f751f4c05c98e2) UBUNTU: 
Ubuntu-6.8.0-31.31

In particular, all those workqueue patches were reverted upstream on
v6.8.4:

https://github.com/gregkh/linux/commits/v6.8.4/

because they were causing several regressions - so any kernel that has those 
reverts, should be good.
Can you confirm that with 6.8.0-58 you are not experiencing this issue anymore?

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2081685

Title:
  [Ubuntu 24.04-generic Kernel-6.8]Hard lockup on 8 Socket System,
  ThinkSystem SR950 V3.

Status in linux package in Ubuntu:
  Confirmed
Status in linux source package in Noble:
  In Progress
Status in linux source package in Oracular:
  Confirmed

Bug description:
  There is CPU hard Lockup detected under Ubuntu 24.04 LTS (kernel
  6.8.0-38). see attachment"dmesg0723-Lockup-Ubuntu24.04.log"

  ubuntu@SR950V3:~$ cat /var/log/dmesg | grep -i  lockup

  [   15.241164] kernel: watchdog: Watchdog detected hard LOCKUP on cpu
  124

  [   15.241164] kernel:  ? watchdog_hardlockup_check+0x1cb/0x3b0

  
  Besides, the issue does not occur on upstream kernel 6.8，6.9, 6.10, 6.11-rc*, 
then only ubuntu kernel issue. see  attachment "dmesg0923-No-Lockup-Kernel 
6-10.log". 
  According to the dmesg log, the "hard lockup" is not a real lockup, 
  Because many CPU try to get cache_disable_lock spin lock at the same time 
when kernel boot. And competition has occurred here. 
  Every CPU's TLB will be flushed in the critical zone, the flushing TLB is a 
time-consuming operation, and there are so many CPUs,
  so the false "hard lockup" was detected by kernel. To avoid customer confuse, 
when Canonical do the fix?

  
  HW Config:
  ThinkSystem SR950 V3

  CPU: 8*  Intel(R) Xeon(R) Platinum 8490H 60 Core 3.5GHz

  MEM:  2TB = SK Hynix 356GB DDR5 4800MHz 3DS (2015.1GB)

  Raid: ThinkSystem RAID 940-8i 4GB Flash PCIe Gen4 12Gb Adapter

  Storage: Micron_7450_MTFDKBA960TFR *1

  Samsung 30.7TB 24Gbps SAS 2.5" SSD

  NIC: ThinkSystem Intel X710-T4L 10GBASE-T 4-Port OCP Ethernet Adapter

  OS: ubuntu 24.04 LTS( kernel 6.8.0-38-generic)

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2081685/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 2081685] Re: [Ubuntu 24.04-generic Kernel-6.8]Hard lockup on 8 Socket System, ThinkSystem SR950 V3.

Reply via email to