Possible work around (not a fix), blacklist module i2c_i801. It works
for me ...

Since I noticed a high amount of CPU time spent in interrupt handling I looked 
at /proc/interrupts (right after the slow boot and slow login):
$ cat /proc/interrupts
            CPU0       CPU1       
   0:          9          0  IR-IO-APIC    2-edge      timer
   1:          0        249  IR-IO-APIC    1-edge      i8042
   8:          1          0  IR-IO-APIC    8-fasteoi   rtc0
   9:          0       1017  IR-IO-APIC    9-fasteoi   acpi
  14:          0        591  IR-IO-APIC   14-fasteoi   INT3453:00, INT3453:01, 
INT3453:03
  15:          0          0  IR-IO-APIC   15-fasteoi   INT3453:02
  20:  190734634          0  IR-IO-APIC   20-fasteoi   i801_smbus
  31:       8350          0  IR-IO-APIC   31-fasteoi   idma64.0, 
i2c_designware.0
  39:          0      84628  IR-IO-APIC   39-fasteoi   mmc0
 120:          0          0  DMAR-MSI    0-edge      dmar0
 121:          0          0  DMAR-MSI    1-edge      dmar1
 122:          0          0  IR-PCI-MSI 311296-edge      PCIe PME
 123:          0          0  IR-PCI-MSI 315392-edge      PCIe PME
 124:          0          0  IR-PCI-MSI 317440-edge      PCIe PME
 125:          0          0  IR-PCI-MSI 294912-edge      ahci[0000:00:12.0]
 126:          0          3  IR-PCI-MSI 1048576-edge      rtsx_pci
 127:       4171          0  IR-PCI-MSI 344064-edge      xhci_hcd
 128:          0        296  INT3453:00   18  ELAN0503:00
 129:          0          0  IR-PCI-MSI 1050624-edge      enp2s0f1
 130:          0         44  IR-PCI-MSI 245760-edge      mei_me
 131:      18279          0  IR-PCI-MSI 1572864-edge      ath10k_pci
 132:          0        669  IR-PCI-MSI 229376-edge      snd_hda_intel:card0
 NMI:        690         49   Non-maskable interrupts
 LOC:     693366     704015   Local timer interrupts
 SPU:          0          0   Spurious interrupts
 PMI:        690         49   Performance monitoring interrupts
 IWI:      31340      91937   IRQ work interrupts
 RTR:          0          0   APIC ICR read retries
 RES:      23071      21772   Rescheduling interrupts
 CAL:      10091       3666   Function call interrupts
 TLB:       2750       4570   TLB shootdowns
 TRM:          0          0   Thermal event interrupts
 THR:          0          0   Threshold APIC interrupts
 DFR:          0          0   Deferred Error APIC interrupts
 MCE:          0          0   Machine check exceptions
 MCP:         10         11   Machine check polls
 ERR:          0
 MIS:          0
 PIN:          0          0   Posted-interrupt notification event
 NPI:          0          0   Nested posted-interrupt event
 PIW:          0          0   Posted-interrupt wakeup event

This lead me to the module i801_smbus which depends on i2c_i801 module (found 
this out using lsmod).
Following this ~similar~ issue 
(https://bbs.archlinux.org/viewtopic.php?id=254885) I decided to give 
blacklisting i2c_i801 a try.

I added "module_blacklist=i2c_i801" to the kernel parameters (via edit
action in grub boot menu), and "viola!" the problem was gone.

Note: I do not fully understand the consequences of not having the
i2C_i801 and i801_smbus modules.

The ?better? way (not via kernel parameter) to blacklist the module
permanently seems to be described here
(https://www.thegeekdiary.com/centos-rhel-how-to-disable-and-blacklist-
linux-kernel-module-to-prevent-it-from-loading-automatically/)

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-hwe-5.11 in Ubuntu.
https://bugs.launchpad.net/bugs/1931001

Title:
  kernel: watchdog: BUG: soft lockup - CPU#3 stuck for 22s!

Status in linux-hwe-5.11 package in Ubuntu:
  Confirmed

Bug description:
  Ubuntu 20.04 LTS and Ubuntu 21.04 occasionally boots with very bad
  performance and very unresponsive to user input on Lenovo laptop
  Lenovo 300e 2nd Gen 81M9 (LENOVO_MT_81M9_BU_idea_FM_300e 2nd G).

  When this happens you can read this kind of messages on journal:

  ---
  root@alumne-1-58:~# journalctl | grep "BUG: soft"
  may 20 21:44:35 alumne-1-58 kernel: watchdog: BUG: soft lockup - CPU#3 stuck 
for 22s! [swapper/3:0]
  may 20 21:44:35 alumne-1-58 kernel: watchdog: BUG: soft lockup - CPU#3 stuck 
for 22s! [swapper/3:0]
  may 22 09:33:34 alumne-1-58 kernel: watchdog: BUG: soft lockup - CPU#0 stuck 
for 22s! [swapper/0:0]
  may 24 16:45:14 alumne-1-58 kernel: watchdog: BUG: soft lockup - CPU#1 stuck 
for 23s! [prometheus-node:4220]
  may 24 16:45:14 alumne-1-58 kernel: watchdog: BUG: soft lockup - CPU#0 stuck 
for 23s! [swapper/0:0]
  jun 03 00:01:09 alumne-1-58 kernel: watchdog: BUG: soft lockup - CPU#0 stuck 
for 22s! [swapper/0:0]
  jun 03 00:01:09 alumne-1-58 kernel: watchdog: BUG: soft lockup - CPU#0 stuck 
for 23s! [swapper/0:0]
  jun 03 00:01:09 alumne-1-58 kernel: watchdog: BUG: soft lockup - CPU#1 stuck 
for 22s! [swapper/1:0]
  jun 03 00:01:09 alumne-1-58 kernel: watchdog: BUG: soft lockup - CPU#0 stuck 
for 22s! [swapper/0:0]
  jun 03 00:02:15 alumne-1-58 kernel: watchdog: BUG: soft lockup - CPU#0 stuck 
for 21s! [swapper/0:0]
  jun 05 08:22:58 alumne-1-58 kernel: watchdog: BUG: soft lockup - CPU#3 stuck 
for 22s! [irq/138-iwlwifi:1044]
  jun 05 08:25:06 alumne-1-58 kernel: watchdog: BUG: soft lockup - CPU#2 stuck 
for 22s! [swapper/2:0]
  jun 05 08:25:06 alumne-1-58 kernel: watchdog: BUG: soft lockup - CPU#3 stuck 
for 22s! [irq/138-iwlwifi:1044]
  jun 05 08:26:42 alumne-1-58 kernel: watchdog: BUG: soft lockup - CPU#1 stuck 
for 23s! [lxd:3975]
  jun 05 08:26:42 alumne-1-58 kernel: watchdog: BUG: soft lockup - CPU#2 stuck 
for 23s! [swapper/2:0]
  jun 05 08:26:42 alumne-1-58 kernel: watchdog: BUG: soft lockup - CPU#3 stuck 
for 23s! [irq/138-iwlwifi:1044]
  jun 05 08:27:38 alumne-1-58 kernel: watchdog: BUG: soft lockup - CPU#3 stuck 
for 23s! [irq/138-iwlwifi:1044]
  jun 05 08:28:34 alumne-1-58 kernel: watchdog: BUG: soft lockup - CPU#3 stuck 
for 22s! [irq/138-iwlwifi:1044]
  jun 05 08:29:46 alumne-1-58 kernel: watchdog: BUG: soft lockup - CPU#3 stuck 
for 22s! [irq/138-iwlwifi:1044]
  root@alumne-1-58:~#
  ---

  Usually if you reboot everything works fine but it's very annoying
  when happens.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-hwe-5.11/+bug/1931001/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to