Dan Horák <d...@danny.cz> writes: > Hi, > > after updating to Fedora built 6.15-rc2 kernel from 6.14 I am getting a > soft lockup early in the boot and NVME related timeout/crash later > (could it be related?). I am first checking if this is a known issue > as I have not started bisecting yet. > > [ 2.866399] Memory: 63016960K/67108864K available (25152K kernel code, > 4416K rwdata, 24000K rodata, 9792K init, 1796K bss, 476160K reserved, > 3356672K cma-reserved) > [ 2.874121] devtmpfs: initialized > [ 24.037685] watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [swapper/0:1] > [ 24.037690] CPU#0 Utilization every 4s during lockup: > [ 24.037692] #1: 101% system, 0% softirq, 0% hardirq, > 0% idle > [ 24.037697] #2: 100% system, 0% softirq, 0% hardirq, > 0% idle > [ 24.037701] #3: 100% system, 0% softirq, 0% hardirq, > 0% idle > [ 24.037704] #4: 101% system, 0% softirq, 0% hardirq, > 0% idle > [ 24.037707] #5: 100% system, 0% softirq, 0% hardirq, > 0% idle > [ 24.037711] Modules linked in: > [ 24.037716] CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted > 6.15.0-0.rc2.22.fc43.ppc64le #1 VOLUNTARY > [ 24.037722] Hardware name: T2P9D01 REV 1.00 POWER9 0x4e1202 > opal:skiboot-bc106a0 PowerNV > [ 24.037725] NIP: c00000000308a72c LR: c00000000308a7d0 CTR: > c0000000018012c0 > [ 24.037729] REGS: c000200006637a50 TRAP: 0900 Not tainted > (6.15.0-0.rc2.22.fc43.ppc64le) > [ 24.037733] MSR: 9000000002009033 <SF,HV,VEC,EE,ME,IR,DR,RI,LE> CR: > 48000828 XER: 00000000 > [ 24.037750] CFAR: 0000000000000000 IRQMASK: 0 > [ 24.037750] GPR00: c00000000308a7d0 c000200006637cf0 c0000000025baa00 > 0000000000000040 > [ 24.037750] GPR04: c0002007ff390b00 0000000000010000 0000000000000000 > c0002007ff3a0b00 > [ 24.037750] GPR08: 00000000002007ff 000000000012d092 0000000000000000 > 0000000000000000 > [ 24.037750] GPR12: 0000000000000000 c000000003fb0000 c000000000011320 > 0000000000000000 > [ 24.037750] GPR16: 0000000000000000 0000000000000000 0000000000000000 > 0000000000000000 > [ 24.037750] GPR20: 0000000000000000 0000000000000000 0000000000000000 > 0000000000000000 > [ 24.037750] GPR24: 0000000000000000 0000000000000000 0000000000000000 > 0000000000000000 > [ 24.037750] GPR28: 0000000000000000 c000000003f10be0 c0000000019efaf8 > 0000000000037940 > [ 24.037806] NIP [c00000000308a72c] memory_dev_init+0xb4/0x194 > [ 24.037815] LR [c00000000308a7d0] memory_dev_init+0x158/0x194 > [ 24.037820] Call Trace: > [ 24.037822] [c000200006637cf0] [c00000000308a7d0] > memory_dev_init+0x158/0x194 (unreliable) > [ 24.037830] [c000200006637d70] [c000000003089bd0] driver_init+0x74/0xa0 > [ 24.037836] [c000200006637d90] [c00000000300f628] > kernel_init_freeable+0x204/0x288 > [ 24.037843] [c000200006637df0] [c000000000011344] kernel_init+0x2c/0x1b8 > [ 24.037849] [c000200006637e50] [c00000000000debc] > ret_from_kernel_user_thread+0x14/0x1c > [ 24.037855] --- interrupt: 0 at 0x0 > [ 24.037858] Code: 7c651b78 40820010 3fa20195 3bbd61e0 48000080 3c62ff89 > 389e00c8 3863e510 4bf7a625 60000000 39290001 7c284840 <41800088> 792aaac2 > 7c2a2840 4080ffec > [ 48.045039] watchdog: BUG: soft lockup - CPU#0 stuck for 44s! [swapper/0:1] > [ 48.045043] CPU#0 Utilization every 4s during lockup: > [ 48.045045] #1: 101% system, 0% softirq, 0% hardirq, > 0% idle > [ 48.045049] #2: 100% system, 0% softirq, 0% hardirq, > 0% idle > [ 48.045053] #3: 100% system, 0% softirq, 0% hardirq, > 0% idle > [ 48.045056] #4: 101% system, 0% softirq, 0% hardirq, > 0% idle > [ 48.045059] #5: 100% system, 0% softirq, 0% hardirq, > 0% idle > [ 48.045063] Modules linked in: > [ 48.045067] CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Tainted: G L > ------ --- 6.15.0-0.rc2.22.fc43.ppc64le #1 VOLUNTARY > [ 48.045073] Tainted: [L]=SOFTLOCKUP > [ 48.045075] Hardware name: T2P9D01 REV 1.00 POWER9 0x4e1202 > opal:skiboot-bc106a0 PowerNV > [ 48.045077] NIP: c00000000308a72c LR: c00000000308a7d0 CTR: > c0000000018012c0 > [ 48.045081] REGS: c000200006637a50 TRAP: 0900 Tainted: G L > ------ --- (6.15.0-0.rc2.22.fc43.ppc64le) > [ 48.045085] MSR: 9000000002009033 <SF,HV,VEC,EE,ME,IR,DR,RI,LE> CR: > 48000828 XER: 00000000 > [ 48.045100] CFAR: 0000000000000000 IRQMASK: 0 > [ 48.045100] GPR00: c00000000308a7d0 c000200006637cf0 c0000000025baa00 > 0000000000000040 > [ 48.045100] GPR04: c0002007ff390b00 0000000000010000 0000000000000000 > c0002007ff3a0b00 > [ 48.045100] GPR08: 00000000002007ff 00000000000a65fd 0000000000000000 > 0000000000000000 > [ 48.045100] GPR12: 0000000000000000 c000000003fb0000 c000000000011320 > 0000000000000000 > [ 48.045100] GPR16: 0000000000000000 0000000000000000 0000000000000000 > 0000000000000000 > [ 48.045100] GPR20: 0000000000000000 0000000000000000 0000000000000000 > 0000000000000000 > [ 48.045100] GPR24: 0000000000000000 0000000000000000 0000000000000000 > 0000000000000000 > [ 48.045100] GPR28: 0000000000000000 c000000003f10be0 c0000000019efaf8 > 000000000007f880 > [ 48.045155] NIP [c00000000308a72c] memory_dev_init+0xb4/0x194 > [ 48.045161] LR [c00000000308a7d0] memory_dev_init+0x158/0x194 > [ 48.045166] Call Trace: > [ 48.045167] [c000200006637cf0] [c00000000308a7d0] > memory_dev_init+0x158/0x194 (unreliable) > [ 48.045175] [c000200006637d70] [c000000003089bd0] driver_init+0x74/0xa0 > [ 48.045181] [c000200006637d90] [c00000000300f628] > kernel_init_freeable+0x204/0x288 > [ 48.045187] [c000200006637df0] [c000000000011344] kernel_init+0x2c/0x1b8 > [ 48.045193] [c000200006637e50] [c00000000000debc] > ret_from_kernel_user_thread+0x14/0x1c > [ 48.045199] --- interrupt: 0 at 0x0
The above looks similar to https://lore.kernel.org/all/20250410125110.1232329-1-gs...@redhat.com/ Maybe you can give this patch a try for above softlockup. -ritesh