You have been subscribed to a public bug: --Problem Description--- kdump fails to take dump with smt set to 2, hmc dumpstart
---Issue observed--- [ 0.004111] Oops: Exception in kernel mode, sig: 4 [#1] [ 0.004118] SMP NR_CPUS=2048 [ 0.004120] NUMA [ 0.004125] pSeries [ 0.004132] Modules linked in: [ 0.004142] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.13.0-12-generic #13-Ubuntu [ 0.004153] task: c000000046715900 task.stack: c000000046134000 [ 0.004162] NIP: c000000000006468 LR: c00000000801764c CTR: 00000000006cdc70 [ 0.004173] REGS: c000000047fe3ce0 TRAP: 0700 Not tainted (4.13.0-12-generic) [ 0.004181] MSR: 8000000000081031 <SF,ME,IR,DR,LE> [ 0.004193] CR: 88042222 XER: 20000003 [ 0.004204] CFAR: c000000000006454 SOFTE: 0 [ 0.004204] GPR00: c00000000801764c c000000047fe3f60 c0000000095e3000 0000000000000000 [ 0.004204] GPR04: 0000000000000001 0000000000000002 ffffffffffffffff ffffffffffffffdf [ 0.004204] GPR08: 0000000000000000 0000000028042222 0000000000000002 0000000000000002 [ 0.004204] GPR12: 0000000000000000 c00000000fff0000 c000000046137f90 000000000b5452d8 [ 0.004204] GPR16: fffffffffffffffd 00000000089ffd10 0000000001360000 000000000b55d378 [ 0.004204] GPR20: 0000000000000060 000000001eca0000 000000000a6c0000 0000000000000007 [ 0.004204] GPR24: 0000000000000000 0000000000000000 c000000009621ed0 0000000000000000 [ 0.004204] GPR28: 0000000000000000 c000000046134000 c000000046137c80 c000000009105df8 [ 0.004328] NIP [c000000000006468] 0xc000000000006468 [ 0.004338] LR [c00000000801764c] __do_irq+0x4c/0x1c0 [ 0.004345] Call Trace: [ 0.004354] [c000000047fe3f60] [c00000000801764c] __do_irq+0x4c/0x1c0 (unreliable) [ 0.004368] [c000000047fe3f90] [c00000000802ab70] call_do_irq+0x14/0x24 [ 0.004380] [c000000046137bc0] [c00000000801785c] do_IRQ+0x9c/0x130 [ 0.004393] [c000000046137c10] [c000000008008ac4] hardware_interrupt_common+0x114/0x120 [ 0.004409] --- interrupt: 501 at arch_local_irq_restore+0x5c/0x90 [ 0.004409] LR = arch_local_irq_restore+0x40/0x90 [ 0.004423] [c000000046137f00] [0000000000000005] 0x5 (unreliable) [ 0.004436] [c000000046137f20] [c000000008049824] start_secondary+0x324/0x350 [ 0.004450] [c000000046137f90] [c00000000800aa6c] start_secondary_prolog+0x10/0x14 [ 0.004460] Instruction dump: [ 0.004467] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX [ 0.004484] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX [ 0.004506] ---[ end trace 3e5a2a9047ef3cd0 ]--- [ 0.004512] [ 0.004518] Oops: Exception in kernel mode, sig: 4 [#2] [ 0.004525] SMP NR_CPUS=2048 [ 0.004526] NUMA [ 0.004532] pSeries [ 0.004540] Modules linked in: [ 0.004550] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G D 4.13.0-12-generic #13-Ubuntu [ 0.004561] task: c000000009579f00 task.stack: c0000000095dc000 [ 0.004569] NIP: c000000000006460 LR: c0000000080b6e80 CTR: 0000000000000000 [ 0.004580] REGS: c0000000095dfb20 TRAP: 0700 Tainted: G D (4.13.0-12-generic) [ 0.004589] MSR: 8000000000081031 <SF,ME,IR,DR,LE> [ 0.004599] CR: 22002228 XER: 20000004 [ 0.004611] CFAR: c00000000000493c SOFTE: 0 [ 0.004611] GPR00: 0000000000000000 c0000000095dfda0 c0000000095e3000 0000000000000000 [ 0.004611] GPR04: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 0.004611] GPR08: 0000000000000000 0000000022002228 000000007fffffff 0000000000000008 [ 0.004611] GPR12: 000000000000ffff c00000000fff0a80 c000000c7e137f90 0000000009980600 [ 0.004611] GPR16: 000000001ec70000 0000000000000001 0000000000000000 0000000000000000 [ 0.004611] GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000007 [ 0.004611] GPR24: 0000000000000008 c000000008000000 0000000008000000 0000000000000000 [ 0.004611] GPR28: 0000000000000000 0000000000000008 c000000009621ed0 c000000009622354 [ 0.004729] NIP [c000000000006460] 0xc000000000006460 [ 0.004739] LR [c0000000080b6e80] pseries_lpar_idle+0x30/0x50 [ 0.004746] Call Trace: [ 0.004756] [c0000000095dfda0] [c0000000095dfe90] init_thread_union+0x3e90/0x4000 (unreliable) [ 0.004771] [c0000000095dfe00] [c00000000801e314] arch_cpu_idle+0x54/0x160 [ 0.004784] [c0000000095dfe30] [c000000008c6b92c] default_idle_call+0x4c/0x7c [ 0.004798] [c0000000095dfe50] [c00000000815da14] do_idle+0x244/0x320 [ 0.004810] [c0000000095dfea0] [c00000000815dd28] cpu_startup_entry+0x38/0x50 [ 0.004823] [c0000000095dfed0] [c00000000800d2dc] rest_init+0xec/0x110 [ 0.004835] [c0000000095dff00] [c000000008fe40fc] start_kernel+0x584/0x5a4 [ 0.004848] [c0000000095dff90] [c00000000800ab7c] start_here_common+0x1c/0x520 [ 0.004857] Instruction dump: [ 0.004864] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX [ 0.004881] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX [ 0.004899] ---[ end trace 3e5a2a9047ef3cd1 ]--- [ 0.004906] [ 3.949808] Kernel panic - not syncing: Fatal exception in interrupt [ 4.179808] ---[ end Kernel panic - not syncing: Fatal exception in interrupt When tried with maxcpus=1, following is observed. [ 3992.056997] Modules linked in: async_tx raid6_pq raid1 raid0 multipath linear ibmvscsi(+) crc32c_vpmsum [ 3992.136992] CPU: 1 PID: 207 Comm: modprobe Not tainted 4.13.0-12-generic #13-Ubuntu [ 3992.166991] task: c000000043719e00 task.stack: c0000000437c8000 [ 3992.206994] NIP: c0000000086d2530 LR: c0000000086d46f0 CTR: 0000000000000013 [ 3992.246996] REGS: c0000000437cb260 TRAP: 0901 Not tainted (4.13.0-12-generic) [ 3992.276994] MSR: 800000000280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> [ 3992.306995] CR: 24844442 XER: 20000000 [ 3992.366993] CFAR: c0000000086d2570 SOFTE: 1 [ 3992.366993] GPR00: ffffffffffffff68 c0000000437cb4e0 c0000000095e3000 c000000043c67e80 [ 3992.366993] GPR04: c000000043c67e80 c000000043c6bc00 ffffffffffffffed 39077b9925c55abe [ 3992.366993] GPR08: 0000000000000000 0000000000000000 0000000000000000 0000000000000060 [ 3992.366993] GPR12: ffffffffffffff00 c00000000fac0a80 [ 3992.546994] NIP [c0000000086d2530] mpihelp_add_n+0x30/0x80 [ 3992.586990] LR [c0000000086d46f0] mpih_sqr_n+0x230/0x460 [ 3992.606991] Call Trace: [ 3992.617082] [c0000000437cb4e0] [c0000000086d48c4] mpih_sqr_n+0x404/0x460 (unreliable) [ 3992.636996] [c0000000437cb560] [c0000000086d4844] mpih_sqr_n+0x384/0x460 [ 3992.676996] [c0000000437cb5e0] [c0000000086d5778] mpi_powm+0x678/0xe50 [ 3992.716992] [c0000000437cb720] [c000000008619d40] _rsa_dec.isra.1+0x80/0xc0 [ 3992.746996] [c0000000437cb760] [c00000000861a094] rsa_verify+0x94/0x140 [ 3992.786994] [c0000000437cb7c0] [c00000000861af44] pkcs1pad_verify+0xd4/0x160 [ 3992.856995] [c0000000437cb800] [c000000008631510] public_key_verify_signature+0x240/0x4b0 [ 3992.896992] [c0000000437cb9a0] [c0000000086311d4] verify_signature+0x64/0x90 [ 3992.926997] [c0000000437cb9c0] [c000000008634690] pkcs7_validate_trust+0x190/0x2c0 [ 3992.976992] [c0000000437cba20] [c0000000082b2e30] verify_pkcs7_signature+0xc0/0x1f0 [ 3993.036993] [c0000000437cbad0] [c0000000081c8414] mod_verify_sig+0x94/0x100 [ 3993.076996] [c0000000437cbb40] [c0000000081c5054] load_module+0x264/0x1fc0 [ 3993.116992] [c0000000437cbd30] [c0000000081c70b4] SyS_finit_module+0xc4/0x130 [ 3993.176992] [c0000000437cbe30] [c00000000800b184] system_call+0x58/0x6c [ 3993.226990] Instruction dump: [ 3993.237018] 39400000 7cc600d0 7cc607b4 7cc930f8 78c01f24 79290020 7c0c0378 39290001 [ 3993.336994] 7d2903a6 60000000 60000000 60420000 <7d6c0050> 38c60001 7cc607b4 7d25582a [ 4028.156997] xor: measuring software checksum speed [ 4029.376998] 8regs : 16.000 MB/sec [ 4030.676992] 8regs_prefetch: 16.000 MB/sec [ 4031.716993] 32regs : 16.000 MB/sec [ 4032.886994] 32regs_prefetch: 16.000 MB/sec [ 4034.256993] altivec : 16.000 MB/sec [ 4034.316994] xor: using function: altivec (16.000 MB/sec) [ 4076.016995] watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [modprobe:207] [ 4076.046994] Modules linked in: xor async_tx raid6_pq raid1 raid0 multipath linear ibmvscsi(+) crc32c_vpmsum [ 4076.126994] CPU: 1 PID: 207 Comm: modprobe Tainted: G L 4.13.0-12-generic #13-Ubuntu [ 4076.186995] task: c000000043719e00 task.stack: c0000000437c8000 [ 4076.226993] NIP: c0000000086d224c LR: c0000000086d4404 CTR: 0000000000000008 [ 4076.256991] REGS: c0000000437cb190 TRAP: 0901 Tainted: G L (4.13.0-12-generic) [ 4076.286994] MSR: 800000000280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> [ 4076.326993] CR: 24884444 XER: 20000000 [ 4076.356998] CFAR: c0000000086d4400 SOFTE: 1 [ 4076.356998] GPR00: 5ebfd337ad53c297 c0000000437cb410 c0000000095e3000 c000000043c62910 [ 4076.356998] GPR04: c000000043c62800 fffffffffffffff8 00000000c68de1f2 0000000000000000 [ 4076.356998] GPR08: 761ab85da0153bf8 0000000000000008 0000000063cfb2b3 026231001e934591 [ 4076.356998] GPR12: 0000000000000038 c00000000fac0a80 [ 4076.556992] NIP [c0000000086d224c] mpihelp_addmul_1+0x4c/0xf0 [ 4076.596990] LR [c0000000086d4404] mpih_sqr_n_basecase+0xd4/0x190 [ 4076.607012] Call Trace: [ 4076.636994] [c0000000437cb410] [0000000000000901] 0x901 (unreliable) [ 4076.676992] [c0000000437cb460] [c0000000086d4644] mpih_sqr_n+0x184/0x460 [ 4076.736992] [c0000000437cb4e0] [c0000000086d4890] mpih_sqr_n+0x3d0/0x460 [ 4076.756995] [c0000000437cb560] [c0000000086d4844] mpih_sqr_n+0x384/0x460 [ 4076.816995] [c0000000437cb5e0] [c0000000086d5778] mpi_powm+0x678/0xe50 [ 4076.846996] [c0000000437cb720] [c000000008619d40] _rsa_dec.isra.1+0x80/0xc0 [ 4076.896992] [c0000000437cb760] [c00000000861a094] rsa_verify+0x94/0x140 [ 4076.946997] [c0000000437cb7c0] [c00000000861af44] pkcs1pad_verify+0xd4/0x160 [ 4076.976996] [c0000000437cb800] [c000000008631510] public_key_verify_signature+0x240/0x4b0 [ 4077.016993] [c0000000437cb9a0] [c0000000086311d4] verify_signature+0x64/0x90 [ 4077.046995] [c0000000437cb9c0] [c000000008634690] pkcs7_validate_trust+0x190/0x2c0 [ 4077.086997] [c0000000437cba20] [c0000000082b2e30] verify_pkcs7_signature+0xc0/0x1f0 [ 4077.136995] [c0000000437cbad0] [c0000000081c8414] mod_verify_sig+0x94/0x100 [ 4077.196996] [c0000000437cbb40] [c0000000081c5054] load_module+0x264/0x1fc0 [ 4077.236996] [c0000000437cbd30] [c0000000081c70b4] SyS_finit_module+0xc4/0x130 [ 4077.286997] [c0000000437cbe30] [c00000000800b184] system_call+0x58/0x6c [ 4077.337015] Instruction dump: [ 4077.366995] 7ca507b4 78c60020 7ca928f8 78bf1f24 79290020 7ffdfb78 39290001 38e00000 [ 4077.426994] 7d2903a6 7b9c83e4 60000000 60000000 <60420000> 7d9df850 38a50001 7ca507b4 Contact Information = hasri...@in.ibm.com ---uname output--- Linux ltcalpine-lp9 4.13.0-12-generic #13-Ubuntu SMP Fri Sep 22 20:52:52 UTC 2017 ppc64le ppc64le ppc64le GNU/Linux Machine Type/Model = Power 8 pVM/8408-E8E ----Additional Info----- # cat /proc/cmdline BOOT_IMAGE=/boot/vmlinux-4.13.0-12-generic root=UUID=861097e8-43d3-4335-83d3-6db421e20564 ro crashkernel=2G-4G:320M,4G-32G:512M,32G-64G:1024M,64G-128G:2048M,128G-:4096M ---Steps to Reproduce--- 1. installed linux-crashdump and install debug kernel 2. edited the kdump-tools.cfg crashkernel cmdline to above 3. update-grub 4. reboot once 5. make sure kdump is enabled 6. pp64_cpu --smt=2 7. Login to hmc and trigger dumpstart. chsysstate -r lpar -m <Server-name> -n <lpar-name> -o dumprestart soft lockup is observed when maxcpus=1 is used in kdump instead of nr_cpus=1. Dump is not taken and kernel boot stops. The full log is attached. Expected: To take dump and boot back to the host kernel. == Comment: #4 - Hari Krishna Bathini <hbath...@in.ibm.com> - 2018-06-11 06:22:57 == The below upstream patches should resolve this issue: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=04b9c96eae72 ("powerpc/crash: Remove the test for cpu_online in the IPI callback") https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=4388c9b3a6ee ("powerpc: Do not send system reset request through the oops path") https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=4552d128c26e ("powerpc: System reset avoid interleaving oops using die synchronisation") Thanks Hari ** Affects: linux (Ubuntu) Importance: Undecided Assignee: Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage) Status: New ** Tags: architecture-ppc64le bugnameltc-159691 severity-high targetmilestone-inin--- -- kdump fails to take dump with smt set to 2, hmc dumpstart https://bugs.launchpad.net/bugs/1776211 You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp