** Package changed: ubuntu => linux (Ubuntu) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1693566
Title: Ubuntu 16.04.03: "NMI watchdog: BUG: soft lockup" occurs while running stress-ng on PowerNV machine. Status in linux package in Ubuntu: New Bug description: == Comment: #0 - PAVITHRA R. PRAKASH <pavra...@in.ibm.com> - 2017-05-17 05:55:38 == --- Problem description ---- Ubuntu 16.04.03: "NMI watchdog: BUG: soft lockup" occurs while running stress-ng on NV machine. --- Steps to recreate------ 1. Install ubuntu16.04.03. 2. Run "stress-ng -a 0". Logs: ==== [ 2660.437087] INFO: rcu_sched self-detected stall on CPU [ 2660.437111] 22-...: (5247 ticks this GP) idle=e19/140000000000001/0 softirq=905/905 fqs=2380 [ 2660.437114] (t=5251 jiffies g=95606 c=95605 q=2545946) [ 2660.437750] 24-...: (5250 ticks this GP) idle=0b7/140000000000001/0 softirq=5805/5805 fqs=2380 [ 2660.437859] [ 2664.172796] NMI watchdog: BUG: soft lockup - CPU#22 stuck for 23s! [stress-ng-mmap:3509] [ 2664.172808] NMI watchdog: BUG: soft lockup - CPU#24 stuck for 23s! [stress-ng-mrema:3536] [ 2674.848037] NMI watchdog: BUG: soft lockup - CPU#5 stuck for 33s! [stress-ng-fork:3381] [ 2676.172894] NMI watchdog: BUG: soft lockup - CPU#30 stuck for 22s! [kswapd0:992] [ 2680.336844] NMI watchdog: BUG: soft lockup - CPU#98 stuck for 23s! [stress-ng-clock:5099] [ 2686.140931] NMI watchdog: BUG: soft lockup - CPU#16 stuck for 39s! [stress-ng-clone:3366] [ 2686.987192] xhci_hcd 0003:09:00.0: HC died; cleaning up [ 2686.987212] usb 1-3-port3: cannot reset (err = -108) After few hours machine will become completely unresponsive [pavithra@localhost ~]$ ping 9.47.69.255 PING 9.47.69.255 (9.47.69.255) 56(84) bytes of data. ^C --- 9.47.69.255 ping statistics --- 12 packets transmitted, 0 received, 100% packet loss, time 11000ms Thanks, Pavithra == Comment: #6 - VIPIN K. PARASHAR <vipar...@in.ibm.com> - 2017-05-25 03:32:26 == ubuntu@ltc-firep2:~$ hostname -i 9.47.69.255 ubuntu@ltc-firep2:~$ uname -a Linux ltc-firep2 4.10.0-21-generic #23~16.04.1-Ubuntu SMP Tue May 2 12:54:57 UTC 2017 ppc64le ppc64le ppc64le GNU/Linux ubuntu@ltc-firep2:~$ cat /etc/os-release NAME="Ubuntu" VERSION="16.04.2 LTS (Xenial Xerus)" ID=ubuntu ID_LIKE=debian PRETTY_NAME="Ubuntu 16.04.2 LTS" VERSION_ID="16.04" HOME_URL="http://www.ubuntu.com/" SUPPORT_URL="http://help.ubuntu.com/" BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/" VERSION_CODENAME=xenial UBUNTU_CODENAME=xenial ubuntu@ltc-firep2:~$ tail /proc/cpuinfo processor : 159 cpu : POWER8 (raw), altivec supported clock : 2061.000000MHz revision : 2.0 (pvr 004d 0200) timebase : 512000000 platform : PowerNV model : 8335-GTA machine : PowerNV 8335-GTA firmware : OPAL ubuntu@ltc-firep2:~$ == Comment: #11 - VIPIN K. PARASHAR <vipar...@in.ibm.com> - 2017-05-25 06:27:12 == System Memory stats ============== ubuntu@ltc-firep2:~$ numactl -H available: 2 nodes (0,8) node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 node 0 size: 61321 MB node 0 free: 60297 MB node 8 cpus: 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 node 8 size: 65303 MB node 8 free: 64923 MB node distances: node 0 8 0: 10 40 8: 40 10 ubuntu@ltc-firep2:~$ free -h total used free shared buff/cache available Mem: 123G 534M 122G 20M 868M 121G Swap: 37G 0B 37G ubuntu@ltc-firep2:~$ sudo sysctl vm | grep free vm.min_free_kbytes = 360448 ubuntu@ltc-firep2:~$ Host is having 123 GB of memory spread across two nodes. Swap is configured to be 37GB and VM min free bytes is set to 360MB. == Comment: #13 - VIPIN K. PARASHAR <vipar...@in.ibm.com> - 2017-05-25 07:25:35 == [ 280.494345] NMI watchdog: BUG: soft lockup - CPU#5 stuck for 23s! [stress-ng-mmap:4172] [ 280.495250] CPU: 5 PID: 4172 Comm: stress-ng-mmap Not tainted 4.10.0-21-generic #23~16.04.1-Ubuntu [ 280.495262] task: c000000fe318c600 task.stack: c000000fc0d7c000 [ 280.495271] NIP: c0000000001a3248 LR: c0000000001a3204 CTR: c0000000000871f0 [ 280.495285] REGS: c000000fc0d7f7d0 TRAP: 0901 Not tainted (4.10.0-21-generic) [ 280.495299] MSR: 9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE> [ 280.495408] CR: 44424444 XER: 20000000 [ 280.495416] CFAR: c0000000001a3250 SOFTE: 1 [ 280.495624] NIP [c0000000001a3248] smp_call_function_many+0x358/0x3f0 [ 280.495636] LR [c0000000001a3204] smp_call_function_many+0x314/0x3f0 [ 280.495645] Call Trace: [ 280.495660] [c000000fc0d7fa50] [c0000000001a31e4] smp_call_function_many+0x2f4/0x3f0 (unreliable) [ 280.495697] [c000000fc0d7fac0] [c0000000001a3430] kick_all_cpus_sync+0x40/0x50 [ 280.495726] [c000000fc0d7fae0] [c000000000069728] hash__pmdp_huge_get_and_clear+0xa8/0xf0 [ 280.495742] [c000000fc0d7fb10] [c00000000032b600] change_huge_pmd+0x210/0x2d0 [ 280.495762] [c000000fc0d7fb80] [c0000000002df638] change_protection_range+0xb38/0xe60 [ 280.495789] [c000000fc0d7fcc0] [c00000000030994c] change_prot_numa+0x3c/0xc0 [ 280.495815] [c000000fc0d7fcf0] [c00000000012e854] task_numa_work+0x2d4/0x3f0 [ 280.495844] [c000000fc0d7fdb0] [c00000000010f330] task_work_run+0x140/0x1a0 [ 280.495868] [c000000fc0d7fe00] [c00000000001db04] do_notify_resume+0xe4/0xf0 [ 280.495885] [c000000fc0d7fe30] [c00000000000b744] ret_from_except_lite+0x70/0x74 [ 280.495909] Instruction dump: [ 280.495925] 3d020003 78691f24 39480fe0 7d2a482a e95d0000 7d4a4a14 812a0018 792707e1 [ 280.496022] 4182001c 60420000 7c210b78 7c421378 <812a0018> 792807e1 4082fff0 7c2004ac [ 636.509312] NMI watchdog: BUG: soft lockup - CPU#29 stuck for 22s! [stress-ng-mrema:4205] [ 636.510076] CPU: 29 PID: 4205 Comm: stress-ng-mrema Tainted: G L 4.10.0-21-generic #23~16.04.1-Ubuntu [ 636.510090] task: c000000fdef86e00 task.stack: c000000fdd074000 [ 636.510104] NIP: c0000000001a3244 LR: c0000000001a3204 CTR: c0000000000871f0 [ 636.510136] REGS: c000000fdd077760 TRAP: 0901 Tainted: G L (4.10.0-21-generic) [ 636.510146] MSR: 9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE> [ 636.510302] CR: 44484824 XER: 20000000 [ 636.510319] CFAR: c0000000001a3250 SOFTE: 1 [ 636.510620] NIP [c0000000001a3244] smp_call_function_many+0x354/0x3f0 [ 636.510647] LR [c0000000001a3204] smp_call_function_many+0x314/0x3f0 [ 636.510658] Call Trace: [ 636.510676] [c000000fdd0779e0] [c0000000001a31e4] smp_call_function_many+0x2f4/0x3f0 (unreliable) [ 636.510759] [c000000fdd077a50] [c0000000001a3430] kick_all_cpus_sync+0x40/0x50 [ 636.510791] [c000000fdd077a70] [c00000000006f350] pmdp_invalidate+0x80/0xc0 [ 636.510820] [c000000fdd077aa0] [c000000000327d7c] __split_huge_pmd_locked+0x5bc/0xaa0 [ 636.510842] [c000000fdd077b60] [c00000000032b834] __split_huge_pmd+0x174/0x280 [ 636.510876] [c000000fdd077bc0] [c00000000032bc04] vma_adjust_trans_huge+0x134/0x1a0 [ 636.510909] [c000000fdd077c10] [c0000000002da1e4] __vma_adjust+0x114/0x8e0 [ 636.510932] [c000000fdd077cf0] [c0000000002dac2c] __split_vma.isra.5+0x27c/0x2a0 [ 636.510969] [c000000fdd077d40] [c0000000002dbb34] do_munmap+0x134/0x480 [ 636.510991] [c000000fdd077db0] [c0000000002e1550] SyS_mremap+0x1f0/0x550 [ 636.511029] [c000000fdd077e30] [c00000000000b184] system_call+0x38/0xe0 [ 636.511048] Instruction dump: [ 636.511065] 409dfda4 3d020003 78691f24 39480fe0 7d2a482a e95d0000 7d4a4a14 812a0018 [ 636.511184] 792707e1 4182001c 60420000 7c210b78 <7c421378> 812a0018 792807e1 4082fff0 Even after increasing vm.min_free_kbytes to 2GB also, soft lockups and hang is still being seen after running stress-ng tool. This seems to be kernel issue. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1693566/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp