*** This bug is a duplicate of bug 1927076 *** https://bugs.launchpad.net/bugs/1927076
** This bug is no longer a duplicate of bug 1909286 ubuntu_kernel_selftest will be interrupted with the reuseport_bpf_cpu / reuseport_bpf_numa test in net (BUG: Unable to handle kernel instruction fetch (NULL pointer?)) ** This bug has been marked a duplicate of bug 1927076 IPv6 TCP in reuseport_bpf_cpu from ubuntu_kernel_selftests/net crash P8 node entei on 5.8 kernel (Oops: Exception in kernel mode, sig: 4 [#1]) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1867155 Title: P8 node modoc will reboot automatically when running the sru_misc test suite Status in ubuntu-kernel-tests: Triaged Status in linux package in Ubuntu: Confirmed Bug description: Tested with 5 attempts, 4 hangs around the following test in ubuntu_kernel_selftests net sub-category: # selftests: net: reuseport_bpf_cpu First attempt: 23:21:32 DEBUG| [stdout] ok 2 selftests: net: reuseport_bpf_cpu 23:21:32 DEBUG| [stdout] # selftests: net: reuseport_bpf_numa 23:21:32 DEBUG| [stdout] # ---- IPv4 UDP ---- (hang here) Second attempt: 10:17:35 DEBUG| [stdout] ok 1 selftests: net: reuseport_bpf 10:17:35 DEBUG| [stdout] # selftests: net: reuseport_bpf_cpu 10:17:35 DEBUG| [stdout] # ---- IPv4 UDP ---- 10:17:35 DEBUG| [stdout] # send cpu 0, receive socket 0 (line skipped) 10:17:35 DEBUG| [stdout] # send cpu 159, receive socket 159 10:17:35 DEBUG| [stdout] # ---- IPv6 TCP ---- (hang here) Third attempt failed because of test timeout: 12:46:16 DEBUG| [stdout] # [FAIL] 12:46:16 DEBUG| [stdout] # -------------------- 12:46:16 DEBUG| [stdout] # running psock_tpacket test 12:46:16 DEBUG| [stdout] # -------------------- 13:14:13 INFO | Timer expired (1800 sec.), nuking pid 161853 Fourth attempt: 07:41:51 DEBUG| [stdout] # selftests: net: reuseport_bpf_cpu 07:41:51 DEBUG| [stdout] # ---- IPv4 UDP ---- 07:41:51 DEBUG| [stdout] # send cpu 0, receive socket 0 (lines skipped) 07:41:51 DEBUG| [stdout] # send cpu 159, receive socket 159 07:41:51 DEBUG| [stdout] # ---- IPv6 UDP ---- 07:41:51 DEBUG| [stdout] # send cpu 0, receive socket 0 07:41:51 DEBUG| [stdout] # send cpu 1, receive socket 1 (lines skipped) 07:41:51 DEBUG| [stdout] # send cpu 157, receive socket 157 07:41:51 DEBUG| [stdout] # send cpu 159, receive socket 159 07:41:51 DEBUG| [stdout] # ---- IPv4 TCP ---- (test hang here) Fifth attempt: 04:29:17 DEBUG| [stdout] ok 1 selftests: net: reuseport_bpf 04:29:17 DEBUG| [stdout] # selftests: net: reuseport_bpf_cpu 04:29:17 DEBUG| [stdout] # ---- IPv4 UDP ---- 04:29:17 DEBUG| [stdout] # send cpu 0, receive socket 0 (lines skipped) 04:29:17 DEBUG| [stdout] # send cpu 159, receive socket 159 04:29:17 DEBUG| [stdout] # ---- IPv6 UDP ---- 04:29:17 DEBUG| [stdout] # send cpu 0, receive socket 0 (lines skipped) 04:29:17 DEBUG| [stdout] # send cpu 159, receive socket 159 04:29:17 DEBUG| [stdout] # ---- IPv4 TCP ---- 04:29:17 DEBUG| [stdout] # send cpu 0, receive socket 0 (lines skipped) 04:29:17 DEBUG| [stdout] # send cpu 15, receive socket 15 (test hang here) I tried to run tests in this sru-misc suite in the following order: 'hwclock', 'ubuntu_bpf', 'ubuntu_bpf_jit', 'ubuntu_kernel_selftests', 'ubuntu_lxc', 'ubuntu_seccomp', 'ubuntu_unionmount_ovlfs', 'ubuntu_cts_kernel', 'ubuntu_kvm_unit_tests', One by one on this node, but I can't reproduce this issue. I tried to watch dmesg when this happens, but there is no information there, the system will be reboot automatically silently. This is what you can see from syslog after reboot: Mar 12 04:27:39 modoc kernel: [ 536.668305] Injecting error (-12) to MEM_GOING_OFFLINE Mar 12 04:27:39 modoc kernel: [ 536.684547] Injecting error (-12) to MEM_GOING_OFFLINE Mar 12 04:27:39 modoc kernel: [ 536.700907] Injecting error (-12) to MEM_GOING_OFFLINE Mar 12 04:27:39 modoc kernel: [ 536.717246] Injecting error (-12) to MEM_GOING_OFFLINE Mar 12 04:27:39 modoc kernel: [ 536.719288] page:c00c000000c4f000 refcount:1 mapcount:0 mapping:c000000f8cfe0fd1 index:0x7611c3e Mar 12 04:27:39 modoc kernel: [ 536.719289] anon Mar 12 04:27:39 modoc kernel: [ 536.719291] flags: 0x3ffff800080024(uptodate|active|swapbacked) Mar 12 04:27:39 modoc kernel: [ 536.719294] raw: 003ffff800080024 5deadbeef0000100 5deadbeef0000122 c000000f8cfe0fd1 Mar 12 04:27:39 modoc kernel: [ 536.719295] raw: 0000000007611c3e 0000000000000000 00000001ffffffff c000000fcfd1c000 Mar 12 04:27:39 modoc kernel: [ 536.719296] page dumped because: unmovable page Mar 12 04:27:39 modoc kernel: [ 536.719296] page->mem_cgroup:c000000fcfd1c000 Mar 12 04:27:39 modoc kernel: [ 536.735465] Injecting error (-12) to MEM_GOING_OFFLINE Mar 12 04:27:39 modoc kernel: [ 536.751848] Injecting error (-12) to MEM_GOING_OFFLINE Mar 12 04:27:39 modoc kernel: [ 536.768210] Injecting error (-12) to MEM_GOING_OFFLINE Mar 12 04:27:39 modoc kernel: [ 536.784450] Injecting error (-12) to MEM_GOING_OFFLINE Mar 12 04:27:39 modoc kernel: [ 536.800756] Injecting error (-12) to MEM_GOING_OFFLINE Mar 12 04:27:39 modoc kernel: [ 536.817006] Injecting error (-12) to MEM_GOING_OFFLINE Mar 12 04:27:39 modoc kernel: [ 536.833133] Injecting error (-12) to MEM_GOING_OFFLINE Mar 12 04:27:39 modoc kernel: [ 536.849205] Injecting error (-12) to MEM_GOING_OFFLINE Mar 12 04:27:39 modoc kernel: [ 536.865448] Injecting error (-12) to MEM_GOING_OFFLINE ^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@Mar 12 04:35:41 modoc systemd[1]: Starting Flush Journal to Persistent Storage... Mar 12 04:35:41 modoc kernel: [ 0.000000] hash-mmu: Page sizes from device-tree: Mar 12 04:35:41 modoc kernel: [ 0.000000] hash-mmu: base_shift=12: shift=12, sllp=0x0000, avpnm=0x00000000, tlbiel=1, penc=0 Mar 12 04:35:41 modoc kernel: [ 0.000000] hash-mmu: base_shift=12: shift=16, sllp=0x0000, avpnm=0x00000000, tlbiel=1, penc=7 Mar 12 04:35:41 modoc kernel: [ 0.000000] hash-mmu: base_shift=12: shift=24, sllp=0x0000, avpnm=0x00000000, tlbiel=1, penc=56 Mar 12 04:35:41 modoc systemd[1]: Started udev Kernel Device Manager. From the log above, line "^@^@^@^@^@^" indicates the reboot. It looks like it's running the memory-hotplug test. Maybe we need to use IPMI to see if there is anything on the console. ProblemType: Bug DistroRelease: Ubuntu 19.10 Package: linux-image-5.3.0-42-generic 5.3.0-42.34 ProcVersionSignature: Ubuntu 5.3.0-42.34-generic 5.3.18 Uname: Linux 5.3.0-42-generic ppc64le AlsaDevices: total 0 crw-rw---- 1 root audio 116, 1 Mar 12 04:33 seq crw-rw---- 1 root audio 116, 33 Mar 12 04:33 timer AplayDevices: Error: [Errno 2] No such file or directory: 'aplay': 'aplay' ApportVersion: 2.20.11-0ubuntu8.5 Architecture: ppc64el ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord': 'arecord' AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1: Date: Thu Mar 12 09:42:24 2020 IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig': 'iwconfig' Lsusb: Bus 004 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub Bus 003 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub PciMultimedia: ProcEnviron: TERM=xterm-256color PATH=(custom, no user) LANG=C.UTF-8 SHELL=/bin/bash ProcFB: ProcKernelCmdLine: root=UUID=b2a867ce-7813-4785-8861-4e7de2ac39b4 ro console=hvc0 ProcLoadAvg: 0.07 0.02 0.00 1/1461 86637 ProcLocks: 1: POSIX ADVISORY WRITE 3799 00:18:841 0 EOF 2: POSIX ADVISORY WRITE 3526 00:18:743 0 EOF 3: FLOCK ADVISORY WRITE 3720 00:18:837 0 EOF ProcSwaps: Filename Type Size Used Priority /swap.img file 8388544 0 -2 ProcVersion: Linux version 5.3.0-42-generic (buildd@bos02-ppc64el-006) (gcc version 9.2.1 20191008 (Ubuntu 9.2.1-9ubuntu2)) #34-Ubuntu SMP Fri Feb 28 05:49:17 UTC 2020 RelatedPackageVersions: linux-restricted-modules-5.3.0-42-generic N/A linux-backports-modules-5.3.0-42-generic N/A linux-firmware 1.183.4 RfKill: Error: [Errno 2] No such file or directory: 'rfkill': 'rfkill' SourcePackage: linux UpgradeStatus: No upgrade log present (probably fresh install) VarLogDump_list: total 0 cpu_cores: Number of cores present = 20 cpu_coreson: Number of cores online = 20 cpu_dscr: DSCR is 0 cpu_freq: min: 3.694 GHz (cpu 159) max: 3.695 GHz (cpu 1) avg: 3.694 GHz cpu_runmode: Could not retrieve current diagnostics mode, No kernel interface to firmware cpu_smt: SMT=8 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu-kernel-tests/+bug/1867155/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp