** Also affects: ubuntu-power-systems Importance: Undecided Status: New
** Changed in: ubuntu-power-systems Status: New => Triaged ** Changed in: ubuntu-power-systems Importance: Undecided => Critical -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1761729 Title: Ubuntu 18.04 Machine crashed while running ltp. Status in The Ubuntu-power-systems project: Triaged Status in linux package in Ubuntu: New Bug description: ---Problem Description--- Ubuntu 18.04 [ Briggs P8 ]: Machine crashed while running ltp. ---Environment-- Kernel Build: Ubuntu 18.04 System Name : ltc-briggs2 Model/Type : P8 Platform : BML ---Uname output--- root@ltc-briggs2:~# uname -a Linux ltc-briggs2 4.15.0-13-generic #14-Ubuntu SMP Sat Mar 17 13:43:15 UTC 2018 ppc64le ppc64le ppc64le GNU/Linux ---Steps to reproduce-- $ git clone https://github.com/linux-test-project/ltp.git $ cd ltp $ make autotools $ ./configure $ make $ make install ltp ===== root@ltc-briggs2:~# root@ltc-briggs2:~# [10781.098337] LTP: starting fs_inod01 (fs_inod $TMPDIR 10 10 10) [10782.837910] LTP: starting linker01 (linktest.sh 1000 1000) [10784.504474] LTP: starting openfile01 (openfile -f10 -t10) [10784.534953] LTP: starting inode01 [10784.550767] LTP: starting inode02 [10784.739104] LTP: starting stream01 [10784.740840] LTP: starting stream02 [10784.742487] LTP: starting stream03 [10784.744532] LTP: starting stream04 [10784.746087] LTP: starting stream05 [10784.747722] LTP: starting ftest01 [10785.142054] LTP: starting ftest02 [10785.158852] LTP: starting ftest03 [10785.404760] LTP: starting ftest04 [10785.527197] LTP: starting ftest05 [10785.937164] LTP: starting ftest06 [10785.958360] LTP: starting ftest07 [10786.463382] LTP: starting ftest08 [10786.592998] LTP: starting lftest01 (lftest 100) [10786.672707] LTP: starting writetest01 (writetest) [10786.774292] LTP: starting fs_di (fs_di -d $TMPDIR) [10792.973510] LTP: starting proc01 (proc01 -m 128) [10793.865686] ICMPv6: process `proc01' is using deprecated sysctl (syscall) net.ipv6.neigh.default.base_reachable_time - use net.ipv6.neigh.default.base_reachable_time_ms instead [10795.785593] LTP: starting read_all_dev (read_all -d /dev -e '/dev/watchdog?(0)' -q -r 10) [10795.895774] NET: Registered protocol family 40 [10795.918763] Bluetooth: Core ver 2.22 [10795.918866] NET: Registered protocol family 31 [10795.918909] Bluetooth: HCI device and connection manager initialized [10795.918955] Bluetooth: HCI socket layer initialized [10795.918991] Bluetooth: L2CAP socket layer initialized [10795.919032] Bluetooth: SCO socket layer initialized [10798.374850] usercopy: kernel memory exposure attempt detected from 0000000029431ea4 (<kernel text>) (1023 bytes) [10798.374952] ------------[ cut here ]------------ [10798.374988] kernel BUG at /build/linux-2BXDjB/linux-4.15.0/mm/usercopy.c:72! [10798.375041] Oops: Exception in kernel mode, sig: 5 [#1] [10798.375080] LE SMP NR_CPUS=2048 [10871.343999650,5] OPAL: Switch to big-endian OS NUMA PowerNV [10798.375117] [10876.190849323,5] OPAL: Switch to little-endian OS Modules linked in: hci_vhci bluetooth ecdh_generic vhost_vsock cuse vmw_vsock_virtio_transport_common userio vsock uhid vhost_net vhost tap snd_seq snd_seq_device snd_timer snd soundcore binfmt_misc sctp quota_v2 quota_tree nls_iso8859_1 ntfs xfs xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter kvm_hv kvm idt_89hpesx vmx_crypto ofpart cmdlinepart ipmi_powernv powernv_flash ipmi_devintf mtd ipmi_msghandler ibmpowernv opal_prd at24 powernv_rng joydev input_leds mac_hid uio_pdrv_genirq uio sch_fq_codel nfsd ib_iser rdma_cm auth_rpcgss iw_cm nfs_acl lockd ib_cm grace iscsi_tcp [10798.375636] libiscsi_tcp libiscsi sunrpc scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear mlx5_ib ses enclosure scsi_transport_sas hid_generic usbhid hid ib_core qla2xxx ast i2c_algo_bit ttm mlx5_core drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops nvme_fc crct10dif_vpmsum nvme_fabrics ahci mlxfw crc32c_vpmsum i40e drm devlink scsi_transport_fc megaraid_sas libahci [10798.375961] CPU: 87 PID: 4085 Comm: read_all Not tainted 4.15.0-13-generic #14-Ubuntu [10798.376013] NIP: c0000000003c76f0 LR: c0000000003c76ec CTR: 00000000300378e8 [10798.376068] REGS: c0000076c63aba00 TRAP: 0700 Not tainted (4.15.0-13-generic) [10798.376120] MSR: 9000000000029033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 28002222 XER: 20000000 [10798.376176] CFAR: c00000000018cce4 SOFTE: 1 [10798.376176] GPR00: c0000000003c76ec c0000076c63abc80 c0000000016eaf00 0000000000000064 [10798.376176] GPR04: c000007ffc1cce18 c000007ffc1e4368 9000000000009033 000000000000040f [10798.376176] GPR08: 0000000000000007 c0000000011c3a74 0000007ffb010000 9000000000001003 [10798.376176] GPR12: 0000000000002200 c000000007a8bd00 0000000000000000 0000000000000000 [10798.376176] GPR16: 0000000000000000 0000000000000000 0000000000000006 00007ffff7a0a018 [10798.376176] GPR20: 000008bb551c8908 000008bb551c88f8 000008bb551c88c8 c0000076c63abe00 [10798.376176] GPR24: 0000000000010000 0000000000000000 00007ffff7a0a018 c0000076c63abe00 [10798.376176] GPR28: c0000000000003ff 0000000000000001 00000000000003ff c000000000000000 [10798.376619] NIP [c0000000003c76f0] __check_object_size+0x140/0x270 [10798.376662] LR [c0000000003c76ec] __check_object_size+0x13c/0x270 [10798.376706] Call Trace: [10798.376724] [c0000076c63abc80] [c0000000003c76ec] __check_object_size+0x13c/0x270 (unreliable) [10798.376787] [c0000076c63abd00] [c0000000008268a4] read_mem+0x84/0x220 [10798.376835] [c0000076c63abd70] [c0000000003d109c] __vfs_read+0x3c/0x70 [10798.376880] [c0000076c63abd90] [c0000000003d118c] vfs_read+0xbc/0x1b0 [10798.376925] [c0000076c63abde0] [c0000000003d1788] SyS_read+0x68/0x110 [10798.377012] [c0000076c63abe30] [c00000000000b184] system_call+0x58/0x6c [10798.377057] Instruction dump: [10798.377086] 2fbd0000 419e010c 3c82ff8b 3ca2ff94 3884c360 38a5ad68 3c62ff8b 7fc8f378 [10798.377140] 7fe6fb78 3863c370 4bdc55b5 60000000 <0fe00000> 60000000 60000000 60420000 [10798.377195] ---[ end trace 21abd4753a69334c ]--- [10798.445038] [10798.445135] Sending IPI to other CPUs [10798.446688] IPI complete [10798.449081] kexec: waiting for cpu 0 (physical 16) to enter OPAL [10798.450224] kexec: waiting for cpu 23 (physical 47) to enter OPAL [10798.451396] kexec: waiting for cpu 54 (physical 94) to enter OPAL [10800.049202] kexec: Starting switchover sequence. [ 1.078053] integrity: Unable to open file: /etc/keys/x509_ima.der (-2) [ 1.078057] integrity: Unable to open file: /etc/keys/x509_evm.der (-2) [ 1.165219] vio vio: uevent: failed to send synthetic uevent /dev/nvme0n1p2: recovering journal /dev/nvme0n1p2: clean, 14017353/122101760 files, 57953106/488376576 blocks -.mount sys-kernel-debug.mount setvtrgb.service dev-hugepages.mount dev-mqueue.mount kmod-static-nodes.service lvm2-lvmetad.service systemd-remount-fs.service systemd-tmpfiles-setup-dev.service systemd-random-seed.service lvm2-monitor.service systemd-udevd.service systemd-modules-load.service sys-fs-fuse-connections.mount sys-kernel-config.mount systemd-sysctl.service systemd-networkd.service swapfile.swap [ 5.177490] vio vio: uevent: failed to send synthetic uevent systemd-udev-trigger.service keyboard-setup.service systemd-journald.service [ 5.458352] qla2xxx [0020:01:00.0]-00c6:17: MSI-X: Failed to enable support with 32 vectors, using 10 vectors. apparmor.service systemd-journal-flush.service systemd-tmpfiles-setup.service systemd-update-utmp.service [ 6.119284] qla2xxx [0020:01:00.1]-00c6:18: MSI-X: Failed to enable support with 32 vectors, using 10 vectors. systemd-timesyncd.service [ 10.052141] megaraid_sas 0001:03:00.0: Init cmd return status SUCCESS for SCSI host 1 systemd-networkd-wait-online.service iscsid.service blk-availability.service [ 10.675964] kdump-tools[2222]: Starting kdump-tools: * running makedumpfile -c -d 31 /proc/vmcore /var/crash/201804050340/dump-incomplete lvm2-pvscan@8:195.service lvm2-pvscan@8:179.service Copying data : [100.0 %] / eta: 0s [ 55.227083] kdump-tools[2222]: The kernel version is not supported. [ 55.227300] kdump-tools[2222]: The makedumpfile operation may be incomplete. [ 55.227471] kdump-tools[2222]: The dumpfile is saved to /var/crash/201804050340/dump-incomplete. [ 55.227583] kdump-tools[2222]: makedumpfile Completed. [ 55.230250] kdump-tools[2222]: * kdump-tools: saved vmcore in /var/crash/201804050340 [ 55.311695] kdump-tools[2222]: * running makedumpfile --dump-dmesg /proc/vmcore /var/crash/201804050340/dmesg.201804050340 [ 55.330032] kdump-tools[2222]: The kernel version is not supported. [ 55.330206] kdump-tools[2222]: The makedumpfile operation may be incomplete. [ 55.330302] kdump-tools[2222]: The dmesg log is saved to /var/crash/201804050340/dmesg.201804050340. [ 55.330416] kdump-tools[2222]: makedumpfile Completed. [ 55.330533] kdump-tools[2222]: * kdump-tools: saved dmesg content in /var/crash/201804050340 [ 55.334722] kdump-tools[2222]: Thu, 05 Apr 2018 03:40:44 -0500 [ 55.338419] kdump-tools[2222]: Rebooting. [ 55.546343] mlx5_core 0021:01:00.1: mlx5_enter_error_state:121:(pid 2715): start [ 55.546414] mlx5_core 0021:01:00.1: mlx5_enter_error_state:128:(pid 2715): end [ 55.942498] mlx5_core 0021:01:00.0: mlx5_enter_error_state:121:(pid 2715): start [ 55.942631] mlx5_core 0021:01:00.0: mlx5_enter_error_state:128:(pid 2715): end [ 59.836381] reboot: Restarting system [10963.485916127,5] OPAL: Reboot request... 5.31149|Ignoring boot flags, incorrect version 0x0 5.52090|ISTEP 6. 3 6.16670|ISTEP 6. 4 6.16957|ISTEP 6. 5 8.74865|HWAS|PRESENT> DIMM[03]=00AA00AA00AA00AA 8.74865|HWAS|PRESENT> Membuf[04]=4444000000000000 8.74866|HWAS|PRESENT> Proc[05]=C000000000000000 14.03690|ISTEP 6. 6 14.11948|ISTEP 6. 7 16.75478|ISTEP 6. 8 16.91585|ISTEP 6. 9 17.47534|ISTEP 6.10 17.55249|ISTEP 6.11 19.29629|ISTEP 6.12 19.29926|ISTEP 6.13 19.30139|ISTEP 7. 1 19.51889|ISTEP 7. 2 == Comment: #7 - Vaishnavi Bhat <vaish...@in.ibm.com> - 2018-04-06 04:52:31 == kernel memory exposure attempt detected and the BUG() is called from the below code snippet: mm/usercopy.c:72 KERNEL: /usr/lib/debug/boot/vmlinux-4.15.0-13-generic DUMPFILE: dump.201804050340 [PARTIAL DUMP] CPUS: 160 DATE: Thu Apr 5 03:39:16 2018 UPTIME: 00:48:44 LOAD AVERAGE: 2.78, 11.61, 106.19 TASKS: 1748 NODENAME: ltc-briggs2 RELEASE: 4.15.0-13-generic VERSION: #14-Ubuntu SMP Sat Mar 17 13:43:15 UTC 2018 MACHINE: ppc64le (2926 Mhz) MEMORY: 512 GB PANIC: "kernel BUG at /build/linux-2BXDjB/linux-4.15.0/mm/usercopy.c:72!" PID: 4085 COMMAND: "read_all" TASK: c000007659f23f00 [THREAD_INFO: c0000076c63a8000] CPU: 87 STATE: TASK_RUNNING (PANIC) crash> bt PID: 4085 TASK: c000007659f23f00 CPU: 87 COMMAND: "read_all" #0 [c0000076c63ab740] crash_kexec at c0000000001e22b0 #1 [c0000076c63ab780] oops_end at c000000000025888 #2 [c0000076c63ab800] _exception at c000000000026684 #3 [c0000076c63ab990] program_check_common at c000000000008da4 Program Check [700] exception frame: R0: c0000000003c76ec R1: c0000076c63abc80 R2: c0000000016eaf00 R3: 0000000000000064 R4: c000007ffc1cce18 R5: c000007ffc1e4368 R6: 9000000000009033 R7: 000000000000040f R8: 0000000000000007 R9: c0000000011c3a74 R10: 0000007ffb010000 R11: 9000000000001003 R12: 0000000000002200 R13: c000000007a8bd00 R14: 0000000000000000 R15: 0000000000000000 R16: 0000000000000000 R17: 0000000000000000 R18: 0000000000000006 R19: 00007ffff7a0a018 R20: 000008bb551c8908 R21: 000008bb551c88f8 R22: 000008bb551c88c8 R23: c0000076c63abe00 R24: 0000000000010000 R25: 0000000000000000 R26: 00007ffff7a0a018 R27: c0000076c63abe00 R28: c0000000000003ff R29: 0000000000000001 R30: 00000000000003ff R31: c000000000000000 NIP: c0000000003c76f0 MSR: 9000000000029033 OR3: c00000000018cce4 CTR: 00000000300378e8 LR: c0000000003c76ec XER: 0000000020000000 CCR: 0000000028002222 MQ: 0000000000000001 DAR: 0000000000000000 DSISR: 0000000000000000 Syscall Result: 0000000000000000 #4 [c0000076c63abc80] __check_object_size at c0000000003c76f0 [Link Register] [c0000076c63abc80] __check_object_size at c0000000003c76ec (unreliable) #5 [c0000076c63abd00] read_mem at c0000000008268a4 #6 [c0000076c63abd70] __vfs_read at c0000000003d109c #7 [c0000076c63abd90] vfs_read at c0000000003d118c #8 [c0000076c63abde0] sys_read at c0000000003d1788 #9 [c0000076c63abe30] system_call at c00000000000b184 System Call [c01] exception frame: R0: 0000000000000003 R1: 00007ffff7a09ae0 R2: 0000753ec21b7f00 R3: 0000000000000006 R4: 00007ffff7a0a018 R5: 00000000000003ff R6: 0000000000004000 R7: 0000753ec21898c4 R8: 900000010000d033 R9: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 R13: 0000753ec224a8d0 NIP: 0000753ec2188580 MSR: 900000010000d033 OR3: 0000000000000006 CTR: 0000000000000000 LR: 000008bb551b5f20 XER: 0000000000000000 CCR: 0000000042002244 MQ: 0000000000000001 DAR: 0000753ec21affa8 DSISR: 0000000040000000 Syscall Result: 0000000000000006 crash> dis -s c0000000003c76f0 FILE: /build/linux-2BXDjB/linux-4.15.0/mm/usercopy.c LINE: 72 static void report_usercopy(const void *ptr, unsigned long len, bool to_user, const char *type) { pr_emerg("kernel memory %s attempt detected %s %p (%s) (%lu bytes)\n", to_user ? "exposure" : "overwrite", to_user ? "from" : "to", ptr, type ? : "unknown", len); /* * For greater effect, it would be nice to do do_group_exit(), * but BUG() actually hooks all the lock-breaking and per-arch * Oops code, so that is used here instead. */ BUG(); } From the logs, I see that the memory exposure happens after the bluetooth driver is initialized. This might be an issue with the default bluetooth driver provided by the distro. [10795.918866] NET: Registered protocol family 31 [10795.918909] Bluetooth: HCI device and connection manager initialized [10795.918955] Bluetooth: HCI socket layer initialized [10795.918991] Bluetooth: L2CAP socket layer initialized [10795.919032] Bluetooth: SCO socket layer initialized [10798.374850] usercopy: kernel memory exposure attempt detected from 0000000029431ea4 (<kernel text>) (1023 bytes) [10798.374952] ------------[ cut here ]------------ [10798.374988] kernel BUG at /build/linux-2BXDjB/linux-4.15.0/mm/usercopy.c:72! To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu-power-systems/+bug/1761729/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp