I'm afraid that one fails, too, on the second run when bringing CPU10 back online. Here's the dmesg output:
[ 154.987312] smpboot: Booting Node 1 Processor 10 APIC 0x14 [ 154.992953] BUG: unable to handle kernel paging request at 0000317865646e69 [ 154.993932] IP: __kmalloc_track_caller+0x97/0x1f0 [ 154.994847] PGD 0 [ 154.994848] P4D 0 [ 154.997397] Oops: 0000 [#1] SMP [ 154.998250] Modules linked in: nls_iso8859_1 intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm joydev input_leds ipmi_ssif irqbypass mac_hid ipmi_si shpchp intel_cstate intel_rapl_perf acpi_power_meter ipmi_devintf acpi_pad mei_me lpc_ich ipmi_msghandler mei ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi autofs4 btrfs raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear ses enclosure scsi_transport_sas mgag200 ttm fnic hid_generic crct10dif_pclmul crc32_pclmul ghash_clmulni_intel drm_kms_helper pcbc usbhid syscopyarea igb sysfillrect libfcoe aesni_intel sysimgblt dca fb_sys_fops i2c_algo_bit aes_x86_64 hid crypto_simd glue_helper libfc ptp mxm_wmi ahci drm cryptd [ 155.005714] libahci pps_core scsi_transport_fc enic megaraid_sas wmi [ 155.006913] CPU: 10 PID: 69 Comm: cpuhp/10 Not tainted 4.13.0-13-generic #14~lp1733662Commitac2fc5adab0f4 [ 155.008154] Hardware name: Cisco Systems Inc UCSC-C240-M4L/UCSC-C240-M4L, BIOS C240M4.2.0.10c.0.032320160820 03/23/2016 [ 155.009427] task: ffff91c7b8785d00 task.stack: ffffa8760c7e8000 [ 155.010718] RIP: 0010:__kmalloc_track_caller+0x97/0x1f0 [ 155.012014] RSP: 0000:ffffa8760c7ebc48 EFLAGS: 00010206 [ 155.013308] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00000000000014b9 [ 155.014618] RDX: 00000000000014b8 RSI: 0000000000000000 RDI: 000000000001f3e0 [ 155.015946] RBP: ffffa8760c7ebc80 R08: ffff91c7bf29f3e0 R09: ffff91a7bf807c00 [ 155.017284] R10: ffffa8760c7ebce0 R11: 0000000000000006 R12: 0000317865646e69 [ 155.018620] R13: 00000000014000c0 R14: 0000000000000007 R15: ffff91a7bf807c00 [ 155.019965] FS: 0000000000000000(0000) GS:ffff91c7bf280000(0000) knlGS:0000000000000000 [ 155.021329] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 155.022710] CR2: 0000317865646e69 CR3: 0000000ec6c09000 CR4: 00000000001406e0 [ 155.024101] Call Trace: [ 155.025490] ? kvasprintf_const+0x45/0xa0 [ 155.026906] kvasprintf+0x66/0xd0 [ 155.028304] kvasprintf_const+0x45/0xa0 [ 155.029703] kobject_set_name_vargs+0x23/0x90 [ 155.031101] cpu_device_create+0xa4/0x100 [ 155.032485] ? smp_call_function_single+0xb9/0xe0 [ 155.033891] cacheinfo_cpu_online+0x2ac/0x400 [ 155.035295] ? get_cpu_cacheinfo+0x50/0x50 [ 155.036709] cpuhp_invoke_callback+0x84/0x3b0 [ 155.038101] cpuhp_up_callbacks+0x36/0xc0 [ 155.039513] cpuhp_thread_fun+0xd4/0xe0 [ 155.040923] smpboot_thread_fn+0xec/0x160 [ 155.042319] kthread+0x125/0x140 [ 155.043706] ? sort_range+0x30/0x30 [ 155.045107] ? kthread_create_on_node+0x70/0x70 [ 155.046515] ret_from_fork+0x25/0x30 [ 155.047906] Code: 08 65 4c 03 05 ab e5 7d 5b 49 83 78 10 00 4d 8b 20 0f 84 ef 00 00 00 4d 85 e4 0f 84 e6 00 00 00 49 63 41 20 49 8b 39 48 8d 4a 01 <49> 8b 1c 04 4c 89 e0 65 48 0f c7 0f 0f 94 c0 84 c0 74 bb 49 63 [ 155.050922] RIP: __kmalloc_track_caller+0x97/0x1f0 RSP: ffffa8760c7ebc48 [ 155.052426] CR2: 0000317865646e69 [ 155.053914] ---[ end trace f7bb4aa3c197a453 ]--- To be sure, here's the kernel version information: $ uname -a Linux oil-boldore 4.13.0-13-generic #14~lp1733662Commitac2fc5adab0f4 SMP Fri Jan 5 15:31:13 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1733662 Title: System hang with Linux kernel 4.13, not with 4.10 Status in linux package in Ubuntu: In Progress Status in linux-hwe package in Ubuntu: New Status in linux source package in Artful: In Progress Status in linux-hwe source package in Artful: New Status in linux source package in Bionic: In Progress Status in linux-hwe source package in Bionic: New Bug description: In doing Ubuntu 17.10 regression testing, we've encountered one computer (boldore, a Cisco UCS C240 M4 [VIC]), that hangs about one in four times when running our cpu_offlining test. This test attempts to take all the CPU cores offline except one, then brings them back online again. This test ran successfully on boldore with previous releases, but with 17.10, the system sometimes (about one in four runs) hangs. Reverting to Ubuntu 16.04.3, I found no problems; but when I upgraded the 16.04.3 installation to linux- image-4.13.0-16-generic, the problem appeared again, so I'm confident this is a problem with the kernel. I'm attaching two files, dmesg- output-4.10.txt and dmesg-output-4.13.txt, which show the dmesg output that appears when running the cpu_offlining test with 4.10.0-38 and 4.13.0-16 kernels, respectively; the system hung on the 4.13 run. (I was running "dmesg -w" in a second SSH login; the files are cut-and- pasted from that.) I initiated this bug report from an Ubuntu 16.04.3 installation running a 4.10 kernel; but as I said, this applies to the 4.13 kernel. ProblemType: Bug DistroRelease: Ubuntu 16.04 Package: linux-image-4.10.0-38-generic 4.10.0-38.42~16.04.1 ProcVersionSignature: User Name 4.10.0-38.42~16.04.1-generic 4.10.17 Uname: Linux 4.10.0-38-generic x86_64 ApportVersion: 2.20.1-0ubuntu2.10 Architecture: amd64 Date: Tue Nov 21 17:36:06 2017 ProcEnviron: TERM=xterm-256color PATH=(custom, no user) XDG_RUNTIME_DIR=<set> LANG=en_US.UTF-8 SHELL=/bin/bash SourcePackage: linux-hwe UpgradeStatus: No upgrade log present (probably fresh install) To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1733662/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp