Quoting Thomas Gleixner (2019-09-07 16:00:17) > Does this only happen with that CPU0 hotplug stuff enabled or on CPUs other > than CPU0 as well? That hotplug CPU0 stuff is a bandaid so I wouldn't be > surprised if we broke that somehow.
If I ignore cpu0 in that test and so use [ 133.847187] smpboot: CPU 1 is now offline [ 134.861861] x86: Booting SMP configuration: [ 134.861875] smpboot: Booting Node 0 Processor 1 APIC 0x2 [ 134.880218] smpboot: CPU 2 is now offline [ 135.893806] smpboot: Booting Node 0 Processor 2 APIC 0x1 [ 135.935115] smpboot: CPU 3 is now offline [ 136.949760] smpboot: Booting Node 0 Processor 3 APIC 0x3 that has run for 10 minutes without failure, so it seems confined to cpu0 hotplugging. All we are doing in the test to generate the hotplugs is: for (int cpu = 0;; cpu++) { char name[128]; int cpufd; snprintf(name, sizeof(name), "/sys/devices/system/cpu/cpu%d/online", cpu), sizeof(name)); cpufd = open(name, O_WRONLY); if (cpufd < 0) break; write(cpufd, "0", 2); usleep(1e6); write(cpufd, "1", 2); close(cpufd); } -Chris