On Thu, Jan 24, 2019 at 04:57:13PM +0100, Igor Mammedov wrote: > In case guest is booted with one CPU present and then later > a sibling CPU is hotplugged [1], it stays offline since SMT > is disabled. > > Bisects to > 73d5e2b47264 ("cpu/hotplug: detect SMT disabled by BIOS") > which used __max_smt_threads to decide disabling SMT and in > case [1] only primary CPU thread is present hence SMT > is disabled. > > Later bc2d8d262cba (cpu/hotplug: Fix SMT supported evaluation), > rewrites code path but evaluation criteria still depends on > sibling thread being present at boot time, so problem persist. > > 1) QEMU -smp 1,sockets=2,cores=1,threads=2 -monitor stdio ... > # hotplug sibling thread > (qemu) device_add qemu64-x86_64-cpu,socket-id=0,core-id=0,thread-id=1 > > I've failed to find reasoning behind statement: > " > cpu/hotplug: detect SMT disabled by BIOS > > If SMT is disabled in BIOS, the CPU code doesn't properly detect it. > " > > Question is > 1: why cpu_smt_check_topology_early() at check_bugs() > wasn't sufficient to detect SMT disabled in BIOS and > 2: why side-effect of present at boot siblings were used > to keep SMT enabled?
Hi Igor, Thanks for reporting this. The original problems with SMT disabled in BIOS were: 1) /sys/devices/system/cpu/smt showed "on"; and 2) vmx_vm_init() was incorrectly showing the L1TF_MSG_SMT warning. So 73d5e2b47264 ("cpu/hotplug: detect SMT disabled by BIOS") was intended to fix that, by reporting SMT as permanently disabled, when the BIOS has done so. The problem is, I don't think there's a way to detect a difference between the HW "SMT disabled by BIOS" case and the virt "Sibling not yet plugged in" case. I'd propose that we consider #1 above to *not* be a problem. Because in the virt case, it's possible that a sibling thread can be brought online later. So it makes sense to default with smt control "on" to allow for that possibility. The real problem is #2. Which is a simple fix: change vmx_vm_init() to query the current SMT state with sched_smt_active(), instead of looking at the "control" sysfs value. How about this patch? It's just a revert of 73d5e2b47264 and bc2d8d262cba, plus the 1-line vmx_vm_init() change. If it looks ok to you, I can clean it up and submit an official version. diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c index 453e66291e38..2a4db86c2780 100644 --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -70,7 +70,7 @@ void __init check_bugs(void) * identify_boot_cpu() initialized SMT support information, let the * core code know. */ - cpu_smt_check_topology_early(); + cpu_smt_check_topology(); if (!IS_ENABLED(CONFIG_SMP)) { pr_info("CPU: "); diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 886ad6db0926..494f32ae6e6f 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -11102,7 +11102,7 @@ static int vmx_vm_init(struct kvm *kvm) * Warn upon starting the first VM in a potentially * insecure environment. */ - if (cpu_smt_control == CPU_SMT_ENABLED) + if (sched_smt_active()) pr_warn_once(L1TF_MSG_SMT); if (l1tf_vmx_mitigation == VMENTER_L1D_FLUSH_NEVER) pr_warn_once(L1TF_MSG_L1D); diff --git a/include/linux/cpu.h b/include/linux/cpu.h index 218df7f4d3e1..5041357d0297 100644 --- a/include/linux/cpu.h +++ b/include/linux/cpu.h @@ -180,12 +180,10 @@ enum cpuhp_smt_control { #if defined(CONFIG_SMP) && defined(CONFIG_HOTPLUG_SMT) extern enum cpuhp_smt_control cpu_smt_control; extern void cpu_smt_disable(bool force); -extern void cpu_smt_check_topology_early(void); extern void cpu_smt_check_topology(void); #else # define cpu_smt_control (CPU_SMT_ENABLED) static inline void cpu_smt_disable(bool force) { } -static inline void cpu_smt_check_topology_early(void) { } static inline void cpu_smt_check_topology(void) { } #endif diff --git a/kernel/cpu.c b/kernel/cpu.c index d7c2e48ed10d..240fb4ab3f38 100644 --- a/kernel/cpu.c +++ b/kernel/cpu.c @@ -360,8 +360,6 @@ void __weak arch_smt_update(void) { } enum cpuhp_smt_control cpu_smt_control __read_mostly = CPU_SMT_ENABLED; EXPORT_SYMBOL_GPL(cpu_smt_control); -static bool cpu_smt_available __read_mostly; - void __init cpu_smt_disable(bool force) { if (cpu_smt_control == CPU_SMT_FORCE_DISABLED || @@ -378,25 +376,11 @@ void __init cpu_smt_disable(bool force) /* * The decision whether SMT is supported can only be done after the full - * CPU identification. Called from architecture code before non boot CPUs - * are brought up. - */ -void __init cpu_smt_check_topology_early(void) -{ - if (!topology_smt_supported()) - cpu_smt_control = CPU_SMT_NOT_SUPPORTED; -} - -/* - * If SMT was disabled by BIOS, detect it here, after the CPUs have been - * brought online. This ensures the smt/l1tf sysfs entries are consistent - * with reality. cpu_smt_available is set to true during the bringup of non - * boot CPUs when a SMT sibling is detected. Note, this may overwrite - * cpu_smt_control's previous setting. + * CPU identification. Called from architecture code. */ void __init cpu_smt_check_topology(void) { - if (!cpu_smt_available) + if (!topology_smt_supported()) cpu_smt_control = CPU_SMT_NOT_SUPPORTED; } @@ -409,18 +393,10 @@ early_param("nosmt", smt_cmdline_disable); static inline bool cpu_smt_allowed(unsigned int cpu) { - if (topology_is_primary_thread(cpu)) + if (cpu_smt_control == CPU_SMT_ENABLED) return true; - /* - * If the CPU is not a 'primary' thread and the booted_once bit is - * set then the processor has SMT support. Store this information - * for the late check of SMT support in cpu_smt_check_topology(). - */ - if (per_cpu(cpuhp_state, cpu).booted_once) - cpu_smt_available = true; - - if (cpu_smt_control == CPU_SMT_ENABLED) + if (topology_is_primary_thread(cpu)) return true; /* diff --git a/kernel/smp.c b/kernel/smp.c index d86eec5f51c1..084c8b3a2681 100644 --- a/kernel/smp.c +++ b/kernel/smp.c @@ -584,8 +584,6 @@ void __init smp_init(void) num_nodes, (num_nodes > 1 ? "s" : ""), num_cpus, (num_cpus > 1 ? "s" : "")); - /* Final decision about SMT support */ - cpu_smt_check_topology(); /* Any cleanup work */ smp_cpus_done(setup_max_cpus); }