On 11/15/16 16:45, Michael S. Tsirkin wrote: > On Tue, Nov 15, 2016 at 04:39:04PM +0100, Laszlo Ersek wrote: >> On 11/15/16 14:59, Paolo Bonzini wrote: >>> >>> >>> On 15/11/2016 02:50, Laszlo Ersek wrote: >>>> The generic edk2 SMM infrastructure prefers >>>> EFI_SMM_CONTROL2_PROTOCOL.Trigger() to inject an SMI on each processor. If >>>> Trigger() only brings the current processor into SMM, then edk2 handles it >>>> in the following ways: >>>> >>>> (1) If Trigger() is executed by the BSP (which is guaranteed before >>>> ExitBootServices(), but is not necessarily true at runtime), then: >>>> >>>> (a) If edk2 has been configured for "traditional" SMM synchronization, >>>> then the BSP sends directed SMIs to the APs with APIC delivery, >>>> bringing them into SMM individually. Then the BSP runs the SMI >>>> handler / dispatcher. >>>> >>>> (b) If edk2 has been configured for "relaxed" SMM synchronization, >>>> then the APs that are not already in SMM are not brought in, and >>>> the BSP runs the SMI handler / dispatcher. >>>> >>>> (2) If Trigger() is executed by an AP (which is possible after >>>> ExitBootServices(), and can be forced e.g. by "taskset -c 1 >>>> efibootmgr"), then the AP in question brings in the BSP with a >>>> directed SMI, and the BSP runs the SMI handler / dispatcher. >>>> >>>> The smaller problem with (1a) and (2) is that the BSP and AP >>>> synchronization is slow. For example, the "taskset -c 1 efibootmgr" >>>> command from (2) can take more than 3 seconds to complete, because >>>> efibootmgr accesses non-volatile UEFI variables intensively. >>>> >>>> The larger problem is that QEMU's current behavior diverges from the >>>> behavior usually seen on physical hardware, and that keeps exposing >>>> obscure corner cases, race conditions and other instabilities in edk2, >>>> which generally expects / prefers a software SMI to affect all CPUs at >>>> once. >>>> >>>> Therefore introduce a special APM_STS value (0x51) that causes QEMU to >>>> inject the SMI on all VCPUs. OVMF's EFI_SMM_CONTROL2_PROTOCOL.Trigger() >>>> can utilize this to accommodate edk2's preference about "broadcast" SMI. >>>> >>>> SeaBIOS uses values 0x00 and 0x01 for APM_STS (called PORT_SMI_STATUS in >>>> the SeaBIOS code), so this change should be transparent to it. >>>> >>>> While the original posting of this patch >>>> <http://lists.nongnu.org/archive/html/qemu-devel/2015-10/msg05658.html> >>>> only intended to speed up (2), based on our recent "stress testing" of SMM >>>> this patch actually provides functional improvements. (There are no code >>>> changes relative to the original posting.) >>>> >>>> Cc: "Michael S. Tsirkin" <m...@redhat.com> >>>> Cc: Paolo Bonzini <pbonz...@redhat.com> >>>> Also-suggested-by: Paolo Bonzini <pbonz...@redhat.com> >>>> Signed-off-by: Laszlo Ersek <ler...@redhat.com> >>> >>> I'm queuing this for 2.8, >> >> Thank you! >> >>> but I have a question---should this feature be >>> detectable, and if so how? >> >> That's the exact question I wanted to ask you. :) >> >> ... How about an fw_cfg file? For example: >> >> - name: etc/broadcast_smi >> - value type: uint8_t >> - role: if the file exists, the broadcast SMI capability exists. Read >> the uint8_t value from the fw_cfg file. Write the uint8_t value read to >> ICH9_APM_STS first, before triggering the SMI via ICH9_APM_CNT, to >> request a broadcast SMI. The values 0 and 1 are reserved for SeaBIOS, so >> if the fw_cfg file exists, those values will never be provided. >> >> This would allow me to make the OVMF patches more robust: >> - first, I don't have to hardcode the value 'Q' in SmmControl2Dxe, >> - second, I can check for the presence of "etc/broadcast_smi" in >> PlatformPei, and set PcdCpuSmmApSyncTimeout and PcdCpuSmmSyncMode >> dynamically. >> >> I would quite like this approach, as simply reverting the >> PcdCpuSmmApSyncTimeout and PcdCpuSmmSyncMode settings in OVMF will break >> SMM hard on QEMUs that do not support the broadcast SMI feature. The >> default would remain the current setting. >> >> Also, if we add the fw_cfg item, we don't need to rush this into 2.8 I >> think (of course I wouldn't mind making 2.8 nevertheless). >> >> Do you think we should make the broadcast SMI capability (and the >> descriptor fw_cfg file) machine-type dependent? I do think so (if for >> nothing else then for "rather safe than sorry"), but I always struggle >> with this kind of question, so any advice is most welcome... >> >> Thank you! >> Laszlo > > Hmm it's a bugfix so why not just backport the fix? > I don't see why do we want work-arounds in UEFI - > does not seem much easier.
If the consensus is that the patch is a QEMU bugfix (as opposed to a feature) and that it is eligible for the currently supported upstream stable branches, that's the best, no doubt. For reference, the OVMF documentation recommends QEMU 2.5+ for SMM. The SMM enablement in libvirt enforces QEMU 2.4+. (Libvirt is actually correct; when I was writing the OVMF docs, I must have misunderstood the requirements and needlessly required 2.5+; 2.4+ should have been fine.) Which means the fix should be backported as far as stable-2.4. Should we proceed with that? CC'ing Mike Roth and the stable list. Thanks! Laszlo > > >>> >>> Paolo >>> >>>> --- >>>> hw/isa/lpc_ich9.c | 12 +++++++++++- >>>> 1 file changed, 11 insertions(+), 1 deletion(-) >>>> >>>> diff --git a/hw/isa/lpc_ich9.c b/hw/isa/lpc_ich9.c >>>> index 10d1ee8b9310..f2fe644fdaa4 100644 >>>> --- a/hw/isa/lpc_ich9.c >>>> +++ b/hw/isa/lpc_ich9.c >>>> @@ -372,6 +372,8 @@ void ich9_lpc_pm_init(PCIDevice *lpc_pci, bool >>>> smm_enabled) >>>> >>>> /* APM */ >>>> >>>> +#define QEMU_ICH9_APM_STS_BROADCAST_SMI 'Q' >>>> + >>>> static void ich9_apm_ctrl_changed(uint32_t val, void *arg) >>>> { >>>> ICH9LPCState *lpc = arg; >>>> @@ -386,7 +388,15 @@ static void ich9_apm_ctrl_changed(uint32_t val, void >>>> *arg) >>>> >>>> /* SMI_EN = PMBASE + 30. SMI control and enable register */ >>>> if (lpc->pm.smi_en & ICH9_PMIO_SMI_EN_APMC_EN) { >>>> - cpu_interrupt(current_cpu, CPU_INTERRUPT_SMI); >>>> + if (lpc->apm.apms == QEMU_ICH9_APM_STS_BROADCAST_SMI) { >>>> + CPUState *cs; >>>> + >>>> + CPU_FOREACH(cs) { >>>> + cpu_interrupt(cs, CPU_INTERRUPT_SMI); >>>> + } >>>> + } else { >>>> + cpu_interrupt(current_cpu, CPU_INTERRUPT_SMI); >>>> + } >>>> } >>>> } >>>> >>>>