On Thu, Jul 04, 2019 at 10:33:11AM +0530, Aravinda Prasad wrote: > > > On Thursday 04 July 2019 06:37 AM, David Gibson wrote: > > On Wed, Jul 03, 2019 at 02:58:24PM +0530, Aravinda Prasad wrote: > >> > >> > >> On Wednesday 03 July 2019 08:33 AM, David Gibson wrote: > >>> On Tue, Jul 02, 2019 at 11:54:26AM +0530, Aravinda Prasad wrote: > >>>> > >>>> > >>>> On Tuesday 02 July 2019 09:21 AM, David Gibson wrote: > >>>>> On Wed, Jun 12, 2019 at 02:51:04PM +0530, Aravinda Prasad wrote: > >>>>>> Introduce the KVM capability KVM_CAP_PPC_FWNMI so that > >>>>>> the KVM causes guest exit with NMI as exit reason > >>>>>> when it encounters a machine check exception on the > >>>>>> address belonging to a guest. Without this capability > >>>>>> enabled, KVM redirects machine check exceptions to > >>>>>> guest's 0x200 vector. > >>>>>> > >>>>>> This patch also introduces fwnmi-mce capability to > >>>>>> deal with the case when a guest with the > >>>>>> KVM_CAP_PPC_FWNMI capability enabled is attempted > >>>>>> to migrate to a host that does not support this > >>>>>> capability. > >>>>>> > >>>>>> Signed-off-by: Aravinda Prasad <aravi...@linux.vnet.ibm.com> > >>>>>> --- > >>>>>> hw/ppc/spapr.c | 1 + > >>>>>> hw/ppc/spapr_caps.c | 26 ++++++++++++++++++++++++++ > >>>>>> include/hw/ppc/spapr.h | 4 +++- > >>>>>> target/ppc/kvm.c | 19 +++++++++++++++++++ > >>>>>> target/ppc/kvm_ppc.h | 12 ++++++++++++ > >>>>>> 5 files changed, 61 insertions(+), 1 deletion(-) > >>>>>> > >>>>>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c > >>>>>> index 6dd8aaa..2ef86aa 100644 > >>>>>> --- a/hw/ppc/spapr.c > >>>>>> +++ b/hw/ppc/spapr.c > >>>>>> @@ -4360,6 +4360,7 @@ static void spapr_machine_class_init(ObjectClass > >>>>>> *oc, void *data) > >>>>>> smc->default_caps.caps[SPAPR_CAP_NESTED_KVM_HV] = SPAPR_CAP_OFF; > >>>>>> smc->default_caps.caps[SPAPR_CAP_LARGE_DECREMENTER] = > >>>>>> SPAPR_CAP_ON; > >>>>>> smc->default_caps.caps[SPAPR_CAP_CCF_ASSIST] = SPAPR_CAP_OFF; > >>>>>> + smc->default_caps.caps[SPAPR_CAP_FWNMI_MCE] = SPAPR_CAP_OFF; > >>>>>> spapr_caps_add_properties(smc, &error_abort); > >>>>>> smc->irq = &spapr_irq_dual; > >>>>>> smc->dr_phb_enabled = true; > >>>>>> diff --git a/hw/ppc/spapr_caps.c b/hw/ppc/spapr_caps.c > >>>>>> index 31b4661..2e92eb6 100644 > >>>>>> --- a/hw/ppc/spapr_caps.c > >>>>>> +++ b/hw/ppc/spapr_caps.c > >>>>>> @@ -479,6 +479,22 @@ static void > >>>>>> cap_ccf_assist_apply(SpaprMachineState *spapr, uint8_t val, > >>>>>> } > >>>>>> } > >>>>>> > >>>>>> +static void cap_fwnmi_mce_apply(SpaprMachineState *spapr, uint8_t val, > >>>>>> + Error **errp) > >>>>>> +{ > >>>>>> + if (!val) { > >>>>>> + return; /* Disabled by default */ > >>>>>> + } > >>>>>> + > >>>>>> + if (tcg_enabled()) { > >>>>>> + error_setg(errp, > >>>>>> +"No Firmware Assisted Non-Maskable Interrupts support in TCG, try > >>>>>> cap-fwnmi-mce=off"); > >>>>> > >>>>> Not allowing this for TCG creates an awkward incompatibility between > >>>>> KVM and TCG guests. I can't actually see any reason to ban it for TCG > >>>>> - with the current code TCG won't ever generate NMIs, but I don't see > >>>>> that anything will actually break. > >>>>> > >>>>> In fact, we do have an nmi monitor command, currently wired to the > >>>>> spapr_nmi() function which resets each cpu, but it probably makes > >>>>> sense to wire it up to the fwnmi stuff when present. > >>>> > >>>> Yes, but that nmi support is not enough to inject a synchronous error > >>>> into the guest kernel. For example, we should provide the faulty address > >>>> along with other information such as the type of error (slb multi-hit, > >>>> memory error, TLB multi-hit) and when the error occurred (load/store) > >>>> and whether the error was completely recovered or not. Without such > >>>> information we cannot build the error log and pass it on to the guest > >>>> kernel. Right now nmi monitor command takes cpu number as the only > >>>> argument. > >>> > >>> Obviously we can't inject an arbitrary MCE event with that monitor > >>> command. But isn't there some sort of catch-all / unknown type of MCE > >>> event which we could inject? > >> > >> We have "unknown" type of error, but we should also pass an address in > >> the MCE event log. Strictly speaking this address should be a valid > >> address in the current CPU context as MCEs are synchronous errors > >> triggered when we touch a bad address. > > > > Well, some of them are. At least historically both synchronous and > > asnchronous MCEs were possible. Are there really no versions where > > you can report an MCE with unknown address? > > I am not aware of any such versions. Will cross check. > > > > >> We can pass a default address with every nmi, but I am not sure whether > >> that will be practically helpful. > >> > >>> It seems very confusing to me to have 2 totally separate "nmi" > >>> mechanisms. > >>> > >>>> So I think TCG support should be a separate patch by itself. > >>> > >>> Even if we don't wire up the monitor command, I still don't see > >>> anything that this patch breaks - we can support the nmi-register and > >>> nmi-interlock calls without ever actually creating MCE events. > >> > >> If we support nmi-register and nmi-interlock calls without the monitor > >> command wire-up then we will be falsely claiming the nmi support to the > >> guest while it is not actually supported. > > > > How so? AFAICT, from the point of view of the guest this is not > > observably different from supporting the NMI mechanism but NMIs never > > occurring. > > A guest inserting a duplicate SLB will expect the machine check > exception delivered to the handler registered via nmi,register. > But we actually don't do that in TCG.
Ah, true, I was thinking of external hardware fault triggered MCEs rather than software error ones like duplicate SLB. That said, I strongly suspect TCG is buggy enough at present that exact behaviour in rare error conditions like duplicate SLB is not really a big problem in the scheme of things. I really don't think we can enable this by default until we allow it for TCG - we don't want starting a TCG guest to involve manually switching other options. We could consider allowing it for TCG but just printing a warning that the behaviour may not be correct in some conditions - we do something similar for some of the Spectre workarounds already. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson
signature.asc
Description: PGP signature