On Wednesday 03 July 2019 08:33 AM, David Gibson wrote:
> On Tue, Jul 02, 2019 at 11:54:26AM +0530, Aravinda Prasad wrote:
>>
>>
>> On Tuesday 02 July 2019 09:21 AM, David Gibson wrote:
>>> On Wed, Jun 12, 2019 at 02:51:04PM +0530, Aravinda Prasad wrote:
>>>> Introduce the KVM capability KVM_CAP_PPC_FWNMI so that
>>>> the KVM causes guest exit with NMI as exit reason
>>>> when it encounters a machine check exception on the
>>>> address belonging to a guest. Without this capability
>>>> enabled, KVM redirects machine check exceptions to
>>>> guest's 0x200 vector.
>>>>
>>>> This patch also introduces fwnmi-mce capability to
>>>> deal with the case when a guest with the
>>>> KVM_CAP_PPC_FWNMI capability enabled is attempted
>>>> to migrate to a host that does not support this
>>>> capability.
>>>>
>>>> Signed-off-by: Aravinda Prasad <aravi...@linux.vnet.ibm.com>
>>>> ---
>>>> hw/ppc/spapr.c | 1 +
>>>> hw/ppc/spapr_caps.c | 26 ++++++++++++++++++++++++++
>>>> include/hw/ppc/spapr.h | 4 +++-
>>>> target/ppc/kvm.c | 19 +++++++++++++++++++
>>>> target/ppc/kvm_ppc.h | 12 ++++++++++++
>>>> 5 files changed, 61 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
>>>> index 6dd8aaa..2ef86aa 100644
>>>> --- a/hw/ppc/spapr.c
>>>> +++ b/hw/ppc/spapr.c
>>>> @@ -4360,6 +4360,7 @@ static void spapr_machine_class_init(ObjectClass
>>>> *oc, void *data)
>>>> smc->default_caps.caps[SPAPR_CAP_NESTED_KVM_HV] = SPAPR_CAP_OFF;
>>>> smc->default_caps.caps[SPAPR_CAP_LARGE_DECREMENTER] = SPAPR_CAP_ON;
>>>> smc->default_caps.caps[SPAPR_CAP_CCF_ASSIST] = SPAPR_CAP_OFF;
>>>> + smc->default_caps.caps[SPAPR_CAP_FWNMI_MCE] = SPAPR_CAP_OFF;
>>>> spapr_caps_add_properties(smc, &error_abort);
>>>> smc->irq = &spapr_irq_dual;
>>>> smc->dr_phb_enabled = true;
>>>> diff --git a/hw/ppc/spapr_caps.c b/hw/ppc/spapr_caps.c
>>>> index 31b4661..2e92eb6 100644
>>>> --- a/hw/ppc/spapr_caps.c
>>>> +++ b/hw/ppc/spapr_caps.c
>>>> @@ -479,6 +479,22 @@ static void cap_ccf_assist_apply(SpaprMachineState
>>>> *spapr, uint8_t val,
>>>> }
>>>> }
>>>>
>>>> +static void cap_fwnmi_mce_apply(SpaprMachineState *spapr, uint8_t val,
>>>> + Error **errp)
>>>> +{
>>>> + if (!val) {
>>>> + return; /* Disabled by default */
>>>> + }
>>>> +
>>>> + if (tcg_enabled()) {
>>>> + error_setg(errp,
>>>> +"No Firmware Assisted Non-Maskable Interrupts support in TCG, try
>>>> cap-fwnmi-mce=off");
>>>
>>> Not allowing this for TCG creates an awkward incompatibility between
>>> KVM and TCG guests. I can't actually see any reason to ban it for TCG
>>> - with the current code TCG won't ever generate NMIs, but I don't see
>>> that anything will actually break.
>>>
>>> In fact, we do have an nmi monitor command, currently wired to the
>>> spapr_nmi() function which resets each cpu, but it probably makes
>>> sense to wire it up to the fwnmi stuff when present.
>>
>> Yes, but that nmi support is not enough to inject a synchronous error
>> into the guest kernel. For example, we should provide the faulty address
>> along with other information such as the type of error (slb multi-hit,
>> memory error, TLB multi-hit) and when the error occurred (load/store)
>> and whether the error was completely recovered or not. Without such
>> information we cannot build the error log and pass it on to the guest
>> kernel. Right now nmi monitor command takes cpu number as the only argument.
>
> Obviously we can't inject an arbitrary MCE event with that monitor
> command. But isn't there some sort of catch-all / unknown type of MCE
> event which we could inject?
We have "unknown" type of error, but we should also pass an address in
the MCE event log. Strictly speaking this address should be a valid
address in the current CPU context as MCEs are synchronous errors
triggered when we touch a bad address.
We can pass a default address with every nmi, but I am not sure whether
that will be practically helpful.
>
> It seems very confusing to me to have 2 totally separate "nmi"
> mechanisms.
>
>> So I think TCG support should be a separate patch by itself.
>
> Even if we don't wire up the monitor command, I still don't see
> anything that this patch breaks - we can support the nmi-register and
> nmi-interlock calls without ever actually creating MCE events.
If we support nmi-register and nmi-interlock calls without the monitor
command wire-up then we will be falsely claiming the nmi support to the
guest while it is not actually supported.
Regards,
Aravinda
>
--
Regards,
Aravinda