On Thursday 16 May 2019 07:15 AM, David Gibson wrote:
> On Tue, May 14, 2019 at 11:02:07AM +0530, Aravinda Prasad wrote:
>>
>>
>> On Tuesday 14 May 2019 10:17 AM, David Gibson wrote:
>>> On Mon, May 13, 2019 at 04:00:43PM +0530, Aravinda Prasad wrote:
>>>>
>>>>
>>>> On Friday 10 May 2019 03:23 PM, David Gibson wrote:
>>>>> On Fri, May 10, 2019 at 12:45:29PM +0530, Aravinda Prasad wrote:
>>>>>>
>>>>>>
>>>>>> On Friday 10 May 2019 12:16 PM, David Gibson wrote:
>>>>>>> On Mon, Apr 22, 2019 at 12:33:35PM +0530, Aravinda Prasad wrote:
>>>>>>>> Enable the KVM capability KVM_CAP_PPC_FWNMI so that
>>>>>>>> the KVM causes guest exit with NMI as exit reason
>>>>>>>> when it encounters a machine check exception on the
>>>>>>>> address belonging to a guest. Without this capability
>>>>>>>> enabled, KVM redirects machine check exceptions to
>>>>>>>> guest's 0x200 vector.
>>>>>>>>
>>>>>>>> This patch also deals with the case when a guest with
>>>>>>>> the KVM_CAP_PPC_FWNMI capability enabled is attempted
>>>>>>>> to migrate to a host that does not support this
>>>>>>>> capability.
>>>>>>>>
>>>>>>>> Signed-off-by: Aravinda Prasad <aravi...@linux.vnet.ibm.com>
>>>>>>>> ---
>>>>>>>> hw/ppc/spapr.c | 1 +
>>>>>>>> hw/ppc/spapr_caps.c | 26 ++++++++++++++++++++++++++
>>>>>>>> hw/ppc/spapr_rtas.c | 14 ++++++++++++++
>>>>>>>> include/hw/ppc/spapr.h | 4 +++-
>>>>>>>> target/ppc/kvm.c | 14 ++++++++++++++
>>>>>>>> target/ppc/kvm_ppc.h | 6 ++++++
>>>>>>>> 6 files changed, 64 insertions(+), 1 deletion(-)
>>>>>>>>
>>>>>>>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
>>>>>>>> index ffd1715..44e09bb 100644
>>>>>>>> --- a/hw/ppc/spapr.c
>>>>>>>> +++ b/hw/ppc/spapr.c
>>>>>>>> @@ -4372,6 +4372,7 @@ static void spapr_machine_class_init(ObjectClass
>>>>>>>> *oc, void *data)
>>>>>>>> smc->default_caps.caps[SPAPR_CAP_NESTED_KVM_HV] = SPAPR_CAP_OFF;
>>>>>>>> smc->default_caps.caps[SPAPR_CAP_LARGE_DECREMENTER] =
>>>>>>>> SPAPR_CAP_ON;
>>>>>>>> smc->default_caps.caps[SPAPR_CAP_CCF_ASSIST] = SPAPR_CAP_OFF;
>>>>>>>> + smc->default_caps.caps[SPAPR_CAP_FWNMI_MCE] = SPAPR_CAP_OFF;
>>>>>>>> spapr_caps_add_properties(smc, &error_abort);
>>>>>>>> smc->irq = &spapr_irq_xics;
>>>>>>>> smc->dr_phb_enabled = true;
>>>>>>>> diff --git a/hw/ppc/spapr_caps.c b/hw/ppc/spapr_caps.c
>>>>>>>> index edc5ed0..5b3af04 100644
>>>>>>>> --- a/hw/ppc/spapr_caps.c
>>>>>>>> +++ b/hw/ppc/spapr_caps.c
>>>>>>>> @@ -473,6 +473,22 @@ static void
>>>>>>>> cap_ccf_assist_apply(SpaprMachineState *spapr, uint8_t val,
>>>>>>>> }
>>>>>>>> }
>>>>>>>>
>>>>>>>> +static void cap_fwnmi_mce_apply(SpaprMachineState *spapr, uint8_t val,
>>>>>>>> + Error **errp)
>>>>>>>> +{
>>>>>>>> + PowerPCCPU *cpu = POWERPC_CPU(first_cpu);
>>>>>>>> +
>>>>>>>> + if (!val) {
>>>>>>>> + return; /* Disabled by default */
>>>>>>>> + }
>>>>>>>> +
>>>>>>>> + if (kvm_enabled()) {
>>>>>>>> + if (kvmppc_fwnmi_enable(cpu)) {
>>>>>>>> + error_setg(errp, "Requested fwnmi capability not support
>>>>>>>> by KVM");
>>>>>>>> + }
>>>>>>>> + }
>>>>>>>> +}
>>>>>>>> +
>>>>>>>> SpaprCapabilityInfo capability_table[SPAPR_CAP_NUM] = {
>>>>>>>> [SPAPR_CAP_HTM] = {
>>>>>>>> .name = "htm",
>>>>>>>> @@ -571,6 +587,15 @@ SpaprCapabilityInfo
>>>>>>>> capability_table[SPAPR_CAP_NUM] = {
>>>>>>>> .type = "bool",
>>>>>>>> .apply = cap_ccf_assist_apply,
>>>>>>>> },
>>>>>>>> + [SPAPR_CAP_FWNMI_MCE] = {
>>>>>>>> + .name = "fwnmi-mce",
>>>>>>>> + .description = "Handle fwnmi machine check exceptions",
>>>>>>>> + .index = SPAPR_CAP_FWNMI_MCE,
>>>>>>>> + .get = spapr_cap_get_bool,
>>>>>>>> + .set = spapr_cap_set_bool,
>>>>>>>> + .type = "bool",
>>>>>>>> + .apply = cap_fwnmi_mce_apply,
>>>>>>>> + },
>>>>>>>> };
>>>>>>>>
>>>>>>>> static SpaprCapabilities default_caps_with_cpu(SpaprMachineState
>>>>>>>> *spapr,
>>>>>>>> @@ -706,6 +731,7 @@ SPAPR_CAP_MIG_STATE(ibs, SPAPR_CAP_IBS);
>>>>>>>> SPAPR_CAP_MIG_STATE(nested_kvm_hv, SPAPR_CAP_NESTED_KVM_HV);
>>>>>>>> SPAPR_CAP_MIG_STATE(large_decr, SPAPR_CAP_LARGE_DECREMENTER);
>>>>>>>> SPAPR_CAP_MIG_STATE(ccf_assist, SPAPR_CAP_CCF_ASSIST);
>>>>>>>> +SPAPR_CAP_MIG_STATE(fwnmi, SPAPR_CAP_FWNMI_MCE);
>>>>>>>>
>>>>>>>> void spapr_caps_init(SpaprMachineState *spapr)
>>>>>>>> {
>>>>>>>> diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c
>>>>>>>> index d3499f9..997cf19 100644
>>>>>>>> --- a/hw/ppc/spapr_rtas.c
>>>>>>>> +++ b/hw/ppc/spapr_rtas.c
>>>>>>>> @@ -49,6 +49,7 @@
>>>>>>>> #include "hw/ppc/fdt.h"
>>>>>>>> #include "target/ppc/mmu-hash64.h"
>>>>>>>> #include "target/ppc/mmu-book3s-v3.h"
>>>>>>>> +#include "kvm_ppc.h"
>>>>>>>>
>>>>>>>> static void rtas_display_character(PowerPCCPU *cpu, SpaprMachineState
>>>>>>>> *spapr,
>>>>>>>> uint32_t token, uint32_t nargs,
>>>>>>>> @@ -354,6 +355,7 @@ static void rtas_ibm_nmi_register(PowerPCCPU *cpu,
>>>>>>>> target_ulong args,
>>>>>>>> uint32_t nret, target_ulong rets)
>>>>>>>> {
>>>>>>>> + int ret;
>>>>>>>> uint64_t rtas_addr = spapr_get_rtas_addr();
>>>>>>>>
>>>>>>>> if (!rtas_addr) {
>>>>>>>> @@ -361,6 +363,18 @@ static void rtas_ibm_nmi_register(PowerPCCPU *cpu,
>>>>>>>> return;
>>>>>>>> }
>>>>>>>>
>>>>>>>> + ret = kvmppc_fwnmi_enable(cpu);
>>>>>>>
>>>>>>> You shouldn't need this here as well as in cap_fwnmi_mce_apply().
>>>>>>>
>>>>>>> Instead, you should unconditionally fail the nmi-register if the
>>>>>>> capability is not enabled.
>>>>>>
>>>>>> cap_fwnmi is not enabled by default, because if it is enabled by default
>>>>>> them KVM will start routing machine check exceptions via guest exit
>>>>>> instead of routing it to guest's 0x200.
>>>>>>
>>>>>> During early boot since guest has not yet issued nmi-register, KVM is
>>>>>> expected to route exceptions to 0x200. Therefore we enable cap_fwnmi
>>>>>> only when a guest issues nmi-register.
>>>>>
>>>>> Except that's not true - you enable it in cap_fwnmi_mce_apply() which
>>>>> will be executed whenever the machine capability is enabled.
>>>>
>>>> I enable cap_fwnmi in cap_fwnmi_mce_apply() only when the "val" argument
>>>> (which is the effective cap value) is set. In early boot "val" is not
>>>> set as cap_fwnmi by default is not set, hence cap_fwnmi is not
>>>> enabled.
>>>
>>> Uh.. if that's true, something else is horribly wrong. SPAPR caps are
>>> designed to have a fixed value for the lifetime of the VM. Otherwise
>>> they will fail in their purpose of making sure we have a consistent
>>> environment across migrations. So if the 'val' changes after the
>>> first call to apply(), then something is broken.
>>
>> If SPAPR caps are initialized before boot that do not change later, then
>> for cap_fwnmi, the default value is off at boot and the cap is enabled
>> only when guest issues "ibm,nmi-register" call. Should SPAPR caps be
>> updated when "ibm,nmi-register" is called?
>
> So the confusing thing here is that there are spapr machine caps, and
> those are separate from the KVM caps for the VM. Then the KVM caps
> also have whether the cap is possible and whether it is current
> activated.
>
> The spapr machine caps *must* remain static for the VM's lifetime and
> only cover possibilities, not runtime configuration. KVM caps may be
> activated as necessary.
>
> So you can leave activating the KVM cap until nmi-register. But if
> the spapr cap is disabled you must prohibit nmi-register.
>
> The apply() functions are responsible for checking if the spapr caps
> are possible on the KVM implementation. So if cap_fwnmi_mci_apply()
> is called with val==1 and KVM doesn't support the fwnmi extensions, it
> must fail outright.
I see, this clears my confusion on spapr machine caps and KVM caps..
>
>>>> My understanding is that, cap_fwnmi_mce_apply() is also called during
>>>> migration on the target machine.
>>>
>>> Only in the sense that the machine is initialized before processing
>>> the incoming migration. The capability values must be equal on either
>>> side of the migration (that's checked elsewhere). Well, actually,
>>> you're allowed to increase the cap value across a migration, just not
>>> decrease it.
>>
>> Ah.. ok.. I am still familiarizing myself with the migration code..
>>
>>>
>>>> If effective cap for cap_fwnmi is
>>>> enabled on source machine than I think "val" will be set when
>>>> cap_fwnmi_mce_apply() is called on target machine.
>>>
>>> Nope. The migrated value of the cap will be *validated* against the
>>> value set on the destination setup, but it won't *alter* the value on
>>> the destination (the result is that you have it enabled on the source,
>>> but not the destination, the migration will fail).
>>
>> But if cap_fwnmi is set on the host, which function is responsible to
>
> I'm not sure what you mean by "the host" here.
>
>> enable it on the destination? I think cap_fwnmi_mce_apply() is
>> responsible for enabling it on the destination.
>
> Enabling the spapr cap? It is set based on the command line and
> machine type, just like on the source machine.
>
>> If that is the case
>> cap_fwnmi_mce_apply() should know if cap_fwnmi is set on the host and
>> the only way it can check that is based on the "val" argument passed on
>> to it.
>>
>> Or am I missing something here?
>
> Probably, but I'm not sure exactly what.
>
> The val argument to apply() is set to the value of the spapr cap.
> This is based on the qemu command line and machine type, and must be
> the same on source and destination. As a general rule, qemu requires
> that the same machine options are used on source and destination.
Please ignore my previous statements this was made without clear
understanding of spapr machine caps and KVM caps.
I will resend the patches with the modifications.
>
--
Regards,
Aravinda