Amit Machhiwal <[email protected]> writes:

> On IBM POWER systems, newer processor generations can operate in
> compatibility modes corresponding to earlier generations. This becomes
> relevant for nested virtualization, where nested KVM guests may need to
> run with a specific processor compatibility level.
>
> Currently, when running a nested KVM guest (L2) inside a Power11 pSeries
> logical partition (L1) booted in Power10 compatibility mode, the guest
> fails to boot while setting 'arch_compat'. This happens because the CPU
> class is derived from the hardware PVR (via mfspr()), which reflects the
> physical processor generation (Power11), rather than the effective
> compatibility mode (Power10).
>
> As a result, userspace may request a Power11 arch_compat for the L2
> guest. However, the L1 partition, running in Power10 compatibility, has
> only negotiated support up to Power10 with the Power Hypervisor (L0).
> When H_SET_STATE is invoked with a Power11 Logical PVR, the hypervisor

s/H_SET_STATE/H_GUEST_SET_STATE 

> rejects the request, leading to a late guest boot failure:
>
>   KVM-NESTEDv2: couldn't set guest wide elements
>   [..KVM reg dump..]
>

I think irrespective of the other UAPI changes, we should still get this
fixed - so that we don't see a late KVM guest boot failure msgs.

So, in this review, I would like to mainly look at fixing this issue
first and would request if we can defer the UAPI changes as a separate
patch series please.


> This situation should be detected earlier. Rejecting unsupported
> 'arch_compat' values in 'kvmppc_set_arch_compat()' avoids issuing an
> invalid H_SET_STATE hcall and provides a clearer failure mode.

s/H_SET_STATE/H_GUEST_SET_STATE

>
> Add a check to reject Power11 'arch_compat' requests when the host is
> running in Power10 compatibility mode, returning -EINVAL early instead
> of deferring the failure to the hypervisor.
>
> Suggested-by: Vaibhav Jain <[email protected]>
> Tested-by: Anushree Mathur <[email protected]>
> Signed-off-by: Amit Machhiwal <[email protected]>
> ---
>  arch/powerpc/kvm/book3s_hv.c | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
>
> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
> index 61dbeea317f3..249d1f2e4e2c 100644
> --- a/arch/powerpc/kvm/book3s_hv.c
> +++ b/arch/powerpc/kvm/book3s_hv.c
> @@ -446,7 +446,19 @@ static int kvmppc_set_arch_compat(struct kvm_vcpu *vcpu, 
> u32 arch_compat)
>                       guest_pcr_bit = PCR_ARCH_300;
>                       break;
>               case PVR_ARCH_31:
> +                     guest_pcr_bit = PCR_ARCH_31;
> +                     break;
>               case PVR_ARCH_31_P11:
> +                     /*
> +                      * Need to check this for ISA 3.1, as Power10 and
> +                      * Power11 share the same PCR. For any subsequent ISA
> +                      * versions, this will be taken care of by the guest vs
> +                      * host PCR comparison below.
> +                      */
> +                     if ((PVR_ARCH_31 & cur_cpu_spec->pvr_mask) ==
> +                             cur_cpu_spec->pvr_value) {
> +                             return -EINVAL;
> +                     }

Instead of the complicated check can we simply do this?
                        if (!cpu_has_feature(CPU_FTR_P11_PVR))
                                return -EINVAL;

which means that if the Qemu is trying to set the arch_compat with P11
PVR (arch_compat) and if the host cpu FTR doesn't support P11 PVR, then
simply return -EINVAL

-ritesh


Reply via email to