On POWER systems, the host CPU may run in a compatibility mode (e.g., a Power11 processor operating in Power10 compatibility mode). In such cases, the effective CPU level exposed to guests differs from the physical processor generation.
When running nested KVM guests, QEMU derives the host CPU type using mfpvr(), which reflects the physical processor version. This can result in a mismatch between the CPU model selected by QEMU and the compatibility mode enforced by the host, leading to guest boot failures. For example, booting a nested guest on a Power11 LPAR configured in Power10 compatibility mode fails with: KVM-NESTEDv2: couldn't set guest wide elements [..KVM reg dump..] This occurs because QEMU selects a CPU model corresponding to the physical processor (via mfpvr()), while the host operates in a lower compatibility mode. As a result, KVM rejects the requested compatibility level during guest initialization. On pseries nestedv2 systems, add support for retrieving host CPU compatibility capabilities for nested guests on PowerVM. The capability bitmap reflects the processor modes negotiated between the Power hypervisor (L0) and the host partition (L1) via the H_GUEST_GET_CAPABILITIES hcall, but is retrieved from the cached nested_capabilities value populated during module initialization, avoiding repeated hypervisor calls. A WARN_ON_ONCE() flags the unexpected case where nested_capabilities is zero on a nestedv2 system. The implementation defines KVM-specific capability constants (KVM_PPC_COMPAT_CAP_POWER9/10/11), masks unsupported bits, and exposes the result through the KVM_PPC_GET_COMPAT_CAPS ioctl. Hook the implementation into the Book3S HV kvmppc_ops so that it can be invoked by the generic KVM ioctl handling code. Suggested-by: Vaibhav Jain <[email protected]> Signed-off-by: Amit Machhiwal <[email protected]> --- Changes in this version: - Updated PowerVM implementation to use cached nested_capabilities instead of making a live H_GUEST_GET_CAPABILITIES hcall on every ioctl call - Added WARN_ON_ONCE(!nested_capabilities); sanity check when nested_capabilities is unexpectedly zero on a nestedv2 system arch/powerpc/include/uapi/asm/kvm.h | 10 ++++++++++ arch/powerpc/kvm/book3s_hv.c | 20 ++++++++++++++++++++ 2 files changed, 30 insertions(+) diff --git a/arch/powerpc/include/uapi/asm/kvm.h b/arch/powerpc/include/uapi/asm/kvm.h index 19e53d5ae540..913a64b901a3 100644 --- a/arch/powerpc/include/uapi/asm/kvm.h +++ b/arch/powerpc/include/uapi/asm/kvm.h @@ -445,6 +445,16 @@ struct kvm_ppc_compat_caps { }; #define KVM_PPC_COMPAT_CAPS_SIZE_VER0 24 /* sizeof first published struct */ +/* + * Capability bits for compat_capabilities field in kvm_ppc_compat_caps. + * These bits indicate which processor compatibility modes are supported. + */ +#define KVM_PPC_COMPAT_CAP_POWER9 (1ULL << 62) +#define KVM_PPC_COMPAT_CAP_POWER10 (1ULL << 61) +#define KVM_PPC_COMPAT_CAP_POWER11 (1ULL << 60) +#define KVM_PPC_COMPAT_BITMASK (KVM_PPC_COMPAT_CAP_POWER9 | \ + KVM_PPC_COMPAT_CAP_POWER10 | \ + KVM_PPC_COMPAT_CAP_POWER11) /* * Values for character and character_mask. * These are identical to the values used by H_GET_CPU_CHARACTERISTICS. diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c index f9380ef65750..152cd08a5b38 100644 --- a/arch/powerpc/kvm/book3s_hv.c +++ b/arch/powerpc/kvm/book3s_hv.c @@ -6523,6 +6523,25 @@ static bool kvmppc_hash_v3_possible(void) return true; } + +static int kvmppc_get_compat_caps(struct kvm_ppc_compat_caps *host_caps) +{ + unsigned long capabilities = 0; + long rc = -EINVAL; + + if (kvmhv_on_pseries()) { + if (kvmhv_is_nestedv2()) { + WARN_ON_ONCE(!nested_capabilities); + capabilities = nested_capabilities; + rc = 0; + } + } + + host_caps->compat_capabilities = capabilities & KVM_PPC_COMPAT_BITMASK; + + return rc; +} + static struct kvmppc_ops kvm_ops_hv = { .get_sregs = kvm_arch_vcpu_ioctl_get_sregs_hv, .set_sregs = kvm_arch_vcpu_ioctl_set_sregs_hv, @@ -6565,6 +6584,7 @@ static struct kvmppc_ops kvm_ops_hv = { .hash_v3_possible = kvmppc_hash_v3_possible, .create_vcpu_debugfs = kvmppc_arch_create_vcpu_debugfs_hv, .create_vm_debugfs = kvmppc_arch_create_vm_debugfs_hv, + .get_compat_caps = kvmppc_get_compat_caps, }; static int kvm_init_subcore_bitmap(void) -- 2.50.1 (Apple Git-155)
