Currently, when booting a compatibility-mode KVM guest (L1) on a PowerNV hypervisor (L0), the guest runs with the expected processor compatibility level. However, when booting a nested KVM guest (L2) inside the L1, QEMU derives the CPU model from the raw host PVR and attempts to run the nested guest at that level, instead of honoring the compatibility mode of the L1.
Extend host CPU compatibility capability reporting to support nested virtualization on PowerNV systems (PAPR nested API v1). For nested API v2 (PowerVM), compatibility capabilities are served from the cached nested_capabilities value (populated at module init via kvmhv_nested_init() using the H_GUEST_GET_CAPABILITIES hcall). This information is not available on PowerNV systems. For nested API v1, derive the compatibility capabilities from the L1 guest by reading the "cpu-version" property from the device tree, which reflects the effective (logical) processor compatibility level. Map this value to the corresponding compatibility capability bitmap using KVM-specific constants. The mapping is cumulative: a system running at a given compatibility level is assumed to also support older generations down the supported chain. Note that unlike KVM on PowerVM (nested API v2), KVM on PowerNV currently does not strictly enforce older generation compatibility modes for nested guests - the reported capabilities reflect what the host CPU can present, not what the hypervisor independently validates. Introduce a helper kvmppc_map_compat_capabilities() to translate CPU version values into KVM_PPC_COMPAT_CAP bits using a fallthrough switch, and integrate it into kvmppc_get_compat_caps(). The implementation applies masking to ensure only supported processor modes are exposed. This allows userspace to query host CPU compatibility modes on both KVM on PowerVM and on PowerNV platforms via the KVM_PPC_GET_COMPAT_CAPS ioctl. Suggested-by: Vaibhav Jain <[email protected]> Signed-off-by: Amit Machhiwal <[email protected]> --- Changes in this version: - Converted switch in kvmppc_map_compat_capabilities() to use fallthrough for cumulative compat mode reporting - Check for 'rc' error before assigning 'capabilities' to 'host_caps->compat_capabilities' - Call of_node_put(np) before break in for_each_node_by_type() loop to avoid leaking the OF node reference arch/powerpc/kvm/book3s_hv.c | 38 ++++++++++++++++++++++++++++++++++++ 1 file changed, 38 insertions(+) diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c index 152cd08a5b38..ba4b2b3aaf4e 100644 --- a/arch/powerpc/kvm/book3s_hv.c +++ b/arch/powerpc/kvm/book3s_hv.c @@ -6523,20 +6523,58 @@ static bool kvmppc_hash_v3_possible(void) return true; } +static int kvmppc_map_compat_capabilities(const __be32 cpu_version, + unsigned long *capabilities) +{ + switch (cpu_version) { + case PVR_ARCH_31_P11: + *capabilities |= KVM_PPC_COMPAT_CAP_POWER11; + fallthrough; + case PVR_ARCH_31: + *capabilities |= KVM_PPC_COMPAT_CAP_POWER10; + fallthrough; + case PVR_ARCH_300: + *capabilities |= KVM_PPC_COMPAT_CAP_POWER9; + break; + default: + return -EINVAL; + } + + return 0; +} static int kvmppc_get_compat_caps(struct kvm_ppc_compat_caps *host_caps) { + struct device_node *np; unsigned long capabilities = 0; + const __be32 *prop = NULL; long rc = -EINVAL; + u32 cpu_version; if (kvmhv_on_pseries()) { if (kvmhv_is_nestedv2()) { WARN_ON_ONCE(!nested_capabilities); capabilities = nested_capabilities; rc = 0; + } else { + for_each_node_by_type(np, "cpu") { + prop = of_get_property(np, "cpu-version", NULL); + if (prop) { + cpu_version = be32_to_cpup(prop); + of_node_put(np); + break; + } + } + if (!prop) + return -EINVAL; + rc = kvmppc_map_compat_capabilities(cpu_version, + &capabilities); } } + if (rc < 0) + return rc; + host_caps->compat_capabilities = capabilities & KVM_PPC_COMPAT_BITMASK; return rc; -- 2.50.1 (Apple Git-155)
