On 5/16/2025 9:33 PM, Sean Christopherson wrote: > This shortlog is unnecessarily confusing. It reads as if supported for > running > L2 in a vCPU with a mediated PMU is somehow lacking. > > On Mon, Mar 24, 2025, Mingwei Zhang wrote: >> Add nested virtualization support for mediated PMU by combining the MSR >> interception bitmaps of vmcs01 and vmcs12. > Do not phrase changelogs related to nested virtualization in terms of > enabling a > _KVM_ feature. KVM has no control over what hypervisor runs in L1. It's > a-ok to > provide example use cases, but they need to be just that, examples. > >> Readers may argue even without this patch, nested virtualization works for >> mediated PMU because L1 will see Perfmon v2 and will have to use legacy vPMU >> implementation if it is Linux. However, any assumption made on L1 may be >> invalid, e.g., L1 may not even be Linux. >> >> If both L0 and L1 pass through PMU MSRs, the correct behavior is to allow >> MSR access from L2 directly touch HW MSRs, since both L0 and L1 passthrough >> the access. >> >> However, in current implementation, if without adding anything for nested, >> KVM always set MSR interception bits in vmcs02. This leads to the fact that >> L0 will emulate all MSR read/writes for L2, leading to errors, since the >> current mediated vPMU never implements set_msr() and get_msr() for any >> counter access except counter accesses from the VMM side. >> >> So fix the issue by setting up the correct MSR interception for PMU MSRs. > This is not a fix. > > KVM: nVMX: Disable PMU MSR interception as appropriate while running L2 > > Merge KVM's PMU MSR interception bitmaps with those of L1, i.e. merge the > bitmaps of vmcs01 and vmcs12, e.g. so that KVM doesn't interpose on MSR > accesses unnecessarily if L1 exposes a mediated PMU (or equivalent) to L2.
Sure. Thanks. > >> Signed-off-by: Mingwei Zhang <mizh...@google.com> >> Co-developed-by: Dapeng Mi <dapeng1...@linux.intel.com> >> Signed-off-by: Dapeng Mi <dapeng1...@linux.intel.com> >> --- >> arch/x86/kvm/vmx/nested.c | 32 ++++++++++++++++++++++++++++++++ >> 1 file changed, 32 insertions(+) >> >> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c >> index cf557acf91f8..dbec40cb55bc 100644 >> --- a/arch/x86/kvm/vmx/nested.c >> +++ b/arch/x86/kvm/vmx/nested.c >> @@ -626,6 +626,36 @@ static inline void >> nested_vmx_set_intercept_for_msr(struct vcpu_vmx *vmx, >> #define nested_vmx_merge_msr_bitmaps_rw(msr) \ >> nested_vmx_merge_msr_bitmaps(msr, MSR_TYPE_RW) >> >> +/* >> + * Disable PMU MSRs interception for nested VM if L0 and L1 are >> + * both mediated vPMU. >> + */ > Again, KVM has no idea what is running in L1. Drop this. Yes. thanks. > >> +static void nested_vmx_merge_pmu_msr_bitmaps(struct kvm_vcpu *vcpu, >> + unsigned long *msr_bitmap_l1, >> + unsigned long *msr_bitmap_l0) >> +{ >> + struct kvm_pmu *pmu = vcpu_to_pmu(vcpu); >> + struct vcpu_vmx *vmx = to_vmx(vcpu); >> + int i; >> + >> + if (!kvm_mediated_pmu_enabled(vcpu)) > This is a worthwhile check, but a comment would be helpful: > > /* > * Skip the merges if the vCPU doesn't have a mediated PMU MSR, i.e. if > * none of the MSRs can possibly be passed through to L1. > */ > if (!kvm_vcpu_has_mediated_pmu(vcpu)) > return; Sure. thanks. > >> + return; >> + >> + for (i = 0; i < pmu->nr_arch_gp_counters; i++) { >> + nested_vmx_merge_msr_bitmaps_rw(MSR_ARCH_PERFMON_EVENTSEL0 + i); > This is unnecessary, KVM always intercepts event selectors. Yes. > >> + nested_vmx_merge_msr_bitmaps_rw(MSR_IA32_PERFCTR0 + i); >> + nested_vmx_merge_msr_bitmaps_rw(MSR_IA32_PMC0 + i); >> + } >> + >> + for (i = 0; i < pmu->nr_arch_fixed_counters; i++) >> + nested_vmx_merge_msr_bitmaps_rw(MSR_CORE_PERF_FIXED_CTR0 + i); >> + >> + nested_vmx_merge_msr_bitmaps_rw(MSR_CORE_PERF_FIXED_CTR_CTRL); > Same thing here. Yes. > >> + nested_vmx_merge_msr_bitmaps_rw(MSR_CORE_PERF_GLOBAL_CTRL); >> + nested_vmx_merge_msr_bitmaps_read(MSR_CORE_PERF_GLOBAL_STATUS); >> + nested_vmx_merge_msr_bitmaps_write(MSR_CORE_PERF_GLOBAL_OVF_CTRL); >> +}