On Wed, Jan 21, 2026 at 5:21 PM Yosry Ahmed <[email protected]> wrote:
>
> On Thu, Jan 15, 2026 at 03:21:40PM -0800, Jim Mattson wrote:
> > When the vCPU is in guest mode with nested NPT enabled, guest accesses to
> > IA32_PAT are redirected to the gPAT register, which is stored in
> > vmcb02->save.g_pat.
> >
> > Non-guest accesses (e.g. from userspace) to IA32_PAT are always redirected
> > to hPAT, which is stored in vcpu->arch.pat.
> >
> > This is architected behavior. It also makes it possible to restore a new
> > checkpoint on an old kernel with reasonable semantics. After the restore,
> > gPAT will be lost, and L2 will run on L1's PAT. Note that the old kernel
> > would have always run L2 on L1's PAT.
>
> This creates a difference in MSR_IA32_CR_PAT handling with nested SVM
> and nested VMX, right?

Correct.

> AFAICT, reading MSR_IA32_CR_PAT while an L2 VMX guest is running will
> read L2's PAT. With this change, the same scenario on SVM will read L1's
> PAT. We can claim that it was always L1's PAT though, because we've
> always been running L2 with L1's PAT.
>
> I am just raising this in case it's a problem to have different behavior
> for SVM and VMX. I understand that we need to do this to be able to
> save/restore L1's PAT with SVM in guest mode and maintain backward
> compatibility.
>
> IIUC VMX does not have the same issue because host and guest PAT are
> both in vmcs12 and are both saved/restored appropriately.

Correct.

> >
> > Signed-off-by: Jim Mattson <[email protected]>
> > ---
> >  arch/x86/kvm/svm/svm.c | 31 ++++++++++++++++++++++++-------
> >  1 file changed, 24 insertions(+), 7 deletions(-)
> >
> > diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> > index 7041498a8091..3f8581adf0c1 100644
> > --- a/arch/x86/kvm/svm/svm.c
> > +++ b/arch/x86/kvm/svm/svm.c
> > @@ -2846,6 +2846,13 @@ static int svm_get_msr(struct kvm_vcpu *vcpu, struct 
> > msr_data *msr_info)
> >       case MSR_AMD64_DE_CFG:
> >               msr_info->data = svm->msr_decfg;
> >               break;
> > +     case MSR_IA32_CR_PAT:
> > +             if (!msr_info->host_initiated && is_guest_mode(vcpu) &&
> > +                 nested_npt_enabled(svm))
> > +                     msr_info->data = svm->vmcb->save.g_pat; /* gPAT */
> > +             else
> > +                     msr_info->data = vcpu->arch.pat; /* hPAT */
> > +             break;
> >       default:
> >               return kvm_get_msr_common(vcpu, msr_info);
> >       }
> > @@ -2929,14 +2936,24 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, 
> > struct msr_data *msr)
> >
> >               break;
> >       case MSR_IA32_CR_PAT:
> > -             ret = kvm_set_msr_common(vcpu, msr);
> > -             if (ret)
> > -                     break;
> > +             if (!kvm_pat_valid(data))
> > +                     return 1;
> >
> > -             svm->vmcb01.ptr->save.g_pat = data;
>
> This is a bug fix, L2 is now able to alter L1's PAT, right?

Yes, L1 and L2 share a PAT today. The whole series is fixing that mistake.

> > -             if (is_guest_mode(vcpu))
> > -                     nested_vmcb02_compute_g_pat(svm);
> > -             vmcb_mark_dirty(svm->vmcb, VMCB_NPT);
>
> This looks like another bug fix, it seems like we'll update vmcb01 but
> clear the clean bit in vmcb02 if we're in guest mode.

I will split this out into a separate patch, since this is tangential
to the rest of the series.

> Probably worth calling these out (and CC:stable, Fixes:.., etc)?
>
> We probably need a comment here explaining the gPAT vs hPAT case, I
> don't think it's super obvious why we only redirect L2's own accesses to
> its PAT but not userspace's.

Point taken.

> > +             if (!msr->host_initiated && is_guest_mode(vcpu) &&
> > +                 nested_npt_enabled(svm)) {
> > +                     svm->vmcb->save.g_pat = data; /* gPAT */
> > +                     vmcb_mark_dirty(svm->vmcb, VMCB_NPT);
> > +             } else {
> > +                     vcpu->arch.pat = data; /* hPAT */
>
> Should we call kvm_set_msr_common() here instead of setting
> vcpu->arch.pat? The kvm_pat_valid() call would be redundant but that
> should be fine. My main concern is if kvm_set_msr_common() gains more
> logic for MSR_IA32_CR_PAT that isn't reflected here. Probably unlikely
> tho..

There are already three open-coded assignments to vcpu->arch.pat on
the VMX side. Perhaps this should be avoided, but it's tangential to
this series.

> > +                     if (npt_enabled) {
> > +                             svm->vmcb01.ptr->save.g_pat = data;
> > +                             vmcb_mark_dirty(svm->vmcb01.ptr, VMCB_NPT);
> > +                             if (is_guest_mode(vcpu)) {
>
> IIUC (with the fix you mentioned) this is because L1 and L2 share the
> PAT if nested NPT is disabled, and if we're already in guest mode then
> we also need to update vmcb02 (as it was already created based on vmcb01
> with the old PAT). Probably worth a comment.

Noted.

> > +                                     svm->vmcb->save.g_pat = data;
> > +                                     vmcb_mark_dirty(svm->vmcb, VMCB_NPT);
> > +                             }
> > +                     }
> > +             }
> >               break;
> >       case MSR_IA32_SPEC_CTRL:
> >               if (!msr->host_initiated &&
> > --
> > 2.52.0.457.g6b5491de43-goog
> >

Reply via email to