On Fri, 2026-06-26 at 06:40 -0700, Sean Christopherson wrote:
> On Fri, Jun 26, 2026, David Woodhouse wrote:
> > On Thu, 2026-06-25 at 16:09 -0700, Sean Christopherson wrote:
> > > > diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c
> > > > index 91fd3673c09a..c16b4560c9e7 100644
> > > > --- a/arch/x86/kvm/xen.c
> > > > +++ b/arch/x86/kvm/xen.c
> > > > @@ -907,6 +907,13 @@ int kvm_xen_vcpu_set_attr(struct kvm_vcpu *vcpu,
> > > > struct kvm_xen_vcpu_attr *data)
> > > > {
> > > > int idx, r = -ENOENT;
> > > >
> > > > + /*
> > > > + * kvm_xen_write_hypercall_page() manages its own locking.
> > > > + * Handle it before taking xen_lock to avoid a deadlock.
> > >
> > > Do we actually want the side effects that necessitate taking
> > > xen.xen_lock? From
> > > a uAPI perspective, it's odd to effectively bundle
> > > KVM_XEN_ATTR_TYPE_LONG_MODE
> > > into KVM_XEN_VCPU_ATTR_TYPE_WRITE_HYPERCALL_PAGE.
> >
> > That's *guest* ABI, and it's derived from Xen behaviour. Xen will
> > 'latch' its idea of whether a guest VM is 32-bit or 64-bit, for the
> > purpose of shared data structures (shared_info page, vcpu_info,
> > runstate).
> >
> > Xen latches this from the current mode of the running vCPU in *two*
> > places:
> > • When the hypercall MSR is invoked
> > • When the guest sets the event channel GSI (HVM_PARAM_CALLBACK_IRQ).
> >
> > Thus far, the former has been handled in the kernel (in the code you're
> > looking at), while the latter is why we have the ioctl to explicitly
> > latch the guest's long_mode from userspace too, as userspace handles
> > the HVMOP_set_param calls.
>
> Right, and I'm pointing out that from a KVM uAPI perspective, bundling the
> first
> one in a "write hypercall page" call is rather odd, especially since there's
> already uAPI to handle the latching.Mostly just because they came in the opposite order. The kernel was writing the hypercall page for itself for about a decade before there was anything to latch. And userspace wasn't *involved* when it happened (initially because there was no facility for userspace intercepting MSR writes). > > > The other question is, why does kvm_xen_write_hypercall_page() drop > > > xen_lock > > > when writing guest memory? That seems odd and unnecessary. > > > > Huh? It takes the lock to do the thing that needs the lock, then drops > > it. That is not "odd and unnecessary" at all. > > > > You've been spending too long with these scope-guarded locks. > > No, I'm asking why KVM doesn't serialize the writes to guest memory. Usually > when KVM writes to guest memory, KVM is emulating something that is very much > vCPU-specific, and so if there are races it's the guest's problem to deal > with. > > The Xen MSR here is clearly VM-scoped though, which is why it feels odd to > take > a per-VM lock, and then deliberately drop the lock before completing the > operation, > In practice it shouldn't matter, since it sounds like the same repeating 16 > byte > pattern will be written every time, but it was a bit head-scratching when > reading > the code. The lock protects the long_mode latching. It wouldn't make sense to hold it around the actual writing of the hypercall page. It's not intended to be an *atomic* write, and holding some random lock around it wouldn't make it atomic with respect to all the other things that can write guest memory anyway. > We can/should return whatever kvm_vcpu_write_guest() returns, i.e. literally > return its result directly. Which of course is only ever going to be -EFAULT, > but in the extremely unlikely case that ever changes, we won't have to worry > about creating misleading behavior in the Xen code. Yes, and if we do it with a wrapper as you suggested, we can return a proper error code and the MSR handler can convert that to 0/1. As it is, we were doing the conversion the other way round, which is why we were making up error numbers for the failure cause.
smime.p7s
Description: S/MIME cryptographic signature

