On Fri, 2026-06-26 at 06:40 -0700, Sean Christopherson wrote:
> On Fri, Jun 26, 2026, David Woodhouse wrote:
> > On Thu, 2026-06-25 at 16:09 -0700, Sean Christopherson wrote:
> > > > diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c
> > > > index 91fd3673c09a..c16b4560c9e7 100644
> > > > --- a/arch/x86/kvm/xen.c
> > > > +++ b/arch/x86/kvm/xen.c
> > > > @@ -907,6 +907,13 @@ int kvm_xen_vcpu_set_attr(struct kvm_vcpu *vcpu, 
> > > > struct kvm_xen_vcpu_attr *data)
> > > >  {
> > > >         int idx, r = -ENOENT;
> > > >  
> > > > +       /*
> > > > +        * kvm_xen_write_hypercall_page() manages its own locking.
> > > > +        * Handle it before taking xen_lock to avoid a deadlock.
> > > 
> > > Do we actually want the side effects that necessitate taking 
> > > xen.xen_lock?  From
> > > a uAPI perspective, it's odd to effectively bundle 
> > > KVM_XEN_ATTR_TYPE_LONG_MODE
> > > into KVM_XEN_VCPU_ATTR_TYPE_WRITE_HYPERCALL_PAGE.
> > 
> > That's *guest* ABI, and it's derived from Xen behaviour. Xen will
> > 'latch' its idea of whether a guest VM is 32-bit or 64-bit, for the
> > purpose of shared data structures (shared_info page, vcpu_info,
> > runstate).
> > 
> > Xen latches this from the current mode of the running vCPU in *two*
> > places:
> >  • When the hypercall MSR is invoked
> >  • When the guest sets the event channel GSI (HVM_PARAM_CALLBACK_IRQ).
> > 
> > Thus far, the former has been handled in the kernel (in the code you're
> > looking at), while the latter is why we have the ioctl to explicitly
> > latch the guest's long_mode from userspace too, as userspace handles
> > the HVMOP_set_param calls.
> 
> Right, and I'm pointing out that from a KVM uAPI perspective, bundling the 
> first
> one in a "write hypercall page" call is rather odd, especially since there's
> already uAPI to handle the latching.

Mostly just because they came in the opposite order. The kernel was
writing the hypercall page for itself for about a decade before there
was anything to latch. And userspace wasn't *involved* when it happened
(initially because there was no facility for userspace intercepting MSR
writes).

> > > The other question is, why does kvm_xen_write_hypercall_page() drop 
> > > xen_lock
> > > when writing guest memory?  That seems odd and unnecessary.
> > 
> > Huh? It takes the lock to do the thing that needs the lock, then drops
> > it. That is not "odd and unnecessary" at all.
> > 
> > You've been spending too long with these scope-guarded locks.
> 
> No, I'm asking why KVM doesn't serialize the writes to guest memory.  Usually
> when KVM writes to guest memory, KVM is emulating something that is very much
> vCPU-specific, and so if there are races it's the guest's problem to deal 
> with.
> 
> The Xen MSR here is clearly VM-scoped though, which is why it feels odd to 
> take
> a per-VM lock, and then deliberately drop the lock before completing the 
> operation,
> In practice it shouldn't matter, since it sounds like the same repeating 16 
> byte
> pattern will be written every time, but it was a bit head-scratching when 
> reading
> the code.

The lock protects the long_mode latching. It wouldn't make sense to
hold it around the actual writing of the hypercall page. It's not
intended to be an *atomic* write, and holding some random lock around
it wouldn't make it atomic with respect to all the other things that
can write guest memory anyway.

> We can/should return whatever kvm_vcpu_write_guest() returns, i.e. literally
> return its result directly.  Which of course is only ever going to be -EFAULT,
> but in the extremely unlikely case that ever changes, we won't have to worry
> about creating misleading behavior in the Xen code.

Yes, and if we do it with a wrapper as you suggested, we can return a
proper error code and the MSR handler can convert that to 0/1.

As it is, we were doing the conversion the other way round, which is
why we were making up error numbers for the failure cause.

Attachment: smime.p7s
Description: S/MIME cryptographic signature

Reply via email to