On Fri, May 28, 2021 at 11:48:51AM +0100, Julien Grall wrote:
> Hi Jan,
> 
> On 28/05/2021 11:23, Jan Beulich wrote:
> > On 28.05.2021 10:30, Roger Pau Monné wrote:
> > > On Thu, May 27, 2021 at 07:48:41PM +0100, Julien Grall wrote:
> > > > On 27/05/2021 12:28, Jan Beulich wrote:
> > > > > port_is_valid() and evtchn_from_port() are fine to use without holding
> > > > > any locks. Accordingly acquire the per-domain lock slightly later in
> > > > > evtchn_close() and evtchn_bind_vcpu().
> > > > 
> > > > So I agree that port_is_valid() and evtchn_from_port() are fine to use
> > > > without holding any locks in evtchn_bind_vcpu(). However, this is 
> > > > misleading
> > > > to say there is no problem with evtchn_close().
> > > > 
> > > > evtchn_close() can be called with current != d and therefore, there is a
> > > 
> > > The only instances evtchn_close is called with current != d and the
> > > domain could be unpaused is in free_xen_event_channel AFAICT.
> > 
> > As long as the domain is not paused, ->valid_evtchns can't ever
> > decrease: The only point where this gets done is in evtchn_destroy().
> > Hence ...
> > 
> > > > risk that port_is_valid() may be valid and then invalid because
> > > > d->valid_evtchns is decremented in evtchn_destroy().
> > > 
> > > Hm, I guess you could indeed have parallel calls to
> > > free_xen_event_channel and evtchn_destroy in a way that
> > > free_xen_event_channel could race with valid_evtchns getting
> > > decreased?
> > 
> > ... I don't see this as relevant.
> > 
> > > > Thankfully the memory is still there. So the current code is okayish 
> > > > and I
> > > > could reluctantly accept this behavior to be spread. However, I don't 
> > > > think
> > > > this should be left uncommented in both the code (maybe on top of
> > > > port_is_valid()?) and the commit message.
> > > 
> > > Indeed, I think we need some expansion of the comment in port_is_valid
> > > to clarify all this. I'm not sure I understand it properly myself when
> > > it's fine to use port_is_valid without holding the per domain event
> > > lock.
> > 
> > Because of the above property plus the fact that even if
> > ->valid_evtchns decreases, the underlying struct evtchn instance
> > will remain valid (i.e. won't get de-allocated, which happens only
> > in evtchn_destroy_final()), it is always fine to use it without
> > lock. With this I'm having trouble seeing what would need adding
> > to port_is_valid()'s commentary.
> 
> Lets take the example of free_xen_event_channel(). The function is checking
> if the port is valid. If it is, then evtchn_close() will be called.
> 
> At this point, it would be fair for a developper to assume that
> port_is_valid() will also return true in event_close().
> 
> To push to the extreme, if free_xen_event_channel() was the only caller of
> evtchn_close(), one could argue that the check in evtchn_close() could be a
> BUG_ON().
> 
> However, this can't be because this would sooner or later turn to an XSA.
> 
> Effectively, it means that is_port_valid() *cannot* be used in an
> ASSERT()/BUG_ON() and every caller should check the return even if the port
> was previously validated.

We already have cases of is_port_valid being used in ASSERTs (in the
shim) and a BUG_ON (with the domain event lock held in evtchn_close).

> So I think a comment on top of is_port_valid() would be really useful before
> someone rediscover it the hard way.

I think I'm being extremely dull here, sorry. From your text I
understand that the value returned by is_port_valid could be stale by
the time the user reads it?

I think there's some condition that makes this value stale, and it's
not the common case?

Thanks, Roger.

Reply via email to