On 04.12.2020 16:09, Julien Grall wrote: > On 04/12/2020 12:01, Jan Beulich wrote: >> On 04.12.2020 12:51, Julien Grall wrote: >>> On 04/12/2020 11:48, Jan Beulich wrote: >>>> On 04.12.2020 12:28, Julien Grall wrote: >>>>> On 03/12/2020 10:09, Jan Beulich wrote: >>>>>> On 02.12.2020 22:10, Julien Grall wrote: >>>>>>> So shouldn't we handle this issue properly in VM event? >>>>>> >>>>>> I suppose that's a question to the VM event folks rather than me? >>>>> >>>>> Yes. From my understanding of Tamas's e-mail, they are relying on the >>>>> monitoring software to do the right thing. >>>>> >>>>> I will refrain to comment on this approach. However, given the race is >>>>> much wider than the event channel, I would recommend to not add more >>>>> code in the event channel to deal with such problem. >>>>> >>>>> Instead, this should be fixed in the VM event code when someone has time >>>>> to harden the subsystem. >>>> >>>> Are effectively saying I should now undo the addition of the >>>> refcounting, which was added in response to feedback from you? >>> >>> Please point out where I made the request to use the refcounting... >> >> You didn't ask for this directly, sure, but ... >> >>> I pointed out there was an issue with the VM event code. >> >> ... this has ultimately led to the decision to use refcounting >> (iirc there was one alternative that I had proposed, besides >> the option of doing nothing). > > One other option that was discussed (maybe only on secur...@xen.org) is > to move the spinlock outside of the structure so it is always allocated.
Oh, right - forgot about that one, because that's nothing I would ever have taken on actually carrying out. >>> This was latter >>> analysed as a wider issue. The VM event folks doesn't seem to be very >>> concerned on the race, so I don't see the reason to try to fix it in the >>> event channel code. >> >> And you won't need the refcount for vpl011 then? > > I don't believe we need it for the vpl011 as the spin lock protecting > the code should always be allocated. The problem today is the lock is > initialized too late. > >> I can certainly >> drop it again, but it feels odd to go back to an earlier version >> under the circumstances ... > > The code introduced doesn't look necessary outside of the VM event code. > So I think it would be wrong to merge it if it is just papering over a > bigger problem. So to translate this to a clear course of action: You want me to go back to the earlier version by dropping the refcounting again? (I don't view this as "papering over" btw, but a tiny step towards a solution.) Jan