On Thu, Jan 09, 2020 at 12:24:05PM +0100, Roger Pau Monné wrote:
> On Thu, Jan 09, 2020 at 10:43:01AM +0100, Jan Beulich wrote:
> > On 08.01.2020 19:14, Roger Pau Monné wrote:
> > > On Wed, Jan 08, 2020 at 02:54:57PM +0100, Jan Beulich wrote:
> > >> On 08.01.2020 14:30, Roger Pau Monné  wrote:
> > >>> On Fri, Jan 03, 2020 at 01:55:51PM +0100, Jan Beulich wrote:
> > >>>> On 03.01.2020 13:34, Roger Pau Monné wrote:
> > >>>>> On Fri, Jan 03, 2020 at 01:08:20PM +0100, Jan Beulich wrote:
> > >>>>>> On 24.12.2019 13:44, Roger Pau Monne wrote:
> > >>>>>> Further a question on lock nesting: Since the commit message
> > >>>>>> doesn't say anything in this regard, did you check there are no
> > >>>>>> TLB flush invocations with the get_cpu_maps() lock held?
> > >>>>>
> > >>>>> The CPU maps lock is a recursive one, so it should be fine to attempt
> > >>>>> a TLB flush with the lock already held.
> > >>>>
> > >>>> When already held by the same CPU - sure. It being a recursive
> > >>>> one (which I paid attention to when writing my earlier reply)
> > >>>> doesn't make it (together with any other one) immune against
> > >>>> ABBA deadlocks, though.
> > >>>
> > >>> There's no possibility of a deadlock here because get_cpu_maps does a
> > >>> trylock, so if another cpu is holding the lock the flush will just
> > >>> fallback to not using the shorthand.
> > >>
> > >> Well, with the _exact_ arrangements (flush_lock used only in one
> > >> place, and cpu_add_remove_lock only used in ways similar to how it
> > >> is used now), there's no such risk, I agree. But there's nothing
> > >> at all making sure this doesn't change. Hence, as said, at the very
> > >> least this needs reasoning about in the description (or a code
> > >> comment).
> > > 
> > > I'm afraid you will have to bear with me, but I'm still not sure how
> > > the addition of get_cpu_maps in flush_area_mask can lead to deadlocks.
> > > As said above get_cpu_maps does a trylock, which means that it will
> > > never deadlock, and that's the only way to lock cpu_add_remove_lock.
> > 
> > That's why I said "cpu_add_remove_lock only used in ways similar to
> > how it is used now" - any non-trylock addition would break the
> > assumptions you make afaict. And you can't really expect people
> > adding another (slightly different) use of an existing lock to be
> > aware they now need to check whether this introduces deadlock
> > scenarios on unrelated pre-existing code paths. Hence my request to
> > at least discuss this in the description (raising awareness, and
> > hence allowing reviewers to judge whether further precautionary
> > measures should be taken).
> 
> Thanks for the clarification, I did indeed misunderstood your
> complain.
> 
> Regarding the generalization of the shorthand and integration into
> send_IPI_mask, I've found an issue related to locking. send_IPI_mask
> is called with both interrupts enabled and disabled, and so
> get_cpu_maps panics because of the lock checking.
> 
> I however don't think that such panic is accurate: the logic in
> check_lock specifically relates to the spinning done when picking a
> lock, but I would say the call to check_lock in
> _spin_trylock{_recursive} is not strictly needed, the scenario
> described in check_lock doesn't apply when using trylock.

Never mind, this is not actually true. You can still trigger the
deadlock if you mix trylock with regular locking, so I guess I will
leave the shorthand usage to flush_area_mask unless I can make all the
callers of send_IPI_mask use the same irq status.

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Reply via email to