On March 11, 2016 11:25am, <men...@cis.upenn.edu> wrote: > On Wed, Mar 9, 2016 at 8:17 AM, Quan Xu <quan...@intel.com> wrote: > > pcidevs_lock should be held with interrupt enabled. However there > > remains an exception in AMD IOMMU code, where the lock is acquired > > with interrupt disabled. This inconsistency might lead to deadlock. > > Why will this inconsistency lead to deadlock? > I understand the difference between spin_lock_irqsave(), which disable > interrupt, > and spin_lock(), which allows interrupt, however, I didn't get why misuse the > spin_lock_irqsave() at the place of spin_lock() could potentially lead to a > deadlock?
For above 2 questions: Mix is just one of _possible_ scenarios(IMO, not 100% leading to a deadlock): -where an interrupt tries to lock an already locked variable. This is ok if the other interrupt happens on another CPU, but it is _not_ ok if the interrupt happens on the same CPU that already holds the lock I got it from the bellow 2 points: 1).As Jan mentioned, The implication from disabling interrupts while acquiring a lock is that the lock is also being acquired by some interrupt handler. If you mix acquire types, the one not disabling interrupts is prone to be interrupted, and the interrupt trying to get hold of the lock the same CPU already owns. 2). Comment inside check_lock(), we partition locks into IRQ-safe (always held with IRQs disabled) and IRQ-unsafe (always held with IRQs enabled) types. The convention for every lock must be consistently observed else we can deadlock in IRQ-context rendezvous functions (__a rendezvous which gets every CPU into IRQ context before any CPU is released from the rendezvous__). If we can mix IRQ-disabled and IRQ-enabled callers, the following can happen: * Lock is held by CPU A, with IRQs enabled * CPU B is spinning on same lock, with IRQs disabled * Rendezvous starts -- CPU A takes interrupt and enters rendezbous spin * DEADLOCK -- CPU B will never enter rendezvous, CPU A will never exit the rendezvous, and will hence never release the lock. To guard against this subtle bug we latch the IRQ safety of every spinlock in the system, on first use. > Would you minding pointing me to somewhere I can find the reason or > enlighten me? > Some reference material: * (look at "Lesson 3: spinlocks revisited.") https://www.kernel.org/doc/Documentation/locking/spinlocks.txt * comment inside check_lock(), in xen/common/spinlock.c, in Xen's codebase. * the "Linux Device Drivers, 3rd Edition" book , http://www.makelinux.net/ldd3/chp-5-sect-5 Quan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel