On Wed, Sep 14, 2022 at 12:13:49PM +0200, Jan Beulich wrote:
> On 14.09.2022 11:13, Roger Pau Monné wrote:
> > On Wed, Sep 14, 2022 at 10:31:34AM +0200, Jan Beulich wrote:
> >> On 14.09.2022 10:14, Jan Beulich wrote:
> >>> On 13.09.2022 16:50, Roger Pau Monné wrote:
> >>>> On Mon, Dec 13, 2021 at 04:12:55PM +0100, Jan Beulich wrote:
> >>>>> show_hvm_stack() requires interrupts to be enabled to avoids triggering
> >>>>> the consistency check in check_lock() for the p2m lock. To do so in
> >>>>> spurious_interrupt() requires adding reentrancy protection / handling
> >>>>> there.
> >>>>
> >>>> There's also an ASSERT(!in_irq()) in _percpu_write_lock() that will
> >>>> trigger when trying to acquire the p2m lock from spurious_interrupt()
> >>>> context, as p2m_lock() -> mm_write_lock() -> _mm_write_lock ->
> >>>> percpu_write_lock().
> >>>
> >>> s/will/may/ since spurious_interrupt() doesn't itself use irq_enter(),
> > 
> > do_IRQ() does call irq_enter(), and that's the caller of
> > spurious_interrupt() AFAICT.
> 
> Hmm, you're right. I was mislead by smp_call_function_interrupt()
> explicitly using irq_{enter,exit}(). I guess that should have been
> removed in b57458c1d02b ("x86: All vectored interrupts go through
> do_IRQ()"). I guess I need to either open-code the variant of in_irq()
> I'd need, or (perhaps better for overall state) explicitly irq_exit()
> before the check and irq_enter() after the call. Thoughts?

Well, it's ugly but it's likely the easier way to get this working.

> >>> but yes - we could nest inside a lower priority interrupt. I'll make
> >>> local_irq_enable() depend on !in_irq().
> >>
> >> Upon further thought I guess more precautions are necessary: We might
> >> have interrupted code holding the P2M lock already, and we might also
> >> have interrupted code holding another MM lock precluding acquiring of
> >> the P2M lock. All of this probably plays into Andrew's concerns, yet
> >> still I don't view it as a viable route to omit the stack dump for HVM
> >> domains, and in particular for PVH Dom0. Sadly I can't think of any
> >> better approach ...
> > 
> > Yes, I also had those concerns.  The mm locks are recursive, but
> > spurious_interrupt() hitting in the middle of code already holding any
> > mm lock is likely to end up triggering the mm lock order checker.
> 
> Guarding against this is possible, while ...
> 
> > One (likely very risky option ATM) is to introduce a per pCPU flag
> > that when set will turn all mm locks into noops, and use it here in
> > order to avoid any locking issues.  This could introduce two issues at
> > least: first one is how resilient page walking routines are against
> > page tables changing under their feet.  The second one is that any
> > page table walker p2m helper should avoid doing modifications to the
> > p2m, so no P2M_ALLOC or P2M_UNSHARE flags could be used.
> 
> ... personally I view this as too risky.

Is the dump of the stack only used for the debug key handler, or there
are other places this is also used?

Thanks, Roger.

Reply via email to