Re: [PATCH], issue EOI to APIC prior to calling crash_kexec in die_nmi path

2008-02-20 Thread Neil Horman
On Tue, Feb 12, 2008 at 04:08:16PM -0500, Neil Horman wrote: > > > > Neil, is it possible to do some serial console debugging to find out > > where exactly we are hanging? Beats me, what's that operation which can > > not be executed while being in NMI handler and makes system to hang. I am > > al

Re: [PATCH], issue EOI to APIC prior to calling crash_kexec in die_nmi path

2008-02-15 Thread Eric W. Biederman
Neil Horman <[EMAIL PROTECTED]> writes: >> >> Neil, is it possible to do some serial console debugging to find out >> where exactly we are hanging? Beats me, what's that operation which can >> not be executed while being in NMI handler and makes system to hang. I am >> also curious to know if it

Re: [PATCH], issue EOI to APIC prior to calling crash_kexec in die_nmi path

2008-02-12 Thread Neil Horman
> > Neil, is it possible to do some serial console debugging to find out > where exactly we are hanging? Beats me, what's that operation which can > not be executed while being in NMI handler and makes system to hang. I am > also curious to know if it is nested NMI case. > > Thanks > Vivek > H

Re: [PATCH], issue EOI to APIC prior to calling crash_kexec in die_nmi path

2008-02-08 Thread Neil Horman
On Fri, Feb 08, 2008 at 11:45:44AM -0500, Vivek Goyal wrote: > On Fri, Feb 08, 2008 at 11:14:22AM -0500, Neil Horman wrote: > > On Thu, Feb 07, 2008 at 01:24:04PM +0100, Ingo Molnar wrote: > > > > > > * Neil Horman <[EMAIL PROTECTED]> wrote: > > > > > > > Ingo noted a few posts down the nmi_exit

Re: [PATCH], issue EOI to APIC prior to calling crash_kexec in die_nmi path

2008-02-08 Thread Andi Kleen
Ingo Molnar <[EMAIL PROTECTED]> writes: > > try a dummy iret, something like: > > asm volatile ("pushf; push $1f; iret; 1: \n"); > > to get the CPU out of its 'nested NMI' state. (totally untested) Just if you do this while running on the NMI stack (and I think you do if you insert it at the s

Re: [PATCH], issue EOI to APIC prior to calling crash_kexec in die_nmi path

2008-02-08 Thread Vivek Goyal
On Fri, Feb 08, 2008 at 11:14:22AM -0500, Neil Horman wrote: > On Thu, Feb 07, 2008 at 01:24:04PM +0100, Ingo Molnar wrote: > > > > * Neil Horman <[EMAIL PROTECTED]> wrote: > > > > > Ingo noted a few posts down the nmi_exit doesn't actually write to the > > > APIC EOI register, so yeah, I agree,

Re: [PATCH], issue EOI to APIC prior to calling crash_kexec in die_nmi path

2008-02-08 Thread Neil Horman
On Thu, Feb 07, 2008 at 01:24:04PM +0100, Ingo Molnar wrote: > > * Neil Horman <[EMAIL PROTECTED]> wrote: > > > Ingo noted a few posts down the nmi_exit doesn't actually write to the > > APIC EOI register, so yeah, I agree, its bogus (and I apologize, I > > should have checked that more careful

Re: [PATCH], issue EOI to APIC prior to calling crash_kexec in die_nmi path

2008-02-07 Thread Neil Horman
On Thu, Feb 07, 2008 at 01:24:04PM +0100, Ingo Molnar wrote: > > * Neil Horman <[EMAIL PROTECTED]> wrote: > > > Ingo noted a few posts down the nmi_exit doesn't actually write to the > > APIC EOI register, so yeah, I agree, its bogus (and I apologize, I > > should have checked that more careful

Re: [PATCH], issue EOI to APIC prior to calling crash_kexec in die_nmi path

2008-02-07 Thread Ingo Molnar
* Neil Horman <[EMAIL PROTECTED]> wrote: > Ingo noted a few posts down the nmi_exit doesn't actually write to the > APIC EOI register, so yeah, I agree, its bogus (and I apologize, I > should have checked that more carefully). Nevertheless, this patch > consistently allowed a hangning machine

Re: [PATCH], issue EOI to APIC prior to calling crash_kexec in die_nmi path

2008-02-07 Thread Neil Horman
On Wed, Feb 06, 2008 at 05:31:11PM -0700, Eric W. Biederman wrote: > Ingo Molnar <[EMAIL PROTECTED]> writes: > > > * H. Peter Anvin <[EMAIL PROTECTED]> wrote: > > > >>> I am wondering if interrupts are disabled on crashing cpu or if > >>> crashing cpu is inside die_nmi(), how would it stop/preven

Re: [PATCH], issue EOI to APIC prior to calling crash_kexec in die_nmi path

2008-02-06 Thread Eric W. Biederman
Ingo Molnar <[EMAIL PROTECTED]> writes: > * Eric W. Biederman <[EMAIL PROTECTED]> wrote: > >> Looking at the patch the local_irq_enable() is totally bogus. As soon >> was we hit machine_crash_shutdown the first thing we do is disable >> irqs. > > yeah. > >> I'm wondering if someone was using th

Re: [PATCH], issue EOI to APIC prior to calling crash_kexec in die_nmi path

2008-02-06 Thread Ingo Molnar
* Eric W. Biederman <[EMAIL PROTECTED]> wrote: > Looking at the patch the local_irq_enable() is totally bogus. As soon > was we hit machine_crash_shutdown the first thing we do is disable > irqs. yeah. > I'm wondering if someone was using the switch cpus on crash patch that > was floating a

Re: [PATCH], issue EOI to APIC prior to calling crash_kexec in die_nmi path

2008-02-06 Thread Eric W. Biederman
Ingo Molnar <[EMAIL PROTECTED]> writes: > * H. Peter Anvin <[EMAIL PROTECTED]> wrote: > >>> I am wondering if interrupts are disabled on crashing cpu or if >>> crashing cpu is inside die_nmi(), how would it stop/prevent delivery >>> of NMI IPI to other cpus. >> >> I don't see how it would. > > c

Re: [PATCH], issue EOI to APIC prior to calling crash_kexec in die_nmi path

2008-02-06 Thread Vivek Goyal
On Thu, Feb 07, 2008 at 12:36:57AM +0100, Ingo Molnar wrote: > > * H. Peter Anvin <[EMAIL PROTECTED]> wrote: > > >> I am wondering if interrupts are disabled on crashing cpu or if > >> crashing cpu is inside die_nmi(), how would it stop/prevent delivery > >> of NMI IPI to other cpus. > > > > I

Re: [PATCH], issue EOI to APIC prior to calling crash_kexec in die_nmi path

2008-02-06 Thread Ingo Molnar
* H. Peter Anvin <[EMAIL PROTECTED]> wrote: >> I am wondering if interrupts are disabled on crashing cpu or if >> crashing cpu is inside die_nmi(), how would it stop/prevent delivery >> of NMI IPI to other cpus. > > I don't see how it would. cross-CPU IPIs are a bit fragile on some PC platform

Re: [PATCH], issue EOI to APIC prior to calling crash_kexec in die_nmi path

2008-02-06 Thread H. Peter Anvin
Vivek Goyal wrote: I am wondering if interrupts are disabled on crashing cpu or if crashing cpu is inside die_nmi(), how would it stop/prevent delivery of NMI IPI to other cpus. I don't see how it would. -hpa -- To unsubscribe from this list: send the line "unsubscribe linux-kernel"

Re: [PATCH], issue EOI to APIC prior to calling crash_kexec in die_nmi path

2008-02-06 Thread Ingo Molnar
* Vivek Goyal <[EMAIL PROTECTED]> wrote: > On Wed, Feb 06, 2008 at 11:00:01PM +0100, Ingo Molnar wrote: > > > > * Neil Horman <[EMAIL PROTECTED]> wrote: > > > > > if (!user_mode_vm(regs)) { > > > + nmi_exit(); > > > + local_irq_enable(); > > > current->thread.trap_no

Re: [PATCH], issue EOI to APIC prior to calling crash_kexec in die_nmi path

2008-02-06 Thread Vivek Goyal
On Wed, Feb 06, 2008 at 11:00:01PM +0100, Ingo Molnar wrote: > > * Neil Horman <[EMAIL PROTECTED]> wrote: > > > if (!user_mode_vm(regs)) { > > + nmi_exit(); > > + local_irq_enable(); > > current->thread.trap_no = 2; > > crash_kexec(regs); > > looks

Re: [PATCH], issue EOI to APIC prior to calling crash_kexec in die_nmi path

2008-02-06 Thread Ingo Molnar
* Neil Horman <[EMAIL PROTECTED]> wrote: > if (!user_mode_vm(regs)) { > + nmi_exit(); > + local_irq_enable(); > current->thread.trap_no = 2; > crash_kexec(regs); looks good to me, but please move the local_irq_enable() to within crash_ke

Re: [PATCH], issue EOI to APIC prior to calling crash_kexec in die_nmi path

2008-02-06 Thread Neil Horman
On Wed, Feb 06, 2008 at 12:21:30PM -0800, H. Peter Anvin wrote: > Neil Horman wrote: > >Can an APIC accept an NMI while already handling an NMI? I didn't think > >they > >would interrupt one another, but rather, pend until such time as the > >previous > >NMI was cleared > > The CPU certainly wo

Re: [PATCH], issue EOI to APIC prior to calling crash_kexec in die_nmi path

2008-02-06 Thread Vivek Goyal
On Wed, Feb 06, 2008 at 03:12:23PM -0500, Neil Horman wrote: > On Wed, Feb 06, 2008 at 02:40:40PM -0500, Vivek Goyal wrote: > > On Wed, Feb 06, 2008 at 02:25:55PM -0500, Neil Horman wrote: > > > Hey all- > > > A hang on kdump was reported to me awhile back, only when systems died > > > via nmi wa

Re: [PATCH], issue EOI to APIC prior to calling crash_kexec in die_nmi path

2008-02-06 Thread H. Peter Anvin
Neil Horman wrote: Can an APIC accept an NMI while already handling an NMI? I didn't think they would interrupt one another, but rather, pend until such time as the previous NMI was cleared The CPU certainly won't (there is a hidden flag that's cleared on IRET which disables NMI; it's possibl

Re: [PATCH], issue EOI to APIC prior to calling crash_kexec in die_nmi path

2008-02-06 Thread Neil Horman
On Wed, Feb 06, 2008 at 02:40:40PM -0500, Vivek Goyal wrote: > On Wed, Feb 06, 2008 at 02:25:55PM -0500, Neil Horman wrote: > > Hey all- > > A hang on kdump was reported to me awhile back, only when systems died > > via nmi watchdog panic. The hang wouldn't always be in the same place, but >

Re: [PATCH], issue EOI to APIC prior to calling crash_kexec in die_nmi path

2008-02-06 Thread Vivek Goyal
On Wed, Feb 06, 2008 at 02:25:55PM -0500, Neil Horman wrote: > Hey all- > A hang on kdump was reported to me awhile back, only when systems died > via nmi watchdog panic. The hang wouldn't always be in the same place, but it > would usually be somewhere down in purgatory. In looking at the