From: Linus Torvalds <[EMAIL PROTECTED]> Date: Fri, 13 Apr 2007 18:34:23 -0700 (PDT)
Let's see how related these two might actually be. > On Sat, 14 Apr 2007, Adrian Bunk wrote: > > > > Subject : laptops with e1000: lockups > > References : https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=229603 > > Submitter : Dave Jones <[EMAIL PROTECTED]> > > Handled-By : Jesse Brandeburg <[EMAIL PROTECTED]> > > Status : problem is being debugged In this case the entire machine hangs and sometimes spits out an NMI message. The user confirms that using another network interface (albeit wireless) works properly. The Intel folks can reproduce this one in-house and will look more deeply into it on Monday. > > Subject : forcedeth: interface hangs under load > > References : http://lkml.org/lkml/2007/4/3/39 > > Submitter : Ingo Molnar <[EMAIL PROTECTED]> > > Handled-By : Ingo Molnar <[EMAIL PROTECTED]> > > Ayaz Abdulla <[EMAIL PROTECTED]> > > Status : problem is being debugged In Ingo's case here the interface stops working entirely, but his system is still otherwise operational. I looked at the interrupt handler for this driver and it is absolutely awful especially in the NAPI enabled case. It tries to handle TX done interrupts and other status events in the HW irq handler, and the RX packet processing via NAPI ->poll(). Time has shown that this is a faulty way to use NAPI and that all events types should be done in the NAPI ->poll() handler, not just RX packet processing. The way the loop is coded now it will keep prodding at the interrupt status register in the HW irq handler loop even after the RX packet processing has been deferred to NAPI ->poll(). It seems likely that since the RX packets aren't being processed there, the RX irq event status should keep showing as set as new packets arrive. Really, the interrupt status should be checked exactly once, all the work deferred to NAPI's ->poll() and then the HW interrupt handler should return immediately. This is what e1000 and tg3 do, and it is therefore the most well tested manner in which to use NAPI in a network driver. Anything else is racey and error prone. This would also eliminate the max_interrupt_work hack, it's a side effect of the way the interrupt handler is implemented in this driver. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html