From: David Woodhouse <dw...@infradead.org> Date: Mon, 21 Sep 2015 15:01:49 +0100
> From: David Woodhouse <david.woodho...@intel.com> > > The TX timeout handling has been observed to trigger RX IRQ storms. And > since cp_interrupt() just keeps saying that it handled the interrupt, > the machine then dies. Fix the return value from cp_interrupt(), and > the offending IRQ gets disabled and the machine survives. > > Signed-off-by: David Woodhouse <david.woodho...@intel.com> Like Francois, I don't like this. First of all, there are only 3 bits not handled explicitly by cp_interrupt(). And for those if they are set and no other condition was rasied, you should report the event and the status bits set, and then forcibly clear the interrupt. And if we are getting Rx* interrupts with napi_schedule_prep() returning false, that's a serious problem. It can mean that the TX timeout handler's resetting of the chip is either miscoded or is racing with either NAPI polling or this interrupt handler. And if that's the case your patch is making the chip's IRQ line get disabled when this race triggers. This change is even worse, in my opinion, if patch #2 indeed makes the problem go away. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html