From: Andrew Lunn <and...@lunn.ch> Date: Thu, 14 Feb 2019 03:07:23 +0100
> On Mon, Feb 11, 2019 at 01:40:21PM -0500, John David Anglin wrote: >> The GPIO interrupt controller on the espressobin board only supports edge >> interrupts. >> If one enables the use of hardware interrupts in the device tree for the >> 88E6341, it is >> possible to miss an edge. When this happens, the INTn pin on the Marvell >> switch is >> stuck low and no further interrupts occur. >> >> I found after adding debug statements to mv88e6xxx_g1_irq_thread_work() that >> there is >> a race in handling device interrupts (e.g. PHY link interrupts). Some >> interrupts are >> directly cleared by reading the Global 1 status register. However, the >> device interrupt >> flag, for example, is not cleared until all the unmasked SERDES and PHY >> ports are serviced. >> This is done by reading the relevant SERDES and PHY status register. >> >> The code only services interrupts whose status bit is set at the time of >> reading its status >> register. If an interrupt event occurs after its status is read and before >> all interrupts >> are serviced, then this event will not be serviced and the INTn output pin >> will remain low. >> >> This is not a problem with polling or level interrupts since the handler >> will be called >> again to process the event. However, it's a big problem when using level >> interrupts. >> >> The fix presented here is to add a loop around the code servicing switch >> interrupts. If >> any pending interrupts remain after the current set has been handled, we >> loop and process >> the new set. If there are no pending interrupts after servicing, we are >> sure that INTn has >> gone high and we will get an edge when a new event occurs. >> >> Tested on espressobin board. >> >> Signed-off-by: John David Anglin <dave.ang...@bell.net> > > Fixes: dc30c35be720 ("net: dsa: mv88e6xxx: Implement interrupt support.") > > Tested-by: Andrew Lunn <and...@lunn.ch> > > David, please ensure that Heiner's patch: > > net: phy: fix interrupt handling in non-started states > > is applied first. Otherwise we can get into an interrupt storm. Ok, all done. Should I queue just this one for -stable? I didn't queue up Heiner's change for -stable because it fixes a 5.0-rcX regression.