On 2019-02-04 9:21 p.m., Andrew Lunn wrote: >> The problem is INTn can go low before the interrupt handler for it is >> registered and enabled. >> This can't happen. The domain is setup immediately after registering >> the GPIO interrupt. >> The interrupt can't fire until one of the enables is set. > These two statement seem to contradict each other? I don't think so. In the first, I am referring to the enabling of the GPIO pin interrupt in the Armada 3720 CPU. Specifically, this would be the setting of the enable in the South Bridge GPIO2 Interrupt Enable Register (RD0018C00). In the second, I'm referring to the enables in the switch's Global Control Register. Setting these enables to zero forces the the switch's INTn output high (assuming there isn't a short). This shouldn't cause a CPU interrupt if the South Bridge GPIO2 Polarity Control is set correctly at the time. INTn goes low after a switch reset because EEIntEn is set on reset, so clearing the enables will causes a rising edge on INTn. > >> These are set >> by mv88e6xxx_g2_irq_setup(), >> mv88e6xxx_g1_atu_prob_irq_setup() and >> mv88e6xxx_g1_vtu_prob_irq_setup(). These irqs >> are setup after mv88e6xxx_g1_irq_setup()/mv88e6xxx_irq_poll_setup() is >> called. Thus, the >> irq domain is setup before the GPIO interrupt can fire. > At what point is INTn going low, causing you all these problems? I've > yet to see a real description of the race. Please give us a blow by > blow of how the race happens. And then explain how your fix actually > fixes this. I'm going to withdraw my proposed change. I had thought that calling request_threaded_irq() earlier fixed the issue at boot. But given your mail, I reverted the change in my 4.20.6 build and I wasn't able to reproduce the problem. Yet, my 4.20.4 build fails every time doing a power on boot.
We have the following interrupt status when it fails: 44: 2 0 GPIO2 23 Edge d0032004.mdio-mii:01 48: 0 0 mv88e6xxx-g1 3 Edge mv88e6xxx-g1-atu-prob 50: 0 0 mv88e6xxx-g1 5 Edge mv88e6xxx-g1-vtu-prob 52: 0 1 mv88e6xxx-g1 7 Edge mv88e6xxx-g2 55: 0 1 mv88e6xxx-g2 1 Edge !soc!internal-regs@d0000000!mdio@32004!switch0@1!mdio:11 56: 0 0 mv88e6xxx-g2 2 Edge !soc!internal-regs@d0000000!mdio@32004!switch0@1!mdio:12 57: 0 0 mv88e6xxx-g2 3 Edge !soc!internal-regs@d0000000!mdio@32004!switch0@1!mdio:13 69: 0 0 mv88e6xxx-g2 15 Edge mv88e6xxx-watchdog We have the following values in the Global1 registers: MCLI> sys dumpGlobal1 ------------------------------------------------------ Switch Global Status(0x0) c881 ATU FID Register(0x1) 0000 VTU FID Register(0x2) 0000 VTU SID Register(0x3) 0000 Switch Global Control Register(0x4) 40a8 The DeviceInt bit is set in the Global Status register. It would appear we have missed an edge on GPIO2 23. Yet, we have handled 2 interrupts on it. So, this isn't the setup issue that I thought it was. > > Also, i'm not yet convinced this hardware can actually work correctly > with edge interrupts. You can probably reduce the size of the race > window, but i don't think you can eliminate it. And if you cannot > eliminate it, at some point it is going to hit you. You could be right but I don't want to give up just yet. I need to go back and rebuild v4.20.4 and retest. My hunch is the second hunk of the original patch will fix this. Dave -- John David Anglin dave.ang...@bell.net