Hi Russell, On Sun, Aug 11, 2019 at 10:37 AM Russell King - ARM Linux admin <li...@armlinux.org.uk> wrote: > > Hi Fabio, > > When I woke up this morning, I found that one of the Hummingboards > had gone offline (as in, lost network link) during the night. > Investigating, I find that the system had gone into OOM, and at > that time, triggered an unrelated: > > [4111697.698776] fec 2188000.ethernet eth0: MDIO read timeout > [4111697.712996] MII_DATA: 0x6006796d > [4111697.729415] MII_SPEED: 0x0000001a > [4111697.745232] IEVENT: 0x00000000 > [4111697.745242] IMASK: 0x0a8000aa > [4111698.002233] Atheros 8035 ethernet 2188000.ethernet-1:00: PHY state > change RUNNING -> HALTED > [4111698.009882] fec 2188000.ethernet eth0: Link is Down > > This is on a dual-core iMX6. > > It looks like the read actually completed (since MII_DATA contains > the register data) but we somehow lost the interrupt (or maybe > received the interrupt after wait_for_completion_timeout() timed > out.) > > From what I can see, the OOM events happened on CPU1, CPU1 was > allocated the FEC interrupt, and the PHY polling that suffered the > MDIO timeout was on CPU0. > > Given that IEVENT is zero, it seems that CPU1 had read serviced the > interrupt, but it is not clear how far through processing that it > was - it may be that fec_enet_interrupt() had been delayed by the > OOM condition. > > This seems rather fragile - as the system slowing down due to OOM > triggers the network to completely collapse by phylib taking the > PHY offline, making the system inaccessible except through the > console. > > In my case, even serial console wasn't operational (except for > magic sysrq). Not sure what agetty was playing at... so the only > way I could recover any information from the system was to connect > the HDMI and plug in a USB keyboard. > > Any thoughts on how FEC MDIO accesses could be made more robust?
Sorry for the delay. I am currently on vacation with limited e-mail access. I think it is worth trying Andrew's suggestion to increase FEC_MII_TIMEOUT. Thanks