Re: [BUG] fec mdio times out under system stress

2019-08-12 Thread Fabio Estevam
Hi Russell, On Sun, Aug 11, 2019 at 10:37 AM Russell King - ARM Linux admin wrote: > > Hi Fabio, > > When I woke up this morning, I found that one of the Hummingboards > had gone offline (as in, lost network link) during the night. > Investigating, I find that the system had gone into OOM, and at

Re: [BUG] fec mdio times out under system stress

2019-08-11 Thread Andrew Lunn
> Maybe phylib should retry a number of times - but with read-sensitive > registers, if the read has already completed successfully, and its > just a problem with the FEC MDIO hardware, that could cause issues. Hi Russell At the bus level, MDIO cannot fail. The bits get clocked out, and the bits

Re: [BUG] fec mdio times out under system stress

2019-08-11 Thread Andrew Lunn
On Sun, Aug 11, 2019 at 02:37:07PM +0100, Russell King - ARM Linux admin wrote: > Hi Fabio, > > When I woke up this morning, I found that one of the Hummingboards > had gone offline (as in, lost network link) during the night. > Investigating, I find that the system had gone into OOM, and at > tha

Re: [BUG] fec mdio times out under system stress

2019-08-11 Thread Andrew Lunn
> I think a better question is why is the FEC MDIO controller configured > to emit interrupts anyway (especially since the API built on top does > not benefit in any way from this)? Hubert (copied) sent an interesting > email very recently where he pointed out that this is one of the main > sources

Re: [BUG] fec mdio times out under system stress

2019-08-11 Thread Vladimir Oltean
Hi Russell, Fabio, On Sun, 11 Aug 2019 at 16:42, Russell King - ARM Linux admin wrote: > > Hi Fabio, > > When I woke up this morning, I found that one of the Hummingboards > had gone offline (as in, lost network link) during the night. > Investigating, I find that the system had gone into OOM, an

Re: [BUG] fec mdio times out under system stress

2019-08-11 Thread Russell King - ARM Linux admin
On Sun, Aug 11, 2019 at 02:37:07PM +0100, Russell King - ARM Linux admin wrote: > Hi Fabio, > > When I woke up this morning, I found that one of the Hummingboards > had gone offline (as in, lost network link) during the night. > Investigating, I find that the system had gone into OOM, and at > tha

[BUG] fec mdio times out under system stress

2019-08-11 Thread Russell King - ARM Linux admin
Hi Fabio, When I woke up this morning, I found that one of the Hummingboards had gone offline (as in, lost network link) during the night. Investigating, I find that the system had gone into OOM, and at that time, triggered an unrelated: [4111697.698776] fec 2188000.ethernet eth0: MDIO read timeo