On Wed, Feb 07, 2018 at 09:56:37PM +0100, Heiner Kallweit wrote: > Am 04.02.2018 um 03:48 schrieb Florian Fainelli: > > > > > > On 02/03/2018 03:58 PM, Heiner Kallweit wrote: > >> Am 03.02.2018 um 21:17 schrieb Andrew Lunn: > >>> On Sat, Feb 03, 2018 at 05:41:54PM +0100, Heiner Kallweit wrote: > >>>> This commit forces callers of phy_resume() and phy_suspend() to hold > >>>> mutex phydev->lock. This was done for calls to phy_resume() and > >>>> phy_suspend() in phylib, however there are more callers in network > >>>> drivers. I'd assume that these other calls issue a warning now > >>>> because of the lock not being held. > >>>> So is there something I miss or would this have to be fixed? > >>> > >>> Hi Heiner > >>> > >>> This is a good point. > >>> > >>> Yes, it looks like some fixes are needed. But what exactly? > >>> > >>> The phy state machine will suspend and resume the phy is you call > >>> phy_stop() and phy_start() in the MAC suspend and resume functions. > >>> > >> AFAICS phy_stop() doesn't suspend the PHY. It just sets the state > >> to PHY_HALTED and (at least if we're not in polling mode) doesn't > >> call phy_suspend(). Maybe a call to phy_trigger_machine() is > >> needed like in phy_start() ? Then the state machine would call > >> phy_suspend(), provided the link is still up. > > > > Right, phy_stop() merely just moves the state machine to PHY_HALTED and > > this is actually a great source of problems which I tried to address here: > > > > https://www.mail-archive.com/netdev@vger.kernel.org/msg196061.html > > > > because phy_stop() is not a synchronous call, so when it returns the > > state machine might still be running (it can take up to a 1 HZ depending > > on when you called phy_stop()) and so if you took that as a > > synchronization point to e.g: turn of your Ethernet MAC/MDIO bus clocks, > > you will likely see problems. phy_stop_machine() would provide that > > synchronization point, but is not currently exported, despite being a > > global symbol. This patch series above is all well and good, except that > > Geert reported issues with suspend/resume interactions which I have not > > been able to track down. > > > To not confuse readers I changed the subject of the mail to reflect that > the discussion isn't about the original topic any longer. > > It seems to me that (at least one) reason for the issues is that pm > callbacks for the phy device and the network device interfere. > phy device pm (mdio_bus_phy_suspend et al) feels responsible for dealing > with suspending/resuming the PHY, and the network driver pm callbacks > as well. > > Maybe, if the network driver takes care, it should inform phy device pm > to do nothing. For this we could add a flag to phydev.mdio.flags. > If the network driver sets this flag then mdio_bus_phy_suspend() > and mdio_bus_phy_resume() could turn into no-ops. > Not totally sure yet about mdio_bus_phy_restore() ..
What if the MDIO bus is handled by a separate device and the MDIO bus is suspended prior to the network driver, thereby making the PHY inaccessible? -- RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up According to speedtest.net: 8.21Mbps down 510kbps up