Hi Charistian, > -----Original Message----- > From: Christian Melki <christian.me...@t2data.com> > Sent: 2021年4月5日 16:44 > To: Heiner Kallweit <hkallwe...@gmail.com>; Joakim Zhang > <qiangqing.zh...@nxp.com>; and...@lunn.ch; li...@armlinux.org.uk; > da...@davemloft.net; k...@kernel.org > Cc: netdev@vger.kernel.org; linux-ker...@vger.kernel.org; dl-linux-imx > <linux-...@nxp.com> > Subject: Re: [PATCH] net: phy: fix PHY possibly unwork after MDIO bus resume > back > > On 4/5/21 12:48 AM, Heiner Kallweit wrote: > > On 04.04.2021 16:09, Heiner Kallweit wrote: > >> On 04.04.2021 12:07, Joakim Zhang wrote: > >>> commit 4c0d2e96ba055 ("net: phy: consider that suspend2ram may cut > >>> off PHY power") invokes phy_init_hw() when MDIO bus resume, it will > >>> soft reset PHY if PHY driver implements soft_reset callback. > >>> commit 764d31cacfe4 ("net: phy: micrel: set soft_reset callback to > >>> genphy_soft_reset for KSZ8081") adds soft_reset for KSZ8081. After > >>> these two patches, I found i.MX6UL 14x14 EVK which connected to > >>> KSZ8081RNB doesn't work any more when system resume back, MAC > driver is fec_main.c. > >>> > >>> It's obvious that initializing PHY hardware when MDIO bus resume > >>> back would introduce some regression when PHY implements soft_reset. > >>> When I > >> > >> Why is this obvious? Please elaborate on why a soft reset should > >> break something. > >> > >>> am debugging, I found PHY works fine if MAC doesn't support > >>> suspend/resume or phy_stop()/phy_start() doesn't been called during > >>> suspend/resume. This let me realize, PHY state machine > >>> phy_state_machine() could do something breaks the PHY. > >>> > >>> As we known, MAC resume first and then MDIO bus resume when system > >>> resume back from suspend. When MAC resume, usually it will invoke > >>> phy_start() where to change PHY state to PHY_UP, then trigger the > >>> stat> machine to run now. In phy_state_machine(), it will > >>> start/config auto-nego, then change PHY state to PHY_NOLINK, what to > >>> next is periodically check PHY link status. When MDIO bus resume, it > >>> will initialize PHY hardware, including soft_reset, what would > >>> soft_reset affect seems various from different PHYs. For KSZ8081RNB, > >>> when it in PHY_NOLINK state and then perform a soft reset, it will never > complete auto-nego. > >> > >> Why? That would need to be checked in detail. Maybe chip errata > >> documentation provides a hint. > >> > > > > The KSZ8081 spec says the following about bit BMCR_PDOWN: > > > > If software reset (Register 0.15) is > > used to exit power-down mode > > (Register 0.11 = 1), two software > > reset writes (Register 0.15 = 1) are > > required. The first write clears > > power-down mode; the second > > write resets the chip and re-latches > > the pin strapping pin values. > > > > Maybe this causes the issue you see and genphy_soft_reset() isn't > > appropriate for this PHY. Please re-test with the KSZ8081 soft reset > > following the spec comment. > > > > Interesting. Never expected that behavior. > Thanks for catching it. Skimmed through the datasheets/erratas. > This is what I found (micrel.c): > > 10/100: > 8001 - Unaffected? > 8021/8031 - Double reset after PDOWN. > 8041 - Errata. PDOWN broken. Recommended do not use. Unclear if reset > solves the issue since errata says no error after reset but is also claiming > that > only toggling PDOWN (may) or power will help. > 8051 - Double reset after PDOWN. > 8061 - Double reset after PDOWN. > 8081 - Double reset after PDOWN. > 8091 - Double reset after PDOWN. > > 10/100/1000: > Nothing in gigabit afaics. > > Switches: > 8862 - Not affected? > 8863 - Errata. PDOWN broken. Reset will not help. Workaround exists. > 8864 - Not affected? > 8873 - Errata. PDOWN broken. Reset will not help. Workaround exists. > 9477 - Errata. PDOWN broken. Will randomly cause link failure on adjacent > links. > Do not use. > > This certainly explains a lot.
Thanks for digging into it. As I discussed with you before, there is no problem with these two fixes if I did ifdown/ifup. Almost the same route with suspend/resume. Difference is that it will start state machine after initializing PHY. But when resume back, state machine is running before initializing PHY. I think the key is to figure out what would soft reset affect in 8081, there is no any hint in spec. Best Regards, Joakim Zhang > >>> > >>> This patch changes PHY state to PHY_UP when MDIO bus resume back, it > >>> should be reasonable after PHY hardware re-initialized. Also give > >>> state machine a chance to start/config auto-nego again. > >>> > >> > >> If the MAC driver calls phy_stop() on suspend, then phydev->suspended > >> is true and mdio_bus_phy_may_suspend() returns false. As a > >> consequence > >> phydev->suspended_by_mdio_bus is false and mdio_bus_phy_resume() > >> skips the PHY hw initialization. > >> Please also note that mdio_bus_phy_suspend() calls phy_stop_machine() > >> that sets the state to PHY_UP. > >> > > > > Forgot that MDIO bus suspend is done before MAC driver suspend. > > Therefore disregard this part for now. > > > >> Having said that the current argumentation isn't convincing. I'm not > >> aware of such issues on other systems, therefore it's likely that > >> something is system-dependent. > >> > >> Please check the exact call sequence on your system, maybe it > >> provides a hint. > >> > >>> Signed-off-by: Joakim Zhang <qiangqing.zh...@nxp.com> > >>> --- > >>> drivers/net/phy/phy_device.c | 7 +++++++ > >>> 1 file changed, 7 insertions(+) > >>> > >>> diff --git a/drivers/net/phy/phy_device.c > >>> b/drivers/net/phy/phy_device.c index cc38e326405a..312a6f662481 > >>> 100644 > >>> --- a/drivers/net/phy/phy_device.c > >>> +++ b/drivers/net/phy/phy_device.c > >>> @@ -306,6 +306,13 @@ static __maybe_unused int > mdio_bus_phy_resume(struct device *dev) > >>> ret = phy_resume(phydev); > >>> if (ret < 0) > >>> return ret; > >>> + > >>> + /* PHY state could be changed to PHY_NOLINK from MAC controller > resume > >>> + * rounte with phy_start(), here change to PHY_UP after > re-initializing > >>> + * PHY hardware, let PHY state machine to start/config auto-nego > again. > >>> + */ > >>> + phydev->state = PHY_UP; > >>> + > >>> no_resume: > >>> if (phydev->attached_dev && phydev->adjust_link) > >>> phy_start_machine(phydev); > >>> > >> > >