Russell reported the following race in the phylib state machine
(quoting from his mail):

if (phy_polling_mode(phydev) && phy_is_started(phydev))
        phy_queue_state_machine(phydev, PHY_STATE_TIME);

state = PHY_UP
thread 0                        thread 1
                                phy_disconnect()
                                +-phy_is_started()
phy_is_started()                |
                                `-phy_stop()
                                  +-phydev->state = PHY_HALTED
                                  `-phy_stop_machine()
                                    `-cancel_delayed_work_sync()
phy_queue_state_machine()
`-mod_delayed_work()

At this point, the phydev->state_queue() has been added back onto the
system workqueue despite phy_stop_machine() having been called and
cancel_delayed_work_sync() called on it.

Fix this by protecting the complete operation in thread 0.

Fixes: 2b3e88ea6528 ("net: phy: improve phy state checking")
Reported-by: Russell King - ARM Linux admin <li...@armlinux.org.uk>
Signed-off-by: Heiner Kallweit <hkallwe...@gmail.com>
---
 drivers/net/phy/phy.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
index 602816d70281..c5675df5fc6f 100644
--- a/drivers/net/phy/phy.c
+++ b/drivers/net/phy/phy.c
@@ -985,8 +985,10 @@ void phy_state_machine(struct work_struct *work)
         * state machine would be pointless and possibly error prone when
         * called from phy_disconnect() synchronously.
         */
+       mutex_lock(&phydev->lock);
        if (phy_polling_mode(phydev) && phy_is_started(phydev))
                phy_queue_state_machine(phydev, PHY_STATE_TIME);
+       mutex_unlock(&phydev->lock);
 }
 
 /**
-- 
2.20.1


Reply via email to