On 01/04/2017 04:13 PM, Florian Fainelli wrote:
> 
> 
> On 01/04/2017 07:04 AM, Zefir Kurtisi wrote:
>> While in RUNNING state, phy_state_machine() checks for link changes by
>> comparing phydev->link before and after calling phy_read_status().
>> This works as long as it is guaranteed that phydev->link is never
>> changed outside the phy_state_machine().
>>
>> If in some setups this happens, it causes the state machine to miss
>> a link loss and remain RUNNING despite phydev->link being 0.
>>
>> This has been observed running a dsa setup with a process continuously
>> polling the link states over ethtool each second (SNMPD RFC-1213
>> agent). Disconnecting the link on a phy followed by a ETHTOOL_GSET
>> causes dsa_slave_get_settings() / dsa_slave_get_link_ksettings() to
>> call phy_read_status() and with that modify the link status - and
>> with that bricking the phy state machine.
> 
> That's the interesting part of the analysis, how does this brick the PHY
> state machine? Is the PHY driver changing the link status in the
> read_status callback that it implements?
> 
phydev->read_status points to genphy_read_status(), where the first call goes to
genphy_update_link() which updates the link status.

Thereafter phy_state_machine():RUNNING won't be able to detect the link loss
anymore unless the link state changes again.


I was trying to figure out if there is a rule that forbids changing phydev->link
from outside the state machine, but found several places where it happens 
(either
directly, or over genphy_read_status() or over genphy_update_link()).

Curious how this did not show up before, since within the dsa setup it is very
easy to trigger:
a) physically disconnect link
b) within one second run ethtool ethX


Cheers,
Zefir

Reply via email to