On Wed, 2025-03-12 at 23:25 +0100, Andrew Lunn wrote: > > Hi Andrew, > > > > Thanks for your feedback. I'll try and give more detail about > > what's > > happening with a concrete example. > > > > If we start with the device in a state where it is advertising: > > 1000BaseT Full > > 100baseT Full > > 100baseT Half > > 10baseT Full > > 10baseT Half > > I see the following settings in the autoneg related registers: > > 0.4 = 0x0de1 (PHY_AUTONEG_ADV) > > 0.9 = 0x0200 (PHY_1000T_CTRL) > > > > EEE is disabled. > > > > If I then adjust the advertisement to only advertise 1000BaseT Full > > and > > 100baseT Full with: > > # ethtool -s eth0 advertise 0x28 > > I see the following writes to the registers: > > 1. In igb_phy_setup_autoneg() PHY_AUTONEG_ADV is written with > > 0x0101 > > (the correct value) > > 2. Later in igb_phy_setup_autoneg() PHY_1000T_CTRL is written with > > 0x0200 (correct) > > 3. Autoneg gets restarted in igb_copper_link_autoneg() with > > PHY_CONTROL > > (0.0) being written with 0x1340 > > (everything looks fine up until here) > > 4. Now we reach igb_set_eee_i350(). Here we read in IPCNFG and it > > has > > value 0xf. EEE is disabled so we hit the 'else' case and remove > > E1000_IPCNFG_EEE_1G_AN and E1000_IPCNFG_EEE_100M_AN from the > > 'ipcnfg' > > value. We then write this back as 0x3. At this point, if you re- > > read > > PHY_AUTONEG_ADV you will see it's contents has been reset to > > 0x0de1. > > Thanks for the additional details. These should go into the commit > message. OK, I'll update into a v2 at some point.
> > > If you run the same example above but with EEE enabled (ethtool -- > > set- > > eee eth0 eee on; ethtool -s eth0 advertise 0x28) the issue is not > > seen. > > In this case the contents of IPCNFG are written back unmodified as > > 0xf. > > This seems important to avoid the bug. > > Yes, it does seem like the PHY is broken. > > > > > It seems that any case where EEE is disabled will lead to the > > undesirable behaviour where the contents of PHY_AUTONEG_ADV is > > reset to > > 0x0de1. The key trigger for this appears to be changes to either or > > both of EEE_100M_AN and EEE_1G_AN in IPCNFG. The datasheet does > > note > > that "Changing value of bit causes link drop and re-negotiation" > > Which is what you would expect, since EEE is negotiated. But > implicitly changing the link modes advertised is not what you would > expect. > > By the way, what PHY is this? I don't remember seeing any errata for > Linux PHY drivers resembling this. The phy is i210. I've also checked errata and see nothing even vaguely related to this behaviour. > > > What's your opinion on that less invasive fix (i.e remove "ipcnfg > > &= > > ~(E1000_IPCNFG_EEE_1G_AN | E1000_IPCNFG_EEE_100M_AN);" )? Is it > > sufficient to rely on the EEER settings to control disabling EEE > > with > > the IPCNFG register still set to advertise those modes? > > I actually think you need to do more testing. Assuming the PHY is not > even more broken than we think, it should not matter if it advertises > EEE mode for link modes which are not advertised. The link partner > should ignore them. It would be good if you tested out various EEE > combinations from both link partners sides. OK, I'll do some more testing of this alternate potential fix around EEE behaviour as you suggest. > > However, setting EEE advertisement and then always setting link mode > advertisement does seem like a good workaround. It would however be > good to get some sort of feedback from the PHY silicon vendor about > this odd behaviour. Yes, I'm hoping Tony Nguyen and/or Przemek Kitszel from Intel might be able to shed more light on the underlying problem. I'll wait for them or anyone else to weigh in. Thanks, Hamish M