On Wed, 2025-03-12 at 23:25 +0100, Andrew Lunn wrote:
> > Hi Andrew,
> > 
> > Thanks for your feedback. I'll try and give more detail about
> > what's
> > happening with a concrete example.
> > 
> > If we start with the device in a state where it is advertising:
> > 1000BaseT Full
> > 100baseT Full
> > 100baseT Half
> > 10baseT Full
> > 10baseT Half
> > I see the following settings in the autoneg related registers:
> > 0.4 = 0x0de1 (PHY_AUTONEG_ADV)
> > 0.9 = 0x0200 (PHY_1000T_CTRL)
> > 
> > EEE is disabled.
> > 
> > If I then adjust the advertisement to only advertise 1000BaseT Full
> > and
> > 100baseT Full with:
> > # ethtool -s eth0 advertise 0x28
> > I see the following writes to the registers:
> > 1. In igb_phy_setup_autoneg() PHY_AUTONEG_ADV is written with
> > 0x0101
> > (the correct value)
> > 2. Later in igb_phy_setup_autoneg() PHY_1000T_CTRL is written with
> > 0x0200 (correct)
> > 3. Autoneg gets restarted in igb_copper_link_autoneg() with
> > PHY_CONTROL
> > (0.0) being written with 0x1340
> > (everything looks fine up until here)
> > 4. Now we reach igb_set_eee_i350(). Here we read in IPCNFG and it
> > has
> > value 0xf. EEE is disabled so we hit the 'else' case and remove
> > E1000_IPCNFG_EEE_1G_AN and E1000_IPCNFG_EEE_100M_AN from the
> > 'ipcnfg'
> > value. We then write this back as 0x3. At this point, if you re-
> > read
> > PHY_AUTONEG_ADV you will see it's contents has been reset to
> > 0x0de1.
> 
> Thanks for the additional details. These should go into the commit
> message.
OK, I'll update into a v2 at some point.

> 
> > If you run the same example above but with EEE enabled (ethtool --
> > set-
> > eee eth0 eee on; ethtool -s eth0 advertise 0x28) the issue is not
> > seen.
> > In this case the contents of IPCNFG are written back unmodified as
> > 0xf.
> > This seems important to avoid the bug.
> 
> Yes, it does seem like the PHY is broken.
> 
> > 
> > It seems that any case where EEE is disabled will lead to the
> > undesirable behaviour where the contents of PHY_AUTONEG_ADV is
> > reset to
> > 0x0de1. The key trigger for this appears to be changes to either or
> > both of EEE_100M_AN and EEE_1G_AN in IPCNFG. The datasheet does
> > note
> > that "Changing value of bit causes link drop and re-negotiation"
> 
> Which is what you would expect, since EEE is negotiated. But
> implicitly changing the link modes advertised is not what you would
> expect.
> 
> By the way, what PHY is this? I don't remember seeing any errata for
> Linux PHY drivers resembling this.
The phy is i210. I've also checked errata and see nothing even vaguely
related to this behaviour.

> 
> > What's your opinion on that less invasive fix (i.e remove "ipcnfg
> > &=
> > ~(E1000_IPCNFG_EEE_1G_AN | E1000_IPCNFG_EEE_100M_AN);" )? Is it
> > sufficient to rely on the EEER settings to control disabling EEE
> > with
> > the IPCNFG register still set to advertise those modes?
> 
> I actually think you need to do more testing. Assuming the PHY is not
> even more broken than we think, it should not matter if it advertises
> EEE mode for link modes which are not advertised. The link partner
> should ignore them. It would be good if you tested out various EEE
> combinations from both link partners sides.
OK, I'll do some more testing of this alternate potential fix around
EEE behaviour as you suggest.

> 
> However, setting EEE advertisement and then always setting link mode
> advertisement does seem like a good workaround. It would however be
> good to get some sort of feedback from the PHY silicon vendor about
> this odd behaviour.
Yes, I'm hoping Tony Nguyen and/or Przemek Kitszel from Intel might be
able to shed more light on the underlying problem. I'll wait for them
or anyone else to weigh in.

Thanks,
Hamish M

Reply via email to