On 8/21/2024 8:09 AM, Jiri Slaby wrote:
On 20. 08. 24, 23:30, Bjorn Helgaas wrote:
On Tue, Aug 20, 2024 at 11:13:54PM +0200, Petr Valenta wrote:
Dne 20. 08. 24 v 20:09 Bjorn Helgaas napsal(a):
On Mon, Aug 19, 2024 at 07:23:42AM +0200, Jiri Slaby wrote:
On 19. 08. 24, 6:50, Jiri Slaby wrote:
CC e1000e guys + Jesse (due to 75a3f93b5383) + Bjorn (due to
b2c289415b2b)
Bjorn,
I am confused by these changes:
==========================================
@@ -291,16 +288,13 @@ static int e1000_set_link_ksettings(struct
net_device
*net
dev,
* duplex is forced.
*/
if (cmd->base.eth_tp_mdix_ctrl) {
- if (hw->phy.media_type != e1000_media_type_copper) {
- ret_val = -EOPNOTSUPP;
- goto out;
- }
+ if (hw->phy.media_type != e1000_media_type_copper)
+ return -EOPNOTSUPP;
if ((cmd->base.eth_tp_mdix_ctrl !=
ETH_TP_MDI_AUTO) &&
(cmd->base.autoneg != AUTONEG_ENABLE)) {
e_err("forcing MDI/MDI-X state is not
supported when
lin
k speed and/or duplex are forced\n");
- ret_val = -EINVAL;
- goto out;
+ return -EINVAL;
}
}
@@ -347,7 +341,6 @@ static int e1000_set_link_ksettings(struct
net_device
*netde
v,
}
out:
- pm_runtime_put_sync(netdev->dev.parent);
clear_bit(__E1000_RESETTING, &adapter->state);
return ret_val;
}
==========================================
So no more clear_bit(__E1000_RESETTING in the above fail paths. Is
that
intentional?
Not intentional. Petr, do you have the ability to test the patch
below? I'm not sure it's the correct fix, but it reverts the pieces
of b2c289415b2b that Jiri pointed out.
I tested the patch below but it didn't help. After the first boot
with new
kernel it looked promising as the irq storm only appeared for a few
seconds,
but with subsequent reboots it was the same as without the patch.
Thank you very much for testing that!
To be sure, I also send the md5sum of ethtool.c after applying the
patch:
a25c003257538f16994b4d7c4260e894 ethtool.c
Thanks, that matches what I get when applying the patch on v6.10.
I'm at a loss. You could try reverting the entire b2c289415b2b commit
(patch for that is below).
FWIW he already tested with b2c289415b2b reverted (I provided him with
a built kernel). It behaves the same. So you are not the breaker.
If that doesn't help, I guess you could try reverting the other
commits Jiri mentioned:
76a0a3f9cc2f e1000e: fix force smbus during suspend flow
c93a6f62cb1b e1000e: Fix S0ix residency on corporate systems
bfd546a552e1 e1000e: move force SMBUS near the end of enable_ulp
function
6918107e2540 net: e1000e & ixgbe: Remove PCI_HEADER_TYPE_MFD
duplicates
1eb2cded45b3 net: annotate writes on dev->mtu from ndo_change_mtu()
b2c289415b2b e1000e: Remove redundant runtime resume for ethtool_ops
75a3f93b5383 net: intel: implement modern PM ops declarations
If you do this, I would revert 76a0a3f9cc2f, test, then revert
c93a6f62cb1b in addition, test, then revert bfd546a552e1 in addition,
etc.
Or perhaps easier to do:
git bisect v6.10 v6.9 -- drivers/net/ethernet/intel/e1000e/
directly. But that assumes one of the above commits broke it. If they
did not, as a last resort, you can still do full bisect (without the
"-- drivers" part).
I would take v6.10 suses config.
Would boot 6.10.
do lsmod > /tmp/lsmod
make LSMOD=/tmp/lsmod localyesconfig
make bzImage
and use that bzImage.
Note that graphics, wireless and other stuff will be defunct unless
you build in firmwares for them (EXTRA_FIRMWARE config). Alternatively
use localmodconfig and build and install also modules (now limited to
your machine).
thanks,
From all the data in the Suse Bugzilla I understood that the issue
happens with a cable disconnected.
Does it reproduce with a connected cable?
Also, I suspect that it might be related to wake up interrupts that the
I219 device gets which might cause the IRQ storm.
I would like to ask you to enable the debug prints in the e1000e driver
and share the dmesg log. I would like to see if we can get the values of
WUFC and WUS on on resume.
For getting these values please comment out the following in
__e1000_resume function in netdev.c file:
/* report the system wakeup cause from S3/S4 */
// if (adapter->flags2 & FLAG2_HAS_PHY_WAKEUP) {
u16 phy_data;
e1e_rphy(&adapter->hw, BM_WUS, &phy_data);
if (phy_data) {
e_info("PHY Wakeup cause - %s\n",
phy_data & E1000_WUS_EX ? "Unicast Packet" :
phy_data & E1000_WUS_MC ? "Multicast
Packet" :
phy_data & E1000_WUS_BC ? "Broadcast
Packet" :
phy_data & E1000_WUS_MAG ? "Magic Packet" :
phy_data & E1000_WUS_LNKC ?
"Link Status Change" : "other");
}
e1e_wphy(&adapter->hw, BM_WUS, ~0);
// } else {
u32 wus = er32(WUS);
if (wus) {
e_info("MAC Wakeup cause - %s\n",
wus & E1000_WUS_EX ? "Unicast Packet" :
wus & E1000_WUS_MC ? "Multicast Packet" :
wus & E1000_WUS_BC ? "Broadcast Packet" :
wus & E1000_WUS_MAG ? "Magic Packet" :
wus & E1000_WUS_LNKC ? "Link Status
Change" :
"other");
// }
ew32(WUS, ~0);
To enable the debug prints you can run:
echo "module e1000e +p" | sudo tee /sys/kernel/debug/dynamic_debug/control