Hi there

Over the past couple of months I've occasionally observed my machine
loosing its ethernet connection.

It usually only happens after I've been using the machine for a couple
of hours and it only happens around 3-4 times per month.
Every time (previously) I've just rebooted the machine and then things
were fine when it came back up, but the last time it happened I took a
look at 'dmesg' to see if there was a clue and I found this:

[   11.474412] igc 0000:0c:00.0 eno1: NIC Link is Up 2500 Mbps Full
Duplex, Flow Control: RX/TX
[   11.475554] igc 0000:0c:00.0 eno1: Force mode currently not supported
[   14.363040] usbcore: registered new interface driver snd-usb-audio
[   15.934429] igc 0000:0c:00.0 eno1: NIC Link is Up 2500 Mbps Full
Duplex, Flow Control: RX/TX
[   37.250435] systemd-journald[569]: Time jumped backwards, rotating.
[   38.593498] warning: `kdeconnectd' uses wireless extensions which
will stop working for Wi-Fi 7 hardware; use nl80211
[  352.786791] usb 3-2: new high-speed USB device number 7 using xhci_hcd
[15656.628279] igc 0000:0c:00.0 eno1: PCIe link lost, device now detached
[15656.628287] ------------[ cut here ]------------
[15656.628287] igc: Failed to read reg 0xc030!
[15656.628306] WARNING: CPU: 2 PID: 2383 at
drivers/net/ethernet/intel/igc/igc_main.c:6752 igc_rd32+0x88/0xa0
[igc]
[15656.628313] Modules linked in: snd_usb_audio snd_usbmidi_lib
snd_ump snd_rawmidi snd_seq_device mc vfat fat amd_atl intel_rapl_msr
intel_rapl_common kvm_amd iwlmvm eeepc_wmi asus_nb_wmi
kvm asus_wmi crct10dif_pclmul platform_profile mac80211
snd_hda_codec_hdmi crc32_pclmul snd_hda_intel polyval_clmulni libarc4
polyval_generic snd_intel_dspcfg gf128mul snd_intel_sdw_acpi bt
usb ghash_clmulni_intel snd_hda_codec btrtl iwlwifi sha512_ssse3
btintel sha256_ssse3 snd_hda_core sha1_ssse3 btbcm aesni_intel
snd_hwdep btmtk crypto_simd i8042 snd_pcm cryptd cfg80211 spa
rse_keymap bluetooth sp5100_tco snd_timer serio wmi_bmof rapl pcspkr
k10temp ccp igc i2c_piix4 snd ptp soundcore rfkill joydev pps_core
mousedev gpio_amdpt gpio_generic mac_hid i2c_dev cryp
to_user loop dm_mod nfnetlink ip_tables x_tables ext4 crc32c_generic
crc16 mbcache jbd2 hid_generic usbhid amdgpu amdxcp i2c_algo_bit
drm_ttm_helper ttm drm_exec gpu_sched drm_suballoc_help
er nvme drm_buddy drm_display_helper crc32c_intel nvme_core xhci_pci
xhci_pci_renesas cec
[15656.628364]  video nvme_auth wmi
[15656.628368] CPU: 2 PID: 2383 Comm: btop Not tainted 6.10.8-arch1-1
#1 a95ab4cbeff058332c57c6b7bbc94a2b00a74ca7
[15656.628370] Hardware name: ASUS System Product Name/ROG STRIX
X670E-E GAMING WIFI, BIOS 2007 04/12/2024
[15656.628371] RIP: 0010:igc_rd32+0x88/0xa0 [igc]
[15656.628374] Code: 48 c7 c6 30 f9 7e c1 e8 56 3a 27 d3 48 8b bd 28
ff ff ff e8 ba 26 ba d2 84 c0 74 c5 89 de 48 c7 c7 58 f9 7e c1 e8 48
c5 4e d2 <0f> 0b eb b3 83 c8 ff e9 47 74 53 d3 66 6
6 2e 0f 1f 84 00 00 00 00
[15656.628375] RSP: 0018:ffffb74248adf338 EFLAGS: 00010286
[15656.628377] RAX: 0000000000000000 RBX: 000000000000c030 RCX: 0000000000000027
[15656.628378] RDX: ffff97821d9219c8 RSI: 0000000000000001 RDI: ffff97821d9219c0
[15656.628379] RBP: ffff9773078cece8 R08: 0000000000000000 R09: ffffb74248adf1b8
[15656.628379] R10: ffff97821d7fffa8 R11: 0000000000000003 R12: 0000000000000000
[15656.628380] R13: 0000000000000000 R14: ffff97731260abc0 R15: 000000000000c030
[15656.628381] FS:  00007113194006c0(0000) GS:ffff97821d900000(0000)
knlGS:0000000000000000
[15656.628382] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[15656.628383] CR2: 00007e621ecb2000 CR3: 00000001153b6000 CR4: 0000000000f50ef0
[15656.628384] PKRU: 55555554
[15656.628385] Call Trace:
[15656.628387]  <TASK>
[15656.628388]  ? igc_rd32+0x88/0xa0 [igc
22e0a697bfd5a86bd5c20d279bfffd131de6bb32]
[15656.628391]  ? __warn.cold+0x8e/0xe8
[15656.628393]  ? igc_rd32+0x88/0xa0 [igc
22e0a697bfd5a86bd5c20d279bfffd131de6bb32]
[15656.628398]  ? report_bug+0xff/0x140
[15656.628400]  ? console_unlock+0x84/0x130
[15656.628402]  ? handle_bug+0x3c/0x80
[15656.628404]  ? exc_invalid_op+0x17/0x70
[15656.628405]  ? asm_exc_invalid_op+0x1a/0x20
[15656.628408]  ? igc_rd32+0x88/0xa0 [igc
22e0a697bfd5a86bd5c20d279bfffd131de6bb32]
[15656.628411]  ? igc_rd32+0x88/0xa0 [igc
22e0a697bfd5a86bd5c20d279bfffd131de6bb32]
[15656.628414]  igc_update_stats+0x8a/0x6d0 [igc
22e0a697bfd5a86bd5c20d279bfffd131de6bb32]
[15656.628417]  igc_get_stats64+0x85/0x90 [igc
22e0a697bfd5a86bd5c20d279bfffd131de6bb32]
[15656.628420]  dev_get_stats+0x5d/0x130
[15656.628422]  rtnl_fill_stats+0x3b/0x130
[15656.628425]  rtnl_fill_ifinfo.isra.0+0x779/0x1520
[15656.628426]  ? nla_reserve_64bit+0x30/0x40
[15656.628430]  rtnl_dump_ifinfo+0x4af/0x650
[15656.628438]  ? srso_alias_return_thunk+0x5/0xfbef5
[15656.628439]  ? kmalloc_reserve+0x62/0xf0
[15656.628442]  rtnl_dumpit+0x1c/0x60
[15656.628444]  netlink_dump+0x347/0x3b0
[15656.628449]  __netlink_dump_start+0x1eb/0x310
[15656.628451]  ? __pfx_rtnl_dump_ifinfo+0x10/0x10
[15656.628452]  rtnetlink_rcv_msg+0x2aa/0x3f0
[15656.628454]  ? __pfx_rtnl_dumpit+0x10/0x10
[15656.628456]  ? __pfx_rtnl_dump_ifinfo+0x10/0x10
[15656.628457]  ? __pfx_rtnetlink_rcv_msg+0x10/0x10
[15656.628459]  netlink_rcv_skb+0x50/0x100
[15656.628463]  netlink_unicast+0x240/0x370
[15656.628465]  netlink_sendmsg+0x21b/0x470
[15656.628468]  __sys_sendto+0x201/0x210
[15656.628473]  __x64_sys_sendto+0x24/0x30
[15656.628474]  do_syscall_64+0x82/0x190
[15656.628476]  ? srso_alias_return_thunk+0x5/0xfbef5
[15656.628477]  ? syscall_exit_to_user_mode+0x72/0x200
[15656.628479]  ? srso_alias_return_thunk+0x5/0xfbef5
[15656.628480]  ? do_syscall_64+0x8e/0x190
[15656.628482]  ? srso_alias_return_thunk+0x5/0xfbef5
[15656.628483]  ? seq_read_iter+0x208/0x460
[15656.628485]  ? srso_alias_return_thunk+0x5/0xfbef5
[15656.628486]  ? update_curr+0x26/0x1f0
[15656.628488]  ? srso_alias_return_thunk+0x5/0xfbef5
[15656.628489]  ? reweight_entity+0x1c4/0x260
[15656.628490]  ? srso_alias_return_thunk+0x5/0xfbef5
[15656.628492]  ? srso_alias_return_thunk+0x5/0xfbef5
[15656.628493]  ? task_tick_fair+0x40/0x420
[15656.628494]  ? srso_alias_return_thunk+0x5/0xfbef5
[15656.628495]  ? sched_use_asym_prio+0x66/0x90
[15656.628496]  ? srso_alias_return_thunk+0x5/0xfbef5
[15656.628497]  ? sched_balance_trigger+0x14c/0x340
[15656.628499]  ? srso_alias_return_thunk+0x5/0xfbef5
[15656.628500]  ? srso_alias_return_thunk+0x5/0xfbef5
[15656.628501]  ? rcu_accelerate_cbs+0x7a/0x80
[15656.628503]  ? srso_alias_return_thunk+0x5/0xfbef5
[15656.628504]  ? __note_gp_changes+0x18b/0x1a0
[15656.628506]  ? srso_alias_return_thunk+0x5/0xfbef5
[15656.628507]  ? note_gp_changes+0x6c/0x80
[15656.628508]  ? srso_alias_return_thunk+0x5/0xfbef5
[15656.628509]  ? srso_alias_return_thunk+0x5/0xfbef5
[15656.628510]  ? srso_alias_return_thunk+0x5/0xfbef5
[15656.628511]  ? __rseq_handle_notify_resume+0xa6/0x490
[15656.628514]  ? srso_alias_return_thunk+0x5/0xfbef5
[15656.628515]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[15656.628517] RIP: 0033:0x71131b12a8e4
[15656.628531] Code: 7d e8 89 4d d4 e8 fc 49 f7 ff 44 8b 4d d0 4c 8b
45 c8 89 c3 44 8b 55 d4 8b 7d e8 b8 2c 00 00 00 48 8b 55 d8 48 8b 75
e0 0f 05 <48> 3d 00 f0 ff ff 77 34 89 df 48 89 45 e
8 e8 49 4a f7 ff 48 8b 45
[15656.628532] RSP: 002b:00007113193ff0b0 EFLAGS: 00000293 ORIG_RAX:
000000000000002c
[15656.628533] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 000071131b12a8e4
[15656.628534] RDX: 0000000000000014 RSI: 00007113193ff180 RDI: 0000000000000003
[15656.628535] RBP: 00007113193ff0f0 R08: 00007113193ff140 R09: 000000000000000c
[15656.628536] R10: 0000000000000000 R11: 0000000000000293 R12: 00007113193ff270
[15656.628536] R13: 00007113193ff180 R14: 00007113193ffca8 R15: 00007113193ff780
[15656.628539]  </TASK>
[15656.628540] ---[ end trace 0000000000000000 ]---

I tried reloading the 'igc' module, but that didn't resolve the issue
- then I rebooted as usual and everything was fine again.

My NIC is (from `lspci -vvv`):
0c:00.0 Ethernet controller: Intel Corporation Ethernet Controller
I225-V (rev 03)
       DeviceName: Intel 2.5G LAN
       Subsystem: ASUSTeK Computer Inc. Device 87d2
       Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr- Stepping- SERR- FastB2B- DisINTx+
       Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
<TAbort- <MAbort- >SERR- <PERR- INTx-
       Latency: 0, Cache Line Size: 64 bytes
       Interrupt: pin A routed to IRQ 36
       IOMMU group: 19
       Region 0: Memory at 80100000 (32-bit, non-prefetchable) [size=1M]
       Region 3: Memory at 80200000 (32-bit, non-prefetchable) [size=16K]
       Capabilities: [40] Power Management version 3
               Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA
PME(D0+,D1-,D2-,D3hot+,D3cold+)
               Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=1 PME-
       Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+
               Address: 0000000000000000  Data: 0000
               Masking: 00000000  Pending: 00000000
       Capabilities: [70] MSI-X: Enable+ Count=5 Masked-
               Vector table: BAR=3 offset=00000000
               PBA: BAR=3 offset=00002000
       Capabilities: [a0] Express (v2) Endpoint, IntMsgNum 0
               DevCap: MaxPayload 512 bytes, PhantFunc 0, Latency L0s
<512ns, L1 <64us
                       ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset+
SlotPowerLimit 0W TEE-IO-
               DevCtl: CorrErr+ NonFatalErr+ FatalErr+ UnsupReq+
                       RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+ FLReset-
                       MaxPayload 128 bytes, MaxReadReq 512 bytes
               DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq-
AuxPwr+ TransPend-
               LnkCap: Port #1, Speed 5GT/s, Width x1, ASPM L1, Exit
Latency L1 <4us
                       ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+
               LnkCtl: ASPM L1 Enabled; RCB 64 bytes, LnkDisable- CommClk+
                       ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
               LnkSta: Speed 5GT/s, Width x1
                       TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
               DevCap2: Completion Timeout: Range ABCD, TimeoutDis+
NROPrPrP- LTR+
                        10BitTagComp- 10BitTagReq- OBFF Not Supported,
ExtFmt- EETLPPrefix-
                        EmergencyPowerReduction Not Supported,
EmergencyPowerReductionInit-
                        FRS- TPHComp- ExtTPHComp-
                        AtomicOpsCap: 32bit- 64bit- 128bitCAS-
               DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-
                        AtomicOpsCtl: ReqEn-
                        IDOReq- IDOCompl- LTR+ EmergencyPowerReductionReq-
                        10BitTagReq- OBFF Disabled, EETLPPrefixBlk-
               LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-
                        Transmit Margin: Normal Operating Range,
EnterModifiedCompliance- ComplianceSOS-
                        Compliance Preset/De-emphasis: -6dB
de-emphasis, 0dB preshoot
               LnkSta2: Current De-emphasis Level: -6dB,
EqualizationComplete- EqualizationPhase1-
                        EqualizationPhase2- EqualizationPhase3-
LinkEqualizationRequest-
                        Retimer- 2Retimers- CrosslinkRes: unsupported
       Capabilities: [100 v2] Advanced Error Reporting
               UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt-
UnxCmplt- RxOF- MalfTLP-
                       ECRC- UnsupReq- ACSViol- UncorrIntErr-
BlockedTLP- AtomicOpBlocked- TLPBlockedErr-
                       PoisonTLPBlocked- DMWrReqBlocked- IDECheck-
MisIDETLP- PCRC_CHECK- TLPXlatBlocked-
               UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt-
UnxCmplt- RxOF- MalfTLP-
                       ECRC- UnsupReq- ACSViol- UncorrIntErr-
BlockedTLP- AtomicOpBlocked- TLPBlockedErr-
                       PoisonTLPBlocked- DMWrReqBlocked- IDECheck-
MisIDETLP- PCRC_CHECK- TLPXlatBlocked-
               UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt-
UnxCmplt- RxOF+ MalfTLP+
                       ECRC- UnsupReq- ACSViol- UncorrIntErr+
BlockedTLP- AtomicOpBlocked- TLPBlockedErr-
                       PoisonTLPBlocked- DMWrReqBlocked- IDECheck-
MisIDETLP- PCRC_CHECK- TLPXlatBlocked-
               CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout-
AdvNonFatalErr- CorrIntErr- HeaderOF-
               CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout-
AdvNonFatalErr+ CorrIntErr- HeaderOF-
               AERCap: First Error Pointer: 14, ECRCGenCap+ ECRCGenEn-
ECRCChkCap+ ECRCChkEn-
                       MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap-
               HeaderLog: 40001001 0000000f 8020000c 8020000c
       Capabilities: [140 v1] Device Serial Number a0-36-bc-ff-ff-ac-b3-b6
       Capabilities: [1c0 v1] Latency Tolerance Reporting
               Max snoop latency: 0ns
               Max no snoop latency: 0ns
       Capabilities: [1f0 v1] Precision Time Measurement
               PTMCap: Requester+ Responder- Root-
               PTMClockGranularity: 4ns
               PTMControl: Enabled- RootSelected-
               PTMEffectiveGranularity: Unknown
       Capabilities: [1e0 v1] L1 PM Substates
               L1SubCap: PCI-PM_L1.2- PCI-PM_L1.1+ ASPM_L1.2-
ASPM_L1.1+ L1_PM_Substates+
               L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2- ASPM_L1.1-
               L1SubCtl2:
       Kernel driver in use: igc

My distribution is Arch Linux.

My motherboard is a ASUS X670E-E running a AMD 7950X CPU and using 64G
of RAM at EXPO 6000 speed.

My kernel is: 6.10.9-arch1-1 #1 SMP PREEMPT_DYNAMIC Mon, 09 Sep 2024
02:38:45 +0000 x86_64 GNU/Linux

I'm connected to a 2.5GiB/sec switch that doesn't seem to have any
problems serving other machines when this happens.

I can provide further hardware details upon request, just let me know
what info you need.
I'm perfectly willing to try custom kernels and/or patches, just let
me know what you need me to try/build/test.

Kind regards,
 Jesper Juhl

Reply via email to