>-----Original Message----- >From: linux-kernel-ow...@vger.kernel.org >[mailto:linux-kernel-ow...@vger.kernel.org] On Behalf Of Justin Piszcz >Sent: Sunday, February 22, 2015 4:01 AM >To: linux-kernel@vger.kernel.org >Subject: 3.19: ixgbe 0000:01:00.0 eth4: initiating reset due to tx timeout > >Hello, > >Kernel: 3.19.0 >Issue: When using robocopy to copy files (from Windows 8/8.1) to >Linux/samba, the 10GbE NIC resets - dmesg [1] below. To get it back working >again, I have to down/up the interface. Jumbo frames are being used (mtu of >9014) on each side. The lspci output is listed below. Are there any other >recommended workarounds for this issue as LRO is already off for me as shown >below. When using Linux<->Linux with rsync or NFS, there are no errors with >10GbE. When using Samba<->Windows 8 over 10GbE, this issue occurs >persistently as shown below when a copy is running. > ># ethtool -k eth4|grep large >large-receive-offload: off [fixed]
The issue is a Tx timeout, so LRO is unlikely to have an effect. Is the interface that hangs (eth4) mostly receiving or transmitting? Posting the stats (ethtool -S eth4) would help here. >There is/was a similar issue as reported here: >https://communities.intel.com/message/207408 > > [1] dmesg > > [538576.098186] ixgbe 0000:01:00.0 eth4: NIC Link is Up 10 Gbps, Flow > Control: RX/TX > [541013.223961] ------------[ cut here ]------------ > [541013.223970] WARNING: CPU: 0 PID: 0 at net/sched/sch_generic.c:303 > dev_watchdog+0x227/0x230() > [541013.223971] NETDEV WATCHDOG: eth4 (ixgbe): transmit queue 0 timed out > [541013.223972] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.19.0 #2 > [541013.223973] Hardware name: Supermicro X9SRL-F/X9SRL-F, BIOS 3.0a > 12/05/2013 > [541013.223974] ffffffff81d3a6ae ffff88107fc03da8 ffffffff819d07d7 > ffffffff81e34d98 > [541013.223976] ffff88107fc03df8 ffff88107fc03de8 ffffffff810dbdab > 0000000000000000 > [541013.223977] 0000000000000000 ffff881036304000 0000000000000000 > 0000000000000010 > [541013.223979] Call Trace: > [541013.223979] <IRQ> [<ffffffff819d07d7>] dump_stack+0x45/0x57 > [541013.223985] [<ffffffff810dbdab>] warn_slowpath_common+0x7b/0xc0 > [541013.223987] [<ffffffff810dbe61>] warn_slowpath_fmt+0x41/0x50 > [541013.223990] [<ffffffff810eec4c>] ? __queue_work+0xfc/0x290 > [541013.223996] [<ffffffff818ef0a7>] dev_watchdog+0x227/0x230 > [541013.223997] [<ffffffff818eee80>] ? qdisc_rcu_free+0x40/0x40 > [541013.223998] [<ffffffff818eee80>] ? qdisc_rcu_free+0x40/0x40 > [541013.224001] [<ffffffff811251f7>] call_timer_fn.isra.29+0x17/0x80 > [541013.224002] [<ffffffff81125429>] run_timer_softirq+0x1c9/0x280 > [541013.224004] [<ffffffff810dec7f>] __do_softirq+0xff/0x200 > [541013.224005] [<ffffffff810deea6>] irq_exit+0x76/0xa0 > [541013.224007] [<ffffffff8106ac11>] smp_apic_timer_interrupt+0x41/0x50 > [541013.224009] [<ffffffff819da6aa>] apic_timer_interrupt+0x6a/0x70 > [541013.224009] <EOI> [<ffffffff8184e8f8>] ? cpuidle_enter_state+0x48/0xc0 > [541013.224013] [<ffffffff8184e8ed>] ? cpuidle_enter_state+0x3d/0xc0 > [541013.224014] [<ffffffff8184ea42>] cpuidle_enter+0x12/0x20 > [541013.224017] [<ffffffff8110f222>] cpu_startup_entry+0x272/0x2f0 > [541013.224018] [<ffffffff819cdd5d>] rest_init+0x6d/0x70 > [541013.224021] [<ffffffff81ef0dbb>] start_kernel+0x353/0x360 > [541013.224022] [<ffffffff81ef0495>] x86_64_start_reservations+0x2a/0x2c > [541013.224023] [<ffffffff81ef055f>] x86_64_start_kernel+0xc8/0xcc > [541013.224024] ---[ end trace 59877113cf8b7358 ]--- > [541013.224026] ixgbe 0000:01:00.0 eth4: initiating reset due to tx timeout > [541013.224036] ixgbe 0000:01:00.0 eth4: Reset adapter > [541020.099402] ixgbe 0000:01:00.0 eth4: NIC Link is Up 10 Gbps, Flow > Control: RX/TX > > ( .. it continue but without the trace later .. ) > > [567457.771728] ixgbe 0000:01:00.0 eth4: NIC Link is Down > [567458.140112] ixgbe 0000:01:00.0 eth4: NIC Link is Up 10 Gbps, Flow > Control: RX/TX > [567561.611941] ixgbe 0000:01:00.0 eth4: NIC Link is Down > [567568.188422] ixgbe 0000:01:00.0 eth4: NIC Link is Up 10 Gbps, Flow > Control: RX/TX > [570130.483823] ixgbe 0000:01:00.0 eth4: initiating reset due to tx timeout > [570130.483924] ixgbe 0000:01:00.0 eth4: Reset adapter The reset is a side effect of the Tx hang - the driver is trying to recover from the hang by resetting the interface. If you could open up a ticket at e1000.sf.net with details about your setup and how you configure the interfaces that would help us get a better idea of the issue. You can also upload the stats, kernel config and any other logs that may be relevant. Thanks, Emil -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/