Hi, all. I've been having sporadic and serious problems with the Realtek gigabit interface built into my motherboard. Periodically, it just freezes up. I've tried several things to no avail: turning on DEVICE_POLLING, frobbing bootloader options and sysctl settings, etc.
I had a solid week of function with the following: hw.re.msi_disable="1" hw.re.msix_disable="1" dev.re.0.int_rx_mod=0 <-- this one says it can be a loader tuneable, but it didn't work that way - I had to set it from sysctl.conf And then after a reboot, I locked up again on pushing the interface a little with an rsync. However, I've seen interactive sessions lock the thing up too. It's not just when I'd doing big transfers. It's not clear what's happening. I have been capturing stats periodically with 'sysctl dev.re.0.stats=1', but that doesn't always show a problem. For instance, during one of the lock-ups last night, after a reboot, I got this: re0 statistics: Tx frames : 171306 Rx frames : 20271 Tx errors : 0 Rx errors : 0 Rx missed frames : 0 Rx frame alignment errs : 0 Tx single collisions : 0 Tx multiple collisions : 0 Rx unicast frames : 20271 Rx broadcast frames : 0 Rx multicast frames : 0 Tx aborts : 0 Tx underruns : 0 After running overnight, with sporadic automated transfers: re0 statistics: Tx frames : 4658945 Rx frames : 1258514 Tx errors : 0 Rx errors : 33 Rx missed frames : 0 Rx frame alignment errs : 3591 Tx single collisions : 0 Tx multiple collisions : 0 Rx unicast frames : 1255880 Rx broadcast frames : 2411 Rx multicast frames : 223 Tx aborts : 0 Tx underruns : 0 I was seeing the "Rx multicast frames" creep up each time I saw a freeze last night, which was confusing in that I'm not sure why there'd be any multicast traffic. Here's the card from dmesg, with MSI/X turned off: re0: <RealTek 8168/8111 B/C/CP/D/DP/E/F/G PCIe Gigabit Ethernet> port 0xe800-0xe8ff mem 0xfbfff000-0xfbffffff,0xfbff8000-0xfbffbfff irq 18 at device 0.0 on pci2 re0: Chip rev. 0x2c000000 re0: MAC rev. 0x00200000 miibus0: <MII bus> on re0 rgephy0: <RTL8169S/8110S/8211 1000BASE-T media interface> PHY 1 on miibus0 rgephy0: none, 10baseT, 10baseT-FDX, 10baseT-FDX-flow, 100baseTX, 100baseTX-FDX, 100baseTX-FDX-flow, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, 1000baseT-FDX-flow, 1000baseT-FDX-flow-master, auto, auto-flow re0: Ethernet address: bc:ae:c5:bd:44:e7 The motherboard with this included: Base Board Information Manufacturer: ASUSTeK Computer INC. Product Name: M4A88T-M Version: Rev X.0x Serial Number: MF70B1G04201588 Asset Tag: To Be Filled By O.E.M. Features: Board is a hosting board Board is replaceable Location In Chassis: To Be Filled By O.E.M. Chassis Handle: 0x0003 Type: Motherboard Contained Object Handles: 0 In general I've been saying "ifconfig re0 down ; ifconfig re0 up" to kick the interface, but last night a friendly person from IRC mentioned that I could work around this by running a steady ping and frobbing mediatype when I see the pings fail. So, I've got this running: while true do ping -c 1 -t 1 firewall > /dev/null 2>&1 if [ $? -ne 0 ]; then date echo "toggling re0" echo ifconfig re0 media 1000baseT mediaopt full-duplex,flowcontrol,master ifconfig re0 media autoselect mediaopt flowcontrol sleep 3 fi sleep 1 done This has been noting failures sporadically throughout the day, but it's allowing traffic to continue moving, albeit with the occasional hiccough. This hardware has been running Debian for a couple years, and it's never had so much as a short hiccough, so I have confidence that the hardware is fine. It suggests that there's something the Linux driver is doing to handle this hardware that FreeBSD isn't doing. For a while I was dual-booting and I'd see errors with FreeBSD running that were't there under Debian. I'd started diving into the source, both Linux and FreeBSD, but I lack sufficient exposure to ethernet driver code to be able to get a high-level picture of what they're doing, and as such I haven't yet noticed any special- case or hardware glitch handling that we're missing, although I might find something eventually. I'm struggling with finding a way to see what's actually happening with this. I've toggled MSI and MSI-X handling, I've turned down interrupt handling delays, I've tried both I/O and memory register transfers, although I'd not actually clear what's happening differently there. I've had polling variously enabled and disabled. One thing to note is that last night's horror while I was trying to move some back-up data was after rebooting from Windows. (Installed on a partition for gaming...) It made me wonder if we're not fully setting up some state on the card. I'd have what felt like a solid, glitchless week before that. FWIW, I'm running 10.1-RC3 on this box and I've seen issues from early on while I was still running 10.0-RELEASE. Thanks in advance for clues. This is a showstopper for futher deployment for me, as I've got these Realtek on-board cards in several boxes, and while the media frobbing largely works, it's not something I can inflict on my users. -- Mason Loring Bliss (( If I have not seen as far as others, it is because ma...@blisses.org )) giants were standing on my shoulders. - Hal Abelson _______________________________________________ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"