[Bug 236962] Realtek RTL8111/8168/8411 erratically drops network connection

2019-04-27 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=236962

--- Comment #3 from mar...@herrbischoff.com ---
Rodney, thanks for the information. I cannot change the NICs since the machines
affected are dedicated servers provided by a commercial data center. Since
their pricing is quite competitive, it appears this may reflect on the hardware
side. However, I have Linux installations running on similar hardware and they
never displayed this behavior.

I'm aware that FreeBSD values clean implementation over quick hacks and issues
like this one are probably hard to troubleshoot. On average, the issue comes up
every two months and I wasn't able to reproduce it. From what I gather, this
will likely remain unfixed.

I have asked the data center to switch hardware and see where this gets me. If
this is not possible, I guess I'm up for some long-tail debugging, provided a
team member like you feels this would benefit the project and is prepared to
dive into this with me.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


[Bug 236962] Realtek RTL8111/8168/8411 erratically drops network connection

2019-04-27 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=236962

Eugene Grosbein  changed:

   What|Removed |Added

 CC||eu...@freebsd.org

--- Comment #4 from Eugene Grosbein  ---
(In reply to marcel from comment #3)

I was in exactly same position using cheap hoster's hardware and re0 watchdog
timeouts. There is simple work-around that may be acceptable if problem is
rare. Add single line to /etc/sysctl.conf:

kern.* |/root/bin/monitor_nic

Simple script /root/bin/monitor_nic just does what driver is supposed to do in
such case: reset interface to revive it.

#!/bin/sh
PATH=/bin:/sbin:/usr/bin:/usr/sbin
while read month day time s host kernel rest
do
  case "$rest" in
  "re0: watchdog timeout")
sleep 5
ifconfig re0 down
sleep 1
ifconfig re0 up
sleep 30
;;
  esac
done
# EOF

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


[Bug 236962] Realtek RTL8111/8168/8411 erratically drops network connection

2019-04-27 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=236962

--- Comment #5 from Eugene Grosbein  ---
Maybe you'll need to adjust pattern matching as your logs have different format
comparing to logs generated to my FreeBSD 11.2 boxes.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


[Bug 236962] Realtek RTL8111/8168/8411 erratically drops network connection

2019-04-27 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=236962

Konstantin Belousov  changed:

   What|Removed |Added

 CC||k...@freebsd.org

--- Comment #6 from Konstantin Belousov  ---
Some time ago I started using
https://www.gigabyte.com/Motherboard/GA-J3455N-D3H-rev-10#ov
for my home server.  In-tree driver stops operating with dreaded 'device
timeout', and the official realtek driver caused some weird hangs of the
whole machine.

I was not able to figure out what is missing in the in-tree driver.  But
for the realtek code, the cause appeared quite silly.  Since chips are
able to do jumbo, but not scatter-gather, they allocated 9K clusters
for rx fill always, even if interface was configured for standard 1500
MTU.  At some time (2-3 weeks for my workload) memory becomes fragmented
enough that driver cannot refill rx, and due to the interface mutex, this
cascaded to everything that touched network.

I added a knob to disable jumbo and re-imported several revisions of the
vendor driver there:
https://github.com/kostikbel/rere

After that I am quite happy running stable/11 for a year without an issue.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"