https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=209351
Bug ID: 209351 Summary: VLAN TX errors, possible performance regression after 10.1-STABLE (r281235) Product: Base System Version: 11.0-CURRENT Hardware: amd64 OS: Any Status: New Severity: Affects Some People Priority: --- Component: kern Assignee: freebsd-bugs@FreeBSD.org Reporter: zclau...@bsd.com.br CC: freebsd-am...@freebsd.org CC: freebsd-am...@freebsd.org On a BGP, running FreeBSD 10.1-STABLE, version r281235 and it works fine for several years now. After upgrading to any newer version I start having vlan TX errors on the exact same hardware, just booting an SSD with a newer system. Details: We have around 4Gbit/s and 1.8Mpps routed on peak while per port interface we peak at 300Kpps. Our quality metrics are measured with: ping -s 1472 -i 0.1 <our-other-ibgp-router> As well as iperf bidirecional. Systems working w/o problem: - 10.1-STABLE / r281235 Systems tested with drops: - 10.2-STABLE / r292035M - 10.3-STABLE / r298705 - 11.0-CURRENT / r295683 (downloaded snapshot from ftp.freebsd.org) - 11.0-CURRENT Melifaro Routing Branch / r297731M While testing, when errors happen I can see output errs on the vlan port on the output from "netstat -w1 -I vlan6" input vlan6 output packets errs idrops bytes packets errs bytes colls 1 0 0 66 30557 2 33310968 0 1 0 0 105 31458 3 33912219 0 2 0 0 2954 32001 8 34983986 0 1 0 0 1512 33150 6 35942558 0 1 0 0 1512 33654 4 37311862 0 1 0 0 1512 34825 3 38213793 0 3 0 0 1683 35376 4 39488912 0 5 0 0 7280 32423 3 35551869 0 Problems may happen under high load (~200Kpps) or low load (~30Kpps) on a vlan port. The observed frame loss never happens on untagged ports, only vlan related. The observed loss happens with packets sized 900 bytes and above but noticeably loss rate is higher with packets close to 1400 (1472 is my reference size). Loss rate on all listed systems different from r281235 is 9-19% with ping(1) and iperf, while it's 0% (no loss or very irrelevant loss) on r281235. Hardware tried: - Intel 82599EB 10-Gigabit SFI/SFP+ Network Connection (2x2 on x8 PCIe bus, total 4x10G). - Chelsio T520, 2x2 on x8PCIe bus, total 4x10G Exactly the same behavior, so it's not Intel related/exclusive. Same hardware: I always test the very same hardware, I have two SSD drives in this router, one for the 10.1 which just runs fine and the other disk to test the various versions of FreeBSD. Sysctl/loader: Only minor loader and sysctl confs are tweaked: kern.hz=2000 net.inet.ip.redirect=1 # do not send IP redirects net.inet.ip.accept_sourceroute=0 # drop source routed packets since they ca net.inet.ip.sourceroute=0 # if source routed packets are accepted th net.inet.tcp.drop_synfin=1 # SYN/FIN packets get dropped on initial c net.inet.udp.blackhole=1 # drop udp packets destined for closed soc net.inet.tcp.blackhole=2 # drop tcp packets destined for closed por security.bsd.see_other_uids=0 Netstat output when errors happen: input vlan6 output packets errs idrops bytes packets errs bytes colls 1 0 0 66 30557 2 33310968 0 1 0 0 105 31458 3 33912219 0 2 0 0 2954 32001 8 34983986 0 1 0 0 1512 33150 6 35942558 0 1 0 0 1512 33654 4 37311862 0 1 0 0 1512 34825 3 38213793 0 3 0 0 1683 35376 4 39488912 0 5 0 0 7280 32423 3 35551869 0 No relevant errors on the phisical ix(4) o cxl(4) ports happen. It's very easy to simulate/reproduce in my environment, I just need to boot a newer system and very soon some vlan start to drop packets which are not dropped on 10.1-STABLE and I can be contacted if a developer want to ssh in. I can also updated this PR with more informatio if needed. -- You are receiving this mail because: You are the assignee for the bug. _______________________________________________ freebsd-bugs@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-bugs To unsubscribe, send any mail to "freebsd-bugs-unsubscr...@freebsd.org"