https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=217606
Bug ID: 217606 Summary: Bridge stops working after some days Product: Base System Version: 11.0-RELEASE Hardware: amd64 OS: Any Status: New Severity: Affects Some People Priority: --- Component: kern Assignee: freebsd-bugs@FreeBSD.org Reporter: a...@torrentkino.de Hello, we recently upgraded our Bridging FWs from 10.1-RELEASE-pxx to 11.0-RELEASE-p8. And since then they stop passing through traffic after some time. In this case after ~4 days. One of them stopped yesterday evening. (We have a failover mechanism to reduce the impact.) $ uptime 9:26AM up 4 days, 19:22, 2 users, load averages: 0.12, 0.06, 0.01 bridge0 consists of ix0/ix1: ix0: <Intel(R) PRO/10GbE PCI-Express Network Driver, Version - 3.1.13-k> port 0xecc0-0xecdf mem 0xd9e80000-0xd9efffff,0xd9ff8000-0xd9ffbfff irq 48 at device 0.0 numa-domain 0 on pci2 ix1: <Intel(R) PRO/10GbE PCI-Express Network Driver, Version - 3.1.13-k> port 0xece0-0xecff mem 0xd9f00000-0xd9f7ffff,0xd9ffc000-0xd9ffffff irq 52 at device 0.1 numa-domain 0 on pci2 In case of error I see the following for IPv4. The bridge does IPv6 as well. Same problem. ix0: A load balancer is asking for its default GW. No reply... $ tcpdump -i ix0 \( arp \) 09:37:47.330361 ARP, Request who-has A.A.A.A tell B.B.B.B, length 46 ix1: The default GW actually sends a reply. I can see it on ix1. $ tcpdump -i ix1 \( arp \) 09:38:59.328956 ARP, Request who-has A.A.A.A tell B.B.B.B, length 46 09:38:59.329374 ARP, Reply A.A.A.A is-at 00:00:0a:0b:0c:0d (oui Cisco), length 46 A tcpdump for bridge0 show the same as ix1. Some numbers of the currently not working system: $ netstat -m 82409/6901/89310 mbufs in use (current/cache/total) 38692/4094/42786/1015426 mbuf clusters in use (current/cache/total/max) 38692/4065 mbuf+clusters out of packet secondary zone in use (current/cache) 0/192/192/507713 4k (page size) jumbo clusters in use (current/cache/total/max) 0/0/0/150433 9k jumbo clusters in use (current/cache/total/max) 0/0/0/84618 16k jumbo clusters in use (current/cache/total/max) 97986K/10681K/108667K bytes allocated to network (current/cache/total) 0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters) 0/0/0 requests for mbufs delayed (mbufs/clusters/mbuf+clusters) 0/0/0 requests for jumbo clusters delayed (4k/9k/16k) 0/0/0 requests for jumbo clusters denied (4k/9k/16k) 0 sendfile syscalls 0 sendfile syscalls completed without I/O request 0 requests for I/O initiated by sendfile 0 pages read by sendfile as part of a request 0 pages were valid at time of a sendfile request 0 pages were requested for read ahead by applications 0 pages were read ahead by sendfile 0 times sendfile encountered an already busy page 0 requests for sfbufs denied 0 requests for sfbufs delayed $ netstat -b -d -h -i bridge0 Name Mtu Network Address Ipkts Ierrs Idrop Ibytes Opkts Oerrs Obytes Coll Drop ix0 1.5K <Link#1> 00:00:00:00:00:0a 12G 0 0 11T 7.9G 0 1.1T 0 335k ix1 1.5K <Link#2> 00:00:00:00:00:0b 7.9G 0 0 1.2T 12G 0 11T 0 0 bridg 1.5K <Link#8> 00:00:00:00:00:0c 20G 0 0 12T 20G 335k 12T 0 0 What I did so far: # Disable Ethernet Flow-Control # https://wiki.freebsd.org/10gFreeBSD/Router dev.ix.0.fc=0 dev.ix.1.fc=0 # Disable TSO cloned_interfaces="bridge0" ifconfig_bridge0="addm ix0 addm ix1 up" ifconfig_ix0="up -tso" ifconfig_ix1="up -tso" I found the following bug reports: 2004: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=185633 2016: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=212749 And since this system uses PF and Scrubbing. I applied this patch manually: https://reviews.freebsd.org/D7780 But I have no success so far. Shutting down ix0/ix1 and bringing them up makes brigde0 responsive again. But time now works against me. Netstat after that procedure: $ netstat -m 33281/56284/89565 mbufs in use (current/cache/total) 33280/9756/43036/2015426 mbuf clusters in use (current/cache/total/max) 33280/9730 mbuf+clusters out of packet secondary zone in use (current/cache) 0/192/192/507713 4k (page size) jumbo clusters in use (current/cache/total/max) 0/0/0/150433 9k jumbo clusters in use (current/cache/total/max) 0/0/0/84618 16k jumbo clusters in use (current/cache/total/max) 74880K/34351K/109231K bytes allocated to network (current/cache/total) 0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters) 0/0/0 requests for mbufs delayed (mbufs/clusters/mbuf+clusters) 0/0/0 requests for jumbo clusters delayed (4k/9k/16k) 0/0/0 requests for jumbo clusters denied (4k/9k/16k) 0 sendfile syscalls 0 sendfile syscalls completed without I/O request 0 requests for I/O initiated by sendfile 0 pages read by sendfile as part of a request 0 pages were valid at time of a sendfile request 0 pages were requested for read ahead by applications 0 pages were read ahead by sendfile 0 times sendfile encountered an already busy page 0 requests for sfbufs denied 0 requests for sfbufs delayed Kind regards, Aiko -- You are receiving this mail because: You are the assignee for the bug. _______________________________________________ freebsd-bugs@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-bugs To unsubscribe, send any mail to "freebsd-bugs-unsubscr...@freebsd.org"