Just building off my last message. Answering Ryans questions first: - Do you have dedicated addresses on the carp parent interfaces?
For sure. - Are all the carp devices on the master firewall MASTER; what about the backup? Before and after the network dies, primary firewall is all MASTER, secondary stays as BACKUP. - Can you reach the 'dissapearing' network from the backup firewall? Yes. - Is preemption enabled? (sysctl net.inet.carp.preempt=1) Yes. - What is the output of 'netstat -sp carp' on both the master and backup firewalls? Have it below. - What about the output of 'netstat -i'? Are there output errors on the offending interface? Exact output below, but no errors in or out, before or after. - Have you tried running with carp debugging turned on? (sysctl net.inet.carp.log=1) Did this on both firewalls, didn't see output one way or the other. Restarted with it in sysctls.conf just to be sure, but didn't see anything. What further I know: - set debug loud, lots of output, nothing looks different while the problem is present. - From the "dead" network, if I ping the firewall, tcpdump shows the firewall making an arp request for the originating machine. 18:17:50.015307 arp who-has 172.168.120.50 tell 172.168.120.2 172.168.120.50 is the machine on the dead network, which was trying to ping the firewall. This would lead me to believe the firewall saw -something-. Lots of traffic trying to going to, but none come back from that network. - I can ping the dead interface locally. - Bringing interface down and up doesn't help - From the firewall itself, I can hang that interface. Before I was doing it from my desktop, through the firewall. Ifconfig explanation: gem0 - external gem1 - 120.x - network that "disappears" hme0 - 0.x - pfsync traffic hme1 - 121.x - Network my terminal is on hme2 - 119.x My ifconfig -A output from the master firewall: $ ifconfig -A lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> mtu 33192 groups: lo inet 127.0.0.1 netmask 0xff000000 inet6 ::1 prefixlen 128 inet6 fe80::1%lo0 prefixlen 64 scopeid 0xa gem0: flags=8b63<UP,BROADCAST,NOTRAILERS,RUNNING,PROMISC,ALLMULTI,SIMPLEX,MULTICAST> mtu 1500 lladdr 00:03:ba:f2:bc:1c groups: egress media: Ethernet autoselect (100baseTX full-duplex) status: active inet 216.2.22.123 netmask 0xffffffe0 broadcast 216.82.41.127 inet6 fe80::203:baff:fef2:bc1c%gem0 prefixlen 64 scopeid 0x1 gem1: flags=8b63<UP,BROADCAST,NOTRAILERS,RUNNING,PROMISC,ALLMULTI,SIMPLEX,MULTICAST> mtu 1500 lladdr 00:03:ba:f2:bc:1d media: Ethernet autoselect (100baseTX full-duplex) status: active inet 172.168.120.2 netmask 0xffffff00 broadcast 172.168.120.255 inet6 fe80::203:baff:fef2:bc1d%gem1 prefixlen 64 scopeid 0x2 hme0: flags=8863<UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST> mtu 1500 lladdr 08:00:20:ee:66:60 media: Ethernet autoselect (100baseTX full-duplex) status: active inet 10.0.0.1 netmask 0xffffff00 broadcast 10.0.0.255 inet6 fe80::a00:20ff:feee:6660%hme0 prefixlen 64 scopeid 0x3 hme1: flags=8b63<UP,BROADCAST,NOTRAILERS,RUNNING,PROMISC,ALLMULTI,SIMPLEX,MULTICAST> mtu 1500 lladdr 08:00:20:ee:66:61 media: Ethernet autoselect (100baseTX full-duplex) status: active inet 172.168.121.2 netmask 0xffffff00 broadcast 172.168.121.255 inet6 fe80::a00:20ff:feee:6661%hme1 prefixlen 64 scopeid 0x4 hme2: flags=8b63<UP,BROADCAST,NOTRAILERS,RUNNING,PROMISC,ALLMULTI,SIMPLEX,MULTICAST> mtu 1500 lladdr 08:00:20:ee:66:62 media: Ethernet autoselect (100baseTX full-duplex) status: active inet 172.168.119.2 netmask 0xffffff00 broadcast 172.168.119.255 inet6 fe80::a00:20ff:feee:6662%hme2 prefixlen 64 scopeid 0x5 hme3: flags=8822<BROADCAST,NOTRAILERS,SIMPLEX,MULTICAST> mtu 1500 lladdr 08:00:20:ee:66:63 media: Ethernet autoselect pflog0: flags=141<UP,RUNNING,PROMISC> mtu 33192 pfsync0: flags=41<UP,RUNNING> mtu 1348 pfsync: syncdev: hme0 maxupd: 128 enc0: flags=0<> mtu 1536 tun0: flags=8051<UP,POINTOPOINT,RUNNING,MULTICAST> mtu 1500 groups: tun inet 172.168.123.1 --> 172.168.123.2 netmask 0xffffffff carp0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500 carp: MASTER carpdev gem0 vhid 1 advbase 1 advskew 0 groups: carp inet 216.82.41.116 netmask 0xffffffe0 broadcast 216.82.41.127 inet 216.82.41.97 netmask 0xffffffe0 broadcast 216.82.41.127 inet 216.82.41.98 netmask 0xffffffe0 broadcast 216.82.41.127 inet 216.82.41.117 netmask 0xffffffe0 broadcast 216.82.41.127 inet 216.82.41.118 netmask 0xffffffe0 broadcast 216.82.41.127 inet 216.82.41.119 netmask 0xffffffe0 broadcast 216.82.41.127 inet 216.82.41.120 netmask 0xffffffe0 broadcast 216.82.41.127 inet 216.82.41.125 netmask 0xffffffe0 broadcast 216.82.41.127 inet 216.82.41.126 netmask 0xffffffe0 broadcast 216.82.41.127 carp1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500 carp: MASTER carpdev gem1 vhid 2 advbase 1 advskew 0 groups: carp inet 172.168.120.1 netmask 0xffffff00 broadcast 172.168.120.255 carp2: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500 carp: MASTER carpdev hme1 vhid 3 advbase 1 advskew 0 groups: carp inet 172.168.121.1 netmask 0xffffff00 broadcast 172.168.121.255 carp3: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500 carp: MASTER carpdev hme2 vhid 4 advbase 1 advskew 0 groups: carp inet 172.168.119.1 netmask 0xffffff00 broadcast 172.168.119.255 Now for gratuitous output. On the MASTER firewall, before killing it: $ sudo pfctl -s info Status: Enabled for 0 days 00:04:13 Debug: Urgent Interface Stats for gem0 IPv4 IPv6 Bytes In 10942552 0 Bytes Out 376220 352 Packets In Passed 9496 0 Blocked 359 0 Packets Out Passed 6099 3 Blocked 0 2 State Table Total Rate current entries 318 searches 37680 148.9/s inserts 640 2.5/s removals 322 1.3/s Counters match 1160 4.6/s bad-offset 0 0.0/s fragment 0 0.0/s short 0 0.0/s normalize 0 0.0/s memory 0 0.0/s bad-timestamp 0 0.0/s congestion 0 0.0/s ip-option 0 0.0/s proto-cksum 0 0.0/s state-mismatch 0 0.0/s state-insert 1 0.0/s state-limit 0 0.0/s src-limit 0 0.0/s synproxy 0 0.0/s $ netstat -sp carp carp: 8 packets received (IPv4) 0 packets received (IPv6) 0 packets discarded for bad interface 0 packets discarded for wrong TTL 0 packets shorter than header 0 discarded for bad checksums 0 discarded packets with a bad version 0 discarded because packet too short 0 discarded for bad authentication 0 discarded for bad vhid 0 discarded because of a bad address list 1040 packets sent (IPv4) 0 packets sent (IPv6) 0 send failed due to mbuf memory error $ netstat -i Name Mtu Network Address Ipkts Ierrs Opkts Oerrs Colls ... gem1 1500 <Link> 00:03:ba:f2:bc:1d 752 0 1122 0 0 gem1 1500 172.168.120 carp0 752 0 1122 0 0 gem1 1500 fe80::%gem1 fe80::203:baff:fe 752 0 1122 0 0 ... On the MASTER after killing it: $ sudo pfctl -s info Status: Enabled for 0 days 00:07:41 Debug: Urgent Interface Stats for gem0 IPv4 IPv6 Bytes In 16115704 0 Bytes Out 557189 352 Packets In Passed 14501 0 Blocked 670 0 Packets Out Passed 9332 3 Blocked 0 2 State Table Total Rate current entries 240 searches 63819 138.4/s inserts 770 1.7/s removals 530 1.1/s Counters match 1887 4.1/s bad-offset 0 0.0/s fragment 0 0.0/s short 0 0.0/s normalize 0 0.0/s memory 0 0.0/s bad-timestamp 0 0.0/s congestion 0 0.0/s ip-option 0 0.0/s proto-cksum 0 0.0/s state-mismatch 9 0.0/s state-insert 5 0.0/s state-limit 0 0.0/s src-limit 0 0.0/s synproxy 0 0.0/s $ netstat -sp carp carp: 8 packets received (IPv4) 0 packets received (IPv6) 0 packets discarded for bad interface 0 packets discarded for wrong TTL 0 packets shorter than header 0 discarded for bad checksums 0 discarded packets with a bad version 0 discarded because packet too short 0 discarded for bad authentication 0 discarded for bad vhid 0 discarded because of a bad address list 1896 packets sent (IPv4) 0 packets sent (IPv6) 0 send failed due to mbuf memory error $ netstat -ni Name Mtu Network Address Ipkts Ierrs Opkts Oerrs Colls ... gem1 1500 <Link> 00:03:ba:f2:bc:1d 10313 0 2881 0 0 gem1 1500 172.168.120 172.168.120.2 10313 0 2881 0 0 gem1 1500 fe80::%gem1 fe80::203:baff:fe 10313 0 2881 0 0 ... BACKUP firewall before: $ netstat -sp carp carp: 1084 packets received (IPv4) 0 packets received (IPv6) 0 packets discarded for bad interface 0 packets discarded for wrong TTL 0 packets shorter than header 0 discarded for bad checksums 0 discarded packets with a bad version 0 discarded because packet too short 0 discarded for bad authentication 0 discarded for bad vhid 0 discarded because of a bad address list 168 packets sent (IPv4) 0 packets sent (IPv6) 0 send failed due to mbuf memory error BACKUP after: $ netstat -sp carp carp: 2512 packets received (IPv4) 0 packets received (IPv6) 0 packets discarded for bad interface 0 packets discarded for wrong TTL 0 packets shorter than header 0 discarded for bad checksums 0 discarded packets with a bad version 0 discarded because packet too short 0 discarded for bad authentication 0 discarded for bad vhid 0 discarded because of a bad address list 168 packets sent (IPv4) 0 packets sent (IPv6) 0 send failed due to mbuf memory error