Hi Evgeniy, Thank you for your reply. The states hard limit was the problem. The default limit is quite low :)
---------- Tim Korn Network Ninja On Thu, Jun 2, 2016 at 3:48 AM, Evgeniy Sudyr <eject.in...@gmail.com> wrote: > Tim, > > from your problem description I can suggest you to check if you are not > hitting > > states hard limit with (note - during load when you can reproduce issue): > > pfctl -si > pfctl -sm > > Default limit is: states hard limit 10000 > > -- > Evgeniy > > On Thu, Jun 2, 2016 at 3:29 AM, Tim Korn <tk...@etsy.com> wrote: > > Hi. I have a pair of openBSD boxes (5.8) setup as a core/firewall. I > have > > ten VLANs tied to a physical NIC (Intel 82599). This is a new setup and > it > > was just recently put in service. Traffic was fine (or at least we > didn't > > notice any issues) until a large job was run which roughly doubled > traffic > > going thru the firewall. Traffic rate is still extremely low... roughly > 2k > > packets per second on the interface in question and around 20Mb. I have > > other identical openBSD boxes that don't use VLANs, and they pass > multiple > > gigs of traffic per second, so I'm having a hard time not leaning towards > > it being a VLAN issue, however I don't know where to look to prove it. > > > > If a host in vlan100 pings a host in vlan101 I see packet loss on the > first > > few packets, than all subsequent packets pass. Stopping and restarting > the > > ping results in the same thing....first few pings lost, then responses > and > > never fail again until the ping is stopped and restarted. We see this > > behavior with pretty much any new connection. I can replicate it > > consistently with ICMP, TCP, and UDP traffic. > > > > PF ruleset is quite basic. Simple *pass in* rules on the VLANs and *pass > > out* is allowed on all interfaces. icmp has a rule at the top saying > "pass > > log quick proto icmp". i really don't think theres a pf issue of any > kind. > > > > I've run a tcpdump to confirm that packets come in on vlan100, and never > > leave vlan101. Here is an example: > > > > Ping from host in vlan100 (you can see the seq start at 9. first 8 > > never left the firewall): > > [root@pakkit ~]# ping 10.95.1.50 > > PING 10.95.1.50 (10.95.1.50) 56(84) bytes of data. > > 64 bytes from 10.95.1.50: icmp_seq=9 ttl=63 time=0.263 ms > > 64 bytes from 10.95.1.50: icmp_seq=10 ttl=63 time=0.341 ms > > 64 bytes from 10.95.1.50: icmp_seq=11 ttl=63 time=0.335 ms > > 64 bytes from 10.95.1.50: icmp_seq=12 ttl=63 time=0.348 ms > > 64 bytes from 10.95.1.50: icmp_seq=13 ttl=63 time=0.348 ms > > > > > > > > tcpdump on vlan100 showing 13 echo requests: > > [root@pci-ny2-fw1:~ (master)] tcpdump -neti vlan100 host 10.95.0.5 and > > host 10.95.1.50 > > tcpdump: listening on vlan100, link-type EN10MB > > 00:0c:29:16:f7:bf 00:00:5e:00:01:64 0800 98: 10.95.0.5 > 10.95.1.50: > > icmp: echo request (DF) > > 00:0c:29:16:f7:bf 00:00:5e:00:01:64 0800 98: 10.95.0.5 > 10.95.1.50: > > icmp: echo request (DF) > > 00:0c:29:16:f7:bf 00:00:5e:00:01:64 0800 98: 10.95.0.5 > 10.95.1.50: > > icmp: echo request (DF) > > 00:0c:29:16:f7:bf 00:00:5e:00:01:64 0800 98: 10.95.0.5 > 10.95.1.50: > > icmp: echo request (DF) > > 00:0c:29:16:f7:bf 00:00:5e:00:01:64 0800 98: 10.95.0.5 > 10.95.1.50: > > icmp: echo request (DF) > > 00:0c:29:16:f7:bf 00:00:5e:00:01:64 0800 98: 10.95.0.5 > 10.95.1.50: > > icmp: echo request (DF) > > 00:0c:29:16:f7:bf 00:00:5e:00:01:64 0800 98: 10.95.0.5 > 10.95.1.50: > > icmp: echo request (DF) > > 00:0c:29:16:f7:bf 00:00:5e:00:01:64 0800 98: 10.95.0.5 > 10.95.1.50: > > icmp: echo request (DF) > > 00:0c:29:16:f7:bf 00:00:5e:00:01:64 0800 98: 10.95.0.5 > 10.95.1.50: > > icmp: echo request (DF) > > 24:6e:96:04:1b:d8 00:0c:29:16:f7:bf 0800 98: 10.95.1.50 > 10.95.0.5: > > icmp: echo reply > > 00:0c:29:16:f7:bf 00:00:5e:00:01:64 0800 98: 10.95.0.5 > 10.95.1.50: > > icmp: echo request (DF) > > 24:6e:96:04:1b:d8 00:0c:29:16:f7:bf 0800 98: 10.95.1.50 > 10.95.0.5: > > icmp: echo reply > > 00:0c:29:16:f7:bf 00:00:5e:00:01:64 0800 98: 10.95.0.5 > 10.95.1.50: > > icmp: echo request (DF) > > 24:6e:96:04:1b:d8 00:0c:29:16:f7:bf 0800 98: 10.95.1.50 > 10.95.0.5: > > icmp: echo reply > > 00:0c:29:16:f7:bf 00:00:5e:00:01:64 0800 98: 10.95.0.5 > 10.95.1.50: > > icmp: echo request (DF) > > 24:6e:96:04:1b:d8 00:0c:29:16:f7:bf 0800 98: 10.95.1.50 > 10.95.0.5: > > icmp: echo reply > > 00:0c:29:16:f7:bf 00:00:5e:00:01:64 0800 98: 10.95.0.5 > 10.95.1.50: > > icmp: echo request (DF) > > 24:6e:96:04:1b:d8 00:0c:29:16:f7:bf 0800 98: 10.95.1.50 > 10.95.0.5: > > icmp: echo reply > > ^C > > 1049 packets received by filter > > 0 packets dropped by kernel > > > > > > tcpdump on vlan101 showing only 5 echo requests: > > [root@pci-ny2-fw1:/etc/ (master)] tcpdump -neti vlan101 host 10.95.0.5 > > and host 10.95.1.50 > > tcpdump: listening on vlan101, link-type EN10MB > > 24:6e:96:04:1b:d8 24:6e:96:04:1c:84 0800 98: 10.95.0.5 > 10.95.1.50: > > icmp: echo request (DF) > > 24:6e:96:04:1c:84 00:00:5e:00:01:65 0800 98: 10.95.1.50 > 10.95.0.5: > > icmp: echo reply > > 24:6e:96:04:1b:d8 24:6e:96:04:1c:84 0800 98: 10.95.0.5 > 10.95.1.50: > > icmp: echo request (DF) > > 24:6e:96:04:1c:84 00:00:5e:00:01:65 0800 98: 10.95.1.50 > 10.95.0.5: > > icmp: echo reply > > 24:6e:96:04:1b:d8 24:6e:96:04:1c:84 0800 98: 10.95.0.5 > 10.95.1.50: > > icmp: echo request (DF) > > 24:6e:96:04:1c:84 00:00:5e:00:01:65 0800 98: 10.95.1.50 > 10.95.0.5: > > icmp: echo reply > > 24:6e:96:04:1b:d8 24:6e:96:04:1c:84 0800 98: 10.95.0.5 > 10.95.1.50: > > icmp: echo request (DF) > > 24:6e:96:04:1c:84 00:00:5e:00:01:65 0800 98: 10.95.1.50 > 10.95.0.5: > > icmp: echo reply > > 24:6e:96:04:1b:d8 24:6e:96:04:1c:84 0800 98: 10.95.0.5 > 10.95.1.50: > > icmp: echo request (DF) > > 24:6e:96:04:1c:84 00:00:5e:00:01:65 0800 98: 10.95.1.50 > 10.95.0.5: > > icmp: echo reply > > ^C > > 1975 packets received by filter > > 0 packets dropped by kernel > > > > Any help would be greatly appreciated. This is causing massive slow > downs > > for all traffic flowing thru this firewall. Thank you for your time. > > > > -Tim > > > > > > -- > -- > With regards, > Eugene Sudyr