Good to know it helped, probably you also need check for "set optimization aggressive" it will also reduce number of states if it works for your use cases.
-- Evgeniy On Thu, Jun 2, 2016 at 2:40 PM, Tim Korn <tk...@etsy.com> wrote: > Hi Evgeniy, > Thank you for your reply. The states hard limit was the problem. The > default limit is quite low :) > > > ---------- > Tim Korn > Network Ninja > > > On Thu, Jun 2, 2016 at 3:48 AM, Evgeniy Sudyr <eject.in...@gmail.com> wrote: >> >> Tim, >> >> from your problem description I can suggest you to check if you are not >> hitting >> >> states hard limit with (note - during load when you can reproduce issue): >> >> pfctl -si >> pfctl -sm >> >> Default limit is: states hard limit 10000 >> >> -- >> Evgeniy >> >> On Thu, Jun 2, 2016 at 3:29 AM, Tim Korn <tk...@etsy.com> wrote: >> > Hi. I have a pair of openBSD boxes (5.8) setup as a core/firewall. I >> > have >> > ten VLANs tied to a physical NIC (Intel 82599). This is a new setup and >> > it >> > was just recently put in service. Traffic was fine (or at least we >> > didn't >> > notice any issues) until a large job was run which roughly doubled >> > traffic >> > going thru the firewall. Traffic rate is still extremely low... roughly >> > 2k >> > packets per second on the interface in question and around 20Mb. I have >> > other identical openBSD boxes that don't use VLANs, and they pass >> > multiple >> > gigs of traffic per second, so I'm having a hard time not leaning >> > towards >> > it being a VLAN issue, however I don't know where to look to prove it. >> > >> > If a host in vlan100 pings a host in vlan101 I see packet loss on the >> > first >> > few packets, than all subsequent packets pass. Stopping and restarting >> > the >> > ping results in the same thing....first few pings lost, then responses >> > and >> > never fail again until the ping is stopped and restarted. We see this >> > behavior with pretty much any new connection. I can replicate it >> > consistently with ICMP, TCP, and UDP traffic. >> > >> > PF ruleset is quite basic. Simple *pass in* rules on the VLANs and >> > *pass >> > out* is allowed on all interfaces. icmp has a rule at the top saying >> > "pass >> > log quick proto icmp". i really don't think theres a pf issue of any >> > kind. >> > >> > I've run a tcpdump to confirm that packets come in on vlan100, and never >> > leave vlan101. Here is an example: >> > >> > Ping from host in vlan100 (you can see the seq start at 9. first 8 >> > never left the firewall): >> > [root@pakkit ~]# ping 10.95.1.50 >> > PING 10.95.1.50 (10.95.1.50) 56(84) bytes of data. >> > 64 bytes from 10.95.1.50: icmp_seq=9 ttl=63 time=0.263 ms >> > 64 bytes from 10.95.1.50: icmp_seq=10 ttl=63 time=0.341 ms >> > 64 bytes from 10.95.1.50: icmp_seq=11 ttl=63 time=0.335 ms >> > 64 bytes from 10.95.1.50: icmp_seq=12 ttl=63 time=0.348 ms >> > 64 bytes from 10.95.1.50: icmp_seq=13 ttl=63 time=0.348 ms >> > >> > >> > >> > tcpdump on vlan100 showing 13 echo requests: >> > [root@pci-ny2-fw1:~ (master)] tcpdump -neti vlan100 host 10.95.0.5 and >> > host 10.95.1.50 >> > tcpdump: listening on vlan100, link-type EN10MB >> > 00:0c:29:16:f7:bf 00:00:5e:00:01:64 0800 98: 10.95.0.5 > 10.95.1.50: >> > icmp: echo request (DF) >> > 00:0c:29:16:f7:bf 00:00:5e:00:01:64 0800 98: 10.95.0.5 > 10.95.1.50: >> > icmp: echo request (DF) >> > 00:0c:29:16:f7:bf 00:00:5e:00:01:64 0800 98: 10.95.0.5 > 10.95.1.50: >> > icmp: echo request (DF) >> > 00:0c:29:16:f7:bf 00:00:5e:00:01:64 0800 98: 10.95.0.5 > 10.95.1.50: >> > icmp: echo request (DF) >> > 00:0c:29:16:f7:bf 00:00:5e:00:01:64 0800 98: 10.95.0.5 > 10.95.1.50: >> > icmp: echo request (DF) >> > 00:0c:29:16:f7:bf 00:00:5e:00:01:64 0800 98: 10.95.0.5 > 10.95.1.50: >> > icmp: echo request (DF) >> > 00:0c:29:16:f7:bf 00:00:5e:00:01:64 0800 98: 10.95.0.5 > 10.95.1.50: >> > icmp: echo request (DF) >> > 00:0c:29:16:f7:bf 00:00:5e:00:01:64 0800 98: 10.95.0.5 > 10.95.1.50: >> > icmp: echo request (DF) >> > 00:0c:29:16:f7:bf 00:00:5e:00:01:64 0800 98: 10.95.0.5 > 10.95.1.50: >> > icmp: echo request (DF) >> > 24:6e:96:04:1b:d8 00:0c:29:16:f7:bf 0800 98: 10.95.1.50 > 10.95.0.5: >> > icmp: echo reply >> > 00:0c:29:16:f7:bf 00:00:5e:00:01:64 0800 98: 10.95.0.5 > 10.95.1.50: >> > icmp: echo request (DF) >> > 24:6e:96:04:1b:d8 00:0c:29:16:f7:bf 0800 98: 10.95.1.50 > 10.95.0.5: >> > icmp: echo reply >> > 00:0c:29:16:f7:bf 00:00:5e:00:01:64 0800 98: 10.95.0.5 > 10.95.1.50: >> > icmp: echo request (DF) >> > 24:6e:96:04:1b:d8 00:0c:29:16:f7:bf 0800 98: 10.95.1.50 > 10.95.0.5: >> > icmp: echo reply >> > 00:0c:29:16:f7:bf 00:00:5e:00:01:64 0800 98: 10.95.0.5 > 10.95.1.50: >> > icmp: echo request (DF) >> > 24:6e:96:04:1b:d8 00:0c:29:16:f7:bf 0800 98: 10.95.1.50 > 10.95.0.5: >> > icmp: echo reply >> > 00:0c:29:16:f7:bf 00:00:5e:00:01:64 0800 98: 10.95.0.5 > 10.95.1.50: >> > icmp: echo request (DF) >> > 24:6e:96:04:1b:d8 00:0c:29:16:f7:bf 0800 98: 10.95.1.50 > 10.95.0.5: >> > icmp: echo reply >> > ^C >> > 1049 packets received by filter >> > 0 packets dropped by kernel >> > >> > >> > tcpdump on vlan101 showing only 5 echo requests: >> > [root@pci-ny2-fw1:/etc/ (master)] tcpdump -neti vlan101 host 10.95.0.5 >> > and host 10.95.1.50 >> > tcpdump: listening on vlan101, link-type EN10MB >> > 24:6e:96:04:1b:d8 24:6e:96:04:1c:84 0800 98: 10.95.0.5 > 10.95.1.50: >> > icmp: echo request (DF) >> > 24:6e:96:04:1c:84 00:00:5e:00:01:65 0800 98: 10.95.1.50 > 10.95.0.5: >> > icmp: echo reply >> > 24:6e:96:04:1b:d8 24:6e:96:04:1c:84 0800 98: 10.95.0.5 > 10.95.1.50: >> > icmp: echo request (DF) >> > 24:6e:96:04:1c:84 00:00:5e:00:01:65 0800 98: 10.95.1.50 > 10.95.0.5: >> > icmp: echo reply >> > 24:6e:96:04:1b:d8 24:6e:96:04:1c:84 0800 98: 10.95.0.5 > 10.95.1.50: >> > icmp: echo request (DF) >> > 24:6e:96:04:1c:84 00:00:5e:00:01:65 0800 98: 10.95.1.50 > 10.95.0.5: >> > icmp: echo reply >> > 24:6e:96:04:1b:d8 24:6e:96:04:1c:84 0800 98: 10.95.0.5 > 10.95.1.50: >> > icmp: echo request (DF) >> > 24:6e:96:04:1c:84 00:00:5e:00:01:65 0800 98: 10.95.1.50 > 10.95.0.5: >> > icmp: echo reply >> > 24:6e:96:04:1b:d8 24:6e:96:04:1c:84 0800 98: 10.95.0.5 > 10.95.1.50: >> > icmp: echo request (DF) >> > 24:6e:96:04:1c:84 00:00:5e:00:01:65 0800 98: 10.95.1.50 > 10.95.0.5: >> > icmp: echo reply >> > ^C >> > 1975 packets received by filter >> > 0 packets dropped by kernel >> > >> > Any help would be greatly appreciated. This is causing massive slow >> > downs >> > for all traffic flowing thru this firewall. Thank you for your time. >> > >> > -Tim >> > >> >> >> >> -- >> -- >> With regards, >> Eugene Sudyr > > -- -- With regards, Eugene Sudyr