Hello, Recently I've been playing around with a carp+pfsync+pound applevel proxy. On a high connection rate I've noticed some failed connections and the applevel proxy rendered the backend web servers DEAD, that means unreachable.
Pound sets on the gateway, accepts connections from the outside world and makes connections to the backend servers. The state table grew up to 32K states in total. On a very hight rate when pound tried to reach a backend server with connect(2) it recieved an "operation permitted" response, that was quite strange. Sometimes there are "borken pipes", sometimes also around those state lookup failures. On farther digging i've set pf's loglevel to misc and I've noticed state table lookup failures before pound's connect(2) error messages. It looks like this: Jul 26 10:46:54 lvs1 kernel: pf: BAD state: TCP 192.168.4.55:80 192.168.4.55:80 192.168.4.251:42688 [lo=3773866253 high=3773932711 win=2003 modulator=155307840 wscale=5] [lo=9137549 high=9201645 win=33304 modulator=2788154389 wscale=1] 9:9 S seq=3822349776 ack=9137549 len=0 ackskew=0 pkts=35:42 dir=in,fwd Jul 26 10:46:54 lvs1 kernel: pf: State failure on: 1 | 5 Also there are lots of operation timeouts and connection reset by peers. . When I disable pf there's a lot less of them. The pf.conf is the following: --- BEGIN pf.conf --- if_ext="em0" if_vvv="fxp0" if_sync="em1" ip_pub="192.168.4.55" ip_vvv="10.0.0.254" ip_vvv1="10.0.0.1" ip_vvv2="10.0.0.2" ip_vvv3="10.0.0.3" table <vvv> {$ip_vvv1, $ip_vvv2, $ip_vvv3} # Options: tune the behavior of pf, default values are given. set timeout { interval 5, frag 30 } #set timeout { tcp.first 120, tcp.opening 30, tcp.established 86400 } set timeout { tcp.closing 900, tcp.finwait 30, tcp.closed 60 } #set timeout { udp.first 60, udp.single 30, udp.multiple 60 } #set timeout { icmp.first 20, icmp.error 10 } #set timeout { other.first 60, other.single 30, other.multiple 60 } set timeout { adaptive.start 30000, adaptive.end 90000 } set limit { states 100000, frags 2000 } #set loginterface none set block-policy return set require-order yes set fingerprints "/etc/pf.os" set debug misc set skip on lo0 #scrub in all rdr on $if_ext proto tcp from any to $ip_pub port 10001 -> $ip_vvv1 port 22 rdr on $if_ext proto tcp from any to $ip_pub port 10002 -> $ip_vvv2 port 22 rdr on $if_ext proto tcp from any to $ip_pub port 10003 -> $ip_vvv3 port 22 block in log on $if_ext all pass in quick on {$if_ext,$if_vvv} proto vrrp pass out quick on {$if_ext,$if_vvv} proto vrrp pass out quick on $if_ext proto udp from any to 192.168.4.200 port 123 keep state pass in quick on $if_ext proto tcp from any to $if_ext:0 port 22 flags S/SA synproxy state (no-sync) pass in quick on $if_ext proto tcp from any to $ip_pub port 80 flags S/SA modulate state (no-sync) pass out quick on $if_ext proto udp from $if_ext:0 to port 53 keep state (no-sync) pass out quick on $if_ext proto udp from any to port 53 keep state pass out quick on $if_ext proto tcp from $if_ext:0 to port 80 flags S/SA keep state (no-sync) pass out quick on $if_ext proto tcp from any to port 80 flags S/SA keep state pass in quick on $if_ext proto tcp from any to <vvv> port 22 flags S/SA synproxy state #pass out quick on $if_vvv proto tcp from ($if_vvv) to <vvv> port 80 flags S/SA keep state (no-sync) pass out quick on $if_vvv proto tcp from ($if_vvv) to {$ip_vvv1,$ip_vvv2,$ip_vvv3} port 80 flags S/SA keep state (no-sync) --- END pf.conf --- Here, i've player around with the tcp timeouts, scrubbing, adaptive settings and swapped the last two rules (table VS individual rules), but they lead to nowhere, nothing changed. I'm testing this proxy with around 10-15 ab's (apache benchmark, part of the port), with 8 or 16 connections per instance and 500 requests/instance in an infinite loop. Here's an hour's messages log. pf wasn't enabled for the whole hour, but accordingly more then half an hour: http://phoemix.harmless.hu/messages-pffail.0.bz2 The question is, what can cause this high rate of connection failures? What have I done wrong? What's happening, I've never seen such a thing from pf? How could it be repaired to make pf behave stable on a heavier load? Sincerely, Gergely Czuczy mailto: [EMAIL PROTECTED] -- Weenies test. Geniuses solve problems that arise.
pgp3VsABXAFKH.pgp
Description: PGP signature