Hi!
While switching two node carp + pfsync active/passive firewall nodes
over like
fw1# ifconfig -g carp carpdemote 50
i get idle tcp sessions hanging.
I noticed that slave does not honour 'expires in' values of respective
master's states and instead uses packet filter's default (defined with
set timeout tcp.established n).
In my case the problem arises with rules which set tcp.established
longer than the default - since after switchover states expire as
default tcp.established says, it soon forgets about theses states and
tcp connections hang.
I dont think it is special to my hardware, networking gear or packet
filter, but to more sure, i set up two firewalls onto ESXi guestis with
packet filter reduced to more-or-less minimal, like this
if_ext = "em0"
if_int = "em1"
if_mgmt = "em2"
if_carp_ext = "carp181"
if_carp_int = "carp182"
server = "10.80.182.11"
icmp_types = "echoreq"
tcpopts = "flags S/SA modulate state"
tcpopts_llc = "flags S/SA modulate state (tcp.established 600)"
set loginterface $if_ext
set timeout tcp.established 300
set skip on lo
block in log on $if_ext label "NIext_default"
block out log on $if_ext label "NOext_default"
block in log on $if_int label "NIint_default"
block out log on $if_int label "NOint_default"
block in log on $if_mgmt label "NIint_default"
block out log on $if_mgmt label "NOint_default"
pass quick on $if_ext proto carp keep state (no-sync)
pass quick on $if_int proto carp keep state (no-sync)
pass quick on $if_mgmt proto pfsync keep state (no-sync)
pass in quick on $if_ext inet proto tcp to $server port { 22 } tag
TO_SERVER $tcpopts_llc label "YIext_to_server"
pass in quick on $if_int inet proto tcp from $server port { 22 } tag
FROM_SERVER $tcpopts_llc label "YIint_from_server"
pass quick inet proto icmp icmp-type echoreq label "pinging"
pass out quick on $if_int tagged TO_SERVER $tcpopts_llc label "YOint to
http server"
pass out quick on $if_ext tagged FROM_SERVER $tcpopts_llc label "YOext
from http server"
pass in quick on $if_mgmt inet from 172.19/16 keep state (no-sync)
pass out on $if_mgmt inet from $if_mgmt label "JVext_from_tm_to_mgmt"
keep state (no-sync)
pass out on $if_int inet from $if_int label "JVext_from_tm_to_int" keep
state (no-sync)
pass out on $if_ext inet from $if_ext label "JVext_from_tm_to_ext" keep
state (no-sync)
carp and pf sync is like this
fw1# cat /etc/hostname.pfsync0
up syncdev em2 syncpeer 10.0.13.159
fw1# cat /etc/hostname.carp18*
inet 10.80.181.1 255.255.255.0 10.80.181.255 advskew 120 vhid 181
carpdev em0 pass lanpw181 description internet
inet 10.80.182.1 255.255.255.0 10.80.182.255 advskew 120 vhid 182
carpdev em1 pass lanpw182 description intranet
(I explore states with pftop and pfctl -vvvss).
And to make matters worse, having carp + pfync working long tcp sessions
hang anyway i.e. if master stays (and isnt switched over), because slave
with smaller 'expires in' values clears respective states also from
operational master.
I can think of two workarounds
1. no configure tcp.established per rule (although i have somehow grown
to have ssh sessions thru firewall have longer expires in's that say
http states, 10 days vs 30 minutes)
2. create pfsync0 devices only when needed and look out not using them
longer than minimal tcp.establised is (this is what i am doing now)
I wish someone comments on this whether i am doing still something wrong
pf-wise, there are some knobs i am unaware or really carp+pfsync+pf
needs some more dev-love :)
Best regards
Imre