On 24/07/2011, at 8:27 PM, Jonathan Lassoff wrote: > On Wed, Apr 20, 2011 at 7:10 AM, David Gwynne <l...@animata.net> wrote: >> >> On 20/04/2011, at 11:08 PM, Jonathan Lassoff wrote: >> >>> On Wed, Apr 20, 2011 at 4:22 AM, David Gwynne <l...@animata.net> wrote: >>>> you might be able to upgrade your passive firewall to 4.9 next to the active 4.7 one. it looks like the protocol stayed the same so they should be able to talk to each other. >>> >>> This would seem to be the case. >>> >>> This (http://undeadly.org/cgi?action=article&sid=20090301211402) is an >>> absolutely excellent bit of writing about the improvements to pfsync, >>> BTW. Thanks for letting that be shared. >>> >>>> however, it looks like bulk updates were broken in 4.7, which would explain your failover problems. you can work around that by going "pfctl -S /dev/stdout | ssh activefw pfctl -L /dev/stdin" as root on the passive fw. >>> >>> As an initial seeding of state? It seems to me that only some of my >>> flows get affected when failing over (not everything is reset and >>> traffic can still flow). >> >> yes. the pfctl commands will do a bulk update since the in kernel implementation was unreliable back then. >> >>> It appears that both firewalls have an approximately congruent set of >>> states, but usually a "pfctl -ss | wc -l" can be off by several >>> hundred, to several thousand states at times. My hunch is that state >>> creation and counter updates are not updated synchronously, so when >>> failing over there are still some updates in-flight, and for flows >>> that are moving their sequence numbers at a decent clip I could see >>> why they might get reset. >> >> pf has a bit of fuzz when it does its tcp window matching, so packets can get ahead of the firewall and be ok. > > Do you know if there is a way to see how much this fuzz is or if > there's an offset?
from memory its 1000 bytes. > If dropped for being out of a window, will (or can) it get logged to pflog? again, from memory its just dropped. >> i wrote defer, so yes... >> >> on my boxes the increase in latency is about .2 to .3ms. if a firewall is missing its peer(s) it will go up to about 1/100th of a second. > > So does defer wait for a peer to acknowledge a new state just at the > time of creation, or does it include state updates about sequence > numbers as well? defer only delays the first packet. > I suspect I'm hitting a similar issue as you were with long-lived > flows getting reset at failover. i think my problem is that i run both firewalls with the carp demotion counter set low. when a box is rebooted the carp default is at 0 or 1, which means it takes over traffic before it gets all the states. later code in rc.local demotes it, but by that time some packets have been eaten by the new box. i should fix it, but im lazy. >> thats exactly how i have my stuff configured. > > Have you ever had trouble when re-numbering an interface? It seems to > me like ospfd doesn't pick up changes in interface numbering if > changed out from under it. Most other OSPF daemons I use would pick > this up as it changes, but as far I as can tell there's no way to tell > ospfd to reload interface addressing. interfaces and addresses moving around hurts me too. > I'm often needing to add more and more interfaces and ospf interfaces, > necessitating failing over so as to make it safe to kill and re-start > ospfd -- in the process it just seems to nip some flows from flowing. i do that too. lets annoy claudio together!