I've currently been running a redundant firewall solution in our Production environment using OpenBSD (version 4.5-stable) with CARP (4), PF (4), PFsync (4) and SAsyncd (8) which syncs the pf rules and IPSEC security associations via the cross-over cable method. We're also running an IPSEC (4) tunnel between our production and internal networks with a single OpenBSD machine (version 4.5-stable) running PF (4) on our internal network.
In the following year since I've implemented this solution we've experienced a problem in which our firewalls begin to act erratically roughly every 4 months resulting in loss of SSH connectivity, SNMP monitoring failure and the inability to run any command from the console. Despite these problems, both production firewalls are still pingable and continue to filter packets as they should. +----| Production Network |----+ | | bnx2| |bnx2 +-----+ +-----+ | fw1 |-bnx0----------bnx0-| fw2 | +-----+ +-----+ bnx1| |bnx1 | | ---+--- WAN/Internet ---+--- | {IPSEC tunnel} | +------+ | fw | +------+ +----| Internal Network |----+ * * These problems can simply be fixed be rebooting the master and then the slave production firewalls; however this is obviously not a long term solution to the problem at hand. Since I'm not able to view or salvage any of the log files or even run a top while this problem is occurring I've had a hard time troubleshooting this issue. However the order of events leading up to the problem seems to be: 1.) Our monitoring reports that the process load of one or both of the firewalls can not longer be checked via SNMP 2.) Our IPSEC tunnel goes down 3.) SSH connectivity fails and console command line usage fails (I'm still able to type a command but then I'm not able to ctrl-c back to the command line) Please let me know if you have an ideas why this issue might be occurring. Thanks in advance. Regards, Jeff