On May 31, 2006, at 5:21 PM, Bachman Kharazmi wrote:
I've a problem when I do "ifconfig carp1 down" on the MASTER host to test if BACKUP takes over the traffic..
You're not alone. I have a pair of boxes running 3.8/pf/carp/etc.. I upgraded them to 3.9 and during the upgrade discovered that if I do 'ifconfig carp0|1 down' that fail over does not happen properly. : ( If I do 'ifconfig rl0 down' (rl0 being the physical interface for carp0) that things then do fail over as expected. I thought maybe this was an issue with 3.9 so I did a fresh install of 3.8 on both and still the problem persists. I have not bothered with the upgrade to 3.9 again, no time just yet.
I know that 3.8 & 3.9 boxes can't keep sync together. I'm seeing the problem when both boxes are at the same version, either 3.8 or 3.9. I know it worked at one point back in my lab but that was 3.7.
When I do fail rl0, state is preserved for connections. I have a VOIP line and had a connection between that and my cell phone going when I failed rl0. The telephone call stayed live. This is all from memory but I recall that when I did fail carp0 connections stopped. I think it was a case of the master node still had carp1 as master, yet the second node had carp0 as master or something like that where each thought it had half and thus no connections would work.
I'm at a loss as to why this might be happening. Annoying issue too. If I wanted to take a host out of the pool of firewalls then I'd take down the carp interface, leaving the physical interfaces up, so access to the box would still work. Now I can't do that. :(
-Chad