Fantastic,

Thanks Stuart, That was really helpful!

Without even knowing it your thoughts (suggesting manipulating 
carpdemote) has also just helped me to resolve /another/ CARP issue I 
have been battling with when using a direct crossover cable between the 
firewalls.
Same issue as; 
http://old.nabble.com/Unexpected-carp-failovers-when-using-crossover-cable-as-pfsync-syncdev-in-5.1-p33921868.html

When the backup is rebooted, pfsync interface goes down, which causes 
carpdemote to increment on the primary;
stfw1 kernel: carp: pfsync0 demoted group carp by 1 to 1 (pfsync link 
state down)
stfw1 kernel: carp: pfsync0 demoted group pfsync by 1 to 1 (pfsync link 
state down)

When the backup is rebooting the pfsync interface goes up and down a few 
times during POST'ing and NIC BIOS etc, before OpenBSD starts to load. 
This seems to cause the Primary to start the process of attempting a 
bulk update 'carp interlock' before the backup is ready.

When the backup finally comes up and requests a bulk update (even though 
the primary is still attempting a bulk update in the opposite direction 
I think (CARP interlock in place)) which fails, the backup goes master 
as the Primary has carpdemote=1 while the backup has a carpdemote=0, 
thus multiple masters.

On the Primary we saw;
carp: pfsync0 demoted group carp by 1 to 1 (pfsync link state down)
carp: pfsync0 demoted group pfsync by 1 to 1 (pfsync link state down)
carp0: state transition: MASTER -> BACKUP                <- Due to 
multi-master!
carp1: state transition: MASTER -> BACKUP                <- Due to 
multi-master!
carp: pfsync0 demoted group carp by -1 to 0 (pfsync link state up)
carp: pfsync0 demoted group pfsync by -1 to 0 (pfsync link state up)
carp0: state transition: BACKUP -> MASTER                <- Later 
corrects itself
carp1: state transition: BACKUP -> MASTER                <- LAter 
corrects itself

We can see the Primary firewall had to quickly drop to 'backup', as the 
seconadry firewall made itself master.

On the secondary we saw;
carp: carp1 demoted group carp by 1 to 149 (carpdev)
carp: pfsync0 demoted group carp by 32 to 181 (pfsync init)
carp: pfsync0 demoted group pfsync by 32 to 32 (pfsync init)
carp: pfsync0 demoted group carp by 1 to 182 (pfsync bulk start)
carp: pfsync0 demoted group pfsync by 1 to 33 (pfsync bulk start)
carp: carp1 demoted group carp by -1 to 181 (carpdev)
carp: pfsync0 demoted group carp by -1 to 180 (pfsync bulk done)
carp: pfsync0 demoted group pfsync by -1 to 32 (pfsync bulk done)
carp: pfsync0 demoted group carp by -32 to 148 (pfsync init)
carp: pfsync0 demoted group pfsync by -32 to 0 (pfsync init)
carp0: state transition: BACKUP -> MASTER
carp1: state transition: BACKUP -> MASTER
carp0: state transition: MASTER -> BACKUP
carp1: state transition: MASTER -> BACKUP

This was fixed by adding;
!ifconfig -g carp carpdemote 1
!ifconfig -g pfsync carpdemote 1

To each physical interface 'hostname.if', and then adding

sleep 120
ifconfig -g carp -carpdemote 3
ifconfig -g pfsync -carpdemote 3

NB; There are 3 physical interfaces (INT, EXT, and PFSYNC's pysical 
interface).

Completely stabilises a flapping pfsync interface during reboots :)

Cheers, Andy.



On 22/07/13 22:26, Stuart Henderson wrote:
> On 2013-07-22, Andy <a...@brandwatch.com> wrote:
>> For example we are connected to a various providers in various
>> locations (we have many OpenBSD firewalls and this is only a problem in
>> some locations) where they wont enable port fast/configure as static
>> access ports.
> I would think this is the minority, and that most places are either on 
> switches
> not smart enough for STP, or where the admins can configure them appropriately
> for the connected devices, in either case the extra delay would be unwanted..
> (and how long would you delay for anyway? it depends on switch configuration).
>
> BTW an alternative to "sleep" in the network scripts would be to use
> "!ifconfig -g carp carpdemote" in a hostname.if file, then in rc.local
> maybe a sleep and then "ifconfig -g carp -carpdemote".. However neither of
> these account for the situation where you lose and re-gain link after boot.

Reply via email to