Fantastic, Thanks Stuart, That was really helpful!
Without even knowing it your thoughts (suggesting manipulating carpdemote) has also just helped me to resolve /another/ CARP issue I have been battling with when using a direct crossover cable between the firewalls. Same issue as; http://old.nabble.com/Unexpected-carp-failovers-when-using-crossover-cable-as-pfsync-syncdev-in-5.1-p33921868.html When the backup is rebooted, pfsync interface goes down, which causes carpdemote to increment on the primary; stfw1 kernel: carp: pfsync0 demoted group carp by 1 to 1 (pfsync link state down) stfw1 kernel: carp: pfsync0 demoted group pfsync by 1 to 1 (pfsync link state down) When the backup is rebooting the pfsync interface goes up and down a few times during POST'ing and NIC BIOS etc, before OpenBSD starts to load. This seems to cause the Primary to start the process of attempting a bulk update 'carp interlock' before the backup is ready. When the backup finally comes up and requests a bulk update (even though the primary is still attempting a bulk update in the opposite direction I think (CARP interlock in place)) which fails, the backup goes master as the Primary has carpdemote=1 while the backup has a carpdemote=0, thus multiple masters. On the Primary we saw; carp: pfsync0 demoted group carp by 1 to 1 (pfsync link state down) carp: pfsync0 demoted group pfsync by 1 to 1 (pfsync link state down) carp0: state transition: MASTER -> BACKUP <- Due to multi-master! carp1: state transition: MASTER -> BACKUP <- Due to multi-master! carp: pfsync0 demoted group carp by -1 to 0 (pfsync link state up) carp: pfsync0 demoted group pfsync by -1 to 0 (pfsync link state up) carp0: state transition: BACKUP -> MASTER <- Later corrects itself carp1: state transition: BACKUP -> MASTER <- LAter corrects itself We can see the Primary firewall had to quickly drop to 'backup', as the seconadry firewall made itself master. On the secondary we saw; carp: carp1 demoted group carp by 1 to 149 (carpdev) carp: pfsync0 demoted group carp by 32 to 181 (pfsync init) carp: pfsync0 demoted group pfsync by 32 to 32 (pfsync init) carp: pfsync0 demoted group carp by 1 to 182 (pfsync bulk start) carp: pfsync0 demoted group pfsync by 1 to 33 (pfsync bulk start) carp: carp1 demoted group carp by -1 to 181 (carpdev) carp: pfsync0 demoted group carp by -1 to 180 (pfsync bulk done) carp: pfsync0 demoted group pfsync by -1 to 32 (pfsync bulk done) carp: pfsync0 demoted group carp by -32 to 148 (pfsync init) carp: pfsync0 demoted group pfsync by -32 to 0 (pfsync init) carp0: state transition: BACKUP -> MASTER carp1: state transition: BACKUP -> MASTER carp0: state transition: MASTER -> BACKUP carp1: state transition: MASTER -> BACKUP This was fixed by adding; !ifconfig -g carp carpdemote 1 !ifconfig -g pfsync carpdemote 1 To each physical interface 'hostname.if', and then adding sleep 120 ifconfig -g carp -carpdemote 3 ifconfig -g pfsync -carpdemote 3 NB; There are 3 physical interfaces (INT, EXT, and PFSYNC's pysical interface). Completely stabilises a flapping pfsync interface during reboots :) Cheers, Andy. On 22/07/13 22:26, Stuart Henderson wrote: > On 2013-07-22, Andy <a...@brandwatch.com> wrote: >> For example we are connected to a various providers in various >> locations (we have many OpenBSD firewalls and this is only a problem in >> some locations) where they wont enable port fast/configure as static >> access ports. > I would think this is the minority, and that most places are either on > switches > not smart enough for STP, or where the admins can configure them appropriately > for the connected devices, in either case the extra delay would be unwanted.. > (and how long would you delay for anyway? it depends on switch configuration). > > BTW an alternative to "sleep" in the network scripts would be to use > "!ifconfig -g carp carpdemote" in a hostname.if file, then in rc.local > maybe a sleep and then "ifconfig -g carp -carpdemote".. However neither of > these account for the situation where you lose and re-gain link after boot.