Please excuse typos, sent from my phone > On 15 Oct 2014, at 19:13, Marko Cupać <marko.cu...@mimar.rs> wrote: > > On Thu, 02 Oct 2014 18:02:23 +0100 > Andy <a...@brandwatch.com> wrote: > >> Hi >> >> Try setting the advskew to a number greater than 200 and less then >> 254. This seems to be the most stable. >> >> For best practice our primary runs with carp and pfsync values of >> '1'. And the backup runs with carp and pfsync values of '2'. >> >> We do this for two reasons. >> >> 1) it is extremely stable! >> >> 2) We found that CARP master is almost random/unstable when both >> firewalls have the same value (esp '0'), because; >> >> "When advbase is set to 0 the skew value alone is used to calculate >> how often advertisements are sent (the advertisement window) using >> this formula: Window in microseconds = advskew * 1000000 / 256 >> >> E.g. 100 * 1000000 / 256 = 390625us >> >> So it would take much to cause a flip.. >> >> Setting advbase to 1 on both is better as this is more stable if you >> want to have the same carp demote counters.. >> >> Good luck :) >> Andy > > Andy, > > thank you for the tip for increasing advskew value, I'm gonna try it out. > > I had failover on another pair of firewalls, this time external ones, > running bgp. Carp is not reverting to master some 5 hours so far. > > On master, while down, carp is demoted, pfsync is not: >> pacija@bgp1:~ $ ifconfig -g >> carp carp: carp demote count 1 >> pacija@bgp1:~ $ ifconfig -g pfsync >> pfsync: carp demote count 0 > > On backup, while master, neither is demoted: >> pacija@bgp2:~ $ ifconfig -g >> carp carp: carp demote count 0 >> pacija@bgp2:~ $ ifconfig -g pfsync >> pfsync: carp demote count 0 > > In /var/log/messages on downed master, I can see there was some > turbulence: >> Oct 14 15:21:19 bgp1 /bsd: carp2: state transition: MASTER -> BACKUP >> Oct 14 15:21:19 bgp1 /bsd: carp1: state transition: MASTER -> BACKUP >> Oct 14 15:21:22 bgp1 /bsd: carp1: state transition: BACKUP -> MASTER >> Oct 14 15:21:22 bgp1 /bsd: carp2: state transition: BACKUP -> MASTER >> Oct 14 15:22:52 bgp1 /bsd: carp2: state transition: MASTER -> BACKUP >> Oct 14 15:22:52 bgp1 /bsd: carp1: state transition: MASTER -> BACKUP >> Oct 14 15:22:53 bgp1 /bsd: carp3: state transition: MASTER -> BACKUP >> Oct 14 15:23:02 bgp1 /bsd: carp3: state transition: BACKUP -> MASTER >> Oct 14 15:23:03 bgp1 /bsd: carp1: state transition: BACKUP -> MASTER >> Oct 14 15:23:03 bgp1 /bsd: carp2: state transition: BACKUP -> MASTER >> Oct 14 15:23:41 bgp1 /bsd: carp1: state transition: MASTER -> BACKUP >> Oct 14 15:23:41 bgp1 /bsd: carp2: state transition: MASTER -> BACKUP >> Oct 14 15:23:41 bgp1 /bsd: carp3: state transition: MASTER -> BACKUP >> Oct 14 15:23:54 bgp1 /bsd: carp3: state transition: BACKUP -> MASTER >> Oct 14 15:23:56 bgp1 /bsd: carp2: state transition: BACKUP -> MASTER >> Oct 14 15:23:56 bgp1 /bsd: carp1: state transition: BACKUP -> MASTER >> Oct 14 15:26:04 bgp1 /bsd: carp2: state transition: MASTER -> BACKUP >> Oct 14 15:26:04 bgp1 /bsd: carp1: state transition: MASTER -> BACKUP >> Oct 14 15:26:04 bgp1 /bsd: carp3: state transition: MASTER -> BACKUP > > And in /var/log/daemon there is also bgp flapping at that time: >> Oct 14 15:22:53 bgp1 bgpd[1380]: nexthop 82.117.192.124 now valid: directly >> connected >> Oct 14 15:23:02 bgp1 bgpd[1380]: nexthop 82.117.192.124 now valid: via >> 82.117.192.124 >> Oct 14 15:23:41 bgp1 bgpd[1380]: nexthop 82.117.192.124 now valid: directly >> connected >> Oct 14 15:23:54 bgp1 bgpd[1380]: nexthop 82.117.192.124 now valid: via >> 82.117.192.124 >> Oct 14 15:26:04 bgp1 bgpd[1380]: nexthop 82.117.192.124 now valid: directly >> connected
Hi, You'll see these BGP messages as a result of the netstat -rn routing table changes when a box goes from master to backup or visa versa. When a box is the backup, access to the carp IP will be in state "connected" as the routing table with have a MAC address for the CARP IP on the physical connected interface (taking you to the master), but when the box is the master there will be no MAC for the IP as its a local IP, hence the via. I've always thought this problematic as this also causes issues with the BGP nexthop validation logic as when it's the master it considers the CARP IP not in the same broadcast domain as the subnet with the BGP peer. On old versions anyway, things may have changed.. > > 82.117.192.124 is address of one of three carp interfaces. > > I have 'demote carp' in bgpd.conf, so that master does not reclaim its > master role before bgp routes are up. The question remains, why is it > not reverting back to master once everything is ok? > > -- > Marko Cupać > https://www.mimar.rs