On Tue, Apr 28, 2015 at 11:28:31AM +0200, Marko Cupa?? wrote:
> Hi,
> 
> I have a pair of OpenBSD 5.6 firewalls running releases happily for
> years (I think since 5.1). They are in CARP failover mode, running bgp
> sessions with upstrem providers and filtering traffic.
> 
> Few days ago I had Internet outage (first in years), which appear to
> happen as a result of bgpd crash. I could ping ISP's interface, but
> then i noticed i have no routes at all (except connected ones) in
> routing table. Next, I discovered there is no bgpd running process.
> Restarting bgpd gave me routes and Internet connectivity back.
> 
> Here's excerpt from messages log:
> 
> Apr 17 18:29:18 bgp2 bgpd[9759]: neighbor 82.117.192.121 (sbb): sync error
> Apr 17 18:29:18 bgp2 bgpd[9759]: neighbor 82.117.192.121 (sbb): sending 
> notification: Header error, synchronization error
> Apr 17 18:29:18 bgp2 bgpd[9759]: neighbor 82.117.192.121 (sbb): graceful 
> restart of IPv4 unicast, keeping routes
> Apr 17 18:29:18 bgp2 bgpd[24107]: neighbor 82.117.192.121 (sbb): bad nlri 
> prefix
> Apr 17 18:29:19 bgp2 bgpd[9759]: neighbor 82.117.192.121 (sbb): sending 
> notification: error in UPDATE message, network unacceptable
> Apr 17 18:29:51 bgp2 bgpd[9759]: neighbor 82.117.192.121 (sbb): graceful 
> restart of IPv4 unicast, not restarted, flushing
> Apr 17 18:29:52 bgp2 bgpd[24107]: fatal in RDE: peer_up: bad state
> Apr 17 18:29:52 bgp2 bgpd[32268]: dispatch_imsg in main: pipe closed
> Apr 17 18:29:52 bgp2 bgpd[9759]: neighbor 82.117.192.121 (sbb): sending 
> notification: Cease, administratively down
> Apr 17 18:29:52 bgp2 bgpd[9759]: neighbor 178.253.194.253 (orion): sending 
> notification: Cease, administratively down
> 
> 
> Also from daemon log at the same time:
> 
> Apr 17 18:29:18 bgp2 bgpd[9759]: neighbor 82.117.192.121 (sbb): sync error
> Apr 17 18:29:18 bgp2 bgpd[9759]: neighbor 82.117.192.121 (sbb): sending 
> notification: Header error, synchronization error
> Apr 17 18:29:18 bgp2 bgpd[9759]: neighbor 82.117.192.121 (sbb): graceful 
> restart of IPv4 unicast, keeping routes
> Apr 17 18:29:18 bgp2 bgpd[9759]: neighbor 82.117.192.121 (sbb): state change 
> Established -> Idle, reason: Fatal error
> Apr 17 18:29:18 bgp2 bgpd[9759]: neighbor 82.117.192.121 (sbb): state change 
> Idle -> Connect, reason: Start
> Apr 17 18:29:18 bgp2 bgpd[32268]: incremented the demote state of group 'carp'
> Apr 17 18:29:18 bgp2 bgpd[24107]: neighbor 82.117.192.121 (sbb): bad nlri 
> prefix
> Apr 17 18:29:18 bgp2 bgpd[9759]: neighbor 82.117.192.121 (sbb): state change 
> Connect -> OpenSent, reason: Connection opened
> Apr 17 18:29:18 bgp2 bgpd[9759]: neighbor 82.117.192.121 (sbb): state change 
> OpenSent -> Active, reason: Connection closed
> Apr 17 18:29:19 bgp2 bgpd[9759]: neighbor 82.117.192.121 (sbb): sending 
> notification: error in UPDATE message, network unacceptable
> Apr 17 18:29:19 bgp2 bgpd[9759]: neighbor 82.117.192.121 (sbb): state change 
> Active -> Idle, reason: Fatal error
> Apr 17 18:29:49 bgp2 bgpd[9759]: neighbor 82.117.192.121 (sbb): state change 
> Idle -> Connect, reason: Start
> Apr 17 18:29:49 bgp2 bgpd[9759]: neighbor 82.117.192.121 (sbb): state change 
> Connect -> OpenSent, reason: Connection opened
> Apr 17 18:29:51 bgp2 bgpd[9759]: neighbor 82.117.192.121 (sbb): graceful 
> restart of IPv4 unicast, not restarted, flushing
> Apr 17 18:29:51 bgp2 bgpd[9759]: neighbor 82.117.192.121 (sbb): state change 
> OpenSent -> OpenConfirm, reason: OPEN message received
> Apr 17 18:29:51 bgp2 bgpd[9759]: neighbor 82.117.192.121 (sbb): state change 
> OpenConfirm -> Established, reason: KEEPALIVE message received
> Apr 17 18:29:52 bgp2 bgpd[24107]: fatal in RDE: peer_up: bad state
> Apr 17 18:29:52 bgp2 bgpd[32268]: dispatch_imsg in main: pipe closed
> Apr 17 18:29:52 bgp2 bgpd[9759]: neighbor 82.117.192.121 (sbb): sending 
> notification: Cease, administratively down
> Apr 17 18:29:52 bgp2 bgpd[32268]: decremented the demote state of group 'carp'
> Apr 17 18:29:52 bgp2 bgpd[9759]: neighbor 82.117.192.121 (sbb): state change 
> Established -> Idle, reason: Stop
> Apr 17 18:29:52 bgp2 bgpd[9759]: neighbor 178.253.194.253 (orion): sending 
> notification: Cease, administratively down
> Apr 17 18:29:52 bgp2 bgpd[9759]: neighbor 178.253.194.253 (orion): state 
> change Established -> Idle, reason: Stop
> Apr 17 18:29:52 bgp2 bgpd[9759]: session engine exiting
> Apr 17 18:29:54 bgp2 bgpd[32268]: kernel routing table 0 (Loc-RIB) decoupled
> Apr 17 18:29:55 bgp2 bgpd[32268]: Terminating
> 
> 
> I would be grateful if someone explained me me what happened here, and
> also what to do in order to avoid it in the future.
> 

The "fatal in RDE: peer_up: bad state" bug is fixed in 5.7 IIRC. Not
sure if it was backported to 5.6. As a workaround you can disable the
graceful restart capability to not trigger that code path.

Hope that helps.
-- 
:wq Claudio

Reply via email to