I've come across a problem that's bitten me a couple of times doing a 'birdc6 configure soft' where bird6 removes the interface route for a BGP neighbour with BFD enabled and the router ends up learning it via OSPF on a different interface.
I've also seen this on connected routers where the OSPF reconvergence clobbers the static route even for interfaces not directly connected on the router where the reload is happening. I'm running version 1.6.3 - not the latest but upgrading the routers is a non- trivial exercise at the moment. The neighbour config (without import/export lists) is simple: protocol bgp ns3 from customer { description "ns3"; neighbor aaaa:bbb:ff00:30::1 as 64514; ttl security; bfd; } It doesn't happen every time, and it doesn't happen with every neighbour, but I can't see why BIRD is withdrawing the route to begin with and it means we can't read in new BIRD configuration without an outage window. One telltale is bird6 complaining about receiving a route with a "strange next-hop" because it doesn't have a route for it. Jul 23 19:01:27 ns3 bird6[2912]: KRT: Received route ::/0 with strange next- hop aaaa:bbb:ff00:30:: Jul 23 19:01:28 ns3 bird6[2912]: KRT: Received route ::/0 with strange next-hop aaaa:bbb:ff00:30:: ifdown/ifup on the interface isn't sufficient to bring the route back, it has to be manually added with: ip -6 route replace aaaa:bbb:ff00:30::/127 dev eth0.551 metric 256 Some of the relevant logs: root@prod-router1:~# grep ns3 /var/log/syslog.1 | tail -6 Jul 23 20:00:45 prod-router1 bird6: ns3: Connecting to aaaa:bbb:ff00:30::1 from local address aaaa:bbb:ff00:30:: Jul 23 20:00:45 prod-router1 bird6: ns3: Connection lost (Network is unreachable) Jul 23 20:00:45 prod-router1 bird6: ns3: Connect delayed by 5 seconds Jul 23 20:00:46 prod-router1 bird6: Disabling protocol ns3 Jul 23 20:00:46 prod-router1 bird6: ns3: Shutdown requested Jul 23 20:00:46 prod-router1 bird6: ns3: Down Connected route being removed and replaced with a route learned from OSPF from a different device: root@prod-router1:/var/log# grep bird6 syslog.1 | grep aaaa:bbb:ff00:30:: | grep -v "Connecting to" | grep -v bfd1: Jul 23 18:30:03 prod-router1 bird6: edge > removed [replaced] aaaa:bbb:ff00:30::/127 dev eth0.551 Jul 23 18:30:03 prod-router1 bird6: kernel1 < added aaaa:bbb:ff00:30::/127 via fe80::42f2:e9ff:feef:a855 on eth2.540 Jul 23 18:30:03 prod-router1 bird6: edge > added [best] aaaa:bbb:ff00:30::/127 dev eth0.551 Jul 23 18:30:03 prod-router1 bird6: kernel1 < removed aaaa:bbb:ff00:30::/127 via fe80::42f2:e9ff:feef:a855 on eth2.540 Jul 23 19:23:04 prod-router1 bird6: edge > removed [replaced] aaaa:bbb:ff00:30::/127 dev eth0.551 Jul 23 19:23:04 prod-router1 bird6: kernel1 < added aaaa:bbb:ff00:30::/127 via fe80::42f2:e9ff:feef:a855 on eth2.540 Jul 23 19:23:05 prod-router1 bird6: ns3: Waiting for aaaa:bbb:ff00:30::1 to become my neighbor Jul 23 19:23:08 prod-router1 bird6: edge > added [best] aaaa:bbb:ff00:30::/127 dev eth0.551 Jul 23 19:23:08 prod-router1 bird6: kernel1 < removed aaaa:bbb:ff00:30::/127 via fe80::42f2:e9ff:feef:a855 on eth2.540 Has anyone else experienced this? Any known workarounds? If a direct route is in the BIRD routing table, will that prevent OSPF from trying to install the same route in the kernel table? I'm wondering if putting the direct routes into the BIRD table would stop that behaviour from OSPF. Thanks. -- Alasdair Muckart Network Engineer Catalyst IT - Expert Open Source Solutions Mobile: +64 22 638 5141 | DDI: +64 4 897 7794 | www.catalyst.net.nz CONFIDENTIALITY NOTICE: This email is intended for the named recipients only. It may contain privileged, confidential or copyright information. If you are not the named recipient, any use, reliance upon, disclosure or copying of this email or its attachments is unauthorised. If you have received this email in error, please reply via email or call +64 4 499 2267.
signature.asc
Description: This is a digitally signed message part