Hi list, I've been experimenting with bird's ECMP features added to the current git head a while back [1] on my Debian based Linux system. I tried the setup below with git head as of today.
I have three Linux routers running with bird. Two I called gw (gateway) one being a frontend. The gateways are configured to establish one BGP session to the frontend each, and advertising fd57::1 and fd57::2 to it. The frontend accepts both advertisement, and exports both to the kernel table. All output below comes from this bird. I configured bird like this: log syslog { debug, trace, info, remote, warning, error, auth, fatal, bug }; router id 10.3.101.3; # Filters filter kernel_export { if net ~ [ fd57::/64{128,128} ] then accept; reject; } # BGP Filters filter bgp_import { if net ~ [ fd57::/64{128,128} ] then accept; reject; } filter bgp_export { reject; } # Local devices protocol device { scan time 10; } protocol direct { interface "*"; } protocol kernel { import none; #learn; merge paths on; export filter kernel_export; } # BGP peers protocol bgp 'gw1' { description "gw1"; default bgp_local_pref 100; local fc57::3 as 65001; neighbor fc57::1 as 65000; next hop self; import filter bgp_import; export filter bgp_export; hold time 30; error wait time 5, 30; } protocol bgp 'gw2' { description "gw2"; default bgp_local_pref 100; local fc57::3 as 65001; neighbor fc57::2 as 65000; next hop self; import filter bgp_import; export filter bgp_export; hold time 30; error wait time 5, 30; } On the system I have this address configuration: root@debian:~# ip addr show eth0 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 64000 qdisc pfifo_fast state UP group default qlen 1000 link/ether 02:01:93:0b:5a:e9 brd ff:ff:ff:ff:ff:ff inet 10.10.216.12/24 brd 10.10.216.255 scope global eth0 valid_lft forever preferred_lft forever inet6 fc57::3/64 scope global valid_lft forever preferred_lft forever inet6 fe80::1:93ff:fe0b:5ae9/64 scope link valid_lft forever preferred_lft forever in bird: root@debian:~# birdc6 BIRD 1.5.0 ready. bird> show protocols name proto table state since info device1 Device master up 14:53:54 direct1 Direct master up 14:53:54 kernel1 Kernel master up 14:53:54 static1 Static master up 14:53:54 gw1 BGP master up 14:53:58 Established gw2 BGP master up 14:53:55 Established bird> show route fd57::2/128 via fc57::1 on eth0 [gw1 14:53:58] ! (100) [AS65000i] via fc57::2 on eth0 [gw2 14:53:55] (100) [AS65000i] fd57::1/128 via fc57::1 on eth0 [gw1 14:53:58] ! (100) [AS65000i] via fc57::2 on eth0 [gw2 14:53:55] (100) [AS65000i] fc57::/64 dev eth0 [direct1 14:53:54] * (240) Result being, that the multipath routes are being installed into the kernel routing table as expected once the sessions are up: root@debian:~# ip -6 route show fc57::/64 dev eth0 proto kernel metric 256 fd57::1 via fc57::1 dev eth0 proto bird metric 1024 fd57::1 via fc57::2 dev eth0 proto bird metric 1024 fd57::2 via fc57::1 dev eth0 proto bird metric 1024 fd57::2 via fc57::2 dev eth0 proto bird metric 1024 fe80::/64 dev eth0 proto kernel metric 256 However, after some time, bird seems to confuse itself by the routes it installed and removes the multipath route again. This can be seen again in ip route show: root@debian:~# ip -6 route show fc57::/64 dev eth0 proto kernel metric 256 fd57::1 via fc57::2 dev eth0 proto bird metric 1024 fd57::2 via fc57::2 dev eth0 proto bird metric 1024 fe80::/64 dev eth0 proto kernel metric 256 In bird they are however still received: root@ps:~# birdc6 show route all BIRD 1.5.0 ready. fd57::2/128 via fc57::1 on eth0 [gw1 16:25:25] ! (100) [AS65000i] Type: BGP unicast univ BGP.origin: IGP BGP.as_path: 65000 BGP.next_hop: fc57::1 fe80::1:bdff:feab:7f12 BGP.local_pref: 100 via fc57::2 on eth0 [gw2 16:25:23] (100) [AS65000i] Type: BGP unicast univ BGP.origin: IGP BGP.as_path: 65000 BGP.next_hop: fc57::2 fe80::1:bdff:fe05:64b7 BGP.local_pref: 100 fd57::1/128 via fc57::1 on eth0 [gw1 16:25:25] ! (100) [AS65000i] Type: BGP unicast univ BGP.origin: IGP BGP.as_path: 65000 BGP.next_hop: fc57::1 fe80::1:bdff:feab:7f12 BGP.local_pref: 100 via fc57::2 on eth0 [gw2 16:25:23] (100) [AS65000i] Type: BGP unicast univ BGP.origin: IGP BGP.as_path: 65000 BGP.next_hop: fc57::2 fe80::1:bdff:fe05:64b7 BGP.local_pref: 100 ... In the bird log, with debug output enabled I can see: Jan 11 15:12:46 ps bird6: gw1: Got KEEPALIVE Jan 11 15:12:49 ps bird6: gw2: Got KEEPALIVE Jan 11 15:12:50 ps bird6: gw1: Sending KEEPALIVE Jan 11 15:12:52 ps bird6: gw2: Sending KEEPALIVE Jan 11 15:12:54 ps bird6: device1: Scanning interfaces Jan 11 15:12:54 ps bird6: kernel1: Scanning routing table Jan 11 15:12:54 ps bird6: kernel1: fd57::1/128: will be updated Jan 11 15:12:54 ps bird6: kernel1: fd57::1/128: already seen Jan 11 15:12:54 ps bird6: kernel1: fd57::2/128: will be updated Jan 11 15:12:54 ps bird6: kernel1: fd57::2/128: already seen Jan 11 15:12:54 ps bird6: kernel1: Pruning table master Jan 11 15:12:54 ps bird6: kernel1: fd57::2/128: updating Jan 11 15:12:54 ps bird6: Netlink: File exists Jan 11 15:12:54 ps bird6: kernel1: fd57::1/128: updating Jan 11 15:12:54 ps bird6: Netlink: File exists After a while, the problem fixes itself, both routes are being installed, and then the problem reappears for the next cycle. Is there a way around this, or is this actually a bug? To me this looks like bird was scanning it's own routes and falsely scans only one of them. Experimenting with "import all", "learn" etc. for the kernel protocol seems to make no difference. [1] https://gitlab.labs.nic.cz/labs/bird/commit/8d9eef17713a9b38cd42bd59c4ce76c3ef6c2fc2 -- Arno Töll GnuPG Key-ID: 0x9D80F36D