* Nico Schottelius
we got a network in which clients using dhcpcd withdraw the router
advertisements sent by bird too early:

--------------------------------------------------------------------------------
Dec 19 06:33:15 bibimbap daemon.warn dhcpcd[18464]: wlan0: 
fe80::20d:b9ff:fe48:3bb8: router expired
Dec 19 06:33:15 bibimbap daemon.warn dhcpcd[18464]: wlan0: part of a Router 
Advertisement expired
Dec 19 06:38:22 bibimbap daemon.warn dhcpcd[18464]: wlan0: 
fe80::20d:b9ff:fe48:3bb8: router expired
Dec 19 06:38:22 bibimbap daemon.warn dhcpcd[18464]: wlan0: part of a Router 
Advertisement expired
Dec 19 06:39:30 bibimbap daemon.warn dhcpcd[18464]: wlan0: 
fe80::20d:b9ff:fe46:3bd4: router expired
Dec 19 06:39:30 bibimbap daemon.warn dhcpcd[18464]: wlan0: part of a Router 
Advertisement expired
--------------------------------------------------------------------------------

You should probably verify that the client is seeing (all) the RAs sent by BIRD. With such a tiny RA interval, you don't need much packet loss for the default router to expire.

Try: tcpdump -pnvi wlan0 'icmp6[0] == 134'

For what it is worth, wireless networks are particularly tricky, as broadcast/multicast packet delivery usually much less reliable than unicast. Also, battery-powered devices often ignore broadcast/multicast for extended periods of time in order to stay in low-power saving modes.(Some implementations explicitly try to reconfirm the default router by sending an RS after coming out of a low-power mode instead of expiring it right away.)

The config in bird is:

--------------------------------------------------------------------------------
protocol radv {
     # Pods / bridge
     interface "eth0" {
         max ra interval 10;

         prefix 2a0a:5480::/64 { preferred lifetime 86400; };
         prefix 2a0a:e5c0:13::/64 { skip; };
         default preference high;
     };
     rdnss {
       ns 2a0a:5480:0:a::a;
       ns 2a0a:5480:0:a::b;
       lifetime 86400;
     };
}
--------------------------------------------------------------------------------

This causes RAs to be sent that look like this:

--------------------------------------------------------------------------------
interface wlan0
{
        AdvSendAdvert on;
        # Note: {Min,Max}RtrAdvInterval cannot be obtained with radvdump
        AdvManagedFlag off;
        AdvOtherConfigFlag off;
        AdvReachableTime 0;
        AdvRetransTimer 0;
        AdvCurHopLimit 64;
        AdvDefaultLifetime 30;
        AdvHomeAgentFlag off;
        AdvDefaultPreference low;

        prefix 2a0a:5480::/64
        {
                AdvValidLifetime 86400;
                AdvPreferredLifetime 86400;
                AdvOnLink on;
                AdvAutonomous on;
                AdvRouterAddr off;
        }; # End of prefix definition


        RDNSS 2a0a:5480:0:a::a 2a0a:5480:0:a::b
        {
                AdvRDNSSLifetime 86400;
        }; # End of RDNSS definition

}; # End of interface definition
--------------------------------------------------------------------------------

It seems that the "AdvDefaultLifetime 30;" is wrong.

No, it is correct. You have set "max ra interval 10", and the default AdvDefaultLifetime is 3 times that value.

https://datatracker.ietf.org/doc/html/rfc4861#section-6.2.1

--------------------------------------------------------------------------------
/bird/bird.conf:897:31 Default lifetime must be in range 0-9000
--------------------------------------------------------------------------------

I read in the manpage of radvd that indeed 9000 is the max (not sure why
that limit is at 9k though...), but what I am wondering is what is the
right approach to this?
The upper limit of 9000 comes from RFC4861, see above link.
p.s.: A low "max ra interval" allows us to run multiple, active routers
in the same network and clients will quickly fall over to the second
router, if one is not functioning correctly.

Setting AdvDefaultLifetime to 9000 would prevent quick failover, since that is what governs the default route timeout, not the RA interval.

I'd suggest considering a FHRP like VRRP for redundancy instead, then you can have long AdvDefaultLifetime and quick failover at the same time. With VRRP, everything about the default router (in particular its IPv6 link-local and Ethernet MAC addresses) stays the same following a failover, so clients do not need to change their routing tables to remain connected.

Tore


Reply via email to