Another one:

Since killing bird is now safe-ish, I added one more bird3 to the party.

Then I tried killing the first bird again.

This caused the new bird3 to get stuck with BGP into Flush:

BGP_IPv4 BGP --- flush 2024-12-19 16:35:08 Flush Socket: Connection reset by peer BGP_IPv6 BGP --- flush 2024-12-19 16:35:06 Flush Socket: Connection reset by peer

No errors are logged on this second bird after the "Connection reset by peer" stuff, the first bird is retrying to connect, I see that with tcpdump.

birdc restart BGP_IPvx does nothing
birdc [dis|en]able BGP_IPvx also changes nothing

Some more info about this new bird:
- it is a RR so iBGP
- has MD5 enabled for the sessions
- import/export tables on
- extended messages on
- add paths on

Tried it with killing multiple bird2s instead of bird3-bird3:
- the ones with low number of routes (in my case <10k) do not get stuck Flushing
- the ones with one (or more) full tables get stuck Flushing on the RR

Killing birds looks like a theme since I have to kill the Flushing bird3 to get it to reconnect those sessions.

The template from the Flushing bird RR (looks the same for v6):

template bgp BGP_v4_t {
    debug { events };
    disabled off;

    local as ASN;

    enable extended messages on;

    igp metric off;
    prefer older on;

    rr client;
    rr cluster id 0.0.0.1;

    ipv4 {
        add paths on;

        import limit 3000000 action warn;

        import filter BGP_IN4;
        export filter BGP_OUT4;
        import table on;
        export table on;
    };
}

Cheers,

Radu

Reply via email to