Hey! I am trying to make BGP graceful restart work. First, I noticed that BGP graceful restart can only work if BIRD doesn't close cleanly the BGP session. Otherwise, an administrative shutdown is sent and the other end (also BIRD) cleans all routes and don't consider this as a graceful restart.
2016-12-02 10:09:24 <RMT> R1: Received: Administrative shutdown 2016-12-02 10:09:24 <TRACE> R1: BGP session closed 2016-12-02 10:09:24 <TRACE> R1: State changed to stop 2016-12-02 10:09:24 <TRACE> R1 > removed [sole] 203.0.113.0/24 via 192.0.2.1 on eth0 Is that an expected behavior? The second problem I run into is when using BFD. If I kill -9 bird, BFD will quickly detects the problem and shutdown the BGP session. It will not be considered a graceful restart either. 2016-12-02 10:52:50 <TRACE> R1: Neighbor graceful restart detected 2016-12-02 10:52:50 <TRACE> R1: State changed to start 2016-12-02 10:52:50 <TRACE> R1: BGP session closed 2016-12-02 10:52:50 <TRACE> R1: Connect delayed by 5 seconds 2016-12-02 10:52:51 <TRACE> R1: BFD session down 2016-12-02 10:52:51 <TRACE> R1: State changed to stop 2016-12-02 10:52:51 <TRACE> R1 > removed [sole] 203.0.113.0/24 via 192.0.2.1 on eth0 Therefore, BFD seems incompatible with graceful restart. The Juniper implementation has some provisions to make BFD and BGP graceful restart works together: > So that BFD can maintain its BFD protocol sessions across a BGP > graceful restart, BGP requests that BFD set the C bit to 1 in > transmitted BFD packets. When the C bit is set to 1, BFD can > maintain its session in the forwarding plane in spite of disruptions > in the control plane. Setting the bit to 1 gives BGP neighbors > acting as a graceful restart helper the most accurate information > about whether the forwarding plane is up. > > When BGP is acting as a graceful restart helper and the BFD session > to the BGP peer is lost, one of the following actions takes place: > - If the C bit received in the BFD packets was 1, BGP immediately > flushes all routes, determining that the forwarding plane on the > BGP peer has gone down. > - If the C bit received in the BFD packets was 0, BGP marks all > routes as stale but does not flush them because the forwarding > plane on the BGP peer might be working and only the control plane > has gone down. Unrelated to BGP restart but related to BFD, if one BGP peer has a temporary network issue, BFD will quickly close the session and then require a startup delay for the session. When the network outage is solved and one peer tries to reconnect, the session is rejected because of this startup delay: 2016-12-02 11:03:55 <TRACE> R1: State changed to start 2016-12-02 11:03:55 <TRACE> R1: Startup delayed by 60 seconds due to errors 2016-12-02 11:04:02 <TRACE> R1: Incoming connection from 192.0.2.1 (port 49205) rejected 2016-12-02 11:04:07 <TRACE> R1: Incoming connection from 192.0.2.1 (port 36449) rejected The delay can be configured to a lower value, but is it the expected behavior? The current code is: acc = (p->p.proto_state == PS_START || p->p.proto_state == PS_UP) && (p->start_state >= BSS_CONNECT) && (!p->incoming_conn.sk); Could this be changed to? acc = (p->p.proto_state == PS_START || p->p.proto_state == PS_UP) && (p->start_state >= BSS_DELAY) && (!p->incoming_conn.sk); I have put a more detailed summary of my investigations here: https://github.com/vincentbernat/network-lab/tree/caceb38e8543ec22a7693611bbd84cdf36e92e12/lab-bgp-graceful-restart -- Use uniform input formats. - The Elements of Programming Style (Kernighan & Plauger)