> Baldur Norddahl > Sent: Wednesday, February 12, 2020 7:57 PM > > On Tue, Feb 11, 2020 at 12:33 AM Lukas Tribus <mailto:li...@ltri.eu> wrote: > > Therefore, if being down for several minutes is not ok, you should > > invest in dual links to your transits. And connect those to two > > different routers. If possible with a guarantee the transits use two > > routers at their end and that divergent fiber paths are used etc. > > That is not my experience *at all*. I have always seen my prefixes > converge in a couple of seconds upstream (vs 2 different Tier1's). > > This is a bit old but probably still thus: > > https://labs.ripe.net/Members/vastur/the-shape-of-a-bgp-update > > Quote: "To conclude, we observe that BGP route updates tend to > converge globally in just a few minutes. The propagation of newly > announced prefixes happens almost instantaneously, reaching 50% > visibility in just under 10 seconds, revealing a highly responsive > global system. Prefix withdrawals take longer to converge and generate > nearly 4 times more BGP traffic, with the visibility dropping below 10% only > after approximately 2 minutes". > > Unfortunately they did not test the case of withdrawal from one router > while having the prefix still active at another. > Yes that's unfortunate, Although I'm thinking that the convergence time would be highly dependent on the first-hop upstream providers involved in the "local-repair" for the affected AS -once that is done doesn't matter that the whole world still routes traffic to affected AS towards the original first-hop upstream AS, as long as it has a valid detour route. And I guess the topology configuration of this first-hop outskirt from the affected AS involved in the "local-repair" would dictate the convergence time. E.g. if your upstream A box happens to have a direct (usable) link/session to upstream B box -winner, however the higher the number of boxes involved in the "local-repair" detour that need to be told "A no more, now B is the way to go" the longer the convergence time. -but if significant portion of the Internet gets withdraw in 2 min -wondering how long could it be for a typical "local-repair" string of bgp speakers to all get the memo. -but realistically how many bgp speakers could that be, ranging from min 2 - to max... say ~6?
> > When I saw *minutes* of brownouts in connectivity it was always > because of ingress prefix convergence (or the lack thereof, due to > slow FIB programing, then temporary internal routing loops, nasty > things like that, but never external). > > That is also a significant problem. In the case of a single transit > connection per router, two routers and two providers, there will be a > lot of internal convergence between your two routers in the case of a > link failure. That is also avoided by having both routers having the same > provider connections. > That way a router may still have to invalidate many routes but there > will be no loops and the router has loop free alternatives loaded into > memory already (to the other provider). Plus you can use the simple > trick of having a default route as a fall back. > This is a very good point actually, indeed since the box has two transit sessions in case of a failure of only one of them it will still retain all the prefixes in FIB -it will just need to reprogram few next-hops to point towards the other eBGP/iBGP speakers, whoever offers a best path. And reprograming next-hops is significantly faster (with hierarchical FIBs anyways). adam