On Mon, 4 Oct 2021 at 21:58, Michael Thomas <m...@mtcc.com> wrote: > > On 10/4/21 11:48 AM, Luke Guillory wrote: > > > I believe the original change was 'automatic' (as in configuration done > via a web interface). However, now that connection to the outside world is > down, remote access to those tools don't exist anymore, so the emergency > procedure is to gain physical access to the peering routers and do all the > configuration locally. > > Assuming that this is what actually happened, what should fb have done > different (beyond the obvious of not screwing up the immediate issue)? This > seems like it's a single point of failure. Should all of the BGP speakers > have been dual homed or something like that? Or should they not have been > mixing ops and production networks? Sorry if this sounds dumb. >
Facebook is a huge network. It is doubtful that what is going on is this simple. So I will make no guesses to what Facebook is or should be doing. However the traditional way for us small timers is to have a backdoor using someone else's network. Nowadays this could be a simple 4/5G router with a VPN, to a terminal server that allows the operator to configure the equipment through the monitor port even when the config is completely destroyed. Regards, Baldur