On Wed, 2 Sep 2020 at 10:00, Martijn Schmidt via NANOG wrote:
> I suppose now would be a good time for everyone to re-open their Centurylink
> ticket and ask why the RFO doesn't address the most important defect, e.g.
> the inability to withdraw announcements even by shutting down the session?
❦ 2 septembre 2020 10:15 +03, Saku Ytti:
> RFC7313 might show us way to reduce amount of useless work. You might
> want to add signal that initial convergence is done, you might want to
> add signal that no installation or best path algo happens until all
> route are loaded, this would massively
On Wed, 2 Sep 2020 at 12:50, Vincent Bernat wrote:
> It seems BIRD contains an implementation for RFC7313. From the source
> code, it delays removal of stale route until EoRR, but it doesn't seem
> to delay the work on updating the kernel. Juniper doesn't seem to
> implement it. Cisco seems to im
We once moved a 3u server 30 miles between data centers this way. Plug
redundant psu into a ups and 2 people carried it out and put them in a vehicle.
Sent from my iPhone
> On Sep 1, 2020, at 11:58 PM, Christopher Morrow
> wrote:
>
> On Tue, Sep 1, 2020 at 11:53 PM Alain Hebert wrote:
Sure, but I don't care how busy your router is, it shouldn't take hours to
withdraw routes.
-
Mike Hammett
Intelligent Computing Solutions
http://www.ics-il.com
Midwest-IX
http://www.midwest-ix.com
- Original Message -
From: "Saku Ytti"
To: "Martijn Schmidt"
Cc: "Outa
Shawn L via NANOG wrote on 02/09/2020 12:15:
We once moved a 3u server 30 miles between data centers this way.
Plug redundant psu into a ups and 2 people carried it out and put
them in a vehicle.
hopefully none of these server moves that people have been talking about
involved spinning disks.
On Wed, 2 Sep 2020 at 14:40, Mike Hammett wrote:
> Sure, but I don't care how busy your router is, it shouldn't take hours to
> withdraw routes.
Quite, discussion is less about how we feel about it and more about
why it happens and what could be done to it.
--
++ytti
I am not buying it. No normal implementation of BGP stays online, replying
to heart beat and accepting updates from ebgp peers, yet after 5 hours
failed to process withdrawal from customers.
ons. 2. sep. 2020 14.11 skrev Saku Ytti :
> On Wed, 2 Sep 2020 at 14:40, Mike Hammett wrote:
>
> > Sure,
On 9/2/20 1:49 PM, Nick Hilliard wrote:
Shawn L via NANOG wrote on 02/09/2020 12:15:
We once moved a 3u server 30 miles between data centers this way.
Plug redundant psu into a ups and 2 people carried it out and put
them in a vehicle.
hopefully none of these server moves that people have be
On Wed, 2 Sep 2020 at 16:16, Baldur Norddahl wrote:
> I am not buying it. No normal implementation of BGP stays online, replying to
> heart beat and accepting updates from ebgp peers, yet after 5 hours failed to
> process withdrawal from customers.
I can imagine writing BGP implementation like
If the client pays me a shit ton of money to make sure the server
won't turn off, and they pay for the hardware to make it happen. I;d think
about it. It's a like a colo move on hardmode.
Its extremely stupid, and I would advise not doing it.
Hell even when I migrated e911 server, we had a 20 min
Yeah. This actually would be a fascinating study to understand exactly what
happened. The volume of BGP messages flying around because of the session
churn must have been absolutely massive, especially in a complex internal
infrastructure like 3356 has.
I would say the scale of such an event has t
Cisco had a bug a few years back that affected metro switches such that they
would not withdraw routes upstream. We had an internal outage and one of my
carriers kept advertising our prefixes even though we withdrew the routes. We
tried downing the neighbor and even shutting down the physical in
creative engineers can conjecturbate for days on how some turtle in the
pond might write code what did not withdraw for a month, or other
delightful reasons CL might have had this really really bad behavior.
the point is that the actual symptoms and cause really really should be
in the RFO
randy
Sure. But being good engineers, we love to exercise our brains by thinking
about possibilities and probabilities.
For example, we don't form disaster response plans by saying "well, we
could think about what *could* happen for days, but we'll just wait for
something to occur".
-A
On Wed, Sep 2,
On Wed, Sep 2, 2020 at 12:00 AM Christopher Morrow
wrote:
>
> On Tue, Sep 1, 2020 at 11:53 PM Alain Hebert wrote:
> >
> > As a coincidence... I was *thinking* of moving a 90TB SAN (with
> > mechanical's) to another rack that way... skateboard, long fibers and long
> > power cords =D
> >
>
> we don't form disaster response plans by saying "well, we could think
> about what *could* happen for days, but we'll just wait for something
> to occur".
from an old talk of mine, if it was part of the “plan” it’s an “event,”
if it is not then it’s a “disaster.”
I believe someone on this list reported that updates were also broken. They
could not add prepending nor modify communities.
Anyway I am not saying it cannot happen because clearly something did
happen. I just don't believe it is a simple case of overload. There has to
be more to it.
ons. 2. sep.
Detailed explanation can be found below.
https://blog.thousandeyes.com/centurylink-level-3-outage-analysis/
From: NANOG on behalf of
Baldur Norddahl
Date: Wednesday, September 2, 2020 at 12:09 PM
To: "nanog@nanog.org"
Subject: Re: [outages] Major Level3 (CenturyLink) Issues
*External Em
While conserving connectivity? 😂
De : Shawn L via NANOG
Envoyé : mercredi 2 septembre 2020 13:15
À : nanog
Objet : Re: Centurylink having a bad morning?
We once moved a 3u server 30 miles between data centers this way. Plug
redundant psu into a ups and 2 peopl
That is what the 5G router is for...
ons. 2. sep. 2020 19.47 skrev Michael Hallgren :
> While conserving connectivity? 😂
>
>
> --
> *De :* Shawn L via NANOG
> *Envoyé :* mercredi 2 septembre 2020 13:15
> *À :* nanog
> *Objet :* Re: Centurylink having a bad morning?
>
Hello NANOG,
Could somebody from Akamai AANP’s network team contact me off-list? I’ve tried
the peering and NOC and got no replies in months.
Thanks
Ahmed
netsupp...@akamai.com
--
TTFN,
patrick
> On Sep 2, 2020, at 2:40 PM, ahmed.dala...@hrins.net wrote:
>
> Hello NANOG,
>
> Could somebody from Akamai AANP’s network team contact me off-list? I’ve
> tried the peering and NOC and got no replies in months.
>
> Thanks
> Ahmed
https://www.youtube.com/watch?v=vQ5MA685ApE
On Wed 02 Sep 2020 20:40:35 GMT, Baldur Norddahl wrote:
> That is what the 5G router is for...
>
> ons. 2. sep. 2020 19.47 skrev Michael Hallgren :
>
> > While conserving connectivity? 😂
> >
> >
> > --
> > *De :* Shawn L via
❦ 2 septembre 2020 16:35 +03, Saku Ytti:
>> I am not buying it. No normal implementation of BGP stays online,
>> replying to heart beat and accepting updates from ebgp peers, yet
>> after 5 hours failed to process withdrawal from customers.
>
> I can imagine writing BGP implementation like this
On Wed, Sep 2, 2020 at 3:04 PM Vincent Bernat wrote:
>
> ❦ 2 septembre 2020 16:35 +03, Saku Ytti:
>
> >> I am not buying it. No normal implementation of BGP stays online,
> >> replying to heart beat and accepting updates from ebgp peers, yet
> >> after 5 hours failed to process withdrawal from c
On Wed, 2 Sep 2020, Warren Kumari wrote:
The root issue here is that the *publicc* RFO is incomplete / unclear.
Something something flowspec something, blocked flowspec, no more
something does indeed explain that something bad happened, but not
what caused the lack of withdraws / cascading churn
On 30/Aug/20 17:15, vidister via NANOG wrote:
> Operating a CDN inside a Tier1 network is just shitty behaviour.
What's a Tier 1 network :-)?
Mark.
On 30/Aug/20 17:20, Matt Hoppes wrote:
> No clue. They’ve been progressively getting worse since 2010. I have no idea
> why anyone chooses them and they shouldn’t be considered a Tier1 carrier with
> the level of issues they have.
For us, the account management took a turn for the worse af
On 31/Aug/20 16:33, Tomas Lynch wrote:
> Maybe we are idealizing these so-called tier-1 carriers and we,
> tier-ns, should treat them as what they really are: another AS. Accept
> that they are going to fail and do our best to mitigate the impact on
> our own networks, i.e. more peering.
Bingo!
30 matches
Mail list logo