Re: Global Akamai Outage

2021-07-25 Thread Saku Ytti
On Sun, 25 Jul 2021 at 21:41, Mark Tinka wrote: > Are you speaking globally, or for NTT? Doesn't matter. And I'm not trying to say RPKI is a bad thing. I like that we have good AS:origin mapping that is verifiable and machine readable, that part of the solution will be needed for many applicatio

Re: Global Akamai Outage

2021-07-25 Thread Mark Tinka
On 7/25/21 17:32, Saku Ytti wrote: Steering dangerously off-topic from this thread, we have so far had more operational and availability issues from RPKI than from hijacks. And it is a bit more embarrassing to say 'we cocked up' than to say 'someone leaked to internet, it be like it do'. Ar

Re: Global Akamai Outage

2021-07-25 Thread Randy Bush
> Very often the corrective and preventive actions appear to be > different versions and wordings of 'dont make mistakes', in this case: > > - Reviewing and improving input safety checks for mapping components > - Validate and strengthen the safety checks for the configuration > deployment zoning

Re: Global Akamai Outage

2021-07-25 Thread Saku Ytti
On Sun, 25 Jul 2021 at 18:14, Jared Mauch wrote: > How can we improve response times when things are routed poorly? Time to > mitigate hijacks is improved my majority of providers doing RPKI OV, but > interprovider response time scales are much longer. I also think about the > two big CTL long

Re: Global Akamai Outage

2021-07-25 Thread Jared Mauch
Work hat is not on, but context is included from prior workplaces etc. > On Jul 25, 2021, at 2:22 AM, Saku Ytti wrote: > > It doesn't seem like a tenable solution, when the solution is 'do > better', since I'm sure whoever did those checks did their best in the > first place. So we must assume

Re: Global Akamai Outage

2021-07-25 Thread Mark Tinka
On 7/25/21 08:18, Saku Ytti wrote: Hey, Not a critique against Akamai specifically, it applies just the same to me. Everything seems so complex and fragile. Very often the corrective and preventive actions appear to be different versions and wordings of 'dont make mistakes', in this case:

Re: Global Akamai Outage

2021-07-25 Thread Miles Fidelman
Indeed.  Worth rereading for that reason alone (or in particular). Miles Fidelman Hank Nussbacher wrote: On 23/07/2021 09:24, Hank Nussbacher wrote: From Akamai.  How companies and vendors should report outages: [07:35 UTC on July 24, 2021] Update: Root Cause: This configuration directive w

Re: Global Akamai Outage

2021-07-25 Thread Hank Nussbacher
On 25/07/2021 09:18, Saku Ytti wrote: Hey, Not a critique against Akamai specifically, it applies just the same to me. Everything seems so complex and fragile. Complex systems are apt to break and only a very limited set of tier-3 engineers will understand what needs to be done to fix it. K