Re: Cloudflare is down

2013-03-06 Thread danny
On 2013-03-04 08:09, Christopher Morrow wrote: On Mon, Mar 4, 2013 at 2:31 AM, Saku Ytti wrote: I know lot of vendors are fuzzing with 'codenomicon' and they appear not to have flowspec fuzzer. i suspect they fuzz where the money is ... number of users of bgp? number of users of flowspec?

RE: Cloudflare is down

2013-03-05 Thread Adam Vitkovsky
> From my point of view, outages are caused by: > 1) operator > 2) software defect > 3) hardware defect >From my experience now days the likelihood of an outage as a result of 3) is magnitude less than 2) and same goes for 2) to 1) ratio. In other words the vast majority of the outages are caused

Re: Cloudflare is down

2013-03-04 Thread George Herbert
On Mon, Mar 4, 2013 at 10:40 AM, Saku Ytti wrote: > On (2013-03-04 13:23 -0500), Jeff Wheeler wrote: > >> We have lots of stupid people in our industry because so few >> understand "The Way Things Work." > > We have tendency to view mistakes we do as unavoidable human errors and > mistakes other p

Re: Cloudflare is down

2013-03-04 Thread Saku Ytti
On (2013-03-04 12:33 -0800), Constantine A. Murenin wrote: > to use http-acceleration services without DNS tie-ins. Last I > checked, CloudFlare didn't even let you setup just a subdomain for > their service, e.g. they do require complete DNS control from the > registrar-zone level, all the time,

Re: Cloudflare is down

2013-03-04 Thread Constantine A. Murenin
On 3 March 2013 23:31, Saku Ytti wrote: > On (2013-03-03 12:46 -0800), Constantine A. Murenin wrote: > >> Definitely smart to be delegating your DNS to the web-accelerator >> company and a single point of failure, especially if you are not just >> running a web-site, but have some other independen

Re: Cloudflare is down

2013-03-04 Thread Valdis . Kletnieks
On Mon, 04 Mar 2013 20:40:58 +0200, Saku Ytti said: > Most people design only against 3), often with design which actually > increases likelihood of 2) and 1), reducing overall MTBF on design which > strictly theoretically increases it. I have to admit I've always suspect that MTBWTF would be a m

Re: Cloudflare is down

2013-03-04 Thread Saku Ytti
On (2013-03-04 13:23 -0500), Jeff Wheeler wrote: > We have lots of stupid people in our industry because so few > understand "The Way Things Work." We have tendency to view mistakes we do as unavoidable human errors and mistakes other people do as avoidable stupidity. We should actively plan fo

Re: Cloudflare is down

2013-03-04 Thread Jeff Wheeler
On Mon, Mar 4, 2013 at 9:51 AM, Leo Bicknell wrote: > will fix the problem. It won't. Next time the issue will be > different, and the same undertrained person who missed the packet > size this time will miss the next issue as well. They should all be > sitting around saying, "how can we hire c

Re: Cloudflare is down

2013-03-04 Thread Warren Bailey
+1. >From my Android phone on T-Mobile. The first nationwide 4G network. Original message From: "Patrick W. Gilmore" Date: 03/04/2013 11:46 AM (GMT-05:00) To: NANOG list Subject: Re: Cloudflare is down On Mar 04, 2013, at 09:51 , Leo Bicknell wrote: &g

Re: Cloudflare is down

2013-03-04 Thread Patrick W. Gilmore
On Mar 04, 2013, at 09:51 , Leo Bicknell wrote: > Any competent network admin would have stopped and questioned a > 90,000+ byte packet and done more investigation. Competent programmers > writing their internal tools would have flagged that data as out > of rage. The last couple words are the

Re: Cloudflare is down

2013-03-04 Thread Saku Ytti
On (2013-03-04 06:51 -0800), Leo Bicknell wrote: > From what I have heard so far there is something else they could > have done, hire higher quality people. Your solution to mistakes seem to be not to make them. I can understand the train of thought, but I suspect the practicality of such advice.

Re: Cloudflare is down

2013-03-04 Thread Christopher Morrow
On Mon, Mar 4, 2013 at 2:31 AM, Saku Ytti wrote: > I know lot of vendors are fuzzing with 'codenomicon' and they appear not to > have flowspec fuzzer. i suspect they fuzz where the money is ... number of users of bgp? number of users of flowspec?

Re: Cloudflare is down

2013-03-04 Thread Leo Bicknell
In a message written on Mon, Mar 04, 2013 at 09:31:13AM +0200, Saku Ytti wrote: > Probably only thing you could have done to plan against this, would have > been to have solid dual-vendor strategy, to presume that sooner or later, > software defect will take one vendor completely out. And maybe the

Re: Cloudflare is down

2013-03-03 Thread Saku Ytti
On (2013-03-03 12:46 -0800), Constantine A. Murenin wrote: > Definitely smart to be delegating your DNS to the web-accelerator > company and a single point of failure, especially if you are not just > running a web-site, but have some other independent infrastructure, > too. To be fair, most of u

Re: Cloudflare is down

2013-03-03 Thread Steve
o: Subject: RE: Cloudflare is down Message-ID: Content-Type: text/plain; charset="iso-8859-1" I am not sure of bug... could be normal behavior for how JunOS CLI handle "extended" packet size. Will wait for Juniper comment on incident. Vinod > Date: Sun, 3 Mar 2013 20:02

RE: Cloudflare is down

2013-03-03 Thread Frank Bulk
m: Alex [mailto:dreamwave...@yahoo.com] Sent: Sunday, March 03, 2013 3:54 PM To: Nick Hilliard Cc: nanog@nanog.org Subject: Re: Cloudflare is down Is there any blog or some sort of site that has a up to date list with the latest network outages? Like, not just Cloudflare, but every major outage tha

Re: Cloudflare is down

2013-03-03 Thread Alex
Is there any blog or some sort of site that has a up to date list with the latest network outages? Like, not just Cloudflare, but every major outage that has happen lately. Its really nice to see a post-mortem analysis like in this case. Bugs/hidden "features" are not "documented" in most of the

Re: Cloudflare is down

2013-03-03 Thread Florian Weimer
* Constantine A. Murenin: > And how exactly do they expect end-users "clearing the DNS cache"? Do > I call AT&T, and ask them to clear their cache? Sure, and also tell them to clear their BGP cache (aka "route flap dampening"). 8-)

RE: Cloudflare is down

2013-03-03 Thread Vinod K
I am not sure of bug... could be normal behavior for how JunOS CLI handle "extended" packet size. Will wait for Juniper comment on incident. Vinod > Date: Sun, 3 Mar 2013 20:02:05 + > From: n...@foobar.org > To: arthur.w...@gmail.com > Subject: Re: Cloudflare is down

Re: Cloudflare is down

2013-03-03 Thread Constantine A. Murenin
On 3 March 2013 12:02, Nick Hilliard wrote: > On 03/03/2013 10:46, Arthur Wist wrote: >> Apparently due to a routing issue... > > back up again: http://blog.cloudflare.com/todays-outage-post-mortem-82515 > > tl;dr: outage caused by flowspec filter tickling vendor bug. Definitely smart to be deleg

Re: Cloudflare is down

2013-03-03 Thread Nick Hilliard
On 03/03/2013 10:46, Arthur Wist wrote: > Apparently due to a routing issue... back up again: http://blog.cloudflare.com/todays-outage-post-mortem-82515 tl;dr: outage caused by flowspec filter tickling vendor bug. Nick

Re: Cloudflare is down

2013-03-03 Thread Jay Ashworth
- Original Message - > From: "Arthur Wist" > To: nanog@nanog.org > Sent: Sunday, March 3, 2013 5:46:15 AM > https://twitter.com/CloudFlare/status/308157556113698816 > https://twitter.com/CloudFlare/status/308165820285083648 > > Apparently due to a routing issue... Unless you're in UTC-1

Cloudflare is down

2013-03-03 Thread Arthur Wist
https://twitter.com/CloudFlare/status/308157556113698816 https://twitter.com/CloudFlare/status/308165820285083648 Apparently due to a routing issue... -AW