On Jul 3, 2012, at 9:11 AM, "Dan Golding" <dgold...@ragingwire.com> wrote:
>> -----Original Message----- >> From: James Downs [mailto:e...@egon.cc] >> >> >> On Jul 2, 2012, at 7:19 PM, Rodrick Brown wrote: >> >>> People are acting as if Netflix is part of some critical service > they >> stream movies for Christ sake. Some acceptable level of loss is fine >> for 99.99% of Netflix's user base just like cable, electricity and >> running water I suffer a few hours of losses each year from those >> services it suck yes, is it the end of the world no.. >> >> You missed the point. > > And very publically missed the point, too. The Netflix issues led to a > large discussion of downtime, testing, and fault tolerance that has been > very useful for the community and could lead to some good content for > NANOG conferences (/pokes PC). For Netflix (and all other similar > services) downtime is money and money is downtime. There is a > quantifiable cost for customer acquisition and a quantifiable churn > during each minute of downtime. Mature organizations actually calculate > and track this. The trick is to ensure that you have balanced the cost > of greater redundancy vs the cost of churn/customer acquisition. If you > are spending too much on redundancy, it's as big of mistake as spending > too little. I totally got the point and the last bit of my post was just tongue in cheek. As I stated in my original response it's very unrealistic to plan for every possible failure scenario given the constraints most businesses face when implementing BCP today. I doubt Amazon gave much thought to multiple site outages and clients not being able to dynamically redeploy their engines because of inaccessibility from ELB. > > Also, I don't think there is an acceptable level of downtime for water. > Neither do water utilities. > > - Dan >