On Jun 30, 2012 12:25 AM, "joel jaeggli" <joe...@bogus.com> wrote: > > On 6/30/12 12:11 AM, Tyler Haske wrote: >>> >>> I am not a computer science guy but been around a long time. Data centers >>> and clouds are like software. Once they reach a certain size, its >>> impossible to keep the bugs out. You can test and test your heart out and >>> something will slip by. You can say the same thing about nuclear reactors, >>> Apollo moon missions, the NorthEast power grid, and most other technology >>> disasters. >> >> How to run a datacenter 101. Have more then one location, preferably >> far apart. It being Amazon I would expect more. :/ > > there are 7 regions in ec2 three in north america two in asia one in europe and one in south america. > > us east coast, the one currently being impacted is further subdivided into 5 availability zones. > > us east 1d appears to be the only one currently being impacted. > > distributing your application is left as an exercise to the reader. > >
+1 Sorry to be the monday morning quarterback, but the sites that went down learned a valuable lesson in single point of failure analysis. A highly redundant and professionally run data center is a single point of failure. Geo-redundancy is key. In fact, i would take distributed data centers over RAID, UPS, or any other "fancy pants" © mechanisms any day. And, aws East also seems to be cursed. I would run out of west for a while. :-) I would also look into clouds of clouds. ... Who knows. Amazon could have an Enron moment, at which point a corporate entity with a tax id is now a single point of failure. Pay your money, take your chances. CB