On 17/05/13 10:13, Ghiora Drori wrote: > > As to reliability: (This is effectively a contract): No, it isn't (see below). > https://aws.amazon.com/glacier/#highlights > Quote: "Amazon Glacier is designed to provide average annual > durability of 99.999999999% " > If this is not good enough for you too bad. > When you see someone, anyone, saying such a thing, run. As fast and as far as you can.
This level of assurance is called "nine nines"(henceforth 9*9). It amounts to one thousandth of a second of downtime a year. Amazon is talking out of their asses in offering it. First, even if their service is 100% reliable, you will not get 9*9 of service. You home internet connection is not that reliable. The fiber connecting Israel to the world is not that reliable. The BGP protocol that is meant to keep the internet alive should a link go down is not that reliable. No matter what Amazon are doing, nine nines is not the SLA you will be getting. Now, you might claim that that is not Amazon's fault. THEY are providing 9*9, and it is the rest of the internet that is not reliable enough. This claim is bullshit. They are not. No single server can provide 9*9. Servers fail. Hard disks fail. Memory fails. NICs fail. Network switches fail. In order to provide a 9*9 SLA, you must be able to detect each and every one of those failures + provide an alternative path *in less than 1 millisecond*, plus assure that only one such failure happens in a year for every customer. It is not impossible to build such a system, but it will not be affordable. The very fact that Amazon is affordable means that they are not providing 9*9, nor anything even close. Just to give you a taste of how expensive such a system might be, take head of the following interesting fact. I just ran a ping between two computers connected via a crossed ethernet cable over a 1Gb/s link. The average ping time was 0.431ms. In other words, just the round-trip time (including kernel wakeup and related activities) between two computers connected over a 3 meter cable is half the time you have at your disposal to react to a downtime *per year*. At this rate, you cannot afford to ping a second time in the hope that the machine was just slightly busy, or that the packet was lost. If you do not get a reply within half a millisecond, you must act. You only have half a millisecond to set up the actual diversion. What about further away computers? From my home, pinging a server located at the server farm of the same ISP I'm connected to takes 17ms. This means I cannot react to a server downtime in less time than half that no matter what. If the server is down, it will take me no less than 8ms to even find out about it. That is, by the time I find out about the server down, I am already violating my SLA by a factor of 8. The only way to have redundancy is to be on the same segment and use specialized low-latency equipment. Since the ISP's link itself might go down, and since BGP is nowhere fast enough to recover, *the only way to provide a 9*9 service is to build a duplicate of the internet in order to do so*. I think we can all agree that Amazon did not do that, or their service would have been, by several orders of magnitude, more expensive than it is. However, supposing that money was no object, would that work? The answer is "no". The reason the answer is no is that external factors were not taken into account. A 9*9 SLA means that the chances of a problem are less than 1:10^11. The chances of a Reichter 8+ earthquake, tsunami, volcano eruption or meteorite striking are all higher than that. TLDR version: The SLA is not a contractual question. Especially when counting nines, it is a technological infrastructure question. Amazon is not providing the nine nines it seems to be promising, and is therefor lying on its SLA. > ( I do not work for Amazon) I do not work for Amazon either. I did use to run a service that was a (very humble) competitor to this one (in which we did not offer SLA for service availability at all, only for the actual data). I currently work for Akamai, for which Amazon is a competitor (though not this particular service). It should be clear that I do not speak on behalf of my employer. All opinions are my own, and only my own. Shachar
_______________________________________________ Linux-il mailing list Linux-il@cs.huji.ac.il http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il