On Tue, Apr 15, 2014 at 6:14 AM, Ken Hancock <ken.hanc...@schange.com>wrote:
> Keep in mind if you lose the wrong two, you can't satisfy quorum. In a > 5-node cluster with RF=3, it would be impossible to lose 2 nodes without > affecting quorum for at least some of your data. In a 6 node cluster, once > you've lost one node, if you were to lose another, you only have a 1-in-5 > chance of not affecting quorum for some of your data. > This is why the real highly available way to run Cassandra with QUORUM is RF=5, with 5 "racks". Briefly, any given node running a JVM based distributed application should be assumed to potentially become transiently unavailable for a short time, for example during long GC pauses or rolling restarts. There is also a chance of non-transient failure (hard down) at any time, and a much smaller chance of two simultaneous non-transient failures. If you have RF=3 and lose two nodes (one transient, the other non-transient) in a range, that range is now unavailable because quorum is 2 and 3-2 is 1, which is less than 2. If you have RF=5 and lose two nodes in the same way, quorum is 3 and 5-2 is 3, which is equal to 3. AFAICT, no one actually runs Cassandra in this way because keeping 5 copies of your already denormalized data seems excessive and is difficult to justify to management. =Rob