> I am worried that if only 1 node is active and online, and the other > N-1 nodes are inactive, down, and offline, that the cluster will not > be able to complete the operation, because not all of the data is > available on the 1 node that is up.
Which is true, but the correct way normally is to set your RF to whatever you want to handle the necessary redundancy. The total size of the cluster should not be equal to RF, except in the special case where you do not have the performance need to increase the size of the cluster, but you *do* have the redundancy need. If you want N copies of your data and be able to read and write that data when N-1 nodes are down, then use RF=N and CL.ONE. But N does not need to be, and normally *would* not be, the ring size except incidentally. Put another way, you can decide the number of nodes that you want to survive failing, and let *that* be N. Now you have some options: (1) Use CL.ONE and set RF to N+1. (2) Use CL.QUORUM and set RF to (N*2)+1. >From the perspective of nodes being down, this is what you should care about. The total number of nodes in the cluster is irrelevant. If you need more nodes in the cluster because data sizes and/or request rates are so high that you need better performance, you increase the size of the cluster. But the size of the cluster is a property independent of the RF. In fact if you *are* in this position, then increasing RF would be counter-productive since a higher RF means a higher load on the nodes (more disk i/o in total, more disk space in total). The fact that you may set RF to the same number of nodes as the entire cluster size initially, because you do not have the need for additional nodes for performance reasons, would be an incidental "detail" and there is no need to *tie* the RF to the cluster size. The reason you would normally change RF, is if your decided-upon policy of the desired level of redundancy changes. The reason you would normally change the cluster size, is if your cluster needs to be bigger for performance reasons OR if you want to increase RF to be higher than your current cluster size. It's totally fine if you happen to have exactly 3 nodes and want to use RF=3 and survive that 2 of those nodes are down. Nothing wrong with that at all. But what I am saying is that you need to scale RF with your required level of redundancy, but scale the cluster size with your capacity needs. They need not be tied. If you change your mind and want RF=4 by policy, that will imply changing both cluster size and RF. But the reason you're increasing RF is because you decided that you want RF=4, and that in turn necessitated increasing the cluster size to >= 4. It is not the case that having RF be equal to the cluster size is in and of itself a useful property. -- / Peter Schuller (@scode on twitter)