Re: Replicating to all nodes

Peter Schuller Fri, 15 Jul 2011 15:18:54 -0700

> I am worried that if only 1 node is active and online, and the other
> N-1 nodes are inactive, down, and offline, that the cluster will not
> be able to complete the operation, because not all of the data is
> available on the 1 node that is up.


Which is true, but the correct way normally is to set your RF to
whatever you want to handle the necessary redundancy. The total size
of the cluster should not be equal to RF, except in the special case
where you do not have the performance need to increase the size of the
cluster, but you *do* have the redundancy need.

If you want N copies of your data and be able to read and write that
data when N-1 nodes are down, then use RF=N and CL.ONE.

But N does not need to be, and normally *would* not be, the ring size
except incidentally.

Put another way, you can decide the number of nodes that you want to
survive failing, and let *that* be N. Now you have some options:

(1) Use CL.ONE and set RF to N+1.
(2) Use CL.QUORUM and set RF to (N*2)+1.

>From the perspective of nodes being down, this is what you should care
about. The total number of nodes in the cluster is irrelevant.

If you need more nodes in the cluster because data sizes and/or
request rates are so high that you need better performance, you
increase the size of the cluster. But the size of the cluster is a
property independent of the RF. In fact if you *are* in this position,
then increasing RF would be counter-productive since a higher RF means
a higher load on the nodes (more disk i/o in total, more disk space in
total).

The fact that you may set RF to the same number of nodes as the entire
cluster size initially, because you do not have the need for
additional nodes for performance reasons, would be an incidental
"detail" and there is no need to *tie* the RF to the cluster size.

The reason you would normally change RF, is if your decided-upon
policy of the desired level of redundancy changes.

The reason you would normally change the cluster size, is if your
cluster needs to be bigger for performance reasons OR if you want to
increase RF to be higher than your current cluster size.

It's totally fine if you happen to have exactly 3 nodes and want to
use RF=3 and survive that 2 of those nodes are down. Nothing wrong
with that at all. But what I am saying is that you need to scale RF
with your required level of redundancy, but scale the cluster size
with your capacity needs. They need not be tied.

If you change your mind and want RF=4 by policy, that will imply
changing both cluster size and RF. But the reason you're increasing RF
is because you decided that you want RF=4, and that in turn
necessitated increasing the cluster size to >= 4. It is not the case
that having RF be equal to the cluster size is in and of itself a
useful property.

-- 
/ Peter Schuller (@scode on twitter)

Re: Replicating to all nodes

Reply via email to