Correct steps how to extend cluster size and RF

Juho Mäkinen Wed, 21 Jul 2010 13:32:33 -0700

I'm just about to extend my current two node production cluster into
five node cluster and I'd like to be sure that my plan is correct.


Currently cluster has two nodes with RF=2. The target is to add four
nodes, increase RF to 3 and drop one of the old nodes.

My current plan is:
1) Add one node with RF=3 but keep the clients connecting only to the
two old nodes. As I'm doing many reads with ConsistencyLevel.ONE, this
should prevent the clients getting exceptions about missing keys.

2) Restart both old nodes with configuration that has RF=3. The
following inserts should now be propagated to the new 3rd node.

3) Execute "nodetool repair" on the new node. This should result that
now all three nodes have all the data.

4) Tell the clients they can now connect also to the new node.

5) Add the three remaining nodes, one at the time and wait that the
bootstrapping is completed. Also add the nodes to the client
connection list.

6) Execute "nodetool decomission"

7) Execute "nodetool loadbalance" to nodes if needed.

Can somebody spot any big problem with the plan?

I'm also thinking about the possibility to add one node to another
data center which would act as a live backup node. The idea would be
that all keys should have a copy in the backup machine. If I'm
correct, this can be done with RackAwareStrategy as stated in
Operation wiki page. No clients will be doing reads from this backup
machine. Is this even possible and if it is, would it be wise or
should I just do backups by snapshotting the cluster files as
suggested in Operation wiki page? I'm currently using
RackUnawareStrategy and I'm not even sure if it can be changed without
cluster downtime.

 - Juho Mäkinen

Correct steps how to extend cluster size and RF

Reply via email to