I plan to have a multi data center Cassandra 2 setup with 2-4 nodes per
data center and several 10s of data centers. We have keyspaces replicated
on a certain number of nodes on *each* data center. Essentially, each data
center has a logical ring that covers all token ranges. We have a vnode
based deployment. So tokens should get assigned to the nodes automatically.

Documentation at
http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_add_node_to_cluster_t.html
suggests
that addition of new node requires cleanup to be run on all other nodes of
the cluster. However, it does not clarify the procedure in a multi-data
center setup.

My understanding is that nodetool cleanup removes data which no longer
belongs to that node. When a new data center is being setup, we are
creating completely new replicas and AFAICT, it does not result in data
movement/rebalance outside of this new data center and hence there is no
cleanup requirement on nodes of other data centers. Is someone able to
confirm if my understanding is right, and cleanup is not required on nodes
of other data centers?


Thanks

Vish

Reply via email to