Hi, Our environment will consist of cluster with size not bigger than 2 to 4 nodes per cluster(all located in the same DC). We want to ensure that every node in the cluster will own 100% of the data. A node adding(or removing) procedure will be automated so we want to ensure we're making the right steps. Lets say we have node 'A' up and running and want to add another node 'B' to make a cluster. Node A configuration will be: seed: "IP of A" listen_address: "IP of A" num_tokens: 256 rpc_address: 0.0.0.0 The keyspace uses SimpleStrategy with RF: 1.
Adding node 'B' to cluster we are doing the following: 1. Stop cassandra on B. 2. Update cassandra.yaml - change seed to point to "IP of A" 3. Update cassandra-topology.properties - add node A ip to it and make it the default one. 4. rm -rf /var/lib/cassandra/* 5. Start cassandra on B. 6. Wait untill nodetool status reports the node B is up. 7. Update RP of the keyspace to 2. 8. Run nodetool repair on B and wait it to finish. Can we update the RF factor on A before starting Cassandra on B in order to skip steps 7 and 8? Now when the data is sync on both nodes we want to make a node B a seed node. 9. Update seed property on A and B to include the the IP of B node. 10. Restart cassandra on both nodes. If adding more nodes to the cluster the steps will be the same except that seed property will contain all existing nodes in the cluster. So are these steps everything we need to do? Is there anything more we need to do? Is there an easier way to do what we want or all the steps above are mandatory?