Re: Moving to a new cluster

aaron morton Wed, 21 Sep 2011 16:26:18 -0700

How much data is on the nodes in cluster 1 and how much disk space on cluster 2 
? Be aware that Cassandra 0.8 has an issue where repair can go crazy and use a 
lot of space.

If you are not regularly running repair I would also repair before the move.

The repair after the copy is a good idea but should technically not be 
necessary. If you can practice the move watch the repair to see if much is 
transferred (check the logs). There is always a small transfer, but if you see 
data been transferred for several minutes I would investigate. 

When you start a repair it will repair will the other nodes it replicates data 
with. So you only need to run it every RF nodes. Start it one one, watch the 
logs to see who it talks to and then start it on the first node it does not 
talk to. And so on. 

Add a snapshot before the clean (repair will also snapshot before it runs)

Scrub is not needed unless you are migrating or you have file errors.

If your cluster is online, consider running the clean every RFth node rather 
than all at once (e.g. 1,4, 7, 10 then 2,5,8,11). It will have less impact on 
clients. 

Cheers

-----------------
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 22/09/2011, at 10:27 AM, Philippe wrote:

> Hello,
> We're currently running on a 3-node RF=3 cluster. Now that we have a better 
> grip on things, we want to replace it with a 12-node RF=3 cluster of 
> "smaller" servers. So I wonder what the best way to move the data to the new 
> cluster would be. I can afford to stop writing to the current cluster for 
> whatever time is necessary. Has anyone written up something on this subject ?
> 
> My plan is the following (nodes in cluster 1 are node1.1->1.3, nodes in 
> cluster 2 are node2.1->2.12)
> stop writing to current cluster & drain it
> get a snapshot on each node
> Since it's RF=3, each node should have all the data, so assuming I set the 
> tokens correctly I would move the snapshot from node1.1 to node2.1, 2.2, 2.3 
> and 2.4 then node1.2->node2.5,2.6,2.,2.8, etc. This is because the range for 
> node1.1 is now spread across 2.1->2.4
> Run repair & clean & scrub on each node (more or less in //)
> What do you think ?
> Thanks

Re: Moving to a new cluster

Reply via email to