On Sun, Jun 21, 2015 at 8:25 PM, John Wong <gokoproj...@gmail.com> wrote:
> In the case of restoring from snapshots I have restored a 6-node replica > with just copying snapshot files (along with schema files), run nodetool > refresh, and should be able to complete in a few hours. But now with > smaller replica, do I again just pick snapshots from any 3 nodes? What and > why do I need to fix token range (from what I read)? > Unless you have large data sizes, the easiest way to do this process generically is : 1) copy all sstables from all source nodes to all target nodes, avoiding namespace collision between files from different nodes. 2) start target nodes with data in place and appropriate tokens for the new cluster (don't bother using refresh, you can just : a) start target cluster with no data b) create schema c) stop target cluster nodes d) start target cluster nodes) As a small caveat, this process assumes that you can deal with the on-disk bloat in the target cluster until compaction merges. The bloat comes from having multiple replicas of the same data collapsed onto the same node. Otherwise, the simplest is to : 1) not use vnodes 2) cut your cluster in half You accomplish this by copying the data from two source nodes to one target node, avoiding name collisions. A B C D E F AB CD EF AB has the token of A, CD has the token of C, EF has the token of E. Because no remaining nodes lose ranges, one does not need to run cleanup. =Rob