On Sun, Jun 21, 2015 at 8:25 PM, John Wong <gokoproj...@gmail.com> wrote:

> In the case of restoring from snapshots I have restored a 6-node replica
> with just copying snapshot files (along with schema files), run nodetool
> refresh, and should be able to complete in a few hours. But now with
> smaller replica, do I again just pick snapshots from any 3 nodes? What and
> why do I need to fix token range (from what I read)?
>

Unless you have large data sizes, the easiest way to do this process
generically is :

1) copy all sstables from all source nodes to all target nodes, avoiding
namespace collision between files from different nodes.
2) start target nodes with data in place and appropriate tokens for the new
cluster (don't bother using refresh, you can just : a) start target cluster
with no data b) create schema c) stop target cluster nodes d) start target
cluster nodes)

As a small caveat, this process assumes that you can deal with the on-disk
bloat in the target cluster until compaction merges. The bloat comes from
having multiple replicas of the same data collapsed onto the same node.

Otherwise, the simplest is to :

1) not use vnodes
2) cut your cluster in half

You accomplish this by copying the data from two source nodes to one target
node, avoiding name collisions.

A B C D E F
AB CD EF

AB has the token of A, CD has the token of C, EF has the token of E.

Because no remaining nodes lose ranges, one does not need to run cleanup.

=Rob

Reply via email to