On Thu, May 26, 2011 at 3:12 PM, Marcus Bointon <mar...@synchromedia.co.uk> wrote: > I'd like to make sure I've got the right sequence of operations for adding a > node without downtime. If I'm going from 2 to 3 nodes: > > 1 Calculate new initial_token values using the python script > 2 Change token values in existing nodes and restart them > 3 Install/configure new node > 4 Insert new node's token value > 5 Set new node to auto-bootstrap > 6 Start cassandra on new node > 7 Wait for the ring to rebalance > > With token changes (using values from the python script), it's clear that all > nodes will have some data moved. Does this mean that there's a possibility of > overlap between regions if token changes are not absolutely simultaneous on > all nodes? That sounds dangerous to me... Or shouldn't token values be > changed on nodes containing data? >
nodetool repair is good. when we add new nodes, we add a new one without specifying the new token. after everything is up and healthy, we determine new tokens and see if there is a need to renumber nodes. if we do, we do one at a time and wait until the nodetool repair is finished on one node before moving to another.... > Is there a corresponding sequence for removing nodes? I'm guessing draining > is involved. Turn the node off, remove the node from the ring using nodetool and removetoken .... i've found this to be the best problem-free way. Maybe it's better now ... http://blog.sasha.dolgy.com/2011/03/apache-cassandra-nodetool.html