I think we'd need a new operation type (https://issues.apache.org/jira/browse/CASSANDRA-957) to go from "some of the data gets streamed" to "all of the data gets streamed." A node that claims a token that is in the ring is assumed to actually have that data and IMO trying to guess when to break that would be error-prone -- better to have some explicit signal.
On Sun, Jan 30, 2011 at 1:38 AM, Chris Goffinet <c...@chrisgoffinet.com> wrote: > I was looking over the Operations wiki, and with the many improvements with > 0.7, I wanted to bring up a thought. > > The two options today for replacing a node that has lost all data is: > > (Recommended approach) Bring up the replacement node with a new IP address, > and AutoBootstrap set to true in storage-conf.xml. This will place the > replacement node in the cluster and find the appropriate position > automatically. Then the bootstrap process begins. While this process runs, > the node will not receive reads until finished. Once this process is finished > on the replacement node, run nodetool removetoken once, supplying the token > of the dead node, and nodetool cleanup on each node. > (Alternative approach) Bring up a replacement node with the same IP and token > as the old, and run nodetool repair. Until the repair process is complete, > clients reading only from this node may get no data back. Using a higher > ConsistencyLevel on reads will avoid this. > > For nodes that might have a drive failure, but same ip address, what do you > think about supplying the node's same token + autobootstrap set to true? This > process works in trunk, but not all the data seems to be streamed over from > it's replicas. This would provide the option to not let a node take on reads > until replicas stream the SSTables over and would eliminate the alternative > approach of forcing higher consistency levels. > > -Chris > > -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com