2011/9/22 Jonas Borgström <jo...@borgstrom.se>: > On 09/22/2011 01:25 AM, aaron morton wrote: > *snip* >> When you start a repair it will repair will the other nodes it >> replicates data with. So you only need to run it every RF nodes. Start >> it one one, watch the logs to see who it talks to and then start it on >> the first node it does not talk to. And so on.
This is not totally true because of https://issues.apache.org/jira/browse/CASSANDRA-2610. Basically, doing this won't make sure the full cluster is in sync (there is a fair chance it will, but it's not guaranteed). It will be true in 1.0 (though in 1.0 it will be simpler and more efficient to just run 'nodetool repair --partitioner-range' on every node). > Is this new in 0.8 or has it always been this way? > > From > http://wiki.apache.org/cassandra/Operations#Frequency_of_nodetool_repair > > """ > Unless your application performs no deletes, it is vital that production > clusters run nodetool repair periodically on all nodes in the cluster. > """ > > So for a 3 node cluster using RF=3, is it sufficient to run "nodetool > repair" on one node? Technically, in the 3 nodes RF=3 case, you would need to do repair on 2 nodes to make sure the cluster has been fully repaired. But it becomes fairly complicated to know which nodes exactly once you get more than 3 nodes in the cluster or you have RF > 3, so to be safe I would advise sticking to the wiki instruction (until 1.0 at least). > > / Jonas >