GREAT answer, thanks and one last questionŠ So, I suspect I can expect those rows to finally go away when queried from cassandra-cli once GCGraceSeconds has passed then?
Or will they always be there forever and ever and ever(this can't be true, right). Thanks, Dean On 10/2/12 9:34 AM, "Sylvain Lebresne" <sylv...@datastax.com> wrote: >The short version is: there is 2 use case for nodetool repair: > 1) For periodic repair of the whole cluster >(http://wiki.apache.org/cassandra/Operations#Frequency_of_nodetool_repair) >. >In that case, you should run repair *with* -pr and you should run it >on *every* node. > 2) When a node has been down for a long time (for instance long >enough that hints may have been dropped), and you want to repair that >node specifically. In that case, you should run repair on that node >only and you should use it *without* -pr. > >As for the gory details, nodetool repair without -pr will repair all >the range of the node on which the repair is done. But when a range is >repaired, it is repaired on *all* replica. In other words, a repair on >node A will also repair parts of other nodes that share a range with >A. That why, in the case 1) above, where you want to repair the whole >cluster, a repair without -pr is inefficient, because if you repair A >and B and both are replica for the same range, you will duplicate the >work. Hence repair -pr: on one node it repair only its primary range >(but all replica for said range), and so if you do that on every node, >you will have effectively repair the whole cluster without having >repaired the same range twice. > >> Can I run node tool pr repair on just 1/RF of my nodes if I do the >>correct nodes? > >As it's hopefully clear from the description above, no. > >> Why are the row keys still there though? > >http://wiki.apache.org/cassandra/FAQ#range_ghosts > > >-- >Sylvain