You should get on 0.7.4 while you are doing this, this is a pretty good reason 
https://github.com/apache/cassandra/blob/cassandra-0.7.8/CHANGES.txt#L58

>  Never done a read repair on this cluster before, is that a problem?
Potentially. 
Repair will ensure that your data is distributed, and that deletes done 
mysteriously come back to life 
http://wiki.apache.org/cassandra/Operations#Dealing_with_the_consequences_of_nodetool_repair_not_running_within_GCGraceSeconds
 
Personally I would get a repair to complete before I started this process. 

You may want to make sure everything is compacted as best it can be before 
hand, see some of the other threads about repair using a lot of space. 

* use nodetool to change the compaction threshold down to 2 for the CF's
* trigger a minor compaction using nodetool flush
* wait and monitor using nodetool compactionstats

The do a repair, reapir one CF at a time. Starting with the smallest CF. 
Monitor disk space and 
nodetool compactionstats 
then
nodetool netstats


If you have the network space I would just move the files and then put them 
back….

* drain
* copy the /var/lib/cassandra/data and saved_caches dirs
* copy the yaml 
* blast away
* put things back in  place
* start up and run repair

I know you have RF 3 and 3 nodes. I'm been cautious. If you don't have space 
the current approach is fine. 

You may want to disable Hinted Handoff while you are doing this as you are 
going to run repair anyway when the node comes back. 

Cheers

  
-----------------
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 19/08/2011, at 11:57 AM, Anand Somani wrote:

> Hi,
> 
> version - 0.7.4
> cluster size = 3
> RF = 3.
> data size on a node ~500G
> 
> I want to do some disk maintenance on a cassandra node, so the process that I 
> came up with is
> drain this node
> back up the system data space
> rebuild the disk partition
> copy data from another node
> copy data from the backed up system data
> restart node
> run nodetool repair
> Is this process sane. Never done a read repair on this cluster before, is 
> that a problem? Should I run it per CF? Would it help if I did this before 
> bringing the node down?
> 
> Any pointers, things to worry about.
> 
> Thanks
> Anand

Reply via email to