ok, so we just lost the data on that node. are building the raid on it, but
once it is up what is the best way to bring it back in the cluster

   - just let it come up and run nodetool repair
   - copy data from another node and then run nodetool repair,
      -  do I still need to run repair immeidately if I copy the data? Want
      to schedule repair for later during non peak hours?

like I said have 500G and am on 0.7.4, 3 node cluster and RF=3


On Thu, Aug 18, 2011 at 9:42 PM, aaron morton <aa...@thelastpickle.com>wrote:

> You should get on 0.7.4 while you are doing this, this is a pretty good
> reason
> https://github.com/apache/cassandra/blob/cassandra-0.7.8/CHANGES.txt#L58
>
>  Never done a read repair on this cluster before, is that a problem?
>
> Potentially.
> Repair will ensure that your data is distributed, and that deletes done
> mysteriously come back to life
> http://wiki.apache.org/cassandra/Operations#Dealing_with_the_consequences_of_nodetool_repair_not_running_within_GCGraceSeconds
>
> Personally I would get a repair to complete before I started this process.
>
> You may want to make sure everything is compacted as best it can be before
> hand, see some of the other threads about repair using a lot of space.
>
> * use nodetool to change the compaction threshold down to 2 for the CF's
> * trigger a minor compaction using nodetool flush
> * wait and monitor using nodetool compactionstats
>
> The do a repair, reapir one CF at a time. Starting with the smallest CF.
> Monitor disk space and
> nodetool compactionstats
> then
> nodetool netstats
>
>
> If you have the network space I would just move the files and then put them
> backā€¦.
>
> * drain
> * copy the /var/lib/cassandra/data and saved_caches dirs
> * copy the yaml
> * blast away
> * put things back in  place
> * start up and run repair
>
> I know you have RF 3 and 3 nodes. I'm been cautious. If you don't have
> space the current approach is fine.
>
> You may want to disable Hinted Handoff while you are doing this as you are
> going to run repair anyway when the node comes back.
>
> Cheers
>
>
> -----------------
> Aaron Morton
> Freelance Cassandra Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 19/08/2011, at 11:57 AM, Anand Somani wrote:
>
> Hi,
>
> version - 0.7.4
> cluster size = 3
> RF = 3.
> data size on a node ~500G
>
> I want to do some disk maintenance on a cassandra node, so the process that
> I came up with is
>
>    - drain this node
>    - back up the system data space
>    - rebuild the disk partition
>    - copy data from another node
>    - copy data from the backed up system data
>    - restart node
>    - run nodetool repair
>
> Is this process sane. Never done a read repair on this cluster before, is
> that a problem? Should I run it per CF? Would it help if I did this before
> bringing the node down?
>
> Any pointers, things to worry about.
>
> Thanks
> Anand
>
>
>

Reply via email to