Hi Karthick, repairs can be tricky.  

You can (and probably should) run repairs as apart of routine maintenance.  And 
of course absolutely if you lose a node in a bad way.  If you decommission a 
node for example, no “extra” repair needed. 

If you are using TWCS you should probably not run repairs on those cf.

We have a combination of scripts and locks to run repairs across an 18 node 
cluster 1 node at a time, typically takes around 2-3days and so we run it once 
a week.  

The great folks at tlp have put together http://cassandra-reaper.io/ which 
makes managing repairs even easier and probably more performant since as I 
understand, it used range repairs. 

Good luck,
-B

> On Jan 24, 2018, at 4:57 AM, Karthick V <karthick...@gmail.com> wrote:
> 
> Periodically I have been running Full repair process befor GC Grace period as 
> mentioned in the best practices.Initially, all went well but as the data size 
> increases Repair duration has increased drastically and we are also facing 
> Query timeouts during that time and we have tried incremental repair facing 
> some OOM issues.
> 
> After running a repair process for more than 80 Hours we have ended up with 
> the question
> 
> why can't we run a repair process if and only if a Cassandra node got a 
> downtime? 
> 
> Say if there is no downtime during a GC grace period Do we still face 
> Inconsistency among nodes? if yes, then doesn't Hinted Handoff handle those? 
> 
> Cluster Info: Having two DataCenter with 8 machines each with a disk size of 
> 1TB, C* v_2.1.13  and having around 420GB data each.
> 
>> On Wed, Jan 24, 2018 at 2:46 PM, Karthick V <karthick...@gmail.com> wrote:
>> Hi,
>> 
>> 
>> 
>> 
>> 
>> 
> 

Reply via email to