Hi Karthick, repairs can be tricky. You can (and probably should) run repairs as apart of routine maintenance. And of course absolutely if you lose a node in a bad way. If you decommission a node for example, no “extra” repair needed.
If you are using TWCS you should probably not run repairs on those cf. We have a combination of scripts and locks to run repairs across an 18 node cluster 1 node at a time, typically takes around 2-3days and so we run it once a week. The great folks at tlp have put together http://cassandra-reaper.io/ which makes managing repairs even easier and probably more performant since as I understand, it used range repairs. Good luck, -B > On Jan 24, 2018, at 4:57 AM, Karthick V <karthick...@gmail.com> wrote: > > Periodically I have been running Full repair process befor GC Grace period as > mentioned in the best practices.Initially, all went well but as the data size > increases Repair duration has increased drastically and we are also facing > Query timeouts during that time and we have tried incremental repair facing > some OOM issues. > > After running a repair process for more than 80 Hours we have ended up with > the question > > why can't we run a repair process if and only if a Cassandra node got a > downtime? > > Say if there is no downtime during a GC grace period Do we still face > Inconsistency among nodes? if yes, then doesn't Hinted Handoff handle those? > > Cluster Info: Having two DataCenter with 8 machines each with a disk size of > 1TB, C* v_2.1.13 and having around 420GB data each. > >> On Wed, Jan 24, 2018 at 2:46 PM, Karthick V <karthick...@gmail.com> wrote: >> Hi, >> >> >> >> >> >> >