> I think what I feel is that there is a need to know if repair is required > flag in order for team to manage the cluster.
And again, repair is always required essentially. You should *always* run it within the necessary period as determined by GCGraceSeconds. > Atleast at minimum, Is there a flag somewhere that tells if repair was run > within GCGracePeriod? No, and it's not what you want either since by the time that flags says "false", it's already too late :) This is why my best suggestion for a simple improvement would be to expose the time since the last successful repair. Currently this information is, to my knowledge, not exposed by Cassandra so it is the responsibility of your deployment strategy to monitor for this. One simple version (not to be used as-is) might be: set -e # important touch /path/to/flagfile.tmp nodetool -h localhost repair mv /path/to/flagfile.tmp /path/to/flagfile The mtime of /path/to/flagfile is the indicator of when repair succeeded last, assuming a recent version of Cassandra where 'nodetool repair' is blocking. The key point is: What you want to monitor, is the time since last successful repair. If that time is less than some triggering low water mark, someone needs to be informed because you are X hours away from violating the requirements imposed by GCGraceSeconds. (Cassandra could make this easier, but just be clear on what it is that you're actually looking for. You're *not* looking for "has a write been timed out ever in the cluster", but rather "are we closer to GCGraceSeconds than some threshold which we normally should never reach if repairs are functioning and running as intended".) -- / Peter Schuller