Hi, To put it easy, I have been taught that anything that can be disabled is an optimization. So we don't want to rely an optimization that can silently fail. This goes for read repair as well as we cannot be sure that all the data will be read. Plus it is configured to trigger only 10 % of the time by default, and not cross data center.
(Anti-entropy) Repairs are known to be be necessary to make sure data is correctly distributed on all the nodes that are supposed to have it. As Cassandra is built to allow native tolerance to failure (when correctly configured to do so), it can happen that a node miss a data, by design. When this data that miss a node was a tombstone due to a delete, it needs to be replicated before all the other nodes remove it, which happen eventually after 'gc_grace_seconds' (detailed post about this thelastpickle.com/blog/2016/07/27/about-deletes-and-tombstones.html). If this tombstone is removed from all the nodes before having been replicated to the node that missed it, this node will eventually replicate the data that should have been deleted, the data overridden by the tombstone. We call it a zombie. And hinted handoff *can* and *will* fail. It happened to me in the future in a bad way, and nothing prevent it from happening in the future, even if they were greatly imrpoved in 3.0+. >From Datastax doc <https://docs.datastax.com/en/cassandra/3.0/cassandra/operations/opsRepairNodesHintedHandoff.html>: "Hints are flushed to disk every 10 seconds, reducing the staleness of the hints." Which means that, by design, a node going down can lose up to 10 seconds of hints stored for other nodes (in which some might be deletes). The conclusion is often the same one, if not running deletes or if zombie data is not an issue, it is quite safe not to run repair within 'gc_grace_seconds' (default 10 days). But this is the only way to ensure a low entropy for regular data (not only tombstones) in a Cassandra cluster as of now all other optimizations can and will fail at some point. It also provides a better consistency, if reading with a weak consistency level such as LOCAL_ONE, as it will reduce entropy, chance to read the same data everywhere increases. C*heers, ----------------------- Alain Rodriguez - @arodream - al...@thelastpickle.com France The Last Pickle - Apache Cassandra Consulting http://www.thelastpickle.com 2017-04-21 15:54 GMT+02:00 Thakrar, Jayesh <jthak...@conversantmedia.com>: > > > Unfortunately, I don’t know much about the replication architecture. > > The only thing I know is that the replication is set at the keyspace level > (i.e. 1, 2 or 3 or N replicas) and then > > there is the consistency level set at the client application level which > determines how many acknowledgements > > are necessary to deem a write successful. > > > > And you might have noticed in the video that anti-entropy is to be done as > "deemed" necessary and not to be done blindly as a rule. > > E.g. if your data is read-only (never mutated) then there is no need for > anti-entropy. > > > > *From: *eugene miretsky <eugene.miret...@gmail.com> > *Date: *Thursday, April 20, 2017 at 5:52 PM > *To: *Conversant <jthak...@conversantmedia.com> > *Cc: *"user@cassandra.apache.org" <user@cassandra.apache.org> > *Subject: *Re: Why are automatic anti-entropy repairs required when > hinted hand-off is enabled? > > > > Thanks Jayesh, > > > > Watched all of those. > > > > Still not sure I fully get the theory behind it > > > > Aside from the 2 failure cases I mentioned earlier, the only other way > data can become inconsistent is error when replicating the data in the > background. Does Cassandra have a retry policy for internal replication? Is > there a setting to change it? > > > > > > > > > > > > On Thu, Apr 6, 2017 at 10:54 PM, Thakrar, Jayesh < > jthak...@conversantmedia.com> wrote: > > I had asked a similar/related question - on how to carry out repair, etc > and got some useful pointers. > > I would highly recommend the youtube video or the slideshare link below > (both are for the same presentation). > > > > https://www.youtube.com/watch?v=1Sz_K8UID6E > > > > http://www.slideshare.net/DataStax/real-world-repairs- > vinay-chella-netflix-cassandra-summit-2016 > > > > https://www.pythian.com/blog/effective-anti-entropy-repair-cassandra/ > > > > https://docs.datastax.com/en/cassandra/2.1/cassandra/tools/ > toolsRepair.html > > > > https://www.datastax.com/dev/blog/repair-in-cassandra > > > > > > > > > > *From: *eugene miretsky <eugene.miret...@gmail.com> > *Date: *Thursday, April 6, 2017 at 3:35 PM > *To: *<user@cassandra.apache.org> > *Subject: *Why are automatic anti-entropy repairs required when hinted > hand-off is enabled? > > > > Hi, > > > > As I see it, if hinted handoff is enabled, the only time data can be > inconsistent is when: > > 1. A node is down for longer than the max_hint_window > 2. The coordinator node crushes before all the hints have been replayed > > Why is it still recommended to perform frequent automatic repairs, as well > as enable read repair? Can't I just run a repair after one of the nodes is > down? The only problem I see with this approach is a long repair job > (instead of small incremental repairs). But other than that, are there any > other issues/corner-cases? > > > > Cheers, > > Eugene > > >