> > It happened to me in the future in a bad way, and nothing prevent it from > happening in the future
Obviously "It happened to me in the past in a bad way"*. Thinking faster than I write... I am quite slow writing :p. To be clear I recommend: - to run repairs within gc_grace_seconds when performing deletes (not TTL, TTLs are fine) - to run repairs 'regularly' when not deleting data (depending on data size and CL in use) Hope that helps, ----------------------- Alain Rodriguez - @arodream - al...@thelastpickle.com France The Last Pickle - Apache Cassandra Consulting http://www.thelastpickle.com 2017-04-27 13:07 GMT+02:00 Alain RODRIGUEZ <arodr...@gmail.com>: > Hi, > > To put it easy, I have been taught that anything that can be disabled is > an optimization. So we don't want to rely an optimization that can silently > fail. This goes for read repair as well as we cannot be sure that all the > data will be read. Plus it is configured to trigger only 10 % of the time > by default, and not cross data center. > > (Anti-entropy) Repairs are known to be be necessary to make sure data is > correctly distributed on all the nodes that are supposed to have it. > > As Cassandra is built to allow native tolerance to failure (when correctly > configured to do so), it can happen that a node miss a data, by design. > > When this data that miss a node was a tombstone due to a delete, it needs > to be replicated before all the other nodes remove it, which happen > eventually after 'gc_grace_seconds' (detailed post about this > thelastpickle.com/blog/2016/07/27/about-deletes-and-tombstones.html). If > this tombstone is removed from all the nodes before having been replicated > to the node that missed it, this node will eventually replicate the data > that should have been deleted, the data overridden by the tombstone. We > call it a zombie. > > And hinted handoff *can* and *will* fail. It happened to me in the future > in a bad way, and nothing prevent it from happening in the future, even if > they were greatly imrpoved in 3.0+. > > From Datastax doc > <https://docs.datastax.com/en/cassandra/3.0/cassandra/operations/opsRepairNodesHintedHandoff.html>: > "Hints are flushed to disk every 10 seconds, reducing the staleness of the > hints." > > Which means that, by design, a node going down can lose up to 10 seconds > of hints stored for other nodes (in which some might be deletes). > > The conclusion is often the same one, if not running deletes or if zombie > data is not an issue, it is quite safe not to run repair within > 'gc_grace_seconds' (default 10 days). But this is the only way to ensure a > low entropy for regular data (not only tombstones) in a Cassandra cluster > as of now all other optimizations can and will fail at some point. It also > provides a better consistency, if reading with a weak consistency level > such as LOCAL_ONE, as it will reduce entropy, chance to read the same data > everywhere increases. > > C*heers, > ----------------------- > Alain Rodriguez - @arodream - al...@thelastpickle.com > France > > The Last Pickle - Apache Cassandra Consulting > http://www.thelastpickle.com > > 2017-04-21 15:54 GMT+02:00 Thakrar, Jayesh <jthak...@conversantmedia.com>: > >> >> >> Unfortunately, I don’t know much about the replication architecture. >> >> The only thing I know is that the replication is set at the keyspace >> level (i.e. 1, 2 or 3 or N replicas) and then >> >> there is the consistency level set at the client application level which >> determines how many acknowledgements >> >> are necessary to deem a write successful. >> >> >> >> And you might have noticed in the video that anti-entropy is to be done >> as "deemed" necessary and not to be done blindly as a rule. >> >> E.g. if your data is read-only (never mutated) then there is no need for >> anti-entropy. >> >> >> >> *From: *eugene miretsky <eugene.miret...@gmail.com> >> *Date: *Thursday, April 20, 2017 at 5:52 PM >> *To: *Conversant <jthak...@conversantmedia.com> >> *Cc: *"user@cassandra.apache.org" <user@cassandra.apache.org> >> *Subject: *Re: Why are automatic anti-entropy repairs required when >> hinted hand-off is enabled? >> >> >> >> Thanks Jayesh, >> >> >> >> Watched all of those. >> >> >> >> Still not sure I fully get the theory behind it >> >> >> >> Aside from the 2 failure cases I mentioned earlier, the only other way >> data can become inconsistent is error when replicating the data in the >> background. Does Cassandra have a retry policy for internal replication? Is >> there a setting to change it? >> >> >> >> >> >> >> >> >> >> >> >> On Thu, Apr 6, 2017 at 10:54 PM, Thakrar, Jayesh < >> jthak...@conversantmedia.com> wrote: >> >> I had asked a similar/related question - on how to carry out repair, etc >> and got some useful pointers. >> >> I would highly recommend the youtube video or the slideshare link below >> (both are for the same presentation). >> >> >> >> https://www.youtube.com/watch?v=1Sz_K8UID6E >> >> >> >> http://www.slideshare.net/DataStax/real-world-repairs-vinay- >> chella-netflix-cassandra-summit-2016 >> >> >> >> https://www.pythian.com/blog/effective-anti-entropy-repair-cassandra/ >> >> >> >> https://docs.datastax.com/en/cassandra/2.1/cassandra/tools/t >> oolsRepair.html >> >> >> >> https://www.datastax.com/dev/blog/repair-in-cassandra >> >> >> >> >> >> >> >> >> >> *From: *eugene miretsky <eugene.miret...@gmail.com> >> *Date: *Thursday, April 6, 2017 at 3:35 PM >> *To: *<user@cassandra.apache.org> >> *Subject: *Why are automatic anti-entropy repairs required when hinted >> hand-off is enabled? >> >> >> >> Hi, >> >> >> >> As I see it, if hinted handoff is enabled, the only time data can be >> inconsistent is when: >> >> 1. A node is down for longer than the max_hint_window >> 2. The coordinator node crushes before all the hints have been >> replayed >> >> Why is it still recommended to perform frequent automatic repairs, as >> well as enable read repair? Can't I just run a repair after one of the >> nodes is down? The only problem I see with this approach is a long repair >> job (instead of small incremental repairs). But other than that, are there >> any other issues/corner-cases? >> >> >> >> Cheers, >> >> Eugene >> >> >> > >