Re: Why are automatic anti-entropy repairs required when hinted hand-off is enabled?

Alain RODRIGUEZ Thu, 27 Apr 2017 04:09:08 -0700

Hi,

To put it easy, I have been taught that anything that can be disabled is an
optimization. So we don't want to rely an optimization that can silently
fail. This goes for read repair as well as we cannot be sure that all the
data will be read. Plus it is configured to trigger only 10 % of the time
by default, and not cross data center.

(Anti-entropy) Repairs are known to be be necessary to make sure data is
correctly distributed on all the nodes that are supposed to have it.

As Cassandra is built to allow native tolerance to failure (when correctly
configured to do so), it can happen that a node miss a data, by design.

When this data that miss a node was a tombstone due to a delete, it needs
to be replicated before all the other nodes remove it, which happen
eventually after 'gc_grace_seconds' (detailed post about this
thelastpickle.com/blog/2016/07/27/about-deletes-and-tombstones.html). If
this tombstone is removed from all the nodes before having been replicated
to the node that missed it, this node will eventually replicate the data
that should have been deleted, the data overridden by the tombstone. We
call it a zombie.

And hinted handoff *can* and *will* fail. It happened to me in the future
in a bad way, and nothing prevent it from happening in the future, even if
they were greatly imrpoved in 3.0+.

>From Datastax doc
<https://docs.datastax.com/en/cassandra/3.0/cassandra/operations/opsRepairNodesHintedHandoff.html>:
"Hints are flushed to disk every 10 seconds, reducing the staleness of the
hints."

Which means that, by design, a node going down can lose up to 10 seconds of
hints stored for other nodes (in which some might be deletes).

The conclusion is often the same one, if not running deletes or if zombie
data is not an issue, it is quite safe not to run repair within
'gc_grace_seconds' (default 10 days). But this is the only way to ensure a
low entropy for regular data (not only tombstones) in a Cassandra cluster
as of now all other optimizations can and will fail at some point. It also
provides a better consistency, if reading with a weak consistency level
such as LOCAL_ONE, as it will reduce entropy, chance to read the same data
everywhere increases.

C*heers,
-----------------------
Alain Rodriguez - @arodream - al...@thelastpickle.com
France

The Last Pickle - Apache Cassandra Consulting
http://www.thelastpickle.com

2017-04-21 15:54 GMT+02:00 Thakrar, Jayesh <jthak...@conversantmedia.com>:

>
>
> Unfortunately, I don’t know much about the replication architecture.
>
> The only thing I know is that the replication is set at the keyspace level
> (i.e. 1, 2 or 3 or N replicas) and then
>
> there is the consistency level set at the client application level which
> determines how many acknowledgements
>
> are necessary to deem a write successful.
>
>
>
> And you might have noticed in the video that anti-entropy is to be done as
> "deemed" necessary and not to be done blindly as a rule.
>
> E.g. if your data is read-only (never mutated) then there is no need for
> anti-entropy.
>
>
>
> *From: *eugene miretsky <eugene.miret...@gmail.com>
> *Date: *Thursday, April 20, 2017 at 5:52 PM
> *To: *Conversant <jthak...@conversantmedia.com>
> *Cc: *"user@cassandra.apache.org" <user@cassandra.apache.org>
> *Subject: *Re: Why are automatic anti-entropy repairs required when
> hinted hand-off is enabled?
>
>
>
> Thanks Jayesh,
>
>
>
> Watched all of those.
>
>
>
> Still not sure I fully get the theory behind it
>
>
>
> Aside from the 2 failure  cases I mentioned earlier, the only other way
> data can become inconsistent  is error when replicating the data in the
> background. Does Cassandra have a retry policy for internal replication? Is
> there a setting to change it?
>
>
>
>
>
>
>
>
>
>
>
> On Thu, Apr 6, 2017 at 10:54 PM, Thakrar, Jayesh <
> jthak...@conversantmedia.com> wrote:
>
> I had asked a similar/related question - on how to carry out repair, etc
> and got some useful pointers.
>
> I would highly recommend the youtube video or the slideshare link below
> (both are for the same presentation).
>
>
>
> https://www.youtube.com/watch?v=1Sz_K8UID6E
>
>
>
> http://www.slideshare.net/DataStax/real-world-repairs-
> vinay-chella-netflix-cassandra-summit-2016
>
>
>
> https://www.pythian.com/blog/effective-anti-entropy-repair-cassandra/
>
>
>
> https://docs.datastax.com/en/cassandra/2.1/cassandra/tools/
> toolsRepair.html
>
>
>
> https://www.datastax.com/dev/blog/repair-in-cassandra
>
>
>
>
>
>
>
>
>
> *From: *eugene miretsky <eugene.miret...@gmail.com>
> *Date: *Thursday, April 6, 2017 at 3:35 PM
> *To: *<user@cassandra.apache.org>
> *Subject: *Why are automatic anti-entropy repairs required when hinted
> hand-off is enabled?
>
>
>
> Hi,
>
>
>
> As I see it, if hinted handoff is enabled, the only time data can be
> inconsistent is when:
>
>    1. A node is down for longer than the max_hint_window
>    2. The coordinator node crushes before all the hints have been replayed
>
> Why is it still recommended to perform frequent automatic repairs, as well
> as enable read repair? Can't I just run a repair after one of the nodes is
> down? The only problem I see with this approach is a long repair job
> (instead of small incremental repairs). But other than that, are there any
> other issues/corner-cases?
>
>
>
> Cheers,
>
> Eugene
>
>
>

Re: Why are automatic anti-entropy repairs required when hinted hand-off is enabled?

Reply via email to