>
> It happened to me in the future in a bad way, and nothing prevent it from
> happening in the future


Obviously "It happened to me in the past in a bad way"*. Thinking faster
than I write... I am quite slow writing :p.

To be clear I recommend:

   - to run repairs within gc_grace_seconds when performing deletes (not
   TTL, TTLs are fine)
   - to run repairs 'regularly' when not deleting data (depending on data
   size and CL in use)

Hope that helps,
-----------------------
Alain Rodriguez - @arodream - al...@thelastpickle.com
France

The Last Pickle - Apache Cassandra Consulting
http://www.thelastpickle.com

2017-04-27 13:07 GMT+02:00 Alain RODRIGUEZ <arodr...@gmail.com>:

> Hi,
>
> To put it easy, I have been taught that anything that can be disabled is
> an optimization. So we don't want to rely an optimization that can silently
> fail. This goes for read repair as well as we cannot be sure that all the
> data will be read. Plus it is configured to trigger only 10 % of the time
> by default, and not cross data center.
>
> (Anti-entropy) Repairs are known to be be necessary to make sure data is
> correctly distributed on all the nodes that are supposed to have it.
>
> As Cassandra is built to allow native tolerance to failure (when correctly
> configured to do so), it can happen that a node miss a data, by design.
>
> When this data that miss a node was a tombstone due to a delete, it needs
> to be replicated before all the other nodes remove it, which happen
> eventually after 'gc_grace_seconds' (detailed post about this
> thelastpickle.com/blog/2016/07/27/about-deletes-and-tombstones.html). If
> this tombstone is removed from all the nodes before having been replicated
> to the node that missed it, this node will eventually replicate the data
> that should have been deleted, the data overridden by the tombstone. We
> call it a zombie.
>
> And hinted handoff *can* and *will* fail. It happened to me in the future
> in a bad way, and nothing prevent it from happening in the future, even if
> they were greatly imrpoved in 3.0+.
>
> From Datastax doc
> <https://docs.datastax.com/en/cassandra/3.0/cassandra/operations/opsRepairNodesHintedHandoff.html>:
> "Hints are flushed to disk every 10 seconds, reducing the staleness of the
> hints."
>
> Which means that, by design, a node going down can lose up to 10 seconds
> of hints stored for other nodes (in which some might be deletes).
>
> The conclusion is often the same one, if not running deletes or if zombie
> data is not an issue, it is quite safe not to run repair within
> 'gc_grace_seconds' (default 10 days). But this is the only way to ensure a
> low entropy for regular data (not only tombstones) in a Cassandra cluster
> as of now all other optimizations can and will fail at some point. It also
> provides a better consistency, if reading with a weak consistency level
> such as LOCAL_ONE, as it will reduce entropy, chance to read the same data
> everywhere increases.
>
> C*heers,
> -----------------------
> Alain Rodriguez - @arodream - al...@thelastpickle.com
> France
>
> The Last Pickle - Apache Cassandra Consulting
> http://www.thelastpickle.com
>
> 2017-04-21 15:54 GMT+02:00 Thakrar, Jayesh <jthak...@conversantmedia.com>:
>
>>
>>
>> Unfortunately, I don’t know much about the replication architecture.
>>
>> The only thing I know is that the replication is set at the keyspace
>> level (i.e. 1, 2 or 3 or N replicas) and then
>>
>> there is the consistency level set at the client application level which
>> determines how many acknowledgements
>>
>> are necessary to deem a write successful.
>>
>>
>>
>> And you might have noticed in the video that anti-entropy is to be done
>> as "deemed" necessary and not to be done blindly as a rule.
>>
>> E.g. if your data is read-only (never mutated) then there is no need for
>> anti-entropy.
>>
>>
>>
>> *From: *eugene miretsky <eugene.miret...@gmail.com>
>> *Date: *Thursday, April 20, 2017 at 5:52 PM
>> *To: *Conversant <jthak...@conversantmedia.com>
>> *Cc: *"user@cassandra.apache.org" <user@cassandra.apache.org>
>> *Subject: *Re: Why are automatic anti-entropy repairs required when
>> hinted hand-off is enabled?
>>
>>
>>
>> Thanks Jayesh,
>>
>>
>>
>> Watched all of those.
>>
>>
>>
>> Still not sure I fully get the theory behind it
>>
>>
>>
>> Aside from the 2 failure  cases I mentioned earlier, the only other way
>> data can become inconsistent  is error when replicating the data in the
>> background. Does Cassandra have a retry policy for internal replication? Is
>> there a setting to change it?
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> On Thu, Apr 6, 2017 at 10:54 PM, Thakrar, Jayesh <
>> jthak...@conversantmedia.com> wrote:
>>
>> I had asked a similar/related question - on how to carry out repair, etc
>> and got some useful pointers.
>>
>> I would highly recommend the youtube video or the slideshare link below
>> (both are for the same presentation).
>>
>>
>>
>> https://www.youtube.com/watch?v=1Sz_K8UID6E
>>
>>
>>
>> http://www.slideshare.net/DataStax/real-world-repairs-vinay-
>> chella-netflix-cassandra-summit-2016
>>
>>
>>
>> https://www.pythian.com/blog/effective-anti-entropy-repair-cassandra/
>>
>>
>>
>> https://docs.datastax.com/en/cassandra/2.1/cassandra/tools/t
>> oolsRepair.html
>>
>>
>>
>> https://www.datastax.com/dev/blog/repair-in-cassandra
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> *From: *eugene miretsky <eugene.miret...@gmail.com>
>> *Date: *Thursday, April 6, 2017 at 3:35 PM
>> *To: *<user@cassandra.apache.org>
>> *Subject: *Why are automatic anti-entropy repairs required when hinted
>> hand-off is enabled?
>>
>>
>>
>> Hi,
>>
>>
>>
>> As I see it, if hinted handoff is enabled, the only time data can be
>> inconsistent is when:
>>
>>    1. A node is down for longer than the max_hint_window
>>    2. The coordinator node crushes before all the hints have been
>>    replayed
>>
>> Why is it still recommended to perform frequent automatic repairs, as
>> well as enable read repair? Can't I just run a repair after one of the
>> nodes is down? The only problem I see with this approach is a long repair
>> job (instead of small incremental repairs). But other than that, are there
>> any other issues/corner-cases?
>>
>>
>>
>> Cheers,
>>
>> Eugene
>>
>>
>>
>
>

Reply via email to