Re: Multi-DC Environment Question

Vasileios Vlachos Fri, 30 May 2014 04:09:26 -0700

Thanks for your responses, Ben thanks for the link.

Basically you sort of confirmed that if down_time > max_hint_window_in_ms
the only way to bring DC1 up-to-date is anti-entropy repair. Read
consistency level is irrelevant to the problem I described as I am reading
LOCAL_QUORUM. In this situation I lost whatever data -if any- had not been
transfered across to DC2 before DC1 went down, that is understandable.
Also, read repair does not help either as we assumed that down_time >
max_hint_window_in_ms. Please correct me if I am wrong.


I think I could better understand how that works if I knew the answers to
the following questions:
1. What is the output of nodetool status when a cluster spans across 2 DCs?
Will I be able to see ALL nodes irrespective of the DC they belong to?
2. How tokens are being assigned when adding a 2nd DC? Is the range -2^64
to 2^63 for each DC, or it is  -2^64 to 2^63 for the entire cluster? (I
think the latter is correct)
3. Does the coordinator store 1 hint irrespective of how many replicas
happen to be down at the time and also irrespective of DC2 being down in
the scenario I described above? (I think the answer is according to the
presentation you sent me, but I would like someone to confirm that)

Thank you in advance,

Vasilis


On Fri, May 30, 2014 at 3:13 AM, Ben Bromhead <b...@instaclustr.com> wrote:

> Short answer:
>
> If time elapsed > max_hint_window_in_ms then hints will stop being
> created. You will need to rely on your read consistency level, read repair
> and anti-entropy repair operations to restore consistency.
>
> Long answer:
>
> http://www.slideshare.net/jasedbrown/understanding-antientropy-in-cassandra
>
> Ben Bromhead
> Instaclustr | www.instaclustr.com | @instaclustr
> <http://twitter.com/instaclustr> | +61 415 936 359
>
> On 30 May 2014, at 8:40 am, Tupshin Harper <tups...@tupshin.com> wrote:
>
> When one node or DC is down, coordinator nodes being written through will
> notice this fact and store hints (hinted handoff is the mechanism),  and
> those hints are used to send the data that was not able to be replicated
> initially.
>
> http://www.datastax.com/dev/blog/modern-hinted-handoff
>
> -Tupshin
> On May 29, 2014 6:22 PM, "Vasileios Vlachos" <vasileiosvlac...@gmail.com>
> wrote:
>
>  Hello All,
>
> We have plans to add a second DC to our live Cassandra environment.
> Currently RF=3 and we read and write at QUORUM. After adding DC2 we are
> going to be reading and writing at LOCAL_QUORUM.
>
> If my understanding is correct, when a client sends a write request, if
> the consistency level is satisfied on DC1 (that is RF/2+1), success is
> returned to the client and DC2 will eventually get the data as well. The
> assumption behind this is that the the client always connects to DC1 for
> reads and writes and given that there is a site-to-site VPN between DC1 and
> DC2. Therefore, DC1 will almost always return success before DC2 (actually
> I don't know if it is possible for DC2 to be more up-to-date than DC1 with
> this setup...).
>
> Now imagine DC1 looses connectivity and the client fails over to DC2.
> Everything should work fine after that, with the only difference that DC2
> will be now handling the requests directly from the client. After some
> time, say after max_hint_window_in_ms, DC1 comes back up. My question is
> how do I bring DC1 up to speed with DC2 which is now more up-to-date? Will
> that require a nodetool repair on DC1 nodes? Also, what is the answer
> when the outage is < max_hint_window_in_ms instead?
>
> Thanks in advance!
>
> Vasilis
>
> --
> Kind Regards,
>
> Vasileios Vlachos
>
>
>

Re: Multi-DC Environment Question

Reply via email to