Re: Cassandra cross dc replication row isolationCassandra cross dc replication row isolation

Alexey Knyshev Tue, 21 May 2019 08:02:17 -0700

Anyone?

ср, 8 мая 2019 г. в 11:37, Alexey Knyshev <alexey.knys...@gmail.com>:


> Hi, thanks for your answers!
>
> > Are you asking if writes are atomic at the partition level? If so yes.
> If you have N columns in a simple k/v schema, and you send a write with X/N
> of those columns set, all X will be updated at the same time wherever that
> writes goes.
>
> Even for cross dc replication? (And yes, it has nothing about CL at all).
> I'm not familiar with Cassandra internals and implementation details, but
> if I understand correctly there are at least 3 possible implementations for
> replication:
>
>    1. Row level, each row is eventually consistent as a bunch of cells.
>    User cannot see the mix of two concurrent updates, so he can only see row
>    as before or after write but not in intermediate state (write are
>    linearised). Looks like this case is not about Cassandra because it should
>    require read before write operation to achieve this and such thing has huge
>    impact on Cassandra basic properties as it is an AP system in general.
>    2. Write level, each write is applied / replicated atomically and this
>    guarantees isolation across ONLY updated columns during this write. Enough
>    for my case.
>    3. Cell level, each cell "lives" its' own life without any sync with
>    other cells in row (excluding PK of course). Worst case scenario.
>
> To clarify a bit my case. I just insert rows with unique PK (no overwrites
> happen) into CF and hope that row will be atomically replicated
> (eventually) with isolation guarantees (whole row or nothing) to another
> datacenter.
>
> ср, 8 мая 2019 г. в 01:15, Avinash Mandava <avin...@vorstella.com>:
>
>> Are you asking if writes are atomic at the partition level? If so yes. If
>> you have N columns in a simple k/v schema, and you send a write with X/N of
>> those columns set, all X will be updated at the same time wherever that
>> writes goes.
>>
>> The CL thing is more about how tolerant you are to stale data, i.e. if
>> you write in one DC and you absolutely can't tolerate reads from a remote
>> DC showing stale data, you would have to write at EACH_QUORUM and read at
>> LOCAL_QUORUM. While I'm not one for blanket advice, and certainly you can
>> make the decision on this tradeoff, this is a last resort situation, one of
>> those "supported features" that you ought to be wary of, as it's a bit off
>> from the intended design/usage of the system.
>>
>> On Tue, May 7, 2019 at 2:58 PM Rahul Singh <rahul.xavier.si...@gmail.com>
>> wrote:
>>
>>> Depends on the consistency level you are setting on write and read.
>>>
>>> What CL are you writing at and what CL are you reading at?
>>>
>>> The consistency level tells the coordinator when to send acknowledgement
>>> of a write and whether to cross DCs to confirm a write. It also tells the
>>> coordinator how many replicas to read and whether or not to cross  DCs to
>>> get consensus.
>>>
>>> Eg. Local_quorum is different from Quorum.
>>> Local_quorum guarantees Data was saved to a quorum of nodes on the DC on
>>> which the Coordinator accepted the write. Similarly it would only check
>>> nodes in that DC. Quorum would check across DCs in the whole cluster.
>>> On May 7, 2019, 12:11 PM -0500, Alexey Knyshev <alexey.knys...@gmail.com>,
>>> wrote:
>>>
>>> Hi there!
>>>
>>> Could someone please explain how Column Family would be replicated and
>>> "visible / readable" in the following scenario? Having multiple
>>> geo-distributed datacenters with significant latency (up to 100ms RTT).
>>> Let's name two of them A and B and consider the following 2 cases:
>>>
>>>    1. Cassandra client X inserts row into Column Family (CF) with
>>>    Primary Key = PK (all cells are set - no nulls possible). Write 
>>> coordinator
>>>    is in dc A. All cells in this write should have the same writetime. For
>>>    simplicity let's assume that Cassandra coordinator node sets writetime.
>>>    After some amount of time (< RTT) client Y reads whole row (select * ...)
>>>    from the same CF with same PK talking to coordinator node from, another 
>>> dc
>>>    (B). Is it possible that client Y will get some cells as NULLs, I mean, 
>>> is
>>>    it possible to read some already replicated cells and for others get 
>>> NULLs,
>>>    or does Cassandra guarantee row-level isolation / atomic write for that
>>>    insert? Assume that row (all cells for same PK will never be updated /
>>>    deleted afterwards.
>>>    2. Same as in p.1 but after first write at PK same client (X)
>>>    updates some columns for the same PK. Will be this update isolated /
>>>    atomically written and eventually visible in another dc. Will client see
>>>    isolated state as it was before write or after it?
>>>
>>> Thanks in advance!
>>>
>>>
>>> --
>>> linkedin.com/profile
>>> <https://www.linkedin.com/profile/view?id=AAMAABn6oKQBDhBteiQnWsYm-S9yxT7wQkfWhSw>
>>>
>>> github.com/alexeyknyshev
>>> bitbucket.org/alexeyknyshev
>>>
>>>
>>
>> --
>> www.vorstella.com
>> 408 691 8402
>>
>
>
> --
> linkedin.com/profile
> <https://www.linkedin.com/profile/view?id=AAMAABn6oKQBDhBteiQnWsYm-S9yxT7wQkfWhSw>
>
> github.com/alexeyknyshev
> bitbucket.org/alexeyknyshev
>


-- 
linkedin.com/profile
<https://www.linkedin.com/profile/view?id=AAMAABn6oKQBDhBteiQnWsYm-S9yxT7wQkfWhSw>

github.com/alexeyknyshev
bitbucket.org/alexeyknyshev

Re: Cassandra cross dc replication row isolationCassandra cross dc replication row isolation

Reply via email to