Re: New Chain for : Does Cassandra use vector clocks

Anthony John Thu, 24 Feb 2011 09:02:30 -0800

If you are correct and you are probably closer to the code - then CL of
Quorum does not guarantee a consistency.


On Thu, Feb 24, 2011 at 10:54 AM, Sylvain Lebresne <sylv...@datastax.com>wrote:

> On Thu, Feb 24, 2011 at 5:34 PM, Anthony John <chirayit...@gmail.com>wrote:
>
>>  >>Time stamps are not used for conflict resolution - unless is is part
>>> of the application logic!!!
>>>
>>
>> >>What is you definition of conflict resolution ? Because if you update
>> twice the same column (which
>> >>I'll call a conflict), then the timestamps are used to decide which
>> update wins (which I'll call a resolution).
>>
>> I understand what you are saying, and yes semantics is very important
>> here. And yes we are responding to the immediate questions without covering
>> all questions in the thread.
>>
>> The point being made here is that the timestamp of the column is not used
>> by Cassandra to figure out what data to return.
>>
>
> Not quite true.
>
>
>> E.g. - Quorum is 2 nodes - and RF of 3 over N1/2/3
>> A Quorum  Write comes and add/updates the time stamp (TS2) of a particular
>> data element. It succeeds on N1 - fails on N2/3. So the write is returned as
>> failed - right ?
>> Now Quorum read comes in for exactly the same piece of data that the write
>> failed for.
>> So N1 has TS2 but both N2/3 have the old TS (say TS1)
>> And the read succeeds - Will it return TS1 or TS2.
>>
>> I submit it will return TS1 - the old TS.
>>
>
> It all depends on which (first 2) nodes respond to the read (since RF=3,
> that can any two of N1/N2/N3). If N1 is part of the two that makes the
> quorum, then TS2 will be returned, because cassandra will compare the
> timestamp and decide what to return based on this. If N2/N3 responds
> however, both timestamp will be TS1 and so, after timestamp resolution, it
> will stil be TS1 that will be returned.
> So yes timestamp is used for conflict resolution.
>
> In your example, you could get TS1 back because a failed write can let you
> cluster in an inconsistent state. You'd have to retry the quorum and only
> when it succeeds can you be guaranteed that quorum read will always return
> TS2.
>
> This is because when a write fails, Cassandra doesn't guarantee that the
> write did not made it in (there is no revert).
>
>
>>
>> Are we on the same page with this interpretation ?
>>
>> Regards,
>>
>> -JA
>>
>> On Thu, Feb 24, 2011 at 10:12 AM, Sylvain Lebresne 
>> <sylv...@datastax.com>wrote:
>>
>>> On Thu, Feb 24, 2011 at 4:52 PM, Anthony John <chirayit...@gmail.com>wrote:
>>>
>>>> Sylvan,
>>>>
>>>> Time stamps are not used for conflict resolution - unless is is part of
>>>> the application logic!!!
>>>>
>>>
>>> What is you definition of conflict resolution ? Because if you update
>>> twice the same column (which
>>> I'll call a conflict), then the timestamps are used to decide which
>>> update wins (which I'll call a resolution).
>>>
>>>
>>>> You can have "lost updates" w/Cassandra. You need to to use 3rd products
>>>> - cages for e.g. - to get ACID type consistency.
>>>>
>>>
>>> Then again, you'll have to define what you are calling "lost updates".
>>> Provided you use a reasonable consistency level, Cassandra provides fairly
>>> strong durability guarantee, so for some definition you don't "lose
>>> updates".
>>>
>>> That being said, I never pretended that Cassandra provided any ACID
>>> guarantee. ACID relates to transaction, which Cassandra doesn't support. If
>>> we're talking about the guarantees of transaction, then by all means,
>>> cassandra won't provide it. And yes you can use cages or the like to get
>>> transaction. But that was not the point of the thread, was it ? The thread
>>> is about vector clocks, and that has nothing to do with transaction (vector
>>> clocks certainly don't give you transactions).
>>>
>>> Sorry if I wasn't clear in my mail, but I was only responding to why so
>>> far I don't think vector clocks would really provide much for Cassandra.
>>>
>>> --
>>> Sylvain
>>>
>>>
>>>> -JA
>>>>
>>>>
>>>> On Thu, Feb 24, 2011 at 7:41 AM, Sylvain Lebresne <sylv...@datastax.com
>>>> > wrote:
>>>>
>>>>> On Thu, Feb 24, 2011 at 3:22 AM, Anthony John 
>>>>> <chirayit...@gmail.com>wrote:
>>>>>
>>>>>> Apologies : For some reason my response on the original mail keeps
>>>>>> bouncing back, thus this new one!
>>>>>> > From the other hand, the same article says:
>>>>>> > "For conditional writes to work, the condition must be evaluated at
>>>>>> all update
>>>>>> > sites before the write can be allowed to succeed."
>>>>>> >
>>>>>> > This means, that when doing such an update CL=ALL must be used
>>>>>>
>>>>>> Sorry, but I am confused by that entire thread!
>>>>>>
>>>>>> Questions:-
>>>>>> 1. Does Cassandra implement any kind of data locking - at any
>>>>>> granularity whether it be row/colF/Col ?
>>>>>>
>>>>>
>>>>> No locking, no.
>>>>>
>>>>>
>>>>>> 2. If the answer to 1 above is NO! - how does CL ALL prevent
>>>>>> conflicts. Concurrent updates on exactly the same piece of data on 
>>>>>> different
>>>>>> nodes can still mess each other up, right ?
>>>>>>
>>>>>
>>>>> Not sure why you are taking CL.ALL specifically. But in any CL,
>>>>> updating the same piece of data means the same column value. In that case,
>>>>> the resolution rules are the following:
>>>>>    - If the updates have a different timestamp, keep the one with the
>>>>> higher timestamp. That is, the more recent of two updates win.
>>>>>   - It the timestamps are the same, then it compares the values (byte
>>>>> comparison) and keep the highest value. This is just to break ties in a
>>>>> consistent manner.
>>>>>
>>>>> So if you do two truly concurrent updates (that is from two place at
>>>>> the same instant), then you'll end with one of the update. This is the
>>>>> column level.
>>>>>
>>>>> However, if that simple conflict detection/resolution mechanism is not
>>>>> good enough for some of your use case and you need to keep two concurrent
>>>>> updates, it is easy enough. Just make sure that the update don't end up in
>>>>> the same column. This is easily achieved by appending some unique 
>>>>> identifier
>>>>> to the column name for instance. And when reading, do a slice and 
>>>>> reconcile
>>>>> whatever you get back with whatever logic make sense. If you do that,
>>>>> congrats, you've roughly emulated what vector clocks would do. Btw, no
>>>>> locking or anything needed.
>>>>>
>>>>> In my experience, for most things the timestamp resolution is enough.
>>>>> If the same user update twice it's profile picture on you web site at the
>>>>> same microsecond, it's usually fine to end up with one of the two 
>>>>> pictures.
>>>>> In the rare case where you need something more specific, using the 
>>>>> cassandra
>>>>> data model usually solves the problem easily. The reason for not having
>>>>> vector clocks in Cassandra is that so far, we haven't really found much
>>>>> example where it is no the case.
>>>>>
>>>>> --
>>>>> Sylvain
>>>>>
>>>>>
>>>>
>>>
>>
>

Re: New Chain for : Does Cassandra use vector clocks

Reply via email to