Re: race condition for quorum consistency

Nicolas Douillet Wed, 14 Sep 2016 13:50:00 -0700

Hi,

In my opinion the guaranty provided by Cassandra is :
      if your write request in Quorum *succeed*, then the next (after the
write response) read requests in Quorum (that succeed too) will be
consistent
      (actually CL.Write + CL.Read > RF)


Of course while you haven't received a valid response to your write request
in Quorum the cluster is in a inconsistent state, and you have *to retry
your write request.*

That said, Cassandra provides some other important behaviors that will tend
to reduce the time of this inconsistent state :

   - the coordinator will not send the request to only the nodes that
   should answer to satisfy the CL, but to all nodes that should have
the data (of
   course with RF=3, only A,B&C are involved)

   - during read requests, cassandra will ask to one node the data and to
   the others involved in the CL a digest, and if all digests do not match,
   will ask for them the entire data, handle the merge and finally will ask to
   those nodes a background repair. Your write may have succeed during this
   time.

   - according to a chance ratio, cassandra will ask *sometimes* a read to
   all nodes holding the data, not only the ones involved in the CL and
   execute background repairs

   - you have to schedule repairs regularly


I'd add that if some nodes do not succeed to handle write requests in time,
they may be under pressure, and there is a small chance that they succeed
on a read request :)

And finally what is time? From where/when? You may schedule a read after an
other but receive the result before. Writing in Quorum is not writing
within a transaction, you'll certainly have to made some tradeoff.

Regards,

--
Nicolas




Le mer. 14 sept. 2016 à 21:14, Alexander Dejanovski <a...@thelastpickle.com>
a écrit :

> My understanding of the described scenario is that the write hasn't
> succeeded when reads are fired, as B and C haven't processed the mutation
> yet.
>
> There would be 3 clients here and not 2 : C1 writes, C2 and C3 read.
>
> So the race condition could still happen in this particular case.
>
> Le mer. 14 sept. 2016 21:07, Work <jrother...@codojo.me> a écrit :
>
>> Hi Alex:
>>
>> Hmmm ... Assuming clock skew is eliminated.... And assuming nodes are up
>> and available ... And assuming quorum writes and quorum reads and everyone
>> waiting for success ( which is NOT The OP scenario), Two different clients
>> will be guaranteed to see all successful writes, or be told that read
>> failed.
>>
>> C1 writes at quorum to A,B
>> C2 reads at quorum.
>> So it tries to read from ALL nodes, A,B, C.
>> If A,B respond --> success
>> If A,C respond --> conflict
>> If B, C respond --> conflict
>> Because a quorum (2 nodes) responded, the coordinator will return the
>> latest time stamp and may issue read repair depending on YAML settings.
>>
>> So where do you see only one client having this guarantee?
>>
>> Regards,
>>
>> James
>>
>> On Sep 14, 2016, at 4:00 AM, Alexander DEJANOVSKI <adejanov...@gmail.com>
>> wrote:
>>
>> Hi,
>>
>> the analysis is valid, and strong consistency the Cassandra way means
>> that one client writing at quorum, then reading at quorum will always see
>> his previous write.
>> Two different clients have no guarantee to see the same data when using
>> quorum, as illustrated in your example.
>>
>> Only options here are to route requests to specific clients based on some
>> id to guarantee the sequence of operations outside of Cassandra (the same
>> client will always be responsible for a set of ids), or raise the CL to ALL
>> at the expense of availability (you should not do that).
>>
>>
>> Cheers,
>>
>> Alex
>>
>> Le mer. 14 sept. 2016 à 11:47, Qi Li <ken.l...@gmail.com> a écrit :
>>
>>> hi all,
>>>
>>> we are using quorum consistency, and we *suspect* there may be a race
>>> condition during the write. lets say RF is 3. so write will wait for at
>>> least 2 nodes to ack. suppose there is only 1 node acked(node A). the other
>>> 2 nodes(node B, C) are still waiting to update. there come two read requests
>>> one read is having the data responded from the node B and C, so version
>>> 1 us returned.
>>> the other node is having data responded from node A and B, so the latest
>>> version 2 is returned.
>>>
>>> so clients are getting different data at the same time. is this a valid
>>> analysis? if so, is there any options we can set to deal with this issue?
>>>
>>> thanks
>>> Ken
>>>
>> --
> -----------------
> Alexander Dejanovski
> France
> @alexanderdeja
>
> Consultant
> Apache Cassandra Consulting
> http://www.thelastpickle.com
>

Re: race condition for quorum consistency

Reply via email to