Hi, In my opinion the guaranty provided by Cassandra is : if your write request in Quorum *succeed*, then the next (after the write response) read requests in Quorum (that succeed too) will be consistent (actually CL.Write + CL.Read > RF)
Of course while you haven't received a valid response to your write request in Quorum the cluster is in a inconsistent state, and you have *to retry your write request.* That said, Cassandra provides some other important behaviors that will tend to reduce the time of this inconsistent state : - the coordinator will not send the request to only the nodes that should answer to satisfy the CL, but to all nodes that should have the data (of course with RF=3, only A,B&C are involved) - during read requests, cassandra will ask to one node the data and to the others involved in the CL a digest, and if all digests do not match, will ask for them the entire data, handle the merge and finally will ask to those nodes a background repair. Your write may have succeed during this time. - according to a chance ratio, cassandra will ask *sometimes* a read to all nodes holding the data, not only the ones involved in the CL and execute background repairs - you have to schedule repairs regularly I'd add that if some nodes do not succeed to handle write requests in time, they may be under pressure, and there is a small chance that they succeed on a read request :) And finally what is time? From where/when? You may schedule a read after an other but receive the result before. Writing in Quorum is not writing within a transaction, you'll certainly have to made some tradeoff. Regards, -- Nicolas Le mer. 14 sept. 2016 à 21:14, Alexander Dejanovski <a...@thelastpickle.com> a écrit : > My understanding of the described scenario is that the write hasn't > succeeded when reads are fired, as B and C haven't processed the mutation > yet. > > There would be 3 clients here and not 2 : C1 writes, C2 and C3 read. > > So the race condition could still happen in this particular case. > > Le mer. 14 sept. 2016 21:07, Work <jrother...@codojo.me> a écrit : > >> Hi Alex: >> >> Hmmm ... Assuming clock skew is eliminated.... And assuming nodes are up >> and available ... And assuming quorum writes and quorum reads and everyone >> waiting for success ( which is NOT The OP scenario), Two different clients >> will be guaranteed to see all successful writes, or be told that read >> failed. >> >> C1 writes at quorum to A,B >> C2 reads at quorum. >> So it tries to read from ALL nodes, A,B, C. >> If A,B respond --> success >> If A,C respond --> conflict >> If B, C respond --> conflict >> Because a quorum (2 nodes) responded, the coordinator will return the >> latest time stamp and may issue read repair depending on YAML settings. >> >> So where do you see only one client having this guarantee? >> >> Regards, >> >> James >> >> On Sep 14, 2016, at 4:00 AM, Alexander DEJANOVSKI <adejanov...@gmail.com> >> wrote: >> >> Hi, >> >> the analysis is valid, and strong consistency the Cassandra way means >> that one client writing at quorum, then reading at quorum will always see >> his previous write. >> Two different clients have no guarantee to see the same data when using >> quorum, as illustrated in your example. >> >> Only options here are to route requests to specific clients based on some >> id to guarantee the sequence of operations outside of Cassandra (the same >> client will always be responsible for a set of ids), or raise the CL to ALL >> at the expense of availability (you should not do that). >> >> >> Cheers, >> >> Alex >> >> Le mer. 14 sept. 2016 à 11:47, Qi Li <ken.l...@gmail.com> a écrit : >> >>> hi all, >>> >>> we are using quorum consistency, and we *suspect* there may be a race >>> condition during the write. lets say RF is 3. so write will wait for at >>> least 2 nodes to ack. suppose there is only 1 node acked(node A). the other >>> 2 nodes(node B, C) are still waiting to update. there come two read requests >>> one read is having the data responded from the node B and C, so version >>> 1 us returned. >>> the other node is having data responded from node A and B, so the latest >>> version 2 is returned. >>> >>> so clients are getting different data at the same time. is this a valid >>> analysis? if so, is there any options we can set to deal with this issue? >>> >>> thanks >>> Ken >>> >> -- > ----------------- > Alexander Dejanovski > France > @alexanderdeja > > Consultant > Apache Cassandra Consulting > http://www.thelastpickle.com >