read quorum doesn't mean we read newest values from a quorum number of replicas but to ensure we read at least one newest value as long as write quorum succeeded beforehand and W+R > N.
On Fri, Oct 26, 2012 at 12:00 AM, Hiller, Dean <dean.hil...@nrel.gov> wrote: > Kind of an interesting question > > I think you are saying if a client read resolved only the two nodes as > said in Aaron's email back to the client and read -repair was kicked off > because of the inconsistent values and the write did not complete yet and > I guess you would have two nodes go down to lose the value right after the > read, and before write was finished such that the client read a value that > was never stored in the database. The odds of two nodes going out are > pretty slim though. > > Or, what if the node with part of the write went down, as long as the > client stays up, he would complete his write on the other two nodes. > Seems to me as long as two nodes don't fail, you are reading at quorum and > fit with the consistency model since you get a value that will be on two > nodes in the immediate future. > > Thanks, > Dean > > On 10/25/12 9:45 AM, "shankarpnsn" <shankarp...@gmail.com> wrote: > > >aaron morton wrote > >>> 2. You do a write operation (W1) with quorom of val=2 > >>> node1 = val1 node2 = val2 node3 = val1 (write val2 is not complete > >>>yet) > >> If the write has not completed then it is not a successful write at the > >> specified CL as it could fail now. > >> > >> Therefor the R +W > N Strong Consistency guarantee does not apply at > >>this > >> exact point in time. A read to the cluster at this exact point in time > >> using QUOURM may return val2 or val1. Again the operation W1 has not > >> completed, if read R' starts and completes while W1 is processing it may > >> or may not return the result of W1. > > > >I agree completely that it is fair to have this indeterminism in case of > >partial/failed/in-flight writes, based on what nodes respond to a > >subsequent > >read. > > > > > >aaron morton wrote > >> It's import to point out the difference between Read Repair, in the > >> context of the read_repair_chance setting, and Consistent Reads in the > >> context of the CL setting. All of this is outside of the processing of > >> your read request. It is separate from the stuff below. > >> > >> Inside the user read request when ReadCallback.get() is called and CL > >> nodes have responded the responses are compared. If a DigestMismatch > >> happens then a Row Repair read is started, the result of this read is > >> returned to the user. This Row Repair read MAY detect differences, if it > >> does it resolves the super set, sends the delta to the replicas and > >> returns the super set value to be returned to the client. > >> > >>> In this case, for read R1, the value val2 does not have a quorum. Would > >>> read > >>> R1 return val2 or val4 ? > >> > >> If val4 is in the memtable on node before the second read the result > >>will > >> be val4. > >> Writes that happen between the initial read and the second read after a > >> Digest Mismatch are included in the read result. > > > >Thanks for clarifying this, Aaron. This is very much in line with what I > >figured out from the code and brings me back to my initial question on the > >point of when and what the user/client gets to see as the read result. Let > >us, for now, consider only the repairs initiated as a part of /consistent > >reads/. If the Row Repair (after resolving and sending the deltas to > >replicas, but not waiting for a quorum success after the repair) returns > >the > >super set value immediately to the user, wouldn't it be a breach of the > >consistent reads paradigm? My intuition behind saying this is because we > >would respond to the client without the replicas having confirmed their > >meeting the consistency requirement. > > > >I agree that returning val4 is the right thing to do if quorum (two) nodes > >among (node1,node2,node3) have the val4 at the second read after digest > >mismatch. But wouldn't it be incorrect to respond to user with any value > >when the second read (after mismatch) doesn't find a quorum. So after > >sending the deltas to the replicas as a part of the repair (still a part > >of > >/consistent reads/), shouldn't the value be read again to check for the > >presence of a quorum after the repair? > > > >In the example we had, assume the mismatch is detected during a read R1 > >from > >coordinator node C, that reaches node1, node2 > >State seen by C after first read R1: <node1 = val1, node2 = val 2, node3 > >= > >val1> > > > >A second read is initiated as a part of repair for consistent read of R1. > >This second read observes the values (val1, val2) from (node1, node2) and > >sends the corresponding row repair delta to node1. I'm guessing C cannot > >respond back to user with val2 until C knows that node1 has actually > >written > >the value val2 thereby meeting the quorum. Is this interpretation correct > >? > > > > > > > > > > > > > >-- > >View this message in context: > > > http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/What-does > >-ReadRepair-exactly-do-tp7583261p7583395.html > >Sent from the cassandra-u...@incubator.apache.org mailing list archive at > >Nabble.com. > >