I guess one more thing is I completely ignore your second write mainly because I assume it comes after we already read so your let's say you current state is
node1 = val1 node2 = val1 node3 = val1 You do a write quorom of val=2 which is IN the middle!!! node1 = val1 node2 = val2 node3 = val1 (NOTICE the write is not complete yet) If you read from node1 and node3, you get val1. If you read from node1 and node2, you get val2 as a read repair will happen. Ie. You always get the older value or newer value. If you have two writes come in like so node1 = val1 node2 = val2 and node3= val3 Well, I think you can figure it out when you do a read ;). If your read quorum reads from node1 and node3 , you get val3, etc. etc. This is basically how it works….If your scenario is a web page, a user simply hits the refresh button and sees the values changing. Later, Dean From: Manu Zhang <owenzhang1...@gmail.com<mailto:owenzhang1...@gmail.com>> Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" <user@cassandra.apache.org<mailto:user@cassandra.apache.org>> Date: Wednesday, October 24, 2012 8:26 AM To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" <user@cassandra.apache.org<mailto:user@cassandra.apache.org>> Subject: Re: What does ReadRepair exactly do? And we don't send read request to all of the three replicas (R1, R2, R3) if CL=QUOROM; just 2 of them depending on proximity On Wed, Oct 24, 2012 at 10:20 PM, Hiller, Dean <dean.hil...@nrel.gov<mailto:dean.hil...@nrel.gov>> wrote: The user will meet the required consistency unless you encounter some kind of bug in cassandra. You will either get the older value or the newer value. If you read quorum, and maybe a write CL=1 just happened, you may get the older or new value depending on if the node that received the write was involved. If you read quorum and your wrote CL=QUOROM, then you may get the newer value or the older value depending on who gets their first so to speak. In your scenario, if the read repair read from R2 just before the write is applied, you get the old value. If it read from R2 just after the write was applied, it gets the new value. BOTH of these met the consistency constraint. A better example to clear this up may be the following... If you read a value at CL=QUOROM, and you have a write 20ms later, you get the old value, right? And it met the consistency level, right? NOW, what about if the write is 1ms later? What if it the right is .00001ms later? It still met the consistency level, right? If it is .00001ms before, you get the new value as it repairs first with the new node. It is just when programming, your read may get the newer value or older value and generally if you write the code in a way that works, this concept works out great in most cases(in some cases, you need to think a bit differently and solve it other ways). I hope that clears it up Later, Dean On 10/24/12 8:02 AM, "shankarpnsn" <shankarp...@gmail.com<mailto:shankarp...@gmail.com>> wrote: >Hiller, Dean wrote >> in general it is okay to get the older or newer value. If you are >>reading >> 2 rows however instead of one, that may change. > >This is certainly interesting, as it could mean that the user could see a >value that never met the required consistency. For instance with 3 >replicas ><R1,R2,R3> and a quorum consistency, assume that R1 is initiating a read >(becomes the coordinator) - notices a conflict with R2 (assume R1 has a >more >recent value) and initiates a read repair with its value. Meanwhile R2 and >R3 have seen two different writes with newer values than what was computed >by the read repair. If R1 were to respond back to the user with the value >that was computed at the time of read repair, wouldn't it be a value that >never met the consistency constraint? I was thinking if this should >trigger >another round of repair that tries to reach the consistency constraint >with >a newer value or time-out, which is the expected case when you don't meet >the required consistency. Please let me know if I'm missing something >here. > > > >-- >View this message in context: >http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/What-does >-ReadRepair-exactly-do-tp7583261p7583366.html >Sent from the >cassandra-u...@incubator.apache.org<mailto:cassandra-u...@incubator.apache.org> > mailing list archive at >Nabble.com.