Consistency Level is a pseudo-enum... you have the choice between
ONE Quorum (and there are different types of this) ALL At CL=ONE, only one node is guaranteed to have got the write if the operation is a success. At CL=ALL, all nodes that the RF says it should be stored at must confirm the write before the operation succeeds, but a partial write will succeed eventually if at least one node recorded the write At CL=QUORUM, at least ((N/2)+1) nodes must confirm the write for the operation to succeed, otherwise failure, but a partial write will succeed eventually if at least one node recorded the write. Read repair will eventually ensure that the write is replicated across all RF nodes in the cluster. The N in QUORUM above depends on the type of QUORUM you choose, in general think N=RF unless you choose a fancy QUORUM. To have a consistent read, CL of write + CL of read must be > RF... Write at ONE, read at ONE => may not get the most recent write if RF > 1 [fastest write, fastest read] {data loss possible if node lost before read repair} Write at QUORUM, read at ONE => consistent read [moderate write, fastest read] {multiple nodes must be lost for data loss to be possible} Write at ALL, read at ONE => consistent read, writes may be blocked if any node fails [slowest write, fastest read] Write at ONE, read at QUORUM => may not get the most recent write if RF > 2 [fastest write, moderate read] {data loss possible if node lost before read repair} Write at QUORUM, read at QUORUM => consistent read [moderate write, moderate read] {multiple nodes must be lost for data loss to be possible} Write at ALL, read at QUORUM => consistent read, writes may be blocked if any node fails [slowest write, moderate read] Write at ONE, read at ALL => consistent read, reads may fail if any node fails [fastest write, slowest read] {data loss possible if node lost before read repair} Write at QUORUM, read at ALL => consistent read, reads may fail if any node fails [moderate write, slowest read] {multiple nodes must be lost for data loss to be possible} Write at ALL, read at ALL => consistent read, writes may be blocked if any node fails, reads may fail if any node fails [slowest write, slowest read] Note: You can choose the CL for each and every operation. This is something that you should design into your application (unless you exclusively use QUORUM for all operations, in which case you are advised to bake the logic in, but it is less necessary) The other thing to remember is that RF does not have to equal the number of nodes in your cluster... in fact I would recommend designing your app on the basis that RF < number of nodes in your cluster... because at some point, when your data set grows big enough, you will end up with RF < number of nodes. -Stephen On 7 November 2011 13:03, Riyad Kalla <rka...@gmail.com> wrote: > Ah! Ok I was interpreting what you were saying to mean that if my RF was too > high, then the ring would die if I lost one. > Ultimately what I want (I think) is: > Replication Factor: 5 (aka "all of my nodes") > Consistency Level: 2 > Put another way, when I write a value, I want it to exist on two servers *at > least* before I consider that write "successful" enough for my code to > continue, but in the background I would like Cassandra to keep copying that > value around at itsĀ leisure until all the ring nodes know about it. > This sounds like what I need. Thanks for pointing me in the right direction. > Best, > Riyad > > On Mon, Nov 7, 2011 at 5:47 AM, Anthony Ikeda <anthony.ikeda....@gmail.com> > wrote: >> >> Riyad, I'm also just getting to know the different settingsĀ and values >> myself :) >> I believe, and it also depends on your config, CL.ONE Should ignore the >> loss of a node if your RF is 5, once you increase the CL then if you lose a >> node the CL is not met and you will get exceptions returned. >> Sent from my iPhone >> On 07/11/2011, at 4:32, Riyad Kalla <rka...@gmail.com> wrote: >> >> Anthony and Jaydeep, thank you for weighing in. I am glad to see that they >> are two different values (makes more sense mentally to me). >> Anthony, what you said caught my attention "to ensure all nodes have a >> copy you may not be able to survive the loss of a single node." -- why would >> this be the case? >> I assumed (incorrectly?) that a node would simply disappear off the map >> until I could bring it back up again, at which point all the missing values >> that it didn't get while it was done, it would slowly retrieve from other >> members of the ring. Is this the wrong understanding? >> If forcing a replication factor equal to the number of nodes in my ring >> will cause a hard-stop when one ring goes down (as I understood your comment >> to mean), it seems to me I should go with a much lower replication factor... >> something along the lines of 3 or roughly ceiling(N / 2) and just deal with >> the latency when one of the nodes has to route a request to another server >> when it doesn't contain the value. >> Is there a better way to accomplish what I want, or is keeping the >> replication factor that aggressively high generally a bad thing and using >> Cassandra in the "wrong" way? >> Thank you for the help. >> -Riyad >> >> On Sun, Nov 6, 2011 at 11:14 PM, chovatia jaydeep >> <chovatia_jayd...@yahoo.co.in> wrote: >>> >>> Hi Riyad, >>> You can set replication = 5 (number of replicas) and write with CL = ONE. >>> There is no hard requirement from Cassandra to write with CL=ALL to >>> replicate the data unless you need it. Considering your example, If you >>> write with CL=ONE then also it will replicate your data to all 5 replicas >>> eventually. >>> Thank you, >>> Jaydeep >>> ________________________________ >>> From: Riyad Kalla <rka...@gmail.com> >>> To: "user@cassandra.apache.org" <user@cassandra.apache.org> >>> Sent: Sunday, 6 November 2011 9:50 PM >>> Subject: Will writes with < ALL consistency eventually propagate? >>> >>> I am new to Cassandra and was curious about the following scenario... >>> >>> Lets say i have a ring of 5 servers. Ultimately I would like each server >>> to be a full replication of the next (master-master-*). >>> >>> In a presentation i watched today on Cassandra, the presenter mentioned >>> that the ring members will shard data and route your requests to the right >>> host when they come in to a server that doesnt physically contain the value >>> you wanted. To the client requesting this is seamless excwpt for the added >>> latency. >>> >>> If i wanted to avoid the routing and latency and ensure every server had >>> the full data set, do i have to write with a consistency level of ALL and >>> wait for all of those writes to return in my code, or can i write with a CL >>> of 1 or 2 and let the ring propagate the rest of the copies to the other >>> servers in the background after my code has continued executing? >>> >>> I dont mind eventual consistency in my case, but i do (eventually) want >>> all nodes to have all values and cannot tell if this is default behavior, or >>> if sharding is the default and i can only force duplicates onto the other >>> servers explicitly with a CL of ALL. >>> >>> Best, >>> Riyad >>> >> > >