> Are you advising CL.ONE does not worth the game when considering > read performance ? Consistency is not performance, it's a whole new thing to tune in your application. If you have performance issues deal with those as performance issues, better code / data model / hard ware.
> By the way, I do not have consistency problem at all - data is only written > once Nobody expects a consistency problem. It's chief weapon is surprise. Surprise and fear. It's two weapons are fear and surprise. And so forth http://www.youtube.com/watch?v=Ixgc_FGam3s If you write at LOCAL QUORUM in DC 1 and DC 2 is down at the start of the request, a hint will be stored in DC 1. Some time later when DC 2 comes back that hint will be sent to DC 2. If in the mean time you read from DC 2 at CL ONE you will not get that change. With Read Repair enabled it will repair in the background and you may get a different response on the next read (Am guessing here, cannot remember exactly how RR works cross DC) Cheers ----------------- Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 15/09/2011, at 10:07 AM, Pierre Chalamet wrote: > Thanks Aaron, didn't seen your answer before mine. > > I do agree for 2/ I might have read error. Good suggestion to use > EACH_QUORUM - it could be a good trade off to read at this level if ONE > fails. > > Maybe using LOCAL_QUORUM might be a good answer and will avoid headache > after all. Are you advising CL.ONE does not worth the game when considering > read performance ? > > By the way, I do not have consistency problem at all - data is only written > once (and if more it is always the same data) and read several times across > DC. I only have replication problems. That's why I'm more inclined to use > CL.ONE for read if possible. > > Thanks, > - Pierre > > > -----Original Message----- > From: aaron morton [mailto:aa...@thelastpickle.com] > Sent: Wednesday, September 14, 2011 11:48 PM > To: user@cassandra.apache.org; pie...@chalamet.net > Subject: Re: Get CL ONE / NTS > > Your current approach to Consistency opens the door to some inconsistent > behavior. > >> 1/ Will I have an error because DC2 does not have any copy of the data ? > If you read from DC2 at CL ONE and the data is not replicated it will not be > returned. > >> 2/ Will Cassandra try to get the data from DC1 if nothing is found in DC2 > ? > Not at CL ONE. If you used CL EACH QUORUM then the read will go to all the > DC's. If DC2 is behind DC1 then you will get the data form DC1. > >> 3/ In case of partial replication to DC2, will I see sometimes errors > about servers not holding the data in DC2 ? > Depending on the API call and the client, working at CL ONE, you will see > either errors or missing data. > >> 4/ Does Get CL ONE failed as soon as the fastest server to answer tell it > does not have the data or does it waits until all servers tell they do not > have the data ? > yes > > Consider > > using LOCAL QUORUM for write and read, will make things a bit more > consistent but not add inter DC overhead into the request latency. Still > possible to not get data in DC2 if it is totally disconnected from the DC1 > > write at LOCAL QUORUM and read at EACH QUORUM . Will so you can always read, > requests in DC2 will fail if DC1 is not reachable. > > Hope that helps. > > > ----------------- > Aaron Morton > Freelance Cassandra Developer > @aaronmorton > http://www.thelastpickle.com > > On 15/09/2011, at 1:33 AM, Pierre Chalamet wrote: > >> Hello, >> >> I have 2 datacenters. Cassandra is configured as follow: >> - RackInferringSnitch >> - NetworkTopologyStrategy for CF >> - strategy_options: DC1:3 DC2:3 >> >> Data are written using CL LOCAL_QUORUM so data written from one datacenter > will eventually be replicated to the other datacenter. Data is always > written exactly once. >> >> On the other side, I'd like to improve the read path. I'm using actually > the CL ONE since data is only written once (ie: timestamp is more or less > meaningless in my case). >> >> This is where I have some doubts: if data is written on DC1 and > tentatively read from DC2 while the data is still not replicated or > partially replicated (for whatever good reason since replication is async), > what is the behavior of Get with CL ONE / NTS ? >> 1/ Will I have an error because DC2 does not have any copy of the data ? >> 2/ Will Cassandra try to get the data from DC1 if nothing is found in DC2 > ? >> 3/ In case of partial replication to DC2, will I see sometimes errors > about servers not holding the data in DC2 ? >> 4/ Does Get CL ONE failed as soon as the fastest server to answer tell it > does not have the data or does it waits until all servers tell they do not > have the data ? >> >> Thanks a lot, >> - Pierre > >