Doesn't consistency level ALL=QUORUM at RF=2 ? I have not had a chance to test your fix but I don't THINK this is the issue. If it is the issue, how do consistency levels ALL and QUORUM differ at this replication factor?
On Sat, Dec 4, 2010 at 12:03 AM, Jonathan Ellis <jbel...@gmail.com> wrote: > I think you are running into > https://issues.apache.org/jira/browse/CASSANDRA-1316, where when an > inconsistency on QUORUM/ALL is discovered it always peformed the > repair at QUORUM instead of the original CL. Thus, reading at ALL you > would see the correct answer on the 2nd read but you weren't > guaranteed to see it on the first. > > This was fixed in 0.6.4 but apparently I botched the merge to the 0.7 > branch. I corrected that just now, so when you update, you should be > good to go. > > On Fri, Dec 3, 2010 at 9:19 PM, Dan Hendry <dan.hendry.j...@gmail.com> > wrote: > > I am seeing fairly strange, behavior in my Cassandra cluster. > > Setup > > - 3 nodes (lets call them nodes 1 2 and 3) > > - RF=2 > > - A set of servers (producers) which which write data to the cluster at > > consistency level ONE > > - A set of servers (consumers/processors) which read data from the > cluster > > at consistency level ALL > > - Cassandra 0.7 (recent out of the svn branch, post beta 3) > > - Clients use the pelops library > > Situation: > > - Everything is humming along nicely > > - A Cassandra node (say 3) goes down (even with 24 GB of ram, OOM errors > > are the bain of my existence) > > - Producers continue to happily write to the cluster but consumers start > > complaining by throwing TimeOutExceptions and UnavailableExceptions. > > - I stagger out of bed in the middle of the night and restart Cassandra > on > > node 3. > > - The consumers stop complaining and get back to business but generate > > garbage data for the period node 3 was down. Its almost like half the > data > > is missing half the time. (Again, I am reading at consistency level ALL). > > - I force the consumers to reprocess data for the period node 3 was > down. > > They generate accurate output which is different from the first time > round. > > To be explicit, what seems to be happening is first read at consistency > ALL > > gives "A,C,E" (for example) and the second read at consistency level ALL > > gives "A,B,C,D,E". Is this a Cassandra bug? Is my knowledge of > consistency > > levels flawed? My understanding is that you could achieve strongly > > consistent behavior by writing at ONE and reading at ALL. > > After this experience, my theory (uneducated, untested, and > > under-researched) is that "strong consistency" applies only to column > > values, not the set of columns (or super-columns in this case) which make > up > > a row. Any thoughts? > > > > -- > Jonathan Ellis > Project Chair, Apache Cassandra > co-founder of Riptano, the source for professional Cassandra support > http://riptano.com >