The looks bug like, can you create a ticket on https://issues.apache.org/jira/browse/CASSANDRA
Please include the C* version, the table and insert statements, and if you can repo is using CQL 3. Thanks Aaron ----------------- Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 30/01/2013, at 8:10 AM, Elden Bishop <ebis...@exacttarget.com> wrote: > Sure thing, Here is a console dump showing the error. Notice that column > '9801' is NOT NULL on the first two queries but IS NULL on the last query. I > get this behavior constantly on any writes that coincide with a flush. The > column is always readable by itself but disappears depending on the other > columns being queried. > > $ > $ bin/cqlsh –2 > cqlsh> > cqlsh> SELECT '9801' FROM BUGS.Test WHERE KEY='a'; > > 9801 > --------------------- > 0.02271159951509616 > > cqlsh> SELECT '9801','6814' FROM BUGS.Test WHERE KEY='a'; > > 9801 | 6814 > ---------------------+-------------------- > 0.02271159951509616 | 0.6612351709326891 > > cqlsh> SELECT '9801','6814','3333' FROM BUGS.Test WHERE KEY='a'; > > 9801 | 6814 | 3333 > ------+--------------------+-------------------- > null | 0.6612351709326891 | 0.8921380283891902 > > cqlsh> exit; > $ > $ > > From: aaron morton <aa...@thelastpickle.com> > Reply-To: "user@cassandra.apache.org" <user@cassandra.apache.org> > Date: Tuesday, January 29, 2013 12:21 AM > To: "user@cassandra.apache.org" <user@cassandra.apache.org> > Subject: Re: Cass returns Incorrect column data on writes during flushing > >> Ie. Query for a single column works but the column does not appear in slice >> queries depending on the other columns in the query >> >> cfq.getKey("foo").getColumn("A") returns "A" >> cfq.getKey("foo").withColumnSlice("A", "B") returns "B" only >> cfq.getKey("foo").withColumnSlice("A","B","C") returns "A","B" and "C" > Can you replicate this using cassandra-cli or CQL ? > Makes it clearer what's happening and removes any potential issues with the > client or your code. > If you cannot repo it show you astynax code. > > Cheers > > ----------------- > Aaron Morton > Freelance Cassandra Developer > New Zealand > > @aaronmorton > http://www.thelastpickle.com > > On 29/01/2013, at 1:15 PM, Elden Bishop <ebis...@exacttarget.com> wrote: > >> I'm trying to track down some really worrying behavior. It appears that >> writing multiple columns while a table flush is occurring can result in >> Cassandra recording its data in a way that makes columns visible only to >> some queries but not others. >> >> Ie. Query for a single column works but the column does not appear in slice >> queries depending on the other columns in the query >> >> cfq.getKey("foo").getColumn("A") returns "A" >> cfq.getKey("foo").withColumnSlice("A", "B") returns "B" only >> cfq.getKey("foo").withColumnSlice("A","B","C") returns "A","B" and "C" >> >> This is a permanent condition meaning that even hours later with no reads or >> writes the DB will return the same results. I can reproduce this 100% of the >> time by writing multiple columns and then reading a different set of >> multiple columns. Columns written during the flush may or may not appear. >> >> Details >> >> # There are no log errors >> # All single column queries return correct data. >> # Slice queries may or may not return the column depending on which other >> columns are in the query. >> # This is on a stock "unzip and run" installation of Cassandra using default >> options only; basically doing the cassandra getting started tutorial and >> using the Demo table described in that tutorial. >> # Cassandra 1.2.0 using Astynax and Java 1.6.0_37. >> # There are no errors but there is always a "flushing high traffic column >> family" that happens right before the incoherent state occurs >> # to reproduce just update multiple columns at the same time, using random >> rows and then verify the writes by reading multiple columns. I get can >> generate the error on 100% of runs. Once the state is screwed up, the multi >> column read will not contain the column but the single column read will. >> >> Log snippet >> INFO 15:47:49,066 GC for ParNew: 320 ms for 1 collections, 207199992 used; >> max is 1052770304 >> INFO 15:47:58,076 GC for ParNew: 330 ms for 1 collections, 232839680 used; >> max is 1052770304 >> INFO 15:48:00,374 flushing high-traffic column family CFS(Keyspace='BUGS', >> ColumnFamily='Test') (estimated 50416978 bytes) >> INFO 15:48:00,374 Enqueuing flush of >> Memtable-Test@1575891161(4529586/50416978 serialized/live bytes, 279197 ops) >> INFO 15:48:00,378 Writing Memtable-Test@1575891161(4529586/50416978 >> serialized/live bytes, 279197 ops) >> INFO 15:48:01,142 GC for ParNew: 654 ms for 1 collections, 239478568 used; >> max is 1052770304 >> INFO 15:48:01,474 Completed flushing >> /var/lib/cassandra/data/BUGS/Test/BUGS-Test-ia-45-Data.db (4580066 bytes) >> for commitlog position ReplayPosition(segmentId=1359415964165, >> position=7462737) >> >> >> Any ideas on what could be going on? I could not find anything like this in >> the open bugs and the only workaround seems to be never doing multi-column >> reads or writes. I'm concerned that the DB can get into a state where >> different queries can return such inconsistent results. All with no warning >> or errors. There is no way to even verify data correctness; every column can >> seem correct when queried and then disappear during slice queries depending >> on the other columns in the query. >> >> >> Thanks >