Hi Aaron, i invalidated the caches but nothing changed. I didn't get the mentioned log line either, but as I read the code SliceByNamesReadCommand uses NamesQueryFilter and not SliceQueryFilter.
Next, there is only one SSTable. I can rule out that the row is deleted because I deleted all other rows in that CF to reduce data size and speed up testing. I set GCGraceSeconds to zero and ran a compaction. All other rows are gone, but i can still access at least one column from the left row. So as far as I understand it, there should not be a tombstone on row level. To make it a list: * One SSTable, one row * * Row is not deleted (other columns can be read, row survives compaction with GCGraceSeconds=0) * Most columns can be read by get['row']['col'] from cassandra-cli * Some columns can not be read by get['row']['col'] from cassandra-cli but can be found in output of sstable2json * unreadable data survives compaction with GCGraceSeconds=0 (checked with sstable2json) * Invalidation caches does not help * Nothing in the logs Does that point into any direction where i should look next? Best, Thomas On 10/11/2011 10:30 AM, aaron morton wrote: > Nothing jumps out. The obvious answer is that the column has been deleted. > Did you check all the SSTables ? > > It looks like query returned from row cache, otherwise you would see this as > well… > > DEBUG [ReadStage:34] 2011-10-11 21:11:11,484 SliceQueryFilter.java (line 123) > collecting 0 of 2147483647: 1318294191654059:false:354@1318294191654861 > > Which would mean a version of the column was found. > > If you invalidate the cache with nodetool and run the query and the log > message appears it will mean the column was read from (all of the) sstables. > If you do not get a column returned I would say there is a tombstone in > place. It's either a row level or a column level one. > > Hope that helps. > > ----------------- > Aaron Morton > Freelance Cassandra Developer > @aaronmorton > http://www.thelastpickle.com > > On 11/10/2011, at 10:35 AM, Thomas Richter wrote: > >> Hi Aaron, >> >> normally we use hector to access cassandra, but for debugging I switched >> to cassandra-cli. >> >> Column can not be read by a simple >> get CFName['rowkey']['colname']; >> >> Response is "Value was not found" >> if i query another column, everything is just fine. >> >> Serverlog for unsuccessful read (keyspace and CF names replaced): >> >> DEBUG [pool-1-thread-1] 2011-10-10 23:15:29,739 CassandraServer.java >> (line 280) get >> >> DEBUG [pool-1-thread-1] 2011-10-10 23:15:29,744 StorageProxy.java (line >> 320) Command/ConsistencyLevel is >> SliceByNamesReadCommand(table='Keyspace', >> key=61636162626139322d396638312d343562382d396637352d393162303337383030393762, >> columnParent='QueryPath(columnFamilyName='ColumnFamily', >> superColumnName='null', columnName='null')', >> columns=[574c303030375030,])/ONE >> >> DEBUG [pool-1-thread-1] 2011-10-10 23:15:29,750 ReadCallback.java (line >> 86) Blockfor/repair is 1/true; setting up requests to localhost/127.0.0.1 >> >> DEBUG [pool-1-thread-1] 2011-10-10 23:15:29,750 StorageProxy.java (line >> 343) reading data locally >> >> DEBUG [ReadStage:33] 2011-10-10 23:15:29,751 StorageProxy.java (line >> 448) LocalReadRunnable reading SliceByNamesReadCommand(table='Keyspace', >> key=61636162626139322d396638312d343562382d396637352d393162303337383030393762, >> columnParent='QueryPath(columnFamilyName='ColumnFamily', >> superColumnName='null', columnName='null')', columns=[574c303030375030,]) >> >> DEBUG [pool-1-thread-1] 2011-10-10 23:15:29,818 StorageProxy.java (line >> 393) Read: 67 ms. >> >> Log looks fine to me, but no result is returned. >> >> Best, >> >> Thomas >> >> On 10/10/2011 10:00 PM, aaron morton wrote: >>> How are they unreadable ? You need to go into some details about what is >>> going wrong. >>> >>> What sort of read ? >>> What client ? >>> What is in the logging on client and server side ? >>> >>> >>> Try turning the logging up to DEBUG on the server to watch what happens. >>> >>> Cheers >>> >>> ----------------- >>> Aaron Morton >>> Freelance Cassandra Developer >>> @aaronmorton >>> http://www.thelastpickle.com >>> >>> On 10/10/2011, at 9:23 PM, Thomas Richter wrote: >>> >>>> Hi, >>>> >>>> no errors in the server logs. The columns are unreadable on all nodes at >>>> any consistency level (ONE, QUORUM, ALL). We started with 0.7.3 and >>>> upgraded to 0.7.6-2 two months ago. >>>> >>>> Best, >>>> >>>> Thomas >>>> >>>> On 10/10/2011 10:03 AM, aaron morton wrote: >>>>> What error are you seeing in the server logs ? Are the columns >>>>> unreadable at all Consistency Levels ? i.e. are the columns unreadable on >>>>> all nodes. >>>>> >>>>> What is the upgrade history of the cluster ? What version did it start at >>>>> ? >>>>> >>>>> Cheers >>>>> >>>>> >>>>> ----------------- >>>>> Aaron Morton >>>>> Freelance Cassandra Developer >>>>> @aaronmorton >>>>> http://www.thelastpickle.com >>>>> >>>>> On 10/10/2011, at 7:42 AM, Thomas Richter wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> here is some further information. Compaction did not help, but data is >>>>>> still there when I dump the row with sstable2json. >>>>>> >>>>>> Best, >>>>>> >>>>>> Thomas >>>>>> >>>>>> On 10/08/2011 11:30 PM, Thomas Richter wrote: >>>>>>> Hi, >>>>>>> >>>>>>> we are running a 3 node cassandra (0.7.6-2) cluster and some of our >>>>>>> column families contain quite large rows (400k+ columns, 4-6GB row >>>>>>> size). >>>>>>> Replicaton factor is 3 for all keyspaces. The cluster is running fine >>>>>>> for several months now and we never experienced any serious trouble. >>>>>>> >>>>>>> Some days ago we noticed, that some previously written columns could not >>>>>>> be read. This does not always happen, and only some dozen columns out of >>>>>>> 400k are affected. >>>>>>> >>>>>>> After ruling out application logic as a cause I dumped the row in >>>>>>> question with sstable2json and the columns are there (and are not marked >>>>>>> for deletion). >>>>>>> >>>>>>> Next thing was setting up a fresh single node cluster and copying the >>>>>>> column family data to that node. Columns could not be read either. >>>>>>> Right now I'm running a nodetool compact for the cf to see if data could >>>>>>> be read afterwards. >>>>>>> >>>>>>> Is there any explanation for such behavior? Are there any suggestions >>>>>>> for further investigation? >>>>>>> >>>>>>> TIA, >>>>>>> >>>>>>> Thomas >>>>>> >>>>> >>>> >>> >> >