Hi Aaron,

I guess i found it :-).

I added logging for the used IndexInfo to
SSTableNamesIterator.readIndexedColumns and got negative index postions
for the missing columns. This is the reason why the columns are not
loaded from sstable.

So I had a look at ColumnIndexer.serializeInternal and there it is:

int endPosition = 0, startPosition = -1;

Should be:

long endPosition = 0, startPosition = -1;

I'm currently running a compaction with a fixed version to verify.

Best,

Thomas

On 10/12/2011 11:54 PM, aaron morton wrote:
> Sounds a lot like the column is deleted. 
> 
> IIRC this is where the columns from various SSTables are reduced
> https://github.com/apache/cassandra/blob/cassandra-0.8/src/java/org/apache/cassandra/db/filter/QueryFilter.java#L117
> 
> The call to ColumnFamily.addColumn() is where the column instance may be 
> merged with other instances. 
> 
> A 
> 
> -----------------
> Aaron Morton
> Freelance Cassandra Developer
> @aaronmorton
> http://www.thelastpickle.com
> 
> -----------------
> Aaron Morton
> Freelance Cassandra Developer
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 13/10/2011, at 5:33 AM, Thomas Richter wrote:
> 
>> Hi Aaron,
>>
>> I cannot read the column with a slice query.
>> The slice query only returns data till a certain column and after that i
>> only get empty results.
>>
>> I added log output to QueryFilter.isRelevant to see if the filter is
>> dropping the column(s) but it doesn't even show up there.
>>
>> Next thing i will check check is the diff between columns contained in
>> json export and columns fetched with the slice query, maybe this gives
>> more clue...
>>
>> Any other ideas where to place more debugging output to see what's
>> happening?
>>
>> Best,
>>
>> Thomas
>>
>> On 10/11/2011 12:46 PM, aaron morton wrote:
>>> kewl, 
>>>
>>>> * Row is not deleted (other columns can be read, row survives compaction
>>>> with GCGraceSeconds=0)
>>>
>>> IIRC row tombstones can hang around for a while (until gc grace has 
>>> passed), and they only have an effect on columns that have a lower 
>>> timstamp. So it's possible to read columns from a row with a tombstone. 
>>>
>>> Can you read the column using a slice range rather than specifying it's 
>>> name ? 
>>>
>>> Aaron
>>>
>>> -----------------
>>> Aaron Morton
>>> Freelance Cassandra Developer
>>> @aaronmorton
>>> http://www.thelastpickle.com
>>>
>>> On 11/10/2011, at 11:15 PM, Thomas Richter wrote:
>>>
>>>> Hi Aaron,
>>>>
>>>> i invalidated the caches but nothing changed. I didn't get the mentioned
>>>> log line either, but as I read the code SliceByNamesReadCommand uses
>>>> NamesQueryFilter and not SliceQueryFilter.
>>>>
>>>> Next, there is only one SSTable.
>>>>
>>>> I can rule out that the row is deleted because I deleted all other rows
>>>> in that CF to reduce data size and speed up testing. I set
>>>> GCGraceSeconds to zero and ran a compaction. All other rows are gone,
>>>> but i can still access at least one column from the left row.
>>>> So as far as I understand it, there should not be a tombstone on row level.
>>>>
>>>> To make it a list:
>>>>
>>>> * One SSTable, one row
>>>> *
>>>> * Row is not deleted (other columns can be read, row survives compaction
>>>> with GCGraceSeconds=0)
>>>> * Most columns can be read by get['row']['col'] from cassandra-cli
>>>> * Some columns can not be read by get['row']['col'] from cassandra-cli
>>>> but can be found in output of sstable2json
>>>> * unreadable data survives compaction with GCGraceSeconds=0 (checked
>>>> with sstable2json)
>>>> * Invalidation caches does not help
>>>> * Nothing in the logs
>>>>
>>>> Does that point into any direction where i should look next?
>>>>
>>>> Best,
>>>>
>>>> Thomas
>>>>
>>>> On 10/11/2011 10:30 AM, aaron morton wrote:
>>>>> Nothing jumps out. The obvious answer is that the column has been 
>>>>> deleted. Did you check all the SSTables ?
>>>>>
>>>>> It looks like query returned from row cache, otherwise you would see this 
>>>>> as well…
>>>>>
>>>>> DEBUG [ReadStage:34] 2011-10-11 21:11:11,484 SliceQueryFilter.java (line 
>>>>> 123) collecting 0 of 2147483647: 
>>>>> 1318294191654059:false:354@1318294191654861
>>>>>
>>>>> Which would mean a version of the column was found. 
>>>>>
>>>>> If you invalidate the cache with nodetool and run the query and the log 
>>>>> message appears it will mean the column was read from (all of the) 
>>>>> sstables. If you do not get a column returned I would say there is a 
>>>>> tombstone in place. It's either a row level or a column level one.  
>>>>>
>>>>> Hope that helps. 
>>>>>
>>>>> -----------------
>>>>> Aaron Morton
>>>>> Freelance Cassandra Developer
>>>>> @aaronmorton
>>>>> http://www.thelastpickle.com
>>>>>
>>>>> On 11/10/2011, at 10:35 AM, Thomas Richter wrote:
>>>>>
>>>>>> Hi Aaron,
>>>>>>
>>>>>> normally we use hector to access cassandra, but for debugging I switched
>>>>>> to cassandra-cli.
>>>>>>
>>>>>> Column can not be read by a simple
>>>>>> get CFName['rowkey']['colname'];
>>>>>>
>>>>>> Response is "Value was not found"
>>>>>> if i query another column, everything is just fine.
>>>>>>
>>>>>> Serverlog for unsuccessful read (keyspace and CF names replaced):
>>>>>>
>>>>>> DEBUG [pool-1-thread-1] 2011-10-10 23:15:29,739 CassandraServer.java
>>>>>> (line 280) get
>>>>>>
>>>>>> DEBUG [pool-1-thread-1] 2011-10-10 23:15:29,744 StorageProxy.java (line
>>>>>> 320) Command/ConsistencyLevel is
>>>>>> SliceByNamesReadCommand(table='Keyspace',
>>>>>> key=61636162626139322d396638312d343562382d396637352d393162303337383030393762,
>>>>>> columnParent='QueryPath(columnFamilyName='ColumnFamily',
>>>>>> superColumnName='null', columnName='null')',
>>>>>> columns=[574c303030375030,])/ONE
>>>>>>
>>>>>> DEBUG [pool-1-thread-1] 2011-10-10 23:15:29,750 ReadCallback.java (line
>>>>>> 86) Blockfor/repair is 1/true; setting up requests to localhost/127.0.0.1
>>>>>>
>>>>>> DEBUG [pool-1-thread-1] 2011-10-10 23:15:29,750 StorageProxy.java (line
>>>>>> 343) reading data locally
>>>>>>
>>>>>> DEBUG [ReadStage:33] 2011-10-10 23:15:29,751 StorageProxy.java (line
>>>>>> 448) LocalReadRunnable reading SliceByNamesReadCommand(table='Keyspace',
>>>>>> key=61636162626139322d396638312d343562382d396637352d393162303337383030393762,
>>>>>> columnParent='QueryPath(columnFamilyName='ColumnFamily',
>>>>>> superColumnName='null', columnName='null')', columns=[574c303030375030,])
>>>>>>
>>>>>> DEBUG [pool-1-thread-1] 2011-10-10 23:15:29,818 StorageProxy.java (line
>>>>>> 393) Read: 67 ms.
>>>>>>
>>>>>> Log looks fine to me, but no result is returned.
>>>>>>
>>>>>> Best,
>>>>>>
>>>>>> Thomas
>>>>>>
>>>>>> On 10/10/2011 10:00 PM, aaron morton wrote:
>>>>>>> How are they unreadable ? You need to go into some details about what 
>>>>>>> is going wrong. 
>>>>>>>
>>>>>>> What sort of read ? 
>>>>>>> What client ? 
>>>>>>> What is in the logging on client and server side ? 
>>>>>>>
>>>>>>>
>>>>>>> Try turning the logging up to DEBUG on the server to watch what 
>>>>>>> happens. 
>>>>>>>
>>>>>>> Cheers
>>>>>>>
>>>>>>> -----------------
>>>>>>> Aaron Morton
>>>>>>> Freelance Cassandra Developer
>>>>>>> @aaronmorton
>>>>>>> http://www.thelastpickle.com
>>>>>>>
>>>>>>> On 10/10/2011, at 9:23 PM, Thomas Richter wrote:
>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> no errors in the server logs. The columns are unreadable on all nodes 
>>>>>>>> at
>>>>>>>> any consistency level (ONE, QUORUM, ALL). We started with 0.7.3 and
>>>>>>>> upgraded to 0.7.6-2 two months ago.
>>>>>>>>
>>>>>>>> Best,
>>>>>>>>
>>>>>>>> Thomas
>>>>>>>>
>>>>>>>> On 10/10/2011 10:03 AM, aaron morton wrote:
>>>>>>>>> What error are you seeing  in the server logs ? Are the columns 
>>>>>>>>> unreadable at all Consistency Levels ? i.e. are the columns 
>>>>>>>>> unreadable on all nodes.
>>>>>>>>>
>>>>>>>>> What is the upgrade history of the cluster ? What version did it 
>>>>>>>>> start at ? 
>>>>>>>>>
>>>>>>>>> Cheers
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> -----------------
>>>>>>>>> Aaron Morton
>>>>>>>>> Freelance Cassandra Developer
>>>>>>>>> @aaronmorton
>>>>>>>>> http://www.thelastpickle.com
>>>>>>>>>
>>>>>>>>> On 10/10/2011, at 7:42 AM, Thomas Richter wrote:
>>>>>>>>>
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> here is some further information. Compaction did not help, but data 
>>>>>>>>>> is
>>>>>>>>>> still there when I dump the row with sstable2json.
>>>>>>>>>>
>>>>>>>>>> Best,
>>>>>>>>>>
>>>>>>>>>> Thomas
>>>>>>>>>>
>>>>>>>>>> On 10/08/2011 11:30 PM, Thomas Richter wrote:
>>>>>>>>>>> Hi,
>>>>>>>>>>>
>>>>>>>>>>> we are running a 3 node cassandra (0.7.6-2) cluster and some of our
>>>>>>>>>>> column families contain quite large rows (400k+ columns, 4-6GB row 
>>>>>>>>>>> size).
>>>>>>>>>>> Replicaton factor is 3 for all keyspaces. The cluster is running 
>>>>>>>>>>> fine
>>>>>>>>>>> for several months now and we never experienced any serious trouble.
>>>>>>>>>>>
>>>>>>>>>>> Some days ago we noticed, that some previously written columns 
>>>>>>>>>>> could not
>>>>>>>>>>> be read. This does not always happen, and only some dozen columns 
>>>>>>>>>>> out of
>>>>>>>>>>> 400k are affected.
>>>>>>>>>>>
>>>>>>>>>>> After ruling out application logic as a cause I dumped the row in
>>>>>>>>>>> question with sstable2json and the columns are there (and are not 
>>>>>>>>>>> marked
>>>>>>>>>>> for deletion).
>>>>>>>>>>>
>>>>>>>>>>> Next thing was setting up a fresh single node cluster and copying 
>>>>>>>>>>> the
>>>>>>>>>>> column family data to that node. Columns could not be read either.
>>>>>>>>>>> Right now I'm running a nodetool compact for the cf to see if data 
>>>>>>>>>>> could
>>>>>>>>>>> be read afterwards.
>>>>>>>>>>>
>>>>>>>>>>> Is there any explanation for such behavior? Are there any 
>>>>>>>>>>> suggestions
>>>>>>>>>>> for further investigation?
>>>>>>>>>>>
>>>>>>>>>>> TIA,
>>>>>>>>>>>
>>>>>>>>>>> Thomas
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
> 
> 

Reply via email to