Re: [Solved] column index offset miscalculation

Thomas Richter Thu, 13 Oct 2011 07:38:39 -0700

Thanks for the hint.

Ticket created: https://issues.apache.org/jira/browse/CASSANDRA-3358


Best,

Thomas

On 10/13/2011 03:27 PM, Sylvain Lebresne wrote:
> JIRA is not read-only, you should be able to create a ticket at
> https://issues.apache.org/jira/browse/CASSANDRA, though
> that probably require that you create an account.
> 
> --
> Sylvain
> 
> On Thu, Oct 13, 2011 at 3:20 PM, Thomas Richter <t...@tricnet.de> wrote:
>> Hi Aaron,
>>
>> the fix does the trick. I wonder why nobody else ran into this before...
>> I checked org/apache/cassandra/db/ColumnIndexer.java in 0.7.9, 0.8.7 and
>> 1.0.0-rc2 and all seem to be affected.
>>
>> Looks like public Jira is readonly - so I'm not sure about how to continue.
>>
>> Best,
>>
>> Thomas
>>
>> On 10/13/2011 10:52 AM, Thomas Richter wrote:
>>> Hi Aaron,
>>>
>>> I guess i found it :-).
>>>
>>> I added logging for the used IndexInfo to
>>> SSTableNamesIterator.readIndexedColumns and got negative index postions
>>> for the missing columns. This is the reason why the columns are not
>>> loaded from sstable.
>>>
>>> So I had a look at ColumnIndexer.serializeInternal and there it is:
>>>
>>> int endPosition = 0, startPosition = -1;
>>>
>>> Should be:
>>>
>>> long endPosition = 0, startPosition = -1;
>>>
>>> I'm currently running a compaction with a fixed version to verify.
>>>
>>> Best,
>>>
>>> Thomas
>>>
>>> On 10/12/2011 11:54 PM, aaron morton wrote:
>>>> Sounds a lot like the column is deleted.
>>>>
>>>> IIRC this is where the columns from various SSTables are reduced
>>>> https://github.com/apache/cassandra/blob/cassandra-0.8/src/java/org/apache/cassandra/db/filter/QueryFilter.java#L117
>>>>
>>>> The call to ColumnFamily.addColumn() is where the column instance may be 
>>>> merged with other instances.
>>>>
>>>> A
>>>>
>>>> -----------------
>>>> Aaron Morton
>>>> Freelance Cassandra Developer
>>>> @aaronmorton
>>>> http://www.thelastpickle.com
>>>>
>>>> -----------------
>>>> Aaron Morton
>>>> Freelance Cassandra Developer
>>>> @aaronmorton
>>>> http://www.thelastpickle.com
>>>>
>>>> On 13/10/2011, at 5:33 AM, Thomas Richter wrote:
>>>>
>>>>> Hi Aaron,
>>>>>
>>>>> I cannot read the column with a slice query.
>>>>> The slice query only returns data till a certain column and after that i
>>>>> only get empty results.
>>>>>
>>>>> I added log output to QueryFilter.isRelevant to see if the filter is
>>>>> dropping the column(s) but it doesn't even show up there.
>>>>>
>>>>> Next thing i will check check is the diff between columns contained in
>>>>> json export and columns fetched with the slice query, maybe this gives
>>>>> more clue...
>>>>>
>>>>> Any other ideas where to place more debugging output to see what's
>>>>> happening?
>>>>>
>>>>> Best,
>>>>>
>>>>> Thomas
>>>>>
>>>>> On 10/11/2011 12:46 PM, aaron morton wrote:
>>>>>> kewl,
>>>>>>
>>>>>>> * Row is not deleted (other columns can be read, row survives compaction
>>>>>>> with GCGraceSeconds=0)
>>>>>>
>>>>>> IIRC row tombstones can hang around for a while (until gc grace has 
>>>>>> passed), and they only have an effect on columns that have a lower 
>>>>>> timstamp. So it's possible to read columns from a row with a tombstone.
>>>>>>
>>>>>> Can you read the column using a slice range rather than specifying it's 
>>>>>> name ?
>>>>>>
>>>>>> Aaron
>>>>>>
>>>>>> -----------------
>>>>>> Aaron Morton
>>>>>> Freelance Cassandra Developer
>>>>>> @aaronmorton
>>>>>> http://www.thelastpickle.com
>>>>>>
>>>>>> On 11/10/2011, at 11:15 PM, Thomas Richter wrote:
>>>>>>
>>>>>>> Hi Aaron,
>>>>>>>
>>>>>>> i invalidated the caches but nothing changed. I didn't get the mentioned
>>>>>>> log line either, but as I read the code SliceByNamesReadCommand uses
>>>>>>> NamesQueryFilter and not SliceQueryFilter.
>>>>>>>
>>>>>>> Next, there is only one SSTable.
>>>>>>>
>>>>>>> I can rule out that the row is deleted because I deleted all other rows
>>>>>>> in that CF to reduce data size and speed up testing. I set
>>>>>>> GCGraceSeconds to zero and ran a compaction. All other rows are gone,
>>>>>>> but i can still access at least one column from the left row.
>>>>>>> So as far as I understand it, there should not be a tombstone on row 
>>>>>>> level.
>>>>>>>
>>>>>>> To make it a list:
>>>>>>>
>>>>>>> * One SSTable, one row
>>>>>>> *
>>>>>>> * Row is not deleted (other columns can be read, row survives compaction
>>>>>>> with GCGraceSeconds=0)
>>>>>>> * Most columns can be read by get['row']['col'] from cassandra-cli
>>>>>>> * Some columns can not be read by get['row']['col'] from cassandra-cli
>>>>>>> but can be found in output of sstable2json
>>>>>>> * unreadable data survives compaction with GCGraceSeconds=0 (checked
>>>>>>> with sstable2json)
>>>>>>> * Invalidation caches does not help
>>>>>>> * Nothing in the logs
>>>>>>>
>>>>>>> Does that point into any direction where i should look next?
>>>>>>>
>>>>>>> Best,
>>>>>>>
>>>>>>> Thomas
>>>>>>>
>>>>>>> On 10/11/2011 10:30 AM, aaron morton wrote:
>>>>>>>> Nothing jumps out. The obvious answer is that the column has been 
>>>>>>>> deleted. Did you check all the SSTables ?
>>>>>>>>
>>>>>>>> It looks like query returned from row cache, otherwise you would see 
>>>>>>>> this as well…
>>>>>>>>
>>>>>>>> DEBUG [ReadStage:34] 2011-10-11 21:11:11,484 SliceQueryFilter.java 
>>>>>>>> (line 123) collecting 0 of 2147483647: 
>>>>>>>> 1318294191654059:false:354@1318294191654861
>>>>>>>>
>>>>>>>> Which would mean a version of the column was found.
>>>>>>>>
>>>>>>>> If you invalidate the cache with nodetool and run the query and the 
>>>>>>>> log message appears it will mean the column was read from (all of the) 
>>>>>>>> sstables. If you do not get a column returned I would say there is a 
>>>>>>>> tombstone in place. It's either a row level or a column level one.
>>>>>>>>
>>>>>>>> Hope that helps.
>>>>>>>>
>>>>>>>> -----------------
>>>>>>>> Aaron Morton
>>>>>>>> Freelance Cassandra Developer
>>>>>>>> @aaronmorton
>>>>>>>> http://www.thelastpickle.com
>>>>>>>>
>>>>>>>> On 11/10/2011, at 10:35 AM, Thomas Richter wrote:
>>>>>>>>
>>>>>>>>> Hi Aaron,
>>>>>>>>>
>>>>>>>>> normally we use hector to access cassandra, but for debugging I 
>>>>>>>>> switched
>>>>>>>>> to cassandra-cli.
>>>>>>>>>
>>>>>>>>> Column can not be read by a simple
>>>>>>>>> get CFName['rowkey']['colname'];
>>>>>>>>>
>>>>>>>>> Response is "Value was not found"
>>>>>>>>> if i query another column, everything is just fine.
>>>>>>>>>
>>>>>>>>> Serverlog for unsuccessful read (keyspace and CF names replaced):
>>>>>>>>>
>>>>>>>>> DEBUG [pool-1-thread-1] 2011-10-10 23:15:29,739 CassandraServer.java
>>>>>>>>> (line 280) get
>>>>>>>>>
>>>>>>>>> DEBUG [pool-1-thread-1] 2011-10-10 23:15:29,744 StorageProxy.java 
>>>>>>>>> (line
>>>>>>>>> 320) Command/ConsistencyLevel is
>>>>>>>>> SliceByNamesReadCommand(table='Keyspace',
>>>>>>>>> key=61636162626139322d396638312d343562382d396637352d393162303337383030393762,
>>>>>>>>> columnParent='QueryPath(columnFamilyName='ColumnFamily',
>>>>>>>>> superColumnName='null', columnName='null')',
>>>>>>>>> columns=[574c303030375030,])/ONE
>>>>>>>>>
>>>>>>>>> DEBUG [pool-1-thread-1] 2011-10-10 23:15:29,750 ReadCallback.java 
>>>>>>>>> (line
>>>>>>>>> 86) Blockfor/repair is 1/true; setting up requests to 
>>>>>>>>> localhost/127.0.0.1
>>>>>>>>>
>>>>>>>>> DEBUG [pool-1-thread-1] 2011-10-10 23:15:29,750 StorageProxy.java 
>>>>>>>>> (line
>>>>>>>>> 343) reading data locally
>>>>>>>>>
>>>>>>>>> DEBUG [ReadStage:33] 2011-10-10 23:15:29,751 StorageProxy.java (line
>>>>>>>>> 448) LocalReadRunnable reading 
>>>>>>>>> SliceByNamesReadCommand(table='Keyspace',
>>>>>>>>> key=61636162626139322d396638312d343562382d396637352d393162303337383030393762,
>>>>>>>>> columnParent='QueryPath(columnFamilyName='ColumnFamily',
>>>>>>>>> superColumnName='null', columnName='null')', 
>>>>>>>>> columns=[574c303030375030,])
>>>>>>>>>
>>>>>>>>> DEBUG [pool-1-thread-1] 2011-10-10 23:15:29,818 StorageProxy.java 
>>>>>>>>> (line
>>>>>>>>> 393) Read: 67 ms.
>>>>>>>>>
>>>>>>>>> Log looks fine to me, but no result is returned.
>>>>>>>>>
>>>>>>>>> Best,
>>>>>>>>>
>>>>>>>>> Thomas
>>>>>>>>>
>>>>>>>>> On 10/10/2011 10:00 PM, aaron morton wrote:
>>>>>>>>>> How are they unreadable ? You need to go into some details about 
>>>>>>>>>> what is going wrong.
>>>>>>>>>>
>>>>>>>>>> What sort of read ?
>>>>>>>>>> What client ?
>>>>>>>>>> What is in the logging on client and server side ?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Try turning the logging up to DEBUG on the server to watch what 
>>>>>>>>>> happens.
>>>>>>>>>>
>>>>>>>>>> Cheers
>>>>>>>>>>
>>>>>>>>>> -----------------
>>>>>>>>>> Aaron Morton
>>>>>>>>>> Freelance Cassandra Developer
>>>>>>>>>> @aaronmorton
>>>>>>>>>> http://www.thelastpickle.com
>>>>>>>>>>
>>>>>>>>>> On 10/10/2011, at 9:23 PM, Thomas Richter wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi,
>>>>>>>>>>>
>>>>>>>>>>> no errors in the server logs. The columns are unreadable on all 
>>>>>>>>>>> nodes at
>>>>>>>>>>> any consistency level (ONE, QUORUM, ALL). We started with 0.7.3 and
>>>>>>>>>>> upgraded to 0.7.6-2 two months ago.
>>>>>>>>>>>
>>>>>>>>>>> Best,
>>>>>>>>>>>
>>>>>>>>>>> Thomas
>>>>>>>>>>>
>>>>>>>>>>> On 10/10/2011 10:03 AM, aaron morton wrote:
>>>>>>>>>>>> What error are you seeing  in the server logs ? Are the columns 
>>>>>>>>>>>> unreadable at all Consistency Levels ? i.e. are the columns 
>>>>>>>>>>>> unreadable on all nodes.
>>>>>>>>>>>>
>>>>>>>>>>>> What is the upgrade history of the cluster ? What version did it 
>>>>>>>>>>>> start at ?
>>>>>>>>>>>>
>>>>>>>>>>>> Cheers
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> -----------------
>>>>>>>>>>>> Aaron Morton
>>>>>>>>>>>> Freelance Cassandra Developer
>>>>>>>>>>>> @aaronmorton
>>>>>>>>>>>> http://www.thelastpickle.com
>>>>>>>>>>>>
>>>>>>>>>>>> On 10/10/2011, at 7:42 AM, Thomas Richter wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>
>>>>>>>>>>>>> here is some further information. Compaction did not help, but 
>>>>>>>>>>>>> data is
>>>>>>>>>>>>> still there when I dump the row with sstable2json.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thomas
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 10/08/2011 11:30 PM, Thomas Richter wrote:
>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> we are running a 3 node cassandra (0.7.6-2) cluster and some of 
>>>>>>>>>>>>>> our
>>>>>>>>>>>>>> column families contain quite large rows (400k+ columns, 4-6GB 
>>>>>>>>>>>>>> row size).
>>>>>>>>>>>>>> Replicaton factor is 3 for all keyspaces. The cluster is running 
>>>>>>>>>>>>>> fine
>>>>>>>>>>>>>> for several months now and we never experienced any serious 
>>>>>>>>>>>>>> trouble.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Some days ago we noticed, that some previously written columns 
>>>>>>>>>>>>>> could not
>>>>>>>>>>>>>> be read. This does not always happen, and only some dozen 
>>>>>>>>>>>>>> columns out of
>>>>>>>>>>>>>> 400k are affected.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> After ruling out application logic as a cause I dumped the row in
>>>>>>>>>>>>>> question with sstable2json and the columns are there (and are 
>>>>>>>>>>>>>> not marked
>>>>>>>>>>>>>> for deletion).
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Next thing was setting up a fresh single node cluster and 
>>>>>>>>>>>>>> copying the
>>>>>>>>>>>>>> column family data to that node. Columns could not be read 
>>>>>>>>>>>>>> either.
>>>>>>>>>>>>>> Right now I'm running a nodetool compact for the cf to see if 
>>>>>>>>>>>>>> data could
>>>>>>>>>>>>>> be read afterwards.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Is there any explanation for such behavior? Are there any 
>>>>>>>>>>>>>> suggestions
>>>>>>>>>>>>>> for further investigation?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> TIA,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thomas
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>
>>
>>

Re: [Solved] column index offset miscalculation

Reply via email to