It was not something you did wrong. The key cache format/classes involved
changed, there are a few jira issues around this:

https://issues.apache.org/jira/browse/CASSANDRA-4916
https://issues.apache.org/jira/browse/CASSANDRA-5253

Depending on how you moved between version you may or may not have been
effected.


On Sun, Apr 7, 2013 at 4:56 AM, Arya Goudarzi <gouda...@gmail.com> wrote:

> Yes, I know blowing them away would fix it and that is what I did, but I
> want to understand why this happens in first place. I was upgrading from
> 1.1.10 to 1.2.3
>
>
> On Fri, Apr 5, 2013 at 2:53 PM, Edward Capriolo <edlinuxg...@gmail.com>wrote:
>
>> This has happened before the save caches files were not compatible
>> between 0.6 and 0.7. I have ran into this a couple other times before. The
>> good news is the save key cache is just an optimization, you can blow it
>> away and it is not usually a big deal.
>>
>>
>>
>>
>> On Fri, Apr 5, 2013 at 2:55 PM, Arya Goudarzi <gouda...@gmail.com> wrote:
>>
>>> Here is a chunk of bloom filter sstable skip messages from the node I
>>> enabled DEBUG on:
>>>
>>> DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,450 SSTableReader.java (line
>>> 737) Bloom filter allows skipping sstable 39459
>>> DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,450 SSTableReader.java (line
>>> 737) Bloom filter allows skipping sstable 39483
>>> DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,450 SSTableReader.java (line
>>> 737) Bloom filter allows skipping sstable 39332
>>> DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,450 SSTableReader.java (line
>>> 737) Bloom filter allows skipping sstable 39335
>>> DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,450 SSTableReader.java (line
>>> 737) Bloom filter allows skipping sstable 39438
>>> DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,450 SSTableReader.java (line
>>> 737) Bloom filter allows skipping sstable 39478
>>> DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,450 SSTableReader.java (line
>>> 737) Bloom filter allows skipping sstable 39456
>>> DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,450 SSTableReader.java (line
>>> 737) Bloom filter allows skipping sstable 39469
>>> DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,451 SSTableReader.java (line
>>> 737) Bloom filter allows skipping sstable 39334
>>> DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,451 SSTableReader.java (line
>>> 737) Bloom filter allows skipping sstable 39406
>>>
>>> This is the last chunk of log before C* gets stuck, right before I stop
>>> the process, remove key caches and start again (This is from another node
>>> that I upgraded 2 days ago):
>>>
>>> INFO [SSTableBatchOpen:2] 2013-04-03 01:59:39,769 SSTableReader.java
>>> (line 166) Opening
>>> /var/lib/cassandra/data/cardspring_production/UniqueIndexes/cardspring_production-UniqueIndexes-hf-316499
>>> (5273270 bytes)
>>>  INFO [SSTableBatchOpen:1] 2013-04-03 01:59:39,858 SSTableReader.java
>>> (line 166) Opening
>>> /var/lib/cassandra/data/cardspring_production/UniqueIndexes/cardspring_production-UniqueIndexes-hf-314755
>>> (5264359 bytes)
>>>  INFO [SSTableBatchOpen:2] 2013-04-03 01:59:39,894 SSTableReader.java
>>> (line 166) Opening
>>> /var/lib/cassandra/data/cardspring_production/UniqueIndexes/cardspring_production-UniqueIndexes-hf-314762
>>> (5260887 bytes)
>>>  INFO [SSTableBatchOpen:1] 2013-04-03 01:59:39,980 SSTableReader.java
>>> (line 166) Opening
>>> /var/lib/cassandra/data/cardspring_production/UniqueIndexes/cardspring_production-UniqueIndexes-hf-315886
>>> (5262864 bytes)
>>>  INFO [OptionalTasks:1] 2013-04-03 01:59:40,298 AutoSavingCache.java
>>> (line 112) reading saved cache
>>> /var/lib/cassandra/saved_caches/cardspring_production-UniqueIndexes-KeyCache
>>>
>>>
>>> I finally upgrade all 12 nodes in our test environment yesterday. This
>>> issue seemed to exists on 7 nodes out of 12 nodes. They didn't alway get
>>> stuck on the same CF loading its saved KeyCache.
>>>
>>>
>>> On Fri, Apr 5, 2013 at 9:56 AM, aaron morton <aa...@thelastpickle.com>wrote:
>>>
>>>> skipping sstable due to bloom filter debug messages
>>>>
>>>> What were these messages?
>>>>
>>>> Do you have the logs from the start up ?
>>>>
>>>> Cheers
>>>>
>>>>    -----------------
>>>> Aaron Morton
>>>> Freelance Cassandra Consultant
>>>> New Zealand
>>>>
>>>> @aaronmorton
>>>> http://www.thelastpickle.com
>>>>
>>>> On 4/04/2013, at 6:11 AM, Arya Goudarzi <gouda...@gmail.com> wrote:
>>>>
>>>> Hi,
>>>>
>>>> I have upgraded 2 nodes out of a 12 mode test cluster from 1.1.10 to
>>>> 1.2.3. During startup while tailing C*'s system.log, I observed a series of
>>>> SSTable batch load messages and skipping sstable due to bloom filter debug
>>>> messages which is normal for startup, but when it reached loading saved key
>>>> caches, it gets stuck forever. The I/O wait stays high in the CPU graph and
>>>> I/O ops are sent to disk, but C* never passes that step of loading the key
>>>> cache file successfully. The saved key cache file was about 75MB on one
>>>> node and 125MB on the other node and they were for different CFs.
>>>>
>>>> <image.jpeg>
>>>>
>>>> The CPU I/O wait constantly stayed at 40%~ while system.log was stuck
>>>> at loading one saved key cache file. I have marked that on the graph above.
>>>> The workaround was to delete the saved cache files and things loaded fine
>>>> (See marked Normal Startup).
>>>>
>>>> These machines are m1.xlarge EC2 instances. And this issue happened on
>>>> both nodes upgraded. This did not happen during exercise of upgrade from
>>>> 1.1.6 to 1.2.2 using the same snapshot.
>>>>
>>>> Should I raise a JIRA?
>>>>
>>>> -Arya
>>>>
>>>>
>>>>
>>>
>>
>

Reply via email to