Here is a chunk of bloom filter sstable skip messages from the node I
enabled DEBUG on:

DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,450 SSTableReader.java (line
737) Bloom filter allows skipping sstable 39459
DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,450 SSTableReader.java (line
737) Bloom filter allows skipping sstable 39483
DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,450 SSTableReader.java (line
737) Bloom filter allows skipping sstable 39332
DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,450 SSTableReader.java (line
737) Bloom filter allows skipping sstable 39335
DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,450 SSTableReader.java (line
737) Bloom filter allows skipping sstable 39438
DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,450 SSTableReader.java (line
737) Bloom filter allows skipping sstable 39478
DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,450 SSTableReader.java (line
737) Bloom filter allows skipping sstable 39456
DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,450 SSTableReader.java (line
737) Bloom filter allows skipping sstable 39469
DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,451 SSTableReader.java (line
737) Bloom filter allows skipping sstable 39334
DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,451 SSTableReader.java (line
737) Bloom filter allows skipping sstable 39406

This is the last chunk of log before C* gets stuck, right before I stop the
process, remove key caches and start again (This is from another node that
I upgraded 2 days ago):

INFO [SSTableBatchOpen:2] 2013-04-03 01:59:39,769 SSTableReader.java (line
166) Opening
/var/lib/cassandra/data/cardspring_production/UniqueIndexes/cardspring_production-UniqueIndexes-hf-316499
(5273270 bytes)
 INFO [SSTableBatchOpen:1] 2013-04-03 01:59:39,858 SSTableReader.java (line
166) Opening
/var/lib/cassandra/data/cardspring_production/UniqueIndexes/cardspring_production-UniqueIndexes-hf-314755
(5264359 bytes)
 INFO [SSTableBatchOpen:2] 2013-04-03 01:59:39,894 SSTableReader.java (line
166) Opening
/var/lib/cassandra/data/cardspring_production/UniqueIndexes/cardspring_production-UniqueIndexes-hf-314762
(5260887 bytes)
 INFO [SSTableBatchOpen:1] 2013-04-03 01:59:39,980 SSTableReader.java (line
166) Opening
/var/lib/cassandra/data/cardspring_production/UniqueIndexes/cardspring_production-UniqueIndexes-hf-315886
(5262864 bytes)
 INFO [OptionalTasks:1] 2013-04-03 01:59:40,298 AutoSavingCache.java (line
112) reading saved cache
/var/lib/cassandra/saved_caches/cardspring_production-UniqueIndexes-KeyCache


I finally upgrade all 12 nodes in our test environment yesterday. This
issue seemed to exists on 7 nodes out of 12 nodes. They didn't alway get
stuck on the same CF loading its saved KeyCache.


On Fri, Apr 5, 2013 at 9:56 AM, aaron morton <aa...@thelastpickle.com>wrote:

> skipping sstable due to bloom filter debug messages
>
> What were these messages?
>
> Do you have the logs from the start up ?
>
> Cheers
>
> -----------------
> Aaron Morton
> Freelance Cassandra Consultant
> New Zealand
>
> @aaronmorton
> http://www.thelastpickle.com
>
> On 4/04/2013, at 6:11 AM, Arya Goudarzi <gouda...@gmail.com> wrote:
>
> Hi,
>
> I have upgraded 2 nodes out of a 12 mode test cluster from 1.1.10 to
> 1.2.3. During startup while tailing C*'s system.log, I observed a series of
> SSTable batch load messages and skipping sstable due to bloom filter debug
> messages which is normal for startup, but when it reached loading saved key
> caches, it gets stuck forever. The I/O wait stays high in the CPU graph and
> I/O ops are sent to disk, but C* never passes that step of loading the key
> cache file successfully. The saved key cache file was about 75MB on one
> node and 125MB on the other node and they were for different CFs.
>
> <image.jpeg>
>
> The CPU I/O wait constantly stayed at 40%~ while system.log was stuck at
> loading one saved key cache file. I have marked that on the graph above.
> The workaround was to delete the saved cache files and things loaded fine
> (See marked Normal Startup).
>
> These machines are m1.xlarge EC2 instances. And this issue happened on
> both nodes upgraded. This did not happen during exercise of upgrade from
> 1.1.6 to 1.2.2 using the same snapshot.
>
> Should I raise a JIRA?
>
> -Arya
>
>
>

Reply via email to