Hi all, We have a situation that CPU loads on some of our nodes in a cluster has spiked occasionally since the last November, which is triggered by requests for rows that reside on two specific sstables.
We confirmed the followings(when spiked): version: 1.0.7(current) <- 0.8.6 <- 0.8.5 <- 0.7.8 jdk: Oracle 1.6.0 1. a profiling showed that BloomFilterSerializer#deserialize was the hotspot(70% of the total load by running threads) * the stack trace looked like this(simplified) 90.4% - org.apache.cassandra.db.ReadVerbHandler.doVerb 90.4% - org.apache.cassandra.db.SliceByNamesReadCommand.getRow ... 90.4% - org.apache.cassandra.db.CollationController.collectTimeOrderedData ... 89.5% - org.apache.cassandra.db.columniterator.SSTableNamesIterator.read ... 79.9% - org.apache.cassandra.io.sstable.IndexHelper.defreezeBloomFilter 68.9% - org.apache.cassandra.io.sstable.BloomFilterSerializer.deserialize 66.7% - java.io.DataInputStream.readLong 2. Usually, 1 should be so fast that a profiling by sampling can not detect 3. no pressure on Cassandra's VM heap nor on machine in overal 4. a little I/O traffic for our 8 disks/node(up to 100tps/disk by "iostat 1 1000") 5. the problematic Data file contains only 5 to 10 keys data but large(2.4G) 6. the problematic Filter file size is only 256B(could be normal) So now, I am trying to read the Filter file in the same way BloomFilterSerializer#deserialize does as possible as I can, in order to see if the file is something wrong. Could you give me some advise on: 1. what is happening? 2. the best way to simulate the BloomFilterSerializer#deserialize 3. any more info required to proceed? Thanks, Takenori