Re: CPU hotspot at BloomFilterSerializer#deserialize

Takenori Sato(Cloudian) Tue, 05 Feb 2013 17:33:12 -0800

Hi,

We found this issue is specific to 1.0.1 through 1.0.8, which was fixedat 1.0.9.


https://issues.apache.org/jira/browse/CASSANDRA-4023

So by upgrading, we will see a reasonable performnace no matter howlarge row we have!


Thanks,
Takenori

(2013/02/05 2:29), aaron morton wrote:

Yes, it contains a big row that goes up to 2GB with more than amillion of columns.
I've run tests with 10 million small columns and reasonableperformance. I've not looked at 1 million large columns.
- BloomFilterSerializer#deserialize does readLong iteratively ateach page
of size 4K for a given row, which means it could be 500,000 loops(calls
readLong) for a 2G row(from 1.0.7 source).
There is only one Bloom filter per row in an SSTable, not one percolumn index/page.
It could take a while if there are a lot of sstables in the read.
nodetool cfhistorgrams will let you know, run it once to reset thecounts , then do your test, then run it again.
Cheers

-----------------
Aaron Morton
Freelance Cassandra Developer
New Zealand

@aaronmorton
http://www.thelastpickle.com
On 4/02/2013, at 4:13 AM, Edward Capriolo <edlinuxg...@gmail.com<mailto:edlinuxg...@gmail.com>> wrote:
It is interesting the press c* got about having 2 billion columns in a
row. You *can* do it but it brings to light some realities of what
that means.
On Sun, Feb 3, 2013 at 8:09 AM, Takenori Sato <ts...@cloudian.com<mailto:ts...@cloudian.com>> wrote:
Hi Aaron,

Thanks for your answers. That helped me get a big picture.
Yes, it contains a big row that goes up to 2GB with more than amillion of
columns.

Let me confirm if I correctly understand.
- The stack trace is from Slice By Names query. And thedeserialization is
at the step 3, "Read the row level Bloom Filter", on your blog.
- BloomFilterSerializer#deserialize does readLong iteratively ateach page
of size 4K for a given row, which means it could be 500,000 loops(calls
readLong) for a 2G row(from 1.0.7 source).

Correct?
That makes sense Slice By Names queries against such a wide rowcould be CPU
bottleneck. In fact, in our test environment, a
BloomFilterSerializer#deserialize of such a case takes more than10ms, up to
100ms.
Get a single named column.
Get the first 10 columns using the natural column order.
Get the last 10 columns using the reversed order.
Interesting. A query pattern could make a difference?
We thought the only solutions is to change the data structure(don'tuse such
a wide row if it is retrieved by Slice By Names query).

Anyway, will give it a try!

Best,
Takenori
On Sat, Feb 2, 2013 at 2:55 AM, aaron morton<aa...@thelastpickle.com <mailto:aa...@thelastpickle.com>>
wrote:
5. the problematic Data file contains only 5 to 10 keys data but
large(2.4G)

So very large rows ?
What does nodetool cfstats or cfhistograms say about the row sizes ?


1. what is happening?
I think this is partially large rows and partially the querypattern, this
is only by roughly correct
http://thelastpickle.com/2011/07/04/Cassandra-Query-Plans/ and mytalk here
http://www.datastax.com/events/cassandrasummit2012/presentations

3. any more info required to proceed?

Do some tests with different query techniques…

Get a single named column.
Get the first 10 columns using the natural column order.
Get the last 10 columns using the reversed order.

Hope that helps.

-----------------
Aaron Morton
Freelance Cassandra Developer
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 31/01/2013, at 7:20 PM, Takenori Sato <ts...@cloudian.com> wrote:

Hi all,
We have a situation that CPU loads on some of our nodes in acluster hasspiked occasionally since the last November, which is triggered byrequests
for rows that reside on two specific sstables.

We confirmed the followings(when spiked):

version: 1.0.7(current) <- 0.8.6 <- 0.8.5 <- 0.7.8
jdk: Oracle 1.6.0

1. a profiling showed that BloomFilterSerializer#deserialize was the
hotspot(70% of the total load by running threads)

* the stack trace looked like this(simplified)
90.4% - org.apache.cassandra.db.ReadVerbHandler.doVerb
90.4% - org.apache.cassandra.db.SliceByNamesReadCommand.getRow
...
90.4% -org.apache.cassandra.db.CollationController.collectTimeOrderedData
...
89.5% -org.apache.cassandra.db.columniterator.SSTableNamesIterator.read
...
79.9% - org.apache.cassandra.io.sstable.IndexHelper.defreezeBloomFilter
68.9% -org.apache.cassandra.io.sstable.BloomFilterSerializer.deserialize
66.7% - java.io.DataInputStream.readLong

2. Usually, 1 should be so fast that a profiling by sampling can not
detect

3. no pressure on Cassandra's VM heap nor on machine in overal
4. a little I/O traffic for our 8 disks/node(up to 100tps/disk by"iostat
1 1000")

5. the problematic Data file contains only 5 to 10 keys data but
large(2.4G)

6. the problematic Filter file size is only 256B(could be normal)


So now, I am trying to read the Filter file in the same way
BloomFilterSerializer#deserialize does as possible as I can, inorder to see
if the file is something wrong.

Could you give me some advise on:

1. what is happening?
2. the best way to simulate the BloomFilterSerializer#deserialize
3. any more info required to proceed?

Thanks,
Takenori

Re: CPU hotspot at BloomFilterSerializer#deserialize

Reply via email to