Re: Heap is not released and streaming hangs at 0%

aaron morton Thu, 27 Jun 2013 22:00:23 -0700

>  I do not see the out of heap errors but I am taking a bit of a performance 
> hit.
Take a look at nodetool cfhistograms to see how many SSTables are being touched 
per read and the local read latency.


In general if you are hitting more than 4 it's not great.

>  BloomFilterFalseRatio is 0.8367977262013025 this was the reason behind 
> bumping bloom_filter_fp_chance.
Not sure I understand the logic there. 


> 
> I also encountered similar problem. I dump the jvm heap and analyse it by 
> eclipse mat. The eclipse plugin told me there are 10334 instances of 
> SSTableReader, consuming 6.6G memory. I found the CompactionExecutor thread 
> held  8000+ SSTalbeReader object. I wonder why there are so many 
> SSTableReader in memory. 
I would guess you are using Levelled compaction strategy with the default 5MB 
file size. 
When analysing the heap, if bloom filters are taking up too much memory they 
will show up as long[][] 

Cheers


-----------------
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 27/06/2013, at 2:36 AM, srmore <comom...@gmail.com> wrote:

> On Wed, Jun 26, 2013 at 12:16 AM, aaron morton <aa...@thelastpickle.com> 
> wrote:
>> bloom_filter_fp_chance value that was changed from default to 0.1, looked at 
>> the filters and they are about 2.5G on disk and I have around 8G of heap.
>> I will try increasing the value to 0.7 and report my results. 
> You need to re-write the sstables on disk using nodetool upgradesstables. 
> Otherwise only the new tables with have the 0.1 setting. 
> 
>> I will try increasing the value to 0.7 and report my results. 
> No need to, it will probably be something like "Oh no, really, what, how, 
> please make it stop" :)
> 0.7 will mean reads will hit most / all of the SSTables for the CF. 
> 
> Changing the bloom_filter_fp_chance to 0.7 did seem to correct the problem in 
> short run. I do not see the out of heap errors but I am taking a bit of a 
> performance hit. Planning to run some more tests, also  my 
> BloomFilterFalseRatio is 0.8367977262013025 this was the reason behind 
> bumping bloom_filter_fp_chance.
>  
> 
> I covered a high row situation in on of my talks at the summit this month, 
> the slide deck is here 
> http://www.slideshare.net/aaronmorton/cassandra-sf-2013-in-case-of-emergency-break-glass
>  and the videos will soon be up at Planet Cassandra. 
> 
> This was/is extremely helpful Aaron, cannot thank you enough for sharing this 
> with the community, eagerly looking forward for the video.
> 
> Rebuild the sstables, then reduce the index_interval if you still need to 
> reduce mem pressure. 
>  
> Cheers
> 
> 
> -----------------
> Aaron Morton
> Freelance Cassandra Consultant
> New Zealand
> 
> @aaronmorton
> http://www.thelastpickle.com
> 
> 
> On 22/06/2013, at 1:17 PM, sankalp kohli <kohlisank...@gmail.com> wrote:
> 
>> I will take a heap dump and see whats in there rather than guessing. 
>> 
>> 
>> On Fri, Jun 21, 2013 at 4:12 PM, Bryan Talbot <btal...@aeriagames.com> wrote:
>> bloom_filter_fp_chance = 0.7 is probably way too large to be effective and 
>> you'll probably have issues compacting deleted rows and get poor read 
>> performance with a value that high.  I'd guess that anything larger than 0.1 
>> might as well be 1.0.
>> 
>> -Bryan
>> 
>> 
>> 
>> On Fri, Jun 21, 2013 at 5:58 AM, srmore <comom...@gmail.com> wrote:
>> 
>> On Fri, Jun 21, 2013 at 2:53 AM, aaron morton <aa...@thelastpickle.com> 
>> wrote:
>>> > nodetool -h localhost flush didn't do much good.
>> Do you have 100's of millions of rows ?
>> If so see recent discussions about reducing the bloom_filter_fp_chance and 
>> index_sampling. 
>> Yes, I have 100's of millions of rows. 
>>  
>> 
>> If this is an old schema you may be using the very old setting of 0.000744 
>> which creates a lot of bloom filters. 
>> 
>> bloom_filter_fp_chance value that was changed from default to 0.1, looked at 
>> the filters and they are about 2.5G on disk and I have around 8G of heap.
>> I will try increasing the value to 0.7 and report my results. 
>> 
>> It also appears to be a case of hard GC failure (as Rob mentioned) as the 
>> heap is never released, even after 24+ hours of idle time, the JVM needs to 
>> be restarted to reclaim the heap.
>> 
>> Cheers
>>  
>> -----------------
>> Aaron Morton
>> Freelance Cassandra Consultant
>> New Zealand
>> 
>> @aaronmorton
>> http://www.thelastpickle.com
>> 
>> On 20/06/2013, at 6:36 AM, Wei Zhu <wz1...@yahoo.com> wrote:
>> 
>>> If you want, you can try to force the GC through Jconsole. Memory->Perform 
>>> GC.
>>> 
>>> It theoretically triggers a full GC and when it will happen depends on the 
>>> JVM
>>> 
>>> -Wei
>>> 
>>> From: "Robert Coli" <rc...@eventbrite.com>
>>> To: user@cassandra.apache.org
>>> Sent: Tuesday, June 18, 2013 10:43:13 AM
>>> Subject: Re: Heap is not released and streaming hangs at 0%
>>> 
>>> On Tue, Jun 18, 2013 at 10:33 AM, srmore <comom...@gmail.com> wrote:
>>> > But then shouldn't JVM C G it eventually ? I can still see Cassandra alive
>>> > and kicking but looks like the heap is locked up even after the traffic is
>>> > long stopped.
>>> 
>>> No, when GC system fails this hard it is often a permanent failure
>>> which requires a restart of the JVM.
>>> 
>>> > nodetool -h localhost flush didn't do much good.
>>> 
>>> This adds support to the idea that your heap is too full, and not full
>>> of memtables.
>>> 
>>> You could try nodetool -h localhost invalidatekeycache, but that
>>> probably will not free enough memory to help you.
>>> 
>>> =Rob
>> 
>> 
>> 
>> 
> 
>

Re: Heap is not released and streaming hangs at 0%

Reply via email to