Re: About the heap

Hiller, Dean Thu, 14 Mar 2013 07:22:08 -0700

Oh, and one other way to lower your RAM is to scale out….add more machines.  
Since bloomfilters use up a lot of memory, doubling your cluster and 
significantly reduce your RAM usage.  We have switched to LCS but are being 
forced to double our cluster as well which reduces RAM quite a bit.  Though 
perhaps like us you are trying to tune to get more per node as well ;).  But I 
thought I would let you know in case it wasn't obvious.

Dean

From: Alain RODRIGUEZ <arodr...@gmail.com<mailto:arodr...@gmail.com>>
Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" 
<user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Date: Thursday, March 14, 2013 6:41 AM
To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" 
<user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Subject: Re: About the heap

"Using half as many m1.xlarge is the way to go."

OK, good to know.

Are you getting too much GC or running OOM ?

GC, it is always gc, I neved had OOM as far as I remember.

"Are you using the default GC configuration ?"

Yes, as I don't know a lot about it and think default should be fine.

Is cassandra logging a lot of GC warnings ?

Yes, slowing nodes and even causing a node to be marked down from times to 
times.

I have this kind of message logged in:

INFO [ScheduledTasks:1] 2013-03-13 09:10:15,382 GCInspector.java (line 122) GC 
for ParNew: 212 ms for 1 collections, 4755815744 used; max is 8547991552
INFO [ScheduledTasks:1] 2013-03-13 09:10:40,000 GCInspector.java (line 122) GC 
for ParNew: 229 ms for 1 collections, 5432008416 used; max is 8547991552
INFO [ScheduledTasks:1] 2013-03-13 09:10:41,000 GCInspector.java (line 122) GC 
for ParNew: 310 ms for 1 collections, 5434752016 used; max is 8547991552
INFO [ScheduledTasks:1] 2013-03-13 09:10:52,006 GCInspector.java (line 122) GC 
for ParNew: 215 ms for 1 collections, 5807823960 used; max is 8547991552
INFO [ScheduledTasks:1] 2013-03-13 09:10:53,007 GCInspector.java (line 122) GC 
for ParNew: 224 ms for 1 collections, 5765842928 used; max is 8547991552
INFO [ScheduledTasks:1] 2013-03-13 09:11:18,274 GCInspector.java (line 122) GC 
for ParNew: 478 ms for 1 collections, 6011120760 used; max is 8547991552

and even this when things goes worst:

INFO [ScheduledTasks:1] 2013-03-11 15:02:12,001 GCInspector.java (line 122) GC 
for ParNew: 626 ms for 1 collections, 7446160296 used; max is 8547991552
INFO [ScheduledTasks:1] 2013-03-11 15:02:14,002 GCInspector.java (line 122) GC 
for ParNew: 733 ms for 2 collections, 7777586576 used; max is 8547991552
INFO [ScheduledTasks:1] 2013-03-11 15:02:15,564 GCInspector.java (line 122) GC 
for ParNew: 622 ms for 1 collections, 7967657624 used; max is 8547991552
INFO [ScheduledTasks:1] 2013-03-11 15:02:54,089 GCInspector.java (line 122) GC 
for ConcurrentMarkSweep: 8460 ms for 2 collections, 7949200768 used; max is 
8547991552
WARN [ScheduledTasks:1] 2013-03-11 15:02:54,241 GCInspector.java (line 145) 
Heap is 0.9299495348869525 full.  You may need to reduce memtable and/or cache 
sizes.  Cassandra will now flush up to the two largest memtables to free up 
memory.  Adjust flush_largest_memtables_at threshold in cassandra.yaml if you 
don't want Cassandra to do this automatically
INFO [ScheduledTasks:1] 2013-03-11 15:03:36,487 GCInspector.java (line 122) GC 
for ConcurrentMarkSweep: 11637 ms for 2 collections, 8367456784 used; max is 
8547991552
WARN [ScheduledTasks:1] 2013-03-11 15:03:37,194 GCInspector.java (line 145) 
Heap is 0.9788798612046171 full.  You may need to reduce memtable and/or cache 
sizes.  Cassandra will now flush up to the two largest memtables to free up 
memory.  Adjust flush_largest_memtables_at threshold in cassandra.yaml if you 
don't want Cassandra to do this automatically
INFO [ScheduledTasks:1] 2013-03-11 15:04:19,499 GCInspector.java (line 122) GC 
for ConcurrentMarkSweep: 11398 ms for 2 collections, 8472967584 used; max is 
8547991552
WARN [ScheduledTasks:1] 2013-03-11 15:04:20,096 GCInspector.java (line 145) 
Heap is 0.9912232051770751 full.  You may need to reduce memtable and/or cache 
sizes.  Cassandra will now flush up to the two largest memtables to free up 
memory.  Adjust flush_largest_memtables_at threshold in cassandra.yaml if you 
don't want Cassandra to do this automatically
INFO [ScheduledTasks:1] 2013-03-11 15:05:02,916 GCInspector.java (line 122) GC 
for ConcurrentMarkSweep: 11877 ms for 2 collections, 8508628816 used; max is 
8547991552
WARN [ScheduledTasks:1] 2013-03-11 15:05:02,999 GCInspector.java (line 145) 
Heap is 0.99539508950605 full.  You may need to reduce memtable and/or cache 
sizes.  Cassandra will now flush up to the two largest memtables to free up 
memory.  Adjust flush_largest_memtables_at threshold in cassandra.yaml if you 
don't want Cassandra to do this automatically
INFO [ScheduledTasks:1] 2013-03-11 15:05:42,449 GCInspector.java (line 122) GC 
for ConcurrentMarkSweep: 11958 ms for 2 collections, 7557641672 used; max is 
8547991552
WARN [ScheduledTasks:1] 2013-03-11 15:05:42,813 GCInspector.java (line 145) 
Heap is 0.8841423890073588 full.  You may need to reduce memtable and/or cache 
sizes.  Cassandra will now flush up to the two largest memtables to free up 
memory.  Adjust flush_largest_memtables_at threshold in cassandra.yaml if you 
don't want Cassandra to do this automatically
INFO [ScheduledTasks:1] 2013-03-11 15:05:46,152 GCInspector.java (line 122) GC 
for ParNew: 665 ms for 1 collections, 8023369408 used; max is 8547991552
INFO [ScheduledTasks:1] 2013-03-11 15:06:18,931 GCInspector.java (line 122) GC 
for ConcurrentMarkSweep: 9467 ms for 2 collections, 5797092296 used; max is 
8547991552

Once again with 1 GB BF, 1GB memtables and 100 MB caches...

I am not sure how to avoid this.

2013/3/14 aaron morton <aa...@thelastpickle.com<mailto:aa...@thelastpickle.com>>
Because of this I have an unstable cluster and have no other choice than use 
Amazon EC2 xLarge instances when we would rather use twice more EC2 Large nodes.
m1.xlarge is a MUCH better choice than m1.large.
You get more ram and better IO and less steal. Using half as many m1.xlarge is 
the way to go.

My heap is actually changing from 3-4 GB to 6 GB and sometimes growing to the 
max 8 GB (crashing the node).
How is it crashing ?
Are you getting too much GC or running OOM ?
Are you using the default GC configuration ?
Is cassandra logging a lot of GC warnings ?

If you are running OOM then something has to change. Maybe bloom filters, maybe 
caches.

Enable the GC logging in cassandra-env.sh to check how low a CMS compaction 
get's the heap, or use some other tool. That will give an idea of how much 
memory you are using.

Here is some background on what is kept on heap in pre 1.2
http://www.mail-archive.com/user@cassandra.apache.org/msg25762.html

Cheers

-----------------
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 13/03/2013, at 12:19 PM, Wei Zhu <wz1...@yahoo.com<mailto:wz1...@yahoo.com>> 
wrote:

Here is the JIRA I submitted regarding the ancestor.

https://issues.apache.org/jira/browse/CASSANDRA-5342

-Wei

----- Original Message -----
From: "Wei Zhu" <wz1...@yahoo.com<mailto:wz1...@yahoo.com>>
To: user@cassandra.apache.org<mailto:user@cassandra.apache.org>
Sent: Wednesday, March 13, 2013 11:35:29 AM
Subject: Re: About the heap

Hi Dean,
The index_interval is controlling the sampling of the SSTable to speed up the 
lookup of the keys in the SSTable. Here is the code:

https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/DataTracker.java#L478

To increase the interval meaning, taking less samples, less memory, slower 
lookup for read.

I did do a heap dump on my production system which caused about 10 seconds 
pause of the node. I found something interesting, for LCS, it could involve 
thousands of SSTables for one compaction, the ancestors are recorded in case 
something goes wrong during the compaction. But those are never removed after 
the compaction is done. In our case, it takes about 1G of heap memory to store 
that. I am going to submit a JIRA for that.

Here is the culprit:

https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/io/sstable/SSTableMetadata.java#L58

Enjoy looking at Cassandra code:)

-Wei

----- Original Message -----
From: "Dean Hiller" <dean.hil...@nrel.gov<mailto:dean.hil...@nrel.gov>>
To: user@cassandra.apache.org<mailto:user@cassandra.apache.org>
Sent: Wednesday, March 13, 2013 11:11:14 AM
Subject: Re: About the heap

Going to 1.2.2 helped us quite a bit as well as turning on LCS from STCS which 
gave us smaller bloomfilters.

As far as key cache.  There is an entry in cassandra.yaml called index_interval 
set to 128.  I am not sure if that is related to key_cache.  I think it is.  By 
turning that to 512 or maybe even 1024, you will consume less ram there as well 
though I ran this test in QA and my key cache size stayed the same so I am 
really not sure(I am actually checking out cassandra code now to dig a little 
deeper into this property.

Dean

From: Alain RODRIGUEZ 
<arodr...@gmail.com<mailto:arodr...@gmail.com><mailto:arodr...@gmail.com<mailto:arodr...@gmail.com>>>
Reply-To: 
"user@cassandra.apache.org<mailto:user@cassandra.apache.org><mailto:user@cassandra.apache.org<mailto:user@cassandra.apache.org>>"

<user@cassandra.apache.org<mailto:user@cassandra.apache.org><mailto:user@cassandra.apache.org<mailto:user@cassandra.apache.org>>>
Date: Wednesday, March 13, 2013 10:11 AM
To: 
"user@cassandra.apache.org<mailto:user@cassandra.apache.org><mailto:user@cassandra.apache.org<mailto:user@cassandra.apache.org>>"

<user@cassandra.apache.org<mailto:user@cassandra.apache.org><mailto:user@cassandra.apache.org<mailto:user@cassandra.apache.org>>>
Subject: About the heap

Hi,

I would like to know everything that is in the heap.

We are here speaking of C*1.1.6

Theory :

- Memtable (1024 MB)
- Key Cache (100 MB)
- Row Cache (disabled, and serialized with JNA activated anyway, so should be 
off-heap)
- BloomFilters (about 1,03 GB - from cfstats, adding all the "Bloom Filter 
Space Used" and considering they are showed in Bytes - 1103765112)
- Anything else ?

So my heap should be fluctuating between 1,15 GB and 2.15 GB and growing slowly 
(from the new BF of my new data).

My heap is actually changing from 3-4 GB to 6 GB and sometimes growing to the 
max 8 GB (crashing the node).

Because of this I have an unstable cluster and have no other choice than use 
Amazon EC2 xLarge instances when we would rather use twice more EC2 Large nodes.

What am I missing ?

Practice :

Is there a way not inducing any load and easy to do to dump the heap to analyse 
it with MAT (or anything else that you could advice) ?

Alain

Re: About the heap

Reply via email to