Thank you for your responses. In another recent test, the heap actually got 
full, and we got an out of memory error. When we analyzed the heap, almost all 
of it was memtables. Is there any known issue with 1.1.5 which causes 
memtable_total_space_in_mb not to be respected, or not defaulting to 1/3rd of 
the heap size? Or is it possible that the load in the test is that high that 
Cassandra is not able to keep flushing even though it starts the process when 
memtable_total_space_in_mb is 1/3rd of the heap?

We recently switched to LeveledCompaction, however, when we got the earlier 
heap warning, that was running on SizeTiered.
The latest test was running on high performance 32-core, 128 GB RAM, 7 RAID-0 
1TB disks (regular). Earlier tests were run on lesser hardware with the same 
load, but there was no memory problem. We are running more tests to check if 
this is always reproducible.

Answering some of the earlier questions if it helps:

We have Cassandra 1.1.5 running in production. Upgrading to the latest 1.2.x 
release is on the roadmap, but till then this needs to be figured out.

> - How many data do you got per node ?
We are running into these errors while running tests in QA starting with 0 
load. These are around 4 hr tests which end up adding under 1 GB of data on 
each node of a 4-node ring, or a 2-node ring.

> - What is the value of the "index_intval" (cassandra.yaml) ?
It's the default value of 128.

Thanks,
Arindam

-----Original Message-----
From: Aaron Morton [mailto:aa...@thelastpickle.com] 
Sent: Monday, October 28, 2013 12:09 AM
To: Cassandra User
Subject: Re: Heap almost full

> 1] [14/10/2013:19:15:08 PDT] ScheduledTasks:1:  WARN GCInspector.java (line 
> 145) Heap is 0.8287082580489245 full.  You may need to reduce memtable and/or 
> cache sizes.  Cassandra will now flush up to the two largest memtables to 
> free up memory.  Adjust flush_largest_memtables_at threshold in 
> cassandra.yaml if you don't want Cassandra to do this automatically
This means that the CMS GC was unable to free memory quickly, you've not run 
out but may do under heavy load. 

CMS uses CPU resources to do it's job, how much CPU do you have available ? 
To check the behaviour of the CMS collector using JConsole or another tool to 
watch the heap size, you should see a nice saw tooth graph. It should gradually 
grow then drop quickly to below 3ish GB. If the size of CMS is not low enough 
you will spend more time in GC. 

You may also want to adjust flush_largest_memtables_at to be .8 to give CMS a 
chance to do it's work. It starts at .75

> In 1.2+ bloomfilters are off-heap, you can use vnodes...
+1 for 1.2 with off heap bloom filters. 

> - increasing the heap to 10GB.

-1 
Unless you have a node under heavy memory problems, pre 1.2 with 1+billion rows 
and lots of bloom filters, increasing the heap is not the answer. It will 
increase the time taken for ParNew CMS and in kicks the problem down the road. 

Cheers
 
-----------------
Aaron Morton
New Zealand
@aaronmorton

Co-Founder & Principal Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com

On 26/10/2013, at 8:32 am, Alain RODRIGUEZ <arodr...@gmail.com> wrote:

> If you are starting with Cassandra I really advice you to start with 1.2.11
> 
> In 1.2+ bloomfilters are off-heap, you can use vnodes...
> 
> "I summed up the bloom filter usage reported by nodetool cfstats in all the 
> CFs and it was under 50 MB."
> 
> This is quite a small value. Is there no error in your conversion from Bytes 
> read in cfstats ?
> 
> If you are trying to understand this could you tell us :
> 
> - How many data do you got per node ?
> - What is the value of the "index_intval" (cassandra.yaml) ?
> 
> If you are trying to fix this, you can try :
> 
> - changing the "memtable_total_space_in_mb" to 1024
> - increasing the heap to 10GB.
> 
> Hope this will help somehow :).
> 
> Good luck
> 
> 
> 2013/10/16 Arindam Barua <aba...@247-inc.com>
>  
> 
> During performance testing being run on our 4 node Cassandra 1.1.5 cluster, 
> we are seeing warning logs about the heap being almost full - [1]. I'm trying 
> to figure out why, and how to prevent it.
> 
>  
> 
> The tests are being run on a Cassandra ring consisting of 4 dedicated boxes 
> with 32 GB of RAM each.
> 
> The heap size is set to 8 GB as recommended.
> 
> All the other relevant settings I know off are the default ones:
> 
> -          memtable_total_space_in_mb is not set in the yaml, so should 
> default to 1/3rd the heap size.
> 
> -          They key cache should be 100 MB at the most. I checked the key 
> cache the day after the tests were run via nodetool info, and it reported 4.5 
> MB being used.
> 
> -          row cache is not being used
> 
> -          I summed up the bloom filter usage reported by nodetool cfstats in 
> all the CFs and it was under 50 MB.
> 
>  
> 
> The resident size of the cassandra process accd to top is 8.4g even now. Did 
> a heap histogram using jmap, but not sure how to interpret those results 
> usefully - [2].
> 
>  
> 
> Performance test details:
> 
> -          The test is write only, and is writing relatively large amount of 
> data to one CF.
> 
> -          There is some other traffic that is constantly on that writes 
> smaller amounts of data to many CFs, and does some reads.
> 
>  
> 
> The total number of CFs are 114, but quite a few of them are not used.
> 
>  
> 
> Thanks,
> 
> Arindam
> 
>  
> 
> [1] [14/10/2013:19:15:08 PDT] ScheduledTasks:1:  WARN GCInspector.java (line 
> 145) Heap is 0.8287082580489245 full.  You may need to reduce memtable and/or 
> cache sizes.  Cassandra will now flush up to the two largest memtables to 
> free up memory.  Adjust flush_largest_memtables_at threshold in 
> cassandra.yaml if you don't want Cassandra to do this automatically
> 
>  
> 
> [2] Object Histogram:
> 
>  
> 
> num       #instances    #bytes  Class description
> 
> --------------------------------------------------------------------------
> 
> 1:              152855  86035312        int[]
> 
> 2:              13395   45388008        long[]
> 
> 3:              49517   9712000 java.lang.Object[]
> 
> 4:              120094  8415560 char[]
> 
> 5:              145106  6965088 java.nio.HeapByteBuffer
> 
> 6:              40525   5891040 * ConstMethodKlass
> 
> 7:              231258  5550192 java.lang.Long
> 
> 8:              40525   5521592 * MethodKlass
> 
> 9:              134574  5382960 java.math.BigInteger
> 
> 10:             36692   4403040 java.net.SocksSocketImpl
> 
> 11:             3741    4385048 * ConstantPoolKlass
> 
> 12:             63875   3538128 * SymbolKlass
> 
> 13:             104048  3329536 java.lang.String
> 
> 14:             132636  3183264 org.apache.cassandra.db.DecoratedKey
> 
> 15:             97466   3118912 
> java.util.concurrent.ConcurrentHashMap$HashEntry
> 
> 16:             97216   3110912 
> com.googlecode.concurrentlinkedhashmap.ConcurrentLinkedHashMap$Node
> 
>  
> 
> 
> 

Reply via email to