We ran into similar heap issues a while ago for 1.0.11, I am not sure
whether you are at the luxury of upgrading to at-least 1.2.9, we were not.
After a lot of various painful attempts and weeks of testing (just as in
your case) the following settings worked (did not completely relieve the
heap pressure but helped a lot). We still see some heap issues but at-least
it is a bit stable. Unlike in your case we had very heavy reads and writes.
But its good to know that this happens for light load, I was thinking this
was a symptom of heavy load.

-XX:NewSize=1200M
-XX:SurvivorRatio=4
-XX:MaxTenuringThreshold=2


Not sure whether this will help you or not but I think its worth a try.

-sandeep


On Wed, Oct 30, 2013 at 4:34 AM, Jason Tang <ares.t...@gmail.com> wrote:

> What's configuration of following parameters
> memtable_flush_queue_size:
> concurrent_compactors:
>
>
> 2013/10/30 Piavlo <lolitus...@gmail.com>
>
>> Hi,
>>
>> Below I try to give a full picture to the problem I'm facing.
>>
>> This is a 12 node cluster, running on ec2 with m2.xlarge instances (17G
>> ram , 2 cpus).
>> Cassandra version is 1.0.8
>> Cluster normally having between 3000 - 1500 reads per second (depends on
>> time of the day) and 1700 - 800 writes per second- according to Opscetner.
>> RF=3, now row caches are used.
>>
>> Memory relevant  configs from cassandra.yaml:
>> flush_largest_memtables_at: 0.85
>> reduce_cache_sizes_at: 0.90
>> reduce_cache_capacity_to: 0.75
>> commitlog_total_space_in_mb: 4096
>>
>> relevant JVM options used are:
>> -Xms8000M -Xmx8000M -Xmn400M
>> -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled
>> -XX:MaxTenuringThreshold=1
>> -XX:**CMSInitiatingOccupancyFraction**=80 -XX:+**
>> UseCMSInitiatingOccupancyOnly"
>>
>> Now what happens is that with these settings after cassandra process
>> restart, the GC it working fine at the beginning, and heap used looks like a
>> saw with perfect teeth, eventually the teeth size start to diminish until
>> the teeth become not noticable, and then cassandra starts to spend lot's of
>> CPU time
>> doing gc. It takes about 2 weeks until for such cycle , and then I need
>> to restart cassandra process to improve performance.
>> During all this time there are no memory  related messages in cassandra
>> system.log, except a "GC for ParNew: little above 200ms" once in a while.
>>
>> Things i've already done trying to reduce this eventual heap pressure.
>> 1) reducing bloom_filter_fp_chance  resulting in reduction from ~700MB to
>> ~280MB total per node based on all Filter.db files on the node.
>> 2) reducing key cache sizes, and dropping key_caches for CFs which do no
>> not have many reads
>> 3) the heap size was increased from 7000M to 8000M
>> All these have not really helped , just the increase from 7000M to 8000M,
>> helped in increase the cycle till excessive gc from ~9 days to ~14 days.
>>
>> I've tried to graph overtime the data that is supposed to be in heap vs
>> actual heap size, by summing up all CFs bloom filter sizes + all CFs key
>> cache capacities multipled by average key size + all CFs memtables data
>> size reported (i've overestimated the data size a bit on purpose to be on
>> the safe size).
>> Here is a link to graph showing last 2 day metrics for a node which could
>> not effectively do GC, and then cassandra process was restarted.
>> http://awesomescreenshot.com/**0401w5y534<http://awesomescreenshot.com/0401w5y534>
>> You can clearly see that before and after restart, the size of data that
>> is in supposed to be in heap, is the same pretty much the same,
>> which makes me think that I really need is GC tunning.
>>
>> Also I suppose that this is not due to number of total keys each node has
>> , which is between 300 - 200 milions keys for all CF key estimates summed
>> on a code.
>> The nodes have datasize between 75G to 45G  accordingly to milions of
>> keys. And all nodes are starting to have having GC heavy load after about
>> 14 days.
>> Also the excessive GC and heap usage are not affected by load which
>> varies depending on time of the day (see read/write rates at the beginning
>> of the mail).
>> So again based on this , I assume this is not due to large number of keys
>> or too much load on the cluster,  but due to a pure GC misconfiguration
>> issue.
>>
>> Things I remember that I've tried for GC tunning:
>> 1) Changing -XX:MaxTenuringThreshold=1 to values like 8 - did not help.
>> 2) Adding  -XX:+CMSIncrementalMode -XX:+CMSIncrementalPacing -XX:**
>> CMSIncrementalDutyCycleMin=0
>>                   -XX:CMSIncrementalDutyCycle=10 -XX:ParallelGCThreads=2
>> JVM_OPTS -XX:ParallelCMSThreads=1
>>     this actually made things worse.
>> 3) Adding -XX:-XX-UseAdaptiveSizePolicy -XX:SurvivorRatio=8 - did not
>> help.
>>
>> Also since it takes like 2 weeks to verify that changing GC setting did
>> not help, the process is painfully slow to try all the possibilities :)
>> I'd highly appreciate any help and hints on the GC tunning.
>>
>> tnx
>> Alex
>>
>>
>>
>>
>>
>>
>>
>

Reply via email to