On 10/30/2013 03:21 PM, srmore wrote:
We ran into similar heap issues a while ago for 1.0.11, I am not sure
whether you are at the luxury of upgrading to at-least 1.2.9, we were
not. After a lot of various painful attempts and weeks of testing
(just as in your case) the following settings worked (did not
completely relieve the heap pressure but helped a lot). We still see
some heap issues but at-least it is a bit stable. Unlike in your case
we had very heavy reads and writes.
Well that's what I can have with the current 12 node ec2 m2.xlarge
instances cluster, if reads/writes increases significantly higher during
peak times , the node would be locked by 100% user + system cpu usage,
irrelevant of the GC.
If you are on ec2 also what cluster spec and read/write rates do you have?
But its good to know that this happens for light load, I was thinking
this was a symptom of heavy load.
-XX:NewSize=1200M
wow 1200M is really huge size for NewSize , this will blow up the GC for
ParNew pause skyes high for me, I remember back then I increased it from
300M to 400M i already spotted in crease in ParNew intervals.
are such high NewSize really recommended? I guess it partially also
depends on hardware specs. Do you record the NewSize GC paues intervals
with the 1200M setting?
-XX:SurvivorRatio=4
to my knowledge this settings does not have any effect without
-XX:-XX-UseAdaptiveSizePolicy
-XX:MaxTenuringThreshold=2
i had this changed from 1 to 8 and with 8 it was worse, do you think 2
could make any better difference?
tnx for your help
Alex
Not sure whether this will help you or not but I think its worth a try.
-sandeep
On Wed, Oct 30, 2013 at 4:34 AM, Jason Tang <ares.t...@gmail.com
<mailto:ares.t...@gmail.com>> wrote:
What's configuration of following parameters
memtable_flush_queue_size:
concurrent_compactors:
2013/10/30 Piavlo <lolitus...@gmail.com <mailto:lolitus...@gmail.com>>
Hi,
Below I try to give a full picture to the problem I'm facing.
This is a 12 node cluster, running on ec2 with m2.xlarge
instances (17G ram , 2 cpus).
Cassandra version is 1.0.8
Cluster normally having between 3000 - 1500 reads per second
(depends on time of the day) and 1700 - 800 writes per second-
according to Opscetner.
RF=3, now row caches are used.
Memory relevant configs from cassandra.yaml:
flush_largest_memtables_at: 0.85
reduce_cache_sizes_at: 0.90
reduce_cache_capacity_to: 0.75
commitlog_total_space_in_mb: 4096
relevant JVM options used are:
-Xms8000M -Xmx8000M -Xmn400M
-XX:+UseParNewGC -XX:+UseConcMarkSweepGC
-XX:+CMSParallelRemarkEnabled -XX:MaxTenuringThreshold=1
-XX:CMSInitiatingOccupancyFraction=80
-XX:+UseCMSInitiatingOccupancyOnly"
Now what happens is that with these settings after cassandra
process restart, the GC it working fine at the beginning, and
heap used looks like a
saw with perfect teeth, eventually the teeth size start to
diminish until the teeth become not noticable, and then
cassandra starts to spend lot's of CPU time
doing gc. It takes about 2 weeks until for such cycle , and
then I need to restart cassandra process to improve performance.
During all this time there are no memory related messages in
cassandra system.log, except a "GC for ParNew: little above
200ms" once in a while.
Things i've already done trying to reduce this eventual heap
pressure.
1) reducing bloom_filter_fp_chance resulting in reduction
from ~700MB to ~280MB total per node based on all Filter.db
files on the node.
2) reducing key cache sizes, and dropping key_caches for CFs
which do no not have many reads
3) the heap size was increased from 7000M to 8000M
All these have not really helped , just the increase from
7000M to 8000M, helped in increase the cycle till excessive gc
from ~9 days to ~14 days.
I've tried to graph overtime the data that is supposed to be
in heap vs actual heap size, by summing up all CFs bloom
filter sizes + all CFs key cache capacities multipled by
average key size + all CFs memtables data size reported (i've
overestimated the data size a bit on purpose to be on the safe
size).
Here is a link to graph showing last 2 day metrics for a node
which could not effectively do GC, and then cassandra process
was restarted.
http://awesomescreenshot.com/0401w5y534
You can clearly see that before and after restart, the size of
data that is in supposed to be in heap, is the same pretty
much the same,
which makes me think that I really need is GC tunning.
Also I suppose that this is not due to number of total keys
each node has , which is between 300 - 200 milions keys for
all CF key estimates summed on a code.
The nodes have datasize between 75G to 45G accordingly to
milions of keys. And all nodes are starting to have having GC
heavy load after about 14 days.
Also the excessive GC and heap usage are not affected by load
which varies depending on time of the day (see read/write
rates at the beginning of the mail).
So again based on this , I assume this is not due to large
number of keys or too much load on the cluster, but due to a
pure GC misconfiguration issue.
Things I remember that I've tried for GC tunning:
1) Changing -XX:MaxTenuringThreshold=1 to values like 8 - did
not help.
2) Adding -XX:+CMSIncrementalMode -XX:+CMSIncrementalPacing
-XX:CMSIncrementalDutyCycleMin=0
-XX:CMSIncrementalDutyCycle=10 -XX:ParallelGCThreads=2
JVM_OPTS -XX:ParallelCMSThreads=1
this actually made things worse.
3) Adding -XX:-XX-UseAdaptiveSizePolicy -XX:SurvivorRatio=8 -
did not help.
Also since it takes like 2 weeks to verify that changing GC
setting did not help, the process is painfully slow to try all
the possibilities :)
I'd highly appreciate any help and hints on the GC tunning.
tnx
Alex