Hi, 

Has anyboy done any memory usage analysis for cassandra?

How much memory does cassandra need to manager 300G of data load? How much 
extra memory will be needed when doing compaction?

Regarding mmap, memory usage will be determined by the OS so it has nothing to 
do with the heap size of JVM, am I right?

I have a cassandra cluster of 13 nodes, each with 200~300g data.
JVM settings
JVM_OPTS=" \
        -ea \
        -Xms6G \
        -Xmx6G \
        -XX:TargetSurvivorRatio=90 \
        -XX:+AggressiveOpts \
        -XX:+UseParNewGC \
        -XX:+UseConcMarkSweepGC \
        -XX:+CMSParallelRemarkEnabled \
        -XX:+HeapDumpOnOutOfMemoryError \
        -XX:SurvivorRatio=128 \
        -XX:MaxTenuringThreshold=0 \
        -XX:+PrintGC -Xloggc:gc.log -XX:+PrintGCDetails -XX:+PrintGCTimeStamps \
        -Dcom.sun.management.jmxremote.port=4993 \
        -Dcom.sun.management.jmxremote.ssl=false \
        -Dcom.sun.management.jmxremote.authenticate=false"
KeysCache settings for 3 column families are 5,000,000  1,000,000  1,000,000

some nodes run for 1 to 2 days, and then gets very slow, due to bad gc 
performance, then crashed. This happed quite a lot, almost every day. 
Here is a fragment of the gc.log

 (concurrent mode failure): 6014591K->6014591K(6014592K), 25.4846400 secs] 
6289343K->6282274K(6289344K), [CMS Perm : 17290K->17287K(28988K)], 25.4848970 
secs] [Times: user=37.76 sys=0.12, real=25.49 secs] 
69695.771: [Full GC 69695.771: [CMS: 6014591K->6014591K(6014592K), 21.0911470 
secs] 6289343K->6282177K(6289344K), [CMS Perm : 17287K->17287K(28988K)], 
21.0913910 secs] [Times: user=21.01 sys=0.12, real=21.09 secs] 
69716.902: [GC [1 CMS-initial-mark: 6014591K(6014592K)] 6287620K(6289344K), 
0.2759980 secs] [Times: user=0.28 sys=0.00, real=0.28 secs] 
69717.178: [CMS-concurrent-mark-start]
69717.203: [Full GC 69717.203: [CMS69721.345: [CMS-concurrent-mark: 4.152/4.167 
secs] [Times: user=16.64 sys=0.01, real=4.17 secs] 
 (concurrent mode failure): 6014592K->6014591K(6014592K), 25.3649330 secs] 
6289343K->6282200K(6289344K), [CMS Perm : 17287K->17287K(28988K)], 25.3651670 
secs] [Times: user=37.67 sys=0.13, real=25.37 secs] 
69742.598: [Full GC 69742.598: [CMS: 6014591K->6014592K(6014592K), 21.0942430 
secs] 6289343K->6282398K(6289344K), [CMS Perm : 17290K->17287K(28988K)], 
21.0944950 secs] [Times: user=21.00 sys=0.12, real=21.10 secs] 
69763.721: [Full GC 69763.721: [CMS: 6014592K->6014591K(6014592K), 21.0978230 
secs] 6289343K->6282553K(6289344K), [CMS Perm : 17290K->17287K(28988K)], 
21.0980600 secs] [Times: user=20.99 sys=0.12, real=21.09 secs] 
69784.830: [GC [1 CMS-initial-mark: 6014591K(6014592K)] 6287995K(6289344K), 
0.2765360 secs] [Times: user=0.28 sys=0.00, real=0.28 secs] 
69785.107: [CMS-concurrent-mark-start]
69785.123: [Full GC 69785.123: [CMS69789.244: [CMS-concurrent-mark: 4.132/4.136 
secs] [Times: user=16.49 sys=0.03, real=4.13 secs] 
 (concurrent mode failure): 6014591K->6014591K(6014592K), 26.0883660 secs] 
6289343K->6282549K(6289344K), [CMS Perm : 17290K->17287K(28988K)], 26.0886060 
secs] [Times: user=38.28 sys=0.15, real=26.09 secs] 

Anybody got an idea?

Reply via email to