In a 4 node cluster running Cassandra 1.1.5 with sun jvm 1.6.0_29-b11 (64-bit), the nodes are often getting "stuck" in state where CMS collections of the old space are constantly running.
The JVM configuration is using the standard settings in cassandra-env -- relevant settings are included below. The max heap is currently set to 5 GB with 800MB for new size. I don't believe that the cluster is overly busy and seems to be performing well enough other than this issue. When nodes get into this state they never seem to leave it (by freeing up old space memory) without restarting cassandra. They typically enter this state while running "nodetool repair -pr" but once they start doing this, restarting them only "fixes" it for a couple of hours. Compactions are completing and are generally not queued up. All CF are using STCS. The busiest CF consumes about 100GB of space on disk, is write heavy, and all columns have a TTL of 3 days. Overall, there are 41 CF including those used for system keyspace and secondary indexes. The number of SSTables per node currently varies from 185-212. Other than frequent log warnings about "GCInspector - Heap is 0.xxx full..." and "StorageService - Flushing CFS(...) to relieve memory pressure" there are no other log entries to indicate there is a problem. Does the memory needed vary depending on the amount of data stored? If so, how can I predict how much jvm space is needed? I don't want to make the heap too large as that's bad too. Maybe there's a memory leak related to compaction that doesn't allow meta-data to be purged? -Bryan 12 GB of RAM in host with ~6 GB used by java and ~6 GB for OS and buffer cache. $> free -m total used free shared buffers cached Mem: 12001 11870 131 0 4 5778 -/+ buffers/cache: 6087 5914 Swap: 0 0 0 jvm settings in cassandra-env MAX_HEAP_SIZE="5G" HEAP_NEWSIZE="800M" # GC tuning options JVM_OPTS="$JVM_OPTS -XX:+UseParNewGC" JVM_OPTS="$JVM_OPTS -XX:+UseConcMarkSweepGC" JVM_OPTS="$JVM_OPTS -XX:+CMSParallelRemarkEnabled" JVM_OPTS="$JVM_OPTS -XX:SurvivorRatio=8" JVM_OPTS="$JVM_OPTS -XX:MaxTenuringThreshold=1" JVM_OPTS="$JVM_OPTS -XX:CMSInitiatingOccupancyFraction=75" JVM_OPTS="$JVM_OPTS -XX:+UseCMSInitiatingOccupancyOnly" JVM_OPTS="$JVM_OPTS -XX:+UseCompressedOops" jstat shows about 12 full collections per minute with old heap usage constantly over 75% so CMS is always over the CMSInitiatingOccupancyFraction threshold. $> jstat -gcutil -t 22917 5000 4 Timestamp S0 S1 E O P YGC YGCT FGC FGCT GCT 132063.0 34.70 0.00 26.03 82.29 59.88 21580 506.887 17523 3078.941 3585.829 132068.0 34.70 0.00 50.02 81.23 59.88 21580 506.887 17524 3079.220 3586.107 132073.1 0.00 24.92 46.87 81.41 59.88 21581 506.932 17525 3079.583 3586.515 132078.1 0.00 24.92 64.71 81.40 59.88 21581 506.932 17527 3079.853 3586.785 Other hosts not currently experiencing the high CPU load have a heap less than .75 full. $> jstat -gcutil -t 6063 5000 4 Timestamp S0 S1 E O P YGC YGCT FGC FGCT GCT 520731.6 0.00 12.70 36.37 71.33 59.26 46453 1688.809 14785 2130.779 3819.588 520736.5 0.00 12.70 53.25 71.33 59.26 46453 1688.809 14785 2130.779 3819.588 520741.5 0.00 12.70 68.92 71.33 59.26 46453 1688.809 14785 2130.779 3819.588 520746.5 0.00 12.70 83.11 71.33 59.26 46453 1688.809 14785 2130.779 3819.588