ok, let me try asking the question a different way ... How does cassandra use memory and how can I plan how much is needed? I have a 1 GB memtable and 5 GB total heap and that's still not enough even though the number of concurrent connections and garbage generation rate is fairly low.
If I were using mysql or oracle, I could compute how much memory could be used by N concurrent connections, how much is allocated for caching, temp spaces, etc. How can I do this for cassandra? Currently it seems like the memory used scales with the amount of bytes stored and not with how busy the server actually is. That's not such a good thing. -Bryan On Thu, Oct 18, 2012 at 11:06 AM, Bryan Talbot <btal...@aeriagames.com>wrote: > In a 4 node cluster running Cassandra 1.1.5 with sun jvm 1.6.0_29-b11 > (64-bit), the nodes are often getting "stuck" in state where CMS > collections of the old space are constantly running. > > The JVM configuration is using the standard settings in cassandra-env -- > relevant settings are included below. The max heap is currently set to 5 > GB with 800MB for new size. I don't believe that the cluster is overly > busy and seems to be performing well enough other than this issue. When > nodes get into this state they never seem to leave it (by freeing up old > space memory) without restarting cassandra. They typically enter this > state while running "nodetool repair -pr" but once they start doing this, > restarting them only "fixes" it for a couple of hours. > > Compactions are completing and are generally not queued up. All CF are > using STCS. The busiest CF consumes about 100GB of space on disk, is write > heavy, and all columns have a TTL of 3 days. Overall, there are 41 CF > including those used for system keyspace and secondary indexes. The number > of SSTables per node currently varies from 185-212. > > Other than frequent log warnings about "GCInspector - Heap is 0.xxx > full..." and "StorageService - Flushing CFS(...) to relieve memory > pressure" there are no other log entries to indicate there is a problem. > > Does the memory needed vary depending on the amount of data stored? If > so, how can I predict how much jvm space is needed? I don't want to make > the heap too large as that's bad too. Maybe there's a memory leak related > to compaction that doesn't allow meta-data to be purged? > > > -Bryan > > > 12 GB of RAM in host with ~6 GB used by java and ~6 GB for OS and buffer > cache. > $> free -m > total used free shared buffers cached > Mem: 12001 11870 131 0 4 5778 > -/+ buffers/cache: 6087 5914 > Swap: 0 0 0 > > > jvm settings in cassandra-env > MAX_HEAP_SIZE="5G" > HEAP_NEWSIZE="800M" > > # GC tuning options > JVM_OPTS="$JVM_OPTS -XX:+UseParNewGC" > JVM_OPTS="$JVM_OPTS -XX:+UseConcMarkSweepGC" > JVM_OPTS="$JVM_OPTS -XX:+CMSParallelRemarkEnabled" > JVM_OPTS="$JVM_OPTS -XX:SurvivorRatio=8" > JVM_OPTS="$JVM_OPTS -XX:MaxTenuringThreshold=1" > JVM_OPTS="$JVM_OPTS -XX:CMSInitiatingOccupancyFraction=75" > JVM_OPTS="$JVM_OPTS -XX:+UseCMSInitiatingOccupancyOnly" > JVM_OPTS="$JVM_OPTS -XX:+UseCompressedOops" > > > jstat shows about 12 full collections per minute with old heap usage > constantly over 75% so CMS is always over the > CMSInitiatingOccupancyFraction threshold. > > $> jstat -gcutil -t 22917 5000 4 > Timestamp S0 S1 E O P YGC YGCT FGC > FGCT GCT > 132063.0 34.70 0.00 26.03 82.29 59.88 21580 506.887 17523 > 3078.941 3585.829 > 132068.0 34.70 0.00 50.02 81.23 59.88 21580 506.887 17524 > 3079.220 3586.107 > 132073.1 0.00 24.92 46.87 81.41 59.88 21581 506.932 17525 > 3079.583 3586.515 > 132078.1 0.00 24.92 64.71 81.40 59.88 21581 506.932 17527 > 3079.853 3586.785 > > > Other hosts not currently experiencing the high CPU load have a heap less > than .75 full. > > $> jstat -gcutil -t 6063 5000 4 > Timestamp S0 S1 E O P YGC YGCT FGC > FGCT GCT > 520731.6 0.00 12.70 36.37 71.33 59.26 46453 1688.809 14785 > 2130.779 3819.588 > 520736.5 0.00 12.70 53.25 71.33 59.26 46453 1688.809 14785 > 2130.779 3819.588 > 520741.5 0.00 12.70 68.92 71.33 59.26 46453 1688.809 14785 > 2130.779 3819.588 > 520746.5 0.00 12.70 83.11 71.33 59.26 46453 1688.809 14785 > 2130.779 3819.588 > > > >