Hi, We have a table in our production Cassandra that is stored on 11369 SSTables. The average SSTable count for the other tables is around 15, and the read latency for them is much smaller. I tried to run manual compaction (nodetool compact my_keyspace my_table) but then the node starts spending ~90% of the time in GC and compaction advances super slowly (it would take a couple of weeks to finish). I checked IO stats with "iotop" and there is almost no IO going on.
We're running Cassandra on EC2 (m1.xlarge) which has 15G of memory, using DataStax Community AMI. Our Cassandra version is 2.1.2. We didn't change Cassandra configuration from the default in the AMI, so Cassandra calculated 3760M for the heap size. Why does Cassandra fall into this "90% CPU time in GC" state and how can I tune Cassandra so that it can finish the compaction successfully? I would appreciate any help. You can find the output of "nodetool cfstats" and the command line with which Cassandra was started below. Thanks, Mikhail java -ea -javaagent:/usr/share/cassandra/lib/jamm-0.2.8.jar -XX:+CMSClassUnloadingEnabled -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Xms3760M -Xmx3760M -Xmn400M -XX:+HeapDumpOnOutOfMemoryError -Xss256k -XX:StringTableSize=1000003 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=1 -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+UseTLAB -XX:CompileCommandFile=/hotspot_compiler -XX:CMSWaitDuration=10000 -XX:+UseCondCardMark -Djava.net.preferIPv4Stack=true -Djava.rmi.server.hostname=10.199.0.60 -Dcom.sun.management.jmxremote.port=7199 -Dcom.sun.management.jmxremote.rmi.port=7199 -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false -ea -javaagent:/usr/share/cassandra/lib/jamm-0.2.8.jar -XX:+CMSClassUnloadingEnabled -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Xms3760M -Xmx3760M -Xmn400M -XX:+HeapDumpOnOutOfMemoryError -Xss256k -XX:StringTableSize=1000003 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=1 -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+UseTLAB -XX:CompileCommandFile=/etc/cassandra/hotspot_compiler -XX:CMSWaitDuration=10000 -XX:+UseCondCardMark -Djava.net.preferIPv4Stack=true -Djava.rmi.server.hostname=10.199.0.60 -Dcom.sun.management.jmxremote.port=7199 -Dcom.sun.management.jmxremote.rmi.port=7199 -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false -Dlogback.configurationFile=logback.xml -Dcassandra.logdir=/var/log/cassandra -Dcassandra.storagedir= -Dcassandra-pidfile=/var/run/cassandra/cassandra.pid <skip -cp> -XX:HeapDumpPath=/var/lib/cassandra/java_1418933869.hprof -XX:ErrorFile=/var/lib/cassandra/hs_err_1418933869.log org.apache.cassandra.service.CassandraDaemon Keyspace: my_keyspace Read Count: 6026 Read Latency: 50.02909243279124 ms. Write Count: 14 Write Latency: 0.13921428571428573 ms. Pending Flushes: 0 Table: my_table SSTable count: 11369 Space used (live): 35.59 GB Space used (total): 40.42 GB Space used by snapshots (total): 50.15 GB SSTable Compression Ratio: 0.5274547291978071 Memtable cell count: 40 Memtable data size: 1.56 KB Memtable switch count: 10 Local read count: 6026 Local read latency: 50.030 ms Local write count: 14 Local write latency: 0.140 ms Pending flushes: 0 Bloom filter false positives: 3272 Bloom filter false ratio: 0.00000 Bloom filter space used: 149.38 MB Compacted partition minimum bytes: 104 bytes Compacted partition maximum bytes: 86.08 KB Compacted partition mean bytes: 507 bytes Average live cells per slice (last five minutes): 2.236134174692793 Maximum live cells per slice (last five minutes): 87.0 Average tombstones per slice (last five minutes): 0.0 Maximum tombstones per slice (last five minutes): 0.0