I have a 6 node cluster running on AWS. We are using m1.large instances with heap size set to 3G.
5 of the 6 nodes seem quite healthy. The 6th one however is running GCInspector GC for ConcurrentMarkSweep every 15 seconds or so. There is nothing going on on this box. No repairs and almost not user activity. But the CPU is almost continuously at 50% or more. The only message in the log at all is the INFO 2014-02-17 22:58:53,429 [ScheduledTasks:1] GCInspector GC for ConcurrentMarkSweep: 213 ms for 1 collections, 1964940024 used; max is 3200253952 INFO 2014-02-17 22:59:07,431 [ScheduledTasks:1] GCInspector GC for ConcurrentMarkSweep: 250 ms for 1 collections, 1983269488 used; max is 3200253952 INFO 2014-02-17 22:59:21,522 [ScheduledTasks:1] GCInspector GC for ConcurrentMarkSweep: 280 ms for 1 collections, 1998214480 used; max is 3200253952 INFO 2014-02-17 22:59:36,527 [ScheduledTasks:1] GCInspector GC for ConcurrentMarkSweep: 305 ms for 1 collections, 2013065592 used; max is 3200253952 INFO 2014-02-17 22:59:50,529 [ScheduledTasks:1] GCInspector GC for ConcurrentMarkSweep: 334 ms for 1 collections, 2028069232 used; max is 3200253952 We don't see any of these messages on the other nodes in the cluster. We are seeing similar behaviour for both our production and QA clusters. Production is running cassandra 1.2.9 and QA is running 1.2.13. Here are some of the cassandra settings that I would think might be relevant. flush_largest_memtables_at: 0.75 reduce_cache_sizes_at: 0.85 reduce_cache_capacity_to: 0.6 in_memory_compaction_limit_in_mb: 64 Does anyone have any ideas why we are seeing this so selectively on one box? Any cures??? -- John Pyeatt Singlewire Software, LLC www.singlewire.com ------------------ 608.661.1184 john.pye...@singlewire.com