Hi all, We have a node with commit log director ~4G. During start-up of the node on commit log replaying the used heap space is constantly growing ending with OOM error.
The heap size and new heap size properties are - 1G and 256M. We are using the default settings for commitlog_sync, commitlog_sync_period_in_ms and commitlog_segment_size_in_mb. The log shows that cassandra is stuck on MutationStage: Active Pending Completed Blocked 16 385 196 0 The stack trace is: ERROR [metrics-meter-tick-thread-1] 2014-08-12 19:15:10,181 CassandraDaemon.java (line 198) Exception in thread Thread[metrics-meter-tick-thread-1,5,main] java.lang.OutOfMemoryError: Java heap space at java.util.concurrent.locks.AbstractQueuedSynchronizer.addWaiter(Unknown Source) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(Unknown Source) at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(Unknown Source) at java.util.concurrent.locks.ReentrantLock.lock(Unknown Source) at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.offer(Unknown Source) at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.add(Unknown Source) at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.add(Unknown Source) at java.util.concurrent.ScheduledThreadPoolExecutor.reExecutePeriodic(Unknown Source) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) ERROR [MutationStage:8] 2014-08-12 19:15:10,181 CassandraDaemon.java (line 198) Exception in thread Thread[MutationStage:8,5,main] java.lang.OutOfMemoryError: Java heap space at java.nio.HeapByteBuffer.duplicate(Unknown Source) at org.apache.cassandra.db.marshal.AbstractCompositeType.getBytes(AbstractCompositeType.java:62) at org.apache.cassandra.db.marshal.AbstractCompositeType.getWithShortLength(AbstractCompositeType.java:72) at org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:99) at org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:35) at org.apache.cassandra.db.RangeTombstoneList.addAll(RangeTombstoneList.java:188) at org.apache.cassandra.db.DeletionInfo.add(DeletionInfo.java:219) at org.apache.cassandra.db.AtomicSortedColumns.addAllWithSizeDelta(AtomicSortedColumns.java:184) at org.apache.cassandra.db.Memtable.resolve(Memtable.java:226) at org.apache.cassandra.db.Memtable.put(Memtable.java:173) at org.apache.cassandra.db.ColumnFamilyStore.apply(ColumnFamilyStore.java:893) at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:368) at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:333) at org.apache.cassandra.db.commitlog.CommitLogReplayer$1.runMayThrow(CommitLogReplayer.java:352) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) at java.util.concurrent.FutureTask.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) ERROR [MutationStage:8] 2014-08-12 19:15:12,080 CassandraDaemon.java (line 198) Exception in thread Thread[MutationStage:8,5,main] java.lang.IllegalThreadStateException at java.lang.Thread.start(Unknown Source) at org.apache.cassandra.service.CassandraDaemon$2.uncaughtException(CassandraDaemon.java:204) at org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor.handleOrLog(DebuggableThreadPoolExecutor.java:220) at org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor.logExceptionsAfterExecute(DebuggableThreadPoolExecutor.java:203) at org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor.afterExecute(DebuggableThreadPoolExecutor.java:183) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) Increasing the heap space to 2G solves the problem but we want to know if the problem could be solved without increasing the heap space. Does anyone have experience similar problem? If so are there any tuning options in cassandra.yaml? Any help will be much appreciated. If you need more information fell free to ask. Thanks, Jivko Donev