Hi, On linux and cassandra version 2.0.2 I had an OOM after a heavy load and then some (15 ) days of idle running (not exactly idle but very very low activity). Two out of a 4 machine cluster had this OOM.
I checked the heap dump (9GB) and that tells me : One instance of "java.util.concurrent.ScheduledThreadPoolExecutor" loaded by "<system class loader>" occupies 8.927.175.368 (94,53%) bytes. The instance is referenced by org.apache.cassandra.io.sstable.SSTableReader @ 0x7fadf89e0 , loaded by "sun.misc.Launcher$AppClassLoader @ 0x683e6ad30". The memory is accumulated in one instance of "java.util.concurrent.RunnableScheduledFuture[]" loaded by "<system class loader>". So I checked the SSTableReader instance and found out the 'ScheduledThreadPoolExecutor syncExecutor ' object is holding about 600k of ScheduledFutureTasks. According to the code on SSTableReader these tasks must have been created by the code line syncExecutor.scheduleAtFixedRate. That means that none of these tasks ever get scheduled because some (and only one) initial task is probably blocking. But then again, the one thread to execute these tasks, seems to be in a 'normal' state (at time of OOM) and is executing with a stack trace pasted below : Thread 0x696777eb8 at org.apache.cassandra.db.AtomicSortedColumns$1.create(Lorg/apache/cassandra/config/CFMetaData;Z)Lorg/apache/cassandra/db/AtomicSortedColumns; (AtomicSortedColumns.java:58) at org.apache.cassandra.db.AtomicSortedColumns$1.create(Lorg/apache/cassandra/config/CFMetaData;Z)Lorg/apache/cassandra/db/ColumnFamily; (AtomicSortedColumns.java:55) at org.apache.cassandra.db.ColumnFamily.cloneMeShallow(Lorg/apache/cassandra/db/ColumnFamily$Factory;Z)Lorg/apache/cassandra/db/ColumnFamily; (ColumnFamily.java:70) at org.apache.cassandra.db.Memtable.resolve(Lorg/apache/cassandra/db/DecoratedKey;Lorg/apache/cassandra/db/ColumnFamily;Lorg/apache/cassandra/db/index/SecondaryIndexManager$Updater;)V (Memtable.java:187) at org.apache.cassandra.db.Memtable.put(Lorg/apache/cassandra/db/DecoratedKey;Lorg/apache/cassandra/db/ColumnFamily;Lorg/apache/cassandra/db/index/SecondaryIndexManager$Updater;)V (Memtable.java:158) at org.apache.cassandra.db.ColumnFamilyStore.apply(Lorg/apache/cassandra/db/DecoratedKey;Lorg/apache/cassandra/db/ColumnFamily;Lorg/apache/cassandra/db/index/SecondaryIndexManager$Updater;)V (ColumnFamilyStore.java:840) at org.apache.cassandra.db.Keyspace.apply(Lorg/apache/cassandra/db/RowMutation;ZZ)V (Keyspace.java:373) at org.apache.cassandra.db.Keyspace.apply(Lorg/apache/cassandra/db/RowMutation;Z)V (Keyspace.java:338) at org.apache.cassandra.db.RowMutation.apply()V (RowMutation.java:201) at org.apache.cassandra.cql3.statements.ModificationStatement.executeInternal(Lorg/apache/cassandra/service/QueryState;)Lorg/apache/cassandra/transport/messages/ResultMessage; (ModificationStatement.java:477) at org.apache.cassandra.cql3.QueryProcessor.processInternal(Ljava/lang/String;)Lorg/apache/cassandra/cql3/UntypedResultSet; (QueryProcessor.java:178) at org.apache.cassandra.db.SystemKeyspace.persistSSTableReadMeter(Ljava/lang/String;Ljava/lang/String;ILorg/apache/cassandra/metrics/RestorableMeter;)V (SystemKeyspace.java:938) at org.apache.cassandra.io.sstable.SSTableReader$2.run()V (SSTableReader.java:342) at java.util.concurrent.Executors$RunnableAdapter.call()Ljava/lang/Object; (Executors.java:471) at java.util.concurrent.FutureTask.runAndReset()Z (FutureTask.java:304) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(Ljava/util/concurrent/ScheduledThreadPoolExecutor$ScheduledFutureTask;)Z (ScheduledThreadPoolExecutor.java:178) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run()V (ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(Ljava/util/concurrent/ThreadPoolExecutor$Worker;)V (ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run()V (ThreadPoolExecutor.java:615) at java.lang.Thread.run()V (Thread.java:724) Since each of these tasks are throttled by meterSyncThrottle.acquire() I suspect that the RateLimiter is causing a delay. The RateLimiter instance attributes are : Type|Name |Value long|nextFreeTicketMicros|3016022567383 double|maxPermits|100.0 double|storedPermits|99.0 long|offsetNanos|334676357831746 I guess that these attributes will practically result in a blocking behavior, resulting in the OOM ... Is there someone that can make sense out of it? I hope this helps in finding out what the reason is for this and maybe could be avoided in the future. I still have the heap dump, so I can always pass more information if needed. Regards, Ignace Desimpel