Hello, We have a 6-node ring running 0.8.6 on RHEL 6.1. The first node also runs OpsCenter community. This node has crashed few time recently with "OutOfMemoryError: Java heap space" while several compactions on few 200-300 GB SSTables were running. We are using 8GB Java heap on host with 96GB RAM.
I would appreciate for help to figure out the root cause and solution. Feng Qu INFO [GossipTasks:1] 2012-02-22 13:15:59,135 Gossiper.java (line 697) InetAddress /10.89.74.67 is now dead. INFO [ScheduledTasks:1] 2012-02-22 13:16:12,114 StatusLogger.java (line 65) ReadStage 0 0 0 ERROR [CompactionExecutor:10538] 2012-02-22 13:16:12,115 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[CompactionExecutor:10538,1, main] java.lang.OutOfMemoryError: Java heap space at org.apache.cassandra.io.util.BufferedRandomAccessFile.<init>(BufferedRandomAccessFile.java:123) at org.apache.cassandra.io.sstable.SSTableScanner.<init>(SSTableScanner.java:57) at org.apache.cassandra.io.sstable.SSTableReader.getDirectScanner(SSTableReader.java:664) at org.apache.cassandra.db.compaction.CompactionIterator.getCollatingIterator(CompactionIterator.java:92) at org.apache.cassandra.db.compaction.CompactionIterator.<init>(CompactionIterator.java:68) at org.apache.cassandra.db.compaction.CompactionManager.doCompactionWithoutSizeEstimation(CompactionManager.java:553) at org.apache.cassandra.db.compaction.CompactionManager.doCompaction(CompactionManager.java:507) at org.apache.cassandra.db.compaction.CompactionManager$1.call(CompactionManager.java:142) at org.apache.cassandra.db.compaction.CompactionManager$1.call(CompactionManager.java:108) at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source) at java.util.concurrent.FutureTask.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) INFO [GossipTasks:1] 2012-02-22 13:16:12,115 Gossiper.java (line 697) InetAddress /10.2.128.55 is now dead. ERROR [Thread-734] 2012-02-22 13:16:48,189 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[Thread-734,5,main] java.util.concurrent.RejectedExecutionException: ThreadPoolExecutor has shut down at org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor$1.rejectedExecution(DebuggableThreadPoolExecutor.java:60) at java.util.concurrent.ThreadPoolExecutor.reject(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.execute(Unknown Source) at org.apache.cassandra.net.MessagingService.receive(MessagingService.java:490) at org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:136) ERROR [Thread-68450] 2012-02-22 13:16:48,189 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[Thread-68450,5,main] java.util.concurrent.RejectedExecutionException: ThreadPoolExecutor has shut down at org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor$1.rejectedExecution(DebuggableThreadPoolExecutor.java:60) at java.util.concurrent.ThreadPoolExecutor.reject(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.ensureQueuedTaskHandled(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.execute(Unknown Source) at org.apache.cassandra.net.MessagingService.receive(MessagingService.java:490) at org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:136) ERROR [Thread-731] 2012-02-22 13:16:48,189 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[Thread-731,5,main] java.util.concurrent.RejectedExecutionException: ThreadPoolExecutor has shut down at org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor$1.rejectedExecution(DebuggableThreadPoolExecutor.java:60) at java.util.concurrent.ThreadPoolExecutor.reject(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.ensureQueuedTaskHandled(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.execute(Unknown Source) at org.apache.cassandra.net.MessagingService.receive(MessagingService.java:490) at org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:136) ERROR [Thread-736] 2012-02-22 13:16:48,186 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[Thread-736,5,main] java.util.concurrent.RejectedExecutionException: ThreadPoolExecutor has shut down at org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor$1.rejectedExecution(DebuggableThreadPoolExecutor.java:60) at java.util.concurrent.ThreadPoolExecutor.reject(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.execute(Unknown Source) at org.apache.cassandra.net.MessagingService.receive(MessagingService.java:490) at org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:136) ERROR [Thread-723] 2012-02-22 13:16:47,746 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[Thread-723,5,main] java.util.concurrent.RejectedExecutionException: ThreadPoolExecutor has shut down at org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor$1.rejectedExecution(DebuggableThreadPoolExecutor.java:60) at java.util.concurrent.ThreadPoolExecutor.reject(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.ensureQueuedTaskHandled(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.execute(Unknown Source) at org.apache.cassandra.net.MessagingService.receive(MessagingService.java:490)