> several compactions on few 200-300 GB SSTables
Sounds like some big files. Out of interest how much data do you have per node 
? 
Also do you have wide rows ? Can check via nodetool cfstats. 

In cases where OOM / GC is related to compaction these are the steps i take 
first. It's heavy handed and will probably increase the IO load. Once you 
stabilise you should see if you can increase them.

in cassandra.yaml
* set concurrent_compactors to 2 - this will reduce the number of concurrent 
compactions. 
* if you have wide rows reduce in_memory_compaction_limit_in_mb to 32 or lower. 

(as you are on 0.8.X also check memtable_total_space_in_mb is enabled)

Hope that helps. 


-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 25/02/2012, at 10:14 AM, Feng Qu wrote:

> Hello, 
> 
> We have a 6-node ring running 0.8.6 on RHEL 6.1. The first node also runs 
> OpsCenter community. This node has crashed few time recently with 
> "OutOfMemoryError: Java heap space" while several compactions on few 200-300 
> GB SSTables were running. We are using 8GB Java heap on host with 96GB RAM. 
> 
> I would appreciate for help to figure out the root cause and solution.
>  
> Feng Qu
> 
> 
>  INFO [GossipTasks:1] 2012-02-22 13:15:59,135 Gossiper.java (line 697) 
> InetAddress /10.89.74.67 is now dead.
>  INFO [ScheduledTasks:1] 2012-02-22 13:16:12,114 StatusLogger.java (line 65) 
> ReadStage                         0         0         0
> ERROR [CompactionExecutor:10538] 2012-02-22 13:16:12,115 
> AbstractCassandraDaemon.java (line 139) Fatal exception in thread 
> Thread[CompactionExecutor:10538,1,
> main]
> java.lang.OutOfMemoryError: Java heap space
>         at 
> org.apache.cassandra.io.util.BufferedRandomAccessFile.<init>(BufferedRandomAccessFile.java:123)
>         at 
> org.apache.cassandra.io.sstable.SSTableScanner.<init>(SSTableScanner.java:57)
>         at 
> org.apache.cassandra.io.sstable.SSTableReader.getDirectScanner(SSTableReader.java:664)
>         at 
> org.apache.cassandra.db.compaction.CompactionIterator.getCollatingIterator(CompactionIterator.java:92)
>         at 
> org.apache.cassandra.db.compaction.CompactionIterator.<init>(CompactionIterator.java:68)
>         at 
> org.apache.cassandra.db.compaction.CompactionManager.doCompactionWithoutSizeEstimation(CompactionManager.java:553)
>         at 
> org.apache.cassandra.db.compaction.CompactionManager.doCompaction(CompactionManager.java:507)
>         at 
> org.apache.cassandra.db.compaction.CompactionManager$1.call(CompactionManager.java:142)
>         at 
> org.apache.cassandra.db.compaction.CompactionManager$1.call(CompactionManager.java:108)
>         at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source)
>         at java.util.concurrent.FutureTask.run(Unknown Source)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown 
> Source)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
>         at java.lang.Thread.run(Unknown Source)
>  INFO [GossipTasks:1] 2012-02-22 13:16:12,115 Gossiper.java (line 697) 
> InetAddress /10.2.128.55 is now dead.
> ERROR [Thread-734] 2012-02-22 13:16:48,189 AbstractCassandraDaemon.java (line 
> 139) Fatal exception in thread Thread[Thread-734,5,main]
> java.util.concurrent.RejectedExecutionException: ThreadPoolExecutor has shut 
> down
>         at 
> org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor$1.rejectedExecution(DebuggableThreadPoolExecutor.java:60)
>         at java.util.concurrent.ThreadPoolExecutor.reject(Unknown Source)
>         at java.util.concurrent.ThreadPoolExecutor.execute(Unknown Source)
>         at 
> org.apache.cassandra.net.MessagingService.receive(MessagingService.java:490)
>         at 
> org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:136)
> ERROR [Thread-68450] 2012-02-22 13:16:48,189 AbstractCassandraDaemon.java 
> (line 139) Fatal exception in thread Thread[Thread-68450,5,main]
> java.util.concurrent.RejectedExecutionException: ThreadPoolExecutor has shut 
> down
>         at 
> org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor$1.rejectedExecution(DebuggableThreadPoolExecutor.java:60)
>         at java.util.concurrent.ThreadPoolExecutor.reject(Unknown Source)
>         at 
> java.util.concurrent.ThreadPoolExecutor.ensureQueuedTaskHandled(Unknown 
> Source)
>         at java.util.concurrent.ThreadPoolExecutor.execute(Unknown Source)
>         at 
> org.apache.cassandra.net.MessagingService.receive(MessagingService.java:490)
>         at 
> org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:136)
> ERROR [Thread-731] 2012-02-22 13:16:48,189 AbstractCassandraDaemon.java (line 
> 139) Fatal exception in thread Thread[Thread-731,5,main]
> java.util.concurrent.RejectedExecutionException: ThreadPoolExecutor has shut 
> down
>         at 
> org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor$1.rejectedExecution(DebuggableThreadPoolExecutor.java:60)
>         at java.util.concurrent.ThreadPoolExecutor.reject(Unknown Source)
>         at 
> java.util.concurrent.ThreadPoolExecutor.ensureQueuedTaskHandled(Unknown 
> Source)
>         at java.util.concurrent.ThreadPoolExecutor.execute(Unknown Source)
>         at 
> org.apache.cassandra.net.MessagingService.receive(MessagingService.java:490)
>         at 
> org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:136)
> ERROR [Thread-736] 2012-02-22 13:16:48,186 AbstractCassandraDaemon.java (line 
> 139) Fatal exception in thread Thread[Thread-736,5,main]
> java.util.concurrent.RejectedExecutionException: ThreadPoolExecutor has shut 
> down
>         at 
> org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor$1.rejectedExecution(DebuggableThreadPoolExecutor.java:60)
>         at java.util.concurrent.ThreadPoolExecutor.reject(Unknown Source)
>         at java.util.concurrent.ThreadPoolExecutor.execute(Unknown Source)
>         at 
> org.apache.cassandra.net.MessagingService.receive(MessagingService.java:490)
>         at 
> org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:136)
> ERROR [Thread-723] 2012-02-22 13:16:47,746 AbstractCassandraDaemon.java (line 
> 139) Fatal exception in thread Thread[Thread-723,5,main]
> java.util.concurrent.RejectedExecutionException: ThreadPoolExecutor has shut 
> down
>         at 
> org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor$1.rejectedExecution(DebuggableThreadPoolExecutor.java:60)
>         at java.util.concurrent.ThreadPoolExecutor.reject(Unknown Source)
>         at 
> java.util.concurrent.ThreadPoolExecutor.ensureQueuedTaskHandled(Unknown 
> Source)
>         at java.util.concurrent.ThreadPoolExecutor.execute(Unknown Source)
>         at 
> org.apache.cassandra.net.MessagingService.receive(MessagingService.java:490)
> 

Reply via email to