Re: OOM during restart

Dominic Williams Tue, 21 Jun 2011 10:03:12 -0700

Hi gabe,

What you need to do is the following:


1. Adjust cassandra.yaml so when this node is starting up it is not
contacted by other nodes e.g. set thrift to 9061 and storage to 7001

2. Copy your commit logs to tmp sub-folder e.g. commitLog/tmp

3. Copy a small number of commit logs back into main commit log folder (be
careful to copy the <id>.log and <id>.log.header file together)

4. Start up the node. When it has successfully started up, and therefore you
know it has processed the commit logs, go back to step 3 and repeat

5. When you have no more commit logs remaining in tmp, you can revert
cassandra.yaml and restart.. your node should be up again

You might want to read
http://ria101.wordpress.com/2011/02/08/cassandra-the-importance-of-system-clocks-avoiding-oom-and-how-to-escape-oom-meltdown/

With Version 0.8 you can set a global memory threshold for the memtables so
this kind of problem should become greatly reduced

Best, Dominic

On 20 June 2011 23:24, Gabriel Ki <gab...@gmail.com> wrote:

> Hi,
>
> Cassandra: 7.6-2
> I was restarting a node and ran into OOM while replaying the commit log.  I
> am not able to bring the node up again.
>
> DEBUG 15:11:43,501 forceFlush requested but everything is clean
> <--------  For this I don't know what to do.
> java.lang.OutOfMemoryError: Java heap space
>     at
> org.apache.cassandra.io.util.BufferedRandomAccessFile.<init>(BufferedRandomAccessFile.java:123)
>     at
> org.apache.cassandra.io.sstable.SSTableWriter$IndexWriter.<init>(SSTableWriter.java:395)
>     at
> org.apache.cassandra.io.sstable.SSTableWriter.<init>(SSTableWriter.java:76)
>     at
> org.apache.cassandra.db.ColumnFamilyStore.createFlushWriter(ColumnFamilyStore.java:2238)
>     at
> org.apache.cassandra.db.Memtable.writeSortedContents(Memtable.java:166)
>     at org.apache.cassandra.db.Memtable.access$000(Memtable.java:49)
>     at org.apache.cassandra.db.Memtable$1.runMayThrow(Memtable.java:189)
>     at
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
>     at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>     at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>     at java.lang.Thread.run(Thread.java:662)
>
> Any help will be appreciated.
>
> If I update the schema while a node is down, the new schema is loaded
> before the flushing when the node is brought up again, correct?
>
> Thanks,
> -gabe
>

Re: OOM during restart

Reply via email to