Thanks a lot Jonathan! That seems to be it, since the exact same configuration w/ the same data starts up and works fine on a different server.
-Aram On Wed, Dec 1, 2010 at 5:24 PM, Jonathan Ellis <jbel...@gmail.com> wrote: > Stack trace looks like an OS-level thread limit causing problems, not > actually memory. > > On Wed, Dec 1, 2010 at 7:05 PM, Aram Ayazyan <ayaz...@gmail.com> wrote: >> Hi Aaron, >> >> OOM is happening both after the system has been running for a while as >> well as when I restart it afterwards. The only way to make it run >> after it has crashed, is to remove everything from data and commitlog >> directories. Unfortunately I don't have the original log from when >> cassandra crashed earlier, but might have some soon if another node >> crashes. >> >> This particular exception happened during start-up: >> ERROR [main] 2010-12-01 14:58:37,795 CassandraDaemon.java (line 242) >> Exception encountered during startup. >> java.lang.OutOfMemoryError: unable to create new native thread >> at java.lang.Thread.start0(Native Method) >> at java.lang.Thread.start(Thread.java:597) >> at >> org.apache.cassandra.db.commitlog.PeriodicCommitLogExecutorService.<init>(PeriodicCommitLogExecutorService.java:57) >> at >> org.apache.cassandra.db.commitlog.PeriodicCommitLogExecutorService.<init>(PeriodicCommitLogExecutorService.java:40) >> at >> org.apache.cassandra.db.commitlog.CommitLog.<init>(CommitLog.java:117) >> at >> org.apache.cassandra.db.commitlog.CommitLog.<init>(CommitLog.java:71) >> at >> org.apache.cassandra.db.commitlog.CommitLog$CLHandle.<clinit>(CommitLog.java:85) >> at >> org.apache.cassandra.db.commitlog.CommitLog.instance(CommitLog.java:80) >> at >> org.apache.cassandra.db.ColumnFamilyStore.maybeSwitchMemtable(ColumnFamilyStore.java:469) >> at >> org.apache.cassandra.db.ColumnFamilyStore.forceFlush(ColumnFamilyStore.java:517) >> at org.apache.cassandra.db.Table.flush(Table.java:431) >> at >> org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:291) >> at >> org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:172) >> at >> org.apache.cassandra.thrift.CassandraDaemon.setup(CassandraDaemon.java:115) >> at >> org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:224) >> >> And here is the full GC log: http://pastebin.com/XGRSRcBd (all 21 >> seconds of it). >> >> Thank you, >> Aram >> >> On Wed, Dec 1, 2010 at 4:55 PM, Aaron Morton <aa...@thelastpickle.com> wrote: >>> Do you have a log message for the OOM? And some GC messages around it? Have >>> you tried watching the server with jconsole? >>> Is the OOM happening on system start or after it's been running ? Or both? >>> Do you have any row/key caches? Cannot remember but is 0.6* has this but >>> have you enabled the save cache feature? >>> Aaron >>> >>> On 02 Dec, 2010,at 01:28 PM, Aram Ayazyan <ayaz...@gmail.com> wrote: >>> >>> Hi, >>> >>> We have a small cluster of 3 Cassandra servers running w/ full >>> replication. Every once in a while we get an OutOfMemory exception and >>> have to restart servers. Sometimes just restarting doesn’t do it and >>> we have to clean the commitlog or data directory. >>> >>> We are running Cassandra 0.6.8. There is only 1 keyspace and 3 column >>> families. There are less than 1000 keys across all column families. >>> There is roughly 1 write request per second and 1 read request. Each >>> server is allocated 1GB. Size of all files in data directory of the >>> only column family is ~300MB. MemtableThroughputInMB is throttled way >>> down to 2 and BinaryMemtableThroughputInMB to 8 (w/ higher values we >>> were running out of memory extremely fast, this way it works for a >>> couple of days w/o crashing). >>> >>> Last time this issue happened, I didn’t clear the commitlog/data >>> folders, enabled gc logging and restarted Cassandra. It crashes really >>> fast, but what is really strange is that it seems like it still has >>> plenty of memory when the error happens, last 3 lines from gc log: >>> 21.408: [GC 437098K->436592K(1046464K), 0.0986800 secs] >>> 21.520: [GC 453616K->453117K(1046464K), 0.0967770 secs] >>> 21.629: [GC 470141K->469436K(1046464K), 0.0383520 secs] >>> The full log is here: http://pastebin.com/XGRSRcBd >>> >>> I’ve tried increasing the memory up to 1.5GB, but it still doesn’t start. >>> >>> Any ideas what might be the problem here? >>> >>> Thank you, >>> Aram >>> >> > > > > -- > Jonathan Ellis > Project Chair, Apache Cassandra > co-founder of Riptano, the source for professional Cassandra support > http://riptano.com >