0.5 has a bug that allows it to OOM itself from replaying the log too fast. You should upgrade to 0.6.1.
On Mon, Apr 26, 2010 at 12:11 PM, elsif <elsif.t...@gmail.com> wrote: > > Hello. I have a six node cassandra cluster running on modest hardware > with 1G of heap assigned to cassandra. After inserting about 245 > million rows of data, cassandra failed with a > java.lang.OutOfMemoryError: Java heap space error. I rasied the java > heap to 2G, but still get the same error when trying to restart cassandra. > > I am using Cassandra 0.5.1 with Sun jre1.6.0_18. > > Any thoughts on how to resolve this issue are greatly appreciated. > > Here are log excerpts from two of the nodes: > > DEBUG [HINTED-HANDOFF-POOL:1] 2010-04-23 16:19:20,490 > SliceQueryFilter.java (line 116) collecting SuperColumn(dcf9f19e > [0a011d0d,]) > DEBUG [HINTED-HANDOFF-POOL:1] 2010-04-23 16:19:20,490 > SliceQueryFilter.java (line 116) collecting SuperColumn(dd04bf9c > [0a011d0c,0a011d0d,]) > DEBUG [HINTED-HANDOFF-POOL:1] 2010-04-23 16:19:20,490 > SliceQueryFilter.java (line 116) collecting SuperColumn(dd08981a > [0a011d0c,0a011d0d,]) > DEBUG [HINTED-HANDOFF-POOL:1] 2010-04-23 16:19:20,490 > SliceQueryFilter.java (line 116) collecting SuperColumn(dd7f7ac9 > [0a011d0c,0a011d0d,]) > DEBUG [HINTED-HANDOFF-POOL:1] 2010-04-23 16:19:20,490 > SliceQueryFilter.java (line 116) collecting SuperColumn(dde1d4cf > [0a011d0d,]) > DEBUG [HINTED-HANDOFF-POOL:1] 2010-04-23 16:19:20,491 > SliceQueryFilter.java (line 116) collecting SuperColumn(de32aec3 > [0a011d0d,]) > DEBUG [HINTED-HANDOFF-POOL:1] 2010-04-23 16:19:20,491 > SliceQueryFilter.java (line 116) collecting SuperColumn(de378105 > [0a011d0c,0a011d0d,]) > DEBUG [HINTED-HANDOFF-POOL:1] 2010-04-23 16:19:20,491 > SliceQueryFilter.java (line 116) collecting SuperColumn(deb5d591 > [0a011d0d,]) > DEBUG [HINTED-HANDOFF-POOL:1] 2010-04-23 16:19:20,491 > SliceQueryFilter.java (line 116) collecting SuperColumn(ded75dee > [0a011d0c,0a011d0d,]) > DEBUG [HINTED-HANDOFF-POOL:1] 2010-04-23 16:19:20,491 > SliceQueryFilter.java (line 116) collecting SuperColumn(defe3445 > [0a011d0c,0a011d0d,]) > INFO [FLUSH-TIMER] 2010-04-23 16:20:00,071 ColumnFamilyStore.java (line > 393) IpTag has reached its threshold; switching in a fresh Memtable > INFO [FLUSH-TIMER] 2010-04-23 16:20:00,072 ColumnFamilyStore.java (line > 1035) Enqueuing flush of Memtable(IpTag)@7816 > INFO [FLUSH-SORTER-POOL:1] 2010-04-23 16:20:00,072 Memtable.java (line > 183) Sorting Memtable(IpTag)@7816 > INFO [FLUSH-WRITER-POOL:1] 2010-04-23 16:20:00,107 Memtable.java (line > 192) Writing Memtable(IpTag)@7816 > DEBUG [Timer-0] 2010-04-23 16:20:00,130 LoadDisseminator.java (line 39) > Disseminating load info ... > ERROR [ROW-MUTATION-STAGE:41] 2010-04-23 16:20:00,348 > CassandraDaemon.java (line 71) Fatal exception in thread > Thread[ROW-MUTATION-STAGE:41,5,main] > java.lang.OutOfMemoryError: Java heap space > at java.util.Arrays.copyOfRange(Unknown Source) > at java.lang.String.<init>(Unknown Source) > at java.lang.StringBuilder.toString(Unknown Source) > at > org.apache.cassandra.db.marshal.AbstractType.getColumnsString(AbstractType.java:87) > at > org.apache.cassandra.db.ColumnFamily.toString(ColumnFamily.java:344) > at > org.apache.commons.lang.ObjectUtils.toString(ObjectUtils.java:241) > at org.apache.commons.lang.StringUtils.join(StringUtils.java:3073) > at org.apache.commons.lang.StringUtils.join(StringUtils.java:3133) > at > org.apache.cassandra.db.RowMutation.toString(RowMutation.java:263) > at java.lang.String.valueOf(Unknown Source) > at java.lang.StringBuilder.append(Unknown Source) > at > org.apache.cassandra.db.RowMutationVerbHandler.doVerb(RowMutationVerbHandler.java:46) > at > org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:38) > at > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source) > at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown > Source) > at java.lang.Thread.run(Unknown Source) > > --- > > DEBUG [main] 2010-04-23 17:15:45,501 CommitLog.java (line 312) Reading > mutation at 57527476 > DEBUG [main] 2010-04-23 17:16:11,375 CommitLog.java (line 340) replaying > mutation for system.Tracking: {ColumnFamily(HintsColumnFamily [7af4c5c0,])} > DEBUG [main] 2010-04-23 17:16:45,293 CommitLog.java (line 312) Reading > mutation at 57527686 > DEBUG [main] 2010-04-23 17:16:45,294 CommitLog.java (line 340) replaying > mutation for system.Tracking: {ColumnFamily(HintsColumnFamily [7af4c5fb,])} > DEBUG [main] 2010-04-23 17:16:54,311 CommitLog.java (line 312) Reading > mutation at 57527919 > DEBUG [main] 2010-04-23 17:17:46,344 CommitLog.java (line 340) replaying > mutation for system.Tracking: {ColumnFamily(HintsColumnFamily [7af4c5fb,])} > DEBUG [main] 2010-04-23 17:17:55,530 CommitLog.java (line 312) Reading > mutation at 57528129 > DEBUG [main] 2010-04-23 17:18:20,266 CommitLog.java (line 340) replaying > mutation for system.Tracking: {ColumnFamily(HintsColumnFamily [7af4c607,])} > DEBUG [main] 2010-04-23 17:18:38,273 CommitLog.java (line 312) Reading > mutation at 57528362 > DEBUG [main] 2010-04-23 17:21:53,966 CommitLog.java (line 340) replaying > mutation for system.Tracking: {ColumnFamily(HintsColumnFamily [7af4c607,])} > DEBUG [main] 2010-04-23 17:24:48,032 CommitLog.java (line 312) Reading > mutation at 57528572 > ERROR [RMI TCP Connection(idle)] 2010-04-23 17:36:38,932 > CassandraDaemon.java (line 71) Fatal exception in thread Thread[RMI TCP > Connection(idle),5,RMI Runtime] > java.lang.OutOfMemoryError: Java heap space > ERROR [RMI TCP Connection(idle)] 2010-04-23 17:36:38,952 > CassandraDaemon.java (line 71) Fatal exception in thread Thread[RMI TCP > Connection(idle),5,RMI Runtime] > java.lang.OutOfMemoryError: Java heap space > ERROR [RMI TCP Connection(idle)] 2010-04-23 17:36:38,952 > CassandraDaemon.java (line 71) Fatal exception in thread Thread[RMI TCP > Connection(idle),5,RMI Runtime] > java.lang.OutOfMemoryError: Java heap space > at java.io.BufferedInputStream.<init>(Unknown Source) > at java.io.BufferedInputStream.<init>(Unknown Source) > at > sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(Unknown Source) > at > sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(Unknown Source) > at > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source) > at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown > Source) > at java.lang.Thread.run(Unknown Source) > ERROR [RMI TCP Connection(idle)] 2010-04-23 17:36:38,966 > CassandraDaemon.java (line 71) Fatal exception in thread Thread[RMI TCP > Connection(idle),5,RMI Runtime] > java.lang.OutOfMemoryError: Java heap space > ERROR [main] 2010-04-23 17:36:38,966 CassandraDaemon.java (line 184) > Exception encountered during startup. > java.lang.OutOfMemoryError: Java heap space > ERROR [RMI TCP Connection(idle)] 2010-04-23 17:36:38,981 > CassandraDaemon.java (line 71) Fatal exception in thread Thread[RMI TCP > Connection(idle),5,RMI Runtime] > java.lang.OutOfMemoryError: Java heap space > > > Here is my current configuration: > > <Partitioner>org.apache.cassandra.dht.OrderPreservingPartitioner</Partitioner> > <ReplicaPlacementStrategy>org.apache.cassandra.locator.RackUnawareStrategy</ReplicaPlacementStrategy> > <ReplicationFactor>3</ReplicationFactor> > <RpcTimeoutInMillis>30000</RpcTimeoutInMillis> > <CommitLogRotationThresholdInMB>128</CommitLogRotationThresholdInMB> > <SlicedBufferSizeInKB>64</SlicedBufferSizeInKB> > <FlushDataBufferSizeInMB>32</FlushDataBufferSizeInMB> > <ColumnIndexSizeInKB>64</ColumnIndexSizeInKB> > <MemtableSizeInMB>64</MemtableSizeInMB> > <MemtableObjectCountInMillions>0.1</MemtableObjectCountInMillions> > <MemtableFlushAfterMinutes>60</MemtableFlushAfterMinutes> > <ConcurrentReads>8</ConcurrentReads> > <ConcurrentWrites>100</ConcurrentWrites> > <CommitLogSync>periodic</CommitLogSync> > <CommitLogSyncPeriodInMS>1000</CommitLogSyncPeriodInMS> > <GCGraceSeconds>864000</GCGraceSeconds> > <BinaryMemtableSizeInMB>256</BinaryMemtableSizeInMB> > > Ring status: > > Address Status Load Range > Ring > f > 10.1.29.12 Down 7.26 GB 0 > |<--| > 10.1.29.13 Up 3.97 GB 3 > | ^ > 10.1.29.14 Up 7.73 GB 6 > v | > 10.1.29.15 Down 14.27 GB 9 > | ^ > 10.1.29.16 Up 15.42 GB c > v | > 10.1.29.17 Down 12.67 GB f > |-->| > > -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com