Hi,

we are running a cluster of 4 nodes, each one has the same sizing: 2 cores, 16G ram and 1TB of disk space.

On every node we are running cassandra 2.0.17, oracle java version "1.7.0_45", centos 6 with this kernel version 2.6.32-431.17.1.el6.x86_64

Two nodes are running just fine, the other two have started to go OOM at every start.

This is the error we get:

INFO [ScheduledTasks:1] 2016-05-26 18:15:58,460 StatusLogger.java (line 70) ReadRepairStage 0 0 116 0 0 INFO [ScheduledTasks:1] 2016-05-26 18:15:58,462 StatusLogger.java (line 70) MutationStage 31 1369 20526 0 0 INFO [ScheduledTasks:1] 2016-05-26 18:15:58,590 StatusLogger.java (line 70) ReplicateOnWriteStage 0 0 0 0 0 INFO [ScheduledTasks:1] 2016-05-26 18:15:58,591 StatusLogger.java (line 70) GossipStage 0 0 335 0 0 INFO [ScheduledTasks:1] 2016-05-26 18:16:04,195 StatusLogger.java (line 70) CacheCleanupExecutor 0 0 0 0 0 INFO [ScheduledTasks:1] 2016-05-26 18:16:06,526 StatusLogger.java (line 70) MigrationStage 0 0 0 0 0 INFO [ScheduledTasks:1] 2016-05-26 18:16:06,527 StatusLogger.java (line 70) MemoryMeter 1 4 26 0 0 INFO [ScheduledTasks:1] 2016-05-26 18:16:06,527 StatusLogger.java (line 70) ValidationExecutor 0 0 0 0 0 DEBUG [MessagingService-Outgoing-/10.255.235.19] 2016-05-26 18:16:06,518 OutboundTcpConnection.java (line 290) attempting to connect to /10.255.235.19 INFO [GossipTasks:1] 2016-05-26 18:16:22,912 Gossiper.java (line 992) InetAddress /10.255.235.28 is now DOWN INFO [ScheduledTasks:1] 2016-05-26 18:16:22,952 StatusLogger.java (line 70) FlushWriter 1 5 47 0 25 INFO [ScheduledTasks:1] 2016-05-26 18:16:22,953 StatusLogger.java (line 70) InternalResponseStage 0 0 0 0 0 ERROR [ReadStage:27] 2016-05-26 18:16:29,250 CassandraDaemon.java (line 258) Exception in thread Thread[ReadStage:27,5,main]
java.lang.OutOfMemoryError: Java heap space
at org.apache.cassandra.io.util.RandomAccessReader.readBytes(RandomAccessReader.java:347) at org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:392) at org.apache.cassandra.utils.ByteBufferUtil.readWithLength(ByteBufferUtil.java:355) at org.apache.cassandra.db.ColumnSerializer.deserializeColumnBody(ColumnSerializer.java:124) at org.apache.cassandra.db.OnDiskAtom$Serializer.deserializeFromSSTable(OnDiskAtom.java:85)
    at org.apache.cassandra.db.Column$1.computeNext(Column.java:75)
    at org.apache.cassandra.db.Column$1.computeNext(Column.java:64)
at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143) at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138) at com.google.common.collect.AbstractIterator.next(AbstractIterator.java:153) at org.apache.cassandra.db.columniterator.IndexedSliceReader$IndexedBlockFetcher.getNextBlock(IndexedSliceReader.java:434) at org.apache.cassandra.db.columniterator.IndexedSliceReader$IndexedBlockFetcher.fetchMoreData(IndexedSliceReader.java:387) at org.apache.cassandra.db.columniterator.IndexedSliceReader.computeNext(IndexedSliceReader.java:145) at org.apache.cassandra.db.columniterator.IndexedSliceReader.computeNext(IndexedSliceReader.java:45) at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143) at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138) at org.apache.cassandra.db.columniterator.SSTableSliceIterator.hasNext(SSTableSliceIterator.java:82) at org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:157) at org.apache.cassandra.db.filter.QueryFilter$2.hasNext(QueryFilter.java:140) at org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:144) at org.apache.cassandra.utils.MergeIterator$ManyToOne.<init>(MergeIterator.java:87)
    at org.apache.cassandra.utils.MergeIterator.get(MergeIterator.java:46)
at org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:120) at org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:80) at org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:72) at org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:297) at org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:53) at org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1619) at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1438)
    at org.apache.cassandra.db.Keyspace.getRow(Keyspace.java:340)
at org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:89) at org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:47) ERROR [ReadStage:32] 2016-05-26 18:16:29,357 CassandraDaemon.java (line 258) Exception in thread Thread[ReadStage:32,5,main]
java.lang.OutOfMemoryError: Java heap space
at org.apache.cassandra.io.util.RandomAccessReader.readBytes(RandomAccessReader.java:347) at org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:392) at org.apache.cassandra.utils.ByteBufferUtil.readWithLength(ByteBufferUtil.java:355) at org.apache.cassandra.db.ColumnSerializer.deserializeColumnBody(ColumnSerializer.java:124) at org.apache.cassandra.db.OnDiskAtom$Serializer.deserializeFromSSTable(OnDiskAtom.java:85)
    at org.apache.cassandra.db.Column$1.computeNext(Column.java:75)
    at org.apache.cassandra.db.Column$1.computeNext(Column.java:64)
at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143) at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138) at com.google.common.collect.AbstractIterator.next(AbstractIterator.java:153) at org.apache.cassandra.db.columniterator.IndexedSliceReader$IndexedBlockFetcher.getNextBlock(IndexedSliceReader.java:434) at org.apache.cassandra.db.columniterator.IndexedSliceReader$IndexedBlockFetcher.fetchMoreData(IndexedSliceReader.java:387) at org.apache.cassandra.db.columniterator.IndexedSliceReader.computeNext(IndexedSliceReader.java:145) at org.apache.cassandra.db.columniterator.IndexedSliceReader.computeNext(IndexedSliceReader.java:45) at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143) at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138) at org.apache.cassandra.db.columniterator.SSTableSliceIterator.hasNext(SSTableSliceIterator.java:82) at org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:157) at org.apache.cassandra.db.filter.QueryFilter$2.hasNext(QueryFilter.java:140) at org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:144) at org.apache.cassandra.utils.MergeIterator$ManyToOne.<init>(MergeIterator.java:87)
    at org.apache.cassandra.utils.MergeIterator.get(MergeIterator.java:46)
at org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:120) at org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:80) at org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:72) at org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:297) at org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:53) at org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1619) at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1438)
    at org.apache.cassandra.db.Keyspace.getRow(Keyspace.java:340)
at org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:89) at org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:47)

We are observing that the heap is never flushed, it keeps increasing until reaching the limit, then the OOM errors appear and after a short while the node crashes.

These are the relevant settings in cassandra_env for one of the crashing nodes:

MAX_HEAP_SIZE="6G"
HEAP_NEWSIZE="200M"

This is the complete error log http://pastebin.com/QGaACyhR

This is cassandra_env http://pastebin.com/6SLeVmtv

This is cassandra.yaml http://pastebin.com/wb1axHtV

Can anyone help?

Regards,

Paolo Crosato

--
Paolo Crosato
Software engineer/Custom Solutions
e-mail: paolo.cros...@targaubiest.com

Reply via email to