Hi,
we are running a cluster of 4 nodes, each one has the same sizing: 2
cores, 16G ram and 1TB of disk space.
On every node we are running cassandra 2.0.17, oracle java version
"1.7.0_45", centos 6 with this kernel version 2.6.32-431.17.1.el6.x86_64
Two nodes are running just fine, the other two have started to go OOM at
every start.
This is the error we get:
INFO [ScheduledTasks:1] 2016-05-26 18:15:58,460 StatusLogger.java (line
70) ReadRepairStage 0 0 116
0 0
INFO [ScheduledTasks:1] 2016-05-26 18:15:58,462 StatusLogger.java
(line 70) MutationStage 31 1369 20526
0 0
INFO [ScheduledTasks:1] 2016-05-26 18:15:58,590 StatusLogger.java
(line 70) ReplicateOnWriteStage 0 0 0
0 0
INFO [ScheduledTasks:1] 2016-05-26 18:15:58,591 StatusLogger.java
(line 70) GossipStage 0 0 335
0 0
INFO [ScheduledTasks:1] 2016-05-26 18:16:04,195 StatusLogger.java
(line 70) CacheCleanupExecutor 0 0 0
0 0
INFO [ScheduledTasks:1] 2016-05-26 18:16:06,526 StatusLogger.java
(line 70) MigrationStage 0 0 0
0 0
INFO [ScheduledTasks:1] 2016-05-26 18:16:06,527 StatusLogger.java
(line 70) MemoryMeter 1 4 26
0 0
INFO [ScheduledTasks:1] 2016-05-26 18:16:06,527 StatusLogger.java
(line 70) ValidationExecutor 0 0 0
0 0
DEBUG [MessagingService-Outgoing-/10.255.235.19] 2016-05-26 18:16:06,518
OutboundTcpConnection.java (line 290) attempting to connect to
/10.255.235.19
INFO [GossipTasks:1] 2016-05-26 18:16:22,912 Gossiper.java (line 992)
InetAddress /10.255.235.28 is now DOWN
INFO [ScheduledTasks:1] 2016-05-26 18:16:22,952 StatusLogger.java
(line 70) FlushWriter 1 5 47
0 25
INFO [ScheduledTasks:1] 2016-05-26 18:16:22,953 StatusLogger.java
(line 70) InternalResponseStage 0 0 0
0 0
ERROR [ReadStage:27] 2016-05-26 18:16:29,250 CassandraDaemon.java (line
258) Exception in thread Thread[ReadStage:27,5,main]
java.lang.OutOfMemoryError: Java heap space
at
org.apache.cassandra.io.util.RandomAccessReader.readBytes(RandomAccessReader.java:347)
at
org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:392)
at
org.apache.cassandra.utils.ByteBufferUtil.readWithLength(ByteBufferUtil.java:355)
at
org.apache.cassandra.db.ColumnSerializer.deserializeColumnBody(ColumnSerializer.java:124)
at
org.apache.cassandra.db.OnDiskAtom$Serializer.deserializeFromSSTable(OnDiskAtom.java:85)
at org.apache.cassandra.db.Column$1.computeNext(Column.java:75)
at org.apache.cassandra.db.Column$1.computeNext(Column.java:64)
at
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
at
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
at
com.google.common.collect.AbstractIterator.next(AbstractIterator.java:153)
at
org.apache.cassandra.db.columniterator.IndexedSliceReader$IndexedBlockFetcher.getNextBlock(IndexedSliceReader.java:434)
at
org.apache.cassandra.db.columniterator.IndexedSliceReader$IndexedBlockFetcher.fetchMoreData(IndexedSliceReader.java:387)
at
org.apache.cassandra.db.columniterator.IndexedSliceReader.computeNext(IndexedSliceReader.java:145)
at
org.apache.cassandra.db.columniterator.IndexedSliceReader.computeNext(IndexedSliceReader.java:45)
at
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
at
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
at
org.apache.cassandra.db.columniterator.SSTableSliceIterator.hasNext(SSTableSliceIterator.java:82)
at
org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:157)
at
org.apache.cassandra.db.filter.QueryFilter$2.hasNext(QueryFilter.java:140)
at
org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:144)
at
org.apache.cassandra.utils.MergeIterator$ManyToOne.<init>(MergeIterator.java:87)
at org.apache.cassandra.utils.MergeIterator.get(MergeIterator.java:46)
at
org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:120)
at
org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:80)
at
org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:72)
at
org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:297)
at
org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:53)
at
org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1619)
at
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1438)
at org.apache.cassandra.db.Keyspace.getRow(Keyspace.java:340)
at
org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:89)
at
org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:47)
ERROR [ReadStage:32] 2016-05-26 18:16:29,357 CassandraDaemon.java (line
258) Exception in thread Thread[ReadStage:32,5,main]
java.lang.OutOfMemoryError: Java heap space
at
org.apache.cassandra.io.util.RandomAccessReader.readBytes(RandomAccessReader.java:347)
at
org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:392)
at
org.apache.cassandra.utils.ByteBufferUtil.readWithLength(ByteBufferUtil.java:355)
at
org.apache.cassandra.db.ColumnSerializer.deserializeColumnBody(ColumnSerializer.java:124)
at
org.apache.cassandra.db.OnDiskAtom$Serializer.deserializeFromSSTable(OnDiskAtom.java:85)
at org.apache.cassandra.db.Column$1.computeNext(Column.java:75)
at org.apache.cassandra.db.Column$1.computeNext(Column.java:64)
at
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
at
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
at
com.google.common.collect.AbstractIterator.next(AbstractIterator.java:153)
at
org.apache.cassandra.db.columniterator.IndexedSliceReader$IndexedBlockFetcher.getNextBlock(IndexedSliceReader.java:434)
at
org.apache.cassandra.db.columniterator.IndexedSliceReader$IndexedBlockFetcher.fetchMoreData(IndexedSliceReader.java:387)
at
org.apache.cassandra.db.columniterator.IndexedSliceReader.computeNext(IndexedSliceReader.java:145)
at
org.apache.cassandra.db.columniterator.IndexedSliceReader.computeNext(IndexedSliceReader.java:45)
at
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
at
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
at
org.apache.cassandra.db.columniterator.SSTableSliceIterator.hasNext(SSTableSliceIterator.java:82)
at
org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:157)
at
org.apache.cassandra.db.filter.QueryFilter$2.hasNext(QueryFilter.java:140)
at
org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:144)
at
org.apache.cassandra.utils.MergeIterator$ManyToOne.<init>(MergeIterator.java:87)
at org.apache.cassandra.utils.MergeIterator.get(MergeIterator.java:46)
at
org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:120)
at
org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:80)
at
org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:72)
at
org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:297)
at
org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:53)
at
org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1619)
at
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1438)
at org.apache.cassandra.db.Keyspace.getRow(Keyspace.java:340)
at
org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:89)
at
org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:47)
We are observing that the heap is never flushed, it keeps increasing
until reaching the limit, then the OOM errors appear and after a short
while the node crashes.
These are the relevant settings in cassandra_env for one of the crashing
nodes:
MAX_HEAP_SIZE="6G"
HEAP_NEWSIZE="200M"
This is the complete error log http://pastebin.com/QGaACyhR
This is cassandra_env http://pastebin.com/6SLeVmtv
This is cassandra.yaml http://pastebin.com/wb1axHtV
Can anyone help?
Regards,
Paolo Crosato
--
Paolo Crosato
Software engineer/Custom Solutions
e-mail: paolo.cros...@targaubiest.com