Hi Paolo,

a) was there any large insertion done?
b) are the a lot of files in the saved_caches directory?
c) would you consider to increase the HEAP_NEWSIZE to, say, 1200M?


Regards,
Mike Yeap

On Fri, May 27, 2016 at 12:39 AM, Paolo Crosato <
paolo.cros...@targaubiest.com> wrote:

> Hi,
>
> we are running a cluster of 4 nodes, each one has the same sizing: 2
> cores, 16G ram and 1TB of disk space.
>
> On every node we are running cassandra 2.0.17, oracle java version
> "1.7.0_45", centos 6 with this kernel version 2.6.32-431.17.1.el6.x86_64
>
> Two nodes are running just fine, the other two have started to go OOM at
> every start.
>
> This is the error we get:
>
> INFO [ScheduledTasks:1] 2016-05-26 18:15:58,460 StatusLogger.java (line
> 70) ReadRepairStage                   0         0            116
> 0                 0
>  INFO [ScheduledTasks:1] 2016-05-26 18:15:58,462 StatusLogger.java (line
> 70) MutationStage                    31      1369          20526
> 0                 0
>  INFO [ScheduledTasks:1] 2016-05-26 18:15:58,590 StatusLogger.java (line
> 70) ReplicateOnWriteStage             0         0              0
> 0                 0
>  INFO [ScheduledTasks:1] 2016-05-26 18:15:58,591 StatusLogger.java (line
> 70) GossipStage                       0         0            335
> 0                 0
>  INFO [ScheduledTasks:1] 2016-05-26 18:16:04,195 StatusLogger.java (line
> 70) CacheCleanupExecutor              0         0              0
> 0                 0
>  INFO [ScheduledTasks:1] 2016-05-26 18:16:06,526 StatusLogger.java (line
> 70) MigrationStage                    0         0              0
> 0                 0
>  INFO [ScheduledTasks:1] 2016-05-26 18:16:06,527 StatusLogger.java (line
> 70) MemoryMeter                       1         4             26
> 0                 0
>  INFO [ScheduledTasks:1] 2016-05-26 18:16:06,527 StatusLogger.java (line
> 70) ValidationExecutor                0         0              0
> 0                 0
> DEBUG [MessagingService-Outgoing-/10.255.235.19] 2016-05-26 18:16:06,518
> OutboundTcpConnection.java (line 290) attempting to connect to /
> 10.255.235.19
>  INFO [GossipTasks:1] 2016-05-26 18:16:22,912 Gossiper.java (line 992)
> InetAddress /10.255.235.28 is now DOWN
>  INFO [ScheduledTasks:1] 2016-05-26 18:16:22,952 StatusLogger.java (line
> 70) FlushWriter                       1         5             47
> 0                25
>  INFO [ScheduledTasks:1] 2016-05-26 18:16:22,953 StatusLogger.java (line
> 70) InternalResponseStage             0         0              0
> 0                 0
> ERROR [ReadStage:27] 2016-05-26 18:16:29,250 CassandraDaemon.java (line
> 258) Exception in thread Thread[ReadStage:27,5,main]
> java.lang.OutOfMemoryError: Java heap space
>     at
> org.apache.cassandra.io.util.RandomAccessReader.readBytes(RandomAccessReader.java:347)
>     at
> org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:392)
>     at
> org.apache.cassandra.utils.ByteBufferUtil.readWithLength(ByteBufferUtil.java:355)
>     at
> org.apache.cassandra.db.ColumnSerializer.deserializeColumnBody(ColumnSerializer.java:124)
>     at
> org.apache.cassandra.db.OnDiskAtom$Serializer.deserializeFromSSTable(OnDiskAtom.java:85)
>     at org.apache.cassandra.db.Column$1.computeNext(Column.java:75)
>     at org.apache.cassandra.db.Column$1.computeNext(Column.java:64)
>     at
> com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
>     at
> com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
>     at
> com.google.common.collect.AbstractIterator.next(AbstractIterator.java:153)
>     at
> org.apache.cassandra.db.columniterator.IndexedSliceReader$IndexedBlockFetcher.getNextBlock(IndexedSliceReader.java:434)
>     at
> org.apache.cassandra.db.columniterator.IndexedSliceReader$IndexedBlockFetcher.fetchMoreData(IndexedSliceReader.java:387)
>     at
> org.apache.cassandra.db.columniterator.IndexedSliceReader.computeNext(IndexedSliceReader.java:145)
>     at
> org.apache.cassandra.db.columniterator.IndexedSliceReader.computeNext(IndexedSliceReader.java:45)
>     at
> com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
>     at
> com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
>     at
> org.apache.cassandra.db.columniterator.SSTableSliceIterator.hasNext(SSTableSliceIterator.java:82)
>     at
> org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:157)
>     at
> org.apache.cassandra.db.filter.QueryFilter$2.hasNext(QueryFilter.java:140)
>     at
> org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:144)
>     at
> org.apache.cassandra.utils.MergeIterator$ManyToOne.<init>(MergeIterator.java:87)
>     at org.apache.cassandra.utils.MergeIterator.get(MergeIterator.java:46)
>     at
> org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:120)
>     at
> org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:80)
>     at
> org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:72)
>     at
> org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:297)
>     at
> org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:53)
>     at
> org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1619)
>     at
> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1438)
>     at org.apache.cassandra.db.Keyspace.getRow(Keyspace.java:340)
>     at
> org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:89)
>     at
> org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:47)
> ERROR [ReadStage:32] 2016-05-26 18:16:29,357 CassandraDaemon.java (line
> 258) Exception in thread Thread[ReadStage:32,5,main]
> java.lang.OutOfMemoryError: Java heap space
>     at
> org.apache.cassandra.io.util.RandomAccessReader.readBytes(RandomAccessReader.java:347)
>     at
> org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:392)
>     at
> org.apache.cassandra.utils.ByteBufferUtil.readWithLength(ByteBufferUtil.java:355)
>     at
> org.apache.cassandra.db.ColumnSerializer.deserializeColumnBody(ColumnSerializer.java:124)
>     at
> org.apache.cassandra.db.OnDiskAtom$Serializer.deserializeFromSSTable(OnDiskAtom.java:85)
>     at org.apache.cassandra.db.Column$1.computeNext(Column.java:75)
>     at org.apache.cassandra.db.Column$1.computeNext(Column.java:64)
>     at
> com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
>     at
> com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
>     at
> com.google.common.collect.AbstractIterator.next(AbstractIterator.java:153)
>     at
> org.apache.cassandra.db.columniterator.IndexedSliceReader$IndexedBlockFetcher.getNextBlock(IndexedSliceReader.java:434)
>     at
> org.apache.cassandra.db.columniterator.IndexedSliceReader$IndexedBlockFetcher.fetchMoreData(IndexedSliceReader.java:387)
>     at
> org.apache.cassandra.db.columniterator.IndexedSliceReader.computeNext(IndexedSliceReader.java:145)
>     at
> org.apache.cassandra.db.columniterator.IndexedSliceReader.computeNext(IndexedSliceReader.java:45)
>     at
> com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
>     at
> com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
>     at
> org.apache.cassandra.db.columniterator.SSTableSliceIterator.hasNext(SSTableSliceIterator.java:82)
>     at
> org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:157)
>     at
> org.apache.cassandra.db.filter.QueryFilter$2.hasNext(QueryFilter.java:140)
>     at
> org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:144)
>     at
> org.apache.cassandra.utils.MergeIterator$ManyToOne.<init>(MergeIterator.java:87)
>     at org.apache.cassandra.utils.MergeIterator.get(MergeIterator.java:46)
>     at
> org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:120)
>     at
> org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:80)
>     at
> org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:72)
>     at
> org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:297)
>     at
> org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:53)
>     at
> org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1619)
>     at
> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1438)
>     at org.apache.cassandra.db.Keyspace.getRow(Keyspace.java:340)
>     at
> org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:89)
>     at
> org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:47)
>
> We are observing that the heap is never flushed, it keeps increasing until
> reaching the limit, then the OOM errors appear and after a short while the
> node crashes.
>
> These are the relevant settings in cassandra_env for one of the crashing
> nodes:
>
> MAX_HEAP_SIZE="6G"
> HEAP_NEWSIZE="200M"
>
> This is the complete error log http://pastebin.com/QGaACyhR
>
> This is cassandra_env http://pastebin.com/6SLeVmtv
>
> This is cassandra.yaml http://pastebin.com/wb1axHtV
>
> Can anyone help?
>
> Regards,
>
> Paolo Crosato
>
> --
> Paolo Crosato
> Software engineer/Custom Solutions
> e-mail: paolo.cros...@targaubiest.com
>
>

Reply via email to