What kind of disks are you running here? Are you getting a lot of GC before
the OOM?

Patrick

On Wed, Mar 4, 2015 at 9:26 AM, Jan <cne...@yahoo.com> wrote:

> HI Roni;
>
> You mentioned:
> DC1 servers have 32GB of RAM and 10GB of HEAP. DC2 machines have 16GB of
> RAM and 5GB HEAP.
>
> Best practices would be be to:
> a)  have a consistent type of node across both DC's.  (CPUs, Memory, Heap
> & Disk)
> b)  increase heap on DC2 servers to be  8GB for C* Heap
>
> The leveled compaction issue is not addressed by this.
> hope this helps
>
> Jan/
>
>
>
>
>   On Wednesday, March 4, 2015 8:41 AM, Roni Balthazar <
> ronibaltha...@gmail.com> wrote:
>
>
> Hi there,
>
> We are running C* 2.1.3 cluster with 2 DataCenters: DC1: 30 Servers /
> DC2 - 10 Servers.
> DC1 servers have 32GB of RAM and 10GB of HEAP. DC2 machines have 16GB
> of RAM and 5GB HEAP.
> DC1 nodes have about 1.4TB of data and DC2 nodes 2.3TB.
> DC2 is used only for backup purposes. There are no reads on DC2.
> Every writes and reads are on DC1 using LOCAL_ONE and the RF DC1: 2 and
> DC2: 1.
> All keyspaces have STCS (Average 20~30 SSTables count each table on
> both DCs) except one that is using LCS (DC1: Avg 4K~7K SSTables / DC2:
> Avg 3K~14K SSTables).
>
> Basically we are running into 2 problems:
>
> 1) High SSTables count on keyspace using LCS (This KS has 500GB~600GB
> of data on each DC1 node).
> 2) There are 2 servers on DC1 and 4 servers in DC2 that went down with
> the OOM error message below:
>
> ERROR [SharedPool-Worker-111] 2015-03-04 05:03:26,394
> JVMStabilityInspector.java:94 - JVM state determined to be unstable.
> Exiting forcefully due to:
> java.lang.OutOfMemoryError: Java heap space
>         at
> org.apache.cassandra.db.composites.CompoundSparseCellNameType.copyAndMakeWith(CompoundSparseCellNameType.java:186)
> ~[apache-cassandra-2.1.3.jar:2.1.3]
>         at
> org.apache.cassandra.db.composites.AbstractCompoundCellNameType$CompositeDeserializer.readNext(AbstractCompoundCellNameType.java:286)
> ~[apache-cassandra-2.1.3.jar:2.1.3]
>         at
> org.apache.cassandra.db.AtomDeserializer.readNext(AtomDeserializer.java:104)
> ~[apache-cassandra-2.1.3.jar:2.1.3]
>         at
> org.apache.cassandra.db.columniterator.IndexedSliceReader$IndexedBlockFetcher.getNextBlock(IndexedSliceReader.java:426)
> ~[apache-cassandra-2.1.3.jar:2.1.3]
>         at
> org.apache.cassandra.db.columniterator.IndexedSliceReader$IndexedBlockFetcher.fetchMoreData(IndexedSliceReader.java:350)
> ~[apache-cassandra-2.1.3.jar:2.1.3]
>         at
> org.apache.cassandra.db.columniterator.IndexedSliceReader.computeNext(IndexedSliceReader.java:142)
> ~[apache-cassandra-2.1.3.jar:2.1.3]
>         at
> org.apache.cassandra.db.columniterator.IndexedSliceReader.computeNext(IndexedSliceReader.java:44)
> ~[apache-cassandra-2.1.3.jar:2.1.3]
>         at
> com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
> ~[guava-16.0.jar:na]
>         at
> com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
> ~[guava-16.0.jar:na]
>         at
> org.apache.cassandra.db.columniterator.SSTableSliceIterator.hasNext(SSTableSliceIterator.java:82)
> ~[apache-cassandra-2.1.3.jar:2.1.3]
>         at
> org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:172)
> ~[apache-cassandra-2.1.3.jar:2.1.3]
>         at
> org.apache.cassandra.db.filter.QueryFilter$2.hasNext(QueryFilter.java:155)
> ~[apache-cassandra-2.1.3.jar:2.1.3]
>         at
> org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:146)
> ~[apache-cassandra-2.1.3.jar:2.1.3]
>         at
> org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:125)
> ~[apache-cassandra-2.1.3.jar:2.1.3]
>         at
> org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:99)
> ~[apache-cassandra-2.1.3.jar:2.1.3]
>         at
> com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
> ~[guava-16.0.jar:na]
>         at
> com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
> ~[guava-16.0.jar:na]
>         at
> org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:203)
> ~[apache-cassandra-2.1.3.jar:2.1.3]
>         at
> org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:107)
> ~[apache-cassandra-2.1.3.jar:2.1.3]
>         at
> org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:81)
> ~[apache-cassandra-2.1.3.jar:2.1.3]
>         at
> org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:69)
> ~[apache-cassandra-2.1.3.jar:2.1.3]
>         at
> org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:320)
> ~[apache-cassandra-2.1.3.jar:2.1.3]
>         at
> org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:62)
> ~[apache-cassandra-2.1.3.jar:2.1.3]
>         at
> org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1915)
> ~[apache-cassandra-2.1.3.jar:2.1.3]
>         at
> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1748)
> ~[apache-cassandra-2.1.3.jar:2.1.3]
>         at org.apache.cassandra.db.Keyspace.getRow(Keyspace.java:342)
> ~[apache-cassandra-2.1.3.jar:2.1.3]
>         at
> org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:57)
> ~[apache-cassandra-2.1.3.jar:2.1.3]
>         at
> org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:1486)
> ~[apache-cassandra-2.1.3.jar:2.1.3]
>         at
> org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2171)
> ~[apache-cassandra-2.1.3.jar:2.1.3]
>         at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> ~[na:1.8.0_31]
>         at
> org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:164)
> ~[apache-cassandra-2.1.3.jar:2.1.3]
>         at
> org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105)
> ~[apache-cassandra-2.1.3.jar:2.1.3]
>
> So I am asking how to debug this issue and what are the best practices
> in this situation?
>
> Regards,
>
> Roni
>
>
>

Reply via email to