Re: Bootstrap OOM issues with Cassandra 3.11.1

Laszlo Szabo Tue, 07 Aug 2018 08:39:44 -0700

The last run I attempted used 135GB of RAM allocated to the JVM (arguments
below), and while there are OOM errors, there is not a stack trace in
either the system or debug log.  On direct memory runs, there is a stack
trace.  The last Direct memory run used 60GB heaps and 60GB for off heap
(that was the stack trace attached).  The HPROF file is 135GB and I'm
trying to generate the heap information from that now, but its been running
for 2 hours.


The closest I can get to the stack trace for the 135GB heap run is below.

ERROR [PERIODIC-COMMIT-LOG-SYNCER] 2018-08-07 00:34:13,980
> JVMStabilityInspector.java:82 - Exiting due to error while processing
> commit log during initialization.
> java.lang.OutOfMemoryError: Java heap space
> ERROR [MessagingService-Incoming-/10.1.1.11] 2018-08-07 00:34:13,980
> CassandraDaemon.java:228 - Exception in thread
> Thread[MessagingService-Incoming-/10.1.1.11,5,main]
> java.lang.OutOfMemoryError: Java heap space
> ERROR [HintsWriteExecutor:1] 2018-08-07 00:34:13,981
> CassandraDaemon.java:228 - Exception in thread
> Thread[HintsWriteExecutor:1,5,main]
> java.lang.OutOfMemoryError: Java heap space
> ERROR [MessagingService-Incoming-/10.1.1.13] 2018-08-07 00:34:13,980
> CassandraDaemon.java:228 - Exception in thread
> Thread[MessagingService-Incoming-/10.1.1.13,5,main]
> java.lang.OutOfMemoryError: Java heap space
> INFO  [Service Thread] 2018-08-07 00:34:13,982 StatusLogger.java:101 -
> system.schema_triggers                    0,0



JVM Arguments: [-Xloggc:/var/log/cassandra/gc.log, -ea,
> -XX:+UseThreadPriorities, -XX:ThreadPriorityPolicy=42,
> -XX:+HeapDumpOnOutOfMemoryError, -Xss256k, -XX:StringTableSize=1000003

, -XX:+AlwaysPreTouch, -XX:-UseBiasedLocking, -XX:+UseTLAB,
> -XX:+ResizeTLAB, -XX:+UseNUMA, -XX:+PerfDisableSharedMem,
> -Djava.net.preferIPv4Stack=true, -Xms135G, -Xmx135G, -XX:+UseParNewGC,
> -XX:+UseConcMarkSweepGC, -XX:+CMSParallelRemarkEnabled,

-XX:SurvivorRatio=8, -XX:MaxTenuringThreshold=1,
> -XX:CMSInitiatingOccupancyFraction=75, -XX:+UseCMSInitiatingOccupancyOnly,
> -XX:CMSWaitDuration=10000, -XX:+CMSParallelInitialMarkEnabled,
> -XX:+CMSEdenChunksRecordAlways, -XX:+CMSClassUnloadingEnab

led, -XX:+PrintGCDetails, -XX:+PrintGCDateStamps, -XX:+PrintHeapAtGC,
> -XX:+PrintTenuringDistribution, -XX:+PrintGCApplicationStoppedTime,
> -XX:+PrintPromotionFailure, -XX:+UseGCLogFileRotation,
> -XX:NumberOfGCLogFiles=10, -XX:GCLogFileSize=10M, -X

mn2048M, -XX:+UseCondCardMark,
> -XX:CompileCommandFile=/etc/cassandra/hotspot_compiler,
> -javaagent:/usr/share/cassandra/lib/jamm-0.3.0.jar,
> -Dcassandra.jmx.local.port=7199,
> -Dcom.sun.management.jmxremote.authenticate=false, -Dcom.sun.management.j

mxremote.password.file=/etc/cassandra/jmxremote.password,
> -Djava.library.path=/usr/share/cassandra/lib/sigar-bin,
> -Dlogback.configurationFile=logback.xml,
> -Dcassandra.logdir=/var/log/cassandra,
> -Dcassandra.storagedir=/var/lib/cassandra, -Dcassan

dra-pidfile=/var/run/cassandra/cassandra.pid,
> -XX:HeapDumpPath=/var/lib/cassandra/java_1533559747.hprof,
> -XX:ErrorFile=/var/lib/cassandra/hs_err_1533559747.log]



On Tue, Aug 7, 2018 at 11:12 AM, Jeff Jirsa <jji...@gmail.com> wrote:

> That's a direct memory OOM - it's not the heap, it's the offheap.
>
> You can see that 
> gpsmessages.addressreceivedtime_idxgpsmessages.addressreceivedtime_idx
> is holding about 2GB of offheap memory (most of it for the bloom filter),
> but none of the others look like they're holding a ton offheap (either in
> bloom filter, memtable, etc).  With what JVM args are you starting
> cassandra (how much direct memory are you allocating)? Are all of your OOMs
> in direct memory?
>
>
>
> On Tue, Aug 7, 2018 at 6:30 AM, Laszlo Szabo <
> laszlo.viktor.sz...@gmail.com> wrote:
>
>> Hi,
>>
>> Thanks for the fast response!
>>
>> We are not using any materialized views, but there are several indexes.
>> I don't have a recent heap dump, and it will be about 24 before I can
>> generate an interesting one, but most of the memory was allocated to byte
>> buffers, so not entirely helpful.
>>
>> nodetool cfstats is also below.
>>
>> I also see a lot of flushing happening, but it seems like there are too
>> many small allocations to be effective.  Here are the messages I see,
>>
>> DEBUG [SlabPoolCleaner] 2018-08-06 07:16:08,459
>>> ColumnFamilyStore.java:1305 - Flushing largest CFS(Keyspace='userinfo',
>>> ColumnFamily='gpsmessages') to free up room. Used total: 0.54/0.05, live:
>>> 0.00/0.00, flushing: 0.40/0.04, this: 0.00/0.00
>>
>> DEBUG [SlabPoolCleaner] 2018-08-06 07:16:08,459
>>> ColumnFamilyStore.java:915 - Enqueuing flush of gpsmessages: 0.000KiB (0%)
>>> on-heap, 0.014KiB (0%) off-heap
>>
>> DEBUG [SlabPoolCleaner] 2018-08-06 07:16:08,460
>>> ColumnFamilyStore.java:1305 - Flushing largest CFS(Keyspace='userinfo',
>>> ColumnFamily='user_history') to free up room. Used total: 0.54/0.05, live:
>>> 0.00/0.00, flushing: 0.40/0.04, this: 0.00/0.00
>>
>> DEBUG [SlabPoolCleaner] 2018-08-06 07:16:08,461
>>> ColumnFamilyStore.java:915 - Enqueuing flush of user_history: 0.000KiB (0%)
>>> on-heap, 0.011KiB (0%) off-heap
>>
>> DEBUG [SlabPoolCleaner] 2018-08-06 07:16:08,465
>>> ColumnFamilyStore.java:1305 - Flushing largest CFS(Keyspace='userinfo',
>>> ColumnFamily='tweets') to free up room. Used total: 0.54/0.05, live:
>>> 0.00/0.00, flushing: 0.40/0.04, this: 0.00/0.00
>>
>> DEBUG [SlabPoolCleaner] 2018-08-06 07:16:08,465
>>> ColumnFamilyStore.java:915 - Enqueuing flush of tweets: 0.000KiB (0%)
>>> on-heap, 0.188KiB (0%) off-heap
>>
>> DEBUG [SlabPoolCleaner] 2018-08-06 07:16:08,470
>>> ColumnFamilyStore.java:1305 - Flushing largest CFS(Keyspace='userinfo',
>>> ColumnFamily='user_history') to free up room. Used total: 0.54/0.05, live:
>>> 0.00/0.00, flushing: 0.40/0.04, this: 0.00/0.00
>>
>> DEBUG [SlabPoolCleaner] 2018-08-06 07:16:08,470
>>> ColumnFamilyStore.java:915 - Enqueuing flush of user_history: 0.000KiB (0%)
>>> on-heap, 0.024KiB (0%) off-heap
>>
>> DEBUG [SlabPoolCleaner] 2018-08-06 07:16:08,470
>>> ColumnFamilyStore.java:1305 - Flushing largest CFS(Keyspace='userinfo',
>>> ColumnFamily='tweets') to free up room. Used total: 0.54/0.05, live:
>>> 0.00/0.00, flushing: 0.40/0.04, this: 0.00/0.00
>>
>> DEBUG [SlabPoolCleaner] 2018-08-06 07:16:08,470
>>> ColumnFamilyStore.java:915 - Enqueuing flush of tweets: 0.000KiB (0%)
>>> on-heap, 0.188KiB (0%) off-heap
>>
>> DEBUG [SlabPoolCleaner] 2018-08-06 07:16:08,472
>>> ColumnFamilyStore.java:1305 - Flushing largest CFS(Keyspace='userinfo',
>>> ColumnFamily='gpsmessages') to free up room. Used total: 0.54/0.05, live:
>>> 0.00/0.00, flushing: 0.40/0.04, this: 0.00/0.00
>>
>> DEBUG [SlabPoolCleaner] 2018-08-06 07:16:08,472
>>> ColumnFamilyStore.java:915 - Enqueuing flush of gpsmessages: 0.000KiB (0%)
>>> on-heap, 0.013KiB (0%) off-heap
>>
>>
>>>
>>
>> Stack traces from errors are below.
>>
>>
>>> java.io.IOException: Broken pipe
>>
>>         at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
>>> ~[na:1.8.0_181]
>>
>>         at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
>>> ~[na:1.8.0_181]
>>
>>         at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)
>>> ~[na:1.8.0_181]
>>
>>         at sun.nio.ch.IOUtil.write(IOUtil.java:51) ~[na:1.8.0_181]
>>
>>         at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:471)
>>> ~[na:1.8.0_181]
>>
>>         at org.apache.cassandra.io.util.BufferedDataOutputStreamPlus.do
>>> Flush(BufferedDataOutputStreamPlus.java:323)
>>> ~[apache-cassandra-3.11.1.jar:3.11.1]
>>
>>         at org.apache.cassandra.io.util.BufferedDataOutputStreamPlus.fl
>>> ush(BufferedDataOutputStreamPlus.java:331)
>>> ~[apache-cassandra-3.11.1.jar:3.11.1]
>>
>>         at org.apache.cassandra.streaming.ConnectionHandler$OutgoingMes
>>> sageHandler.sendMessage(ConnectionHandler.java:409)
>>> [apache-cassandra-3.11.1.jar:3.11.1]
>>
>>         at org.apache.cassandra.streaming.ConnectionHandler$OutgoingMes
>>> sageHandler.run(ConnectionHandler.java:380)
>>> [apache-cassandra-3.11.1.jar:3.11.1]
>>
>>         at java.lang.Thread.run(Thread.java:748) [na:1.8.0_181]
>>
>> ERROR [MutationStage-226] 2018-08-06 07:16:08,236
>>> JVMStabilityInspector.java:142 - JVM state determined to be unstable.
>>> Exiting forcefully due to:
>>
>> java.lang.OutOfMemoryError: Direct buffer memory
>>
>>         at java.nio.Bits.reserveMemory(Bits.java:694) ~[na:1.8.0_181]
>>
>>         at java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:123)
>>> ~[na:1.8.0_181]
>>
>>         at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:311)
>>> ~[na:1.8.0_181]
>>
>>         at 
>> org.apache.cassandra.utils.memory.SlabAllocator.getRegion(SlabAllocator.java:139)
>>> ~[apache-cassandra-3.11.1.jar:3.11.1]
>>
>>         at 
>> org.apache.cassandra.utils.memory.SlabAllocator.allocate(SlabAllocator.java:104)
>>> ~[apache-cassandra-3.11.1.jar:3.11.1]
>>
>>         at 
>> org.apache.cassandra.utils.memory.ContextAllocator.allocate(ContextAllocator.java:57)
>>> ~[apache-cassandra-3.11.1.jar:3.11.1]
>>
>>         at 
>> org.apache.cassandra.utils.memory.ContextAllocator.clone(ContextAllocator.java:47)
>>> ~[apache-cassandra-3.11.1.jar:3.11.1]
>>
>>         at org.apache.cassandra.utils.memory.MemtableBufferAllocator.
>>> clone(MemtableBufferAllocator.java:40) ~[apache-cassandra-3.11.1.jar:
>>> 3.11.1]
>>
>>         at org.apache.cassandra.db.Memtable.put(Memtable.java:269)
>>> ~[apache-cassandra-3.11.1.jar:3.11.1]
>>
>>         at 
>> org.apache.cassandra.db.ColumnFamilyStore.apply(ColumnFamilyStore.java:1332)
>>> ~[apache-cassandra-3.11.1.jar:3.11.1]
>>
>>         at org.apache.cassandra.db.Keyspace.applyInternal(Keyspace.java:618)
>>> ~[apache-cassandra-3.11.1.jar:3.11.1]
>>
>>         at org.apache.cassandra.db.Keyspace.applyFuture(Keyspace.java:425)
>>> ~[apache-cassandra-3.11.1.jar:3.11.1]
>>
>>         at org.apache.cassandra.db.Mutation.applyFuture(Mutation.java:222)
>>> ~[apache-cassandra-3.11.1.jar:3.11.1]
>>
>>         at 
>> org.apache.cassandra.db.MutationVerbHandler.doVerb(MutationVerbHandler.java:68)
>>> ~[apache-cassandra-3.11.1.jar:3.11.1]
>>
>>         at 
>> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:66)
>>> ~[apache-cassandra-3.11.1.jar:3.11.1]
>>
>>         at 
>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>>> ~[na:1.8.0_181]
>>
>>         at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorSe
>>> rvice$FutureTask.run(AbstractLocalAwareExecutorService.java:162)
>>> ~[apache-cassandra-3.11.1.jar:3.11.1]
>>
>>         at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorSe
>>> rvice$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:134)
>>> [apache-cassandra-3.11.1.jar:3.11.1]
>>
>>         at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109)
>>> [apache-cassandra-3.11.1.jar:3.11.1]
>>
>>         at java.lang.Thread.run(Thread.java:748) [na:1.8.0_181]
>>
>> DEBUG [SlabPoolCleaner] 2018-08-06 07:16:08,459
>>> ColumnFamilyStore.java:1305 - Flushing largest CFS(Keyspace='userinfo',
>>> ColumnFamily='gpsmessages') to free up room. Used total: 0.54/0.05, live:
>>> 0.00/0.00, flushing: 0.40/0.04, this: 0.00/0.00
>>
>> DEBUG [SlabPoolCleaner] 2018-08-06 07:16:08,459
>>> ColumnFamilyStore.java:915 - Enqueuing flush of gpsmessages: 0.000KiB (0%)
>>> on-heap, 0.014KiB (0%) off-heap
>>
>> DEBUG [SlabPoolCleaner] 2018-08-06 07:16:08,460
>>> ColumnFamilyStore.java:1305 - Flushing largest CFS(Keyspace='userinfo',
>>> ColumnFamily='user_history') to free up room. Used total: 0.54/0.05, live:
>>> 0.00/0.00, flushing: 0.40/0.04, this: 0.00/0.00
>>
>> DEBUG [SlabPoolCleaner] 2018-08-06 07:16:08,461
>>> ColumnFamilyStore.java:915 - Enqueuing flush of user_history: 0.000KiB (0%)
>>> on-heap, 0.011KiB (0%) off-heap
>>
>> DEBUG [SlabPoolCleaner] 2018-08-06 07:16:08,465
>>> ColumnFamilyStore.java:1305 - Flushing largest CFS(Keyspace='userinfo',
>>> ColumnFamily='tweets') to free up room. Used total: 0.54/0.05, live:
>>> 0.00/0.00, flushing: 0.40/0.04, this: 0.00/0.00
>>
>> DEBUG [SlabPoolCleaner] 2018-08-06 07:16:08,465
>>> ColumnFamilyStore.java:915 - Enqueuing flush of tweets: 0.000KiB (0%)
>>> on-heap, 0.188KiB (0%) off-heap
>>
>> DEBUG [SlabPoolCleaner] 2018-08-06 07:16:08,470
>>> ColumnFamilyStore.java:1305 - Flushing largest CFS(Keyspace='userinfo',
>>> ColumnFamily='user_history') to free up room. Used total: 0.54/0.05, live:
>>> 0.00/0.00, flushing: 0.40/0.04, this: 0.00/0.00
>>
>> DEBUG [SlabPoolCleaner] 2018-08-06 07:16:08,470
>>> ColumnFamilyStore.java:915 - Enqueuing flush of user_history: 0.000KiB (0%)
>>> on-heap, 0.024KiB (0%) off-heap
>>
>> DEBUG [SlabPoolCleaner] 2018-08-06 07:16:08,470
>>> ColumnFamilyStore.java:1305 - Flushing largest CFS(Keyspace='userinfo',
>>> ColumnFamily='tweets') to free up room. Used total: 0.54/0.05, live:
>>> 0.00/0.00, flushing: 0.40/0.04, this: 0.00/0.00
>>
>> DEBUG [SlabPoolCleaner] 2018-08-06 07:16:08,470
>>> ColumnFamilyStore.java:915 - Enqueuing flush of tweets: 0.000KiB (0%)
>>> on-heap, 0.188KiB (0%) off-heap
>>
>> DEBUG [SlabPoolCleaner] 2018-08-06 07:16:08,472
>>> ColumnFamilyStore.java:1305 - Flushing largest CFS(Keyspace='userinfo',
>>> ColumnFamily='gpsmessages') to free up room. Used total: 0.54/0.05, live:
>>> 0.00/0.00, flushing: 0.40/0.04, this: 0.00/0.00
>>
>> DEBUG [SlabPoolCleaner] 2018-08-06 07:16:08,472
>>> ColumnFamilyStore.java:915 - Enqueuing flush of gpsmessages: 0.000KiB (0%)
>>> on-heap, 0.013KiB (0%) off-heap
>>
>>
>> Total number of tables: 40
>> ----------------
>> Keyspace : userinfo
>> Read Count: 143301
>> Read Latency: 14.945587623254548 ms.
>> Write Count: 2754603904
>> Write Latency: 0.020883145284324698 ms.
>> Pending Flushes: 0
>> Table (index): gpsmessages.gpsmessages_addres
>> scount_idxgpsmessages.gpsmessages_addresscount_idx
>> SSTable count: 9
>> Space used (live): 19043463189
>> Space used (total): 19043463189
>> Space used by snapshots (total): 0
>> Off heap memory used (total): 6259448
>> SSTable Compression Ratio: 0.3704785164266614
>> Number of partitions (estimate): 1025
>> Memtable cell count: 309066
>> Memtable data size: 13602774
>> Memtable off heap memory used: 0
>> Memtable switch count: 0
>> Local read count: 0
>> Local read latency: NaN ms
>> Local write count: 46025778
>> Local write latency: 0.034 ms
>> Pending flushes: 0
>> Percent repaired: 0.0
>> Bloom filter false positives: 0
>> Bloom filter false ratio: 0.00000
>> Bloom filter space used: 2504
>> Bloom filter off heap memory used: 2432
>> Index summary off heap memory used: 320
>> Compression metadata off heap memory used: 6256696
>> Compacted partition minimum bytes: 43
>> Compacted partition maximum bytes: 44285675122
>> Compacted partition mean bytes: 30405277
>> Average live cells per slice (last five minutes): NaN
>> Maximum live cells per slice (last five minutes): 0
>> Average tombstones per slice (last five minutes): NaN
>> Maximum tombstones per slice (last five minutes): 0
>> Dropped Mutations: 0
>>
>> Table (index): gpsmessages.gpsmessages_addres
>> ses_idxgpsmessages.gpsmessages_addresses_idx
>> SSTable count: 18
>> Space used (live): 409514565570
>> Space used (total): 409514565570
>> Space used by snapshots (total): 0
>> Off heap memory used (total): 153405673
>> SSTable Compression Ratio: 0.4447731157134059
>> Number of partitions (estimate): 6013125
>> Memtable cell count: 1110334
>> Memtable data size: 67480140
>> Memtable off heap memory used: 0
>> Memtable switch count: 0
>> Local read count: 0
>> Local read latency: NaN ms
>> Local write count: 147639252
>> Local write latency: 0.015 ms
>> Pending flushes: 0
>> Percent repaired: 0.0
>> Bloom filter false positives: 0
>> Bloom filter false ratio: 0.00000
>> Bloom filter space used: 34175400
>> Bloom filter off heap memory used: 34175256
>> Index summary off heap memory used: 7432177
>> Compression metadata off heap memory used: 111798240
>> Compacted partition minimum bytes: 61
>> Compacted partition maximum bytes: 322381140
>> Compacted partition mean bytes: 36692
>> Average live cells per slice (last five minutes): NaN
>> Maximum live cells per slice (last five minutes): 0
>> Average tombstones per slice (last five minutes): NaN
>> Maximum tombstones per slice (last five minutes): 0
>> Dropped Mutations: 0
>>
>> Table (index): gpsmessages.addressreceivedtim
>> e_idxgpsmessages.addressreceivedtime_idx
>> SSTable count: 10
>> Space used (live): 52738155477
>> Space used (total): 52738155477
>> Space used by snapshots (total): 0
>> Off heap memory used (total): 1909362628
>> SSTable Compression Ratio: 0.4106961621795128
>> Number of partitions (estimate): 1338730016
>> Memtable cell count: 308990
>> Memtable data size: 13410867
>> Memtable off heap memory used: 0
>> Memtable switch count: 0
>> Local read count: 0
>> Local read latency: NaN ms
>> Local write count: 46012614
>> Local write latency: 0.012 ms
>> Pending flushes: 0
>> Percent repaired: 0.0
>> Bloom filter false positives: 0
>> Bloom filter false ratio: 0.00000
>> Bloom filter space used: 1687550888
>> Bloom filter off heap memory used: 1687550808
>> Index summary off heap memory used: 213249180
>> Compression metadata off heap memory used: 8562640
>> Compacted partition minimum bytes: 36
>> Compacted partition maximum bytes: 2759
>> Compacted partition mean bytes: 54
>> Average live cells per slice (last five minutes): NaN
>> Maximum live cells per slice (last five minutes): 0
>> Average tombstones per slice (last five minutes): NaN
>> Maximum tombstones per slice (last five minutes): 0
>> Dropped Mutations: 0
>>
>> Table: gpsmessages
>> SSTable count: 13
>> Space used (live): 337974446627
>> Space used (total): 337974446627
>> Space used by snapshots (total): 0
>> Off heap memory used (total): 77833540
>> SSTable Compression Ratio: 0.5300637241381126
>> Number of partitions (estimate): 22034
>> Memtable cell count: 308904
>> Memtable data size: 72074512
>> Memtable off heap memory used: 0
>> Memtable switch count: 110
>> Local read count: 0
>> Local read latency: NaN ms
>> Local write count: 45996652
>> Local write latency: 0.281 ms
>> Pending flushes: 0
>> Percent repaired: 0.0
>> Bloom filter false positives: 0
>> Bloom filter false ratio: 0.00000
>> Bloom filter space used: 67904
>> Bloom filter off heap memory used: 67800
>> Index summary off heap memory used: 11756
>> Compression metadata off heap memory used: 77753984
>> Compacted partition minimum bytes: 73
>> Compacted partition maximum bytes: 1155149911
>> Compacted partition mean bytes: 13158224
>> Average live cells per slice (last five minutes): NaN
>> Maximum live cells per slice (last five minutes): 0
>> Average tombstones per slice (last five minutes): NaN
>> Maximum tombstones per slice (last five minutes): 0
>> Dropped Mutations: 13699
>>
>> Table: user_history
>> SSTable count: 17
>> Space used (live): 116361158882
>> Space used (total): 116361158882
>> Space used by snapshots (total): 0
>> Off heap memory used (total): 29562319
>> SSTable Compression Ratio: 0.5683114352331539
>> Number of partitions (estimate): 1337206
>> Memtable cell count: 773277
>> Memtable data size: 40623368
>> Memtable off heap memory used: 0
>> Memtable switch count: 57
>> Local read count: 209
>> Local read latency: NaN ms
>> Local write count: 145853733
>> Local write latency: 0.020 ms
>> Pending flushes: 0
>> Percent repaired: 0.0
>> Bloom filter false positives: 0
>> Bloom filter false ratio: 0.00000
>> Bloom filter space used: 3844416
>> Bloom filter off heap memory used: 3844280
>> Index summary off heap memory used: 800991
>> Compression metadata off heap memory used: 24917048
>> Compacted partition minimum bytes: 61
>> Compacted partition maximum bytes: 464228842
>> Compacted partition mean bytes: 72182
>> Average live cells per slice (last five minutes): NaN
>> Maximum live cells per slice (last five minutes): 0
>> Average tombstones per slice (last five minutes): NaN
>> Maximum tombstones per slice (last five minutes): 0
>> Dropped Mutations: 66702
>>
>> Table: users
>> SSTable count: 3
>> Space used (live): 89945186
>> Space used (total): 89945186
>> Space used by snapshots (total): 0
>> Off heap memory used (total): 2092053
>> SSTable Compression Ratio: 0.5712127629253333
>> Number of partitions (estimate): 1365645
>> Memtable cell count: 3556
>> Memtable data size: 150903
>> Memtable off heap memory used: 0
>> Memtable switch count: 42
>> Local read count: 143087
>> Local read latency: 6.094 ms
>> Local write count: 250971
>> Local write latency: 0.024 ms
>> Pending flushes: 0
>> Percent repaired: 0.0
>> Bloom filter false positives: 0
>> Bloom filter false ratio: 0.00000
>> Bloom filter space used: 1709848
>> Bloom filter off heap memory used: 1709824
>> Index summary off heap memory used: 372125
>> Compression metadata off heap memory used: 10104
>> Compacted partition minimum bytes: 36
>> Compacted partition maximum bytes: 310
>> Compacted partition mean bytes: 66
>> Average live cells per slice (last five minutes): 1.0
>> Maximum live cells per slice (last five minutes): 1
>> Average tombstones per slice (last five minutes): 1.0
>> Maximum tombstones per slice (last five minutes): 1
>> Dropped Mutations: 114
>>
>> Table: tweets
>> SSTable count: 18
>> Space used (live): 1809145656486
>> Space used (total): 1809145656486
>> Space used by snapshots (total): 0
>> Off heap memory used (total): 435915908
>> SSTable Compression Ratio: 0.5726200929451171
>> Number of partitions (estimate): 26217889
>> Memtable cell count: 710146
>> Memtable data size: 31793929
>> Memtable off heap memory used: 0
>> Memtable switch count: 399
>> Local read count: 5
>> Local read latency: NaN ms
>> Local write count: 2322829524
>> Local write latency: 0.019 ms
>> Pending flushes: 0
>> Percent repaired: 0.0
>> Bloom filter false positives: 0
>> Bloom filter false ratio: 0.00000
>> Bloom filter space used: 35019224
>> Bloom filter off heap memory used: 35019080
>> Index summary off heap memory used: 16454076
>> Compression metadata off heap memory used: 384442752
>> Compacted partition minimum bytes: 104
>> Compacted partition maximum bytes: 3379391
>> Compacted partition mean bytes: 124766
>> Average live cells per slice (last five minutes): NaN
>> Maximum live cells per slice (last five minutes): 0
>> Average tombstones per slice (last five minutes): NaN
>> Maximum tombstones per slice (last five minutes): 0
>> Dropped Mutations: 697696
>>
>> ----------------
>>
>>
>> On Mon, Aug 6, 2018 at 8:57 PM, Jeff Jirsa <jji...@gmail.com> wrote:
>>
>>>
>>>
>>> Upgrading to 3.11.3 May fix it (there were some memory recycling bugs
>>> fixed recently), but analyzing the heap will be the best option
>>>
>>> If you can print out the heap histogram and stack trace or open a heap
>>> dump in your kit or visualvm or MAT and show us what’s at the top of the
>>> reclaimed objects, we may be able to figure out what’s going on
>>>
>>> --
>>> Jeff Jirsa
>>>
>>>
>>> On Aug 6, 2018, at 5:42 PM, Jeff Jirsa <jji...@gmail.com> wrote:
>>>
>>> Are you using materialized views or secondary indices?
>>>
>>> --
>>> Jeff Jirsa
>>>
>>>
>>> On Aug 6, 2018, at 3:49 PM, Laszlo Szabo <laszlo.viktor.sz...@gmail.com>
>>> wrote:
>>>
>>> Hello All,
>>>
>>> I'm having JVM unstable / OOM errors when attempting to auto bootstrap a
>>> 9th node to an existing 8 node cluster (256 tokens).  Each machine has 24
>>> cores 148GB RAM and 10TB (2TB used).  Under normal operation the 8 nodes
>>> have JVM memory configured with Xms35G and Xmx35G, and handle 2-4 billion
>>> inserts per day.  There are never updates, deletes, or sparsely populated
>>> rows.
>>>
>>> For the bootstrap node, I've tried memory values from 35GB to 135GB in
>>> 10GB increments. I've tried using both memtable_allocation_types
>>> (heap_buffers and offheap_buffers).  I've not tried modifying the
>>> memtable_cleanup_threshold but instead have tried memtable_flush_writers
>>> from 2 to 8.  I've tried memtable_(off)heap_space_in_mb from 20000 to
>>> 60000.  I've tried both CMS and G1 garbage collection with various
>>> settings.
>>>
>>> Typically, after streaming about ~2TB of data, CPU load will hit a
>>> maximum, and the "nodetool info" heap memory will, over the course of an
>>> hour, approach the maximum.  At that point, CPU load will drop to a single
>>> thread with minimal activity until the system becomes unstable and
>>> eventually the OOM error occurs.
>>>
>>> Excerpt of the system log is below, and what I consistently see is the
>>> MemtableFlushWriter and the MemtableReclaimMemory pending queues grow as
>>> the memory becomes depleted, but the number of completed seems to stop
>>> changing a few minutes after the CPU load spikes.
>>>
>>> One other data point is there seems to be a huge number of mutations
>>> that occur after most of the stream has occured.  Concurrent_writes is set
>>> at 256 with the queue getting as high as 200K before dropping down.
>>>
>>> Any suggestions for yaml changes or jvm changes?  JVM.options is
>>> currently the default with the memory set to the max, the current YAML file
>>> is below.
>>>
>>> Thanks!
>>>
>>>
>>> INFO  [ScheduledTasks:1] 2018-08-06 17:49:26,329 StatusLogger.java:51 -
>>>> MutationStage                     1         2      191498052         0
>>>>            0
>>>> INFO  [ScheduledTasks:1] 2018-08-06 17:49:26,331 StatusLogger.java:51 -
>>>> ViewMutationStage                 0         0              0         0
>>>>            0
>>>> INFO  [Service Thread] 2018-08-06 17:49:26,338 StatusLogger.java:51 -
>>>> PerDiskMemtableFlushWriter_0         0         0           5865         0
>>>>                0
>>>> INFO  [ScheduledTasks:1] 2018-08-06 17:49:26,343 StatusLogger.java:51 -
>>>> ReadStage                         0         0              0         0
>>>>            0
>>>> INFO  [Service Thread] 2018-08-06 17:49:26,347 StatusLogger.java:51 -
>>>> ValidationExecutor                0         0              0         0
>>>>            0
>>>> INFO  [ScheduledTasks:1] 2018-08-06 17:49:26,360 StatusLogger.java:51 -
>>>> RequestResponseStage              0         0              8         0
>>>>            0
>>>> INFO  [Service Thread] 2018-08-06 17:49:26,380 StatusLogger.java:51 -
>>>> Sampler                           0         0              0         0
>>>>            0
>>>> INFO  [Service Thread] 2018-08-06 17:49:26,382 StatusLogger.java:51 - 
>>>> *MemtableFlushWriter
>>>>              8     74293           4716         0  *               0
>>>> INFO  [ScheduledTasks:1] 2018-08-06 17:49:26,388 StatusLogger.java:51 -
>>>> ReadRepairStage                   0         0              0         0
>>>>            0
>>>> INFO  [ScheduledTasks:1] 2018-08-06 17:49:26,389 StatusLogger.java:51 -
>>>> CounterMutationStage              0         0              0         0
>>>>            0
>>>> INFO  [ScheduledTasks:1] 2018-08-06 17:49:26,404 StatusLogger.java:51 -
>>>> MiscStage                         0         0              0         0
>>>>            0
>>>> INFO  [ScheduledTasks:1] 2018-08-06 17:49:26,407 StatusLogger.java:51 -
>>>> CompactionExecutor                8        13            493         0
>>>>            0
>>>> INFO  [Service Thread] 2018-08-06 17:49:26,410 StatusLogger.java:51 -
>>>> InternalResponseStage             0         0             16         0
>>>>            0
>>>> INFO  [ScheduledTasks:1] 2018-08-06 17:49:26,413 StatusLogger.java:51 -
>>>>  *MemtableReclaimMemory             1      6066            356
>>>>  0     *            0
>>>> INFO  [Service Thread] 2018-08-06 17:49:26,421 StatusLogger.java:51 -
>>>> AntiEntropyStage                  0         0              0         0
>>>>            0
>>>> INFO  [Service Thread] 2018-08-06 17:49:26,430 StatusLogger.java:51 -
>>>> CacheCleanupExecutor              0         0              0         0
>>>>            0
>>>> INFO  [ScheduledTasks:1] 2018-08-06 17:49:26,431 StatusLogger.java:51 -
>>>> PendingRangeCalculator            0         0              9         0
>>>>            0
>>>> INFO  [Service Thread] 2018-08-06 17:49:26,436 StatusLogger.java:61 -
>>>> CompactionManager                 8        19
>>>
>>>
>>>
>>>
>>>  Current Yaml
>>>
>>> num_tokens: 256
>>>
>>> hinted_handoff_enabled: true
>>>
>>> hinted_handoff_throttle_in_kb: 10240
>>>
>>> max_hints_delivery_threads: 8
>>>
>>> hints_flush_period_in_ms: 10000
>>>
>>> max_hints_file_size_in_mb: 128
>>>
>>> batchlog_replay_throttle_in_kb: 10240
>>>
>>> authenticator: AllowAllAuthenticator
>>>
>>> authorizer: AllowAllAuthorizer
>>>
>>> role_manager: CassandraRoleManager
>>>
>>> roles_validity_in_ms: 2000
>>>
>>> permissions_validity_in_ms: 2000
>>>
>>> credentials_validity_in_ms: 2000
>>>
>>> partitioner: org.apache.cassandra.dht.Murmur3Partitioner
>>>
>>> data_file_directories:
>>>
>>>     - /data/cassandra/data
>>>
>>> commitlog_directory: /data/cassandra/commitlog
>>>
>>> cdc_enabled: false
>>>
>>> disk_failure_policy: stop
>>>
>>> commit_failure_policy: stop
>>>
>>> prepared_statements_cache_size_mb:
>>>
>>> thrift_prepared_statements_cache_size_mb:
>>>
>>> key_cache_size_in_mb:
>>>
>>> key_cache_save_period: 14400
>>>
>>> row_cache_size_in_mb: 0
>>>
>>> row_cache_save_period: 0
>>>
>>> counter_cache_size_in_mb:
>>>
>>> counter_cache_save_period: 7200
>>>
>>> saved_caches_directory: /data/cassandra/saved_caches
>>>
>>> commitlog_sync: periodic
>>>
>>> commitlog_sync_period_in_ms: 10000
>>>
>>> commitlog_segment_size_in_mb: 32
>>>
>>> seed_provider:
>>>
>>>     - class_name: org.apache.cassandra.locator.SimpleSeedProvider
>>>
>>>       parameters:
>>>
>>>           - seeds: "10.1.1.11,10.1.1.12,10.1.1.13"
>>>
>>> concurrent_reads: 128
>>>
>>> concurrent_writes: 256
>>>
>>> concurrent_counter_writes: 96
>>>
>>> concurrent_materialized_view_writes: 32
>>>
>>> disk_optimization_strategy: spinning
>>>
>>> memtable_heap_space_in_mb: 61440
>>>
>>> memtable_offheap_space_in_mb: 61440
>>>
>>> memtable_allocation_type: heap_buffers
>>>
>>> commitlog_total_space_in_mb: 81920
>>>
>>> memtable_flush_writers: 8
>>>
>>>
>>>
>>
>

Re: Bootstrap OOM issues with Cassandra 3.11.1

Reply via email to