Hi,

We have an application that reads data from a set of external sources and
loads them into our cassandra cluster. The load goes ok for some time
(~24h) and then some servers in the cluster starts flapping between being
down and up, and finally they go out of memory.
The cluster consists of 5 m4.xlarge machines with 16gb memory, cassandra
has an 8gb heap. All machines have a high load while data is being written,
with a load between 6 and 20.

I have tried sifting through the information available from nodetool, but I
am unable to find anything helping me determine what is causing the oom. I
am quite new to cassandra, so I might very well overlook the obvious. So
any pointers on how to proceed with identifying the problem will be much
appriciated :)

In the following I have included information from 10.61.70.110 when it was
flapping.

Status for ddp keyspace(only keyspace containing any real data):
-----------------------------------------------------------------------------
Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address       Load       Tokens       Owns (effective)  Host ID
                      Rack
UN  10.61.70.108  65.97 GB   256          59,1%
de79a554-9296-4575-8b79-2089f92069cd  rack1
UN  10.61.70.110  58.95 GB   256          63,3%
310460f6-b7ce-45a7-be63-a7dd409f6b17  rack1
UN  10.61.70.72   58.17 GB   256          60,3%
44fd4f8e-18cd-4487-8174-3a22fb9ed24f  rack1
UN  10.61.70.107  58.69 GB   256          58,5%
f8118fc2-e340-45db-a06e-a5842107d6c8  rack1
UN  10.61.70.64   68 GB      256          58,7%
84bee9fe-2adc-48aa-915c-f43d972f5a2f  rack1
-----------------------------------------------------------------------------


Snippet from system.log:
-----------------------------------------------------------------------------
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,547 MessagingService.java:980
- MUTATION messages were dropped in last 5000 ms: 5776 for internal timeout
and 0 for cross node timeout
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,547 StatusLogger.java:52 -
Pool Name                    Active   Pending      Completed   Blocked  All
Time Blocked
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,551 StatusLogger.java:56 -
MutationStage                    32   4881705    17826061870         0
            0
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,552 StatusLogger.java:56 -
ViewMutationStage                 0         0              0         0
            0
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,552 StatusLogger.java:56 -
ReadStage                         0         0        3266887         0
            0
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,553 StatusLogger.java:56 -
RequestResponseStage              0         0      389429305         0
            0
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,553 StatusLogger.java:56 -
ReadRepairStage                   0         0         322804         0
            0
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,554 StatusLogger.java:56 -
CounterMutationStage              0         0              0         0
            0
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,554 StatusLogger.java:56 -
MiscStage                         0         0              0         0
            0
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,554 StatusLogger.java:56 -
CompactionExecutor                4        54          31305         0
            0
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,555 StatusLogger.java:56 -
MemtableReclaimMemory             0         0           3310         0
            0
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,555 StatusLogger.java:56 -
PendingRangeCalculator            0         0             10         0
            0
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,555 StatusLogger.java:56 -
GossipStage                       0         0         338170         0
            0
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,555 StatusLogger.java:56 -
SecondaryIndexManagement          0         0              0         0
            0
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,556 StatusLogger.java:56 -
HintsDispatcher                   1         4           6264         0
            0
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,558 StatusLogger.java:56 -
MigrationStage                    0         0              0         0
            0
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,559 StatusLogger.java:56 -
MemtablePostFlush                 0         0           3451         0
            0
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,560 StatusLogger.java:56 -
ValidationExecutor                0         0              0         0
            0
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,561 StatusLogger.java:56 -
Sampler                           0         0              0         0
            0
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,563 StatusLogger.java:56 -
MemtableFlushWriter               0         0           3310         0
            0
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,564 StatusLogger.java:56 -
InternalResponseStage             0         0        1873184         0
            0
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,565 StatusLogger.java:56 -
AntiEntropyStage                  0         0              0         0
            0
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,565 StatusLogger.java:56 -
CacheCleanupExecutor              0         0              0         0
            0
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,566 StatusLogger.java:56 -
Native-Transport-Requests         3         2      212872837         0
         1796
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,566 StatusLogger.java:66 -
CompactionManager                 4        32
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,566 StatusLogger.java:78 -
MessagingService                n/a       0/0
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,567 StatusLogger.java:88 -
Cache Type                     Size                 Capacity
KeysToSave
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,568 StatusLogger.java:90 -
KeyCache                   88654200                104857600
       all
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,568 StatusLogger.java:96 -
RowCache                          0                        0
       all
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,568 StatusLogger.java:103 -
Table                       Memtable ops,data
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,569 StatusLogger.java:106 -
ddp.fingerprint_by_content_uuid_mv      99624,17726363
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,569 StatusLogger.java:106 -
ddp.sync                            4747,1551
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,569 StatusLogger.java:106 -
ddp.log_portal                            0,0
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,570 StatusLogger.java:106 -
ddp.meta_data                    2083,3829424
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,570 StatusLogger.java:106 -
ddp.configuration                   199,13915
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,570 StatusLogger.java:106 -
ddp.uuids_by_related_uuid     414431,17721698
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,570 StatusLogger.java:106 -
ddp.file_by_file_id                       0,0
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,571 StatusLogger.java:106 -
ddp.fingerprint_by_content_type_mv       83007,3317002
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,571 StatusLogger.java:106 -
ddp.heartbeat                     35855,13407
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,574 StatusLogger.java:106 -
ddp.concept                               0,0
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,575 StatusLogger.java:106 -
ddp.classification_scheme                 0,0
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,576 StatusLogger.java:106 -
ddp.rendering_relations_mv        9665,1201123
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,576 StatusLogger.java:106 -
ddp.file                             41,19826
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,576 StatusLogger.java:106 -
ddp.rendering_relations       276049,13465909
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,576 StatusLogger.java:106 -
ddp.semantic_group                        0,0
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,577 StatusLogger.java:106 -
ddp.rendering                  23344,68861340
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,577 StatusLogger.java:106 -
ddp.thesauri                             44,8
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,577 StatusLogger.java:106 -
ddp.file_download                         0,0
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,577 StatusLogger.java:106 -
ddp.uuids_by_related_uuid_mv     172394,14524712
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,578 StatusLogger.java:106 -
ddp.fingerprint               124884,54955730
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,578 StatusLogger.java:106 -
ddp.ddp_status                            0,0
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,578 StatusLogger.java:106 -
ddp.log_portal_mv                         0,0
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,578 StatusLogger.java:106 -
system_distributed.parent_repair_history                 0,0
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,579 StatusLogger.java:106 -
system_distributed.repair_history                 0,0
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,579 StatusLogger.java:106 -
system.compaction_history             20,4595
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,580 StatusLogger.java:106 -
system.hints                              0,0
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,580 StatusLogger.java:106 -
system.schema_aggregates                  0,0
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,580 StatusLogger.java:106 -
system.IndexInfo                          0,0
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,581 StatusLogger.java:106 -
system.schema_columnfamilies                 0,0
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,581 StatusLogger.java:106 -
system.schema_triggers                    0,0
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,582 StatusLogger.java:106 -
system.size_estimates            50400,764904
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,582 StatusLogger.java:106 -
system.schema_functions                   0,0
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,582 StatusLogger.java:106 -
system.paxos                              0,0
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,582 StatusLogger.java:106 -
system.views_builds_in_progress                 0,0
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,583 StatusLogger.java:106 -
system.built_views                        0,0
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,583 StatusLogger.java:106 -
system.peer_events                        0,0
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,583 StatusLogger.java:106 -
system.range_xfers                        0,0
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,583 StatusLogger.java:106 -
system.peers                              0,0
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,584 StatusLogger.java:106 -
system.batches                188441,33975764
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,584 StatusLogger.java:106 -
system.schema_keyspaces                   0,0
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,585 StatusLogger.java:106 -
system.schema_usertypes                   0,0
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,585 StatusLogger.java:106 -
system.local                              0,0
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,585 StatusLogger.java:106 -
system.sstable_activity            1376,24807
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,587 StatusLogger.java:106 -
system.available_ranges                   0,0
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,587 StatusLogger.java:106 -
system.batchlog                           0,0
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,588 StatusLogger.java:106 -
system.schema_columns                     0,0
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,589 StatusLogger.java:106 -
system_schema.columns                     0,0
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,589 StatusLogger.java:106 -
system_schema.types                       0,0
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,590 StatusLogger.java:106 -
system_schema.indexes                     0,0
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,590 StatusLogger.java:106 -
system_schema.keyspaces                   0,0
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,590 StatusLogger.java:106 -
system_schema.dropped_columns                 0,0
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,592 StatusLogger.java:106 -
system_schema.aggregates                  0,0
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,592 StatusLogger.java:106 -
system_schema.triggers                    0,0
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,592 StatusLogger.java:106 -
system_schema.tables                      0,0
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,592 StatusLogger.java:106 -
system_schema.views                       0,0
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,593 StatusLogger.java:106 -
system_schema.functions                   0,0
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,593 StatusLogger.java:106 -
system_auth.roles                         0,0
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,593 StatusLogger.java:106 -
system_auth.role_members                  0,0
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,593 StatusLogger.java:106 -
system_auth.resource_role_permissons_index                 0,0
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,594 StatusLogger.java:106 -
system_auth.role_permissions                 0,0
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,594 StatusLogger.java:106 -
system_traces.sessions                    0,0
INFO  [ScheduledTasks:1] 2016-04-12 17:35:35,594 StatusLogger.java:106 -
system_traces.events                      0,0
-----------------------------------------------------------------------------

compactionstats:
-----------------------------------------------------------------------------
pending tasks: 32
                                     id   compaction type   keyspace
            table   completed       total    unit   progress
   3bf89310-00d1-11e6-a5a3-f125ce747d55        Compaction        ddp
rendering_relations     1,09 GB     5,43 GB   bytes     20,10%
   9c5bd6b1-00d4-11e6-a5a3-f125ce747d55        Compaction        ddp
        meta_data   243,64 MB   338,56 MB   bytes     71,96%
   50308cb0-00d2-11e6-a5a3-f125ce747d55        Compaction        ddp
uuids_by_related_uuid     1,37 GB     2,17 GB   bytes     63,11%
   e8965d90-00c3-11e6-a5a3-f125ce747d55        Compaction        ddp
        rendering    19,02 GB    24,09 GB   bytes     78,96%
-----------------------------------------------------------------------------

Thank you in advance :)

Yours sincerely,
  Bo Madsen

Reply via email to