RE: [External]Unexplained stuck memtable flush

Jiri Steuer (EIT) Tue, 05 Nov 2024 04:51:37 -0800

Of cause, let me explain the situation. I have a common question without direct 
relation the problem with “Unexplained stuck memtable flush”. I would like to 
know, how can I identify situation that all nodes cross all data centers will 
be synch.


  *   It is little tricky to wait e.g. 1 day, 2 days or longer time for the 
synchronization

My question is. Do you see easy way, how can I check that expected huge 
synchronization (e.g. after extension cluster – add new data center or add half 
of new nodes) fill be finish from perspective of data synch? Thx for sharing 
you best practices, regards

    Jiri



This item's classification is Internal. It was created by and is in property of 
the EmbedIT. Do not distribute outside of the organization.

From: Bowen Song via user <user@cassandra.apache.org>
Sent: Tuesday, November 5, 2024 1:39 PM
To: user@cassandra.apache.org
Cc: Bowen Song <bo...@bso.ng>
Subject: Re: [External]Unexplained stuck memtable flush

This message is from an EXTERNAL SENDER - be CAUTIOUS, particularly with links 
and attachments.
Please report all suspicious e-mails to 
helpd...@embedit.com<mailto:helpd...@embedit.com>
________________________________


Hi Jiri,

Thank you for taking a look at this issue. But I'm sorry, I don't really 
understand your message. Can you please elaborate?

Cheers,
Bowen
On 05/11/2024 12:34, Jiri Steuer (EIT) wrote:
Hi all,

It is possible easy to check the moment/milestone, when the data cross more 
data centers will by synch (in case that other applications and user access 
will be disabled)? I think about monitoring of throughput or …? Thx for feedback

   J. Steuer



This item's classification is Internal. It was created by and is in property of 
the EmbedIT. Do not distribute outside of the organization.
From: Bowen Song via user 
<user@cassandra.apache.org><mailto:user@cassandra.apache.org>
Sent: Tuesday, November 5, 2024 1:12 PM
To: d...@cassandra.apache.org<mailto:d...@cassandra.apache.org>; 
user@cassandra.apache.org<mailto:user@cassandra.apache.org>
Cc: Bowen Song <bo...@bso.ng><mailto:bo...@bso.ng>
Subject: [External]Unexplained stuck memtable flush

This message is from an EXTERNAL SENDER - be CAUTIOUS, particularly with links 
and attachments.
Please report all suspicious e-mails to 
helpd...@embedit.com<mailto:helpd...@embedit.com>
________________________________


Hi all,

We have a cluster running Cassandra 4.1.1. We are seeing the memtable flush 
randomly getting stuck. This has happened twice in the last 10 days, to two 
different nodes in the same cluster. This started to happen after we enabled 
CDC, and each time it got stuck, there was at least one repair running 
involving the affected node.

The signs of the stuck memtable flush is most obvious from the "StatusLogger" 
logs.

At the beginning, the MemtablePostFlush and MemtableFlushWriter got stuck, they 
have 1 and 2 active tasks each, and a small number of pending tasks.

INFO  [ScheduledTasks:1] 2024-11-03 03:50:15,224 StatusLogger.java:65 - Pool 
Name                       Active   Pending      Completed   Blocked  All Time 
Blocked
INFO  [ScheduledTasks:1] 2024-11-03 03:50:15,224 StatusLogger.java:69 - 
ReadStage                            0         0       34052333         0       
          0
INFO  [ScheduledTasks:1] 2024-11-03 03:50:15,224 StatusLogger.java:69 - 
CompactionExecutor                   0         0        1019777         0       
          0
INFO  [ScheduledTasks:1] 2024-11-03 03:50:15,224 StatusLogger.java:69 - 
MutationStage                        0         0       14930764         0       
          0
INFO  [ScheduledTasks:1] 2024-11-03 03:50:15,224 StatusLogger.java:69 - 
MemtableReclaimMemory                0         0          21877         0       
          0
INFO  [ScheduledTasks:1] 2024-11-03 03:50:15,224 StatusLogger.java:69 - 
PendingRangeCalculator               0         0            177         0       
          0
INFO  [ScheduledTasks:1] 2024-11-03 03:50:15,224 StatusLogger.java:69 - 
Repair#61                            0         0           1344         0       
          0
INFO  [ScheduledTasks:1] 2024-11-03 03:50:15,224 StatusLogger.java:69 - 
GossipStage                          0         0         889452         0       
          0
INFO  [ScheduledTasks:1] 2024-11-03 03:50:15,224 StatusLogger.java:69 - 
SecondaryIndexManagement             0         0              1         0       
          0
INFO  [ScheduledTasks:1] 2024-11-03 03:50:15,224 StatusLogger.java:69 - 
HintsDispatcher                      0         0             19         0       
          0
INFO  [ScheduledTasks:1] 2024-11-03 03:50:15,224 StatusLogger.java:69 - 
Repair-Task                          0         0             65         0       
          0
INFO  [ScheduledTasks:1] 2024-11-03 03:50:15,224 StatusLogger.java:69 - 
RequestResponseStage                 0         0       44447834         0       
          0
INFO  [ScheduledTasks:1] 2024-11-03 03:50:15,224 StatusLogger.java:69 - 
Native-Transport-Requests            0         0        8967921         0       
          0
INFO  [ScheduledTasks:1] 2024-11-03 03:50:15,224 StatusLogger.java:69 - 
MigrationStage                       0         0              5         0       
          0
INFO  [ScheduledTasks:1] 2024-11-03 03:50:15,224 StatusLogger.java:69 - 
MemtableFlushWriter                  2        10          21781         0       
          0
INFO  [ScheduledTasks:1] 2024-11-03 03:50:15,224 StatusLogger.java:69 - 
MemtablePostFlush                    1        11          47856         0       
          0
INFO  [ScheduledTasks:1] 2024-11-03 03:50:15,224 StatusLogger.java:69 - 
PerDiskMemtableFlushWriter_0         0         0          21769         0       
          0
INFO  [ScheduledTasks:1] 2024-11-03 03:50:15,224 StatusLogger.java:69 - Sampler 
                             0         0              0         0               
  0
INFO  [ScheduledTasks:1] 2024-11-03 03:50:15,224 StatusLogger.java:69 - 
ValidationExecutor                   0         0          36651         0       
          0
INFO  [ScheduledTasks:1] 2024-11-03 03:50:15,224 StatusLogger.java:69 - 
ViewBuildExecutor                    0         0              0         0       
          0
INFO  [ScheduledTasks:1] 2024-11-03 03:50:15,224 StatusLogger.java:69 - 
InternalResponseStage                0         0            240         0       
          0
INFO  [ScheduledTasks:1] 2024-11-03 03:50:15,224 StatusLogger.java:69 - 
AntiEntropyStage                     1      1773         120067         0       
          0
INFO  [ScheduledTasks:1] 2024-11-03 03:50:15,224 StatusLogger.java:69 - 
CacheCleanupExecutor                 0         0              0         0       
          0

The number of pending tasks slowly grows larger over time, and the number of 
completed tasks do not increase at all.

INFO  [ScheduledTasks:1] 2024-11-03 16:33:05,941 StatusLogger.java:65 - Pool 
Name                       Active   Pending      Completed   Blocked  All Time 
Blocked
INFO  [ScheduledTasks:1] 2024-11-03 16:33:05,941 StatusLogger.java:69 - 
ReadStage                            0         0       39905462         0       
          0
INFO  [ScheduledTasks:1] 2024-11-03 16:33:05,941 StatusLogger.java:69 - 
CompactionExecutor                   0         0        1170100         0       
          0
INFO  [ScheduledTasks:1] 2024-11-03 16:33:05,941 StatusLogger.java:69 - 
MutationStage                        0         0       16976992         0       
          0
INFO  [ScheduledTasks:1] 2024-11-03 16:33:05,941 StatusLogger.java:69 - 
Repair#76                            0         0              0         0       
          0
INFO  [ScheduledTasks:1] 2024-11-03 16:33:05,941 StatusLogger.java:69 - 
Repair#74                            0         0              0         0       
          0
INFO  [ScheduledTasks:1] 2024-11-03 16:33:05,941 StatusLogger.java:69 - 
Repair#72                            0         0              0         0       
          0
INFO  [ScheduledTasks:1] 2024-11-03 16:33:05,941 StatusLogger.java:69 - 
GossipStage                          0         0        1026877         0       
          0
INFO  [ScheduledTasks:1] 2024-11-03 16:33:05,941 StatusLogger.java:69 - 
Repair-Task                          0         0             83         0       
          0
INFO  [ScheduledTasks:1] 2024-11-03 16:33:05,941 StatusLogger.java:69 - 
RequestResponseStage                 0         0       51122191         0       
          0
INFO  [ScheduledTasks:1] 2024-11-03 16:33:05,941 StatusLogger.java:69 - 
ValidationExecutor                   0         0          36651         0       
          0
INFO  [ScheduledTasks:1] 2024-11-03 16:33:05,941 StatusLogger.java:69 - 
MemtablePostFlush                    1        54          47856         0       
          0
INFO  [ScheduledTasks:1] 2024-11-03 16:33:05,941 StatusLogger.java:69 - 
MemtableFlushWriter                  2        53          21781         0       
          0
INFO  [ScheduledTasks:1] 2024-11-03 16:33:05,941 StatusLogger.java:69 - 
CacheCleanupExecutor                 0         0              0         0       
          0
INFO  [ScheduledTasks:1] 2024-11-03 16:33:05,941 StatusLogger.java:69 - 
Repair#67                            0         0              0         0       
          0
INFO  [ScheduledTasks:1] 2024-11-03 16:33:05,941 StatusLogger.java:69 - 
MemtableReclaimMemory                0         0          21877         0       
          0
INFO  [ScheduledTasks:1] 2024-11-03 16:33:05,941 StatusLogger.java:69 - 
PendingRangeCalculator               0         0            177         0       
          0
INFO  [ScheduledTasks:1] 2024-11-03 16:33:05,941 StatusLogger.java:69 - 
Repair#82                            0         0              0         0       
          0
INFO  [ScheduledTasks:1] 2024-11-03 16:33:05,941 StatusLogger.java:69 - 
Repair#61                            0         0           1344         0       
          0
INFO  [ScheduledTasks:1] 2024-11-03 16:33:05,941 StatusLogger.java:69 - 
SecondaryIndexManagement             0         0              1         0       
          0
INFO  [ScheduledTasks:1] 2024-11-03 16:33:05,941 StatusLogger.java:69 - 
HintsDispatcher                      0         0             20         0       
          0
INFO  [ScheduledTasks:1] 2024-11-03 16:33:05,941 StatusLogger.java:69 - 
Native-Transport-Requests            0         0       10145020         0       
          0
INFO  [ScheduledTasks:1] 2024-11-03 16:33:05,941 StatusLogger.java:69 - 
MigrationStage                       0         0              5         0       
          0
INFO  [ScheduledTasks:1] 2024-11-03 16:33:05,941 StatusLogger.java:69 - Sampler 
                             0         0              0         0               
  0
INFO  [ScheduledTasks:1] 2024-11-03 16:33:05,941 StatusLogger.java:69 - 
PerDiskMemtableFlushWriter_0         0         0          21769         0       
          0
INFO  [ScheduledTasks:1] 2024-11-03 16:33:05,942 StatusLogger.java:69 - 
ViewBuildExecutor                    0         0              0         0       
          0
INFO  [ScheduledTasks:1] 2024-11-03 16:33:05,942 StatusLogger.java:69 - 
InternalResponseStage                0         0            996         0       
          0
INFO  [ScheduledTasks:1] 2024-11-03 16:33:05,942 StatusLogger.java:69 - 
AntiEntropyStage                     1      9694         120067         0       
          0

Then, things start to get worse, the MutationStage also got stuck.

INFO  [ScheduledTasks:1] 2024-11-03 17:05:40,974 StatusLogger.java:65 - Pool 
Name                       Active   Pending      Completed   Blocked  All Time 
Blocked
INFO  [ScheduledTasks:1] 2024-11-03 17:05:40,974 StatusLogger.java:69 - 
ReadStage                            0         0       40186426         0       
          0
INFO  [ScheduledTasks:1] 2024-11-03 17:05:40,974 StatusLogger.java:69 - 
CompactionExecutor                   0         0        1171479         0       
          0
INFO  [ScheduledTasks:1] 2024-11-03 17:05:40,974 StatusLogger.java:69 - 
MutationStage                      128    201414       16988212         0       
          0
INFO  [ScheduledTasks:1] 2024-11-03 17:05:40,974 StatusLogger.java:69 - 
Repair#76                            0         0              0         0       
          0
INFO  [ScheduledTasks:1] 2024-11-03 17:05:40,974 StatusLogger.java:69 - 
Repair#74                            0         0              0         0       
          0
INFO  [ScheduledTasks:1] 2024-11-03 17:05:40,974 StatusLogger.java:69 - 
Repair#72                            0         0              0         0       
          0
INFO  [ScheduledTasks:1] 2024-11-03 17:05:40,974 StatusLogger.java:69 - 
GossipStage                          0         0        1032988         0       
          0
INFO  [ScheduledTasks:1] 2024-11-03 17:05:40,974 StatusLogger.java:69 - 
Repair-Task                          0         0             85         0       
          0
INFO  [ScheduledTasks:1] 2024-11-03 17:05:40,974 StatusLogger.java:69 - 
RequestResponseStage                 0         0       51307497         0       
          0
INFO  [ScheduledTasks:1] 2024-11-03 17:05:40,974 StatusLogger.java:69 - 
ValidationExecutor                   0         0          36651         0       
          0
INFO  [ScheduledTasks:1] 2024-11-03 17:05:40,974 StatusLogger.java:69 - 
MemtablePostFlush                    1        54          47856         0       
          0
INFO  [ScheduledTasks:1] 2024-11-03 17:05:40,974 StatusLogger.java:69 - 
MemtableFlushWriter                  2        53          21781         0       
          0
INFO  [ScheduledTasks:1] 2024-11-03 17:05:40,974 StatusLogger.java:69 - 
CacheCleanupExecutor                 0         0              0         0       
          0
INFO  [ScheduledTasks:1] 2024-11-03 17:05:40,974 StatusLogger.java:69 - 
Repair#67                            0         0              0         0       
          0
INFO  [ScheduledTasks:1] 2024-11-03 17:05:40,974 StatusLogger.java:69 - 
MemtableReclaimMemory                0         0          21877         0       
          0
INFO  [ScheduledTasks:1] 2024-11-03 17:05:40,974 StatusLogger.java:69 - 
Repair#84                            0         0              0         0       
          0
INFO  [ScheduledTasks:1] 2024-11-03 17:05:40,974 StatusLogger.java:69 - 
PendingRangeCalculator               0         0            177         0       
          0
INFO  [ScheduledTasks:1] 2024-11-03 17:05:40,974 StatusLogger.java:69 - 
Repair#82                            0         0              0         0       
          0
INFO  [ScheduledTasks:1] 2024-11-03 17:05:40,974 StatusLogger.java:69 - 
Repair#61                            0         0           1344         0       
          0
INFO  [ScheduledTasks:1] 2024-11-03 17:05:40,974 StatusLogger.java:69 - 
SecondaryIndexManagement             0         0              1         0       
          0
INFO  [ScheduledTasks:1] 2024-11-03 17:05:40,974 StatusLogger.java:69 - 
HintsDispatcher                      0         0             20         0       
          0
INFO  [ScheduledTasks:1] 2024-11-03 17:05:40,974 StatusLogger.java:69 - 
Native-Transport-Requests            0         0       10194549         0       
          0
INFO  [ScheduledTasks:1] 2024-11-03 17:05:40,974 StatusLogger.java:69 - 
MigrationStage                       0         0              5         0       
          0
INFO  [ScheduledTasks:1] 2024-11-03 17:05:40,974 StatusLogger.java:69 - Sampler 
                             0         0              0         0               
  0
INFO  [ScheduledTasks:1] 2024-11-03 17:05:40,974 StatusLogger.java:69 - 
PerDiskMemtableFlushWriter_0         0         0          21769         0       
          0
INFO  [ScheduledTasks:1] 2024-11-03 17:05:40,974 StatusLogger.java:69 - 
ViewBuildExecutor                    0         0              0         0       
          0
INFO  [ScheduledTasks:1] 2024-11-03 17:05:40,974 StatusLogger.java:69 - 
InternalResponseStage                0         0           1087         0       
          0
INFO  [ScheduledTasks:1] 2024-11-03 17:05:40,974 StatusLogger.java:69 - 
AntiEntropyStage                     1     10364         120067         0       
          0

Writes on this node starts to timeout and fail. But if left untouched, it's 
only gonna get worse, and eventually lead to JVM OOM and crash.

By inspecting the heap dump created at OOM, we can see that both of the 
MemtableFlushWriter threads are stuck on line 
1190<https://github.com/apache/cassandra/blob/8d91b469afd3fcafef7ef85c10c8acc11703ba2d/src/java/org/apache/cassandra/db/ColumnFamilyStore.java#L1190>
 in the ColumnFamilyStore.java:

            // mark writes older than the barrier as blocking progress, 
permitting them to exceed our memory limit
            // if they are stuck waiting on it, then wait for them all to 
complete
            writeBarrier.markBlocking();
            writeBarrier.await();   // <----------- stuck here

And the MemtablePostFlush thread is stuck on line 
1094<https://github.com/apache/cassandra/blob/8d91b469afd3fcafef7ef85c10c8acc11703ba2d/src/java/org/apache/cassandra/db/ColumnFamilyStore.java#L1094>
 in the same file.

            try
            {
                // we wait on the latch for the commitLogUpperBound to be set, 
and so that waiters
                // on this task can rely on all prior flushes being complete
                latch.await();   // <----------- stuck here
            }
Our top suspect is CDC interacting with repair, since this started to happen 
shortly after we enabled CDC on the nodes, and each time repair was running. 
But we have not been able to reproduce this in a testing cluster, and don't 
know what's the next step to troubleshoot this issue. So I'm posting it in the 
mailing lists and hoping someone may know something about it or point me to the 
right direction.

p.s.: sorry about posting this to both the user & dev mailing lists. It's an 
end-user related issue but involves Cassandra internals, so I can't decide 
which one is best suited.

Cheers,
Bowen

RE: [External]Unexplained stuck memtable flush

Reply via email to