I ran into this a few months ago, and in my case I tracked it down to an issue with ZFS not unlinking commitlogs properly.
https://issues.apache.org/jira/browse/CASSANDRA-19564 On Tue, Nov 5, 2024 at 6:05 AM Dmitry Konstantinov <netud...@gmail.com> wrote: > I am speaking about a thread dump (stack traces for all threads), not a > heap dump. The heap dump should contain thread stacks info. > Thread dump (stack traces) is small and does not have sensitive info. > > Regards, > Dmitry > > On Tue, 5 Nov 2024 at 13:53, Bowen Song via user < > user@cassandra.apache.org> wrote: > >> It's about 18GB in size and may contain a huge amount of sensitive data >> (e.g. all the pending writes), so I can't share it. However, if there's any >> particular piece of information you would like to have, I'm more than happy >> to extract the info from the dump and and share it here. >> On 05/11/2024 13:01, Dmitry Konstantinov wrote: >> >> Hi Bowen, would it be possible to share a full thread dump? >> >> Regards, >> Dmitry >> >> On Tue, 5 Nov 2024 at 12:12, Bowen Song via user < >> user@cassandra.apache.org> wrote: >> >>> Hi all, >>> >>> We have a cluster running Cassandra 4.1.1. We are seeing the memtable >>> flush randomly getting stuck. This has happened twice in the last 10 days, >>> to two different nodes in the same cluster. This started to happen after we >>> enabled CDC, and each time it got stuck, there was at least one repair >>> running involving the affected node. >>> >>> The signs of the stuck memtable flush is most obvious from the >>> "StatusLogger" logs. >>> >>> At the beginning, the MemtablePostFlush and MemtableFlushWriter got >>> stuck, they have 1 and 2 active tasks each, and a small number of pending >>> tasks. >>> >>> INFO [ScheduledTasks:1] 2024-11-03 03:50:15,224 StatusLogger.java:65 - >>> Pool Name Active Pending Completed Blocked >>> All Time Blocked >>> INFO [ScheduledTasks:1] 2024-11-03 03:50:15,224 StatusLogger.java:69 - >>> ReadStage 0 0 34052333 >>> 0 0 >>> INFO [ScheduledTasks:1] 2024-11-03 03:50:15,224 StatusLogger.java:69 - >>> CompactionExecutor 0 0 1019777 >>> 0 0 >>> INFO [ScheduledTasks:1] 2024-11-03 03:50:15,224 StatusLogger.java:69 - >>> MutationStage 0 0 14930764 >>> 0 0 >>> INFO [ScheduledTasks:1] 2024-11-03 03:50:15,224 StatusLogger.java:69 - >>> MemtableReclaimMemory 0 0 21877 >>> 0 0 >>> INFO [ScheduledTasks:1] 2024-11-03 03:50:15,224 StatusLogger.java:69 - >>> PendingRangeCalculator 0 0 177 >>> 0 0 >>> INFO [ScheduledTasks:1] 2024-11-03 03:50:15,224 StatusLogger.java:69 - >>> Repair#61 0 0 1344 >>> 0 0 >>> INFO [ScheduledTasks:1] 2024-11-03 03:50:15,224 StatusLogger.java:69 - >>> GossipStage 0 0 889452 >>> 0 0 >>> INFO [ScheduledTasks:1] 2024-11-03 03:50:15,224 StatusLogger.java:69 - >>> SecondaryIndexManagement 0 0 1 >>> 0 0 >>> INFO [ScheduledTasks:1] 2024-11-03 03:50:15,224 StatusLogger.java:69 - >>> HintsDispatcher 0 0 19 >>> 0 0 >>> INFO [ScheduledTasks:1] 2024-11-03 03:50:15,224 StatusLogger.java:69 - >>> Repair-Task 0 0 65 >>> 0 0 >>> INFO [ScheduledTasks:1] 2024-11-03 03:50:15,224 StatusLogger.java:69 - >>> RequestResponseStage 0 0 44447834 >>> 0 0 >>> INFO [ScheduledTasks:1] 2024-11-03 03:50:15,224 StatusLogger.java:69 - >>> Native-Transport-Requests 0 0 8967921 >>> 0 0 >>> INFO [ScheduledTasks:1] 2024-11-03 03:50:15,224 StatusLogger.java:69 - >>> MigrationStage 0 0 5 >>> 0 0 >>> INFO [ScheduledTasks:1] 2024-11-03 03:50:15,224 StatusLogger.java:69 - >>> MemtableFlushWriter 2 10 21781 >>> 0 0 >>> INFO [ScheduledTasks:1] 2024-11-03 03:50:15,224 StatusLogger.java:69 - >>> MemtablePostFlush 1 11 47856 >>> 0 0 >>> INFO [ScheduledTasks:1] 2024-11-03 03:50:15,224 StatusLogger.java:69 - >>> PerDiskMemtableFlushWriter_0 0 0 21769 >>> 0 0 >>> INFO [ScheduledTasks:1] 2024-11-03 03:50:15,224 StatusLogger.java:69 - >>> Sampler 0 0 0 >>> 0 0 >>> INFO [ScheduledTasks:1] 2024-11-03 03:50:15,224 StatusLogger.java:69 - >>> ValidationExecutor 0 0 36651 >>> 0 0 >>> INFO [ScheduledTasks:1] 2024-11-03 03:50:15,224 StatusLogger.java:69 - >>> ViewBuildExecutor 0 0 0 >>> 0 0 >>> INFO [ScheduledTasks:1] 2024-11-03 03:50:15,224 StatusLogger.java:69 - >>> InternalResponseStage 0 0 240 >>> 0 0 >>> INFO [ScheduledTasks:1] 2024-11-03 03:50:15,224 StatusLogger.java:69 - >>> AntiEntropyStage 1 1773 120067 >>> 0 0 >>> INFO [ScheduledTasks:1] 2024-11-03 03:50:15,224 StatusLogger.java:69 - >>> CacheCleanupExecutor 0 0 0 >>> 0 0 >>> >>> The number of pending tasks slowly grows larger over time, and the >>> number of completed tasks do not increase at all. >>> >>> INFO [ScheduledTasks:1] 2024-11-03 16:33:05,941 StatusLogger.java:65 - >>> Pool Name Active Pending Completed Blocked >>> All Time Blocked >>> INFO [ScheduledTasks:1] 2024-11-03 16:33:05,941 StatusLogger.java:69 - >>> ReadStage 0 0 39905462 >>> 0 0 >>> INFO [ScheduledTasks:1] 2024-11-03 16:33:05,941 StatusLogger.java:69 - >>> CompactionExecutor 0 0 1170100 >>> 0 0 >>> INFO [ScheduledTasks:1] 2024-11-03 16:33:05,941 StatusLogger.java:69 - >>> MutationStage 0 0 16976992 >>> 0 0 >>> INFO [ScheduledTasks:1] 2024-11-03 16:33:05,941 StatusLogger.java:69 - >>> Repair#76 0 0 0 >>> 0 0 >>> INFO [ScheduledTasks:1] 2024-11-03 16:33:05,941 StatusLogger.java:69 - >>> Repair#74 0 0 0 >>> 0 0 >>> INFO [ScheduledTasks:1] 2024-11-03 16:33:05,941 StatusLogger.java:69 - >>> Repair#72 0 0 0 >>> 0 0 >>> INFO [ScheduledTasks:1] 2024-11-03 16:33:05,941 StatusLogger.java:69 - >>> GossipStage 0 0 1026877 >>> 0 0 >>> INFO [ScheduledTasks:1] 2024-11-03 16:33:05,941 StatusLogger.java:69 - >>> Repair-Task 0 0 83 >>> 0 0 >>> INFO [ScheduledTasks:1] 2024-11-03 16:33:05,941 StatusLogger.java:69 - >>> RequestResponseStage 0 0 51122191 >>> 0 0 >>> INFO [ScheduledTasks:1] 2024-11-03 16:33:05,941 StatusLogger.java:69 - >>> ValidationExecutor 0 0 36651 >>> 0 0 >>> INFO [ScheduledTasks:1] 2024-11-03 16:33:05,941 StatusLogger.java:69 - >>> MemtablePostFlush 1 54 47856 >>> 0 0 >>> INFO [ScheduledTasks:1] 2024-11-03 16:33:05,941 StatusLogger.java:69 - >>> MemtableFlushWriter 2 53 21781 >>> 0 0 >>> INFO [ScheduledTasks:1] 2024-11-03 16:33:05,941 StatusLogger.java:69 - >>> CacheCleanupExecutor 0 0 0 >>> 0 0 >>> INFO [ScheduledTasks:1] 2024-11-03 16:33:05,941 StatusLogger.java:69 - >>> Repair#67 0 0 0 >>> 0 0 >>> INFO [ScheduledTasks:1] 2024-11-03 16:33:05,941 StatusLogger.java:69 - >>> MemtableReclaimMemory 0 0 21877 >>> 0 0 >>> INFO [ScheduledTasks:1] 2024-11-03 16:33:05,941 StatusLogger.java:69 - >>> PendingRangeCalculator 0 0 177 >>> 0 0 >>> INFO [ScheduledTasks:1] 2024-11-03 16:33:05,941 StatusLogger.java:69 - >>> Repair#82 0 0 0 >>> 0 0 >>> INFO [ScheduledTasks:1] 2024-11-03 16:33:05,941 StatusLogger.java:69 - >>> Repair#61 0 0 1344 >>> 0 0 >>> INFO [ScheduledTasks:1] 2024-11-03 16:33:05,941 StatusLogger.java:69 - >>> SecondaryIndexManagement 0 0 1 >>> 0 0 >>> INFO [ScheduledTasks:1] 2024-11-03 16:33:05,941 StatusLogger.java:69 - >>> HintsDispatcher 0 0 20 >>> 0 0 >>> INFO [ScheduledTasks:1] 2024-11-03 16:33:05,941 StatusLogger.java:69 - >>> Native-Transport-Requests 0 0 10145020 >>> 0 0 >>> INFO [ScheduledTasks:1] 2024-11-03 16:33:05,941 StatusLogger.java:69 - >>> MigrationStage 0 0 5 >>> 0 0 >>> INFO [ScheduledTasks:1] 2024-11-03 16:33:05,941 StatusLogger.java:69 - >>> Sampler 0 0 0 >>> 0 0 >>> INFO [ScheduledTasks:1] 2024-11-03 16:33:05,941 StatusLogger.java:69 - >>> PerDiskMemtableFlushWriter_0 0 0 21769 >>> 0 0 >>> INFO [ScheduledTasks:1] 2024-11-03 16:33:05,942 StatusLogger.java:69 - >>> ViewBuildExecutor 0 0 0 >>> 0 0 >>> INFO [ScheduledTasks:1] 2024-11-03 16:33:05,942 StatusLogger.java:69 - >>> InternalResponseStage 0 0 996 >>> 0 0 >>> INFO [ScheduledTasks:1] 2024-11-03 16:33:05,942 StatusLogger.java:69 - >>> AntiEntropyStage 1 9694 120067 >>> 0 0 >>> >>> Then, things start to get worse, the MutationStage also got stuck. >>> >>> INFO [ScheduledTasks:1] 2024-11-03 17:05:40,974 StatusLogger.java:65 - >>> Pool Name Active Pending Completed Blocked >>> All Time Blocked >>> INFO [ScheduledTasks:1] 2024-11-03 17:05:40,974 StatusLogger.java:69 - >>> ReadStage 0 0 40186426 >>> 0 0 >>> INFO [ScheduledTasks:1] 2024-11-03 17:05:40,974 StatusLogger.java:69 - >>> CompactionExecutor 0 0 1171479 >>> 0 0 >>> INFO [ScheduledTasks:1] 2024-11-03 17:05:40,974 StatusLogger.java:69 - >>> MutationStage 128 201414 16988212 >>> 0 0 >>> INFO [ScheduledTasks:1] 2024-11-03 17:05:40,974 StatusLogger.java:69 - >>> Repair#76 0 0 0 >>> 0 0 >>> INFO [ScheduledTasks:1] 2024-11-03 17:05:40,974 StatusLogger.java:69 - >>> Repair#74 0 0 0 >>> 0 0 >>> INFO [ScheduledTasks:1] 2024-11-03 17:05:40,974 StatusLogger.java:69 - >>> Repair#72 0 0 0 >>> 0 0 >>> INFO [ScheduledTasks:1] 2024-11-03 17:05:40,974 StatusLogger.java:69 - >>> GossipStage 0 0 1032988 >>> 0 0 >>> INFO [ScheduledTasks:1] 2024-11-03 17:05:40,974 StatusLogger.java:69 - >>> Repair-Task 0 0 85 >>> 0 0 >>> INFO [ScheduledTasks:1] 2024-11-03 17:05:40,974 StatusLogger.java:69 - >>> RequestResponseStage 0 0 51307497 >>> 0 0 >>> INFO [ScheduledTasks:1] 2024-11-03 17:05:40,974 StatusLogger.java:69 - >>> ValidationExecutor 0 0 36651 >>> 0 0 >>> INFO [ScheduledTasks:1] 2024-11-03 17:05:40,974 StatusLogger.java:69 - >>> MemtablePostFlush 1 54 47856 >>> 0 0 >>> INFO [ScheduledTasks:1] 2024-11-03 17:05:40,974 StatusLogger.java:69 - >>> MemtableFlushWriter 2 53 21781 >>> 0 0 >>> INFO [ScheduledTasks:1] 2024-11-03 17:05:40,974 StatusLogger.java:69 - >>> CacheCleanupExecutor 0 0 0 >>> 0 0 >>> INFO [ScheduledTasks:1] 2024-11-03 17:05:40,974 StatusLogger.java:69 - >>> Repair#67 0 0 0 >>> 0 0 >>> INFO [ScheduledTasks:1] 2024-11-03 17:05:40,974 StatusLogger.java:69 - >>> MemtableReclaimMemory 0 0 21877 >>> 0 0 >>> INFO [ScheduledTasks:1] 2024-11-03 17:05:40,974 StatusLogger.java:69 - >>> Repair#84 0 0 0 >>> 0 0 >>> INFO [ScheduledTasks:1] 2024-11-03 17:05:40,974 StatusLogger.java:69 - >>> PendingRangeCalculator 0 0 177 >>> 0 0 >>> INFO [ScheduledTasks:1] 2024-11-03 17:05:40,974 StatusLogger.java:69 - >>> Repair#82 0 0 0 >>> 0 0 >>> INFO [ScheduledTasks:1] 2024-11-03 17:05:40,974 StatusLogger.java:69 - >>> Repair#61 0 0 1344 >>> 0 0 >>> INFO [ScheduledTasks:1] 2024-11-03 17:05:40,974 StatusLogger.java:69 - >>> SecondaryIndexManagement 0 0 1 >>> 0 0 >>> INFO [ScheduledTasks:1] 2024-11-03 17:05:40,974 StatusLogger.java:69 - >>> HintsDispatcher 0 0 20 >>> 0 0 >>> INFO [ScheduledTasks:1] 2024-11-03 17:05:40,974 StatusLogger.java:69 - >>> Native-Transport-Requests 0 0 10194549 >>> 0 0 >>> INFO [ScheduledTasks:1] 2024-11-03 17:05:40,974 StatusLogger.java:69 - >>> MigrationStage 0 0 5 >>> 0 0 >>> INFO [ScheduledTasks:1] 2024-11-03 17:05:40,974 StatusLogger.java:69 - >>> Sampler 0 0 0 >>> 0 0 >>> INFO [ScheduledTasks:1] 2024-11-03 17:05:40,974 StatusLogger.java:69 - >>> PerDiskMemtableFlushWriter_0 0 0 21769 >>> 0 0 >>> INFO [ScheduledTasks:1] 2024-11-03 17:05:40,974 StatusLogger.java:69 - >>> ViewBuildExecutor 0 0 0 >>> 0 0 >>> INFO [ScheduledTasks:1] 2024-11-03 17:05:40,974 StatusLogger.java:69 - >>> InternalResponseStage 0 0 1087 >>> 0 0 >>> INFO [ScheduledTasks:1] 2024-11-03 17:05:40,974 StatusLogger.java:69 - >>> AntiEntropyStage 1 10364 120067 >>> 0 0 >>> >>> Writes on this node starts to timeout and fail. But if left untouched, >>> it's only gonna get worse, and eventually lead to JVM OOM and crash. >>> >>> By inspecting the heap dump created at OOM, we can see that both of the >>> MemtableFlushWriter threads are stuck on line 1190 >>> <https://github.com/apache/cassandra/blob/8d91b469afd3fcafef7ef85c10c8acc11703ba2d/src/java/org/apache/cassandra/db/ColumnFamilyStore.java#L1190> >>> in the ColumnFamilyStore.java: >>> >>> // mark writes older than the barrier as blocking progress, >>> permitting them to exceed our memory limit >>> // if they are stuck waiting on it, then wait for them all >>> to complete >>> writeBarrier.markBlocking(); >>> writeBarrier.await(); // <----------- stuck here >>> >>> And the MemtablePostFlush thread is stuck on line 1094 >>> <https://github.com/apache/cassandra/blob/8d91b469afd3fcafef7ef85c10c8acc11703ba2d/src/java/org/apache/cassandra/db/ColumnFamilyStore.java#L1094> >>> in the same file. >>> >>> try >>> { >>> // we wait on the latch for the commitLogUpperBound to >>> be set, and so that waiters >>> // on this task can rely on all prior flushes being >>> complete >>> latch.await(); // <----------- stuck here >>> } >>> Our top suspect is CDC interacting with repair, since this started to >>> happen shortly after we enabled CDC on the nodes, and each time repair was >>> running. But we have not been able to reproduce this in a testing cluster, >>> and don't know what's the next step to troubleshoot this issue. So I'm >>> posting it in the mailing lists and hoping someone may know something about >>> it or point me to the right direction. >>> >>> p.s.: sorry about posting this to both the user & dev mailing lists. >>> It's an end-user related issue but involves Cassandra internals, so I can't >>> decide which one is best suited. >>> >>> Cheers, >>> Bowen >>> >>> >>> >> >> -- >> Dmitry Konstantinov >> >> > > -- > Dmitry Konstantinov >