[jira] [Commented] (CASSANDRA-20877) FINALIZED incremental local repair sessions are not cleaned up in case of a range movement

Dmitry Konstantinov (Jira) Sat, 27 Sep 2025 10:10:15 -0700


    [ 
https://issues.apache.org/jira/browse/CASSANDRA-20877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18023329#comment-18023329
 ]


Dmitry Konstantinov commented on CASSANDRA-20877:
-------------------------------------------------

4.0 test results: 
[https://ci-cassandra.apache.org/job/Cassandra-devbranch-before-5/2696/]

Failed tests:
dtest-novnode.commitlog_test.TestCommitLog.test_mv_lock_contention_during_replay
dtest.global_row_key_cache_test.TestGlobalRowKeyCache.test_functional
org.apache.cassandra.net.ProxyHandlerConnectionsTest.testExpireSome-cdc
dtest-novnode.cqlsh_tests.test_cqlsh.TestCqlsh.test_unicode_invalid_request_error
dtest-novnode.cqlsh_tests.test_cqlsh.TestCqlsh.test_unicode_invalid_request_error
dtest.cqlsh_tests.test_cqlsh.TestCqlsh.test_unicode_invalid_request_error
dtest.cqlsh_tests.test_cqlsh.TestCqlsh.test_unicode_invalid_request_error
org.apache.cassandra.distributed.upgrade.DropCompactStorageTest.testDropCompactStorage

none are related to the changes.
4.1/5.0/trunk - MR and tests are in progress.

> FINALIZED incremental local repair sessions are not cleaned up in case of a 
> range movement 
> -------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-20877
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-20877
>             Project: Apache Cassandra
>          Issue Type: Bug
>          Components: Consistency/Repair
>            Reporter: Dmitry Konstantinov
>            Assignee: Dmitry Konstantinov
>            Priority: Normal
>             Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x
>
>          Time Spent: 50m
>  Remaining Estimate: 0h
>
> * system.repairs table is local per each Cassandra node.
>  * This table is cleaned up by a periodically running 
> org.apache.cassandra.repair.consistent.LocalSessions#cleanup() job.
>  * The job runs every cassandra.repair_cleanup_interval_seconds (with default 
> = 10 minutes).
>  * The job should delete repair sessions with FINALIZED state which are older 
> than cassandra.repair_delete_timeout_seconds (with default value = 1 day).
>  * Before deleting of a FINALIZED session 
> org.apache.cassandra.repair.consistent.LocalSessions#isSuperseded check is 
> executed for them to ensure if all ranges and tables covered by this session 
> have since been re-repaired by a more recent session. If it is not superseded 
> the session info delete from the table is skipped and a log message is 
> printed:
> {code:java}
> Skipping delete of FINALIZED LocalSession {repairSessionId} because it has 
> not been superseded by a more recent session"{code}
>  * isSuperseded logic allows to delete a repair session info only if all 
> session ranges are covered by some newer session on the node.
> If we added a new node then a set of ranges is moved to it and for these 
> ranges data are not repaired anymore on the old nodes, so isSuperseded always 
> return false for the last session executed before the node adding.
> If we have a big cluster with a lot of nodes added while an incremental 
> repair is executed regularly then we get a lot of non-removable old records 
> in system.repairs table it may slow down startup for Cassandra nodes 
> especially if a large number of tokens is used on the cluster historically.
> A similar issue is with a table removal, the logic consider the last session 
> which was executed for a removed table as non-superseded and keeps it forever.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (CASSANDRA-20877) FINALIZED incremental local repair sessions are not cleaned up in case of a range movement

Reply via email to