[
https://issues.apache.org/jira/browse/CASSANDRA-20877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18023329#comment-18023329
]
Dmitry Konstantinov commented on CASSANDRA-20877:
-------------------------------------------------
4.0 test results:
[https://ci-cassandra.apache.org/job/Cassandra-devbranch-before-5/2696/]
Failed tests:
dtest-novnode.commitlog_test.TestCommitLog.test_mv_lock_contention_during_replay
dtest.global_row_key_cache_test.TestGlobalRowKeyCache.test_functional
org.apache.cassandra.net.ProxyHandlerConnectionsTest.testExpireSome-cdc
dtest-novnode.cqlsh_tests.test_cqlsh.TestCqlsh.test_unicode_invalid_request_error
dtest-novnode.cqlsh_tests.test_cqlsh.TestCqlsh.test_unicode_invalid_request_error
dtest.cqlsh_tests.test_cqlsh.TestCqlsh.test_unicode_invalid_request_error
dtest.cqlsh_tests.test_cqlsh.TestCqlsh.test_unicode_invalid_request_error
org.apache.cassandra.distributed.upgrade.DropCompactStorageTest.testDropCompactStorage
none are related to the changes.
4.1/5.0/trunk - MR and tests are in progress.
> FINALIZED incremental local repair sessions are not cleaned up in case of a
> range movement
> -------------------------------------------------------------------------------------------
>
> Key: CASSANDRA-20877
> URL: https://issues.apache.org/jira/browse/CASSANDRA-20877
> Project: Apache Cassandra
> Issue Type: Bug
> Components: Consistency/Repair
> Reporter: Dmitry Konstantinov
> Assignee: Dmitry Konstantinov
> Priority: Normal
> Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x
>
> Time Spent: 50m
> Remaining Estimate: 0h
>
> * system.repairs table is local per each Cassandra node.
> * This table is cleaned up by a periodically running
> org.apache.cassandra.repair.consistent.LocalSessions#cleanup() job.
> * The job runs every cassandra.repair_cleanup_interval_seconds (with default
> = 10 minutes).
> * The job should delete repair sessions with FINALIZED state which are older
> than cassandra.repair_delete_timeout_seconds (with default value = 1 day).
> * Before deleting of a FINALIZED session
> org.apache.cassandra.repair.consistent.LocalSessions#isSuperseded check is
> executed for them to ensure if all ranges and tables covered by this session
> have since been re-repaired by a more recent session. If it is not superseded
> the session info delete from the table is skipped and a log message is
> printed:
> {code:java}
> Skipping delete of FINALIZED LocalSession {repairSessionId} because it has
> not been superseded by a more recent session"{code}
> * isSuperseded logic allows to delete a repair session info only if all
> session ranges are covered by some newer session on the node.
> If we added a new node then a set of ranges is moved to it and for these
> ranges data are not repaired anymore on the old nodes, so isSuperseded always
> return false for the last session executed before the node adding.
> If we have a big cluster with a lot of nodes added while an incremental
> repair is executed regularly then we get a lot of non-removable old records
> in system.repairs table it may slow down startup for Cassandra nodes
> especially if a large number of tokens is used on the cluster historically.
> A similar issue is with a table removal, the logic consider the last session
> which was executed for a removed table as non-superseded and keeps it forever.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]