[
https://issues.apache.org/jira/browse/CASSANDRA-20995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jaydeepkumar Chovatia updated CASSANDRA-20995:
----------------------------------------------
Epic Link: CASSANDRA-19918
> Auto-repair scheduler waits for repair interval to pass before cleaning up
> orphaned node history
> ------------------------------------------------------------------------------------------------
>
> Key: CASSANDRA-20995
> URL: https://issues.apache.org/jira/browse/CASSANDRA-20995
> Project: Apache Cassandra
> Issue Type: Bug
> Components: Consistency/Repair
> Reporter: Kristijonas Zalys
> Assignee: Kristijonas Zalys
> Priority: Normal
> Time Spent: 20m
> Remaining Estimate: 0h
>
> When a node in the ring goes down, the auto-repair scheduler of all other
> nodes in the cluster start voting for this node's auto-repair history to be
> removed. Once >50% of the cluster vote to delete said down node, it's
> auto-repair history is deleted from the auto-repair system tables.
> This cleanup process is important to maintain continuous execution of repair
> across the entire cluster. If a node goes down, it will no longer perform
> repair and will not update its repair history in the system tables. However,
> as it is still present in the auto-repair history table, the down node will
> still be considered as a candidate to run repair. As a result, it will occupy
> space in the auto-repair queue and in cases with low auto-repair parallelism
> may even completely block auto-repair within the cluster.
> This is exactly what happened on of our small clusters where auto-repair
> parallelism was just one node at a time. A node got replaced but its repair
> history did not get cleaned up which caused the entire auto-repair system to
> grind to a halt.
> Upon investigation we found out that the root cause lies in the ordering of
> operations within the auto-repair scheduler:
> # The scheduler will check when was the last time the local node ran repair.
> # If that duration is lower than the repair interval, it will immediately
> short circuit.
> # Otherwise, it will proceed with computing the auto-repair queue and
> determining if it's the local node's turn to run repair.
> Importantly, the auto-repair history cleanup happens inside of the
> auto-repair queue algorithm. This means that a given node will clean up
> orphaned entries in auto-repair history only once its repair interval passes.
> For example: if you use auto-repair parallelism of 1 node and a repair
> interval of 24 hours, the orphaned data will not get cleaned up for up to 24
> hours and consequently auto-repair may get stuck for up to 24 hours as well.
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]