>Since the auto repair is running from within Cassandra, we might have more control over this and implement a proper cleanup of such snapshots. Rightly said, Alexander. Having internal knowledge of Cassandra, we can do a lot more. For example, for better Incremental Reliability reliability, Andy T and Chris L have developed a new token-split algorithm on top of the MVP based on unrepaired data in SSTables (soon it will be added to the MVP as they are working on writing test cases, etc.), and that requires internal SSTable data-structure access, etc.
Jaydeep On Mon, Oct 28, 2024 at 10:51 PM Jaydeep Chovatia < chovatia.jayd...@gmail.com> wrote: > >> > That's inaccurate, we can check the replica set for the subrange we're > about to run and see if it overlaps with the replica set of other ranges > which are being processed already. > We can definitely check the replicas for the subrange we plan to run and > see if they overlap with the ongoing one. I am saying that for a smaller > cluster if we want to repair multiple token ranges in parallel, it is tough > to guarantee that replica sets won't overlap. > > >Jira to auto-delete snapshots at X% disk full ? > Sure, just created a new JIRA > https://issues.apache.org/jira/browse/CASSANDRA-20035 > > Jaydeep >