> On May 16, 2025, at 10:22 AM, Mike Sun <m...@msun.io> wrote:
> 
> The Cassandra docs 
> <https://cassandra.apache.org/doc/5.0/cassandra/managing/operating/repair.html>
>  advise:
>> 
>> At a minimum, repair should be run often enough that the gc grace period 
>> never expires on unrepaired data. Otherwise, deleted data could reappear. 
>> With a default gc grace period of 10 days, repairing every node in your 
>> cluster at least once every 7 days will prevent this, while providing enough 
>> slack to allow for delays.
> 
> I don't think repairing at least once every 7 days if gc_grace_seconds is 10 
> days is adequate to guarantee no risk of data resurrection.
> 
> I wrote this post to explain my reasoning:
> https://msun.io/cassandra-scylla-repairs/  
> <https://msun.io/cassandra-scylla-repairs/>
> 
> Would appreciate any feedback, thanks!
> Mike Sun


To summarize the blog for those who haven’t read it: 

Running repairs once every gc_grace_seconds is actually insufficient because it 
doesn’t account for the duration of the repair process itself and the specific 
timing of when data ranges (tokens) are repaired. A tombstone created for data 
just after its specific token was scanned by one repair can expire before the 
next repair cycle (which only begins gc_grace_seconds later) manages to reach 
and process that particular token.


You need to complete the repair within the gc_grace_seconds window. Having 
repair run for 3 days would be a surprise. We can certainly adjust the wording, 
but the intent of that wording isn’t “start it every 7 days regardless of how 
often it runs”, it’s “finish it every 7 days” (successfully).



Yes, it’s not enough to start the repair every 7 days, it needs to complete 
successfully between the time the tombstone is written and the expiration of 
gc_grace_seconds. 


Reply via email to