Snapshots of repair are not cleared in Cassandra 4

Muhammad Soliman Mon, 07 Feb 2022 00:07:48 -0800

Hi Everyone
We have migrated some of our clusters from Cassandra 3.11.11 to 4.0.1. We
do repairs periodically triggered by some automation. Each time we run
repair we do full `-full` sequential `-seq` primary `-pr` repairs for a
portion of the full ring range and we finish iterating over the full range
in a week.


With Cassandra 4.0.1 we started seeing snapshots created during repair
accumulating until they eat up the disk space. We also see some errors with
the message "Could not create snapshot at node x (put x's ip address).
Check the logs on the repair participants for further details". Checking
the logs of the mentioned node shows nothing.

I have the following questions:

   1. Did cassandra 3.11.11 perform automatic cleanup of snapshots? Why
   don't we see this on our Cassandra 3.11.11 clusters?
   2. Is there a way to clear just repair snapshots? Repair snapshots are
   created with a GUID so it is difficult to use `nodetool clearsnapshot -tag`
   with a given tag.
   3. If we do `nodetool clearsnapshot -all` while there is a repair job
   running, what will happen?

Thanks
-- 
Muhammad Soliman
Senior Site Reliability Engineer
[image: Booking.com] <https://www.booking.com/>
Making it easier for everyone
to experience the world.

Snapshots of repair are not cleared in Cassandra 4

Reply via email to