Hello,

We have some tables with significant amount of TTLd rows that have expired
by now (and more gc_grace_seconds have passed since the TTL).  We have
stopped writing more data to these tables quite a while ago, so background
compaction isn't running.  The compaction strategy is the default
SizeTiered one.

Now we would like to get rid of all the droppable tombstones in these
tables.  What would be the approach that puts the least stress on the
cluster?

We've considered a few, but the most promising ones seem to be these two:
`nodetool scrub` or `nodetool upgradesstables -a`.  We are using Cassandra
version 3.0.

Now, this docs page recommends to use upgradesstables wherever possible:
https://docs.datastax.com/en/cassandra/3.0/cassandra/tools/toolsScrub.html
What is the reason behind it?

>From source code I can see that Scrubber the class which is going to drop
the tombstones (and report the total number in the logs):
https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/db/compaction/Scrubber.java#L308

I couldn't find similar handling in the upgradesstables code path.  Is the
assumption correct that this one will not drop the tombstone as a side
effect of rewriting the files?

Any drawbacks of using scrub for this task?

Thanks,
-- 
Oleksandr "Alex" Shulgin | Senior Software Engineer | Team Flux | Data
Services | Zalando SE | Tel: +49 176 127-59-707

Reply via email to