Scrub takes a very long time and does not remove the tombstones. You should do 
garbage cleaning. It immediately removes the tombstones.

Thaks,
Charu

From: Oleksandr Shulgin <oleksandr.shul...@zalando.de>
Reply-To: "user@cassandra.apache.org" <user@cassandra.apache.org>
Date: Monday, September 10, 2018 at 6:53 AM
To: "user@cassandra.apache.org" <user@cassandra.apache.org>
Subject: Drop TTLd rows: upgradesstables -a or scrub?

Hello,

We have some tables with significant amount of TTLd rows that have expired by 
now (and more gc_grace_seconds have passed since the TTL).  We have stopped 
writing more data to these tables quite a while ago, so background compaction 
isn't running.  The compaction strategy is the default SizeTiered one.

Now we would like to get rid of all the droppable tombstones in these tables.  
What would be the approach that puts the least stress on the cluster?

We've considered a few, but the most promising ones seem to be these two: 
`nodetool scrub` or `nodetool upgradesstables -a`.  We are using Cassandra 
version 3.0.

Now, this docs page recommends to use upgradesstables wherever possible: 
https://docs.datastax.com/en/cassandra/3.0/cassandra/tools/toolsScrub.html
What is the reason behind it?

From source code I can see that Scrubber the class which is going to drop the 
tombstones (and report the total number in the logs): 
https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/db/compaction/Scrubber.java#L308

I couldn't find similar handling in the upgradesstables code path.  Is the 
assumption correct that this one will not drop the tombstone as a side effect 
of rewriting the files?

Any drawbacks of using scrub for this task?

Thanks,
--
Oleksandr "Alex" Shulgin | Senior Software Engineer | Team Flux | Data Services 
| Zalando SE | Tel: +49 176 127-59-707

Reply via email to