Hi Nick, the strategy will depend on your compaction strategy and how tombstones are generated (DELETE statements or TTLs), and also your version of Cassandra.
If you're working with TTLs, your best option is definitely TWCS with the unsafe_aggressive_sstable_expiration flag that was introduced by CASSANDRA-13418 <https://issues.apache.org/jira/browse/CASSANDRA-13418>. It'll delete all fully expired SSTables even when there are timestamp overlaps with other SSTables. If you have different TTLs, you can also enable *unchecked_tombstone_compaction* to trigger single sstables compactions more often (and adjust the tombstone_threshold to your particular workload). You can lower gc_grace_seconds to 3 hours (no less otherwise you'll reduce the hint window) in order to avoid keeping tombstones on disk. That's the easy case. Then if you're generating tombstones from DELETE statements, it can be trickier as you'll need the tombstones to be compacted with the data they shadow in order to get a chance to evict it eventually. You also cannot reduce gc_grace_seconds below your repair cycle as it will create a possibility of reviving deleted data (zombie data). LCS doesn't get along very well with tombstones, as they can get "stuck" in higher level with the data they shadow being stored in the lower levels. LCS major compactions are also fairly long to run (and single threaded). TWCS doesn't apply to data that isn't TTLed (your tombstones will possibly be stored in a different time window than the data they shadow). That leaves us with STCS. If you want to be as aggressive as possible there and purge your deletes ASAP, you'll need to : - run repair very often to secure your deletions - reduce gc_grace_seconds to a value that's slighty higher than your repair cycle - run major compactions with the -s flag, in order to avoid creating a single big file, and create one file per size tier instead. The best idea that I can think of here is to trigger a major compaction right after a successful repair. We have a few posts on our blog <http://thelastpickle.com/blog/> that cover the tombstones and compaction strategies topic (search for "tombstone" on that page), notably this one: http://thelastpickle.com/blog/2016/07/27/about-deletes-and-tombstones.html Cheers, On Sat, Mar 16, 2019 at 1:04 AM Nick Hatfield <[email protected]> wrote: > Hey guys, > > > > Can someone give me some idea or link some good material for determining a > good / aggressive tombstone strategy? I want to make sure my tombstones are > getting purged as soon as possible to reclaim disk. > > > > Thanks > -- ----------------- Alexander Dejanovski France @alexanderdeja Consultant Apache Cassandra Consulting http://www.thelastpickle.com
