On Mon, 10 Sep 2018, 19:40 Jeff Jirsa, <jji...@gmail.com> wrote:

> I think it's important to describe exactly what's going on for people who
> just read the list but who don't have context. This blog does a really good
> job:
> http://thelastpickle.com/blog/2016/07/27/about-deletes-and-tombstones.html
> , but briefly:
>
> - When a TTL expires, we treat it as a tombstone, because it may have been
> written ON TOP of another piece of live data, so we need to get that
> deletion marker to all hosts, just like a manual explicit delete
> - Tombstones in sstable A may shadow data in sstable B, so doing anything
> on just one sstable MAY NOT remove the tombstone - we can't get rid of the
> tombstone if sstable A overlaps another sstable with the same partition
> (which we identify via bloom filter) that has any data with a lower
> timestamp (we don't check the sstable for a shadowed value, we just look at
> the minimum live timestamp of the table)
>
> "nodetool garbagecollect" looks for sstables that overlap (partition keys)
> and combine them together, which makes tombstones past GCGS purgable and
> should remove them (and data shadowed by them).
>
> If you're on a version without nodetool garbagecollection, you can
> approximate it using user defined compaction (
> http://thelastpickle.com/blog/2016/10/18/user-defined-compaction.html ) -
> it's a JMX endpoint that let's you tell cassandra to compact one or more
> sstables together based on parameters you choose. This is somewhat like
> upgradesstables or scrub, but you can combine sstables as well. If you
> choose candidates intelligently (notably, oldest sstables first, or
> sstables you know overlap), you can likely manually clean things up pretty
> quickly. At one point, I had a jar that would do single sstable at a time,
> oldest sstable first, and it pretty much worked for this purpose most of
> the time.
>
> If you have room, a "nodetool compact" on stcs will also work, but it'll
> give you one huge sstable, which will be unfortunate long term (probably
> less of a problem if you're no longer writing to this table).
>

That's a really nice refresher, thanks Jeff!

>From the nature of the data at hand and because of the SizeTiered
compaction, I would expect that more or less all tables do overlap with
each other.

Even if we would be able to identify the overlapping ones (how?), I expect
that we would have to do an equivalent of the major compaction, but (maybe)
in multiple stages. Not sure that's really worth the trouble for us.

Thanks,
--
Alex

On Mon, Sep 10, 2018 at 10:29 AM Charulata Sharma (charshar)
> <chars...@cisco.com.invalid> wrote:
>
>> Scrub takes a very long time and does not remove the tombstones. You
>> should do garbage cleaning. It immediately removes the tombstones.
>>
>>
>>
>> Thaks,
>>
>> Charu
>>
>>
>>
>> *From: *Oleksandr Shulgin <oleksandr.shul...@zalando.de>
>> *Reply-To: *"user@cassandra.apache.org" <user@cassandra.apache.org>
>> *Date: *Monday, September 10, 2018 at 6:53 AM
>> *To: *"user@cassandra.apache.org" <user@cassandra.apache.org>
>> *Subject: *Drop TTLd rows: upgradesstables -a or scrub?
>>
>>
>>
>> Hello,
>>
>>
>>
>> We have some tables with significant amount of TTLd rows that have
>> expired by now (and more gc_grace_seconds have passed since the TTL).  We
>> have stopped writing more data to these tables quite a while ago, so
>> background compaction isn't running.  The compaction strategy is the
>> default SizeTiered one.
>>
>>
>>
>> Now we would like to get rid of all the droppable tombstones in these
>> tables.  What would be the approach that puts the least stress on the
>> cluster?
>>
>>
>>
>> We've considered a few, but the most promising ones seem to be these two:
>> `nodetool scrub` or `nodetool upgradesstables -a`.  We are using Cassandra
>> version 3.0.
>>
>>
>>
>> Now, this docs page recommends to use upgradesstables wherever possible:
>> https://docs.datastax.com/en/cassandra/3.0/cassandra/tools/toolsScrub.html
>>
>> What is the reason behind it?
>>
>>
>>
>> From source code I can see that Scrubber the class which is going to drop
>> the tombstones (and report the total number in the logs):
>> https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/db/compaction/Scrubber.java#L308
>>
>>
>>
>> I couldn't find similar handling in the upgradesstables code path.  Is
>> the assumption correct that this one will not drop the tombstone as a side
>> effect of rewriting the files?
>>
>>
>>
>> Any drawbacks of using scrub for this task?
>>
>>
>>
>> Thanks,
>> --
>>
>> Oleksandr "Alex" Shulgin | Senior Software Engineer | Team Flux | Data
>> Services | Zalando SE | Tel: +49 176 127-59-707
>>
>>
>>
>

Reply via email to