On Tue, Sep 11, 2018 at 11:07 AM Steinmaurer, Thomas <
thomas.steinmau...@dynatrace.com> wrote:

> a single (largish) SSTable or any other SSTable for a table, which does
> not get any writes (with e.g. deletes) anymore, will most likely not be
> part of an automatic minor compaction anymore, thus may stay forever on
> disk, if I don’t miss anything crucial here.

I would also expect that, but that's totally fine for us.

> Might be different though, if you are entirely writing TTL-based, cause
> single SSTable based automatic tombstone compaction may kick in here, but
> I’m not really experienced with that.

Yes, we were writing with a TTL of 2 years to these tables, and in about 1
years from now 100% of the data in them will expire.  We would be able to
simply truncate them at that point.

Now that you mention single-SSTable tombstone compaction again, I don't
think this is happening in our case.  For example, on one of the nodes I
see estimated droppable tombstones ratio range from 0.24 to slightly over 1
(1.09).  Yet, no single-SSTable compaction was triggered apparently,
because the data files are all 6 months old now.  We are using all the
default settings for tombstone_threshold, tombstone_compaction_interval
and unchecked_tombstone_compaction.

Does this mean that these all SSTable files do indeed overlap and because
we don't allow unchecked_tombstone_compaction, no actual compaction is

We had been suffering a lot with storing timeseries data with STCS and disk
> capacity to have the cluster working smoothly and automatic minor
> compactions kicking out aged timeseries data according to our retention
> policies in the business logic. TWCS is unfortunately not an option for us.
> So, we did run major compactions every X weeks to reclaim disk space, thus
> from an operational perspective, by far not nice. Thus, finally decided to
> change STCS min_threshold from default 4 to 2, to let minor compactions
> kick in more frequently. We can live with the additional IO/CPU this is
> causing, thus is our current approach to disk space and sizing issues we
> had in the past.

For our new generation of tables we have switched to use TWCS, that's the
reason we don't write anymore to those old tables which are still using


Reply via email to