On Tue, Sep 11, 2018 at 11:07 AM Steinmaurer, Thomas <
thomas.steinmau...@dynatrace.com> wrote:

>
> a single (largish) SSTable or any other SSTable for a table, which does
> not get any writes (with e.g. deletes) anymore, will most likely not be
> part of an automatic minor compaction anymore, thus may stay forever on
> disk, if I don’t miss anything crucial here.
>

I would also expect that, but that's totally fine for us.


> Might be different though, if you are entirely writing TTL-based, cause
> single SSTable based automatic tombstone compaction may kick in here, but
> I’m not really experienced with that.
>

Yes, we were writing with a TTL of 2 years to these tables, and in about 1
years from now 100% of the data in them will expire.  We would be able to
simply truncate them at that point.

Now that you mention single-SSTable tombstone compaction again, I don't
think this is happening in our case.  For example, on one of the nodes I
see estimated droppable tombstones ratio range from 0.24 to slightly over 1
(1.09).  Yet, no single-SSTable compaction was triggered apparently,
because the data files are all 6 months old now.  We are using all the
default settings for tombstone_threshold, tombstone_compaction_interval
and unchecked_tombstone_compaction.

Does this mean that these all SSTable files do indeed overlap and because
we don't allow unchecked_tombstone_compaction, no actual compaction is
triggered?

We had been suffering a lot with storing timeseries data with STCS and disk
> capacity to have the cluster working smoothly and automatic minor
> compactions kicking out aged timeseries data according to our retention
> policies in the business logic. TWCS is unfortunately not an option for us.
> So, we did run major compactions every X weeks to reclaim disk space, thus
> from an operational perspective, by far not nice. Thus, finally decided to
> change STCS min_threshold from default 4 to 2, to let minor compactions
> kick in more frequently. We can live with the additional IO/CPU this is
> causing, thus is our current approach to disk space and sizing issues we
> had in the past.
>

For our new generation of tables we have switched to use TWCS, that's the
reason we don't write anymore to those old tables which are still using
STCS.

Cheers,
--
Alex

Reply via email to