On Tue, Sep 11, 2018 at 11:07 AM Steinmaurer, Thomas < thomas.steinmau...@dynatrace.com> wrote:
> > a single (largish) SSTable or any other SSTable for a table, which does > not get any writes (with e.g. deletes) anymore, will most likely not be > part of an automatic minor compaction anymore, thus may stay forever on > disk, if I don’t miss anything crucial here. > I would also expect that, but that's totally fine for us. > Might be different though, if you are entirely writing TTL-based, cause > single SSTable based automatic tombstone compaction may kick in here, but > I’m not really experienced with that. > Yes, we were writing with a TTL of 2 years to these tables, and in about 1 years from now 100% of the data in them will expire. We would be able to simply truncate them at that point. Now that you mention single-SSTable tombstone compaction again, I don't think this is happening in our case. For example, on one of the nodes I see estimated droppable tombstones ratio range from 0.24 to slightly over 1 (1.09). Yet, no single-SSTable compaction was triggered apparently, because the data files are all 6 months old now. We are using all the default settings for tombstone_threshold, tombstone_compaction_interval and unchecked_tombstone_compaction. Does this mean that these all SSTable files do indeed overlap and because we don't allow unchecked_tombstone_compaction, no actual compaction is triggered? We had been suffering a lot with storing timeseries data with STCS and disk > capacity to have the cluster working smoothly and automatic minor > compactions kicking out aged timeseries data according to our retention > policies in the business logic. TWCS is unfortunately not an option for us. > So, we did run major compactions every X weeks to reclaim disk space, thus > from an operational perspective, by far not nice. Thus, finally decided to > change STCS min_threshold from default 4 to 2, to let minor compactions > kick in more frequently. We can live with the additional IO/CPU this is > causing, thus is our current approach to disk space and sizing issues we > had in the past. > For our new generation of tables we have switched to use TWCS, that's the reason we don't write anymore to those old tables which are still using STCS. Cheers, -- Alex