On Tue, Sep 11, 2018 at 9:47 AM Oleksandr Shulgin < oleksandr.shul...@zalando.de> wrote:
> On Tue, Sep 11, 2018 at 9:31 AM Steinmaurer, Thomas < > thomas.steinmau...@dynatrace.com> wrote: > >> As far as I remember, in newer Cassandra versions, with STCS, nodetool >> compact offers a ā-sā command-line option to split the output into files >> with 50%, 25% ā¦ in size, thus in this case, not a single largish SSTable >> anymore. By default, without -s, it is a single SSTable though. >> > > Thanks Thomas, I've also spotted the option while testing this approach. > I understand that doing major compactions is generally not recommended, but > do you see any real drawback of having a single SSTable file in case we > stopped writing new data to the table? > A related question is: given that we are not writing new data to these tables, it would make sense to exclude them from the routine repair regardless of the option we use in the end to remove the tombstones. However, I've just checked the timestamps of the SSTable files on one of the nodes and to my surprise I can find some files written only a few weeks ago (most of the files are half a year ago, which is expected because it was the time we were adding this DC). But we've stopped writing to the tables about a year ago and we repair the cluster very week. What could explain that we suddenly see these new SSTable files? They shouldn't be there even due to overstreaming, because one would need to find some differences in the Merkle tree in the first place, but I don't see how that could actually happen in our case. Any ideas? Thanks, -- Alex