Hello, Our setup is as follows:
Apache Cassandra: 3.0.17 Cassandra Reaper: 1.3.0-BETA-20180830 Compaction: { 'class': 'TimeWindowCompactionStrategy', 'compaction_window_size': '30', 'compaction_window_unit': 'DAYS' } We have two column families which differ only in the way data is written: one is always with a TTL (of 2 years), the other -- without a TTL. The data is time-series-like, append-only, no explicit updates or deletes. The data goes back as far as ~15 months. We have scheduled a non-incremental repair using Cassandra Reaper to run every week. Now we are observing an unexpected effect such that often *all* of the SSTable files on disk are modified (touched by repair) for both of the TTLd and non-TTLd tables. This is not expected, since the old files from past months have been repeatedly repaired a number of times already. If it is an effect caused by over-streaming, why does Cassandra find any differences in the files from past months in the first place? We expect that after a file from 2 months ago (or earlier) has been fully repaired once, there is no possibility for any more differences to be discovered. Is this not a reasonable assumption? Regards,, -- Alex