Thanks for the explanation. From: Marcus Eriksson [mailto:krum...@gmail.com] Sent: Thursday, March 17, 2016 12:56 AM To: user@cassandra.apache.org Subject: Re: DTCS Question
On Wed, Mar 16, 2016 at 6:49 PM, Anubhav Kale <anubhav.k...@microsoft.com<mailto:anubhav.k...@microsoft.com>> wrote: I am using Cassandra 2.1.13 which has all the latest DTCS fixes (it does STCS within the DTCS windows). It also introduced a field called MAX_WINDOW_SIZE which defaults to one day. So in my data folders, I may see SS Tables that span beyond a day (generated through old data through repairs or commit logs), but whenever I see a message in logs “Compacted Foo” (meaning the SS Table under question was definitely a result of compaction), the “Foo” SS Table should never have data beyond a day. Is this understanding accurate ? No - not until https://issues.apache.org/jira/browse/CASSANDRA-10496<https://na01.safelinks.protection.outlook.com/?url=https%3a%2f%2fissues.apache.org%2fjira%2fbrowse%2fCASSANDRA-10496&data=01%7c01%7cAnubhav.Kale%40microsoft.com%7c1dde7659fb8a420b61f308d34e3993dc%7c72f988bf86f141af91ab2d7cd011db47%7c1&sdata=7334rIfNRo0Oz5sXGAlATOmAkbmFJg4cqifXbGm23qA%3d> (read for explanation) If we have issues with repairs pulling in old data, should MAX_WINDOW_SIZE instead be set to a larger value so that we don’t run the risk of too many SS Tables lying around and never getting compacted ? No, with CASSANDRA-10280 that old data will get compacted if needed (assuming you have default settings). If the remote node is correctly date tiered, the streamed sstable will also be correctly date tiered. Then that streamed sstable will be put in a time window and if there are enough sstables in that old window, we do a compaction. /Marcus