Great !!! Thanks Andrei !!! Thats the answer I was looking for :)
Thanks Anuj Wadehra Sent from Yahoo Mail on Android From:"Andrei Ivanov" <aiva...@iponweb.net> Date:Thu, 23 Apr, 2015 at 11:57 pm Subject:Re: Drawbacks of Major Compaction now that Automatic Tombstone Compaction Exists Just in case it helps - we are running C* with sstable sizes of something like 2.5 TB and ~4TB/node. No evident problems except the time it takes to compact. Andrei. On Wed, Apr 22, 2015 at 5:36 PM, Anuj Wadehra <anujw_2...@yahoo.co.in> wrote: Thanks Robert!! The JIRA was very helpful in understanding how tombstone threshold is implemented. And ticket also says that running major compaction weekly is an alternative. I actually want to understand if I run major compaction on a cf with 500gb of data and a single giant file is created. Do you see any problems with Cassandra processing such a huge file? Is there any Max sstable size beyond which performance etc degrades? What are the implications? Thanks Anuj Wadehra Sent from Yahoo Mail on Android From:"Robert Coli" <rc...@eventbrite.com> Date:Fri, 17 Apr, 2015 at 10:55 pm Subject:Re: Drawbacks of Major Compaction now that Automatic Tombstone Compaction Exists On Tue, Apr 14, 2015 at 8:29 PM, Anuj Wadehra <anujw_2...@yahoo.co.in> wrote: By automatic tombstone compaction, I am referring to tombstone_threshold sub property under compaction strategy in CQL. It is 0.2 by default. So what I understand from the Datastax documentation is that even if a sstable does not find sstables of similar size (STCS) , an automatic tombstone compaction will trigger on sstable when 20% data is tombstone. This compaction works on single sstable only. Overall system behavior is discussed here : https://issues.apache.org/jira/browse/CASSANDRA-6654?focusedCommentId=13914587&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13914587 They are talking about LCS, but the principles apply, but with an overlay of how STS behaves. =Rob