Any comments on side effects of Major compaction especially when sstable generated is 100+ GB?
After Cassandra 1.2 , automated tombstone compaction occurs even on a single sstable if tombstone percentage increases the tombstone_threshold sub property specified in compaction strategy. So, even if the huge sstable is not compacted with any new table, still tombstones will be collected. Any other disadvantage of having a giant sstable of hundreds of GB? I understand that sstables have a summary and index which helps finding correct data blocks directly from a large data file. Still are there any disadvantages? Thanks Anuj Wadehra Sent from Yahoo Mail on Android From:"Anuj Wadehra" <anujw_2...@yahoo.co.in> Date:Mon, 13 Apr, 2015 at 12:33 am Subject:Re: Drawbacks of Major Compaction now that Automatic Tombstone Compaction Exists No. Anuj Wadehra On Monday, 13 April 2015 12:23 AM, Sebastian Estevez <sebastian.este...@datastax.com> wrote: Have you tried user defined compactions via JMX? On Apr 12, 2015 1:40 PM, "Anuj Wadehra" <anujw_2...@yahoo.co.in> wrote: Recently we faced an issue where every repair operation caused addition of hundreds of sstables (CASSANDRA-9146). In order to bring situation under control and make sure reads are not impacted, we were left with no option but to run major compaction to ensure that thousands of tiny sstables are compacted. Queries: Does major compaction has any drawback after automatic tombstone compaction got implemented in 1.2 via tombstone_threshold sub-property(CASSANDRA-3442)? I understand that the huge SSTable created after major compaction wont be compacted with new data any time soon but is that a problem if purged data is removed via automatic tombstone compaction? If we major compaction results in a huge file say 500GB, what are the drawbacks of it? If one big sstable is a problem, is there any way of solving the problem? We tried running sstablesplit after major compaction to split the big sstable but as new sstables were of same size they are again compacted into single huge table once Cassandra was started after executing sstablesplit. Thanks Anuj Wadehra