Any comments on side effects of Major compaction especially when sstable 
generated is 100+ GB? 


After Cassandra 1.2 , automated tombstone compaction occurs even on a single 
sstable if tombstone percentage increases the tombstone_threshold sub property 
specified in compaction strategy. So, even if the huge sstable is not compacted 
with any new table, still tombstones will be collected. Any other disadvantage 
of having a giant sstable of hundreds of GB? I understand that sstables have a 
summary and index which helps finding correct data blocks directly from a large 
data file. Still are there any disadvantages?


Thanks

Anuj Wadehra


Sent from Yahoo Mail on Android

From:"Anuj Wadehra" <anujw_2...@yahoo.co.in>
Date:Mon, 13 Apr, 2015 at 12:33 am
Subject:Re: Drawbacks of Major Compaction now that Automatic Tombstone 
Compaction Exists

No.


Anuj Wadehra




On Monday, 13 April 2015 12:23 AM, Sebastian Estevez 
<sebastian.este...@datastax.com> wrote:



Have you tried user defined compactions via JMX?

On Apr 12, 2015 1:40 PM, "Anuj Wadehra" <anujw_2...@yahoo.co.in> wrote:

Recently we faced an issue where every repair operation caused addition of 
hundreds of sstables (CASSANDRA-9146). In order to bring situation under 
control and make sure reads are not impacted, we were left with no option but 
to run major compaction to ensure that thousands of tiny sstables are compacted.

Queries:
Does major compaction has any drawback after automatic tombstone compaction got 
implemented in 1.2 via tombstone_threshold sub-property(CASSANDRA-3442)? 
I understand that the huge SSTable created after major compaction wont be 
compacted with new data any time soon but is that a problem if purged data is 
removed via automatic tombstone compaction? If we major compaction results in a 
huge file say 500GB, what are the drawbacks of it?

If one big sstable is a problem, is there any way of solving the problem? We 
tried running sstablesplit after major compaction to split the big sstable but 
as new sstables were of same size they are again compacted into single huge 
table once Cassandra was started after executing sstablesplit.



Thanks

Anuj Wadehra



Reply via email to