What are the timestamps on those SST Tables? Do those tables use TTL? To answer your last question, I've seen that scenario happen under load testing with column families with TTL. Large loads within the TTL window cause normal compaction to build up larger and larger SST tables. When the load falls off there's a couple very large tables and under normal load small tables get TTL'd before any table gets large enough to hit min_threshold, so data that's months old and should have been TTL'd will never get the chance.
There's a nice blog that covers the bucket_high and the algorithm -- yes, I believe setting bucket_high large enough will cause the one large table to be grouped with the others -- however, if you're not using TTL, I don't think there's an issue -- small tables simply need to build up to get another three medium-sized tables (1.7 GB) which then need to build up to get 4 larger (8GB tables). http://shrikantbang.wordpress.com/2014/04/22/size-tiered-compaction-strategy-in-apache-cassandra/ On Mon, Jul 7, 2014 at 12:13 PM, John Sanda <john.sa...@gmail.com> wrote: > I have a write-heavy table that is using size tiered compaction. I am > running C* 1.2.9. There is an SSTable that is not getting compacted. It is > disproportionately larger than the other SSTables. The data file sizes are, > > 1.70 GB > 0.18 GB > 0.16 GB > 0.05 GB > 8.61 GB > > If I set the bucket_high compaction property on the table to a > sufficiently large value, will the 8.61 GB get compacted? What if any > drawbacks are there to increasing the bucket_high property? > > In what scenarios could I wind up with such a disproportionately large > SSTable like this? One thing that comes to mind is major compactions, but I > have not that. > > - John > -- *Ken Hancock *| System Architect, Advanced Advertising SeaChange International 50 Nagog Park Acton, Massachusetts 01720 ken.hanc...@schange.com | www.schange.com | NASDAQ:SEAC <http://www.schange.com/en-US/Company/InvestorRelations.aspx> Office: +1 (978) 889-3329 | [image: Google Talk:] ken.hanc...@schange.com | [image: Skype:]hancockks | [image: Yahoo IM:]hancockks [image: LinkedIn] <http://www.linkedin.com/in/kenhancock> [image: SeaChange International] <http://www.schange.com/>This e-mail and any attachments may contain information which is SeaChange International confidential. The information enclosed is intended only for the addressees herein and may not be copied or forwarded without permission from SeaChange International.