On Thu, Jul 11, 2013 at 2:46 AM, Tomàs Núnez <tomas.nu...@groupalia.com>wrote:
> Hi > > About a year ago, we did a major compaction in our cassandra cluster (a > n00b mistake, I know), and since then we've had huge sstables that never > get compacted, and we were condemned to repeat the major compaction process > every once in a while (we are using SizeTieredCompaction strategy, and > we've not avaluated yet LeveledCompaction, because it has its downsides, > and we've had no time to test all of them in our environment). > > I was trying to find a way to solve this situation (that is, do something > like a major compaction that writes small sstables, not huge as major > compaction does), and I couldn't find it in the documentation. > https://github.com/pcmanus/cassandra/tree/sstable_split 1) run sstable_split on One Big SSTable (being careful to avoid name collisions if done with node running) 2) stop node 3) remove One Big SSTable 4) start node This approach is significantly more i/o efficient than your online solution, but does require a node restart and messing around directly with SSTables. Your online solution is clever! If you choose to use this tool, please let us know the result. With some feedback, pcmanus (Sylvain) is likely to merge it into Cassandra as a useful tool for dealing with for example this situation. =Rob