> So the manual compaction did help somewhat but did not get the nodes down to > the > size of their raw data. There are still multiple SSTables on most nodes. > > At 4:02pm, ran nodetool cleanup on every node. > > At 4:12pm, nodes are taking up the expected amount of space and all nodes are > using exactly 1 SSTable (fully compacted):
One thing to keep in mind is that SSTables are not actually removed from disk until the garbage collector has identified the relevant in-memory structures as garbage (there is a note on the wiki about this somewhere; it's a way to avoid the complexity of keeping track of when an sstable becomes safe to delete). I may be wrong, but I did a quick check and did not find an obvious GC trigger in the codepath for the 'cleanup' command. So while I'm not sure why the cleanup would necessarily help other than generally generating garbage and perhaps triggering a GC, a delay in actually freeing disk space can probably be attributed to the GC. (The reason I don't understand why cleanup would help is that even if cleanup did trigger sufficient garbage generation that CMS kicks in and does a mark/sweep, thus triggering the deletion of old sstables, presumably the cleanup itself would produce new sstables that would then have to wait anyway. Unless there is some code path to avoid doing that if nothing at all changes in the sstables as a result of the cleanup... I don't know.) -- / Peter Schuller