Hi, > If Cassandra only compacts one table at a time, then I should be safe if I keep as much free space as there is data in the largest table. If Cassandra can compact multiple tables simultaneously, then it seems that I need as much free space as all the tables put together, which means no more than 50% utilization.
Based on your configuration. 1 per CPU core by default. See concurrent_compactors for details. > Also, what happens if a node gets low on disk space and there isn’t enough available for compaction? A compaction checks if there's enough disk space based on its estimate. Otherwise, it won't get executed. > Is there a way to salvage a node that gets into a state where it cannot compact its tables? If you carefully run some cleanups, then you'll get some room based on its new range. On Fri, Nov 29, 2013 at 12:21 PM, Robert Wille <rwi...@fold3.com> wrote: > I’m trying to estimate our disk space requirements and I’m wondering about > disk space required for compaction. > > My application mostly inserts new data and performs updates to existing > data very infrequently, so there will be very few bytes removed by > compaction. It seems that if a major compaction occurs, that performing the > compaction will require as much disk space as is currently consumed by the > table. > > So here’s my question. If Cassandra only compacts one table at a time, > then I should be safe if I keep as much free space as there is data in the > largest table. If Cassandra can compact multiple tables simultaneously, > then it seems that I need as much free space as all the tables put > together, which means no more than 50% utilization. So, how much free space > do I need? Any rules of thumb anyone can offer? > > Also, what happens if a node gets low on disk space and there isn’t enough > available for compaction? If I add new nodes to reduce the amount of data > on each node, I assume the space won’t be reclaimed until a compaction > event occurs. Is there a way to salvage a node that gets into a state where > it cannot compact its tables? > > Thanks > > Robert > >