> What is strange every time I run repair data takes almost 3 times more > - 270G, then I run compaction and get 100G back.
https://issues.apache.org/jira/browse/CASSANDRA-2699 outlines the maion issues with repair. In short - in your case the limited granularity of merkle trees is causing too much data to be streamed (effectively duplicate data). https://issues.apache.org/jira/browse/CASSANDRA-3912 may be a bandaid for you in that it allows granularity to be much finer, and the process to be more incremental. A 'nodetool compact' decreases disk space temporarily as you have noticed, but it may also have a long-term negative effect on steady state disk space usage depending on your workload. If you've got a workload that's not limited to insertions only (i.e., you have overwrites/deletes), a major compaction will tend to push steady state disk space usage up - because you're creating a single sstable bigger than what would normally happen, and it takes more total disk space before it will be part of a compaction again. -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)