> What is strange every time I run repair data takes almost 3 times more
> - 270G, then I run compaction and get 100G back.

https://issues.apache.org/jira/browse/CASSANDRA-2699 outlines the
maion issues with repair. In short - in your case the limited
granularity of merkle trees is causing too much data to be streamed
(effectively duplicate data).
https://issues.apache.org/jira/browse/CASSANDRA-3912 may be a bandaid
for you in that it allows granularity to be much finer, and the
process to be more incremental.

A 'nodetool compact' decreases disk space temporarily as you have
noticed, but it may also have a long-term negative effect on steady
state disk space usage depending on your workload. If you've got a
workload that's not limited to insertions only (i.e., you have
overwrites/deletes), a major compaction will tend to push steady state
disk space usage up - because you're creating a single sstable bigger
than what would normally happen, and it takes more total disk space
before it will be part of a compaction again.

-- 
/ Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Reply via email to