Bob, Default compression is Snappy compression and I have seen compression ranging between 2-4% (just as the doc says). I got the storage part. Does it mean that as a result of compaction/repair SSTables are decompressed? Is it the reason for CPU utilization spiking up a little? -SR From: as...@outlook.com To: user@cassandra.apache.org Subject: RE: validation compaction Date: Tue, 14 Oct 2014 17:09:14 -0500
Thanks Rob. Date: Mon, 13 Oct 2014 13:42:39 -0700 Subject: Re: validation compaction From: rc...@eventbrite.com To: user@cassandra.apache.org On Mon, Oct 13, 2014 at 1:04 PM, S C <as...@outlook.com> wrote: I have started repairing a 10 node cluster with one of the table having > 1TB of data. I notice that the validation compaction actually shows >3 TB in the "nodetool compactionstats" bytes total. However, I have less than 1TB data on the machine. If I take into consideration of 3 replicas then 3TB makes sense. Per my understanding, validation does only care about data local to the machine running validation compaction. Am I missing some thing here? Any help is much appreciated. Compression is enabled by default; it's showing the uncompressed data size. Your 1TB of data would be 3TB without compression. =Rob