We have a table defined using SizeTieredCompactionStrategy that is used to
store time series data. Over a period of a few days we wrote approximately
200,000 unique time based entries for each of 700 identifiers, i.e. 700 wide
rows with 200,000 entries in each.  The table was empty when we started and
and there were no updates to any entries, no deletions, and no tombstones
were created.

Our estimates suggested that this should have required about 7GB of disk
space but when we looked on disk there were 8 sstables taking up 11GB of

Running nodetool compact on the column family reduced it to a single sstable
that does match our 7GB estimate.

I'd like to understand what accounts for the other 4GB when it was stored as
multiple sstables? Is it because the individual sstables overlap?


View this message in context: 
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 

Reply via email to