Cassandra does "minor" compactions with a minimum of 4 sstables in the same "bucket," with buckets doubling in size as you compact. So you only ever rewrite all data in your weekly-ish major compaction for tombstone cleanup and anti entropy.
-Jonathan On Tue, Mar 30, 2010 at 12:54 AM, Julian Simon <jsi...@jules.com.au> wrote: > Forgive me as I'm probably a little out of my depth in trying to > assess this particular design choice within Cassandra, but... > > My understanding is that Cassandra never updates data "in place" on > disk - instead it completely re-creates the data files during a > "flush". Stop me if I'm wrong already ;-) > > So imagine we have a large data set in our ColumnFamily and we're > constantly adding data to it. > > Every [x] minutes or [y] bytes, the compaction process is triggered, > and the entire data set is written to disk. > > So as our data set grows over time, the compaction process will result > in an increasingly large IO operation to write all that data to disk > each time. > > We could easily be talking about single data files in the > many-gigabyte size range, no? Or is there a file size limit that I'm > not aware of? > > If not, is this an efficient approach to take for large data sets? > Seems like we would become awfully IO bound, writing the entire thing > from scratch each time. > > Do let me know if I've gotten it all wrong ;-) > > Cheers, > Jules >