Awesome tip on TTL. We can really use this as a catch-all to make sure all columns are purged based on time. Fits our use-case good. I forgot this feature existed.
On Jun 22, 2011, at 7:11 PM, Eric tamme wrote: >>> Second, compacting such large files is an IO killer. What can be tuned >>> other than compaction_threshold to help optimize this and prevent the files >>> from getting too big? >>> >>> Thanks! >> >> > > Just a personal implementation note - I make heavy use of column TTL, > so I have very specifically tuned cassandra to having a pretty > constant max disk usage based on my data insertion rate, the TTL, the > memtable flush threshold, and min compaction threshold. My data > basically lives for 7 days and depending on where it is in the > compaction cycle goes from 130 gigs per node up to 160gigs per node. > > If setting TTL is an option for you, It is one way to auto purge data > and keep overall size in check. > > -Eric