Re: Customized Compaction Strategy: Dev Questions

Russell Bradberry Wed, 04 Jun 2014 10:06:04 -0700

hmm, I see. So something similar to Capped Collections in MongoDB.



On June 4, 2014 at 1:03:46 PM, Redmumba (redmu...@gmail.com) wrote:

Not quite; if I'm at say 90% disk usage, I'd like to drop the oldest sstable 
rather than simply run out of space.

The problem with using TTLs is that I have to try and guess how much data is 
being put in--since this is auditing data, the usage can vary wildly depending 
on time of year, verbosity of auditing, etc..  I'd like to maximize the disk 
space--not optimize the cleanup process.

Andrew


On Wed, Jun 4, 2014 at 9:47 AM, Russell Bradberry <rbradbe...@gmail.com> wrote:
You mean this:

https://issues.apache.org/jira/browse/CASSANDRA-5228

?



On June 4, 2014 at 12:42:33 PM, Redmumba (redmu...@gmail.com) wrote:

Good morning!

I've asked (and seen other people ask) about the ability to drop old sstables, 
basically creating a FIFO-like clean-up process.  Since we're using Cassandra 
as an auditing system, this is particularly appealing to us because it means we 
can maximize the amount of auditing data we can keep while still allowing 
Cassandra to clear old data automatically.

My idea is this: perform compaction based on the range of dates available in 
the sstable (or just metadata about when it was created).  For example, a major 
compaction could create a combined sstable per day--so that, say, 60 days of 
data after a major compaction would contain 60 sstables.

My question then is, will this be possible by simply implementing a separate 
AbstractCompactionStrategy?  Does this sound feasilble at all?  Based on the 
implementation of Size and Leveled strategies, it looks like I would have the 
ability to control what and how things get compacted, but I wanted to verify 
before putting time into it.

Thank you so much for your time!

Andrew

Re: Customized Compaction Strategy: Dev Questions

Reply via email to