If the versions can be guaranteed to be a adjacent (i.e. if the latest version is V, the prior version is V-1) you could issue a delete at the same time as an insert for V-N-(buffer) where buffer >= 0
In general guaranteeing that is probably hard, so this seems like something that would be nice to have C* manage for you. Unfortunately we don't have anything on the roadmap to help with this. A custom compaction strategy might do the trick, or permitting some filter during compaction that can omit/tombstone certain records based on the input data. This latter option probably wouldn't be too hard to implement, although it might not offer any guarantees about expiring records in order without incurring extra compaction cost (you could reasonably easily guarantee the most recent N are present, but the cleaning up of older records might happen haphazardly, in no particular order, and without any promptness guarantees, if you want to do it cheaply). Feel free to file a ticket, or submit a patch! On Fri, Jul 18, 2014 at 1:32 AM, Clint Kelly <clint.ke...@gmail.com> wrote: > Hi everyone, > > I am trying to design a schema that will keep the N-most-recent > versions of a value. Currently my table looks like the following: > > CREATE TABLE foo ( > rowkey text, > family text, > qualifier text, > version long, > value blob, > PRIMARY KEY (rowkey, family, qualifier, version)) > WITH CLUSTER ORDER BY (rowkey ASC, family ASC, qualifier ASC, version > DESC)); > > Is there any standard design pattern for updating such a layout such > that I keep the N-most-recent (version, value) pairs for every unique > (rowkey, family, qualifier)? I can't think of any way to do this > without doing a read-modify-write. The best thing I can think of is > to use TTL to approximate the desired behavior (which will work if I > know how often we are writing new data to the table). I could also > use "LIMIT N" in my queries to limit myself to only N items, but that > does not address any of the storage-size issues. > > In case anyone is curious, this question is related to some work that > I am doing translating a system built on HBase (which provides this > "keep the N-most-recent-version-of-a-cell" behavior) to Cassandra > while providing the user with as-similar-as-possible an interface. > > Best regards, > Clint >