You could turn gc_grace_seconds down to zero and tune compaction options for this CF to keep the tombstone count down.
But... This query looks a lot like a ledger. If that is so, treat it as such and skip the updates by: - modifying the schema to include a timeuuid as part of a compound key (and using that timeuuid for order) - select the most recent via limit 1 - would you even need paxos at this point? (please read Pat Helland's Building on Quicksand: http://blogs.msdn.com/cfs-file.ashx/__key/communityserver-components-postattachments/00-09-20-52-14/BuildingOnQuicksand_2D00_V3_2D00_081212h_2D00_pdf.pdfparticularly section 6.2) - use a TTL to keep the table tame if it's high volume This 'immutable' approach plays much nicer with Cassandra's strong points. On Sun, May 25, 2014 at 2:01 PM, Charlie Mason <charlie....@gmail.com>wrote: > Hi All, > > I have a table which has one column per user. It revives at lot of updates > to these columns through out the life time. They are always updates on a > few specific columns Firstly is Cassandra storing a Tombstone for each of > these old column values. > > I have run a simple select and seen the following tracing results: > > activity > | timestamp | source | source_elapsed > > -------------------------------------------------------------------------------------------+--------------+-----------+---------------- > execute_cql3_query | 19:48:36,582 | 127.0.0.1 | 0 > Parsing SELECT Account, Balance FROM AccountBalances WHERE Account = > 'test9' LIMIT 10000; | 19:48:36,582 | 127.0.0.1 | 56 > Preparing statement | 19:48:36,582 | 127.0.0.1 | 181 > Executing single-partition query on accountbalances | 19:48:36,583 | > 127.0.0.1 | 878 > Acquiring sstable references | 19:48:36,583 | 127.0.0.1 | 895 > Merging memtable tombstones | 19:48:36,583 | 127.0.0.1 | 918 > Key cache hit for sstable 569 | 19:48:36,583 | 127.0.0.1 | 997 > Seeking to partition beginning in data file | 19:48:36,583 | 127.0.0.1 | > 1034 > Skipped 0/1 non-slice-intersecting sstables, included 0 due to tombstones > | 19:48:36,583 | 127.0.0.1 | 1383 > Merging data from memtables and 1 sstables | 19:48:36,583 | 127.0.0.1 | > 1402 > Read 1 live and 123780 tombstoned cells | 19:48:36,710 | 127.0.0.1 | > 128631 > Request complete | 19:48:36,711 | 127.0.0.1 | 129276 > > > As you can see that's awful lot of tombstoned cells. That's after a full > compaction as well. Just so you are aware this table is updated using a > Paxos IF statement. > > Its still seems fairly snappy however I am concerned its only going to get > worse. > > Would I better off adding a time based key to the primary key. Then doing > a sepperate insert and then deleting the original. If I did the query with > a limit of one it should always find the first rows before hitting a > tombstone. Is that correct? > > Thanks, > > Charlie M > > -- ----------------- Nate McCall Austin, TX @zznate Co-Founder & Sr. Technical Consultant Apache Cassandra Consulting http://www.thelastpickle.com