Hi All, I have a table which has one column per user. It revives at lot of updates to these columns through out the life time. They are always updates on a few specific columns Firstly is Cassandra storing a Tombstone for each of these old column values.
I have run a simple select and seen the following tracing results: activity | timestamp | source | source_elapsed -------------------------------------------------------------------------------------------+--------------+-----------+---------------- execute_cql3_query | 19:48:36,582 | 127.0.0.1 | 0 Parsing SELECT Account, Balance FROM AccountBalances WHERE Account = 'test9' LIMIT 10000; | 19:48:36,582 | 127.0.0.1 | 56 Preparing statement | 19:48:36,582 | 127.0.0.1 | 181 Executing single-partition query on accountbalances | 19:48:36,583 | 127.0.0.1 | 878 Acquiring sstable references | 19:48:36,583 | 127.0.0.1 | 895 Merging memtable tombstones | 19:48:36,583 | 127.0.0.1 | 918 Key cache hit for sstable 569 | 19:48:36,583 | 127.0.0.1 | 997 Seeking to partition beginning in data file | 19:48:36,583 | 127.0.0.1 | 1034 Skipped 0/1 non-slice-intersecting sstables, included 0 due to tombstones | 19:48:36,583 | 127.0.0.1 | 1383 Merging data from memtables and 1 sstables | 19:48:36,583 | 127.0.0.1 | 1402 Read 1 live and 123780 tombstoned cells | 19:48:36,710 | 127.0.0.1 | 128631 Request complete | 19:48:36,711 | 127.0.0.1 | 129276 As you can see that's awful lot of tombstoned cells. That's after a full compaction as well. Just so you are aware this table is updated using a Paxos IF statement. Its still seems fairly snappy however I am concerned its only going to get worse. Would I better off adding a time based key to the primary key. Then doing a sepperate insert and then deleting the original. If I did the query with a limit of one it should always find the first rows before hitting a tombstone. Is that correct? Thanks, Charlie M