On cassandra 1.1.5 with a write heavy workload, we're having problems getting rows to be compacted away (removed) even though all columns have expired TTL. We've tried size tiered and now leveled and are seeing the same symptom: the data stays around essentially forever.
Currently we write all columns with a TTL of 72 hours (259200 seconds) and expect to add 10 GB of data to this CF per day per node. Each node currently has 73 GB for the affected CF and shows no indications that old rows will be removed on their own. Why aren't rows being removed? Below is some data from a sample row which should have been removed several days ago but is still around even though it has been involved in numerous compactions since being expired. $> ./bin/nodetool -h localhost getsstables metrics request_summary 459fb460-5ace-11e2-9b92-11d67b6163b4 /virtual/cassandra/data/data/metrics/request_summary/metrics-request_summary-he-386179-Data.db $> ls -alF /virtual/cassandra/data/data/metrics/request_summary/metrics-request_summary-he-386179-Data.db -rw-rw-r-- 1 sandra sandra 5252320 Jan 16 08:42 /virtual/cassandra/data/data/metrics/request_summary/metrics-request_summary-he-386179-Data.db $> ./bin/sstable2json /virtual/cassandra/data/data/metrics/request_summary/metrics-request_summary-he-386179-Data.db -k $(echo -n 459fb460-5ace-11e2-9b92-11d67b6163b4 | hexdump -e '36/1 "%x"') { "34353966623436302d356163652d313165322d396239322d313164363762363136336234": [["app_name","50f21d3d",1357785277207001,"d"], ["client_ip","50f21d3d",1357785277207001,"d"], ["client_req_id","50f21d3d",1357785277207001,"d"], ["mysql_call_cnt","50f21d3d",1357785277207001,"d"], ["mysql_duration_us","50f21d3d",1357785277207001,"d"], ["mysql_failure_call_cnt","50f21d3d",1357785277207001,"d"], ["mysql_success_call_cnt","50f21d3d",1357785277207001,"d"], ["req_duration_us","50f21d3d",1357785277207001,"d"], ["req_finish_time_us","50f21d3d",1357785277207001,"d"], ["req_method","50f21d3d",1357785277207001,"d"], ["req_service","50f21d3d",1357785277207001,"d"], ["req_start_time_us","50f21d3d",1357785277207001,"d"], ["success","50f21d3d",1357785277207001,"d"]] } Decoding the column timestamps to shows that the columns were written at "Thu, 10 Jan 2013 02:34:37 GMT" and that their TTL expired at "Sun, 13 Jan 2013 02:34:37 GMT". The date of the SSTable shows that it was generated on Jan 16 which is 3 days after all columns have TTL-ed out. The schema shows that gc_grace is set to 0 since this data is write-once, read-seldom and is never updated or deleted. create column family request_summary with column_type = 'Standard' and comparator = 'UTF8Type' and default_validation_class = 'UTF8Type' and key_validation_class = 'UTF8Type' and read_repair_chance = 0.1 and dclocal_read_repair_chance = 0.0 and gc_grace = 0 and min_compaction_threshold = 4 and max_compaction_threshold = 32 and replicate_on_write = true and compaction_strategy = 'org.apache.cassandra.db.compaction.LeveledCompactionStrategy' and caching = 'NONE' and bloom_filter_fp_chance = 1.0 and compression_options = {'chunk_length_kb' : '64', 'sstable_compression' : 'org.apache.cassandra.io.compress.SnappyCompressor'}; Thanks in advance for help in understanding why rows such as this are not removed! -Bryan