Twitter tried a timestamp-based compaction strategy in https://issues.apache.org/jira/browse/CASSANDRA-2735. The conclusion was, "this actually resulted in a lot more compactions than the SizeTieredCompactionStrategy. The increase in IO was not acceptable for our use and therefore stopped working on this patch."
2012/4/3 Radim Kolar <h...@filez.com>: > there is problem with size tiered compaction design. It compacts together > tables of similar size. > > sometimes it might happen that you will have some sstables sitting on disk > forever (Feb 23) because no other similar sized tables were created and > probably never be. because flushed sstable is about 11-16 mb. > > next level about 90 MB > then 5x 90 MB gets compacted to 400 MB sstable > and 5x400 MB ~ 2 GB > > problem is that 400 MB sstable is too small to be compacted against these 3x > 720 MB ones. > > -rw-r--r-- 1 root wheel 165M Feb 23 17:03 resultcache-hc-13086-Data.db > -rw-r--r-- 1 root wheel 772M Feb 23 17:04 resultcache-hc-13087-Data.db > -rw-r--r-- 1 root wheel 156M Feb 23 17:06 resultcache-hc-13091-Data.db > -rw-r--r-- 1 root wheel 716M Feb 23 17:18 resultcache-hc-13096-Data.db > -rw-r--r-- 1 root wheel 734M Feb 23 17:29 resultcache-hc-13101-Data.db > -rw-r--r-- 1 root wheel 5.0G Mar 14 09:38 resultcache-hc-13923-Data.db > -rw-r--r-- 1 root wheel 1.9G Mar 16 22:41 resultcache-hc-14084-Data.db > -rw-r--r-- 1 root wheel 1.9G Mar 21 15:11 resultcache-hc-14460-Data.db > -rw-r--r-- 1 root wheel 1.9G Mar 27 05:22 resultcache-hc-14694-Data.db > -rw-r--r-- 1 root wheel 2.0G Mar 31 04:57 resultcache-hc-14851-Data.db > -rw-r--r-- 1 root wheel 112M Mar 31 06:30 resultcache-hc-14922-Data.db > -rw-r--r-- 1 root wheel 577M Apr 1 19:25 resultcache-hc-14943-Data.db > > compaction strategy needs to compact sstables by timestamp too. older tables > should have increased chance to get compacted. > for example - table from today will be compacted with other table in range > (0.5-1.5) of its size, and this range will get increased with sstable age. - > 1 month old will have range for example (0.2 - 1.8). -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com