https://issues.apache.org/jira/browse/CASSANDRA-2575
On Thu, Apr 21, 2011 at 11:56 PM, Jonathan Ellis <jbel...@gmail.com> wrote: > I suggest as a workaround making the forceUserDefinedCompaction method > ignore disk space estimates and attempt the requested compaction even > if it guesses it will not have enough space. This would allow you to > submit the 2 sstables you want manually. > > On Thu, Apr 21, 2011 at 8:34 AM, Shotaro Kamio <kamios...@gmail.com> wrote: >> Hi Aaron, >> >> >> Maybe, my previous description was not good. It's not a compaction >> threshold problem. >> In fact, Cassandra tries to compact 7 sstables in the minor >> compaction. But it decreases the number of sstables one by one due to >> insufficient disk space. At the end, it compacts a single file as in >> the new log below. >> >> Compactionstats on a node says: >> >> compaction type: Minor >> column family: foobar >> bytes compacted: 133473101929 >> bytes total in progress: 170000743825 >> pending tasks: 12 >> >> The disk usage reaches 78%. It's really tough situation. But I guess >> the data contains a lot of duplicates. because we feed same data again >> and again and do repair. >> >> >> Another thing I'm wondering is a file selection algorithm. >> For example, one of disks has 235G free space. It contains sstables of >> 61G, 159G, 191G, 196G, 197G. The one cassandra trying to compact >> forever is 159G sstable. But there is smaller sstable. It should try >> compacting 61G + 159G ideally. >> A more intelligent algorithm is required to find optimal combination. >> And if cassandra knows statistics about number of deleted data and old >> data to be compacted for sstables, it should be useful to find more >> efficient file combination. >> >> >> Regards, >> Shotaro >> >> >> >> * Minor compaction log >> ----- >> WARN [CompactionExecutor:1] 2011-04-21 21:44:08,554 >> CompactionManager.java (line 405) insufficient space to compact all >> requested files SSTableReader(path='foobar-f-773-Data.db'), >> SSTableReader(path='foobar-f-1452-Data.db'), >> SSTableReader(path='foobar-f-1620-Data.db'), >> SSTableReader(path='foobar-f-1642-Data.db'), >> SSTableReader(path='foobar-f-1643-Data.db'), >> SSTableReader(path='foobar-f-1690-Data.db'), >> SSTableReader(path='foobar-f-1814-Data.db') >> WARN [CompactionExecutor:1] 2011-04-21 21:44:28,565 >> CompactionManager.java (line 405) insufficient space to compact all >> requested files SSTableReader(path='foobar-f-773-Data.db'), >> SSTableReader(path='foobar-f-1452-Data.db'), >> SSTableReader(path='foobar-f-1642-Data.db'), >> SSTableReader(path='foobar-f-1643-Data.db'), >> SSTableReader(path='foobar-f-1690-Data.db'), >> SSTableReader(path='foobar-f-1814-Data.db') >> WARN [CompactionExecutor:1] 2011-04-21 21:44:48,576 >> CompactionManager.java (line 405) insufficient space to compact all >> requested files SSTableReader(path='foobar-f-773-Data.db'), >> SSTableReader(path='foobar-f-1452-Data.db'), >> SSTableReader(path='foobar-f-1642-Data.db'), >> SSTableReader(path='foobar-f-1643-Data.db'), >> SSTableReader(path='foobar-f-1814-Data.db') >> WARN [CompactionExecutor:1] 2011-04-21 21:45:08,586 >> CompactionManager.java (line 405) insufficient space to compact all >> requested files SSTableReader(path='foobar-f-1452-Data.db'), >> SSTableReader(path='foobar-f-1642-Data.db'), >> SSTableReader(path='foobar-f-1643-Data.db'), >> SSTableReader(path='foobar-f-1814-Data.db') >> WARN [CompactionExecutor:1] 2011-04-21 21:45:28,596 >> CompactionManager.java (line 405) insufficient space to compact all >> requested files SSTableReader(path='foobar-f-1642-Data.db'), >> SSTableReader(path='foobar-f-1643-Data.db'), >> SSTableReader(path='foobar-f-1814-Data.db') >> WARN [CompactionExecutor:1] 2011-04-21 21:45:48,607 >> CompactionManager.java (line 405) insufficient space to compact all >> requested files SSTableReader(path='foobar-f-1642-Data.db'), >> SSTableReader(path='foobar-f-1814-Data.db') >> ------ >> >> >> >> On Thu, Apr 21, 2011 at 7:20 PM, aaron morton <aa...@thelastpickle.com> >> wrote: >>> Want to check if you are talking about minor compactions or major (nodetool) >>> compactions. >>> What settings compaction settings do you have for this CF ? You can increase >>> the min compaction threshold and reduce the frequency of >>> compactions http://wiki.apache.org/cassandra/StorageConfiguration >>> It seems like compaction is running continually, are their pending tasks in >>> the o.a.c.db.CompactionManager MBean ? >>> How bad is you disk space problem ? >>> For the code change, AFAIK it's not possible for cassandra to know if there >>> are tombstones in the SSTable which can be purged until the rows are read. >>> Perhaps the file could hold the earliest deleted at time somewhere (same for >>> TTL), but I do not think we do that now. >>> Hope that helps. >>> Aaron >>> >>> On 20 Apr 2011, at 21:25, Shotaro Kamio wrote: >>> >>> Hi, >>> >>> I found that our cluster repeats compacting a single file forever >>> (cassandra 0.7.5). We are wondering if compaction logic is wrong. I'd >>> like to have comments from you guys. >>> >>> Situation: >>> - After trying to repair a column family, our cluster's disk usage is >>> quite high. Cassandra cannot compact all sstables at once. I think it >>> repeats compacting single file at the end. (you can check the attached >>> log below) >>> - Our data doesn't have deletes. So, the compaction of single file >>> doesn't make free disk space. >>> >>> We are approaching to full-disk. But I believe that the repair >>> operation made a lot of duplicate data on the disk and it requires >>> compaction. However, most of nodes stuck on compacting a single file. >>> The only thing we can do is to restart the nodes. >>> >>> My question is why the compaction doesn't stop. >>> >>> I looked at the logic in CompactionManager.java: >>> ----------------- >>> String compactionFileLocation = >>> table.getDataFileLocation(cfs.getExpectedCompactedFileSize(sstables)); >>> // If the compaction file path is null that means we have no >>> space left for this compaction. >>> // try again w/o the largest one. >>> List<SSTableReader> smallerSSTables = new >>> ArrayList<SSTableReader>(sstables); >>> while (compactionFileLocation == null && smallerSSTables.size() > 1) >>> { >>> logger.warn("insufficient space to compact all requested >>> files " + StringUtils.join(smallerSSTables, ", ")); >>> smallerSSTables.remove(cfs.getMaxSizeFile(smallerSSTables)); >>> compactionFileLocation = >>> table.getDataFileLocation(cfs.getExpectedCompactedFileSize(smallerSSTables)); >>> } >>> if (compactionFileLocation == null) >>> { >>> logger.error("insufficient space to compact even the two >>> smallest files, aborting"); >>> return 0; >>> } >>> ----------------- >>> >>> The while condition: smallerSSTables.size() > 1 >>> Is this should be "smallerSSTables.size() > 2" ? >>> >>> In my understanding, compaction of single file makes free disk space >>> only when the sstable has a lot of tombstone and only if the tombstone >>> is removed in the compaction. If cassandra knows the sstable has >>> tombstones to be removed, it's worth to compact it. Otherwise, it >>> might makes free space if you are lucky. In worst case, it leads to >>> infinite loop like our case. >>> >>> What do you think the code change? >>> >>> >>> Best regards, >>> Shotaro >>> >>> >>> * Cassandra compaction log >>> ------------------------- >>> WARN [CompactionExecutor:1] 2011-04-20 01:03:14,446 >>> CompactionManager.java (line 405) insufficient space to compact all >>> requested files SSTableReader( >>> path='foobar-f-3020-Data.db'), SSTableReader(path='foobar-f-3034-Data.db') >>> INFO [CompactionExecutor:1] 2011-04-20 03:47:29,833 >>> CompactionManager.java (line 482) Compacted to >>> foobar-tmp-f-3035-Data.db. 260,646,760,319 to 260,646,760,319 (~100% >>> of original) bytes for 6,893,896 keys. Time: 9,855,385ms. >>> >>> WARN [CompactionExecutor:1] 2011-04-20 03:48:11,308 >>> CompactionManager.java (line 405) insufficient space to compact all >>> requested files SSTableReader(path='foobar-f-3020-Data.db'), >>> SSTableReader(path='foobar-f-3035-Data.db') >>> INFO [CompactionExecutor:1] 2011-04-20 06:31:41,193 >>> CompactionManager.java (line 482) Compacted to >>> foobar-tmp-f-3036-Data.db. 260,646,760,319 to 260,646,760,319 (~100% >>> of original) bytes for 6,893,896 keys. Time: 9,809,882ms. >>> >>> WARN [CompactionExecutor:1] 2011-04-20 06:32:22,476 >>> CompactionManager.java (line 405) insufficient space to compact all >>> requested files SSTableReader(path='foobar-f-3020-Data.db'), >>> SSTableReader(path='foobar-f-3036-Data.db') >>> INFO [CompactionExecutor:1] 2011-04-20 09:20:29,903 >>> CompactionManager.java (line 482) Compacted to >>> foobar-tmp-f-3037-Data.db. 260,646,760,319 to 260,646,760,319 (~100% >>> of original) bytes for 6,893,896 keys. Time: 10,087,424ms. >>> ------------------------- >>> You can see that compacted size is always the same. It repeats >>> compacting the same single sstable. >>> >>> >> >> >> >> -- >> Shotaro Kamio >> > > > > -- > Jonathan Ellis > Project Chair, Apache Cassandra > co-founder of DataStax, the source for professional Cassandra support > http://www.datastax.com > -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com