Hi Simon, I know for sure that clean up (like compaction) need to copy the entire SSTable (Data + index) excepted from the part being evicted by the cleanup. As SSTables are immutable, to manipulate (remove) data, cleanup like compaction need to copy the data we want to keep before removing the old SSTable. Given this it is understandable that you have tmp files taking almost as much space as the original once. That's why you can read in many places that it is good to have over 50 % of the disk space available at any moment.
That being said: - I am not sure why you have one tmp + one tmplink file for each sstable (like 'tmplink-lb-59517-big-Data.db' + 'tmp-lb-59517-big-Data.db'). There must be a reason I am not aware of (I don't believe it's a bug, it would be quite gross). Maybe someone else knows about this? - If new SSTable size is almost equal (or equal) to the old SSTable size, it means that there were not much data to clean up. Remember that cleanup only deletes data outside of the token ranges owned by each node (primary + replicas). This only occurs when adding nodes or messing with the token ranges. And do I have choice to do cleaning with less disk space? > I would say yes, as it seems you are using C* 2.1+, there is something you could try to make the parallel process of cleanup to be sequential instead (as it was in pre C*2.1) by doing this: 'nodetool cleanup -j 1' More informations: http://cassandra.apache.org/doc/latest/tools/n odetool/cleanup.html It could reduce the amount of free space needed as only one cleanup would run at any moment, meaning only 1 SSTable will be processed at the time. In these conditions, the maximum amount of extra disk space used by this process would be related to the biggest existing SSTable. The other thing I would explore is the reason why Cassandra is maintaining ' tmplink-lb-59517-big-Data.db' and 'tmp-lb-59517-big-Data.db'. I didn't cleanup for a while and I don't know your Cassandra version which makes it hard to investigate properly. C*heers, ----------------------- Alain Rodriguez - @arodream - al...@thelastpickle.com France The Last Pickle - Apache Cassandra Consulting http://www.thelastpickle.com 2017-06-20 6:51 GMT+01:00 wxn...@zjqunshuo.com <wxn...@zjqunshuo.com>: > Hi, > Cleaning up is generating temporary files which are occupying large disk > space. I noticed for every source sstable file, it is generating 4 > temporary files, and two of them is almost as large as the source sstable > file. If there are two concurrent cleaning tasks running, I have to leave > the remaining disk space at least as two times large as the sum size of the > two sstable files being cleaned up. > Is it expected? And do I have choice to do cleaning with less disk space? > > Below is the temporary files generated during cleaning up: > -rw-r--r-- 2 root root 798M Jun 20 13:34 tmplink-lb-59516-big-Index.db > -rw-r--r-- 2 root root 798M Jun 20 13:34 tmp-lb-59516-big-Index.db > -rw-r--r-- 2 root root 219G Jun 20 13:34 tmplink-lb-59516-big-Data.db > -rw-r--r-- 2 root root 219G Jun 20 13:34 tmp-lb-59516-big-Data.db > -rw-r--r-- 2 root root 978M Jun 20 13:33 tmplink-lb-59517-big-Index.db > -rw-r--r-- 2 root root 978M Jun 20 13:33 tmp-lb-59517-big-Index.db > -rw-r--r-- 2 root root 245G Jun 20 13:34 tmplink-lb-59517-big-Data.db > -rw-r--r-- 2 root root 245G Jun 20 13:34 tmp-lb-59517-big-Data.db > > Cheers, > -Simon >