Hi Dongfeng, 3: Restarting the code does NOT remove those files. I stopped and restarted > C* many times and it did nothing.
Finally, my solution was to manually delete those old files. I actually > deleted them while C* is running and did not see any errors/warnings in > system.log. My guess is that those files are not in C* metadata so C* does > not know their existance. This was a good move. If all the data is TTLed after 8 days, then any sstable older than 8 days is no longer relevant, this is a guarantee. I would probably have stopped the node though. Glad it worked. Automatic compaction by C* does not work in a timely manner for me You might want to give "unchecked_tombstone_compaction=true" a try on this table options. This will allow a most aggressive tombstone eviction, and should be quite safe. Not sure why this is not yet a Cassandra default. Single sstable compactions will trigger, removing tombstones after 10 days (gc_grace_seconds). So any data older than 8 days (TTL) + 10 days (gc_grace_seconds) = 18 days should be eventually (and quite quickly) removed. Major compaction (nodetool compaction) produces a very big sstables that will no longer be compacted until there are 3 other files of the same size (using default). I think running major comapction delay the issue (and might make it worse) but does not solve it. It is also good to know that compaction is doing a lot better in 2.1.X from my own experience. B: We have tested the procedure with 2.1.11 in our DEV environment quite > some time ago. Due to priority changes, we only started applying it to > production lately. By rule, I had to re-test it if I switch to 2.1.14, and > I don't see much benefits doing it. As an example example, if you are planning to take profit of the incremental repair features (new in 2.1) or DTCS, you probably want to jump to 2.1.14 because of: "FIX 2.1.14 - DTCS repair both unrepaired / repaired sstables - incremental only https://issues.apache.org/jira/browse/CASSANDRA-11113 FIX 2.1.14 - Avoid major compaction mixing repaired and unrepaired sstables in DTCS https://issues.apache.org/jira/browse/CASSANDRA-11113 FIX 2.1.12 - A lot of sstables using range repairs due to anticompaction - incremental only https://issues.apache.org/jira/browse/CASSANDRA-10422 FIX 2.1.12 - repair hang when replica is down - incremental only https://issues.apache.org/jira/browse/CASSANDRA-10288" I would probably go through 2.1.11 --> 2.1.14 changes and see if it is worth it. I am not saying you shouldn't test it, but, if migrating to 2.1.11 worked for you, I guess 2.1.14 will work as well. I am quite confident, but as I won't be responsible of it and of fixing any issue that might show up, it is up to you :-). An other way is to do one more step from 2.1.11 to 2.1.14, but I see no value in this as you would have to then test 2.1.11 --> 2.1.14 upgrade. > Since we are at 2.0.6, we have to migrate twice, from 2.0.6 to 2.0.17. > then to 2.1.11. Glad to see you did not miss that. I pointed it out, just in case :-). Good luck with this all, C*heers, ----------------------- Alain Rodriguez - al...@thelastpickle.com France The Last Pickle - Apache Cassandra Consulting http://www.thelastpickle.com 2016-06-01 17:52 GMT+01:00 Dongfeng Lu <dlu66...@yahoo.com>: > Alain, > > Thanks for responding to my question. > > 1 & 2: I think it is a bug, but as you said, maybe no one will dig it. I > just hope it has been fixed in the later versions. > 3: Restarting the code does NOT remove those files. I stopped and > restarted C* many times and it did nothing. > 4: Thanks for the links. I will probably try DTCS in the near future. > > A: Automatic compaction by C* does not work in a timely manner for me. I > set TTL to 8 days, and hoped that I only have data files with timestamps > like within 2 weeks. However, I often saw files created 2 months ago with > 50GB in size. > > In the final step of upgrade, I am suppose to run upgradesstables, which > is like a compaction. I know compaction takes a long time to run. In order > to reduce the amount of time during the actual upgrade, I ran a manual > compaction to cut down the size, by 80% in my case. > > B: We have tested the procedure with 2.1.11 in our DEV environment quite > some time ago. Due to priority changes, we only started applying it to > production lately. By rule, I had to re-test it if I switch to 2.1.14, and > I don't see much benefits doing it. > > C: Yes, I noticed the statement "When upgrading to Cassandra 2.1 all nodes > must be on at least Cassandra 2.0.7 to support rolling start." Since we are > at 2.0.6, we have to migrate twice, from 2.0.6 to 2.0.17. then to 2.1.11. > > Finally, my solution was to manually delete those old files. I actually > deleted them while C* is running and did not see any errors/warnings in > system.log. My guess is that those files are not in C* metadata so C* does > not know their existance. > > Thanks, > Dongfeng > > > On Wednesday, June 1, 2016 6:36 AM, Alain RODRIGUEZ <arodr...@gmail.com> > wrote: > > > Hi, > > About your main concern: > > 1. True those files should have been removed. Yet Cassandra 2.0 is no > longer supported, even more such an old version (2.0.6), so I think no one > is going to dig this issue. To fix it, upgrade will probably be enough. > > I don't usually run manual compaction, and relied completely on Cassandra > to automatically do it. A couple of days ago in preparation for an upgrade > to Cassandra 2.1.11, I ran a manual, complete compaction > > > 2. As you might know, sstables are immutable, meaning compacting, merging > row shards, has to be done somewhere else, not in place. Those -tmp- files > are the result of compactions ongoing basically. It is perfectly normal. > Yet '-tmp-' files are supposed to be removed once compaction is done. > > 3. Restarting the node will most probably solve your issue. To be sure to > indeed free disk space, make sure you have no snapshot of those old > sstables. > > 4. The advantage of DTCS is that data is not mixed per age. Meaning > Cassandra can drop a full expired sstable, without compacting. It sounds > like a good fit. Yet this compaction strategy is the most recent one and > some things are still being fixed. I still think it is safe to use it. Make > sure you read first: > https://labs.spotify.com/2014/12/18/date-tiered-compaction/ And/Or > http://www.datastax.com/dev/blog/datetieredcompactionstrategy > > You also might want to have a look at https://github.com/jeffjirsa/twcs. > > Some other off-topic, but maybe useful questions / info > > A - Why do you need a manual compaction before upgrading? I really can't > see any reason for it. > B - Why upgrading to Cassandra 2.1.14 when 2.1.14 is available and brings > some more bug fixes (compared to 2.1.11)? > C - It is recommended to move to 2.0.last before going to 2.1.X. You might > run into some issue. Either make sure to test it works or go incrementally > 2.0.6 --> 2.0.17 --> 2.1.14. I would probably do both. Test it and go > incrementally. I would not go with 2.0.6 --> 2.1.14 without testing it > first anyway. > > Hope it is all clear and that a restart will solve your issue. > > C*heers, > > ----------------------- > Alain Rodriguez - al...@thelastpickle.com > France > > The Last Pickle - Apache Cassandra Consulting > http://www.thelastpickle.com > > 2016-05-17 0:06 GMT+01:00 Dongfeng Lu <dlu66...@yahoo.com>: > > Forgive me if that has been answered somewhere, but I could not find a > concise or clear answer. > > I am using Cassandra 2.0.6 on a 3 node cluster. I don't usually run manual > compaction, and relied completely on Cassandra to automatically do it. A > couple of days ago in preparation for an upgrade to Cassandra 2.1.11, I ran > a manual, complete compaction. The compaction ran for many hours, but it > did complete successfully, and the "load" in "nodetool status" dropped 80%. > However, I did not see a big drop in disk usage, even after waiting for a > couple of days. There are still many old data files left on the disk. For > instance, here is a list of data files for one table. > > -bash-4.1$ ls -ltr *-Data.db > -rw-r--r-- 1 cassandra cassandra 36441245112 Jan 19 05:42 > keyspace-event_index-jb-620839-Data.db > -rw-r--r-- 1 cassandra cassandra 48117578123 Jan 25 05:17 > keyspace-event_index-jb-649329-Data.db > -rw-r--r-- 1 cassandra cassandra 8731574747 Jan 27 18:30 > keyspace-event_index-jb-662597-Data.db > -rw-r--r-- 1 cassandra cassandra 835204478 Feb 2 07:20 > keyspace-event_index-jb-670851-Data.db > -rw-r--r-- 1 cassandra cassandra 39496133 Feb 2 15:29 > keyspace-event_index-tmp-jb-672828-Data.db > ... about 110 files listed here, removed for clarity ... > > -rw-r--r-- 1 cassandra cassandra 149344563 May 9 20:53 > keyspace-event_index-tmp-jb-827472-Data.db > -rw-r--r-- 11 cassandra cassandra 20149715779 May 15 04:18 > keyspace-event_index-jb-829601-Data.db > -rw-r--r-- 11 cassandra cassandra 7153875910 May 15 11:15 > keyspace-event_index-jb-830446-Data.db > -rw-r--r-- 11 cassandra cassandra 3051908121 May 16 03:08 > keyspace-event_index-jb-831112-Data.db > -rw-r--r-- 11 cassandra cassandra 6109582092 May 16 06:11 > keyspace-event_index-jb-831709-Data.db > -rw-r--r-- 11 cassandra cassandra 2922532233 May 16 07:14 > keyspace-event_index-jb-831873-Data.db > -rw-r--r-- 11 cassandra cassandra 1766025989 May 16 08:31 > keyspace-event_index-jb-832111-Data.db > -rw-r--r-- 8 cassandra cassandra 2922259593 May 16 11:39 > keyspace-event_index-jb-832693-Data.db > -rw-r--r-- 8 cassandra cassandra 1224495235 May 16 11:50 > keyspace-event_index-jb-832764-Data.db > -rw-r--r-- 7 cassandra cassandra 2051385733 May 16 12:57 > keyspace-event_index-jb-832975-Data.db > -rw-r--r-- 6 cassandra cassandra 853824939 May 16 13:12 > keyspace-event_index-jb-833100-Data.db > -rw-r--r-- 5 cassandra cassandra 763243638 May 16 14:58 > keyspace-event_index-jb-833203-Data.db > -rw-r--r-- 3 cassandra cassandra 99076639 May 16 16:29 > keyspace-event_index-jb-833222-Data.db > -rw-r--r-- 2 cassandra cassandra 254935385 May 16 17:21 > keyspace-event_index-jb-833233-Data.db > -rw-r--r-- 2 cassandra cassandra 66006223 May 16 17:51 > keyspace-event_index-jb-833238-Data.db > -rw-r--r-- 1 cassandra cassandra 50204322 May 16 18:18 > keyspace-event_index-jb-833243-Data.db > -rw-r--r-- 2 cassandra cassandra 16078537 May 16 18:26 > keyspace-event_index-jb-833244-Data.db > > However, it looks to me that Cassandra knows that the first 115 files are > old and are not really used to create snapshot. Here is the newly created > snapshot. > > -bash-4.1$ ls -ltr snapshots/20160516-1800/*-Data.db > -rw-r--r-- 11 cassandra cassandra 20149715779 May 15 04:18 > snapshots/20160516-1800/keyspace-event_index-jb-829601-Data.db > -rw-r--r-- 11 cassandra cassandra 7153875910 May 15 11:15 > snapshots/20160516-1800/keyspace-event_index-jb-830446-Data.db > -rw-r--r-- 11 cassandra cassandra 3051908121 May 16 03:08 > snapshots/20160516-1800/keyspace-event_index-jb-831112-Data.db > -rw-r--r-- 11 cassandra cassandra 6109582092 May 16 06:11 > snapshots/20160516-1800/keyspace-event_index-jb-831709-Data.db > -rw-r--r-- 11 cassandra cassandra 2922532233 May 16 07:14 > snapshots/20160516-1800/keyspace-event_index-jb-831873-Data.db > -rw-r--r-- 11 cassandra cassandra 1766025989 May 16 08:31 > snapshots/20160516-1800/keyspace-event_index-jb-832111-Data.db > -rw-r--r-- 8 cassandra cassandra 2922259593 May 16 11:39 > snapshots/20160516-1800/keyspace-event_index-jb-832693-Data.db > -rw-r--r-- 8 cassandra cassandra 1224495235 May 16 11:50 > snapshots/20160516-1800/keyspace-event_index-jb-832764-Data.db > -rw-r--r-- 7 cassandra cassandra 2051385733 May 16 12:57 > snapshots/20160516-1800/keyspace-event_index-jb-832975-Data.db > -rw-r--r-- 6 cassandra cassandra 853824939 May 16 13:12 > snapshots/20160516-1800/keyspace-event_index-jb-833100-Data.db > -rw-r--r-- 5 cassandra cassandra 763243638 May 16 14:58 > snapshots/20160516-1800/keyspace-event_index-jb-833203-Data.db > -rw-r--r-- 3 cassandra cassandra 99076639 May 16 16:29 > snapshots/20160516-1800/keyspace-event_index-jb-833222-Data.db > -rw-r--r-- 2 cassandra cassandra 254935385 May 16 17:21 > snapshots/20160516-1800/keyspace-event_index-jb-833233-Data.db > -rw-r--r-- 2 cassandra cassandra 66006223 May 16 17:51 > snapshots/20160516-1800/keyspace-event_index-jb-833238-Data.db > -rw-r--r-- 2 cassandra cassandra 16336415 May 16 17:59 > snapshots/20160516-1800/keyspace-event_index-jb-833239-Data.db > -rw-r--r-- 2 cassandra cassandra 1947026 May 16 18:00 > snapshots/20160516-1800/keyspace-event_index-jb-833240-Data.db > -bash-4.1$ > > You can see that only files dated "May 15 04:18" or later exist in the > snapshot folder. > > My questions: > > 1. I believe Cassandra should have deleted all old 115 data files. What > could have prevented those files being deleted? What can I do to make sure > old files will be deleted in future compactions? > 2. What are those files with "-tmp-"? What is the implication of their > existance? Does it mean a compaction failed? > 3. Since Cassandra knows what files are actually used, is there some > utility that I can use to delete those old files? I can delete them > manually, but that would be error-prone. > 4. The table uses SizeTieredCompactionStrategy, and contains data with a > TTL of 8 days. Will switching to DateTieredCompactionStrategy after > upgrading to 2.1.11 offer much better compaction performance? > > Thanks, > Dongfeng > > > > >