Hi Maik, Yes when u have old and new data mixed together the old SStable will not be dropped until new SStable is fully expired.
There are couple of ways for you to reclaim the storage, 1.) If this is one time thing probably, you can manually run some commands which will rewrite sstables like, nodetool compact, scrub or garbagecollect 2.) If u think this would be recurring probably u should set unchecked_tombstone_compaction to true default is false unchecked_tombstone_compaction (default: false) The single sstable compaction has quite strict checks for whether it should be started, this option disables those checks and for some use cases this might be needed. Note that this does not change anything for the actual compaction, tombstones are only dropped if it is safe to do so - it might just rewrite an sstable without being able to drop any tombstones. In both cases it will trigger numerous compactions make sure u have enough i/o or throttle ur compaction threads If u r going to make DDL change - option 2, would recommend u to go through some more info on how TWCS wrks http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html Thanks Sri Rathan Hallo, > > we work with Cassandra version 3.0.9 and have a problem in a table with > TWCS. The command “nodetool repair” create always new files with old data. > This avoid the delete of the old data. > > The layout of the Table is following: > > cqlsh> desc stat.spa > > > > CREATE TABLE stat.spa ( > > region int, > > id int, > > date text, > > hour int, > > zippedjsonstring blob, > > PRIMARY KEY ((region, id), date, hour) > > ) WITH CLUSTERING ORDER BY (date ASC, hour ASC) > > AND bloom_filter_fp_chance = 0.01 > > AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'} > > AND comment = '' > > AND compaction = {'class': > 'org.apache.cassandra.db.compaction.TimeWindowCompactionStrategy', > 'compaction_window_size': '1', 'compaction_window_unit': 'DAYS', > 'max_threshold': '100', 'min_threshold': '4', > 'tombstone_compaction_interval': '86460'} > > AND compression = {'chunk_length_in_kb': '64', 'class': > 'org.apache.cassandra.io.compress.LZ4Compressor'} > > AND crc_check_chance = 1.0 > > AND dclocal_read_repair_chance = 0.0 > > AND default_time_to_live = 0 > > AND gc_grace_seconds = 864000 > > AND max_index_interval = 2048 > > AND memtable_flush_period_in_ms = 0 > > AND min_index_interval = 128 > > AND read_repair_chance = 0.0 > > AND speculative_retry = '99PERCENTILE'; > > > > Actual the oldest data are from 2017/04/15 and will not remove: > > > > $ for f in *Data.db; do meta=$(sudo sstablemetadata $f); echo -e "Max:" > $(date --date=@$(echo "$meta" | grep Maximum\ time | cut -d" " -f3| cut -c > 1-10) '+%Y/%m/%d %H:%M') "Min:" $(date --date=@$(echo "$meta" | grep > Minimum\ time | cut -d" " -f3| cut -c 1-10) '+%Y/%m/%d %H:%M') $(echo > "$meta" | grep droppable) $(echo "$meta" | grep "Repaired at") ' \t ' $(ls > -lh $f | awk '{print $5" "$6" "$7" "$8" "$9}'); done | sort > > Max: 2017/04/15 12:08 Min: 2017/03/31 13:09 Estimated droppable > tombstones: 1.7731048805815162 Repaired at: 1525685601400 42K May 7 > 19:56 mc-22922-big-Data.db > > Max: 2017/04/17 13:49 Min: 2017/03/31 13:09 Estimated droppable > tombstones: 1.9600207684319835 Repaired at: 1525685601400 116M May > 7 13:31 mc-15096-big-Data.db > > Max: 2017/04/21 13:43 Min: 2017/04/15 13:34 Estimated droppable > tombstones: 1.9090909090909092 Repaired at: 1525685601400 11K May 7 > 19:56 mc-22921-big-Data.db > > Max: 2017/05/23 21:45 Min: 2017/04/21 14:00 Estimated droppable > tombstones: 1.8360655737704918 Repaired at: 1525685601400 21M May 7 > 19:56 mc-22919-big-Data.db > > Max: 2017/06/12 15:19 Min: 2017/04/25 14:45 Estimated droppable > tombstones: 1.8091397849462365 Repaired at: 1525685601400 19M May 7 > 14:36 mc-17095-big-Data.db > > Max: 2017/06/15 15:26 Min: 2017/05/10 14:37 Estimated droppable > tombstones: 1.76536312849162 Repaired at: 1529612605539 9.3M Jun > 21 22:31 mc-25372-big-Data.db > > … > > > > After a „nodetool repair“ run, a new big data file is created that include > old data from 2017/07/31. > > > > Max: 2018/07/27 18:10 Min: 2017/03/31 13:13 Estimated droppable > tombstones: 0.08392555471691247 Repaired at: 0 11G Sep 11 22:02 > mc-39281-big-Data.db > > … > > Max: 2018/08/16 18:18 Min: 2018/08/06 12:19 Estimated droppable > tombstones: 0.0 Repaired at: 1534525730510 123M Aug 17 23:46 > mc-36847-big-Data.db > > Max: 2018/08/17 19:20 Min: 2017/07/31 12:04 Estimated droppable > tombstones: 0.03385963490004347 Repaired at: 0 11G Sep 11 > 21:43 mc-39265-big-Data.db > > Max: 2018/08/17 19:20 Min: 2018/07/24 12:33 Estimated droppable > tombstones: 0.0 Repaired at: 1534525730510 135M Sep 11 21:44 > mc-39270-big-Data.db > > … > > Max: 2018/09/06 17:30 Min: 2018/08/28 12:17 Estimated droppable > tombstones: 0.0 Repaired at: 1536690786879 129M Sep 11 21:10 > mc-39238-big-Data.db > > Max: 2018/09/07 18:22 Min: 2017/04/23 12:48 Estimated droppable > tombstones: 0.1548442441468401 Repaired at: 0 8.0G Sep 11 21:33 > mc-39258-big-Data.db > > Max: 2018/09/07 18:22 Min: 2018/09/07 12:15 Estimated droppable > tombstones: 0.0 Repaired at: 1536690786879 72M Sep 11 21:34 > mc-39262-big-Data.db > > Max: 2018/09/08 18:20 Min: 2018/08/22 12:17 Estimated droppable > tombstones: 0.0 Repaired at: 0 2.8G Sep 11 21:47 > mc-39272-big-Data.db > > > > The tool sstableexpiredblockers shows that the file mc-39281-big-Data.db > blocks > 95 expired files from getting dropped, for example the oldest file > mc-22922-big-Data.db > > > > [BigTableReader(path='.../stat/spa-.../mc-39281-big-Data.db') (minTS = > 1490958782530000, maxTS = 1532707837676719, maxLDT = 1557154990) > > blocks 95 expired sstables from getting dropped: > > [BigTableReader(path='.../stat/spa-.../mc-36936-big-Data.db') (minTS = > 1500027128958000, maxTS = 1503666765807229, maxLDT = 1535202765) > > [BigTableReader(path='.../stat/spa-.../mc-22921-big-Data.db') (minTS = > 1492256093314000, maxTS = 1492775013454001, maxLDT = 1524311013) > > [BigTableReader(path='.../stat/spa-.../mc-36947-big-Data.db') (minTS = > 1492255708403000, maxTS = 1501937182477001, maxLDT = 1533473182) > > [BigTableReader(path='.../stat/spa-.../mc-32582-big-Data.db') (minTS = > 1493028031639000, maxTS = 1499175057476001, maxLDT = 1530711057) > > [BigTableReader(path='.../stat/spa-.../mc-32560-big-Data.db') (minTS = > 1500210297826000, maxTS = 1501416691390001, maxLDT = 1532952691) > > [BigTableReader(path='.../stat/spa-.../mc-32528-big-Data.db') (minTS = > 1490958761762000, maxTS = 1504358072394248, maxLDT = 1535894072) > > [BigTableReader(path='.../stat/spa-.../mc-32572-big-Data.db') (minTS = > 1500027103795000, maxTS = 1500297137808001, maxLDT = 1531833137) > > [BigTableReader(path='.../stat/spa-.../mc-36935-big-Data.db') (minTS = > 1500038582669000, maxTS = 1503839159485824, maxLDT = 1535375159) > > [BigTableReader(path='.../stat/spa-.../mc-22922-big-Data.db') (minTS = > 1490958570018000, maxTS = 1492250905633001, maxLDT = 1523786905) > > [BigTableReader(path='.../stat/spa-.../mc-33470-big-Data.db') (minTS = > 1499940836241000, maxTS = 1500040376685000, maxLDT = 1531576376) > > > > Why create the repair such turbulence in new data files and how can we > remove the old data? > > > > Kind Regards > > Maik Cäsar > > > > > DXC Technology Company -- This message is transmitted to you by or on > behalf of DXC Technology Company or one of its affiliates. It is intended > exclusively for the addressee. The substance of this message, along with > any attachments, may contain proprietary, confidential or privileged > information or information that is otherwise legally exempt from > disclosure. Any unauthorized review, use, disclosure or distribution is > prohibited. If you are not the intended recipient of this message, you are > not authorized to read, print, retain, copy or disseminate any part of this > message. If you have received this message in error, please destroy and > delete all copies and notify the sender by return e-mail. Regardless of > content, this e-mail shall not operate to bind DXC Technology Company or > any of its affiliates to any order or other contract unless pursuant to > explicit written agreement or government initiative expressly permitting > the use of e-mail for such purpose. --. >