Hi all: I am using Cassandra 4.0 and a table called 'minute' is configured to use TWCS.
Since it is TWCS, I do not have cassandra-reaper to perform any repair of the table. There is also no manual compaction performed on the table. When we insert data into the db, we set a TTL of one day from Monday to Thursday. Then a custom TTL on records inserted on Friday for 3 days. This is our table definition: CREATE TABLE test_db.minute ( market smallint, sin bigint, field smallint, slot timestamp, close frozen<pricerecord>, high frozen<pricerecord>, low frozen<pricerecord>, open frozen<pricerecord>, PRIMARY KEY ((market, sin, field), slot) ) WITH CLUSTERING ORDER BY (slot ASC) AND additional_write_policy = '99p' AND bloom_filter_fp_chance = 0.01 AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'} AND cdc = false AND comment = '' AND compaction = {'class': 'org.apache.cassandra.db.compaction.TimeWindowCompactionStrategy', 'compaction_window_size': '2', 'compaction_window_unit': 'HOURS', 'max_threshold': '32', 'min_threshold': '4', 'unsafe_aggressive_sstable_expiration': 'true'} AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'} AND crc_check_chance = 1.0 AND default_time_to_live = 86400 AND extensions = {} AND gc_grace_seconds = 86400 AND max_index_interval = 2048 AND memtable_flush_period_in_ms = 0 AND min_index_interval = 128 AND read_repair = 'BLOCKING' AND speculative_retry = '99p'; But on Oct 15th (Friday), I still see data files created on Monday still exist in the system. # ls -ltrh *Data.db -rw-r--r-- 1 cassandra cassandra 702M Oct 11 06:02 nb-158077-big-Data.db -rw-r--r-- 1 cassandra cassandra 3.8G Oct 11 08:13 nb-158282-big-Data.db -rw-r--r-- 1 cassandra cassandra 4.3G Oct 11 10:08 nb-158490-big-Data.db -rw-r--r-- 1 cassandra cassandra 4.1G Oct 11 12:15 nb-158717-big-Data.db -rw-r--r-- 1 cassandra cassandra 4.3G Oct 11 14:08 nb-158921-big-Data.db -rw-r--r-- 1 cassandra cassandra 4.3G Oct 11 16:08 nb-159136-big-Data.db -rw-r--r-- 1 cassandra cassandra 3.7G Oct 11 18:13 nb-159349-big-Data.db -rw-r--r-- 1 cassandra cassandra 3.4G Oct 11 20:05 nb-159522-big-Data.db -rw-r--r-- 1 cassandra cassandra 347M Oct 11 22:41 nb-159539-big-Data.db -rw-r--r-- 1 cassandra cassandra 646M Oct 12 02:04 nb-159580-big-Data.db -rw-r--r-- 1 cassandra cassandra 606M Oct 12 03:53 nb-159600-big-Data.db -rw-r--r-- 1 cassandra cassandra 761M Oct 12 06:03 nb-159629-big-Data.db -rw-r--r-- 1 cassandra cassandra 3.7G Oct 12 08:07 nb-159818-big-Data.db -rw-r--r-- 1 cassandra cassandra 4.3G Oct 12 10:08 nb-160034-big-Data.db -rw-r--r-- 1 cassandra cassandra 4.0G Oct 12 12:08 nb-160244-big-Data.db -rw-r--r-- 1 cassandra cassandra 4.4G Oct 12 14:10 nb-160466-big-Data.db -rw-r--r-- 1 cassandra cassandra 4.6G Oct 12 16:09 nb-160691-big-Data.db -rw-r--r-- 1 cassandra cassandra 3.6G Oct 12 18:06 nb-160885-big-Data.db -rw-r--r-- 1 cassandra cassandra 3.3G Oct 12 20:05 nb-161068-big-Data.db -rw-r--r-- 1 cassandra cassandra 379M Oct 12 22:38 nb-161086-big-Data.db -rw-r--r-- 1 cassandra cassandra 357M Oct 13 00:04 nb-161102-big-Data.db -rw-r--r-- 1 cassandra cassandra 610M Oct 13 02:03 nb-161125-big-Data.db -rw-r--r-- 1 cassandra cassandra 658M Oct 13 04:07 nb-161148-big-Data.db -rw-r--r-- 1 cassandra cassandra 737M Oct 13 06:02 nb-161176-big-Data.db -rw-r--r-- 1 cassandra cassandra 3.7G Oct 13 08:07 nb-161360-big-Data.db -rw-r--r-- 1 cassandra cassandra 4.4G Oct 13 10:08 nb-161582-big-Data.db -rw-r--r-- 1 cassandra cassandra 3.9G Oct 13 12:15 nb-161802-big-Data.db -rw-r--r-- 1 cassandra cassandra 4.5G Oct 13 14:09 nb-162014-big-Data.db -rw-r--r-- 1 cassandra cassandra 4.6G Oct 13 16:08 nb-162238-big-Data.db -rw-r--r-- 1 cassandra cassandra 3.5G Oct 13 18:08 nb-162426-big-Data.db -rw-r--r-- 1 cassandra cassandra 3.3G Oct 13 20:06 nb-162600-big-Data.db -rw-r--r-- 1 cassandra cassandra 354M Oct 13 22:41 nb-162616-big-Data.db -rw-r--r-- 1 cassandra cassandra 393M Oct 14 00:07 nb-162635-big-Data.db -rw-r--r-- 1 cassandra cassandra 632M Oct 14 02:07 nb-162658-big-Data.db -rw-r--r-- 1 cassandra cassandra 598M Oct 14 03:56 nb-162678-big-Data.db -rw-r--r-- 1 cassandra cassandra 763M Oct 14 06:02 nb-162708-big-Data.db -rw-r--r-- 1 cassandra cassandra 3.7G Oct 14 08:07 nb-162902-big-Data.db -rw-r--r-- 1 cassandra cassandra 4.1G Oct 14 10:08 nb-163112-big-Data.db -rw-r--r-- 1 cassandra cassandra 3.7G Oct 14 12:07 nb-163319-big-Data.db -rw-r--r-- 1 cassandra cassandra 4.3G Oct 14 14:09 nb-163538-big-Data.db -rw-r--r-- 1 cassandra cassandra 4.3G Oct 14 16:08 nb-163755-big-Data.db -rw-r--r-- 1 cassandra cassandra 3.2G Oct 14 18:06 nb-163937-big-Data.db -rw-r--r-- 1 cassandra cassandra 3.1G Oct 14 20:06 nb-164103-big-Data.db -rw-r--r-- 1 cassandra cassandra 365M Oct 14 22:39 nb-164121-big-Data.db -rw-r--r-- 1 cassandra cassandra 412M Oct 15 00:01 nb-164141-big-Data.db -rw-r--r-- 1 cassandra cassandra 628M Oct 15 01:58 nb-164163-big-Data.db -rw-r--r-- 1 cassandra cassandra 27M Oct 15 04:01 nb-164187-big-Data.db -rw-r--r-- 1 cassandra cassandra 623M Oct 15 04:02 nb-164188-big-Data.db I was expecting data from Monday to Wednesday already expired and deleted. From what I can see, the clean up of data only happened on Sunday. DB performance is ok from Monday to Wednesday. On Thursday & Friday, we have many slow query alerts. Then after Cassandra clean up the expired data from 'minute' on Sunday, performance of the db improves. Then on Thursday & Friday performance drop again. I can see in debug.log "DEBUG [CompactionExecutor:11] 2021-10-15 04:42:41,089 TimeWindowCompactionStrategy.java:124 - TWCS skipping check for fully expired SSTables". So Cassandra sort of know some of the tables already expired. But the expired table files are not being deleted. I ran sstableexpiredblockers against the table, there is nothing blocking the clean up. When I looked at the data file with sstablemetadata: # sstablemetadata nb-158282-big-Data.db SSTable: /var/lib/cassandra/data/test_db/minute-d7955270f31d11ea88fabb8dcc37b800/nb-158282-big Partitioner: org.apache.cassandra.dht.Murmur3Partitioner Bloom Filter FP chance: 0.01 Minimum timestamp: 1499178300000000 (07/04/2017 14:25:00) Maximum timestamp: 1633939179628350 (10/11/2021 07:59:39) SSTable min local deletion time: 1634018406 (10/12/2021 06:00:06) SSTable max local deletion time: 1634198378 (10/14/2021 07:59:38) Compressor: org.apache.cassandra.io.compress.LZ4Compressor Compression ratio: 0.2617497307180066 TTL min: 86400 (1 day) TTL max: 259200 (3 days) First token: -9156529758809369559 (322:20632605:6) Last token: 9211448821245734928 (3870:24273549:7) minClusteringValues: [2017-07-04T14:25:00.000Z] maxClusteringValues: [2021-10-11T07:58:00.000Z] Estimated droppable tombstones: 1.1312506927559127 Therefore, I want to see if you guys can help me to identify why the expired tables are not being cleaned up based on the TTL. Thanks, Eric