Thank you for the information ! On Thu, Jun 20, 2019 at 9:50 AM Alexander Dejanovski <a...@thelastpickle.com> wrote:
> Léo, > > if a major compaction isn't a viable option, you can give a go at > Instaclustr SSTables tools to target the partitions with the most > tombstones : > https://github.com/instaclustr/cassandra-sstable-tools/tree/cassandra-2.2#ic-purge > > It generates a report like this: > > Summary: > > +---------+---------+ > > | | Size | > > +---------+---------+ > > | Disk | 1.9 GB | > > | Reclaim | 11.7 MB | > > +---------+---------+ > > > Largest reclaimable partitions: > > +--------------+--------+---------+-----------------+ > > | Key | Size | Reclaim | Generations | > > +--------------+--------+---------+-----------------+ > > | 001.2.340862 | 3.2 kB | 3.2 kB | [534, 438, 498] | > > | 001.2.946243 | 2.9 kB | 2.8 kB | [534, 434, 384] | > > | 001.1.527557 | 2.8 kB | 2.7 kB | [534, 519, 394] | > > | 001.2.181797 | 2.6 kB | 2.6 kB | [534, 424, 343] | > > | 001.3.475853 | 2.7 kB | 28 B | [524, 462] | > > | 001.0.159704 | 2.7 kB | 28 B | [440, 247] | > > | 001.1.311372 | 2.6 kB | 28 B | [424, 458] | > > | 001.0.756293 | 2.6 kB | 28 B | [428, 358] | > > | 001.2.681009 | 2.5 kB | 28 B | [440, 241] | > > | 001.2.474773 | 2.5 kB | 28 B | [524, 484] | > > | 001.2.974571 | 2.5 kB | 28 B | [386, 517] | > > | 001.0.143176 | 2.5 kB | 28 B | [518, 368] | > > | 001.1.185198 | 2.5 kB | 28 B | [517, 386] | > > | 001.3.503517 | 2.5 kB | 28 B | [426, 346] | > > | 001.1.847384 | 2.5 kB | 28 B | [436, 396] | > > | 001.0.949269 | 2.5 kB | 28 B | [516, 356] | > > | 001.0.756763 | 2.5 kB | 28 B | [440, 249] | > > | 001.3.973808 | 2.5 kB | 28 B | [517, 386] | > > | 001.0.312718 | 2.4 kB | 28 B | [524, 467] | > > | 001.3.632066 | 2.4 kB | 28 B | [432, 377] | > > | 001.1.946590 | 2.4 kB | 28 B | [519, 389] | > > | 001.1.798591 | 2.4 kB | 28 B | [434, 388] | > > | 001.3.953922 | 2.4 kB | 28 B | [432, 375] | > > | 001.2.585518 | 2.4 kB | 28 B | [432, 375] | > > | 001.3.284942 | 2.4 kB | 28 B | [376, 432] | > > +--------------+--------+---------+-----------------+ > > Once you've identified these partitions you can run a compaction on the > SSTables that contain them (identified using "nodetool getsstables"). > Note that user defined compactions are only available for STCS. > Also ic-purge will perform a compaction but without writing to disk > (should look like a validation compaction), so it is rightfully reported by > the docs as an "intensive process" (not more than a repair though). > > ----------------- > Alexander Dejanovski > France > @alexanderdeja > > Consultant > Apache Cassandra Consulting > http://www.thelastpickle.com > > > On Thu, Jun 20, 2019 at 9:17 AM Alexander Dejanovski < > a...@thelastpickle.com> wrote: > >> My bad on date formatting, it should have been : %Y/%m/%d >> Otherwise the SSTables aren't ordered properly. >> >> You have 2 SSTables that claim to cover timestamps from 1940 to 2262, >> which is weird. >> Aside from that, you have big overlaps all over the SSTables, so that's >> probably why your tombstones are sticking around. >> >> Your best shot here will be a major compaction of that table, since it >> doesn't seem so big. Remember to use the --split-output flag on the >> compaction command to avoid ending up with a single SSTable after that. >> >> Cheers, >> >> ----------------- >> Alexander Dejanovski >> France >> @alexanderdeja >> >> Consultant >> Apache Cassandra Consulting >> http://www.thelastpickle.com >> >> >> On Thu, Jun 20, 2019 at 8:13 AM Léo FERLIN SUTTON >> <lfer...@mailjet.com.invalid> wrote: >> >>> On Thu, Jun 20, 2019 at 7:37 AM Alexander Dejanovski < >>> a...@thelastpickle.com> wrote: >>> >>>> Hi Leo, >>>> >>>> The overlapping SSTables are indeed the most probable cause as >>>> suggested by Jeff. >>>> Do you know if the tombstone compactions actually triggered? (did the >>>> SSTables name change?) >>>> >>> >>> Hello ! >>> >>> I believe they have changed. I do not remember the sstable name but the >>> "last modified" has changed recently for these tables. >>> >>> >>>> Could you run the following command to list SSTables and provide us the >>>> output? It will display both their timestamp ranges along with the >>>> estimated droppable tombstones ratio. >>>> >>>> >>>> for f in *Data.db; do meta=$(sstablemetadata -gc_grace_seconds 259200 >>>> $f); echo $(date --date=@$(echo "$meta" | grep Maximum\ time | cut -d" " >>>> -f3| cut -c 1-10) '+%m/%d/%Y %H:%M:%S') $(date --date=@$(echo "$meta" | >>>> grep Minimum\ time | cut -d" " -f3| cut -c 1-10) '+%m/%d/%Y >>>> %H:%M:%S') $(echo "$meta" | grep droppable) $(ls -lh $f); done | sort >>>> >>> >>> Here is the results : >>> >>> ``` >>> 04/01/2019 22:53:12 03/06/2018 16:46:13 Estimated droppable tombstones: >>> 0.0 -rw-r--r-- 1 cassandra cassandra 16G Apr 13 14:35 md-147916-big-Data.db >>> 04/11/2262 23:47:16 10/09/1940 19:13:17 Estimated droppable tombstones: >>> 0.0 -rw-r--r-- 1 cassandra cassandra 218M Jun 20 05:57 md-167948-big-Data.db >>> 04/11/2262 23:47:16 10/09/1940 19:13:17 Estimated droppable tombstones: >>> 0.0 -rw-r--r-- 1 cassandra cassandra 2.2G Jun 20 05:57 md-167942-big-Data.db >>> 05/01/2019 08:03:24 03/06/2018 16:46:13 Estimated droppable tombstones: >>> 0.0 -rw-r--r-- 1 cassandra cassandra 4.6G May 1 08:39 md-152253-big-Data.db >>> 05/09/2018 06:35:03 03/06/2018 16:46:07 Estimated droppable tombstones: >>> 0.0 -rw-r--r-- 1 cassandra cassandra 30G Apr 13 22:09 md-147948-big-Data.db >>> 05/21/2019 05:28:01 03/06/2018 16:46:16 Estimated droppable tombstones: >>> 0.45150604672159905 -rw-r--r-- 1 cassandra cassandra 1.1G Jun 20 05:55 >>> md-167943-big-Data.db >>> 05/22/2019 11:54:33 03/06/2018 16:46:16 Estimated droppable tombstones: >>> 0.30826566640798975 -rw-r--r-- 1 cassandra cassandra 7.6G Jun 20 04:35 >>> md-167913-big-Data.db >>> 06/13/2019 00:02:40 03/06/2018 16:46:08 Estimated droppable tombstones: >>> 0.20980847354256815 -rw-r--r-- 1 cassandra cassandra 6.9G Jun 20 04:51 >>> md-167917-big-Data.db >>> 06/17/2019 05:56:12 06/16/2019 20:33:52 Estimated droppable tombstones: >>> 0.6114260192855792 -rw-r--r-- 1 cassandra cassandra 257M Jun 20 05:29 >>> md-167938-big-Data.db >>> 06/18/2019 11:21:55 03/06/2018 17:48:22 Estimated droppable tombstones: >>> 0.18655813086540254 -rw-r--r-- 1 cassandra cassandra 2.2G Jun 20 05:52 >>> md-167940-big-Data.db >>> 06/19/2019 16:53:04 06/18/2019 11:22:04 Estimated droppable tombstones: >>> 0.0 -rw-r--r-- 1 cassandra cassandra 425M Jun 19 17:08 md-167782-big-Data.db >>> 06/20/2019 04:17:22 06/19/2019 16:53:04 Estimated droppable tombstones: >>> 0.0 -rw-r--r-- 1 cassandra cassandra 146M Jun 20 04:18 md-167921-big-Data.db >>> 06/20/2019 05:50:23 06/20/2019 04:17:32 Estimated droppable tombstones: >>> 0.0 -rw-r--r-- 1 cassandra cassandra 42M Jun 20 05:56 md-167946-big-Data.db >>> 06/20/2019 05:56:03 06/20/2019 05:50:32 Estimated droppable tombstones: >>> 0.0 -rw-r--r-- 2 cassandra cassandra 4.8M Jun 20 05:56 md-167947-big-Data.db >>> 07/03/2018 17:26:54 03/06/2018 16:46:07 Estimated droppable tombstones: >>> 0.0 -rw-r--r-- 1 cassandra cassandra 27G Apr 13 17:45 md-147919-big-Data.db >>> 09/09/2018 18:55:23 03/06/2018 16:46:08 Estimated droppable tombstones: >>> 0.0 -rw-r--r-- 1 cassandra cassandra 30G Apr 13 18:57 md-147926-big-Data.db >>> 11/30/2018 11:52:33 03/06/2018 16:46:08 Estimated droppable tombstones: >>> 0.0 -rw-r--r-- 1 cassandra cassandra 14G Apr 13 13:53 md-147908-big-Data.db >>> 12/20/2018 07:30:03 03/06/2018 16:46:08 Estimated droppable tombstones: >>> 0.0 -rw-r--r-- 1 cassandra cassandra 9.3G Apr 13 13:28 md-147906-big-Data.db >>> ``` >>> >>> You could also check the min and max tokens in each SSTable (not sure if >>>> you get that info from 3.0 sstablemetadata) so that you can detect the >>>> SSTables that overlap on token ranges with the ones that carry the >>>> tombstones, and have earlier timestamps. This way you'll be able to trigger >>>> manual compactions, targeting those specific SSTables. >>>> >>> >>> I have checked and I don't believe the info is available in the 3.0.X >>> version of sstablemetadata :( >>> >>> >>>> The rule for a tombstone to be purged is that there is no SSTable >>>> outside the compaction that would possibly contain the partition and that >>>> would have older timestamps. >>>> >>> Is there a way to log these checks and decisions made by the compaction >>> thread ? >>> >>> >>>> Is this a followup on your previous issue where you were trying to >>>> perform a major compaction on an LCS table? >>>> >>> >>> In some way. >>> >>> We are trying to globally reclaim the data used up by our tombstones (on >>> more than one table). We have recently started to purge old data in our >>> cassandra cluster, and since (on cloud providers) `Disk space isn't cheap` >>> we are trying to be sure the data correctly expires and the disk space is >>> reclaimed ! >>> >>> The major compaction on the LCS table was one of our unsuccessful >>> attempts (too long and too much disk space used, so abandoned), and we are >>> currently trying to tweak the compaction parameters to speed things up. >>> >>> Regards. >>> >>> Leo >>> >>> On Thu, Jun 20, 2019 at 7:02 AM Jeff Jirsa <jji...@gmail.com> wrote: >>>> >>>>> Probably overlapping sstables >>>>> >>>>> Which compaction strategy? >>>>> >>>>> >>>>> > On Jun 19, 2019, at 9:51 PM, Léo FERLIN SUTTON >>>>> <lfer...@mailjet.com.invalid> wrote: >>>>> > >>>>> > I have used the following command to check if I had droppable >>>>> tombstones : >>>>> > `/usr/bin/sstablemetadata --gc_grace_seconds 259200 >>>>> /var/lib/cassandra/data/stats/tablename/md-sstablename-big-Data.db` >>>>> > >>>>> > I checked every sstable in a loop and had 4 sstables with droppable >>>>> tombstones : >>>>> > >>>>> > ``` >>>>> > Estimated droppable tombstones: 0.1558453651124074 >>>>> > Estimated droppable tombstones: 0.20980847354256815 >>>>> > Estimated droppable tombstones: 0.30826566640798975 >>>>> > Estimated droppable tombstones: 0.45150604672159905 >>>>> > ``` >>>>> > >>>>> > I changed my compaction configuration this morning (via JMX) to >>>>> force a tombstone compaction. These are my settings on this node : >>>>> > >>>>> > ``` >>>>> > { >>>>> > "max_threshold":"32", >>>>> > "min_threshold":"4", >>>>> > "unchecked_tombstone_compaction":"true", >>>>> > "tombstone_threshold":"0.1", >>>>> > >>>>> "class":"org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy" >>>>> > } >>>>> > ``` >>>>> > The threshold is lowed than the amount of tombstones in these >>>>> sstables and I expected the setting `unchecked_tombstone_compaction=True` >>>>> would force cassandra to run a "Tombstone Compaction", yet about 24h later >>>>> all the tombstones are still there. >>>>> > >>>>> > ## About the cluster : >>>>> > >>>>> > The compaction backlog is clear and here are our cassandra settings >>>>> : >>>>> > >>>>> > Cassandra 3.0.18 >>>>> > concurrent_compactors: 4 >>>>> > compaction_throughput_mb_per_sec: 150 >>>>> > sstable_preemptive_open_interval_in_mb: 50 >>>>> > memtable_flush_writers: 4 >>>>> > >>>>> > >>>>> > Any idea what I might be missing ? >>>>> > >>>>> > Regards, >>>>> > >>>>> > Leo >>>>> >>>>> --------------------------------------------------------------------- >>>>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org >>>>> For additional commands, e-mail: user-h...@cassandra.apache.org >>>>> >>>>>