Yes, clean-up will reduce the disk space on the existing nodes by re-writing only the data that the node now owns into new sstables.
Sean R. Durity DB Solutions Staff Systems Engineer – Cassandra From: Lapo Luchini <l...@lapo.it> Sent: Friday, December 30, 2022 4:12 AM To: user@cassandra.apache.org Subject: [EXTERNAL] Re: Best compaction strategy for rarely used data On 2022-12-29 21: 54, Durity, Sean R via user wrote: > At some point you will end up with large sstables (like 1 TB) that won’t > compact because there are not 4 similar-sized ones able to be compacted Yes, that's exactly what's happening. INTERNAL USE On 2022-12-29 21:54, Durity, Sean R via user wrote: > At some point you will end up with large sstables (like 1 TB) that won’t > compact because there are not 4 similar-sized ones able to be compacted Yes, that's exactly what's happening. I'll see maybe just one more compaction, since the biggest sstable is already more than 20% of residual free space. > For me, the backup strategy shouldn’t drive the rest. Mhh, yes, that makes sense. > And if your data is ever-growing > and never deleted, you will be adding nodes to handle the extra data as > time goes by (and running clean-up on the existing nodes). What will happen when adding new nodes, as you say, though? If I have a 1GB sstable with 250GB of data that will be no longer useful (as a new node will be the new owner) will that sstable be reduced to 750GB by "cleanup" or will it retain old data? Thanks, -- Lapo Luchini l...@lapo.it<mailto:l...@lapo.it>