Yes, clean-up will reduce the disk space on the existing nodes by re-writing 
only the data that the node now owns into new sstables.


Sean R. Durity
DB Solutions
Staff Systems Engineer – Cassandra

From: Lapo Luchini <l...@lapo.it>
Sent: Friday, December 30, 2022 4:12 AM
To: user@cassandra.apache.org
Subject: [EXTERNAL] Re: Best compaction strategy for rarely used data

On 2022-12-29 21: 54, Durity, Sean R via user wrote: > At some point you will 
end up with large sstables (like 1 TB) that won’t > compact because there are 
not 4 similar-sized ones able to be compacted Yes, that's exactly what's 
happening. 



INTERNAL USE

On 2022-12-29 21:54, Durity, Sean R via user wrote:

> At some point you will end up with large sstables (like 1 TB) that won’t

> compact because there are not 4 similar-sized ones able to be compacted



Yes, that's exactly what's happening.



I'll see maybe just one more compaction, since the biggest sstable is

already more than 20% of residual free space.



> For me, the backup strategy shouldn’t drive the rest.



Mhh, yes, that makes sense.



> And if your data is ever-growing

> and never deleted, you will be adding nodes to handle the extra data as

> time goes by (and running clean-up on the existing nodes).



What will happen when adding new nodes, as you say, though?

If I have a 1GB sstable with 250GB of data that will be no longer useful

(as a new node will be the new owner) will that sstable be reduced to

750GB by "cleanup" or will it retain old data?



Thanks,



--

Lapo Luchini

l...@lapo.it<mailto:l...@lapo.it>


Reply via email to