Another solution: distribute data in more tables, for example you could create multiple tables based on value or hash_bucket of one of the columns, by doing this current data volume and compaction overhead would be divided to the number of underlying tables. Although there is a limitation for number of tables in Cassandra (a few hundreds).
I wish STCS simply had a limitation for maximum sstable size so sstables bigger that this limit would not be compacted at all, that would have solved most of similar problems?! Sent using https://www.zoho.com/mail/ ---- On Fri, 30 Dec 2022 21:43:27 +0330 Durity, Sean R via user <user@cassandra.apache.org> wrote --- Yes, clean-up will reduce the disk space on the existing nodes by re-writing only the data that the node now owns into new sstables. Sean R. Durity DB Solutions Staff Systems Engineer – Cassandra From: Lapo Luchini <mailto:l...@lapo.it> Sent: Friday, December 30, 2022 4:12 AM To: mailto:user@cassandra.apache.org Subject: [EXTERNAL] Re: Best compaction strategy for rarely used data On 2022-12-29 21: 54, Durity, Sean R via user wrote: > At some point you will end up with large sstables (like 1 TB) that won’t > compact because there are not 4 similar-sized ones able to be compacted Yes, that's exactly what's happening. INTERNAL USE On 2022-12-29 21:54, Durity, Sean R via user wrote: > At some point you will end up with large sstables (like 1 TB) that won’t > compact because there are not 4 similar-sized ones able to be compacted Yes, that's exactly what's happening. I'll see maybe just one more compaction, since the biggest sstable is already more than 20% of residual free space. > For me, the backup strategy shouldn’t drive the rest. Mhh, yes, that makes sense. > And if your data is ever-growing > and never deleted, you will be adding nodes to handle the extra data as > time goes by (and running clean-up on the existing nodes). What will happen when adding new nodes, as you say, though? If I have a 1GB sstable with 250GB of data that will be no longer useful (as a new node will be the new owner) will that sstable be reduced to 750GB by "cleanup" or will it retain old data? Thanks, -- Lapo Luchini mailto:l...@lapo.it