Re: Increased Disk Usage After Upgrading From Cassandra 3.x.x to 4.1.3

C. Scott Andreas Thu, 20 Mar 2025 21:23:19 -0700
While it's true that COMPACT STORAGE was deprecated in Cassandra 4.0+, it was not removed. It's largely supported and works just fine for many use cases. Feedback from users indicated that removing it entirely 
would be disruptive because it required special effort from Cassandra users on upgrade to 4.x. Much of COMPACT STORAGE was reintroduced in Cassandra 4.x, and works just fine – including in 5.0 and trunk. You 
can find details on that work here: https://issues.apache.org/jira/browse/CASSANDRA-16217 Excerpting the docs added at the time: 
https://github.com/apache/cassandra/commit/0a49d25078665da0ec30d9e69a036de163deb9c3#diff-6a259516d12c778b2b08ec408b884a88bf94d14a9f6dd8d943c6e4376288da6dR479-R480 –––––– A compact storage table is one defined 
with the COMPACT STORAGE option. This option is only maintained for backward compatibility for definitions created before CQL version 3 and shouldn't be used for new tables. 4.0 supports partially COMPACT 
STORAGE. There is no support for super column family. Since Cassandra 3.0, compact storage tables have the exact same layout internally than non compact ones (for the same schema obviously), and declaring a 
table with this option creates limitations for the table which are largely arbitrary (and exists for historical reasons). Amongst those limitations: - A compact storage table cannot use collections nor static 
columns. - If a compact storage table has at least one clustering column, then it must have *exactly* one column outside of the primary key ones. This implies you cannot add or remove columns after creation in 
particular. - A compact storage table is limited in the indexes it can create, and no materialized view can be created on it. –––––– On Mar 20, 2025, at 8:43 PM, Thomas Elliott <[email protected]> 
wrote: I'm now wondering how your Cassandra is starting with this table configuration, Part of migrating from 3.x should have dropped COMPACT STORAGE from your tables. Do you find any notice of exceptions with 
Cassandra starting up? was the Description of table1 that you shared from the current server or from before the upgrade? .thomas On Thu, Mar 20, 2025 at 6:59 PM Thomas Elliott < [email protected] > 
wrote: William, Longtime listener, first time poster. Something stands out here for me. This table uses COMPACT STORAGE which is deprecated in Cassandra 4.x and it was a hold-over from the old thrift API. I'm 
sure that this is eliciting some gasps from the mailing list. You might actually be better served by creating a new keyspace and table, using Leveled Compaction instead of STC and using LZ4 instead of snappy. 
After copying the data over to the new table, run a compaction and repair on the table, and I expect that you should see the footprint decrease. What I'm concerned about is that you have Compact Storage using 
this legacy thrift API which isn't compatible with 4.0, you also have wide columns and a large partition, which exacerbates the compact storage issue. I have to keep coming back to that this is a single 
instance, single replica and a small volume of data so I shouldn't be thinking of the scalability of your data model. That said, your data-footprint between Cassandra .3.x and 4.x shouldn't be 10x difference. 
It should be bigger because 4.0 introduces additional meta-data but not THAT much bigger. I took a stab at a potential new schema for you and this might work better... CREATE TABLE keyspace1.table1 ( blob1 
blob, -- Partition key blob2 blob, -- Clustering key blue3 blob, -- Data column PRIMARY KEY (blob1, blob2) -- Composite primary key ) WITH CLUSTERING ORDER BY (blob2 ASC) -- Maintain clustering order AND 
compaction = { 'class': 'LeveledCompactionStrategy', -- Optimized for write-heavy workloads 'sstable_size_in_mb': '160' -- Default size for LCS } AND compression = { 'class': 'LZ4Compressor', 
'chunk_length_in_kb': '64' } AND caching = { 'keys': 'ALL', -- Cache all partition keys 'rows_per_partition': 'NONE' -- Avoid caching rows to save memory } AND gc_grace_seconds = 864000 AND default_time_to_live 
= 0 AND speculative_retry = '99p' AND bloom_filter_fp_chance = 0.01 AND crc_check_chance = 1.0 AND memtable_flush_period_in_ms = 0; All the best, .thomas On Thu, Mar 20, 2025 at 8:36 AM Michalis Kotsiouros 
(EXT) via user < [email protected] > wrote: Hello William, Does your data get updated? You mentioned in one email that you switched from TTL 0 to 10 days. This might mean that some data are not 
deleted because at some point in time were written with no TTL ( TTL set to 0). This has occurred in some systems in past. You may use the sstableexpiredblocker tool to check about this. BR MK From: William 
Crowell via user < [email protected] > Sent: March 20, 2025 13:58 To: [email protected] Cc: William Crowell < [email protected] >; Sartor Fabien (DIN) < 
[email protected] >; [email protected] Subject: Re: Increased Disk Usage After Upgrading From Cassandra 3.x.x to 4.1.3 Stephane and Fabian, Good morning and thank you for your replies. Your 
English is fine. I just appreciate the help. Apache Cassandra was overkill for this application, and this was just something handed to me. The reason why this is an issue is because of the limited amount of 
disk space they specified when creating the volume that Cassandra resides on, and it cannot be expanded. We still have plenty of disk space but with this growth we will eventually run out. Trying to be 
proactive here. I do not have any snapshots when doing nodetable listsnapshots or backup data in the backup folder. It could be an actual increase in data size, and I am not aware of any data being injected 
into the table by anyone else. That’s something to check into though. We initially created this table with Cassandra 3.x and then migrated to 4.1.3. I will paste the table description below, and I have changed 
the table and column names only. I would never define a table like this with the primary key being 2 blobs. I would have at least created a uuid field as a primary key. The compression ratio is not very good 
because there are 3 blob columns in the table. When I first looked at this, I felt LeveledCompactionStrategy would have been a better fit instead of STCS. I do agree and see that droppable tombstones are below 
5% which is expected and should not significantly impact the data size. Have we recently deleted a large amount of data? Not to my knowledge. We define TTLs of 10 days which Cassandra deletes the records for 
us. How do you determine the data size before compaction ? We look at the disk space being used and notcied the increase : df -H Do these archives contain the full dataset before the update? Yes they do. I will 
try the suggestions you both mentioned. Here is the table definition : cqlsh:keyspace1> desc table table1 /* Warning: Table keyspace1.table1 omitted because it has constructs not compatible with CQL (was 
created via legacy API). Approximate structure, for reference: (this should not be used to reproduce this schema) CREATE TABLE keyspace1.table1 ( blob1 blob, blob2 blob, blue3 blob, PRIMARY KEY (blob1, blob2) ) 
WITH COMPACT STORAGE AND CLUSTERING ORDER BY (blob2 ASC) AND additional_write_policy = '99p' AND bloom_filter_fp_chance = 0.01 AND caching = {'keys': 'NONE', 'rows_per_partition': '120'} AND cdc = false AND 
comment = '' AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'} AND compression = {'chunk_length_in_kb': '16', 'class': 
'org.apache.cassandra.io.compress.SnappyCompressor'} AND memtable = 'default' AND crc_check_chance = 1.0 AND default_time_to_live = 0 AND extensions = {} AND gc_grace_seconds = 864000 AND max_index_interval = 
2048 AND memtable_flush_period_in_ms = 0 AND min_index_interval = 128 AND read_repair = 'BLOCKING' AND speculative_retry = '99p'; */ Regards, William Crowell From: Sartor Fabien (DIN) via user < 
[email protected] > Date: Thursday, March 20, 2025 at 6:56 AM To: [email protected] < [email protected] > Cc: Sartor Fabien (DIN) < [email protected] > Subject: RE: 
Increased Disk Usage After Upgrading From Cassandra 3.x.x to 4.1.3 Dear Wiliam, I'm also sorry for my poor english ! Compression ratio 0.566… I guess it means If you have 1MB data on your laptop, you get 
0.566MB in Cassandra. If you put blobs in the table, it is ok! If it is plain english text, It seams low. I understand answer of below questions from Bowen: • snapshots could've been created automatically, o 
such as by dropping or truncating tables when auto_snapshots is set to true: SET TO TRUE o or compaction when snapshot_before_compaction is set to true : SET TO FALSE • backups, which could've been created 
automatically, e.g. when incremental_backups is set to true: SET TO FALSE • mixing repaired and unrepaired sstables, which is usually caused by incremental repairs, even if it had only been ran once : no 
incremental repairs done • partially upgraded cluster, e.g. mixed Cassandra version in the same cluster: Only 1 node un the cluster. • token ring change (e.g. adding or removing nodes) without "nodetool 
cleanup" : Only 1 node un the cluster. • changes made to the compression table properties: No change done I don't have the answers of these questions below from Bowen: • do you have any snapshots/backup 
data in the folder ? o You can try to run : nodetool listsnapshots • actual increase in data size of business data ? o maybe one developper use prod env to inject data and didn't notice it ? I recently started 
working in Cassandra administration a few months ago. So I’m maybe wrong ! My team recently migrated Cassandra from version 3 to 4.1.4. They didn't observe this behavior. All tables use the Leveled Compaction 
Strategy, and we do not perform manual compactions. What I can see is that droppable tombstones are below 5%, which is expected and should not significantly impact the data size. Do we agree with this 
statement? Have you recently deleted a large amount of data? Could you run the following command on the 80GB+ SSTable and some others? $CASSANDRA_HOME/tools/bin/sstablemetadata An SSTable of 80GB+ seems very 
large. An SSTable ensures that you don't have duplicate data inside, meaning this 80GB+ consists of useful data. So my question is: How do you determine the data size before compaction? I understand that you 
perform backups—could you check the backup archives from before the update? Do these archives contain the full dataset before the update? Additionally, try parsing the business data and verify if all the data 
is intact. You can use the following command to inspect a few records: $CASSANDRA_HOME/tools/bin/sstabledump -d Another possibility is to load all the SSTables into another Cassandra instance using: 
$CASSANDRA_HOME/bin/sstableloader Then, check if you get the same data size. Thank you, Best regards, Fabien De : Tasmaniedemon < [email protected] > Envoyé : jeudi, 20 mars 2025 09:06 À : 
[email protected] Objet : Re: Increased Disk Usage After Upgrading From Cassandra 3.x.x to 4.1.3 PRUDENCE. Ce message provient d'un expéditeur externe à l'Etat. Ne cliquez sur les liens ou n'ouvrez les 
pièces jointes que si vous faites entière confiance à cet expéditeur. Hi, Could you give more details about the use of tables and modeling about this single node cassandra ? Have you began to use Cassandra with 
3 version or have you already migrate before from previous version ( 2.x) ? To be honest, i would suggest to use the last release avalable, and to rebuild and relaad a fresh new cluster with a very low 
num_token ( and 3 nodes :-) May i ask you why only single node cassandra ? Scalability is not intended ? Sorry for my poor english :-) Kind regards Stephane Le 19/03/2025 à 14:15, William Crowell via user a 
écrit : Bowen, Fabien, Stéphane, and Luciano, A bit more information here... We have not run incremental repairs, and we have not made any changes to the compression properties on the tables. When we first 
started the database the TTL on the records was set to 0 but not it is set to 10 days. We do have one table in a keyspace that is occupying 84.1GB of disk space: ls -l /var/lib/cassandra/data/keyspace1/ table1 
… -rw-rw-r--. 1 xxxxxxxx xxxxxxxxx 84145170181 Mar 18 08:28 nb-163033-big-Data.db … Regards, William Crowell From: William Crowell via user <[email protected]> Date: Friday, March 14, 2025 at 
10:53 AM To: [email protected] <[email protected]> Cc: William Crowell <[email protected]> , Bowen Song <[email protected]> Subject: Re: Increased Disk Usage After Upgrading 
From Cassandra 3.x.x to 4.1.3 Bowen, This is just a single Cassandra node. Unfortunately, I cannot get on the box at the moment, but the following configuration is in cassandra.yaml: snapshot_before_compaction: 
false auto_snapshot: true incremental_backups: false The only other configuration parameter that had been changed other than the keystore and truststore was num_tokens (default: 16): num_tokens: 256 I also 
noticed the compression ratio on the largest table is not good: 0.566085855123187 Regards, William Crowell From: Bowen Song via user <[email protected]> Date: Friday, March 14, 2025 at 10:13 AM 
To: William Crowell via user <[email protected]> Cc: Bowen Song <[email protected]> Subject: Re: Increased Disk Usage After Upgrading From Cassandra 3.x.x to 4.1.3 A few suspects: * snapshots, 
which could've been created automatically, such as by dropping or truncating tables when auto_snapshots is set to true, or compaction when snapshot_before_compaction is set to true * backups, which could've 
been created automatically, e.g. when incremental_backups is set to true * mixing repaired and unrepaired sstables, which is usually caused by incremental repairs, even if it had only been ran once * partially 
upgraded cluster, e.g. mixed Cassandra version in the same cluster * token ring change (e.g. adding or removing nodes) without "nodetool cleanup" * actual increase in data size * changes made to the 
compression table properties To find the root cause, you will need to check the file/folder sizes to find out what is using the extra disk space, and may also need to review the cassandra.yaml file (or post it 
here with sensitive information removed) and any actions you've made to the cluster prior to the first appearance of the issue. Also, manually running major compactions is no advised. On 12/03/2025 20:26, 
William Crowell via user wrote: Hi. A few months ago, I upgraded a single node Cassandra instance from version 3 to 4.1.3. This instance is not very large with about 15 to 20 gigabytes of data on version 3, but 
after the update it has went substantially up to over 100gb. I do a compaction once a week and take a snapshot, but with the increase in data it makes the compaction a much lengthier process. I also did a 
sstableupate as part of the upgrade. Any reason for the increased size of the database on the file system? I am using the default STCS compaction strategy. My “nodetool cfstats” on a heavily used table looks 
like this: Keyspace : xxxxxxxx Read Count: 48089 Read Latency: 12.52872569610514 ms Write Count: 1616682825 Write Latency: 0.0067135265490310386 ms Pending Flushes: 0 Table: sometable SSTable count: 13 Old 
SSTable count: 0 Space used (live): 104005524836 Space used (total): 104005524836 Space used by snapshots (total): 0 Off heap memory used (total): 116836824 SSTable Compression Ratio: 0.566085855123187 Number 
of partitions (estimate): 14277177 Memtable cell count: 81033 Memtable data size: 13899174 Memtable off heap memory used: 0 Memtable switch count: 13171 Local read count: 48089 Local read latency: NaN ms Local 
write count: 1615681213 Local write latency: 0.005 ms Pending flushes: 0 Percent repaired: 0.0 Bytes repaired: 0.000KiB Bytes unrepaired: 170.426GiB Bytes pending repair: 0.000KiB Bloom filter false positives: 
125 Bloom filter false ratio: 0.00494 Bloom filter space used: 24656936 Bloom filter off heap memory used: 24656832 Index summary off heap memory used: 2827608 Compression metadata off heap memory used: 
89352384 Compacted partition minimum bytes: 73 Compacted partition maximum bytes: 61214 Compacted partition mean bytes: 11888 Average live cells per slice (last five minutes): NaN Maximum live cells per slice 
(last five minutes): 0 Average tombstones per slice (last five minutes): NaN Maximum tombstones per slice (last five minutes): 0 Dropped Mutations: 0 Droppable tombstone ratio: 0.04983 This e-mail may contain 
information that is privileged or confidential. If you are not the intended recipient, please delete the e-mail and any attachments and notify us immediately. CAUTION: This email originated from outside of the 
organization. Do not click on links or open attachments unless you recognize the sender and know the content is safe. This e-mail may contain information that is privileged or confidential. If you are not the 
intended recipient, please delete the e-mail and any attachments and notify us immediately. CAUTION: This email originated from outside of the organization. Do not click on links or open attachments unless you 
recognize the sender and know the content is safe. This e-mail may contain information that is privileged or confidential. If you are not the intended recipient, please delete the e-mail and any attachments and 
notify us immediately. -- <image001.gif> CAUTION: This email originated from outside of the organization. Do not click on links or open attachments unless you recognize the sender and know the content is 
safe. This e-mail may contain information that is privileged or confidential. If you are not the intended recipient, please delete the e-mail and any attachments and notify us immediately.
Re: Increased Disk Usage After Upgrading From Cassandra 3.x.x to 4.1.3

Reply via email to