A few suspects:
* snapshots, which could've been created automatically, such as by
dropping or truncating tables when auto_snapshots is set to true, or
compaction when snapshot_before_compaction is set to true
* backups, which could've been created automatically, e.g. when
incremental_backups is set to true
* mixing repaired and unrepaired sstables, which is usually caused by
incremental repairs, even if it had only been ran once
* partially upgraded cluster, e.g. mixed Cassandra version in the same
cluster
* token ring change (e.g. adding or removing nodes) without "nodetool
cleanup"
* actual increase in data size
* changes made to the compression table properties
To find the root cause, you will need to check the file/folder sizes to
find out what is using the extra disk space, and may also need to review
the cassandra.yaml file (or post it here with sensitive information
removed) and any actions you've made to the cluster prior to the first
appearance of the issue.
Also, manually running major compactions is no advised.
On 12/03/2025 20:26, William Crowell via user wrote:
Hi. A few months ago, I upgraded a single node Cassandra instance
from version 3 to 4.1.3. This instance is not very large with about
15 to 20 gigabytes of data on version 3, but after the update it has
went substantially up to over 100gb. I do a compaction once a week
and take a snapshot, but with the increase in data it makes the
compaction a much lengthier process. I also did a sstableupate as
part of the upgrade. Any reason for the increased size of the
database on the file system?
I am using the default STCS compaction strategy. My “nodetool
cfstats” on a heavily used table looks like this:
Keyspace : xxxxxxxx
Read Count: 48089
Read Latency: 12.52872569610514 ms
Write Count: 1616682825
Write Latency: 0.0067135265490310386 ms
Pending Flushes: 0
Table: sometable
SSTable count: 13
Old SSTable count: 0
Space used (live): 104005524836
Space used (total): 104005524836
Space used by snapshots (total): 0
Off heap memory used (total): 116836824
SSTable Compression Ratio: 0.566085855123187
Number of partitions (estimate): 14277177
Memtable cell count: 81033
Memtable data size: 13899174
Memtable off heap memory used: 0
Memtable switch count: 13171
Local read count: 48089
Local read latency: NaN ms
Local write count: 1615681213
Local write latency: 0.005 ms
Pending flushes: 0
Percent repaired: 0.0
Bytes repaired: 0.000KiB
Bytes unrepaired: 170.426GiB
Bytes pending repair: 0.000KiB
Bloom filter false positives: 125
Bloom filter false ratio: 0.00494
Bloom filter space used: 24656936
Bloom filter off heap memory used: 24656832
Index summary off heap memory used: 2827608
Compression metadata off heap memory used: 89352384
Compacted partition minimum bytes: 73
Compacted partition maximum bytes: 61214
Compacted partition mean bytes: 11888
Average live cells per slice (last five minutes): NaN
Maximum live cells per slice (last five minutes): 0
Average tombstones per slice (last five minutes): NaN
Maximum tombstones per slice (last five minutes): 0
Dropped Mutations: 0
Droppable tombstone ratio: 0.04983
This e-mail may contain information that is privileged or
confidential. If you are not the intended recipient, please delete the
e-mail and any attachments and notify us immediately.