Hi,

Could you give more details about the use of tables and modeling about this single node cassandra ?

Have you began to use Cassandra with 3 version or have you already migrate before from previous version ( 2.x) ?

To be honest, i would suggest to use the last release avalable, and to rebuild and relaad a fresh new cluster with a very low num_token ( and 3 nodes :-)

May i ask you why only single node cassandra ? Scalability is not  intended ?

Sorry for my poor english :-)

Kind regards

Stephane



Le 19/03/2025 à 14:15, William Crowell via user a écrit :

Bowen, Fabien, Stéphane, and Luciano,

A bit more information here...

We have not run incremental repairs, and we have not made any changes to the compression properties on the tables.

When we first started the database the TTL on the records was set to 0 but not it is set to 10 days.

We do have one table in a keyspace that is occupying 84.1GB of disk space:

ls -l */var/lib/cassandra/data/keyspace1/*table1

…

-rw-rw-r--. 1 xxxxxxxx xxxxxxxxx *84145170181 *Mar 18 08:28 nb-163033-big-Data.db

…

Regards,

William Crowell


*From: *William Crowell via user <user@cassandra.apache.org>
*Date: *Friday, March 14, 2025 at 10:53 AM
*To: *user@cassandra.apache.org <user@cassandra.apache.org>
*Cc: *William Crowell <wcrow...@perforce.com>, Bowen Song <bo...@bso.ng>
*Subject: *Re: Increased Disk Usage After Upgrading From Cassandra 3.x.x to 4.1.3

Bowen,

This is just a single Cassandra node.  Unfortunately, I cannot get on the box at the moment, but the following configuration is in cassandra.yaml:

snapshot_before_compaction: false

auto_snapshot: true

incremental_backups: false

The only other configuration parameter that had been changed other than the keystore and truststore was num_tokens (default: 16):

num_tokens: 256

I also noticed the compression ratio on the largest table is not good:  0.566085855123187

Regards,

William Crowell

*From: *Bowen Song via user <user@cassandra.apache.org>
*Date: *Friday, March 14, 2025 at 10:13 AM
*To: *William Crowell via user <user@cassandra.apache.org>
*Cc: *Bowen Song <bo...@bso.ng>
*Subject: *Re: Increased Disk Usage After Upgrading From Cassandra 3.x.x to 4.1.3

A few suspects:

* snapshots, which could've been created automatically, such as by dropping or truncating tables when auto_snapshots is set to true, or compaction when snapshot_before_compaction is set to true

* backups, which could've been created automatically, e.g. when incremental_backups is set to true

* mixing repaired and unrepaired sstables, which is usually caused by incremental repairs, even if it had only been ran once

* partially upgraded cluster, e.g. mixed Cassandra version in the same cluster

* token ring change (e.g. adding or removing nodes) without "nodetool cleanup"

* actual increase in data size

* changes made to the compression table properties

To find the root cause, you will need to check the file/folder sizes to find out what is using the extra disk space, and may also need to review the cassandra.yaml file (or post it here with sensitive information removed) and any actions you've made to the cluster prior to the first appearance of the issue.

Also, manually running major compactions is no advised.

On 12/03/2025 20:26, William Crowell via user wrote:

    Hi.  A few months ago, I upgraded a single node Cassandra instance
    from version 3 to 4.1.3.  This instance is not very large with
    about 15 to 20 gigabytes of data on version 3, but after the
    update it has went substantially up to over 100gb.  I do a
    compaction once a week and take a snapshot, but with the increase
    in data it makes the compaction a much lengthier process.  I also
    did a sstableupate as part of the upgrade.  Any reason for the
    increased size of the database on the file system?

    I am using the default STCS compaction strategy.  My “nodetool
    cfstats” on a heavily used table looks like this:

    Keyspace : xxxxxxxx

      Read Count: 48089

      Read Latency: 12.52872569610514 ms

      Write Count: 1616682825

      Write Latency: 0.0067135265490310386 ms

      Pending Flushes: 0

              Table: sometable

              SSTable count: 13

              Old SSTable count: 0

              Space used (live): 104005524836

              Space used (total): 104005524836

              Space used by snapshots (total): 0

              Off heap memory used (total): 116836824

              SSTable Compression Ratio: 0.566085855123187

              Number of partitions (estimate): 14277177

              Memtable cell count: 81033

              Memtable data size: 13899174

              Memtable off heap memory used: 0

              Memtable switch count: 13171

              Local read count: 48089

              Local read latency: NaN ms

              Local write count: 1615681213

              Local write latency: 0.005 ms

              Pending flushes: 0

              Percent repaired: 0.0

              Bytes repaired: 0.000KiB

              Bytes unrepaired: 170.426GiB

              Bytes pending repair: 0.000KiB

              Bloom filter false positives: 125

              Bloom filter false ratio: 0.00494

              Bloom filter space used: 24656936

              Bloom filter off heap memory used: 24656832

              Index summary off heap memory used: 2827608

              Compression metadata off heap memory used: 89352384

              Compacted partition minimum bytes: 73

              Compacted partition maximum bytes: 61214

              Compacted partition mean bytes: 11888

              Average live cells per slice (last five minutes): NaN

              Maximum live cells per slice (last five minutes): 0

              Average tombstones per slice (last five minutes): NaN

              Maximum tombstones per slice (last five minutes): 0

              Dropped Mutations: 0

              Droppable tombstone ratio: 0.04983

    /This e-mail may contain information that is privileged or
    confidential. If you are not the intended recipient, please delete
    the e-mail and any attachments and notify us immediately./

*CAUTION:*This email originated from outside of the organization. Do not click on links or open attachments unless you recognize the sender and know the content is safe.

/This e-mail may contain information that is privileged or confidential. If you are not the intended recipient, please delete the e-mail and any attachments and notify us immediately./



*CAUTION:*This email originated from outside of the organization. Do not click on links or open attachments unless you recognize the sender and know the content is safe.


This e-mail may contain information that is privileged or confidential. If you are not the intended recipient, please delete the e-mail and any attachments and notify us immediately.


--

Reply via email to