Why not use *nodetool** import* tool ? Create a new table, copy the files from the old table to the new table directory, and then use nodetool import . *nodetool import [keyspace] [target_table] [targate_table_directory]*
edi On Mon, Mar 31, 2025 at 5:22 PM Stéphane Alleaume <tasmaniede...@free.fr> wrote: > Hi > > I would do the same > > Kind regards > Stéphane > > > Le 31 mars 2025 16:13:01 GMT+02:00, "Durity, Sean R via user" < > user@cassandra.apache.org> a écrit : > >> I would use dsbulk for the unload and load of data. It is very fast and >> specific to a table. >> >> >> >> >> >> Sean R. Durity >> >> >> >> INTERNAL USE >> ------------------------------ >> *From:* William Crowell via user <user@cassandra.apache.org> >> *Sent:* Monday, March 31, 2025 10:05 AM >> *To:* user@cassandra.apache.org <user@cassandra.apache.org>; Stéphane >> Alleaume <crystallo...@gmail.com> >> *Cc:* William Crowell <wcrow...@perforce.com>; sc...@paradoxica.net < >> sc...@paradoxica.net>; asf.telli...@gmail.com <asf.telli...@gmail.com> >> *Subject:* [EXTERNAL] Re: Increased Disk Usage After Upgrading From >> Cassandra 3.x.x to 4.1.3 >> >> Good morning. What would be the best way to dump a large table (about >> 90GB), drop the table, recreate the table, and then reload the data? I am >> looking through this documentation: https: //cassandra. apache. org/doc/4. >> 1/cassandra/operating/backups. html >> >> Good morning. What would be the best way to dump a large table (about >> 90GB), drop the table, recreate the table, and then reload the data? >> >> >> >> I am looking through this documentation: >> https://cassandra.apache.org/doc/4.1/cassandra/operating/backups.html >> [cassandra.apache.org] >> <https://urldefense.com/v3/__https://cassandra.apache.org/doc/4.1/cassandra/operating/backups.html__;!!HR4Zk_QqkEPgHrsGXw!3r8LAYhLcAAxsgwclK9zx0UVoovrOK9Z5xZawc5c3cSssAdfw0tACC3LtrLfNLPlhOs5HyEmNk9HPQUJug0cYCLvpeM$> >> >> >> >> To do the restore it looks like I would need to take a snapshot and then >> drop the table. Recreate the table with the CQL that Thomas provided and >> then use either sstableloader or nodetool import to reload it. Does that >> make sense? >> >> >> >> Regards, >> >> >> >> William Crowell >> >> >> >> *From: *William Crowell via user <user@cassandra.apache.org> >> *Date: *Monday, March 24, 2025 at 3:07 PM >> *To: *user@cassandra.apache.org <user@cassandra.apache.org>, Stéphane >> Alleaume <crystallo...@gmail.com> >> *Cc: *William Crowell <wcrow...@perforce.com>, sc...@paradoxica.net < >> sc...@paradoxica.net>, asf.telli...@gmail.com <asf.telli...@gmail.com> >> *Subject: *Re: Increased Disk Usage After Upgrading From Cassandra 3.x.x >> to 4.1.3 >> >> Surprisingly sstableexpiredblockers returned nothing. I am looking into >> recreating a new keyspace and table using LCS instead of STCS. >> >> >> >> Regards, >> >> >> >> William Crowell >> >> >> >> *From: *William Crowell via user <user@cassandra.apache.org> >> *Date: *Friday, March 21, 2025 at 7:36 AM >> *To: *Stéphane Alleaume <crystallo...@gmail.com>, >> user@cassandra.apache.org <user@cassandra.apache.org> >> *Cc: *William Crowell <wcrow...@perforce.com>, sc...@paradoxica.net < >> sc...@paradoxica.net>, asf.telli...@gmail.com <asf.telli...@gmail.com> >> *Subject: *Re: Increased Disk Usage After Upgrading From Cassandra 3.x.x >> to 4.1.3 >> >> Stéphane, >> >> >> >> I will check. What issues could that cause if the driver on the client >> side was outdated? >> >> >> >> Regards, >> >> >> >> William Crowell >> >> >> >> *From: *Stéphane Alleaume <crystallo...@gmail.com> >> *Date: *Friday, March 21, 2025 at 7:32 AM >> *To: *user@cassandra.apache.org <user@cassandra.apache.org> >> *Cc: *sc...@paradoxica.net <sc...@paradoxica.net>, asf.telli...@gmail.com >> <asf.telli...@gmail.com>, William Crowell <wcrow...@perforce.com> >> *Subject: *Re: Increased Disk Usage After Upgrading From Cassandra 3.x.x >> to 4.1.3 >> >> You don't often get email from crystallo...@gmail.com. Learn why this is >> important [aka.ms] >> <https://urldefense.com/v3/__https://aka.ms/LearnAboutSenderIdentification__;!!HR4Zk_QqkEPgHrsGXw!3r8LAYhLcAAxsgwclK9zx0UVoovrOK9Z5xZawc5c3cSssAdfw0tACC3LtrLfNLPlhOs5HyEmNk9HPQUJug0cV_5ob4I$> >> >> Hi, >> >> >> >> I thought that LCS strategy compaction was not advise with high write >> workload ? >> >> >> >> Do you know if driver version have been updated client side ? >> >> >> >> Have a nice day >> >> Kind regards >> >> Stephane >> >> >> >> Le ven. 21 mars 2025, 12:13, William Crowell via user < >> user@cassandra.apache.org> a écrit : >> >> Thomas and Scott, >> >> >> >> Thank you both for your replies. >> >> >> >> I am not noticing any exceptions in the logs, but I will check again. >> That description of table1 is from the current server with the table and >> column names obscured, but that table definition is what we are running >> with. You are correct here…scalability I guess was maybe intended >> initially, but they never added nodes to their ring. >> >> >> >> I would say that the increase in disk usage is probably around 10-15%. >> 90GB in 3.x to a little over 100GB with 4.x. I am wondering how much extra >> meta-data is being stored with 4.x? >> >> >> >> I am wondering if it would be better by creating a new keyspace and table >> using LCS instead of STCS and using LZ4 instead of snappy as Thomas >> suggested? >> >> >> >> Regards, >> >> >> >> William Crowell >> >> >> >> *From: *C. Scott Andreas <sc...@paradoxica.net> >> *Date: *Friday, March 21, 2025 at 12:21 AM >> *To: *user@cassandra.apache.org <user@cassandra.apache.org> >> *Cc: *user@cassandra.apache.org <user@cassandra.apache.org> >> *Subject: *Re: Increased Disk Usage After Upgrading From Cassandra 3.x.x >> to 4.1.3 >> >> You don't often get email from sc...@paradoxica.net. Learn why this is >> important [aka.ms] >> <https://urldefense.com/v3/__https://aka.ms/LearnAboutSenderIdentification__;!!HR4Zk_QqkEPgHrsGXw!3r8LAYhLcAAxsgwclK9zx0UVoovrOK9Z5xZawc5c3cSssAdfw0tACC3LtrLfNLPlhOs5HyEmNk9HPQUJug0cV_5ob4I$> >> >> While it's true that COMPACT STORAGE was deprecated in Cassandra 4.0+, it >> was not removed. It's largely supported and works just fine for many use >> cases. >> >> >> >> Feedback from users indicated that removing it entirely would be >> disruptive because it required special effort from Cassandra users on >> upgrade to 4.x. Much of COMPACT STORAGE was reintroduced in Cassandra 4.x, >> and works just fine – including in 5.0 and trunk. >> >> >> >> You can find details on that work here: >> https://issues.apache.org/jira/browse/CASSANDRA-16217 [issues.apache.org] >> <https://urldefense.com/v3/__https://issues.apache.org/jira/browse/CASSANDRA-16217__;!!HR4Zk_QqkEPgHrsGXw!3r8LAYhLcAAxsgwclK9zx0UVoovrOK9Z5xZawc5c3cSssAdfw0tACC3LtrLfNLPlhOs5HyEmNk9HPQUJug0ckT6Y7Ug$> >> >> >> >> Excerpting the docs added at the time: >> >> >> https://github.com/apache/cassandra/commit/0a49d25078665da0ec30d9e69a036de163deb9c3#diff-6a259516d12c778b2b08ec408b884a88bf94d14a9f6dd8d943c6e4376288da6dR479-R480 >> [github.com] >> <https://urldefense.com/v3/__https://github.com/apache/cassandra/commit/0a49d25078665da0ec30d9e69a036de163deb9c3*diff-6a259516d12c778b2b08ec408b884a88bf94d14a9f6dd8d943c6e4376288da6dR479-R480__;Iw!!HR4Zk_QqkEPgHrsGXw!3r8LAYhLcAAxsgwclK9zx0UVoovrOK9Z5xZawc5c3cSssAdfw0tACC3LtrLfNLPlhOs5HyEmNk9HPQUJug0ck-jJ1Y8$> >> >> >> >> –––––– >> >> A compact storage table is one defined with the COMPACT STORAGE option. >> This option is only maintained for backward compatibility for definitions >> created before CQL version 3 and shouldn't be used for new tables. 4.0 >> supports partially COMPACT STORAGE. There is no support for super column >> family. Since Cassandra 3.0, compact storage tables have the exact same >> layout internally than non compact ones (for the same schema obviously), >> and declaring a table with this option creates limitations for the table >> which are largely arbitrary (and exists for historical reasons). >> >> >> >> Amongst those limitations: >> >> >> >> - A compact storage table cannot use collections nor static columns. >> >> - If a compact storage table has at least one clustering column, then >> it must have *exactly* one column outside of the primary key ones. This >> implies you cannot add or remove columns after creation in particular. >> >> - A compact storage table is limited in the indexes it can create, and >> no materialized view can be created on it. >> >> –––––– >> >> >> >> >> >> >> >> On Mar 20, 2025, at 8:43 PM, Thomas Elliott <asf.telli...@gmail.com> >> wrote: >> >> >> >> >> >> I'm now wondering how your Cassandra is starting with this table >> configuration, Part of migrating from 3.x should have dropped COMPACT >> STORAGE from your tables. >> >> >> >> Do you find any notice of exceptions with Cassandra starting up? was >> the Description of table1 that you shared from the current server or from >> before the upgrade? >> >> >> >> .thomas >> >> >> >> On Thu, Mar 20, 2025 at 6:59 PM Thomas Elliott <asf.telli...@gmail.com> >> wrote: >> >> William, >> >> >> >> Longtime listener, first time poster. >> >> >> >> Something stands out here for me. This table uses COMPACT STORAGE which >> is deprecated in Cassandra 4.x and it was a hold-over from the old thrift >> API. I'm sure that this is eliciting some gasps from the mailing list. >> >> >> >> You might actually be better served by creating a new keyspace and table, >> using Leveled Compaction instead of STC and using LZ4 instead of snappy. >> >> >> >> After copying the data over to the new table, run a compaction and repair >> on the table, and I expect that you should see the footprint decrease. >> >> >> >> What I'm concerned about is that you have Compact Storage using this >> legacy thrift API which isn't compatible with 4.0, you also have wide >> columns and a large partition, which exacerbates the compact storage issue. >> >> >> >> I have to keep coming back to that this is a single instance, single >> replica and a small volume of data so I shouldn't be thinking of the >> scalability of your data model. That said, your data-footprint between >> Cassandra .3.x and 4.x shouldn't be 10x difference. It should be bigger >> because 4.0 introduces additional meta-data but not THAT much bigger. >> >> >> >> I took a stab at a potential new schema for you and this might work >> better... >> >> >> >> CREATE TABLE keyspace1.table1 ( >> >> blob1 blob, -- Partition key >> >> blob2 blob, -- Clustering key >> >> blue3 blob, -- Data column >> >> PRIMARY KEY (blob1, blob2) -- Composite primary key >> >> ) >> >> WITH CLUSTERING ORDER BY (blob2 ASC) -- Maintain clustering order >> >> AND compaction = { >> >> 'class': 'LeveledCompactionStrategy', -- Optimized for write-heavy >> workloads >> >> 'sstable_size_in_mb': '160' -- Default size for LCS >> >> } >> >> AND compression = { >> >> 'class': 'LZ4Compressor', >> >> 'chunk_length_in_kb': '64' >> >> } >> >> AND caching = { >> >> 'keys': 'ALL', -- Cache all partition keys >> >> 'rows_per_partition': 'NONE' -- Avoid caching rows to save >> memory >> >> } >> >> AND gc_grace_seconds = 864000 >> >> AND default_time_to_live = 0 >> >> AND speculative_retry = '99p' >> >> AND bloom_filter_fp_chance = 0.01 >> >> AND crc_check_chance = 1.0 >> >> AND memtable_flush_period_in_ms = 0; >> >> >> >> >> >> All the best, >> >> .thomas >> >> >> >> On Thu, Mar 20, 2025 at 8:36 AM Michalis Kotsiouros (EXT) via user < >> user@cassandra.apache.org> wrote: >> >> Hello William, >> >> Does your data get updated? You mentioned in one email that you switched >> from TTL 0 to 10 days. This might mean that some data are not deleted >> because at some point in time were written with no TTL ( TTL set to 0). >> >> This has occurred in some systems in past. >> >> You may use the sstableexpiredblocker tool to check about this. >> >> >> >> BR >> >> MK >> >> >> >> *From:* William Crowell via user <user@cassandra.apache.org> >> *Sent:* March 20, 2025 13:58 >> *To:* user@cassandra.apache.org >> *Cc:* William Crowell <wcrow...@perforce.com>; Sartor Fabien (DIN) < >> fabien.sar...@etat.ge.ch>; tasmaniede...@free.fr >> *Subject:* Re: Increased Disk Usage After Upgrading From Cassandra 3.x.x >> to 4.1.3 >> >> >> >> Stephane and Fabian, >> >> >> >> Good morning and thank you for your replies. Your English is fine. I >> just appreciate the help. >> >> >> >> Apache Cassandra was overkill for this application, and this was just >> something handed to me. The reason why this is an issue is because of the >> limited amount of disk space they specified when creating the volume that >> Cassandra resides on, and it cannot be expanded. We still have plenty of >> disk space but with this growth we will eventually run out. Trying to be >> proactive here. >> >> >> >> I do not have any snapshots when doing nodetable listsnapshots or backup >> data in the backup folder. It could be an actual increase in data size, >> and I am not aware of any data being injected into the table by anyone >> else. That’s something to check into though. >> >> >> >> We initially created this table with Cassandra 3.x and then migrated to >> 4.1.3. >> >> >> >> I will paste the table description below, and I have changed the table >> and column names only. I would never define a table like this with the >> primary key being 2 blobs. I would have at least created a uuid field as a >> primary key. >> >> >> >> The compression ratio is not very good because there are 3 blob columns >> in the table. When I first looked at this, I felt >> LeveledCompactionStrategy would have been a better fit instead of STCS. >> >> >> >> I do agree and see that droppable tombstones are below 5% which is >> expected and should not significantly impact the data size. >> >> >> >> Have we recently deleted a large amount of data? Not to my knowledge. >> We define TTLs of 10 days which Cassandra deletes the records for us. >> >> >> >> How do you determine the data size before compaction ? We look at the >> disk space being used and notcied the increase : df -H >> >> >> >> Do these archives contain the full dataset before the update? Yes they >> do. >> >> >> >> I will try the suggestions you both mentioned. Here is the table >> definition : >> >> >> >> cqlsh:keyspace1> desc table table1 >> >> >> >> /* >> >> Warning: Table keyspace1.table1 omitted because it has constructs not >> compatible with CQL (was created via legacy API). >> >> Approximate structure, for reference: >> >> (this should not be used to reproduce this schema) >> >> >> >> CREATE TABLE keyspace1.table1 ( >> >> blob1 blob, >> >> blob2 blob, >> >> blue3 blob, >> >> PRIMARY KEY (blob1, blob2) >> >> ) WITH COMPACT STORAGE >> >> AND CLUSTERING ORDER BY (blob2 ASC) >> >> AND additional_write_policy = '99p' >> >> AND bloom_filter_fp_chance = 0.01 >> >> AND caching = {'keys': 'NONE', 'rows_per_partition': '120'} >> >> AND cdc = false >> >> AND comment = '' >> >> AND compaction = {'class': >> 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', >> 'max_threshold': '32', 'min_threshold': '4'} >> >> AND compression = {'chunk_length_in_kb': '16', 'class': >> 'org.apache.cassandra.io.compress.SnappyCompressor'} >> >> AND memtable = 'default' >> >> AND crc_check_chance = 1.0 >> >> AND default_time_to_live = 0 >> >> AND extensions = {} >> >> AND gc_grace_seconds = 864000 >> >> AND max_index_interval = 2048 >> >> AND memtable_flush_period_in_ms = 0 >> >> AND min_index_interval = 128 >> >> AND read_repair = 'BLOCKING' >> >> AND speculative_retry = '99p'; >> >> */ >> >> >> >> Regards, >> >> >> >> William Crowell >> >> >> >> *From: *Sartor Fabien (DIN) via user <user@cassandra.apache.org> >> *Date: *Thursday, March 20, 2025 at 6:56 AM >> *To: *user@cassandra.apache.org <user@cassandra.apache.org> >> *Cc: *Sartor Fabien (DIN) <fabien.sar...@etat.ge.ch> >> *Subject: *RE: Increased Disk Usage After Upgrading From Cassandra 3.x.x >> to 4.1.3 >> >> Dear Wiliam, >> >> >> >> I'm also sorry for my poor english ! >> >> >> >> Compression ratio 0.566… I guess it means If you have 1MB data on your >> laptop, you get 0.566MB in Cassandra. >> >> If you put blobs in the table, it is ok! >> >> If it is plain english text, It seams low. >> >> >> >> I understand answer of below questions from Bowen: >> >> >> >> • snapshots could've been created automatically, >> >> o such as by dropping or truncating tables when >> auto_snapshots is set to true: SET TO TRUE >> >> o or compaction when snapshot_before_compaction is set to >> true : SET TO FALSE >> >> • backups, which could've been created automatically, e.g. >> when incremental_backups is set to true: SET TO FALSE >> >> • mixing repaired and unrepaired sstables, which is usually >> caused by incremental repairs, even if it had only been ran once : no >> incremental repairs done >> >> • partially upgraded cluster, e.g. mixed Cassandra version in >> the same cluster: Only 1 node un the cluster. >> >> • token ring change (e.g. adding or removing nodes) without >> "nodetool cleanup" : Only 1 node un the cluster. >> >> • changes made to the compression table properties: No change >> done >> >> >> >> I don't have the answers of these questions below from Bowen: >> >> • do you have any snapshots/backup data in the folder ? >> >> o You can try to run : nodetool listsnapshots >> >> • actual increase in data size of business data ? >> >> o maybe one developper use prod env to inject data and didn't >> notice it ? >> >> >> >> I recently started working in Cassandra administration a few months ago. >> >> So I’m maybe wrong ! >> >> >> >> My team recently migrated Cassandra from version 3 to 4.1.4. >> >> They didn't observe this behavior. >> >> All tables use the Leveled Compaction Strategy, and we do not perform >> manual compactions. >> >> >> >> What I can see is that droppable tombstones are below 5%, which is >> expected and should not significantly impact the data size. >> >> Do we agree with this statement? >> >> Have you recently deleted a large amount of data? >> >> Could you run the following command on the 80GB+ SSTable and some others? >> >> *$CASSANDRA_HOME/tools/bin/sstablemetadata* >> >> >> >> >> >> An SSTable of 80GB+ seems very large. >> >> An SSTable ensures that you don't have duplicate data inside, meaning >> this 80GB+ consists of useful data. >> >> >> >> So my question is: How do you determine the data size before compaction? >> >> I understand that you perform backups—could you check the backup archives >> from before the update? >> >> Do these archives contain the full dataset before the update? >> >> >> >> Additionally, try parsing the business data and verify if all the data is >> intact. >> >> You can use the following command to inspect a few records: >> >> >> >> *$CASSANDRA_HOME/tools/bin/sstabledump -d* >> >> >> >> >> >> >> >> Another possibility is to load all the SSTables into another Cassandra >> instance using: >> >> >> >> *$CASSANDRA_HOME/bin/sstableloader* >> >> >> >> Then, check if you get the same data size. >> >> >> >> Thank you, >> >> Best regards, >> >> Fabien >> >> >> >> >> >> *De :* Tasmaniedemon <tasmaniede...@free.fr> >> *Envoyé :* jeudi, 20 mars 2025 09:06 >> *À :* user@cassandra.apache.org >> *Objet :* Re: Increased Disk Usage After Upgrading From Cassandra 3.x.x >> to 4.1.3 >> >> >> >> *PRUDENCE.* Ce message provient d'un expéditeur externe à l'Etat. Ne >> cliquez sur les liens ou n'ouvrez les pièces jointes que si vous faites >> entière confiance à cet expéditeur. >> >> >> >> Hi, >> >> Could you give more details about the use of tables and modeling about >> this single node cassandra ? >> >> Have you began to use Cassandra with 3 version or have you already >> migrate before from previous version ( 2.x) ? >> >> To be honest, i would suggest to use the last release avalable, and to >> rebuild and relaad a fresh new cluster with a very low num_token ( and 3 >> nodes :-) >> >> May i ask you why only single node cassandra ? Scalability is not >> intended ? >> >> Sorry for my poor english :-) >> >> Kind regards >> >> Stephane >> >> >> >> >> >> Le 19/03/2025 à 14:15, William Crowell via user a écrit : >> >> Bowen, Fabien, Stéphane, and Luciano, >> >> >> >> A bit more information here... >> >> >> >> We have not run incremental repairs, and we have not made any changes to >> the compression properties on the tables. >> >> >> >> When we first started the database the TTL on the records was set to 0 >> but not it is set to 10 days. >> >> >> >> We do have one table in a keyspace that is occupying 84.1GB of disk space: >> >> >> >> ls -l */var/lib/cassandra/data/keyspace1/*table1 >> >> … >> >> -rw-rw-r--. 1 xxxxxxxx xxxxxxxxx *84145170181 *Mar 18 08:28 >> nb-163033-big-Data.db >> >> … >> >> >> >> Regards, >> >> >> >> William Crowell >> >> >> >> *From: *William Crowell via user <user@cassandra.apache.org> >> <user@cassandra.apache.org> >> *Date: *Friday, March 14, 2025 at 10:53 AM >> *To: *user@cassandra.apache.org <user@cassandra.apache.org> >> <user@cassandra.apache.org> >> *Cc: *William Crowell <wcrow...@perforce.com> <wcrow...@perforce.com>, >> Bowen Song <bo...@bso.ng> <bo...@bso.ng> >> *Subject: *Re: Increased Disk Usage After Upgrading From Cassandra 3.x.x >> to 4.1.3 >> >> Bowen, >> >> >> >> This is just a single Cassandra node. Unfortunately, I cannot get on the >> box at the moment, but the following configuration is in cassandra.yaml: >> >> >> >> snapshot_before_compaction: false >> >> auto_snapshot: true >> >> incremental_backups: false >> >> >> >> The only other configuration parameter that had been changed other than >> the keystore and truststore was num_tokens (default: 16): >> >> >> >> num_tokens: 256 >> >> >> >> I also noticed the compression ratio on the largest table is not good: >> 0.566085855123187 >> >> >> >> Regards, >> >> >> >> William Crowell >> >> >> >> *From: *Bowen Song via user <user@cassandra.apache.org> >> <user@cassandra.apache.org> >> *Date: *Friday, March 14, 2025 at 10:13 AM >> *To: *William Crowell via user <user@cassandra.apache.org> >> <user@cassandra.apache.org> >> *Cc: *Bowen Song <bo...@bso.ng> <bo...@bso.ng> >> *Subject: *Re: Increased Disk Usage After Upgrading From Cassandra 3.x.x >> to 4.1.3 >> >> A few suspects: >> >> * snapshots, which could've been created automatically, such as by >> dropping or truncating tables when auto_snapshots is set to true, or >> compaction when snapshot_before_compaction is set to true >> >> * backups, which could've been created automatically, e.g. when >> incremental_backups is set to true >> >> * mixing repaired and unrepaired sstables, which is usually caused by >> incremental repairs, even if it had only been ran once >> >> * partially upgraded cluster, e.g. mixed Cassandra version in the same >> cluster >> >> * token ring change (e.g. adding or removing nodes) without "nodetool >> cleanup" >> >> * actual increase in data size >> >> * changes made to the compression table properties >> >> >> >> To find the root cause, you will need to check the file/folder sizes to >> find out what is using the extra disk space, and may also need to review >> the cassandra.yaml file (or post it here with sensitive information >> removed) and any actions you've made to the cluster prior to the first >> appearance of the issue. >> >> >> >> Also, manually running major compactions is no advised. >> >> On 12/03/2025 20:26, William Crowell via user wrote: >> >> Hi. A few months ago, I upgraded a single node Cassandra instance from >> version 3 to 4.1.3. This instance is not very large with about 15 to 20 >> gigabytes of data on version 3, but after the update it has went >> substantially up to over 100gb. I do a compaction once a week and take a >> snapshot, but with the increase in data it makes the compaction a much >> lengthier process. I also did a sstableupate as part of the upgrade. Any >> reason for the increased size of the database on the file system? >> >> >> >> I am using the default STCS compaction strategy. My “nodetool cfstats” >> on a heavily used table looks like this: >> >> >> >> Keyspace : xxxxxxxx >> >> Read Count: 48089 >> >> Read Latency: 12.52872569610514 ms >> >> Write Count: 1616682825 >> >> Write Latency: 0.0067135265490310386 ms >> >> Pending Flushes: 0 >> >> Table: sometable >> >> SSTable count: 13 >> >> Old SSTable count: 0 >> >> Space used (live): 104005524836 >> >> Space used (total): 104005524836 >> >> Space used by snapshots (total): 0 >> >> Off heap memory used (total): 116836824 >> >> SSTable Compression Ratio: 0.566085855123187 >> >> Number of partitions (estimate): 14277177 >> >> Memtable cell count: 81033 >> >> Memtable data size: 13899174 >> >> Memtable off heap memory used: 0 >> >> Memtable switch count: 13171 >> >> Local read count: 48089 >> >> Local read latency: NaN ms >> >> Local write count: 1615681213 >> >> Local write latency: 0.005 ms >> >> Pending flushes: 0 >> >> Percent repaired: 0.0 >> >> Bytes repaired: 0.000KiB >> >> Bytes unrepaired: 170.426GiB >> >> Bytes pending repair: 0.000KiB >> >> Bloom filter false positives: 125 >> >> Bloom filter false ratio: 0.00494 >> >> Bloom filter space used: 24656936 >> >> Bloom filter off heap memory used: 24656832 >> >> Index summary off heap memory used: 2827608 >> >> Compression metadata off heap memory used: 89352384 >> >> Compacted partition minimum bytes: 73 >> >> Compacted partition maximum bytes: 61214 >> >> Compacted partition mean bytes: 11888 >> >> Average live cells per slice (last five minutes): NaN >> >> Maximum live cells per slice (last five minutes): 0 >> >> Average tombstones per slice (last five minutes): NaN >> >> Maximum tombstones per slice (last five minutes): 0 >> >> Dropped Mutations: 0 >> >> Droppable tombstone ratio: 0.04983 >> >> >> >> >> >> *This e-mail may contain information that is privileged or confidential. >> If you are not the intended recipient, please delete the e-mail and any >> attachments and notify us immediately.* >> >> >> >> >> >> *CAUTION:* This email originated from outside of the organization. Do >> not click on links or open attachments unless you recognize the sender and >> know the content is safe. >> >> >> >> >> >> *This e-mail may contain information that is privileged or confidential. >> If you are not the intended recipient, please delete the e-mail and any >> attachments and notify us immediately.* >> >> >> >> *CAUTION:* This email originated from outside of the organization. Do >> not click on links or open attachments unless you recognize the sender and >> know the content is safe. >> >> >> >> >> >> *This e-mail may contain information that is privileged or confidential. >> If you are not the intended recipient, please delete the e-mail and any >> attachments and notify us immediately.* >> >> >> >> -- >> <image001.gif> >> >> >> >> *CAUTION:* This email originated from outside of the organization. Do >> not click on links or open attachments unless you recognize the sender and >> know the content is safe. >> >> >> >> >> >> *This e-mail may contain information that is privileged or confidential. >> If you are not the intended recipient, please delete the e-mail and any >> attachments and notify us immediately.* >> >> >> >> >> >> >> >> *CAUTION:* This email originated from outside of the organization. Do >> not click on links or open attachments unless you recognize the sender and >> know the content is safe. >> >> >> >> >> >> *This e-mail may contain information that is privileged or confidential. >> If you are not the intended recipient, please delete the e-mail and any >> attachments and notify us immediately.* >> >> >> >> >> >> *CAUTION:* This email originated from outside of the organization. Do >> not click on links or open attachments unless you recognize the sender and >> know the content is safe. >> >> >> >> >> >> *This e-mail may contain information that is privileged or confidential. >> If you are not the intended recipient, please delete the e-mail and any >> attachments and notify us immediately.* >> >> >> >> *CAUTION:* This email originated from outside of the organization. Do >> not click on links or open attachments unless you recognize the sender and >> know the content is safe. >> >> >> >> >> >> *This e-mail may contain information that is privileged or confidential. >> If you are not the intended recipient, please delete the e-mail and any >> attachments and notify us immediately.* >> >> >> >> *CAUTION:* This email originated from outside of the organization. Do >> not click on links or open attachments unless you recognize the sender and >> know the content is safe. >> >> >> >> This e-mail may contain information that is privileged or confidential. >> If you are not the intended recipient, please delete the e-mail and any >> attachments and notify us immediately. >> >> >> ------------------------------ >> >> The information in this Internet Email is confidential and may be legally >> privileged. It is intended solely for the addressee. Access to this Email >> by anyone else is unauthorized. If you are not the intended recipient, any >> disclosure, copying, distribution or any action taken or omitted to be >> taken in reliance on it, is prohibited and may be unlawful. When addressed >> to our clients any opinions or advice contained in this Email are subject >> to the terms and conditions expressed in any applicable governing The Home >> Depot terms of business or client engagement letter. The Home Depot >> disclaims all responsibility and liability for the accuracy and content of >> this attachment and for any damages or losses arising from any >> inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other >> items of a destructive nature, which may be contained in this attachment >> and shall not be liable for direct, indirect, consequential or special >> damages in connection with this e-mail message or its attachment. >> >