I suspect that you are CPU bound rather than IO bound. There are a lot of areas to look into, but I would start with a few. I could not tell much from the results you shared since at the time, there were no writes happening. Switching to a different compaction strategy will most likely make it worse for you. as of now, you only use 1 sstable per read, and STCS is the least expensive compaction type.
For starters, 1) Revise cassandra.yaml for Common disk settings, i.e., concurrent_reads, concurrent_writes, etc https://docs.datastax.com/en/cassandra/3.0/cassandra/configuration/configCassandra_yaml.html 2) Ensure that you optimize your OS for C* https://docs.datastax.com/en/dse/6.0/dse-admin/datastax_enterprise/config/configRecommendedSettings.html What I would do next is to monitor the system. The bottleneck you explained is triggered by clients and it's out of your control. So 3) monitor system resources. If you have DSE, then use OpsCenter. Otherwise, you can use dstat. something like 'dstat -taf' would do it. You will have to run this for a long period of time until the timeouts occur. So, now you can have a general idea of what resources are saturating. 4) If this is CPU bound, then reduce contention by setting concurrent_compactors to 1 in cassandra.yaml 5) monitor GC. There are a lot of tools that you can use to do so. most of the time, it's the GC that is not tuned well. If you are not using G1GC, then you might want to do so you can read about GC here briefly: https://docs.datastax.com/en/cassandra/3.0/cassandra/operations/opsTuneJVM.html https://docs.datastax.com/en/dse-trblshoot/doc/troubleshooting/gcPauses.html 6) this sounds naive, but check the logs to see if there is something interesting there, you can also see the GC pauses there as well. Ali Hubail Petrolink International Ltd. Confidentiality warning: This message and any attachments are intended only for the persons to whom this message is addressed, are confidential, and may be privileged. If you are not the intended recipient, you are hereby notified that any review, retransmission, conversion to hard copy, copying, modification, circulation or other use of this message and any attachments is strictly prohibited. If you receive this message in error, please notify the sender immediately by return email, and delete this message and any attachments from your system. Petrolink International Limited its subsidiaries, holding companies and affiliates disclaims all responsibility from and accepts no liability whatsoever for the consequences of any unauthorized person acting, or refraining from acting, on any information contained in this message. For security purposes, staff training, to assist in resolving complaints and to improve our customer service, email communications may be monitored and telephone calls may be recorded. rajasekhar kommineni <rajaco...@gmail.com> 09/20/2018 01:14 PM Please respond to user@cassandra.apache.org To user@cassandra.apache.org, cc Subject Newsletter / Marketing: Re: Compaction Strategy Hi Ali, Please find my answers 1) The table holds customer history data, where we receive the transaction data everyday for multiple vendors and batch job is executed which updates the data if the customer do any transactions that day, and insert will happen if he is new customer. Reads will happen if the customer visits to calculate the relevancy of items based on the transactions he had done. I attached the tablestats & tablehistograms output to file. 2) RAM : 30GB, CPU:4, hard drive : Amazon EBS 3) Attached output to file Thanks, On Sep 20, 2018, at 10:53 AM, Ali Hubail <ali.hub...@petrolink.com> wrote: Hello Rajasekhar, It's not really clear to me what your workload is. As I understand it, you do heavy writes, but what about reads? So, could you: 1) execute nodetool tablestats nodetool tablehistograms nodetool compactionstats we should be able to see the latency, workload type, and the # of sstable used for reads 2) specify your hardware specs. i.e., memory size, cpu, # of drives (for data sstables), and type of harddrives (ssd/hdd) 3) cassandra.yaml (make sure to sanitize it) You have a lot of updates, and your data is most likely scattered across different sstables. size compaction strategy (STCS) is much less expensive than level compaction strategy (LCS). Stopping the background compaction should be approached with caution, I think your problem is more to do with why STCS compaction is taking more resources than you expect. Regards, Ali Hubail Petrolink International Ltd Confidentiality warning: This message and any attachments are intended only for the persons to whom this message is addressed, are confidential, and may be privileged. If you are not the intended recipient, you are hereby notified that any review, retransmission, conversion to hard copy, copying, modification, circulation or other use of this message and any attachments is strictly prohibited. If you receive this message in error, please notify the sender immediately by return email, and delete this message and any attachments from your system. Petrolink International Limited its subsidiaries, holding companies and affiliates disclaims all responsibility from and accepts no liability whatsoever for the consequences of any unauthorized person acting, or refraining from acting, on any information contained in this message. For security purposes, staff training, to assist in resolving complaints and to improve our customer service, email communications may be monitored and telephone calls may be recorded. rajasekhar kommineni <rajaco...@gmail.com> 09/19/2018 04:44 PM Please respond to user@cassandra.apache.org To user@cassandra.apache.org, cc Subject Re: Compaction Strategy Hello, Can any one respond to my questions. Is it a good idea to disable auto compaction and schedule it every 3 days. I am unable to control compaction and it is causing timeouts. Also will reducing or increasing compaction_throughput_mb_per_sec eliminate timeouts ? Thanks, > On Sep 17, 2018, at 9:38 PM, rajasekhar kommineni <rajaco...@gmail.com> wrote: > > Hello Folks, > > I need advice in deciding the compaction strategy for my C cluster. There are multiple jobs that will load the data with less inserts and more updates but no deletes. Currently I am using Size Tired compaction, but seeing auto compactions after the data load kicks, and also read timeouts during compaction. > > Can anyone suggest good compaction strategy for my cluster which will reduce the timeouts. > > > Thanks, > --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org For additional commands, e-mail: user-h...@cassandra.apache.org