Are there specific queries that are slow? Partition-key queries should have 
read latencies in the single digits of ms (or faster). If that is not what you 
are seeing, I would first review the data model and queries to make sure that 
the data is modeled properly for Cassandra. Without metrics, I would start at 
16-20 GB of RAM for Cassandra on each node (or 31 GB if you can get 64 GB per 
host).

Since these are VMs, is there any chance they are competing for resources on 
the same physical host? In my (limited) VM experience, VMs can be 10x slower 
than physical hosts with local SSDs. (They don't have to be slower, but it can 
be harder to get visibility to the actual bottlenecks.)

I would also look to see what consistency level is being used with the queries. 
In most cases LOCAL_QUORUM or LOCAL_ONE is preferred.

Does the app use prepared statements that are only prepared once per app 
invocation? Any LWT/"if exists" in your code?


Sean Durity

From: Attila Wind <attilaw@swf.technology>
Sent: Friday, March 5, 2021 9:48 AM
To: user@cassandra.apache.org
Subject: [EXTERNAL] underutilized servers


Hi guys,

I have a DevOps related question - hope someone here could give some 
ideas/pointers...

We are running a 3 nodes Cassandra cluster
Recently we realized we do have performance issues. And based on investigation 
we took it seems our bottleneck is the Cassandra cluster. The application layer 
is waiting a lot for Cassandra ops. So queries are running slow on Cassandra 
side however due to our monitoring it looks the Cassandra servers still have 
lots of free resources...

The Cassandra machines are virtual machines (we do own the physical hosts too) 
built with kvm - with 6 CPU cores (3 physical) and 32GB RAM dedicated to it.
We are using Ubuntu Linux 18.04 distro - everywhere the same version (the 
physical and virtual host)
We are running Cassandra 4.0-alpha4

What we see is

  *   CPU load is around 20-25% - so we have lots of spare capacity
  *   iowait is around 2-5% - so disk bandwidth should be fine
  *   network load is around 50% of the full available bandwidth
  *   loadavg is max around 4 - 4.5 but typically around 3 (because of the cpu 
count 6 should represent 100% load)

and still, query performance is slow ... and we do not understand what could 
hold Cassandra back to fully utilize the server resources...

We are clearly missing something!
Anyone any idea / tip?

thanks!
--
Attila Wind

http://www.linkedin.com/in/attilaw 
[linkedin.com]<https://urldefense.com/v3/__http:/www.linkedin.com/in/attilaw__;!!M-nmYVHPHQ!bV6Y2yInjIblpSxfYKYMiA824aLtBpQOoMG9YxMiFFqAvGsnmu9WObBWHS6rFDGp-DVnAQ8$>
Mobile: +49 176 43556932


________________________________

The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.

Reply via email to