Re: How to debug node load unbalance

2021-03-05 Thread Lapo Luchini

Thanks for the explanation, Kane!

In case anyone is curious I decommissioned node7 and things re-balanced 
themselves automatically: https://i.imgur.com/EOxzJu9.png

(node8 received 422 GiB, while the others did receive 82-153 GiB,
as reported by "nodetool netstats -H")

Lapo

On 2021-03-03 23:59, Kane Wilson wrote:
Well, that looks like your problem. They are logical racks and they come 
into play when NetworkTopologyStrategy is deciding which replicas to put 
data on. NTS will ensure a replica goes on the first node in a different 
rack when traversing the ring, with the idea of keeping only one set of 
replicas on a rack (so that a whole rack can go down without you losing 
QUORUM).



-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



underutilized servers

2021-03-05 Thread Attila Wind

Hi guys,

I have a DevOps related question - hope someone here could give some 
ideas/pointers...


We are running a 3 nodes Cassandra cluster
Recently we realized we do have performance issues. And based on 
investigation we took it seems our bottleneck is the Cassandra cluster. 
The application layer is waiting a lot for Cassandra ops. So queries are 
running slow on Cassandra side however due to our monitoring it looks 
the Cassandra servers still have lots of free resources...


The Cassandra machines are virtual machines (we do own the physical 
hosts too) built with kvm - with 6 CPU cores (3 physical) and 32GB RAM 
dedicated to it.
We are using Ubuntu Linux 18.04 distro - everywhere the same version 
(the physical and virtual host)

We are running Cassandra 4.0-alpha4

What we see is

 * CPU load is around 20-25% - so we have lots of spare capacity
 * iowait is around 2-5% - so disk bandwidth should be fine
 * network load is around 50% of the full available bandwidth
 * loadavg is max around 4 - 4.5 but typically around 3 (because of the
   cpu count 6 should represent 100% load)

and still, query performance is slow ... and we do not understand what 
could hold Cassandra back to fully utilize the server resources...


We are clearly missing something!
Anyone any idea / tip?

thanks!

--
Attila Wind

http://www.linkedin.com/in/attilaw 
Mobile: +49 176 43556932




RE: underutilized servers

2021-03-05 Thread Durity, Sean R
Are there specific queries that are slow? Partition-key queries should have 
read latencies in the single digits of ms (or faster). If that is not what you 
are seeing, I would first review the data model and queries to make sure that 
the data is modeled properly for Cassandra. Without metrics, I would start at 
16-20 GB of RAM for Cassandra on each node (or 31 GB if you can get 64 GB per 
host).

Since these are VMs, is there any chance they are competing for resources on 
the same physical host? In my (limited) VM experience, VMs can be 10x slower 
than physical hosts with local SSDs. (They don't have to be slower, but it can 
be harder to get visibility to the actual bottlenecks.)

I would also look to see what consistency level is being used with the queries. 
In most cases LOCAL_QUORUM or LOCAL_ONE is preferred.

Does the app use prepared statements that are only prepared once per app 
invocation? Any LWT/"if exists" in your code?


Sean Durity

From: Attila Wind 
Sent: Friday, March 5, 2021 9:48 AM
To: user@cassandra.apache.org
Subject: [EXTERNAL] underutilized servers


Hi guys,

I have a DevOps related question - hope someone here could give some 
ideas/pointers...

We are running a 3 nodes Cassandra cluster
Recently we realized we do have performance issues. And based on investigation 
we took it seems our bottleneck is the Cassandra cluster. The application layer 
is waiting a lot for Cassandra ops. So queries are running slow on Cassandra 
side however due to our monitoring it looks the Cassandra servers still have 
lots of free resources...

The Cassandra machines are virtual machines (we do own the physical hosts too) 
built with kvm - with 6 CPU cores (3 physical) and 32GB RAM dedicated to it.
We are using Ubuntu Linux 18.04 distro - everywhere the same version (the 
physical and virtual host)
We are running Cassandra 4.0-alpha4

What we see is

  *   CPU load is around 20-25% - so we have lots of spare capacity
  *   iowait is around 2-5% - so disk bandwidth should be fine
  *   network load is around 50% of the full available bandwidth
  *   loadavg is max around 4 - 4.5 but typically around 3 (because of the cpu 
count 6 should represent 100% load)

and still, query performance is slow ... and we do not understand what could 
hold Cassandra back to fully utilize the server resources...

We are clearly missing something!
Anyone any idea / tip?

thanks!
--
Attila Wind

http://www.linkedin.com/in/attilaw 
[linkedin.com]
Mobile: +49 176 43556932




The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.


Re: underutilized servers

2021-03-05 Thread Bowen Song
Based on my personal experience, the combination of slow read queries 
and low CPU usage is often an indicator of bad table schema design 
(e.g.: large partitions) or bad query (e.g. without partition key). 
Check the Cassandra logs first, is there any long stop-the-world GC? 
tombstone warning? anything else that's out of ordinary? Check the 
output from "nodetool tpstats", is there any pending or blocked tasks? 
Which thread pool(s) are they in? Is there a high number of dropped 
messages? If you can't find anything useful from the Cassandra server 
logs and "nodetool tpstats", try to get a few slow queries from your 
application's log, and run them manually in the cqlsh. Are the results 
very large? How long do they take?



Regarding some of your observations:

/> CPU load is around 20-25% - so we have lots of spare capacity/

Is it very few threads each uses nearly 100% of a CPU core? If so, what 
are those threads? (I find the ttop command from the sjk tool 
 very helpful)


/> network load is around 50% of the full available bandwidth/

This sounds alarming to me. May I ask what's the full available 
bandwidth? Do you have a lots of CPU time spent in sys (vs user) mode?



On 05/03/2021 14:48, Attila Wind wrote:


Hi guys,

I have a DevOps related question - hope someone here could give some 
ideas/pointers...


We are running a 3 nodes Cassandra cluster
Recently we realized we do have performance issues. And based on 
investigation we took it seems our bottleneck is the Cassandra 
cluster. The application layer is waiting a lot for Cassandra ops. So 
queries are running slow on Cassandra side however due to our 
monitoring it looks the Cassandra servers still have lots of free 
resources...


The Cassandra machines are virtual machines (we do own the physical 
hosts too) built with kvm - with 6 CPU cores (3 physical) and 32GB RAM 
dedicated to it.
We are using Ubuntu Linux 18.04 distro - everywhere the same version 
(the physical and virtual host)

We are running Cassandra 4.0-alpha4

What we see is

  * CPU load is around 20-25% - so we have lots of spare capacity
  * iowait is around 2-5% - so disk bandwidth should be fine
  * network load is around 50% of the full available bandwidth
  * loadavg is max around 4 - 4.5 but typically around 3 (because of
the cpu count 6 should represent 100% load)

and still, query performance is slow ... and we do not understand what 
could hold Cassandra back to fully utilize the server resources...


We are clearly missing something!
Anyone any idea / tip?

thanks!

--
Attila Wind

http://www.linkedin.com/in/attilaw 
Mobile: +49 176 43556932




Re: underutilized servers

2021-03-05 Thread Attila Wind

Thanks for the answers @Sean and @Bowen !!!

First of all, this article described very similar thing we experience - 
let me share

https://www.senticore.com/overcoming-cassandra-write-performance-problems/
we are studying that now

Furthermore

 * yes, we have some level of unbalanced data which needs to be
   improved - this is on our backlog so should be done
 * and yes we do see clearly that this unbalanced data is slowing down
   everything in Cassandra (there is proof of it in our
   Prometheus+Grafana based monitoring)
 * we will do this optimization now definitely (luckily we have plan
   already)

@Sean:

 * "Since these are VMs, is there any chance they are competing for
   resources on the same physical host?"
   We are splitting the physical hardware into 2 VMs - and resources
   (cpu cores, disks, ram) all assigned in a dedicated fashion to the
   VMs without intersection
   BUT!!
   You are right... There is one thing we are sharing: network
   bandwidth... and actually that one does not come up in the "iowait"
   part for sure. We will further analyze into this direction
   definitely because from the monitoring as far as I see yeppp, we
   might hit the wall here
 * consistency level: we are using LOCAL_ONE
 * "Does the app use prepared statements that are only prepared once
   per app invocation?"
   Yes and yes :-)
 * "Any LWT/”if exists” in your code?"
   No. We go with RF=2 so we even can not use this (as LWT goes with
   QUORUM and in our case this would mean we could not tolerate losing
   a node... not good... so no)

@Bowen:

 * The bandwidth limit is 1Gbit/sec (so 120Mb/sec) BUT it is the limit
   of the physical host - so our 2 VMs competing here. Possible that
   Cassandra VM has ~50-70% of it...
 * The CPU's "system" value shows 8-12%
 * "nodetool tpstats"
   whooa I never used it, we definitely need some learning here to even
   understand the output... :-) But I copy that here to the bottom ...
   maybe clearly shows something to someone who can read it...

so, "nodetool tpstats" from one of the nodes

Pool Name Active Pending  Completed   
Blocked  All time blocked
ReadStage  0 0 6248406 
0 0
CompactionExecutor 0 0 168525 
0 0
MutationStage  0 0 25116817 
0 0
MemtableReclaimMemory  0 0 17636 
0 0
PendingRangeCalculator 0 0 7 
0 0
GossipStage    0 0 324388 
0 0
SecondaryIndexManagement   0 0 0 
0 0
HintsDispatcher    1 0 75 
0 0
Repair-Task    0 0 1 
0 0
RequestResponseStage   0 0 31186150 
0 0
Native-Transport-Requests  0 0 22827219 
0 0
CounterMutationStage   0 0 12560992 
0 0
MemtablePostFlush  0 0 19259 
0 0
PerDiskMemtableFlushWriter_0   0 0 17636 
0 0
ValidationExecutor 0 0 48 
0 0
Sampler    0 0 0 
0 0
ViewBuildExecutor  0 0 0 
0 0
MemtableFlushWriter    0 0 17636 
0 0
InternalResponseStage  0 0 44658 
0 0
AntiEntropyStage   0 0 161 
0 0
CacheCleanupExecutor   0 0 0 
0 0


Message type   Dropped  Latency waiting in queue 
(micros)
 50% 95%   
99%   Max
READ_RSP    18   1629.72 8409.01 
155469.30 386857.37
RANGE_REQ    0  0.00 0.00  
0.00  0.00
PING_REQ 0  0.00 0.00  
0.00  0.00
_SAMPLE  0  0.00 0.00  
0.00  0.00
VALIDATION_RSP   0  0.00 0.00  
0.00  0.00
SCHEMA_PULL_RSP  0  0.00 0.00  
0.00  0.00
SYNC_RSP 0  0.00 0.00  
0.00  0.00
SCHEMA_VERSION_REQ   0  0.00 0.00  
0.00  0.00
HINT_RSP 0    943.13 3379.39   
5839.59  52066.35
BATCH_REMOVE_RSP 0  0.0

Re: underutilized servers

2021-03-05 Thread daemeon reiydelle
you did not specify read and write consistency levels, default would be to
hit two nodes (one for data, one for digest) with every query. Network load
of 50% is not too helpful. 1gbit? 10gbit? 50% of each direction or average
of both?

Iowait is not great for a system of this size: assuming that you have 3
vm's on THREE SEPARATE physical systems and WITHOUT network attached storage
...


*Daemeon Reiydelle*
*email: daeme...@gmail.com *
*LI: https://www.linkedin.com/in/daemeonreiydelle/
*
*San Francisco 1.415.501.0198/Skype daemeon.c.m.reiydelle*

"Life should not be a journey to the grave with the intention of arriving
safely in a pretty and well preserved body, but rather to skid in broadside
in a cloud of smoke, thoroughly used up, totally worn out, and loudly
proclaiming “Wow! What a Ride!" - Hunter S. Thompson


On Fri, Mar 5, 2021 at 6:48 AM Attila Wind  wrote:

> Hi guys,
>
> I have a DevOps related question - hope someone here could give some
> ideas/pointers...
>
> We are running a 3 nodes Cassandra cluster
> Recently we realized we do have performance issues. And based on
> investigation we took it seems our bottleneck is the Cassandra cluster. The
> application layer is waiting a lot for Cassandra ops. So queries are
> running slow on Cassandra side however due to our monitoring it looks the
> Cassandra servers still have lots of free resources...
>
> The Cassandra machines are virtual machines (we do own the physical hosts
> too) built with kvm - with 6 CPU cores (3 physical) and 32GB RAM dedicated
> to it.
> We are using Ubuntu Linux 18.04 distro - everywhere the same version (the
> physical and virtual host)
> We are running Cassandra 4.0-alpha4
>
> What we see is
>
>- CPU load is around 20-25% - so we have lots of spare capacity
>- iowait is around 2-5% - so disk bandwidth should be fine
>- network load is around 50% of the full available bandwidth
>- loadavg is max around 4 - 4.5 but typically around 3 (because of the
>cpu count 6 should represent 100% load)
>
> and still, query performance is slow ... and we do not understand what
> could hold Cassandra back to fully utilize the server resources...
>
> We are clearly missing something!
> Anyone any idea / tip?
>
> thanks!
> --
> Attila Wind
>
> http://www.linkedin.com/in/attilaw
> Mobile: +49 176 43556932
>
>
>


Re: underutilized servers

2021-03-05 Thread Erick Ramirez
The tpstats you posted show that the node is dropping reads and writes
which means that your disk can't keep up with the load meaning your disk is
the bottleneck. If you haven't already, place data and commitlog on
separate disks so they're not competing for the same IO bandwidth. Note
that It's OK to have them on the same disk/volume if you have NVMe SSDs
since it's a lot more difficult to saturate them.

The challenge with monitoring is that typically it's only checking disk
stats every 5 minutes (for example). But your app traffic is bursty in
nature so stats averaged out over a period of time is irrelevant because
the only thing that matters is what the disk IO is at the the time you hit
peak loads.

The dropped reads and mutations tell you the node is overloaded. Provided
your nodes are configured correctly, the only way out of this situation is
to correctly size your cluster and add more nodes -- your cluster needs to
be sized for peak loads, not average throughput. Cheers!