On 19/03/2017 02:54, S G wrote:
Forgot to mention that this vmstat picture is for the client-cluster
reading from Cassandra.
Hi SG,
Your numbers are low, 15k req/sec would be ok for a single node, for a
12 nodes cluster, something goes wrong... how do you measure the
throughput?
As suggested by others, to achieve good results you have to add threads
and client VMs: Cassandra scales horizontally, not vertically, ie each
single node performance will not goes up, but if you spread the load, by
adding nodes the global cluster performance will.
Theorically,
assuming the data and the load is spread on the cluster (*1)
from your saying, with each request at 2ms avg (*2)
you should have 500 req/sec in each thread,
40 threads should go 20k req/sec on each client VM stress application (*3)
and 10 client VMs should go 200k req/sec on the whole cluster (*4)
=====
(*1) the partition key (first PK column) must spread data on all nodes
and your testing code must spread the load by selecting evenly spread data.
(This point is very important: can you give information on your schema
and your data ?)
(*2) to achieve better single client throughtput, may be you could
prepare the requests, since you are always executing the same requests
(*3) => run more client tests application on each VM
(*4) add more client VMs (Patrick's suggestion)
with (3) and (4) the throughput of each client will not be better, but
the global cluster throughput will.
=====
There are other factors to take into account if you are also writing to
the cluster : read path, tombstones, replication, repairs etc. but
that's not the case here?
Performance testing goes to the limit of our understanding of the system
and is very difficult
... hence interesting :)
--
best,
Alain