RE: yet another benchmark bottleneck

Jacques-Henri Berthemet Mon, 12 Mar 2018 06:20:08 -0700

If throughput decreases as you add more load then it’s probably due to disk 
latency, can you test SDDs? Are you using VMWare ESXi?

--
Jacques-Henri Berthemet

From: onmstester onmstester [mailto:onmstes...@zoho.com]
Sent: Monday, March 12, 2018 2:15 PM
To: user <user@cassandra.apache.org>
Subject: RE: yet another benchmark bottleneck

I mentioned that already tested increasing client threads + many stress-client 
instances in one node + two stress-client in two separate nodes, in all of them 
the sum of throughputs is less than 130K. I've been tuning all aspects of OS 
and Cassandra (whatever I've seen in config files!) for two days, still no luck!

Sent using Zoho Mail<https://www.zoho.com/mail/>

---- On Mon, 12 Mar 2018 16:38:22 +0330 Jacques-Henri Berthemet 
<jacques-henri.berthe...@genesys.com<mailto:jacques-henri.berthe...@genesys.com>>
 wrote ----

What happens if you increase number of client threads?
Can you add another instance of cassandra-stress on another host?

--
Jacques-Henri Berthemet

From: onmstester onmstester 
[mailto:onmstes...@zoho.com<mailto:onmstes...@zoho.com>]
Sent: Monday, March 12, 2018 12:50 PM
To: user <user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Subject: RE: yet another benchmark bottleneck

no luck even with 320 threads for write

Sent using Zoho Mail<https://www.zoho.com/mail/>

---- On Mon, 12 Mar 2018 14:44:15 +0330 Jacques-Henri Berthemet 
<jacques-henri.berthe...@genesys.com<mailto:jacques-henri.berthe...@genesys.com>>
 wrote ----

It makes more sense now, 130K is not that bad.

According to cassandra.yaml you should be able to increase your number of write 
threads in Cassandra:
# On the other hand, since writes are almost never IO bound, the ideal
# number of "concurrent_writes" is dependent on the number of cores in
# your system; (8 * number_of_cores) is a good rule of thumb.
concurrent_reads: 32
concurrent_writes: 32
concurrent_counter_writes: 32

Jumping directly to 160 would be a bit high with spinning disks, maybe start 
with 64 just to see if it gets better.

--
Jacques-Henri Berthemet

From: onmstester onmstester 
[mailto:onmstes...@zoho.com<mailto:onmstes...@zoho.com>]
Sent: Monday, March 12, 2018 12:08 PM
To: user <user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Subject: RE: yet another benchmark bottleneck

RF=1
No errors or warnings.
Actually its 300 Mbit/seconds and 130K OP/seconds. I missed a 'K' in first 
mail, but anyway! the point is: More than half of node resources (cpu, mem, 
disk, network) is unused and i can't increase write throughput.

Sent using Zoho Mail<https://www.zoho.com/mail/>

---- On Mon, 12 Mar 2018 14:25:12 +0330 Jacques-Henri Berthemet 
<jacques-henri.berthe...@genesys.com<mailto:jacques-henri.berthe...@genesys.com>>
 wrote ----

Any errors/warning in Cassandra logs? What’s your RF?
Using 300MB/s of network bandwidth for only 130 op/s looks very high.

--
Jacques-Henri Berthemet

From: onmstester onmstester 
[mailto:onmstes...@zoho.com<mailto:onmstes...@zoho.com>]
Sent: Monday, March 12, 2018 11:38 AM
To: user <user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Subject: RE: yet another benchmark bottleneck

1.2 TB 15K
latency reported by stress tool is 7.6 ms. disk latency is 2.6 ms

Sent using Zoho Mail<https://www.zoho.com/mail/>

---- On Mon, 12 Mar 2018 14:02:29 +0330 Jacques-Henri Berthemet 
<jacques-henri.berthe...@genesys.com<mailto:jacques-henri.berthe...@genesys.com>>
 wrote ----

What’s your disk latency? What kind of disk is it?

--
Jacques-Henri Berthemet

From: onmstester onmstester 
[mailto:onmstes...@zoho.com<mailto:onmstes...@zoho.com>]
Sent: Monday, March 12, 2018 10:48 AM
To: user <user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Subject: Re: yet another benchmark bottleneck

Running two instance of Apache Cassandra on same server, each having their own 
commit log disk dis not help. Sum of cpu/ram usage  for both instances would be 
less than half of all available resources. disk usage is less than 20% and 
network is still less than 300Mb in Rx.

Sent using Zoho Mail<https://www.zoho.com/mail/>

---- On Mon, 12 Mar 2018 09:34:26 +0330 onmstester onmstester 
<onmstes...@zoho.com<mailto:onmstes...@zoho.com>> wrote ----

Apache-cassandra-3.11.1
Yes, i'm dosing a single host test

Sent using Zoho Mail<https://www.zoho.com/mail/>

---- On Mon, 12 Mar 2018 09:24:04 +0330 Jeff Jirsa 
<jji...@gmail.com<mailto:jji...@gmail.com>> wrote ----

Would help to know your version. 130 ops/second sounds like a ridiculously low 
rate. Are you doing a single host test?

On Sun, Mar 11, 2018 at 10:44 PM, onmstester onmstester 
<onmstes...@zoho.com<mailto:onmstes...@zoho.com>> wrote:

I'm going to benchmark Cassandra's write throughput on a node with following 
spec:

  *   CPU: 20 Cores
  *   Memory: 128 GB (32 GB as Cassandra heap)
  *   Disk: 3 seprate disk for OS, data and commitlog
  *   Network: 10 Gb (test it with iperf)
  *   Os: Ubuntu 16

Running Cassandra-stress:
cassandra-stress write n=1000000 -rate threads=1000 -mode native cql3 -node 
X.X.X.X

from two node with same spec as above, i can not get throughput more than 130 
Op/s. The clients are using less than 50% of CPU, Cassandra node uses:

  *   60% of cpu
  *   30% of memory
  *   30-40% util in iostat of commitlog
  *   300 Mb of network bandwidth
I suspect the network, cause no matter how many clients i run, cassandra always 
using less than 300 Mb. I've done all the tuning mentioned by datastax.
Increasing wmem_max and rmem_max did not help either.

Sent using Zoho Mail<https://www.zoho.com/mail/>

RE: yet another benchmark bottleneck

Reply via email to