Re: Benchmarking Cassandra with YCSB

Thibaut Britz Tue, 15 Feb 2011 11:59:49 -0800

Cassandra is very CPU hungry so you might be hitting a CPU bottleneck.
What's your CPU usage during these tests?



On Tue, Feb 15, 2011 at 8:45 PM, Markus Klems <mar...@klems.eu> wrote:
> Hi there,
>
> we are currently benchmarking a Cassandra 0.6.5 cluster with 3
> High-Mem Quadruple Extra Large EC2 nodes
> (http://aws.amazon.com/ec2/#instance) using Yahoo's YCSB tool
> (replication factor is 3, random partitioner). We assigned 32 GB RAM
> to the JVM and left 32 GB RAM for the Ubuntu Linux filesystem buffer.
> We also set the user count to a very large number via ulimit -u
> 999999.
>
> Our goal is to achieve max throughput by increasing YCSB's threadcount
> parameter (i.e. the number of parallel benchmarking client threads).
> However, this does only improve Cassandra throughput for low numbers
> of threads. If we move to higher threadcounts, throughput does not
> increase and even  decreases. Do you have any idea why this is
> happening and possibly suggestions how to scale throughput to much
> higher numbers? Why is throughput hitting a wall, anyways? And where
> does the latency/throughput tradeoff come from?
>
> Here is our YCSB configuration:
> recordcount=300000
> operationcount=1000000
> workload=com.yahoo.ycsb.workloads.CoreWorkload
> readallfields=true
> readproportion=0.5
> updateproportion=0.5
> scanproportion=0
> insertproportion=0
> threadcount= 500
> target = 10000
> hosts=EC2-1,EC2-2,EC2-3
> requestdistribution=uniform
>
> These are typical results for threadcount=1:
> Loading workload...
> Starting test.
>  0 sec: 0 operations;
>  10 sec: 11733 operations; 1168.28 current ops/sec; [UPDATE
> AverageLatency(ms)=0.64] [READ AverageLatency(ms)=1.03]
>  20 sec: 24246 operations; 1251.68 current ops/sec; [UPDATE
> AverageLatency(ms)=0.48] [READ AverageLatency(ms)=1.11]
>
> These are typical results for threadcount=10:
> 10 sec: 30428 operations; 3029.77 current ops/sec; [UPDATE
> AverageLatency(ms)=2.11] [READ AverageLatency(ms)=4.32]
>  20 sec: 60838 operations; 3041.91 current ops/sec; [UPDATE
> AverageLatency(ms)=2.15] [READ AverageLatency(ms)=4.37]
>
> These are typical results for threadcount=100:
> 10 sec: 29070 operations; 2895.42 current ops/sec; [UPDATE
> AverageLatency(ms)=20.53] [READ AverageLatency(ms)=44.91]
>  20 sec: 53621 operations; 2455.84 current ops/sec; [UPDATE
> AverageLatency(ms)=23.11] [READ AverageLatency(ms)=55.39]
>
> These are typical results for threadcount=500:
> 10 sec: 30655 operations; 3053.59 current ops/sec; [UPDATE
> AverageLatency(ms)=72.71] [READ AverageLatency(ms)=187.19]
>  20 sec: 68846 operations; 3814.14 current ops/sec; [UPDATE
> AverageLatency(ms)=65.36] [READ AverageLatency(ms)=191.75]
>
> We never measured more than ~6000 ops/sec. Are there ways to tune
> Cassandra that we are not aware of? We made some modification to the
> Cassandra 0.6.5 core for experimental reasons, so it's not easy to
> switch to 0.7x or 0.8x. However, if this might solve the scaling
> issues, we might consider to port our modifications to a newer
> Cassandra version...
>
> Thanks,
>
> Markus Klems
>
> Karlsruhe Institute of Technology, Germany
>

Re: Benchmarking Cassandra with YCSB

Reply via email to