Good point. When we looked at the EC2 nodes, we measured 120% CPU utilization 
or so. We interpreted this as a false representation of CPU utilization on a 
multi-core machine. Our EC2 nodes have 8 virtual cores each.

Maybe Cassandra 0.6.5 is not so good with execution on multi-core systems?

On 15.02.2011, at 20:59, Thibaut Britz <thibaut.br...@trendiction.com> wrote:

> Cassandra is very CPU hungry so you might be hitting a CPU bottleneck.
> What's your CPU usage during these tests?
> 
> 
> On Tue, Feb 15, 2011 at 8:45 PM, Markus Klems <mar...@klems.eu> wrote:
>> Hi there,
>> 
>> we are currently benchmarking a Cassandra 0.6.5 cluster with 3
>> High-Mem Quadruple Extra Large EC2 nodes
>> (http://aws.amazon.com/ec2/#instance) using Yahoo's YCSB tool
>> (replication factor is 3, random partitioner). We assigned 32 GB RAM
>> to the JVM and left 32 GB RAM for the Ubuntu Linux filesystem buffer.
>> We also set the user count to a very large number via ulimit -u
>> 999999.
>> 
>> Our goal is to achieve max throughput by increasing YCSB's threadcount
>> parameter (i.e. the number of parallel benchmarking client threads).
>> However, this does only improve Cassandra throughput for low numbers
>> of threads. If we move to higher threadcounts, throughput does not
>> increase and even  decreases. Do you have any idea why this is
>> happening and possibly suggestions how to scale throughput to much
>> higher numbers? Why is throughput hitting a wall, anyways? And where
>> does the latency/throughput tradeoff come from?
>> 
>> Here is our YCSB configuration:
>> recordcount=300000
>> operationcount=1000000
>> workload=com.yahoo.ycsb.workloads.CoreWorkload
>> readallfields=true
>> readproportion=0.5
>> updateproportion=0.5
>> scanproportion=0
>> insertproportion=0
>> threadcount= 500
>> target = 10000
>> hosts=EC2-1,EC2-2,EC2-3
>> requestdistribution=uniform
>> 
>> These are typical results for threadcount=1:
>> Loading workload...
>> Starting test.
>>  0 sec: 0 operations;
>>  10 sec: 11733 operations; 1168.28 current ops/sec; [UPDATE
>> AverageLatency(ms)=0.64] [READ AverageLatency(ms)=1.03]
>>  20 sec: 24246 operations; 1251.68 current ops/sec; [UPDATE
>> AverageLatency(ms)=0.48] [READ AverageLatency(ms)=1.11]
>> 
>> These are typical results for threadcount=10:
>> 10 sec: 30428 operations; 3029.77 current ops/sec; [UPDATE
>> AverageLatency(ms)=2.11] [READ AverageLatency(ms)=4.32]
>>  20 sec: 60838 operations; 3041.91 current ops/sec; [UPDATE
>> AverageLatency(ms)=2.15] [READ AverageLatency(ms)=4.37]
>> 
>> These are typical results for threadcount=100:
>> 10 sec: 29070 operations; 2895.42 current ops/sec; [UPDATE
>> AverageLatency(ms)=20.53] [READ AverageLatency(ms)=44.91]
>>  20 sec: 53621 operations; 2455.84 current ops/sec; [UPDATE
>> AverageLatency(ms)=23.11] [READ AverageLatency(ms)=55.39]
>> 
>> These are typical results for threadcount=500:
>> 10 sec: 30655 operations; 3053.59 current ops/sec; [UPDATE
>> AverageLatency(ms)=72.71] [READ AverageLatency(ms)=187.19]
>>  20 sec: 68846 operations; 3814.14 current ops/sec; [UPDATE
>> AverageLatency(ms)=65.36] [READ AverageLatency(ms)=191.75]
>> 
>> We never measured more than ~6000 ops/sec. Are there ways to tune
>> Cassandra that we are not aware of? We made some modification to the
>> Cassandra 0.6.5 core for experimental reasons, so it's not easy to
>> switch to 0.7x or 0.8x. However, if this might solve the scaling
>> issues, we might consider to port our modifications to a newer
>> Cassandra version...
>> 
>> Thanks,
>> 
>> Markus Klems
>> 
>> Karlsruhe Institute of Technology, Germany
>> 

Reply via email to