Re: Cassandra benchmarking on Rackspace Cloud

Schubert Zhang Sat, 17 Jul 2010 10:06:13 -0700

I fact, in my cassandra-0.6.2, I can only get about 40~50 reads/s with
disabled Key/Row cache.


On Sun, Jul 18, 2010 at 1:02 AM, Schubert Zhang <zson...@gmail.com> wrote:

> Hi Jonathan,
> The 7k reads/s is very high, could you please make more explain about your
> benchmark?
>
> 7000 reads/s makes average latency of each read operation only talks
> 0.143ms. Consider 2 disks in the benchmark, it may be 0.286ms.
>
> But in most random read applications on very large dataset, OS cache and
> Cassandra Key/Row cache is not so effective. So, I guess, maybe for a test
> on large dataset (such as 1TB) , random reads, the result may not so good.
>
>
> On Sat, Jul 17, 2010 at 9:07 PM, Jonathan Ellis <jbel...@gmail.com> wrote:
>
>> On Fri, Jul 16, 2010 at 6:06 PM, Oren Benjamin <o...@clearspring.com>
>> wrote:
>> > The first goal was to reproduce the test described on spyced here:
>> http://spyced.blogspot.com/2010/01/cassandra-05.html
>> >
>> > Using Cassandra 0.6.3, a 4GB/160GB cloud server (
>> http://www.rackspacecloud.com/cloud_hosting_products/servers/pricing)
>> with default storage-conf.xml and cassandra.in.sh, here's what I got:
>> >
>> > Reads: 4,800/s
>> > Writes: 9,000/s
>> >
>> > Pretty close to the result posted on the blog, with a slightly lower
>> write performance (perhaps due to the availability of only a single disk for
>> both commitlog and data).
>>
>> You're getting as close as you are because you're comparing 0.6
>> numbers with 0.5.  For 0.6 on the test machine used in the blog post
>> (quad core, 2 disks, 4GB) we were getting 7k reads and 14k writes.
>>
>> In our tests we saw a 5-15% performance penalty from adding a
>> virtualization layer.  Things like only having a single disk are going
>> to stack on top of that.
>>
>> > The above was single node testing.  I'd expect to be able to add nodes
>> and scale throughput.  Unfortunately, I seem to be running into a cap of
>> 21,000 reads/s regardless of the number of nodes in the cluster.
>>
>> This is what I would expect if a single machine is handling all the
>> Thrift requests.  Are you spreading the client connections to all the
>> machines?
>>
>> > The disk performance of the cloud servers have been extremely spotty...
>> Is this normal for the cloud?
>>
>> Yes.
>>
>> >  And if so, what's the solution re Cassandra?
>>
>> The larger the instance you're using, the closer you are to having the
>> entire machine, meaning less other users are competing with you for
>> disk i/o.
>>
>> Of course when you're renting the entire machine's worth, it can be
>> more cost-effective to just use dedicated hardware.
>>
>> > However, Cassandra routes to the nearest node topologically and not to
>> the best performing one, so "bad" nodes will always result in high latency
>> reads.
>>
>> Cassandra routes reads around nodes with temporarily poor performance
>> in 0.7, btw.
>>
>> --
>> Jonathan Ellis
>> Project Chair, Apache Cassandra
>> co-founder of Riptano, the source for professional Cassandra support
>> http://riptano.com
>>
>
>

Re: Cassandra benchmarking on Rackspace Cloud

Reply via email to