Re: High performance disk io

Hiller, Dean Wed, 22 May 2013 07:33:51 -0700

Well, if you just want to lower your I/O util %, you could always just add more 
nodes to the cluster ;).

Dean

From: Igor <i...@4friends.od.ua<mailto:i...@4friends.od.ua>>
Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" 
<user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Date: Wednesday, May 22, 2013 8:06 AM
To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" 
<user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Subject: Re: High performance disk io

Hello

What level of read performance do you expect? We have limit 15 ms for 99 
percentile with average read latency near 0.9ms. For some CF 99 percentile 
actually equals to 2ms, for other - to 10ms, this depends on the data volume 
you read in each query.

Tuning read performance involved cleaning up data model, tuning cassandra.yaml, 
switching from Hector to astyanax, tuning OS parameters.

On 05/22/2013 04:40 PM, Christopher Wirt wrote:
Hello,

We’re looking at deploying a new ring where we want the best possible read 
performance.

We’ve setup a cluster with 6 nodes, replication level 3, 32Gb of memory, 8Gb 
Heap, 800Mb keycache, each holding 40/50Gb of data on a 200Gb SSD and 500Gb 
SATA for OS and commitlog
Three column families
ColFamily1 50% of the load and data
ColFamily2 35% of the load and data
ColFamily3 15% of the load and data

At the moment we are still seeing around 20% disk utilisation and occasionally 
as high as 40/50% on some nodes at peak time.. we are conducting some semi live 
testing.
CPU looks fine, memory is fine, keycache hit rate is about 80% (could be 
better, so maybe we should be increasing the keycache size?)

Anyway, we’re looking into what we can do to improve this.

One conversion we are having at the moment is around the SSD disk setup..

We are considering moving to have 3 smaller SSD drives and spreading the data 
across those.

The possibilities are:
-We have a RAID0 of the smaller SSDs and hope that improves performance.
Will this acutally yield better throughput?

-We mount the SSDs to different directories and define multiple data 
directories in Cassandra.yaml.
Will not having a layer of RAID controller improve the throughput?

-We mount the SSDs to different columns family directories and have a single 
data directory declared in Cassandra.yaml.
Think this is quite attractive idea.
What are the drawbacks? System column families will be on the main SATA?

-We don’t change anything and just keep upping our keycache.
-Anything you guys can think of.

Ideas and thoughts welcome. Thanks for your time and expertise.

Chris

Re: High performance disk io

Reply via email to