Re: [ceph-users] Rados bench result when increasing OSDs

Guang Yang Tue, 22 Oct 2013 07:21:22 -0700

Thanks Mark for the response. My comments inline...

From: Mark Nelson <mark.nel...@inktank.com>
To: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] Rados bench result when increasing OSDs
Message-ID: <52653b49.8090...@inktank.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed

On 10/21/2013 09:13 AM, Guang Yang wrote:
> Dear ceph-users,

Hi!

> Recently I deployed a ceph cluster with RadosGW, from a small one (24 OSDs) 
> to a much bigger one (330 OSDs).
> 
> When using rados bench to test the small cluster (24 OSDs), it showed the 
> average latency was around 3ms (object size is 5K), while for the larger one 
> (330 OSDs), the average latency was around 7ms (object size 5K), twice 
> comparing the small cluster.

Did you have the same number of concurrent requests going?
[yguang] Yes. I run the test with 3 or 5 concurrent request, that does not 
change the result.

> 
> The OSD within the two cluster have the same configuration, SAS disk,  and 
> two partitions for one disk, one for journal and the other for metadata.
> 
> For PG numbers, the small cluster tested with the pool having 100 PGs, and 
> for the large cluster, the pool has 43333 PGs (as I will to further scale the 
> cluster, so I choose a much large PG).

Forgive me if this is a silly question, but were the pools using the 
same level of replication?
[yguang] Yes, both have 3 replicas.
> 
> Does my test result make sense? Like when the PG number and OSD increase, the 
> latency might drop?

You wouldn't necessarily expect a larger cluster to show higher latency 
if the nodes, pools, etc were all configured exactly the same, 
especially if you were using the same amount of concurrency.  It's 
possible that you have some slow drives on the larger cluster that could 
be causing the average latency to increase.  If there are more disks per 
node, that could do it too.
[yguang] Glad to know this :) I will need to gather more information in terms 
of if there is any slow disk, will get back on this.

Are there any other differences you can think of?
[yguang] Another difference is, for the large cluster, as we expect to scale it 
to more than a thousand OSDs, we have a large PG number (43333) pre-created.

Thanks,
Guang

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Rados bench result when increasing OSDs

Reply via email to