Nicolas,

Does atop show anything out of the ordinary when you run the benchmark (both on 
the Ceph nodes and the node you run the benchmark from)?
It should give a good indication what could be limiting your performance.

I would highly recommend against using 9 disk RAID0 for the disks:
* I expect it to be significantly slower
* Failure of one disk would result in a re-sync of 9 x the amount of data. This 
could take ages while you have significantly reduced performance
* Seems there is a significant chance of catastrophic failure losing all data.
   If you really want to use RAID I would use RAID 10 and do 2 instead of 3 
replicas.


Cheers,
Robert van Leeuwen





________________________________
From: ceph-users-boun...@lists.ceph.com [ceph-users-boun...@lists.ceph.com] on 
behalf of nicolasc [nicolas.cance...@surfsara.nl]
Sent: Thursday, December 12, 2013 5:23 PM
To: Craig Lewis; ceph-users@lists.ceph.com
Subject: Re: [ceph-users] Impact of fancy striping

Hi James, Robert, Craig,

Thank your for those informative answers! You all pointed out interesting 
issues.

I know losing 1 SAS disk in RAID0 means losing all journals, but this is for 
testing so I do not care.

I do not think sequential write speed to the RAID0 array is the bottleneck (I 
benchmarked it at more than 500MB/s). However, I failed to realize that the 
synchronous writes of several OSDs would become random instead of sequential, 
thank you for explaining that.

I want to try this setup with several journals on a single partition (to 
mitigate seek time), and I also want to try replacing my 9 OSDs (per node) by a 
big RAID0 array of 9 disks — leaving replication to Ceph. But first I wanted to 
get an idea of SSD performance, so I created a 1GB RAMdisk for every OSD 
journal.

Shockingly, even with every journal on a dedicated RAMdisk, I still witnessed 
less than 100MB/s sequential writes with 4MB blocks. This is writing to an RBD 
image, independently of the format, the size, the striping pattern, or whether 
the image is mounted (with XFS on it) or directly accessed.

So, maybe my journal setup is not satisfying, but the bottleneck seems to be 
somewhere else. Any idea at all about striping? Or maybe pool/PG config? (I 
blindly followed the PG ratios indicated in the docs).

Thank you all for your help. Best regards,

Nicolas Canceill
Scalable Storage Systems
SURFsara (Amsterdam, NL)




On 12/06/2013 07:31 PM, Robert van Leeuwen wrote:

If I understand correctly you have one sas disk as a journal for multiple OSDs.
If you do small synchronous writes it will become a IO bottleneck pretty 
quickly:
Due to multiple journals on the same disk it will no longer be sequential 
writes writes to one journal but  4k writes to x journals making it fully 
random.
I would expect a performance of 100 to 200 IOPS max.
Doing an iostat -x or atop should show this bottleneck immediately.
This is also the reason to go with SSDs: they have reasonable random IO 
performance.

Cheers,
Robert van Leeuwen

Sent from my iPad



On 6 dec. 2013, at 17:05, "nicolasc" 
<nicolas.cance...@surfsara.nl><mailto:nicolas.cance...@surfsara.nl> wrote:




_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to