Hi Bradley, I did the similar benchmark recently, and the result is no better than yours.
My setup: 3 servers(CPU: Intel Xeon E5-2609 0 @ 2.40GHz, RAM: 32GB), I used only 2 SATA 7.2k RPM disk(2 TB) plus a 400 GB SSD for OSD in total. Servers are connected with 10gbps ethernet. Replication level: 2 I launched 3 VMs acting as ceph client, then I used fio to run 4k random read/write benchmark on all VMs at the same time. For 4k random read: 176 IOPS For 4k random write: 474 IOPS I don't know why random read performance was so poor. Can someone help me out? Here is my fio configuration: [global] iodepth=64 runtime=300 ioengine=libaio direct=1 size=10G directory=/mnt filename=bench ramp_time=40 invalidate=1 exec_prerun="echo 3 > /proc/sys/vm/drop_caches" [rand-write-4k] bs=4K rw=randwrite [rand-read-4k] bs=4K rw=randread [seq-read-64k] bs=64K rw=read [seq-write-64k] bs=64K rw=write My benchmark script: https://gist.github.com/kazhang/8344180 Regards, Kai At 2014-01-09 01:25:17,"Bradley Kite" <bradley.k...@gmail.com> wrote: Hi there, I am new to Ceph and still learning its performance capabilities, but I would like to share my performance results in the hope that they are useful to others, and also to see if there is room for improvement in my setup. Firstly, a little about my setup: 3 servers (quad-core CPU, 16GB RAM), each with 4 SATA 7.2K RPM disks (4TB) plus a 160GB SSD. I have mapped a 10GB volume to a 4th server which is acting as a ceph client. Due to Ceph's thin-provisioning, I used "dd" to write to the entire block device to ensure that the ceph volume is fully allocated. DD writes sequentially at around 95MB/sec which shows the network can run at full capacity. Each device is connected by a single 1gbps ethernet link to a switch. I then used "fio" to benchmark the raw block device. The reason for this is that I also need to compare ceph against a traditional iscsi SAN and the internal "rados bench" tools cannot be used for this. The replication level for the pool I am testing against is 2. I have tried two setups with regards to the OSD's - firstly with the journal running on a partition on the SSD, and secondly by using "bcache" (http://bcache.evilpiepirate.org) to provide a write-back cache of the 4TB drives. In all tests, fio was configured to do direct I/O with 256 parallel I/O's. With the journal on the SSD: 4k random read, around 1200 iops/second, 5mbps. 4k random write, around 300 iops/second, 1.2 mbps. Using BCache for each OSD (journal is just a file on the OSD): 4k random read, around 2200 iops/second, 9mbps. 4k random write, around 300 iops/second, 1.2 mbps. By comparison, a 12-disk RAID5 iscsi SAN is doing ~4000 read iops and ~2000 iops write (but with 15KRPM SAS disks). What is interesting is that bcache definitely has a positive effect on the read IOPS, but something else is being a bottle-neck for the writes. It looks to me like I have missed something in the configuration which brings down the write IOPS - since 300 iops/second is very poor. If, however, I turn off Direct I/O in the fio tests the performance jumps to around 4000 iops/second. It makes no difference to the read performance which is to be expected. I have tried increasing the number of threads in each OSD but that has made no difference. I have also tried images with different (smaller) stripe sizes (--order) instead of the default 4MB but it doesnt make any difference. Do these figures look reasonable to others? What kind of IOPS should I be expecting? Additional info is below: Ceph 0.72.2 running on Centos 6.5 (with custom 3.10.25 kernel for bcache support) 3 servers of the following spec: CPU: Quad Core Intel(R) Xeon(R) CPU E5-2609 0 @ 2.40GHz RAM: 16GB Disks: 4x 4TB Seagate Constellation (7.2K RPM) plus 1x Intel 160GB DC S3500 SSD Test pool has 400 placement groups (and placement groups for placement). fio configuration - read: [global] rw=randread filename=/dev/rbd1 ioengine=posixaio iodepth=256 direct=1 runtime=60 ramp_time=30 blocksize=4k write_bw_log=fio-2-random-read write_lat_log=fio-2-random-read write_iops_log=fio-2-random-read fio configuration - writes: [global] rw=randwrite filename=/dev/rbd1 ioengine=posixaio iodepth=256 direct=1 runtime=60 ramp_time=30 blocksize=4k write_bw_log=fio-2-random-write write_lat_log=fio-2-random-write write_iops_log=fio-2-random-write
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com