Hi there,

I am new to Ceph and still learning its performance capabilities, but I
would like to share my performance results in the hope that they are useful
to others, and also to see if there is room for improvement in my setup.

Firstly, a little about my setup:

3 servers (quad-core CPU, 16GB RAM), each with 4 SATA 7.2K RPM disks (4TB)
plus a 160GB SSD.

I have mapped a 10GB volume to a 4th server which is acting as a ceph
client. Due to Ceph's thin-provisioning, I used "dd" to write to the entire
block device to ensure that the ceph volume is fully allocated. DD writes
sequentially at around 95MB/sec which shows the network can run at full
capacity.

Each device is connected by a single 1gbps ethernet link to a switch.

I then used "fio" to benchmark the raw block device. The reason for this is
that I also need to compare ceph against a traditional iscsi SAN and the
internal "rados bench" tools cannot be used for this.

The replication level for the pool I am testing against is 2.

I have tried two setups with regards to the OSD's - firstly with the
journal running on a partition on the SSD, and secondly by using "bcache" (
http://bcache.evilpiepirate.org) to provide a write-back cache of the 4TB
drives.

In all tests, fio was configured to do direct I/O with 256 parallel I/O's.

With the journal on the SSD:

4k random read, around 1200 iops/second, 5mbps.
4k random write, around 300 iops/second, 1.2 mbps.

Using BCache for each OSD (journal is just a file on the OSD):
4k random read, around 2200 iops/second, 9mbps.
4k random write, around 300 iops/second, 1.2 mbps.

By comparison, a 12-disk RAID5 iscsi SAN is doing ~4000 read iops and ~2000
iops write (but with 15KRPM SAS disks).

What is interesting is that bcache definitely has a positive effect on the
read IOPS, but something else is being a bottle-neck for the writes.

It looks to me like I have missed something in the configuration which
brings down the write IOPS - since 300 iops/second is very poor. If,
however, I turn off Direct I/O in the fio tests the performance jumps to
around 4000 iops/second. It makes no difference to the read performance
which is to be expected.

I have tried increasing the number of threads in each OSD but that has made
no difference.

I have also tried images with different (smaller) stripe sizes (--order)
instead of the default 4MB but it doesnt make any difference.

Do these figures look reasonable to others? What kind of IOPS should I be
expecting?

Additional info is below:

Ceph 0.72.2 running on Centos 6.5 (with custom 3.10.25 kernel for bcache
support)
3 servers of the following spec:
CPU: Quad Core Intel(R) Xeon(R) CPU E5-2609 0 @ 2.40GHz
RAM: 16GB
Disks: 4x 4TB Seagate Constellation (7.2K RPM) plus 1x Intel 160GB DC S3500
SSD

Test pool has 400 placement groups (and placement groups for placement).

fio configuration - read:
[global]
rw=randread
filename=/dev/rbd1
ioengine=posixaio
iodepth=256
direct=1
runtime=60
ramp_time=30
blocksize=4k
write_bw_log=fio-2-random-read
write_lat_log=fio-2-random-read
write_iops_log=fio-2-random-read

fio configuration - writes:
[global]
rw=randwrite
filename=/dev/rbd1
ioengine=posixaio
iodepth=256
direct=1
runtime=60
ramp_time=30
blocksize=4k
write_bw_log=fio-2-random-write
write_lat_log=fio-2-random-write
write_iops_log=fio-2-random-write
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to