Hi Pedro,

Without knowing much about your actual Ceph config file setup (ceph.conf) or 
any other factors (pool/replication setup) I'd say you're probably suffering 
due to the journal sitting on your OSDs. As in, you made the OSDs and didn't 
specify a SSD (or other disk) as the journal location. In that case, you'll get 
a journal ON the same disk. This means each IO first writes to the journal, 
then to the same disk again to actually store your data. Add into this your 
replication settings (probably 3 if you're doing everything default) and that 
would explain those numbers.
A good explanation can be found here -> 
http://www.sebastien-han.fr/blog/2014/02/17/ceph-io-patterns-the-bad/

Cheers

Kris

From: Pedro Miranda <potter...@gmail.com<mailto:potter...@gmail.com>>
Date: Mon, 20 Apr 2015 10:25:38 +0100
To: <ceph-users@lists.ceph.com<mailto:ceph-users@lists.ceph.com>>
Subject: [ceph-users] RADOS Bench slow write speed

Hi all!!
I'm setting up a Ceph (version 0.80.6) cluster and I'mbenchmarking the 
infrastructure and Ceph itself. I've got 3 rack servers (Dell R630) each with 
it's own disks in enclosures.
The cluster network bandwidth is of 10Gbps, the bandwidth between the RAID 
controller (Dell H830) and enclosure (MD1400) is of 48 Gbps and negotiated 
speed with each disk (configured as JBOD) is of 6 Gbps.
We use 4 TB SAS disks, 7200 RPM with sustained bandwidth of 175 MB/s (confirmed 
with fio).

I tried to simulate a Ceph pattern by writing/reading on a single disk with 
XFS, a great number of 4 MB files with on instance of fio.

[writetest]
ioengine=libaio
directory=/var/local/xfs
filesize=4m
iodepth=256
rw= write
direct=1
numjobs=1
loops=1
nrfiles=18000
bs=4096k

For the other tests (sequential reads, random writes, random reads), only the 
"rw" changes.

I got the following bandwidth:
Sequential write speed: 137 MB/s
Sequential read speed:  144.5 MB/s
Radom write speed: 134 MB/s
Radom read speed: 144 MB/s

Ok, it has an overhead associated with writing a great amount of 4 MB files.

Now, I move on to bechmark with an OSD on top of the disk, so I've got the 
cluster with only 1 OSD (separate partitions for journal and data) and run 
rados bench.

1 Thread:
Writes: 27.7 MB/s
Reads: 90 MB/s
Random reads: 82 MB/s

16 Threads (default):
Writes: 37 MB/s
Reads: 79.2 MB/s
Random reads:  71.4 MB/s


As you can see, writes are awfully slow. What I have notice is very high 
latencies:

Total writes made:      16868
Write size:             4194304
Bandwidth (MB/sec):     37.449
Stddev Bandwidth:       24.606
Max bandwidth (MB/sec): 120
Min bandwidth (MB/sec): 0
Average Latency:        1.70862
Stddev Latency:         1.13098
Max latency:            6.33929
Min latency:            0.20767

Is this throughput to be expected from 7.2K RPM disks with 4 TB size? Or is 
anything in the Ceph configuration that might be changed to decrease the 
observed latency? Any sugestions?

Appreciated,
Pedro Miranda.


_______________________________________________ ceph-users mailing list 
ceph-users@lists.ceph.com<mailto:ceph-users@lists.ceph.com> 
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to